xen.git
16 years agoi386: fix handling of Xen entries in final L2 page table
Keir Fraser [Wed, 15 Jul 2009 12:07:30 +0000 (13:07 +0100)]
i386: fix handling of Xen entries in final L2 page table

Running Xen on top of KVM exposed an issue that latently also exists
on real hardware: So far, updating any L3 entry resulted in the Xen
owned part of the L2 table referenced by the final L3 one to be re-
initialized. This was not only unnecessary, it actually resulted in
Xen relying on the TLB entry which maps the L2 page that's being
updated not going away intermediately, since as a first step the full
range of Xen owned entries in the L2 were replaced by the respective
ones from the idle page table, and only then the per-domain entries
got re- written to their intended values.

This part of the initialization really is sufficient to be done once,
when the page becomes an L2-with-Xen-entries (PGT_pae_xen_l2) one,
i.e. can be moved to alloc_l2_table(). Only the linear page table
setup has to remain where it always was.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agostubdom: make -> $(MAKE)
Keir Fraser [Wed, 15 Jul 2009 09:31:50 +0000 (10:31 +0100)]
stubdom: make -> $(MAKE)

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agoFix a couple of comment typos.
Keir Fraser [Wed, 15 Jul 2009 09:30:59 +0000 (10:30 +0100)]
Fix a couple of comment typos.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agox86: Fix an oversight of c/s 19927 - per-CPU data accesses must
Keir Fraser [Wed, 15 Jul 2009 08:14:19 +0000 (09:14 +0100)]
x86: Fix an oversight of c/s 19927 - per-CPU data accesses must
not be iterated over using NR_CPUS bound loops.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agostubdom: don't leak include dir on distclean
Keir Fraser [Wed, 15 Jul 2009 08:11:40 +0000 (09:11 +0100)]
stubdom: don't leak include dir on distclean

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agominios: switch to C99 integer types
Keir Fraser [Wed, 15 Jul 2009 08:09:48 +0000 (09:09 +0100)]
minios: switch to C99 integer types

This is a necessary step to make minios build on NetBSD.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agodebug=y default during development
Keir Fraser [Tue, 14 Jul 2009 20:25:24 +0000 (21:25 +0100)]
debug=y default during development
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agostubdom: fix stubdom-dm error path
Keir Fraser [Tue, 14 Jul 2009 13:46:04 +0000 (14:46 +0100)]
stubdom: fix stubdom-dm error path

Exit the shell and not a subshell.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agopython: Remove tab indents.
Keir Fraser [Tue, 14 Jul 2009 13:43:19 +0000 (14:43 +0100)]
python: Remove tab indents.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agopass-through: use vdevn in xm_pci_attach()
Keir Fraser [Tue, 14 Jul 2009 13:38:56 +0000 (14:38 +0100)]
pass-through: use vdevn in xm_pci_attach()

Use vdevfn in xm_pci_attach() for non-zero functions,
the vslot element of dev dictionaries no longer exists.

Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agodocs/xenapi: Update examples section reflecting the current behaviour.
Keir Fraser [Tue, 14 Jul 2009 13:37:53 +0000 (14:37 +0100)]
docs/xenapi: Update examples section reflecting the current behaviour.

Signed-off-by: Andreas Florath <xen@flonatel.org>
16 years agostubdom: Install and use stubdompath.sh
Keir Fraser [Mon, 13 Jul 2009 15:50:53 +0000 (16:50 +0100)]
stubdom: Install and use stubdompath.sh
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agox86-64: reduce symbol table size
Keir Fraser [Mon, 13 Jul 2009 15:49:50 +0000 (16:49 +0100)]
x86-64: reduce symbol table size

With all of Xen's symbols sitting within a 2Gb range on x86-64, they
can be referred to by the kallsyms-like offset table using 4- instead
of 8-byte slots.

The marker table can use 4-byte slots in all cases, just like the
table entry counts can (though that's only a minor improvement).

If ia64's PERCPU_ADDR got moved down to (KERNEL_START + 2Gb -
PERCPU_PAGE_SIZE), it could also utilize the more compact form.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agoMapping grant references into HVM guests, take 2
Keir Fraser [Mon, 13 Jul 2009 11:35:34 +0000 (12:35 +0100)]
Mapping grant references into HVM guests, take 2

After some discussion, here's a second version of the patch I posted a
couple of weeks back to map grant references into HVM guests.  As
before, this is done by modifying the P2M map, but this time there's
no new hypercall to do it.  Instead, the existing GNTTABOP_map is
overloaded to perform a P2M mapping if called from a shadow mode
translate guest.  This matches the IA64 API.

Signed-off-by: Steven Smith <steven.smith@citrix.com>
Acked-by: Tim Deegan <tim.deegan@citrix.com>
CC: Bhaskar Jayaraman <Bhaskar.Jayaraman@lsi.com>
16 years agoEliminate grant_table_op restriction
Keir Fraser [Mon, 13 Jul 2009 11:18:04 +0000 (12:18 +0100)]
Eliminate grant_table_op restriction

Eliminate the hard-coded, arbitrarily chosen limit of 512 grant table
ops a domain may submit at a time, and instead check for necessary
preemption after each individual element got processed, invoking the
hypercall continuation logic when necessary.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agoAllow XENMEM_exchange to support exchange on foreign domains.
Keir Fraser [Mon, 13 Jul 2009 11:17:05 +0000 (12:17 +0100)]
Allow XENMEM_exchange to support exchange on foreign domains.

Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
16 years agoVT-d: fix assertion fault in pci passthrough code
Keir Fraser [Mon, 13 Jul 2009 10:52:49 +0000 (11:52 +0100)]
VT-d: fix assertion fault in pci passthrough code

Remove ASSERT(spin_is_locked(&pcidevs_lock)) in
pci_get_pdev_by_domain() to allow caller the flexibility to not hold
the lock if it does not care if the device is hot removed between the
time it got the pdev and the time it is used to get the corresponding
vt-d engine.  In the new RHSA use case, we just wanted to get the vt-d
engine of any device passthrough in the domain. Also, rename RHSA
field "domain" to "proximity domain" to avoid overloading the term
"domain" in virtualization context.

Signed-off-by: Allen Kay <allen.m.kay@intel.com>
16 years agox86: merge final linking scripts
Keir Fraser [Mon, 13 Jul 2009 10:51:07 +0000 (11:51 +0100)]
x86: merge final linking scripts

While unrelated to the previous four patches, I realized that the two
scripts are nearly identical when coding those earlier patches, and
this patch depends on them in order to apply cleanly.

As an extra measure, it also adjusts the (unused) space freed at the
end of the per-CPU area to include all alignment space needed before
the first actual constituent of the .bss section (up to 7 pages on
x86-64).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agoMove cpu_{sibling,core}_map into per-CPU space
Keir Fraser [Mon, 13 Jul 2009 10:45:31 +0000 (11:45 +0100)]
Move cpu_{sibling,core}_map into per-CPU space

These cpu maps get read from all CPUs, so apart from addressing the
square(nr_cpus) growth of these objects, they also get moved into the
previously introduced read-mostly sub-section of the per-CPU section,
in order to not need to waste a full cacheline in order to align (and
properly pad) them, which would be undue overhead on systems with low
NR_CPUS.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agoIntroduce and use a per-CPU read-mostly sub-section
Keir Fraser [Mon, 13 Jul 2009 10:32:41 +0000 (11:32 +0100)]
Introduce and use a per-CPU read-mostly sub-section

Since mixing data that only gets setup once and then (perhaps
frequently) gets read by remote CPUs with data that the local CPU may
modify (again, perhaps frequently) still causes undesirable cache
protocol related bus traffic, separate the former class of objects
from the latter.

These objects converted here are just picked based on their write-once
(or write-very-rarely) properties; perhaps some more adjustments may
be desirable subsequently. The primary users of the new sub-section
will result from the next patch.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agox86: move ucode_cpu_info into per-CPU space
Keir Fraser [Mon, 13 Jul 2009 10:31:34 +0000 (11:31 +0100)]
x86: move ucode_cpu_info into per-CPU space

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agox86: move init_tss into per-CPU space
Keir Fraser [Mon, 13 Jul 2009 10:31:08 +0000 (11:31 +0100)]
x86: move init_tss into per-CPU space

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agoia64: consolidate final linking step
Keir Fraser [Mon, 13 Jul 2009 10:19:31 +0000 (11:19 +0100)]
ia64: consolidate final linking step

This basically makes the final linking stage identical to x86's (with
the sole difference being that ia64 has the linker generate a map
file, while x86 doesn't), so would generally allow moving the final
linking rule into xen/Rules.mk.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agoi386: fix boot
Keir Fraser [Mon, 13 Jul 2009 10:18:57 +0000 (11:18 +0100)]
i386: fix boot

Since the Xen heap pages (which are the only ones mapped at this
point) don't get passed to init_boot_pages(), it has no place to store
the bootmem regions without faulting. Hence, a mapped page must be
passed to that function as the very first thing.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agostubdom: Fix stubdom-dm after c/s 19818
Keir Fraser [Fri, 10 Jul 2009 17:12:13 +0000 (18:12 +0100)]
stubdom: Fix stubdom-dm after c/s 19818

19818 added the following line to stubdom-dm:
  . ./stubdompath.sh
and replaced many paths with variables.  However the path to
stubdompath.sh is obviously wrong and stubdompath.sh is nowhere to be
found anyway.  For the moment I am dropping . ./stubdompath.sh and
hardcoding the values of the variables.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
16 years agonetbsd: remove qemu-ifup-nbsd, now that it is in the right place
Keir Fraser [Thu, 9 Jul 2009 16:06:40 +0000 (17:06 +0100)]
netbsd: remove qemu-ifup-nbsd, now that it is in the right place
(ioemu c/s 5cc34ea27f1cbd1a0560cfca91fb89ccd6d5726f)

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agoMake python check scripts use of $(PYTHON) make variable.
Keir Fraser [Thu, 9 Jul 2009 16:05:07 +0000 (17:05 +0100)]
Make python check scripts use of $(PYTHON) make variable.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agotools: python -> $(PYTHON)
Keir Fraser [Thu, 9 Jul 2009 15:06:52 +0000 (16:06 +0100)]
tools: python -> $(PYTHON)
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agoUpdate QEMU_TAG to 5cc34ea27f1cbd1a0560cfca91fb89ccd6d5726f
Keir Fraser [Thu, 9 Jul 2009 15:06:01 +0000 (16:06 +0100)]
Update QEMU_TAG to 5cc34ea27f1cbd1a0560cfca91fb89ccd6d5726f

16 years agoUse $(PYTHON) Makefile variable when building the hypervisor.
Keir Fraser [Thu, 9 Jul 2009 14:26:24 +0000 (15:26 +0100)]
Use $(PYTHON) Makefile variable when building the hypervisor.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agox86_64: Fix Xen relocation size -- there is no longer an allocation
Keir Fraser [Thu, 9 Jul 2009 07:52:31 +0000 (08:52 +0100)]
x86_64: Fix Xen relocation size -- there is no longer an allocation
bitmap to account for.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoReplace boot-time free-pages bitmap with a region list.
Keir Fraser [Wed, 8 Jul 2009 21:08:31 +0000 (22:08 +0100)]
Replace boot-time free-pages bitmap with a region list.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoDo not use bitmap allocator after boot time.
Keir Fraser [Wed, 8 Jul 2009 15:47:58 +0000 (16:47 +0100)]
Do not use bitmap allocator after boot time.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86 hvm: Use 'x' as parameter name for macros converting between
Keir Fraser [Wed, 8 Jul 2009 13:22:00 +0000 (14:22 +0100)]
x86 hvm: Use 'x' as parameter name for macros converting between
{vcpu,domain} and {vlapic,vpic,vrtc,hpet}. Completely avoids
accidental aliasing.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoblktap2: Fix compile warning with gcc4.
Keir Fraser [Wed, 8 Jul 2009 10:00:23 +0000 (11:00 +0100)]
blktap2: Fix compile warning with gcc4.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agofs-back: better error handling in fs-backend
Keir Fraser [Wed, 8 Jul 2009 09:58:09 +0000 (10:58 +0100)]
fs-back: better error handling in fs-backend

Currently most of the error checking in fs-backend is done by the use
of asserts that would terminate the daemon in case of a single error
on a single request.  This patch replaces the asserts with debugging
messages and terminates the connection on which the error occurred.
With this patch applied I was able to complete successfully over 1000
live migrations with stubdoms.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
16 years agostubdom: fix a race that affects live migration with stubdoms
Keir Fraser [Wed, 8 Jul 2009 09:51:00 +0000 (10:51 +0100)]
stubdom: fix a race that affects live migration with stubdoms

This patch fixes a race during live migration with stubdoms: right
after the stubdom dies the configuration file of the VM is removed by
stubdom-dm but, in case of a live migration, the configuration file
could be the one of the new VM in the process of being created.
Removing the config file before destroying the stubdom is enough to
solve the race.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
16 years agox86: extend mmu_update hypercall to allow update of foreign pagetables.
Keir Fraser [Tue, 7 Jul 2009 13:38:59 +0000 (14:38 +0100)]
x86: extend mmu_update hypercall to allow update of foreign pagetables.

Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
16 years agox86,hvm: Allow delivery of timer interrupts to VCPUs != 0
Keir Fraser [Tue, 7 Jul 2009 13:21:16 +0000 (14:21 +0100)]
x86,hvm: Allow delivery of timer interrupts to VCPUs != 0

This patch is needed for kexec/kdump since VCPU#0 is halted.

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86,hvm: cleanup hpet.c vcpu handling same as i8254.c/rtc.c
Keir Fraser [Tue, 7 Jul 2009 13:08:47 +0000 (14:08 +0100)]
x86,hvm: cleanup hpet.c vcpu handling same as i8254.c/rtc.c

- introduce macros: domain_vhpet, vcpu_vhpet, vhpet_domain, vhpet_vcpu
- remove *vcpu field from struct HPETState
- modify guest_time_hpet() takes *vhpet instead of *vcpu as 1st
- argument

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
16 years agodocs: update vtd.txt for pv-ops dom0
Keir Fraser [Tue, 7 Jul 2009 13:07:08 +0000 (14:07 +0100)]
docs: update vtd.txt for pv-ops dom0

Now VT-d works with pv-ops dom0, update vtd.txt to tell how to build
and use VT-d with pv-ops.

Signed-off-by: Weidong Han <weidong.han@intel.com>=
16 years agovmx: Add support for Pause-Loop Exiting
Keir Fraser [Tue, 7 Jul 2009 13:06:35 +0000 (14:06 +0100)]
vmx: Add support for Pause-Loop Exiting

New NHM processors will support Pause-Loop Exiting by adding 2
VM-execution control fields:
PLE_Gap    - upper bound on the amount of time between two successive
             executions of PAUSE in a loop.
PLE_Window - upper bound on the amount of time a guest is allowed to
             execute in a PAUSE loop

If the time, between this execution of PAUSE and previous one, exceeds
the PLE_Gap, processor consider this PAUSE belongs to a new loop.
Otherwise, processor determins the the total execution time of this
loop(since 1st PAUSE in this loop), and triggers a VM exit if total
time exceeds the PLE_Window.
* Refer SDM volume 3b section 21.6.13 & 22.1.3.

Pause-Loop Exiting can be used to detect Lock-Holder Preemption, where
one VP is sched-out after hold a spinlock, then other VPs for same
lock are sched-in to waste the CPU time.

Our tests indicate that most spinlocks are held for less than 2^12
cycles.  Performance tests show that with 2X LP over-commitment we can
get +2% perf improvement for kernel build(Even more perf gain with
more LPs).

Signed-off-by: Zhai Edwin <edwin.zhai@intel.com>
16 years agoxenstat: Use backend path which is compatible with pvops and 2.6.18-xen kernels.
Keir Fraser [Tue, 7 Jul 2009 13:01:30 +0000 (14:01 +0100)]
xenstat: Use backend path which is compatible with pvops and 2.6.18-xen kernels.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
16 years agox86, hvm: fix get msix entry error
Keir Fraser [Mon, 6 Jul 2009 10:58:02 +0000 (11:58 +0100)]
x86, hvm:  fix get msix entry error

There is a mistake to get the msix entry number. It should be
divide instead of modulus.

Signed-off-by: Yang Zhang <yang.zhang@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
16 years agoAMD IOMMU: Add suspend and resume support for amd iommu.
Keir Fraser [Mon, 6 Jul 2009 10:57:18 +0000 (11:57 +0100)]
AMD IOMMU: Add suspend and resume support for amd iommu.

Signed-off-by: Wei Wang <wei.wang2@amd.com>
16 years agoAMD IOMMU: Make iommu suspend & resume functions more generic.
Keir Fraser [Mon, 6 Jul 2009 10:56:51 +0000 (11:56 +0100)]
AMD IOMMU: Make iommu suspend & resume functions more generic.

Signed-off-by: Wei Wang <wei.wang2@amd.com>
16 years agoAMD IOMMU: Clean up hardware initialization functions to make them
Keir Fraser [Mon, 6 Jul 2009 10:56:17 +0000 (11:56 +0100)]
AMD IOMMU: Clean up hardware initialization functions to make them
more friendly to iommu suspend and resume operations.

Signed-off-by: Wei Wang <wei.wang2@amd.com>
16 years ago32on64: increase size of compat argument translation area to 2 pages.
Keir Fraser [Mon, 6 Jul 2009 10:55:17 +0000 (11:55 +0100)]
32on64: increase size of compat argument translation area to 2 pages.

The existing single page is not quite large enough to translate a
XENMEM_exchange hypercall with order=3D9. Since Linux uses
MAX_CONTIG_ORDER of 9 this seems like a reasonable upper bound to
support.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
16 years agox86: PERCPU_SHIFT can be reduced to 12 now compat_arg_xlat_area is not
Keir Fraser [Mon, 6 Jul 2009 10:55:01 +0000 (11:55 +0100)]
x86: PERCPU_SHIFT can be reduced to 12 now compat_arg_xlat_area is not
directly a per-cpu object.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoAvoid compat_arg_xlat to be a large per-CPU object
Keir Fraser [Mon, 6 Jul 2009 10:51:19 +0000 (11:51 +0100)]
Avoid compat_arg_xlat to be a large per-CPU object

Signed-off-by: Jan Beulich <jbeulich@novell.com>
16 years agox86 shadow: disable fast np path in OOS.
Keir Fraser [Mon, 6 Jul 2009 10:50:30 +0000 (11:50 +0100)]
x86 shadow: disable fast np path in OOS.

Signed off by: Gianluca Guida <gianluca.guida@eu.citrix.com>

16 years agox86 shadow: Fix lock-less race between resync and fast path.
Keir Fraser [Mon, 6 Jul 2009 10:49:56 +0000 (11:49 +0100)]
x86 shadow: Fix lock-less race between resync and fast path.

Signed-off-by: Gianluca Guida <gianluca.guida@eu.citrix.com>
16 years agoxend: allow pv_ops kernel driver pci-stub to hide devices for assignment
Keir Fraser [Mon, 6 Jul 2009 10:48:44 +0000 (11:48 +0100)]
xend: allow pv_ops kernel driver pci-stub to hide devices for assignment

pciback is used by VT-d to hide device for assigment. But in pv-ops
dom0, pciback is not supported yet. Fortunately, pci-stub module is
used to hide device in Linux for KVM VT-d device assignment and it's
included in pv-ops dom0. So can use pci-stub to hide devices for
assignment.

Device must be hidden before assignment. Control panel has checks if
devices can be assigned or not, and can list assignable devices via
reading devices owned by pciback. This patch changes the checks, and
also list assignable devices which are owned by pci-stub. Use pci-stub
to hide devices, and use this patch to pass checkes in control panel,
device assignemnt with VT-d works on Xen with pv-ops dom0.

Signed-off-by: Weidong Han <weidong.han@intel.com>
16 years agoblktap2: fix save/restore/migration
Keir Fraser [Mon, 6 Jul 2009 10:47:34 +0000 (11:47 +0100)]
blktap2: fix save/restore/migration

blktap2 devices use a regular 'phy' vbd blkback backend, causing
Blktap2Controller to trample the devices' parameters. This causes
problems with save/restore and managed domains, among other
things. This patch modifies Blktap2Controller to store both the vbd
and tap2 parameters in xenstore, and stops it from trampling the
device's config on device creation.

 * store blktap2 parameters in xenstore
 * restore blktap2 device config to original state once the underlying
 * vbd
   device is created (this fixes managed domains)
 * use blktap2 parameters rather than vbd parameters when building
 * blktap2
   device configurations
 * remove blktap2 specific code from XendConfig

Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
16 years agoblktap2: seperate blktap1/blktap2 disk types
Keir Fraser [Mon, 6 Jul 2009 10:47:02 +0000 (11:47 +0100)]
blktap2: seperate blktap1/blktap2 disk types

 * seperate blktap1/blktap2 disk types
 * use blktap1 when driver is not in explicit list of blktap2 drivers,
 * rather than current check against list of blktap1 only drivers
 * remove 'tapdisk' disk type (it is not a tapdisk disk type) and fix
 * tapdisk disk type check in XenConfig

Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
16 years agox86: Process only pending timers in acpi idle handler, not all
Keir Fraser [Mon, 6 Jul 2009 10:46:22 +0000 (11:46 +0100)]
x86: Process only pending timers in acpi idle handler, not all
softirqs. This fixes a bug where bailing into SCHEDULE_SOFTIRQ may not
actually return.

From: Ke Yu <ke.yu@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agolibfsimage: Support for zfs version 16.
Keir Fraser [Mon, 6 Jul 2009 10:42:05 +0000 (11:42 +0100)]
libfsimage: Support for zfs version 16.

Remove version checks to support boot of ZFS root filesystem version
16.

Signed-off-by: Susan Kamm-Worrell <susan.kamm-worrell@sun.com>
16 years agoFix c/s 19886: Must free pages after synchronously scrubbing them
Keir Fraser [Fri, 3 Jul 2009 07:54:51 +0000 (08:54 +0100)]
Fix c/s 19886: Must free pages after synchronously scrubbing them
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agotools: Always check for __linux__ not __Linux__
Keir Fraser [Thu, 2 Jul 2009 20:45:30 +0000 (21:45 +0100)]
tools: Always check for __linux__ not __Linux__
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoRemove page-scrub lists and async scrubbing.
Keir Fraser [Thu, 2 Jul 2009 15:45:31 +0000 (16:45 +0100)]
Remove page-scrub lists and async scrubbing.

The original user for this was domain destruction. Now that this is
preemptible all the way back up to dom0 userspace, asynchrony is
better iontroduced at that level, if at all, imo.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agokexec: switch to a known good/static GDT before kexec
Keir Fraser [Thu, 2 Jul 2009 15:16:15 +0000 (16:16 +0100)]
kexec: switch to a known good/static GDT before kexec

kexec has been failing (at least on 32on64, didn't try others) since
18771:8e18dd41c6c7 "x86: reduce GDT switching". Ensure that we are
using a known good GDT before attempting to switch to compatability
mode.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
16 years agogtraceview: compile fixes for NetBSD.
Keir Fraser [Thu, 2 Jul 2009 10:36:17 +0000 (11:36 +0100)]
gtraceview: compile fixes for NetBSD.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agoxend: Remove disused constants
Keir Fraser [Thu, 2 Jul 2009 10:35:30 +0000 (11:35 +0100)]
xend: Remove disused constants

This patch removes disused constants from XendConstants.py.

Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
16 years agoxend: fix an undefined name error: mac may be referenced before definition.
Keir Fraser [Thu, 2 Jul 2009 10:34:48 +0000 (11:34 +0100)]
xend: fix an undefined name error: mac may be referenced before definition.

Signed-off-by: Zhigang Wang <zhigang.x.wang@oracle.com>
16 years agolibxc: Fix bugs in xc_exchange_page: pfn_type indexed by gpfn.
Keir Fraser [Thu, 2 Jul 2009 10:33:43 +0000 (11:33 +0100)]
libxc: Fix bugs in xc_exchange_page: pfn_type indexed by gpfn.

Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
16 years agoxend: Restore uname of blktap for managed domains
Keir Fraser [Thu, 2 Jul 2009 10:32:49 +0000 (11:32 +0100)]
xend: Restore uname of blktap for managed domains

Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
16 years agox86 hvm: Remove assertion that PIC IRQs are delivered only to VCPU0.
Keir Fraser [Thu, 2 Jul 2009 10:31:58 +0000 (11:31 +0100)]
x86 hvm: Remove assertion that PIC IRQs are delivered only to VCPU0.

It's no longer true, if the guest reprograms IOAPIC pin0 for example.

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
16 years agobuikd: Fix the detection of udev with udevadm version < 128
Keir Fraser [Thu, 2 Jul 2009 10:31:00 +0000 (11:31 +0100)]
buikd: Fix the detection of udev with udevadm version < 128

Signed-off-by: Marc-A. Dahlhaus <mad@wol.de>
16 years agox86 hvm: Allow delivery of legacy 8259 interrupts to VCPUs != 0.
Keir Fraser [Wed, 1 Jul 2009 19:22:29 +0000 (20:22 +0100)]
x86 hvm: Allow delivery of legacy 8259 interrupts to VCPUs != 0.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86 hvm: Fix #UD interception.
Keir Fraser [Wed, 1 Jul 2009 13:58:31 +0000 (14:58 +0100)]
x86 hvm: Fix #UD interception.
 * Interception should be standard part of HVM_TRAP_MASK
 * Failed intercept should quietly forward #UD to the guest

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoRemove redundant semicolons
Keir Fraser [Wed, 1 Jul 2009 09:54:25 +0000 (10:54 +0100)]
Remove redundant semicolons

Signed-off-by: Rikiya Ayukawa <ayukawa.rikiya@jp.fujitsu.com>
16 years agoblktap2: add blktap2 device class and device controller
Keir Fraser [Wed, 1 Jul 2009 09:53:15 +0000 (10:53 +0100)]
blktap2: add blktap2 device class and device controller

blktap2 devices must be handled differently than blktap2
devices. blktap2 devices require a sysfs write to close the underlying
device, as well as extra sysfs writes when the domU is
paused/unpaused. The differences between blktap1 and blktap2 are great
enough to warrant the creation of a new device class, 'tap2', and
device controller for blktap2 devices.

  * add a new device controller (Blktap2Controller) and device class
    (tap2) for blktap2 devices
  * move blktap2 specific code from DevController to Blktap2Controller
  * if possible, check xenstore to determine block device class
  * use vmpath (/vm/<uuid>/) when releasing devices
  * modify linux hotplug cleanup script to handle blktap2 device
    removal

Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
16 years agoUpdate QEMU_TAG to b471f03d51cde3976b6d52179ca2a86d8906a587
Keir Fraser [Tue, 30 Jun 2009 15:00:57 +0000 (16:00 +0100)]
Update QEMU_TAG to b471f03d51cde3976b6d52179ca2a86d8906a587

16 years agoVT-d: Remove the dprintk() in alloc_pgtable_maddr()
Keir Fraser [Tue, 30 Jun 2009 15:00:29 +0000 (16:00 +0100)]
VT-d: Remove the dprintk() in alloc_pgtable_maddr()

The trivial message is printed too many when xen boots and when we
create hvm guests with devices assigned.

Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
16 years agox86 hvm mce: Support HVM Guest virtual MCA handling.
Keir Fraser [Tue, 30 Jun 2009 14:40:39 +0000 (15:40 +0100)]
x86 hvm mce: Support HVM Guest virtual MCA handling.

When MCE# happens, if the error has been contained/recovered by XEN
and it impacts one guest Domain(DOM0/HVM Guest/PV Guest), we will
inject the corresponding vMCE# into the impacted Domain. Guest OS will
go on its own recovery job if it has MCA handler.

Signed-off-by: Liping Ke <liping.ke@intel.com>
Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
16 years agoxend: get rid of hardcoded path in xend config file
Keir Fraser [Tue, 30 Jun 2009 14:37:14 +0000 (15:37 +0100)]
xend: get rid of hardcoded path in xend config file

* Change default settings to relative pathes.
* Make xend to prepend install directory if entries have no absolute
* path

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
16 years agox86 svm: Fix svm_update_guest_efer() after c/s 19856.
Keir Fraser [Mon, 29 Jun 2009 14:50:32 +0000 (15:50 +0100)]
x86 svm: Fix svm_update_guest_efer() after c/s 19856.

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agox86 Cx tracing: adds gtraceview & gtracestat utilities
Keir Fraser [Mon, 29 Jun 2009 10:03:24 +0000 (11:03 +0100)]
x86 Cx tracing: adds gtraceview & gtracestat utilities

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
16 years agox86 Cx tracing: export the Cx exit reason (pending interrupt during Cx)
Keir Fraser [Mon, 29 Jun 2009 10:01:50 +0000 (11:01 +0100)]
x86 Cx tracing: export the Cx exit reason (pending interrupt during Cx)

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
16 years agox86 Cx tracing: export expected/predicted Cx to xentrace
Keir Fraser [Mon, 29 Jun 2009 10:00:56 +0000 (11:00 +0100)]
x86 Cx tracing: export expected/predicted Cx to xentrace

Signed-off-by: Lu Guanqun <guanqun.lu@intel.com>
16 years agox86 tboot: Fix c/s 19577
Keir Fraser [Mon, 29 Jun 2009 09:58:56 +0000 (10:58 +0100)]
x86 tboot: Fix c/s 19577

Signed-off-by: Shane Wang <shane.wang@intel.com>
16 years agoMerge with ia64 tree
Keir Fraser [Mon, 29 Jun 2009 09:51:35 +0000 (10:51 +0100)]
Merge with ia64 tree

16 years ago[IA64] replace MAX_VCPUS with d->max_vcpus where necessary.
Isaku Yamahata [Mon, 29 Jun 2009 02:26:05 +0000 (11:26 +0900)]
[IA64] replace MAX_VCPUS with d->max_vcpus where necessary.

don't use MAX_VCPUS, and use vcpu::max_vcpus.
The changeset of 2f9e1348aa98 introduced max_vcpus to allow more vcpus
per guest. This patch is ia64 counter part.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
16 years ago[IA64] plumb tmem hypercall entry point on ia64.
Isaku Yamahata [Mon, 29 Jun 2009 02:23:53 +0000 (11:23 +0900)]
[IA64] plumb tmem hypercall entry point on ia64.

add do_tmem_op() to ia64_hypercall_table to support tmem.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
16 years ago[IA64] remove a warning
Isaku Yamahata [Mon, 29 Jun 2009 02:23:31 +0000 (11:23 +0900)]
[IA64] remove a warning

This patch removes the following warning.
> unwind.c:40:1: warning: "write_trylock" redefined
> In file included from xen/include/xen/sched.h:7,
>                  from unwind.c:33:
> xen/include/xen/spinlock.h:115:1: warning: this is the location of the previous definition

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
16 years agofs-backend: fix default export and filename checks
Keir Fraser [Sat, 27 Jun 2009 09:47:38 +0000 (10:47 +0100)]
fs-backend: fix default export and filename checks

This patch changes fs-backend to use /var/lib/xen as default export
and check all the file names and paths given by the frontend against the
export path, so that the frontend can only operate under the export
directory.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
16 years agotmem: extra stats
Keir Fraser [Sat, 27 Jun 2009 09:40:11 +0000 (10:40 +0100)]
tmem: extra stats

This patch collects a few additional valuable per-domain
performance stats.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
16 years agominios: fix free_fbfront
Keir Fraser [Sat, 27 Jun 2009 09:39:10 +0000 (10:39 +0100)]
minios: fix free_fbfront

When a stubdom is destroyed, fbfront tries to unbind the evtchn
twice.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
16 years agodocs: Add network_setup.txt file explaining bridge setup.
Keir Fraser [Sat, 27 Jun 2009 09:37:51 +0000 (10:37 +0100)]
docs: Add network_setup.txt file explaining bridge setup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
16 years agox86 svm: Make 32bit legacy guests boot again
Keir Fraser [Sat, 27 Jun 2009 09:33:33 +0000 (10:33 +0100)]
x86 svm: Make 32bit legacy guests boot again

Attached patch fixes a bug introduced in c/s 19648.

32bit legacy guests have the sysenter/sysexit instructions available.
Therefore, we have to disable intercepts for the sysenter MSRs or the
guest stucks in an infinite loop of #GPs, otherwise.

For guests in 64bit mode and 32bit compat mode, sysenter/sysexit
instructions aren't available. The sysenter MSRs have to be
intercepted to make the instruction emulation working.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
16 years agoUpdate QEMU_TAG for d67e46f6860bfbd8991b7691efc1f67b7bc413bc
Keir Fraser [Sat, 27 Jun 2009 09:02:52 +0000 (10:02 +0100)]
Update QEMU_TAG for d67e46f6860bfbd8991b7691efc1f67b7bc413bc

16 years agoxend: pass-through: Implement least-mapping of virtual functions
Keir Fraser [Sat, 27 Jun 2009 09:01:20 +0000 (10:01 +0100)]
xend: pass-through: Implement least-mapping of virtual functions

This is an alternative to identity mapping virtual functions.

It works by assigning the numerically lowest virtual function that is
available.

* The order of assignment is thus dependent on the order that physical
  functions are specified.

  e.g.
  config         physical     virtual
  01.00.0,2  ->  01:00.0  ->  00:07.0
             ->  01:00.2  ->  00:07.1

  is different to

  config         physical     virtual
  01.00.2,0  ->  01:00.2  ->  00:07.0
             ->  01:00.0  ->  00:07.1

* Physical function 0 need not be present

  e.g.
  config         physical     virtual
  01.00.1,2  ->  01:00.1  ->  00:07.0
             ->  01:00.2  ->  00:07.1

* Functions from the same physical multi-function device
  may be exported as multiple multi-function and single-function
  devices

  e.g.
  01.00.0,2  ->  01:00.0  ->  00:07.0
             ->  01:00.2  ->  00:07.1
  and
  01.00.1,3  ->  01:00.1  ->  00:08.1
             ->  01:00.3  ->  00:08.1
  and
  01.00.5    ->  01:00.5  ->  00:09.0

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agoxend: pass-through: rename vslot to vdevfn and vslots to vdevfns
Keir Fraser [Sat, 27 Jun 2009 09:00:24 +0000 (10:00 +0100)]
xend: pass-through: rename vslot to vdevfn and vslots to vdevfns

This is a noisy patch that makes not functional changes.
It renames vslot to vdevfn and vslots to vdevfns.

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agoxend: pass-through: Parse command line for multi-function hot-plug and unplug
Keir Fraser [Sat, 27 Jun 2009 08:56:15 +0000 (09:56 +0100)]
xend: pass-through: Parse command line for multi-function hot-plug and unplug

Hook things up to allow multi-function pass-through.

This includes making sure that request is valid.
In the case of pci-detach:

* All the functions requested must be attached to the same virtual
  slot * and;
* A request must include the functions attached to a virtual slot

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agoxend: pass-through: Add key to pci device dictionary
Keir Fraser [Sat, 27 Jun 2009 08:55:42 +0000 (09:55 +0100)]
xend: pass-through: Add key to pci device dictionary

This will be used to identify the functions belonging to
a multi-function device.

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agoxend: pass-through: Add pci_dict_bin_to_str()
Keir Fraser [Sat, 27 Jun 2009 08:54:54 +0000 (09:54 +0100)]
xend: pass-through: Add pci_dict_bin_to_str()

Break out the device list gathering code from xm_pci_list()
so that it can be re-used by subsequent changes.

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agoxend: pass-through: Only tell qemu-xen to unplug function 0
Keir Fraser [Sat, 27 Jun 2009 08:54:29 +0000 (09:54 +0100)]
xend: pass-through: Only tell qemu-xen to unplug function 0

When unplugging a function, all functions in the same vslot must be
unplugged, and function 0 must be one of the functions present when a
vslot is hot-plugged. Telling qemu-dm to unplug function 0 also tells
it to unplug all other functions in the same vslot.

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agoxend: pass-through: Allow multi-function device specifications to be parsed
Keir Fraser [Sat, 27 Jun 2009 08:53:56 +0000 (09:53 +0100)]
xend: pass-through: Allow multi-function device specifications to be parsed

The general format is as follows:

  Now: SEQ:BUS:DEV.FUNC[@VSLOT][,OPT...]
  New: SEQ:BUS:DEV.FUNC0-FUNCN[@VSLOT][,OPT...]
       SEQ:BUS:DEV.FUNC0,FUNCM,FUNCN[@VSLOT][,OPT...]
       SEQ:BUS:DEV.*[@VSLOT][,OPT...]

  In the case of unplug the VSLOT and OPT must be omitted.

  Xm expands this notation notation out and passes
  more conventional parameters to qemu-xen.

  E.g:
       0000:00:01.00-03 becomes:
         0000:00:01.00
         0000:00:01.01
         0000:00:01.02
         0000:00:01.03

       0000:00:01.00,03,05,07 becomes:
         0000:00:01.00
         0000:00:01.03
         0000:00:01.05
         0000:00:01.07

       For a device that has functions 0, 1, 2, 3, 5 and 7,
       0000:00:01.* becomes:
         0000:00:01.00
         0000:00:01.01
         0000:00:01.02
         0000:00:01.03
         0000:00:01.05
         0000:00:01.07

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Cc: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agohvmloader: pass-through: multi-function PCI hot-plug
Keir Fraser [Sat, 27 Jun 2009 08:53:19 +0000 (09:53 +0100)]
hvmloader: pass-through: multi-function PCI hot-plug

This registers information to allow guests to recognise non-zero
functions when hot-plugged.

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
16 years agoxend: pass-through: use devfn instead of slots as the unit for pass-through
Keir Fraser [Sat, 27 Jun 2009 08:51:37 +0000 (09:51 +0100)]
xend: pass-through: use devfn instead of slots as the unit for pass-through

Instead of suppling a slot number to qemu-xen, supply a devfn.

This and subsequent other changes will allow xend to ask
for more than one function to be inserted into a single slot -
by specifying which function of the slot should be used.

This is a minimal patch for this change. A subsequent
patch that has a lot of noise to rename slot to devfn
is intended to follow.

Cc: Dexuan Cui <dexuan.cui@intel.com>
Cc: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>