Keir Fraser [Fri, 11 Dec 2009 08:47:51 +0000 (08:47 +0000)]
libxenlight: fix cd-insert cli arguments parsing
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 11 Dec 2009 08:46:02 +0000 (08:46 +0000)]
libxenlight: add a cli option to exit right after domain creation
This patch adds a command line option in xl to exit right after domain
creation and not wait in background for the death of the domain.
Users should be aware that if they use this option, they always have
to destroy the domain manually after the guest shuts down.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 11 Dec 2009 08:45:26 +0000 (08:45 +0000)]
libxenlight: fix two memory related issues
- LIBXL_MAXMEM_CONSTANT is 1MB but must be expressed in KB;
- xc_dom_linux_build should take target_memkb instead of max_memkb as
an argument.
Thanks to Andres for spotting the latter.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 11 Dec 2009 08:44:33 +0000 (08:44 +0000)]
domain builder: multiboot-like module support
This defines how multiple modules can be passed to a domain by packing
them together into a "multiboot module" in a way very similar to the
multiboot standard. An SIF_ flag is added to announce such package.
This also adds a packing implementation to PV-GRUB.
Signed-Off-By: Samuel Thibault <samuel.thibault@ens-lyon.org>
Keir Fraser [Fri, 11 Dec 2009 08:42:28 +0000 (08:42 +0000)]
PoD: appropriate BUG_ON when domain is dying
BUG_ON(d->is_dying) in p2m_pod_cache_add() which is introduced in
c/s 20426 is not proper. Since dom->is_dying is set asynchronously.
For example, MMU_UPDATE hypercalls from qemu and the
DOMCTL_destroydomain hypercall from xend can be issued simultaneously.
Also this patch lets p2m_pod_empty_cache() wait by spin_barrier
until another PoD operation ceases.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Wed, 9 Dec 2009 10:59:31 +0000 (10:59 +0000)]
x86-32/pod: fix map_domain_page() leak
The 'continue' in the if() part of the conditional at the end of
p2m_pod_zero_check() was causing this, but there also really is no
point in retaining the mapping after having checked page contents,
so fix it both ways. Additionally there is no point in updating
map[] at this point anymore.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 9 Dec 2009 10:58:52 +0000 (10:58 +0000)]
tools: simplify PYTHON_PATH computation (and fixes for NetBSD)
Doesn't work when build-time python path differs from install-time. Do
we care about this given tools should be packaged/built for the
specific run-time distro?
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Wed, 9 Dec 2009 10:46:11 +0000 (10:46 +0000)]
tmem, xentop: Report a few key per-domain tmem statistics in xentop.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Wed, 9 Dec 2009 10:44:56 +0000 (10:44 +0000)]
tmem: reclaim minimal memory proactively
When a single domain is using most/all of tmem memory
for ephemeral pages belonging to the same object, e.g.
when copying a single huge file larger than ephemeral
memory, long lists are traversed looking for a page to
evict that doesn't belong to this object (as pages in
the object for which a page is currently being inserted
are locked and cannot be evicted). This is essentially
a livelock.
Avoid this by proactively ensuring there is a margin
of available memory (1MB) before locks are taken on
the object.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Wed, 9 Dec 2009 10:44:11 +0000 (10:44 +0000)]
libxenlight: implement libxl_set_memory_target
This patch adds a target_memkb parameter to libxl_domain_build_info to
set the target memory for the VM at build time and a new function
called libxl_set_memory_target to dynamically modify the memory target
of a VM at run time. Finally a new command "mem-set" is added to xl
that calls directly libxl_set_memory_target.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 9 Dec 2009 10:43:33 +0000 (10:43 +0000)]
libxenlight: xenstore data path writable by the guest
Make the data path on xenstore writable by the guest
because Citrix pv drivers requires it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 9 Dec 2009 10:42:53 +0000 (10:42 +0000)]
SRAT memory hotplug 2/2: Support overlapped and sparse node memory arrangement.
Currently xen hypervisor use nodes to keep start/end address of
node. It assume memory among nodes has no overlap, this is not always
true, especially if we have memory hotplug support in the system.
This patch backport Linux kernel's memblks to support overlapping
among node. The memblks will be used both for checking conflict, and
caculate memnode_shift.
Also, currently if there is no memory populated in a node when system
booting, the node will be unparsed later, and the corresponding CPU's
numa information will be removed also. This patch will keep the CPU
information.
One thing need notice is, currently we caculate memnode_shift with all
memory, including un-populated ones. This should work if the smallest
chuck is not so small. Other option can be flags in the page_info
structure, etc.
The memnodemap is changed from paddr to pdx, both to save space, and
also because currently most access is from pfn.
A flag is mem_hotplug added if there is hotplug memory range.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Wed, 9 Dec 2009 10:41:37 +0000 (10:41 +0000)]
SRAT memory hotplug 1/2: Revert 20053:
ebb07c5934c8.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Tue, 8 Dec 2009 14:14:27 +0000 (14:14 +0000)]
hvm: Share ASID logic between VMX and SVM.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 8 Dec 2009 10:33:08 +0000 (10:33 +0000)]
hvm: Pull SVM ASID management into common HVM code where it can be shared.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 8 Dec 2009 07:55:21 +0000 (07:55 +0000)]
Track free pages live rather than count pages in all nodes/zones
Trying to fix a livelock condition in tmem that occurs
only when the system is totally out of memory requires
the ability to easily determine if all zones in all
nodes are empty, and this must be checked at a fairly
high frequency. So to avoid walking all the zones in
all the nodes each time, I'd like a fast way to determine
if "free_pages" is zero. This patch tracks the sum
of the free pages in all nodes/zones. Since I think
the value is modified only when heap_lock is held,
it need not be atomic.
I don't know this for sure, but suspect this will be
useful in other future memory utilization code, e.g.
page sharing.
This has had limited testing, though I did drive free
memory down to zero and up and down a few times with
debug on and no asserts were triggered.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Tue, 8 Dec 2009 07:51:30 +0000 (07:51 +0000)]
VT-d: per-iommu domain-id
Currently, xen uses shared iommu domain-id across all the VT-d units
in the platform. The number of iommu domain-ids (NR_DID, e.g. 256)
supported by each VT-d unit is reported in Capability register. The
limitation of current implementation is it only can support at most
NR_DID domains with VT-d in the entire platform, even though the
platform can support N * NR_DID (where N is the number of VT-d
units). Imagine a platform with several SR_IOV NICs, and each NIC
supports 128 VFs. It possibly beyond the NR_DID.
This patch implements iommu domain-id management per iommu (VT-d
unit), hence solves above limitation. It removes the global domain-id
bitmap, instead use domain-id bitmap in struct iommu, and also involve
an array to map guest domain-id and iommu domain-id, which is used to
iommu domain-id when flush context cache or IOTLB. When a device is
assigned to a guest, choose an available iommu domain-id from the
device's iommu, and map guest domain id to the domain-id mapping
array. When a device is deassigned from a guest, clear the domain-id
bit in domain-id bitmap and clear the corresponding entry in domain-id
map array if there is no other devices under the same iommu owned by
the guest.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Tue, 8 Dec 2009 07:49:54 +0000 (07:49 +0000)]
xend: Add keymap to vfb config for existing hvm guests
I submitted a patch a while back to add keymap to vfb config for hvm
guests. This patch works fine for new config (xm create|new) but not
existing, managed guests. To cover the latter case I've introduced a
validator method in XendConfig.
Signed-off-by: Jim Fehlig <jfehlig@novell.com>
Keir Fraser [Tue, 8 Dec 2009 07:48:45 +0000 (07:48 +0000)]
Make tsc_mode=3 (pvrdtscp) work correctly.
Initial tsc_mode patch contained a rough cut at pvrdtscp mode. This
patch gets it working correctly. For the record, pvrdtscp mode allows
an application to obtain information from Xen to descale/de-offset
a physical tsc value to obtain "nsec since VM start". Though the
raw tsc value may change across migration due to different Hz rates
and different start times of different physical machines, applying
the pvrdtscp algorithm to a raw tsc value guarantees that the result
will always be both a fixed known rate (nanoseconds) and monotonically
increasing. BUT, pvrdtscp will only be fast on physical machines that
support the rdtscp instruction AND on which tsc is "safe"; on other
machines both the rdtsc and rdtscp instructions will be emulated.
Also note that when tsc_mode=3 is enabled, tsc-sensitive applications
that do NOT implement the pvrdtscp algorithm will behave incorrectly.
So, tsc_mode=3 should only be used when all apps are either
tsc-resilient
or pvrdtscp-modified, and only has a performance advantage on very
recent generation processors.
Signed-off-by: Dan Magenheiemer <dan.magenheimer@oracle.com>
Keir Fraser [Tue, 8 Dec 2009 07:47:52 +0000 (07:47 +0000)]
libxenlight: implement cdrom insert/eject
This patch implements functions in libxenlight to change the cdrom in
a VM at run time and to handle cdrom eject requests from guests.
This patch adds two new commands to xl: cd-insert and cd-eject; it
also modifies xl to handle cdrom eject requests coming from guests
(actually coming from qemu).
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Tue, 8 Dec 2009 07:45:15 +0000 (07:45 +0000)]
fs-backend: add a backend cleanup function
This patch implements a backend cleanup function in fs-backend so that
when the connection to the frontend is closed we don't leak nodes on
xenstore.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Tue, 8 Dec 2009 07:44:45 +0000 (07:44 +0000)]
libxenlight: minimal vfs support
This patch adds minimal support for fs-backend and minios' fs-front
to libxenlight:
- it creates a vfs directory on the stubdom's xenstore
device path and allows the stubdom to write to it;
- it doesn't try to cleany shutdown the vfs backend.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 7 Dec 2009 14:10:27 +0000 (14:10 +0000)]
Keir Fraser [Sat, 5 Dec 2009 12:32:34 +0000 (12:32 +0000)]
x86_32: Fix build after 20575:
0930d17589a6
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sat, 5 Dec 2009 12:30:46 +0000 (12:30 +0000)]
libxenlight: physmap slack for pv domains
Contemplate a memory space slack for PV domains,
since they do ballooning (or flipping network rx)
and need some extra room in their pfn space.
Note that this does not allocate any extra memory
to the domain, it simply extends the physmap with
some extra room for "bounce bufffering back" pfn's
that are yielded to dom0.
The default slack is set at 8MB.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Acked-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Sat, 5 Dec 2009 12:29:48 +0000 (12:29 +0000)]
Keir Fraser [Fri, 4 Dec 2009 07:11:44 +0000 (07:11 +0000)]
libxenlight: get state for one domain
Simple function to get the dominfo state of a single domain.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Fri, 4 Dec 2009 07:11:06 +0000 (07:11 +0000)]
libxenlight: domain resume
Added libxenlight implementation for resume domain.
This brings back a cooperative pv domain from the
shutdown state after save, enabling checkpointing.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Fri, 4 Dec 2009 07:10:22 +0000 (07:10 +0000)]
libxenlight: Destroy device model only for domains that have it
Destroy device model only for domains that have it.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Fri, 4 Dec 2009 07:09:44 +0000 (07:09 +0000)]
libxenlight: avoid writing empty values to xenstore
Prevent segmentation fault caused by empty values
in key-value pairs for the /vm/ subdirectory
when restoring a pv domain.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Fri, 4 Dec 2009 07:06:47 +0000 (07:06 +0000)]
libxenlight: disk and nic destroy calls
Expose disk and nic device destroy calls
Also removes the obsolete device shutdown calls.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Fri, 4 Dec 2009 07:03:45 +0000 (07:03 +0000)]
libxenlight: refactor libxl destroy code
Refactor libxl device destroy code. Abstract function
waiting for the watch on the state node to fire.
Create a generic device delete function.
Only a single LIBXL_DESTROY_TIMEOUT elapses when
waiting for destruction of all the devices of a
domain.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Fri, 4 Dec 2009 07:02:49 +0000 (07:02 +0000)]
libxenlight: fix GC when cloning contexts
Provide a function to clone a context. This is necessary
because simply copying the structs will eventually
corrup the GC: maxsize is updated in the cloned context
but not in the originating, yet they have the same array
of referenced pointers alloc_ptrs.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Fri, 4 Dec 2009 07:00:25 +0000 (07:00 +0000)]
xend: Fix parameters to PyArg_ParseTupleAndKeywords()
The kwd_list parameter PyArg_ParseTupleAndKeywords() must be a
NULL-terminated list.
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Keir Fraser [Fri, 4 Dec 2009 06:59:33 +0000 (06:59 +0000)]
x86: XENMEM_add_to_physmap should propagate errors from guest_physmap_add_page().
Authored-by: David Lively
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Keir Fraser [Fri, 4 Dec 2009 06:58:08 +0000 (06:58 +0000)]
Add keyhandler 'g' to print all active grant table entries.
Authored-By: Robert Phillips
Signed-off-By: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Keir Fraser [Fri, 4 Dec 2009 06:51:53 +0000 (06:51 +0000)]
libxenlight: Get rid of the dependency on the LIBCONFIG_SOURCE directory.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Keir Fraser [Fri, 4 Dec 2009 06:50:46 +0000 (06:50 +0000)]
libxenlight: Delete dep files on 'make clean', and include them in Makefile rules.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 3 Dec 2009 13:52:02 +0000 (13:52 +0000)]
grant-tables: do not fail attempts to GNTTABOP_set_version to the current version.
...even if there are active grants.
This triggers when checkpoint a guest which essentially resumes
without actually having gone through the suspend so the domain is
already latched to v2 inside Xen.
Also return the current actual version on success and failure. Not
terribly useful with only 2 options but is more robust to future
developments.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Thu, 3 Dec 2009 13:51:20 +0000 (13:51 +0000)]
xend: Add GPL license stanza to MemoryPool.py
Signed-off-by: James Song (Wei) <jsong@novell.com>
Keir Fraser [Thu, 3 Dec 2009 13:50:43 +0000 (13:50 +0000)]
Remus: fall back to xenstore if necessary
This is primarily for pvops until it gets a dedicated suspend
event channel.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Thu, 3 Dec 2009 13:50:14 +0000 (13:50 +0000)]
Remus: fix shadow memory allocation, broken by 20558:
4ed3b9b1de3f
This approach is perhaps a little cleaner than directly calling
balloon.free.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Wed, 2 Dec 2009 18:46:14 +0000 (18:46 +0000)]
x86 hvm: fix up the unified HAP nested-pagefault handler.
A guest PFN may have been marked dirty and switched to p2m_ram_rw by
another CPU between the VMEXIT and lookup in this handler, so
we can't just check for p2m_ram_logdirty. Also, handle_mmio
doesn't handle passthrough MMIO.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Wed, 2 Dec 2009 18:43:28 +0000 (18:43 +0000)]
xentop: Allow full domain name display
Add a '-f' option to xentop to allow the full domain name to be
displayed. This is the original behavior which can cause the display
to be unaligned. Customers have requested this because only the
trailing characters of their domain names are unique and therefore
cannot be distinguished when the display is limited to a 10 character
width.
Signed-off-by: Charles Arnold <carnold@novell.com>
Keir Fraser [Wed, 2 Dec 2009 18:42:36 +0000 (18:42 +0000)]
libxenlight: fix multiple xenstore watches problem
this patch fixes the multiple xenstore watches problem in libxenlight
opening a new xenstore connection to set and read temporary watches on
the device state nodes. This way they don't interfere with other long
running watches.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 2 Dec 2009 18:42:03 +0000 (18:42 +0000)]
libxenlight: use watch and select in libxl_wait_for_device_model
This patch reimplements libxl_wait_for_device_model using a xenstore
watch and a select loop.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 2 Dec 2009 18:41:31 +0000 (18:41 +0000)]
libxenlight: fix dm_xenstore_record_pid
The function dm_xenstore_record_pid is executed by a child of the main
process and therefore shouldn't use the same xenstore connection:
currently it opens a new connection but still uses the old one.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 2 Dec 2009 13:45:35 +0000 (13:45 +0000)]
xenstat: Fixes for 20528:
e6e3bf767d16 (stats for dom0 network bonding)
In above c/s I introduced dom0 statistics for case we use network
bonding. The indentation was not good for xenstat C codebase and also
some modifications were done to the logic, mainly not using the parsed
variables we don't care about (as we care only about
{tx|rx}{bytes,packets,errs,drops} and no other variable from
/proc/net/dev) by passing NULLs to variables we don't care about. Also
dom0 statistics alteration was fixed to include {tx|rx}{drop,errs} for
dom0 (previous version of my patch was not having this code applied).
Signed-off-by: Michal Novotny <minovotn@redhat.com>
Keir Fraser [Wed, 2 Dec 2009 13:43:37 +0000 (13:43 +0000)]
xend, vt-d: do not reserve vtd_mem if iommu is not enabled
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Keir Fraser [Wed, 2 Dec 2009 13:39:07 +0000 (13:39 +0000)]
vmx: During task-switch, read instr-len VMCS field only when valid.
Otherwise we can crash on the BUG_ON() in __get_instruction_length().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 2 Dec 2009 08:52:50 +0000 (08:52 +0000)]
VT-d: Fix indentation to make log messages more readable in dmar.c
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Wed, 2 Dec 2009 08:51:59 +0000 (08:51 +0000)]
pci: Correct BDF format from B:D:F to B:D.F in log messages.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Wed, 2 Dec 2009 08:51:12 +0000 (08:51 +0000)]
xend: Memory pool for pv guest on systems with >128G memory
The main idea of this patch is:
1) The admin sets aside some memory below 128G for 32-bit paravirtual
domain creation (via dom0_mem=-<value> in kernel comand line).
2) The admin also explicitly states to the tools (i..e xend) how much
memory is supposed to be left untouched by 64-bit domains
3) If a 32-bit pv DomU gets created, no ballooning ought to be
necessary (since if it is, no guarantee can be made about the address
range of the memory ballooned out), and memory gets allocated from the
reserved range.
4) Upon 64-bit (or 32-bit HVM or HVM) DomU creation, the tools
determine the amount of memory to be ballooned out of Dom0 by adding
the amount needed for the new guest and the amount still in the
reserved pool (and then of course subtracting the total amount of
memory the hypervisor has available for guest use).
Signed-off-by: james song (wei) <jsong@novell.com>
Keir Fraser [Wed, 2 Dec 2009 08:48:36 +0000 (08:48 +0000)]
VT-d: get rid of hardcode in iommu_flush_cache_entry
Currently iommu_flush_cache_entry uses a fixed size 8 bytes to flush
cache. But it also needs to flush caches with different sizes,
e.g. struct root_entry is 16 bytes. This patch fixes the hardcode by
using a parameter "size" to flush caches with different sizes.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Wed, 2 Dec 2009 08:47:49 +0000 (08:47 +0000)]
xm: fix message in OptionError deprecated since Python 2.6
BaseException.message has been deprecated since Python 2.6. To
prevent DeprecationWarning from popping up over this pre-existing
attribute, use a new property that takes lookup precedence.
Signed-off-by: Wei Kong <weikong.cn@gmail.com>
Keir Fraser [Wed, 2 Dec 2009 08:46:47 +0000 (08:46 +0000)]
docs: new tsc_mode VM configuration option
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Wed, 2 Dec 2009 08:46:11 +0000 (08:46 +0000)]
remus: Skip Linux-specific build components on other OSes
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Acked-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Wed, 2 Dec 2009 08:45:16 +0000 (08:45 +0000)]
libxenlight: write stubdoms logs to file
It turns out that there is a better way to write stubdoms logs to file
than using libxl_console_attach: qemu is the one that provides the
console backend for stubdoms and qemu is able to redirect a serial to
file, so we can use this feature to make sure the first stubdom
console is always redirected to a logfile.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 2 Dec 2009 08:44:40 +0000 (08:44 +0000)]
libxenlight: two small fixes
- set the domid of the guest and not the one of the stubdom in the
libxl_device_model_starting returned to the user;
- check that the length of the two strings matches in
libxl_name_to_domid, otherwise we can get a match for two different
domains that have the same initial part of the name.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 2 Dec 2009 08:44:10 +0000 (08:44 +0000)]
libxl: include signal.h, required for SIGKILL definition
...makes libxl build on NetBSD.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Tue, 1 Dec 2009 14:19:28 +0000 (14:19 +0000)]
x86: Correctly allocate module-relocation area and bzimage headroom.
Without this patch, loading a bzimage dom0 kernel while also
requesting a dynamically-allocated crashkernel area is broken.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 1 Dec 2009 14:08:27 +0000 (14:08 +0000)]
hvmloader: Fix bug in 20510:
749b5d46e7a9 (GPE notifications)
The GPE notification decision tree was inverted.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 1 Dec 2009 14:03:42 +0000 (14:03 +0000)]
libxenlight: wait for pv qemu initialization
this patch makes libxl_create_stubdom wait for pv qemu to be properly
initialized before unpausing the stubdom.
A new libxl_device_model_starting pointer is used to wait for pv qemu
initialization while the libxl_device_model_starting pointer given by
the user is initialized to a new structure with an empty for_spawn
member, because nothing that was spawn has to be waited for anymore.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 14:02:00 +0000 (14:02 +0000)]
x86: fix MCE/NMI injection
This attempts to address all the concerns raised in
http://lists.xensource.com/archives/html/xen-devel/2009-11/msg01195.html,
but I'm nevertheless still not convinced that all aspects of the
injection handling really work reliably. In particular, while the
patch here on top of the fixes for the problems menioned in the
referenced mail also adds code to keep send_guest_trap() from
injecting multiple events at a time, I don't think the is the right
mechanism - it should be possible to handle NMI/MCE nested within
each other.
Another fix on top of the ones for the earlier described problems is
that the vCPU affinity restore logic didn't account for software
injected NMIs - these never set cpu_affinity_tmp, but due to it most
likely being different from cpu_affinity it would have got restored
(to a potentially random value) nevertheless.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Tue, 1 Dec 2009 13:59:47 +0000 (13:59 +0000)]
xen: turn numa=on by default
I did some benchmark runs (lmbench & kernel compile) with a number of
guests running in parallel to compare the performance of numa=on vs.
numa=off. As soon as one starts to load the machine, the performance
goes down in the numa=off case. The tests were done on an 8-node
machine (4 cores each). lmbench (actually copying large amounts of
memory) shows a dramatic dropdown, but I even noticed significant
performance decrease for a tmpfs based Linux kernel compile. Here a
summary of the data:
lmbench's rd benchmark (normalized to native Linux (=100)):
guests numa=off numa=on avg increase
min avg max min avg max
1 78.0 102.3
7 37.4 45.6 62.0 90.6 102.3 110.9 124.4%
15 21.0 25.8 31.7 41.7 48.7 54.1 88.2%
23 13.4 17.5 23.2 25.0 28.0 30.1 60.2%
kernel compile in tmpfs, 1 VCPU, 2GB RAM, average of elapsed time:
guests numa=off numa=on increase
1 480.610 464.320 3.4%
7 482.109 461.721 4.2%
15 515.297 477.669 7.3%
23 548.427 495.180 9.7%
again with 2 VCPUs and make -j2:
1 264.580 261.690 1.1%
7 279.763 258.907 7.7%
15 330.385 272.762 17.4%
23 463.510 390.547 15.7% (46 VCPUs on 32pCPUs)
Selected tests on a 4-node machine showed similar behavior (7.9 %
increase with 6 parallel guests on the 2 VCPU kernel compile
benchmark).
Note that this does not affect non-NUMA machines at all, since NUMA
will be turned off again by the code if no NUMA topology is detected.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Keir Fraser [Tue, 1 Dec 2009 13:57:02 +0000 (13:57 +0000)]
libxc: pass the restore_context through function and allocate the context on the restore function stack.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:56:26 +0000 (13:56 +0000)]
libxc: pass the suspend_context through function and allocate the context on the save function stack.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:55:50 +0000 (13:55 +0000)]
libxc: move the domain_info_context into the restore_context
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:55:15 +0000 (13:55 +0000)]
libxc: move domain_info_context into the save_context
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:54:36 +0000 (13:54 +0000)]
libxc: move restore global variable to a global static context
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:54:01 +0000 (13:54 +0000)]
libxc: create a global context structure to record global variables in save
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:53:14 +0000 (13:53 +0000)]
libxc: create a domain_info_context structure to store guest_width and p2m_size for macros.
Macro now refers to guest_width and p2m_size through a dinfo pointer.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:49:33 +0000 (13:49 +0000)]
libxenlight: enables less than maximum vcpus
Enable turning on a different amount of vcpus than
the maximum during domain creation/restore.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Tue, 1 Dec 2009 13:48:48 +0000 (13:48 +0000)]
libxenlight: allow domain to publish its suspend evtchn
Allow domain to publish its suspend event channel.
Otherwise, the fast event-channel-based suspend
path is disabled.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Tue, 1 Dec 2009 13:48:03 +0000 (13:48 +0000)]
libxenlight: write vcpu availability paths in xenstore
Write cpu availability paths to xenstore. Otherwise,
no vcpus other than the first are enabled.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Tue, 1 Dec 2009 13:47:18 +0000 (13:47 +0000)]
libxenlight: remove vss and xapi patch on domain destroy
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Tue, 1 Dec 2009 13:46:31 +0000 (13:46 +0000)]
libxenlight: set domain handle
Set domain handle much like xend does, identical to
the uuid. This allows obtaining the uuid of a domain
from the handle in the dominfo struct.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Tue, 1 Dec 2009 13:45:45 +0000 (13:45 +0000)]
libxenlight: fix uuid code
- Use proper constants
- Use functions from the uuid library
- Fix broken pointer handling in libxl_dominfo
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Tue, 1 Dec 2009 13:44:13 +0000 (13:44 +0000)]
libxenlight: avoid writing empty values to xenstore
Prevent segmentation fault caused by empty values
in key-value pairs for the /vm/ subdirectory
when creating a pv domain.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.com>
Keir Fraser [Tue, 1 Dec 2009 13:41:38 +0000 (13:41 +0000)]
sysctl: Fix mis-allocation of number for XEN_SYSCTL_lockprof_op
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:39:51 +0000 (13:39 +0000)]
Revert 20523:
bd52fff29e6e "Remove redundant tests in __start_xen()"
Consensus is that code is clearer with the tests, even though they are
redundant.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:38:18 +0000 (13:38 +0000)]
xentop: Add tmem-freeable info when tmem is active
(No change to xentop output when tmem is inactive.)
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Tue, 1 Dec 2009 13:37:20 +0000 (13:37 +0000)]
xenstat: Linux dom0 statistics for case we use network bonding
I've created a patch that alters dom0 statistics (if empty like in
case of network bonding) and puts network bridge statistics
instead. It's been tested with network bonding both enabled and
disabled and also by creating a standalone network bridge without
bonding... It was working fine in all my tests...
Signed-off-by: Michal Novotny <minovotn@redhat.com>
Keir Fraser [Tue, 1 Dec 2009 13:36:22 +0000 (13:36 +0000)]
Report hardware tsc frequency even for emulated tsc
I was starting some documentation for tsc_mode and
realized this discussion was never resolved. Currently
when TSC is emulated the pvclock algorithm reports
to a PV OS Xen's system clock hz rate (1GHz). Linux
at boottime samples the TSC rate and shows it in
dmesg and the rate is also shown in the "cpu MHz"
field in /proc/cpuinfo. So when TSC is emulated,
it appears that the processor MHz is 1000.0, which
is likely to be confusing to many Xen users.
This patch changes the reported hz rate to the
hz rate of the initial machine on which the guest
is booted and retains that reported hz rate across
save/restore/migration.
Jeremy has pointed out that reporting 1000.0 MHz is
useful because it shows that TSC is being emulated.
However, with the new tsc_mode default where
a guest may start with native TSC and switch to
emulated TSC after migration, users are likely to
get even more confused. And "xm debug-key s"
reveals not only whether TSC is being emulated but
also the frequency so is more descriptive anyway.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Tue, 1 Dec 2009 13:35:28 +0000 (13:35 +0000)]
tools: avoid cpu over-commitment if numa=on
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Keir Fraser [Tue, 1 Dec 2009 13:34:38 +0000 (13:34 +0000)]
libxenlight: fix segfault when reading blktap2 devs
This patch fixes a possible segfault when reading from
/sys/class/blktap2/devices, if the line read is empty.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Tue, 1 Dec 2009 13:34:10 +0000 (13:34 +0000)]
libxenlight: fix multiple console with stubdoms
libxenlight doesn't handle properly the multiple pv console case,
needed to support an emulated serial in hvm guests with stubdoms.
This patch fixes it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 30 Nov 2009 11:48:36 +0000 (11:48 +0000)]
x86: Remove redundant tests in __start_xen()
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Keir Fraser [Mon, 30 Nov 2009 10:58:23 +0000 (10:58 +0000)]
ia64: eliminate build warnings
Various warnings appeared since 3.4 - eliminate at least some of them.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Mon, 30 Nov 2009 10:57:42 +0000 (10:57 +0000)]
xend: fix bugs in c/s 20321:
7a69f773548e "add a config description item for each guest"
Signed-off-by: james song (wei)<jsong@novell.com>
Keir Fraser [Mon, 30 Nov 2009 10:54:20 +0000 (10:54 +0000)]
libxenlight: implement blktap2 support
This patch implements blktap2 support in libxenlight; blktap2 is only
enabled if it is actually supported by the host, otherwise we fall
back to the previous code. Also for the moment we pretend that disk
type file is actually tap:aio.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 30 Nov 2009 10:53:39 +0000 (10:53 +0000)]
libxenlight: fix suspend/resume
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 30 Nov 2009 10:47:36 +0000 (10:47 +0000)]
libxenlight: add console command
This patch adds "xl console" command similar to "xm console".
Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 30 Nov 2009 10:41:28 +0000 (10:41 +0000)]
libxenlight: fix hvm flag when no hvmloader
Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 30 Nov 2009 10:38:58 +0000 (10:38 +0000)]
x86/mm: set_p2m_entry() should return 0 on error
set_p2m_entry() ignores halfway errors.
It should return 0 on error.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Acked-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Fri, 27 Nov 2009 08:09:26 +0000 (08:09 +0000)]
xm: Allow detaching vif by MAC address
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Fri, 27 Nov 2009 08:05:18 +0000 (08:05 +0000)]
VT-d: Free unused interrupt remapping table entry
This patch changes the IRTE allocation method, and frees unused
IRTE when device is de-assigned.
Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com>
Keir Fraser [Fri, 27 Nov 2009 07:56:38 +0000 (07:56 +0000)]
build: Execute mk_dsdt with path
Signed-off-by: Simon Horman <horms@verge.net.au>
Keir Fraser [Thu, 26 Nov 2009 15:27:00 +0000 (15:27 +0000)]
hvmloader: Auto-generate IRQ routing tables in ACPI DSDT.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 26 Nov 2009 14:49:40 +0000 (14:49 +0000)]
libxenlight: implement pause and unpause
this patch adds domain pause and unpause commands to xl, implementing
them using the already exiting functions libxl_domain_pause and
libxl_domain_unpause.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>