xen.git
12 years agopvh: introduce PVH guest type
Mukesh Rathor [Wed, 13 Nov 2013 08:33:12 +0000 (09:33 +0100)]
pvh: introduce PVH guest type

Introduce new PVH guest type, flags to create it, and ways to identify it.

To begin with, it will inherit functionality marked hvm_container.

Code to actually check for hardware support, in the VMX case, will be added
in future patches.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Eddie Dong <eddie.dong@intel.com>
12 years agopvh prep: introduce pv guest type and has_hvm_container macros
Mukesh Rathor [Wed, 13 Nov 2013 08:30:09 +0000 (09:30 +0100)]
pvh prep: introduce pv guest type and has_hvm_container macros

The goal of this patch is to classify conditionals more clearly, as to
whether they relate to pv guests, hvm-only guests, or guests with an
"hvm container" (which will eventually include PVH).

This patch introduces an enum for guest type, as well as two new macros
for switching behavior on and off: is_pv_* and has_hvm_container_*.  At the
moment is_pv_* <=> !has_hvm_container_*.  The purpose of having two is that
it seems to me different to take a path because something does *not* have PV
structures as to take a path because it *does* have HVM structures, even if the
two happen to coincide 100% at the moment.  The exact usage is occasionally a bit
fuzzy though, and a judgement call just needs to be made on which is clearer.

In general, a switch should use is_pv_* (or !is_pv_*) if the code in question
relates directly to a PV guest.  Examples include use of pv_vcpu structs or
other behavior directly related to PV domains.

hvm_container is more of a fuzzy concept, but in general:

* Most core HVM behavior will be included in this.  Behavior not
appropriate for PVH mode will be disabled in later patches

* Hypercalls related to HVM guests will *not* be included by default;
functionality needed by PVH guests will be enabled in future patches

* The following functionality are not considered part of the HVM
container, and PVH will end up behaving like PV by default: Event
channel, vtsc offset, code related to emulated timers, nested HVM,
emuirq, PoD

* Some features are left to implement for PVH later: vpmu, shadow mode

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Eddie Dong <eddie.dong@intel.com>
12 years agopvh: tolerate HVM guests having no ioreq page
George Dunlap [Wed, 13 Nov 2013 08:29:02 +0000 (09:29 +0100)]
pvh: tolerate HVM guests having no ioreq page

PVH guests don't have a backing device model emulator (qemu); just
tolerate this situation explicitly, rather than special-casing PVH.

For unhandled IO, hvmemul_do_io() will now return X86EMUL_OKAY, which
is I believe what would be the effect if qemu didn't have a handler
for the IO.

This also fixes a potetial DoS in the host from the reworked series:
If the guest makes a hypercall which sends an invalidate request, it
would have crashed the host.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Eddie Dong <eddie.dong@intel.com>
12 years agopvh prep: code motion
Mukesh Rathor [Wed, 13 Nov 2013 08:26:38 +0000 (09:26 +0100)]
pvh prep: code motion

There are many functions where PVH requires some code in common with
HVM.  Rearrange some of these functions so that the code is together.

In general, the HVM code that PVH also uses includes:
 - cacheattr functionality
 - paging
 - hvm_funcs
 - hvm_assert_evtchn_irq tasklet
 - tm_list
 - hvm_params

And code that PVH shares with PV but not with PVH:
 - updating the domain wallclock
 - setting v->is_initialized

There should be no end-to-end changes in behavior.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Eddie Dong <eddie.dong@intel.com>
12 years agolibxc: move temporary grant table mapping to end of memory
Roger Pau Monné [Wed, 13 Nov 2013 08:26:13 +0000 (09:26 +0100)]
libxc: move temporary grant table mapping to end of memory

In order to set up the grant table for HVM guests, libxc needs to map
the grant table temporarily.  At the moment, it does this by adding the
grant page to the HVM guest's p2m table in the MMIO hole (at gfn 0xFFFFE),
then mapping that gfn, setting up the table, then unmapping the gfn and
removing it from the p2m table.

This breaks with PVH guests with 4G or more of ram, because there is
no MMIO hole; so it ends up clobbering a valid RAM p2m entry, then
leaving a "hole" when it removes the grant map from the p2m table.
Since the guest thinks this is normal ram, when it maps it and tries
to access the page, it crashes.

This patch maps the page at max_gfn+1 instead.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Eddie Dong <eddie.dong@intel.com>
12 years agoVMX: allow vmx_update_debug_state to be called when v!=current
George Dunlap [Wed, 13 Nov 2013 08:25:36 +0000 (09:25 +0100)]
VMX: allow vmx_update_debug_state to be called when v!=current

Removing the assert allows the PVH code to call this during vmcs
construction in a later patch, making the code more robust by removing
duplicate code.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Eddie Dong <eddie.dong@intel.com>
12 years agolibxl: Avoid realloc(,0) when libxl__xs_directory returns empty list
Ian Jackson [Thu, 18 Apr 2013 15:27:46 +0000 (16:27 +0100)]
libxl: Avoid realloc(,0) when libxl__xs_directory returns empty list

If the named path is a leaf node, libxl__xs_directory can succeed,
returning non-null, but set *nb to 0.

In three places in libxl this may result in a zero size argument being
passed to malloc() or realloc(), which is not adviseable.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: Deprecate synchronous waiting for the device model
Ian Jackson [Mon, 14 Oct 2013 16:26:01 +0000 (17:26 +0100)]
libxl: Deprecate synchronous waiting for the device model

libxl__wait_for_device_model blocks, with the ctx lock held, waiting
for a response from the device model.  If the dm doesn't respond
quickly (for example, because it has crashed), this may block the
whole process.  Explain this in a comment, rename the function to
libxl__wait_for_device_model_deprecated, and explain what to use
instead.

libxl__wait_for_offspring is the core implementation for the above.
Its name leads people to think it might be generally useful for
waiting for children, which is far from the case.  It only waits for
xenstore.  Also it has the problems described above.  Explain this,
rename it to libxl__xenstore_child_wait_deprecated, and explain what
to use instead.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: Do not generate short block in libxl__datacopier_prefixdata
Ian Jackson [Tue, 3 Sep 2013 12:41:46 +0000 (13:41 +0100)]
libxl: Do not generate short block in libxl__datacopier_prefixdata

libxl__datacopier_prefixdata would prepend a deliberately short block
(not just a half-full one, but one with a short buffer) to the
dc->bufs queue.  However, this is wrong because datacopier_readable
will find it and try to continue to fill it up.

Instead, allocate a full-sized buffer.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Tested-by: Chunyan Liu <cyliu@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: Introduce nested async operations (nested ao)
Ian Jackson [Mon, 4 Nov 2013 17:56:15 +0000 (17:56 +0000)]
libxl: Introduce nested async operations (nested ao)

This allows a long-running ao to avoid accumulating memory.  Each
nested ao has its own gc.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agocommon/vsprintf: fix signed->unsigned error, causing glacial performance
Andrew Cooper [Tue, 12 Nov 2013 16:20:34 +0000 (17:20 +0100)]
common/vsprintf: fix signed->unsigned error, causing glacial performance

The original patch for

  c/s 67a3542c5bc356e6452d8305991617c875f87de4
  "common/vsprintf: Refactor string() out of vsnprintf()"

specifically used signed integers, identical to the code copied out of vsprintf.

When committed, these had changed to unsigned integers, which causes a
functional change.  This causes glacial boot performance and an excessive
quantity of spaces printed to the serial console, as we loop to the upper
bound of a 32bit integer.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agoQEMU_TAG update
Ian Jackson [Tue, 12 Nov 2013 15:41:37 +0000 (15:41 +0000)]
QEMU_TAG update

12 years agolibxl: save/restore errno in SIGCHLD handler
Ian Jackson [Mon, 11 Nov 2013 17:17:55 +0000 (17:17 +0000)]
libxl: save/restore errno in SIGCHLD handler

Without this, code interrupted by SIGCHLD may experience strange
values of errno.  (As far as I know this is not the cause of any
reported bugs.)

This fix should be backported in due course.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agox86: eliminate has_arch_mmios()
Jan Beulich [Tue, 12 Nov 2013 15:28:47 +0000 (16:28 +0100)]
x86: eliminate has_arch_mmios()

... as being generally insufficient: Either has_arch_pdevs() or
cache_flush_permitted() should be used (in particular, it is
insufficient to consider MMIO ranges alone - I/O port ranges have the
same requirements if available to a guest).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn/fifo: don't spin indefinitely when setting LINK
David Vrabel [Tue, 12 Nov 2013 12:19:25 +0000 (13:19 +0100)]
evtchn/fifo: don't spin indefinitely when setting LINK

A malicious or buggy guest can cause another domain to spin
indefinitely by repeatedly writing to an event word when the other
guest is trying to link a new event.  The cmpxchg() in
evtchn_fifo_set_link() will repeatedly fail and the loop may never
terminate.

Fixing this requires a change to the ABI which is documented in draft
H of the design.

  http://xenbits.xen.org/people/dvrabel/event-channels-H.pdf

Since a well-behaved guest only makes a limited set of state changes,
the loop can terminate early if the guest makes an invalid state
transition.

The guest may:

- clear LINKED and LINK.
- clear PENDING
- set MASKED
- clear MASKED

It is valid for the guest to mask and unmask an event at any time so
specify that it is not valid for a guest to clear MASKED if Xen is
trying to update LINK.  Indicate this to the guest with an additional
BUSY bit in the event word.  The guest must not clear MASKED if BUSY
is set and it should spin until BUSY is cleared.

The remaining valid writes (clear LINKED, clear PENDING, set MASKED,
clear MASKED by Xen) will limit the number of failures of the
cmpxchg() to at most 4.  A clear of LINKED will also terminate the
loop early. Therefore, the loop can then be limited to at most 4
iterations.

If the buggy or malicious guest does cause the loop to exit with
LINKED set and LINK unset then that buggy guest will lose events.

Reported-by: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 years agoVMX: don't crash processing 'd' debug key
Jan Beulich [Tue, 12 Nov 2013 10:52:19 +0000 (11:52 +0100)]
VMX: don't crash processing 'd' debug key

There's a window during scheduling where "current" and the active VMCS
may disagree: The former gets set much earlier than the latter. Since
both vmx_vmcs_enter() and vmx_vmcs_exit() immediately return when the
subject vCPU is "current", accessing VMCS fields would, depending on
whether there is any currently active VMCS, either read wrong data, or
cause a crash.

Going forward we might want to consider reducing the window during
which vmx_vmcs_enter() might fail (e.g. doing a plain __vmptrld() when
v->arch.hvm_vmx.vmcs != this_cpu(current_vmcs) but arch_vmx->active_cpu
== -1), but that would add complexities (acquiring and - more
importantly - properly dropping v->arch.hvm_vmx.vmcs_lock) that don't
look worthwhile adding right now.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agonested SVM: adjust guest handling of structure mappings
Jan Beulich [Tue, 12 Nov 2013 10:51:15 +0000 (11:51 +0100)]
nested SVM: adjust guest handling of structure mappings

For one, nestedsvm_vmcb_map() error checking must not consist of using
assertions: Global (permanent) mappings can fail, and hence failure
needs to be dealt with properly. And non-global (transient) mappings
can't fail anyway.

And then the I/O port access bitmap handling was broken: It checked
only to first of the accessed ports rather than each of them.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Egger <chegger@amazon.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
12 years agoMAINTAINERS: Add KEXEC maintainer
David Vrabel [Tue, 12 Nov 2013 10:47:36 +0000 (11:47 +0100)]
MAINTAINERS: Add KEXEC maintainer

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: check kexec relocation code fits in a page
David Vrabel [Tue, 12 Nov 2013 10:47:26 +0000 (11:47 +0100)]
x86: check kexec relocation code fits in a page

The kexec relocation (control) code must fit in a single page so add a
link time check for this.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Tested-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agolibxc: add API for kexec hypercall
David Vrabel [Tue, 12 Nov 2013 10:47:07 +0000 (11:47 +0100)]
libxc: add API for kexec hypercall

Add xc_kexec_exec(), xc_kexec_get_ranges(), xc_kexec_load(), and
xc_kexec_unload().  The load and unload calls require the v2 load and
unload ops.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Tested-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Don Slutz <dslutz@verizon.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agolibxc: add hypercall buffer arrays
David Vrabel [Tue, 12 Nov 2013 10:46:39 +0000 (11:46 +0100)]
libxc: add hypercall buffer arrays

Hypercall buffer arrays are used when a hypercall takes a variable
length array of buffers.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Tested-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Don Slutz <dslutz@verizon.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agokexec crash image when dom0 crashes
David Vrabel [Tue, 12 Nov 2013 10:46:06 +0000 (11:46 +0100)]
kexec crash image when dom0 crashes

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Tested-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Don Slutz <dslutz@verizon.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agokexec: extend hypercall with improved load/unload ops
David Vrabel [Tue, 12 Nov 2013 10:44:41 +0000 (11:44 +0100)]
kexec: extend hypercall with improved load/unload ops

In the existing kexec hypercall, the load and unload ops depend on
internals of the Linux kernel (the page list and code page provided by
the kernel).  The code page is used to transition between Xen context
and the image so using kernel code doesn't make sense and will not
work for PVH guests.

Add replacement KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload ops
that no longer require a code page to be provided by the guest -- Xen
now provides the code for calling the image directly.

The new load op looks similar to the Linux kexec_load system call and
allows the guest to provide the image data to be loaded.  The guest
specifies the architecture of the image which may be a 32-bit subarch
of the hypervisor's architecture (i.e., an EM_386 image on an
EM_X86_64 hypervisor).

The toolstack can now load images without kernel involvement.  This is
required for supporting kexec when using a dom0 with an upstream
kernel.

Crash images are copied directly into the crash region on load.
Default images are copied into domheap pages and a list of source and
destination machine addresses is created.  This is list is used in
kexec_reloc() to relocate the image to its destination.

The old load and unload sub-ops are still available (as
KEXEC_CMD_load_v1 and KEXEC_CMD_unload_v1) and are implemented on top
of the new infrastructure.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Tested-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agokexec: add infrastructure for handling kexec images
David Vrabel [Tue, 12 Nov 2013 10:41:02 +0000 (11:41 +0100)]
kexec: add infrastructure for handling kexec images

Add the code needed to handle and load kexec images into Xen memory or
into the crash region.  This is needed for the new KEXEC_CMD_load and
KEXEC_CMD_unload hypercall sub-ops.

Much of this code is derived from the Linux kernel.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Tested-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agokexec: add public interface for improved load/unload sub-ops
David Vrabel [Tue, 12 Nov 2013 10:39:29 +0000 (11:39 +0100)]
kexec: add public interface for improved load/unload sub-ops

Add replacement KEXEC_CMD_load and KEXEC_CMD_unload sub-ops to the
kexec hypercall.  These new sub-ops allow a priviledged guest to
provide the image data to be loaded into Xen memory or the crash
region instead of guests loading the image data themselves and
providing the relocation code and metadata.

The old interface is provided to guests requesting an interface
version prior to 4.4.

Bump __XEN_LATEST_INTERFACE_VERSION__ to 0x00040400.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Tested-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: give FIX_EFI_MPF its own fixmap entry
David Vrabel [Tue, 12 Nov 2013 10:37:19 +0000 (11:37 +0100)]
x86: give FIX_EFI_MPF its own fixmap entry

FIX_EFI_MPF was the same as FIX_KEXEC_BASE_0 which is going away.  So
add its own entry.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Tested-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Don Slutz <dslutz@verizon.com>
Tested-by: Don Slutz <dslutz@verizon.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agocommon/symbols: Remove print_symbol() and associated infrastructure
Andrew Cooper [Tue, 12 Nov 2013 10:11:30 +0000 (11:11 +0100)]
common/symbols: Remove print_symbol() and associated infrastructure

Also adjust the one common user of print_symbol() to use the new printk()
format.  While adjusting the format string, increase the width so a
long-to-expire plt_overflow() timer doesn't break the column alignment.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoarm: Replace print_symbol() with new %ps/%pS format
Andrew Cooper [Tue, 12 Nov 2013 10:11:05 +0000 (11:11 +0100)]
arm: Replace print_symbol() with new %ps/%pS format

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: Replace print_symbol() with new %ps/%pS format
Andrew Cooper [Tue, 12 Nov 2013 10:10:35 +0000 (11:10 +0100)]
x86: Replace print_symbol() with new %ps/%pS format

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agocommon/vsprintf: Add %ps and %pS format specifier support
Andrew Cooper [Tue, 12 Nov 2013 10:09:12 +0000 (11:09 +0100)]
common/vsprintf: Add %ps and %pS format specifier support

Introduce the %ps and %pS format options for printing a symbol.

  %ps will print the symbol name and optional offset and size
  %pS will print the symbol name and unconditional offset and size

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agocommon/vsprintf: Refactor pointer() out of vsnprintf()
Andrew Cooper [Tue, 12 Nov 2013 10:06:45 +0000 (11:06 +0100)]
common/vsprintf: Refactor pointer() out of vsnprintf()

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agocommon/vsprintf: Refactor string() out of vsnprintf()
Andrew Cooper [Tue, 12 Nov 2013 10:06:09 +0000 (11:06 +0100)]
common/vsprintf: Refactor string() out of vsnprintf()

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agonuma-sched: leave node-affinity alone if not in "auto" mode
Dario Faggioli [Tue, 12 Nov 2013 09:54:28 +0000 (10:54 +0100)]
numa-sched: leave node-affinity alone if not in "auto" mode

If the domain's NUMA node-affinity is being specified by the
user/toolstack (instead of being automatically computed by Xen),
we really should stick to that. This means domain_update_node_affinity()
is wrong when it filters out some stuff from there even in "!auto"
mode.

This commit fixes that. Of course, this does not mean node-affinity
is always honoured (e.g., a vcpu won't run on a pcpu of a different
cpupool) but the necessary logic for taking into account all the
possible situations lives in the scheduler code, where it belongs.

What could happen without this change is that, under certain
circumstances, the node-affinity of a domain may change when the
user modifies the vcpu-affinity of the domain's vcpus. This, even
if probably not a real bug, is at least something the user does
not expect, so let's avoid it.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoxen/arm: more info on the ARM ABI
Stefano Stabellini [Thu, 7 Nov 2013 14:52:50 +0000 (14:52 +0000)]
xen/arm: more info on the ARM ABI

Add more information about the exported ARM ABI.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
12 years agoxen/midway: Add 1:1 workaround
Julien Grall [Wed, 6 Nov 2013 19:37:15 +0000 (19:37 +0000)]
xen/midway: Add 1:1 workaround

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxl: remove pointless null pointer check
Matthew Daley [Fri, 8 Nov 2013 00:45:11 +0000 (13:45 +1300)]
xl: remove pointless null pointer check

poolinfo is guaranteed non-null here.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxc: remove pointless null pointer check
Matthew Daley [Fri, 8 Nov 2013 00:45:10 +0000 (13:45 +1300)]
libxc: remove pointless null pointer check

ctxt_buf is guaranteed non-null here.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen/video: remove pointless if subcondition
Matthew Daley [Fri, 8 Nov 2013 00:45:09 +0000 (13:45 +1300)]
xen/video: remove pointless if subcondition

It's already handled just above.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen/arm: remove pointless if subcondition
Matthew Daley [Fri, 8 Nov 2013 00:45:08 +0000 (13:45 +1300)]
xen/arm: remove pointless if subcondition

It's already handled just above.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: correct strtod error check
Matthew Daley [Fri, 8 Nov 2013 00:32:58 +0000 (13:32 +1300)]
libxl: correct strtod error check

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen/arm: Device Tree cpu clock-frequency
Jon Fraser [Thu, 7 Nov 2013 23:50:28 +0000 (18:50 -0500)]
xen/arm: Device Tree cpu clock-frequency

When creating CPU device tree properties, copy the
clock-frequency if present.

Quiets annoying messages from linux kernel:
"/cpus/cpu@0 missing clock-frequency property"

Signed-off-by: Jon Fraser <jfraser@broadcom.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agooxenstored: allow updates regardless of quota
Zheng Li [Thu, 31 Oct 2013 16:32:56 +0000 (16:32 +0000)]
oxenstored: allow updates regardless of quota

Allow a domain updating existing xenstore keys even if it has already reached
its max entries limit

As updating existing key won't increase the number of entries belonging to a
domain, we should avoid checking the max entries limit prematurely. The patch
addresses this issue in the following functions: write/add, mkdir, setperms.

Signed-off-by: Zheng Li <zheng.li@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: use CAMLlocal1 macro rather than value-type in auto-generated C-code
Rob Hoes [Wed, 6 Nov 2013 17:50:04 +0000 (17:50 +0000)]
libxl: ocaml: use CAMLlocal1 macro rather than value-type in auto-generated C-code

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: provide defaults for libxl types
Rob Hoes [Wed, 6 Nov 2013 17:50:03 +0000 (17:50 +0000)]
libxl: ocaml: provide defaults for libxl types

Libxl functions such as libxl_domain_create_new take large structs
of configuration parameters. Often, we would like to use the default
values for many of these parameters.

The struct and keyed-union types in libxl have init functions, which
fill in the defaults for a given type. This commit provides an OCaml
interface to obtain records of defaults by calling the relevant init
function.

These default records can be used as a base to construct your own
records, and to selectively override parameters where needed.

For example, a Domain_create_info record can now be created as follows:

  Xenlight.Domain_create_info.({ default ctx () with
    ty = Xenlight.DOMAIN_TYPE_PV;
    name = Some vm_name;
    uuid = vm_uuid;
  })

For types with KeyedUnion fields, such as Domain_build_info, a record
with defaults is obtained by specifying the type key:

  Xenlight.Domain_build_info.default ctx ~ty:Xenlight.DOMAIN_TYPE_HVM ()

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: in send_debug_keys, clean up before raising exception
Rob Hoes [Wed, 6 Nov 2013 17:50:02 +0000 (17:50 +0000)]
libxl: ocaml: in send_debug_keys, clean up before raising exception

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: add PCI device helper functions
Rob Hoes [Wed, 6 Nov 2013 17:49:59 +0000 (17:49 +0000)]
libxl: ocaml: add PCI device helper functions

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: add NIC helper functions
Rob Hoes [Wed, 6 Nov 2013 17:49:58 +0000 (17:49 +0000)]
libxl: ocaml: add NIC helper functions

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: add dominfo_list and dominfo_get
Rob Hoes [Wed, 6 Nov 2013 17:49:54 +0000 (17:49 +0000)]
libxl: ocaml: add dominfo_list and dominfo_get

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agolibxl: ocaml: use the "string option" type for IDL strings
Rob Hoes [Wed, 6 Nov 2013 17:49:53 +0000 (17:49 +0000)]
libxl: ocaml: use the "string option" type for IDL strings

The libxl IDL is based on C type "char *", and therefore "strings" can
by NULL, or be an actual string. In ocaml, it is common to encode such
things as option types.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: fix the handling of enums in the bindings generator
Rob Hoes [Wed, 6 Nov 2013 17:49:52 +0000 (17:49 +0000)]
libxl: ocaml: fix the handling of enums in the bindings generator

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agolibxl: ocaml: add domain_build/create_info/config and events to the bindings.
Rob Hoes [Wed, 6 Nov 2013 17:49:51 +0000 (17:49 +0000)]
libxl: ocaml: add domain_build/create_info/config and events to the bindings.

We now have enough infrastructure in place to do this trivially.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agolibxl: ocaml: make Val_defbool GC-proof
Rob Hoes [Wed, 6 Nov 2013 17:49:50 +0000 (17:49 +0000)]
libxl: ocaml: make Val_defbool GC-proof

In order to avoid newly created OCaml values from being GC'ed, they must be
registered as roots with the GC, before an iteration of the GC may happen. The
Val_* functions potentially allocate new values on the OCaml heap, and may
trigger an iteration of the OCaml GC.

The way to register a value with the GC is to assign it to a variable declared
with a CAMLparam or CAMLlocal macro, which put the value into a struct that
can be reached from a GC root.

This leads to slightly weird looking C code, but avoids hard to find segfaults.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: propagate the libxl return error code in exceptions
Rob Hoes [Wed, 6 Nov 2013 17:49:49 +0000 (17:49 +0000)]
libxl: ocaml: propagate the libxl return error code in exceptions

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agolibxl: ocaml: generate string_of_* functions for enums
Rob Hoes [Wed, 6 Nov 2013 17:49:48 +0000 (17:49 +0000)]
libxl: ocaml: generate string_of_* functions for enums

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by:Ian Campbell <ian.campbell@citrix.com>

12 years agolibxl: make the libxl error type an IDL enum
Rob Hoes [Wed, 6 Nov 2013 17:49:47 +0000 (17:49 +0000)]
libxl: make the libxl error type an IDL enum

This makes it easier to use in language bindings.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: idl: add Enumeration.value_namespace property
Rob Hoes [Wed, 6 Nov 2013 17:49:46 +0000 (17:49 +0000)]
libxl: idl: add Enumeration.value_namespace property

This allows setting the namespace for values of an Enumeration to be different
from the namespace of the Enumeration itself.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: switch all functions over to take a context.
Rob Hoes [Wed, 6 Nov 2013 17:49:45 +0000 (17:49 +0000)]
libxl: ocaml: switch all functions over to take a context.

Since the context has a logger we can get rid of the logger built into these
bindings and use the xentoollog bindings instead.

The gc is of limited use when most things are freed with libxl_FOO_dispose,
so get rid of that too.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agolibxl: ocaml: allocate a long lived libxl context.
Rob Hoes [Wed, 6 Nov 2013 17:49:44 +0000 (17:49 +0000)]
libxl: ocaml: allocate a long lived libxl context.

Rather than allocating a new context for every libxl call begin to
switch to a model where a context is allocated by the caller and may
then be used for multiple calls down into the library.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agolibxc: ocaml: add simple binding for xentoollog (output only).
Rob Hoes [Wed, 6 Nov 2013 17:49:43 +0000 (17:49 +0000)]
libxc: ocaml: add simple binding for xentoollog (output only).

These bindings allow ocaml code to receive log message via xentoollog
but do not support injecting messages into xentoollog from ocaml.
Receiving log messages from libx{c,l} and forwarding them to ocaml is
the use case which is needed by the following patches.

Add a simple noddy test case (tools/ocaml/test).

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
[ ijc -- dropped the xtl test harness, it failed to link ]

12 years agolibxl: ocaml: add some more builtin types.
Rob Hoes [Wed, 6 Nov 2013 17:49:42 +0000 (17:49 +0000)]
libxl: ocaml: add some more builtin types.

  * bitmaps
  * string_list
  * key_value_list
  * cpuid_policy_list (left "empty" for now)

None of these are used yet, so no change to the generated code.

Bitmap_val requires a ctx, so leave it as an abort for now.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agolibxl: ocaml: support for KeyedUnion in the bindings generator.
Rob Hoes [Wed, 6 Nov 2013 17:49:41 +0000 (17:49 +0000)]
libxl: ocaml: support for KeyedUnion in the bindings generator.

A KeyedUnion consists of two fields in the containing struct. First an
enum field ("e") used as a descriminator and second a union ("u")
containing potentially anonymous structs associated with each enum
value.

We map the anonymous structs to structs named after the descriminator
field ("e") and the specific enum values. We then declare an ocaml
variant type name e__union mapping each enum value to its associated
struct.

So given IDL:

foo = Enumeration("foo", [
    (0, "BAR"),
    (1, "BAZ"),
])
s = Struct("s", [
    ("u", KeyedUnion(none, foo, "blargle", [
        ("bar", Struct(...xxx...)),
        ("baz", Struct(...yyy...)),
    ])),
])

We generate C:

enum foo { BAR, BAZ };
struct s {
    enum foo blargle;
    union {
        struct { ...xxx... } bar;
        struct { ...yyy... } baz;
    } u;
}

and map this to ocaml

type foo = BAR | BAZ;

module S = struct

    type blargle_bar = ...xxx...;

    type blargle_baz = ...yyy...;

    type blargle__union = Bar of blargle_bar | Baz of blargle_baz;

    type t =
    {
        blargle : blargle__union;
    }
end

These type names are OK because they are already within the namespace
associated with the struct "s".

If the struct associated with bar is empty then we don't bother with
blargle_bar of "of blargle_bar".

No actually change in the generated code since we don't generate any
KeyedUnions yet.

The actual implementation was inspired by
http://www.linux-nantes.org/~fmonnier/ocaml/ocaml-wrapping-c.php#ref_constvrnt

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agolibxl: ocaml: avoid reserved words in type and field names.
Rob Hoes [Wed, 6 Nov 2013 17:49:40 +0000 (17:49 +0000)]
libxl: ocaml: avoid reserved words in type and field names.

Do this by adding a "xl_" prefix to all names that are OCaml keywords.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: support for Arrays in bindings generator.
Rob Hoes [Wed, 6 Nov 2013 17:49:39 +0000 (17:49 +0000)]
libxl: ocaml: support for Arrays in bindings generator.

No change in generated code because no arrays are currently generated.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
12 years agotools: support system supplied ovmf binary
Ian Campbell [Tue, 29 Oct 2013 11:39:50 +0000 (11:39 +0000)]
tools: support system supplied ovmf binary

Debian Jessie at least contains an ovmf package that includes
/usr/share/ovmf/OVMF.fd. It's also possible that user may want to supply
his/her own ovmf binary.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
12 years agotools: clone ovmf to ovmf-dir directory
Ian Campbell [Tue, 29 Oct 2013 11:39:49 +0000 (11:39 +0000)]
tools: clone ovmf to ovmf-dir directory

for consistency with other foo-dir e.g. qemu, seabios.

Remove obsolete ovmf-find target.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
12 years agoConfig.mk: update OVMF changeset
Wei Liu [Tue, 29 Oct 2013 11:39:48 +0000 (11:39 +0000)]
Config.mk: update OVMF changeset

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: use macro GCNEW in libxl_qmp.c
Kelley Nielsen [Mon, 11 Nov 2013 10:27:54 +0000 (02:27 -0800)]
libxl: use macro GCNEW in libxl_qmp.c

The new coding style uses the convenience macro GCNEW as declared in
libxl_internal.h. Substitute an invocation of this macro for its
body at the one place it occurs in libxl_qmp.c.

Suggested-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Kelley Nielsen <kelleynnn@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: use macro CTX in libxl_qmp.c
Kelley Nielsen [Mon, 11 Nov 2013 10:08:58 +0000 (02:08 -0800)]
libxl: use macro CTX in libxl_qmp.c

The new coding style uses the convenience macro CTX as declared in
libxl_internal.h. Substitute an invocation of this macro for its body
at the one place it occurs in libxl_qmp.c.

Suggested-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Kelley Nielsen <kelleynnn@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: add convenience macros to qmp_send() in libxl_qmp.c
Kelley Nielsen [Mon, 11 Nov 2013 10:08:57 +0000 (02:08 -0800)]
libxl: add convenience macros to qmp_send() in libxl_qmp.c

Update qmp_send() in libxl_qmp.c to use the new convenience macros
declared in libxl_internal.h. Uses GC_INIT at the top of the function,
and GC_FREE at the exit. Since GC_INIT returns a libxl__gc by reference
and not by value, remove the address operator from the left of the
variable gc where it is passed as a parameter.

Suggested-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Kelley Nielsen <kelleynnn@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: macro LOG() used in place of LIBXL__LOG in libxl_qmp.c
Kelley Nielsen [Sun, 10 Nov 2013 03:05:05 +0000 (19:05 -0800)]
libxl: macro LOG() used in place of LIBXL__LOG in libxl_qmp.c

Code cleanup -- no functional changes

Coding style has recently been changed for libxl. The convenience macro
LOG() has been introduced, and it is intended that it calls to the old
macro LIBXL__LOG() be replaced with it. Change 7 occurences of the old
macro (in functions that have a local libxl_gc *gc) to the new one.

Signed-off-by: Kelley Nielsen <kelleynnn@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: Use new LOG* and GCSPRINTF macros in tools/libxl/libxl_device.c
Alexandra Sava [Mon, 11 Nov 2013 09:00:39 +0000 (11:00 +0200)]
libxl: Use new LOG* and GCSPRINTF macros in tools/libxl/libxl_device.c

Replace libxl__sprintf, LIBXL__LOG and LIBXL__LOG_ERRNO with new
"Convenience macros": GCSPRINTF, LOG, LOGE in tools/libxl/libxl_device.c

Signed-off-by: Alexandra Sava <alexandrasava18@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agox86/idle: reduce contention on ACPI register accesses
Jan Beulich [Mon, 11 Nov 2013 10:01:04 +0000 (11:01 +0100)]
x86/idle: reduce contention on ACPI register accesses

Other than when they're located in I/O port space, accessing them when
in MMIO space (currently) implies usage of some sort of global lock: In
-unstable this would be due to the use of vmap(), is older trees the
necessary locking was introduced by 2ee9cbf9 ("ACPI: fix
acpi_os_map_memory()"). This contention was observed to result in Dom0
kernel soft lockups during the loading of the ACPI processor driver
there on systems with very many CPU cores.

There are a couple of things being done for this:
- re-order elements of an if() condition so that the register access
  only happens when we really need it
- turn off arbitration disabling only when the first CPU leaves C3
  (paralleling how arbitration disabling gets turned on)
- only set the (global) bus master reload flag once (when the first
  target CPU gets processed)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/Intel: don't probe CPUID faulting on family 0xf CPUs
Jan Beulich [Mon, 11 Nov 2013 10:00:21 +0000 (11:00 +0100)]
x86/Intel: don't probe CPUID faulting on family 0xf CPUs

These are known to not support the feature, so we can save ourselves
from emitting the resulting #GP fault recovery related message (which
might worry people looking at the logs).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Liu Jinsong <jinsong.liu@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agonested VMX: VMLANUCH/VMRESUME emulation must check permission first thing
Jan Beulich [Mon, 11 Nov 2013 08:15:04 +0000 (09:15 +0100)]
nested VMX: VMLANUCH/VMRESUME emulation must check permission first thing

Otherwise uninitialized data may be used, leading to crashes.

This is CVE-2013-4551 / XSA-75.

Reported-and-tested-by: Jeff Zimmerman <Jeff_Zimmerman@McAfee.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-and-tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agomini-os: remove repeated statement in netfront_input
Matthew Daley [Fri, 8 Nov 2013 10:10:08 +0000 (11:10 +0100)]
mini-os: remove repeated statement in netfront_input

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
12 years agonestedsvm: remove pointless BUG_ON
Matthew Daley [Fri, 8 Nov 2013 10:09:37 +0000 (11:09 +0100)]
nestedsvm: remove pointless BUG_ON

It's already handled just above.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
12 years agox86/EFI: drop redundant newlines from blexit() argument strings
Jan Beulich [Fri, 8 Nov 2013 10:09:06 +0000 (11:09 +0100)]
x86/EFI: drop redundant newlines from blexit() argument strings

The function issues a newline itself.

Also correct two slightly misplaced __initdata annotations.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/EFI: make trampoline allocation more flexible
Jan Beulich [Fri, 8 Nov 2013 10:08:32 +0000 (11:08 +0100)]
x86/EFI: make trampoline allocation more flexible

Certain UEFI implementations reserve all memory below 1Mb at boot time,
making it impossible to properly allocate the chunk necessary for the
trampoline. Fall back to simply grabbing a chunk from EfiBootServices*
regions immediately prior to calling ExitBootServices().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/hvm: fix restart of RTC periodic timer with vpt_align=1
Kouya Shimura [Fri, 8 Nov 2013 10:07:14 +0000 (11:07 +0100)]
x86/hvm: fix restart of RTC periodic timer with vpt_align=1

The commit 58afa7ef "x86/hvm: Run the RTC periodic timer on a
consistent time series" aligns the RTC periodic timer to the VM's boot time.
However, it's aligned later again to the system time in create_periodic_time()
with vpt_align=1. The next tick might be skipped.

Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
12 years agox86/msi: Refactor msi_compose_message() to not require an irq_desc
Andrew Cooper [Thu, 7 Nov 2013 14:17:48 +0000 (15:17 +0100)]
x86/msi: Refactor msi_compose_message() to not require an irq_desc

Subsequent changes will cause HPET MSIs to not have an associated IRQ.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
12 years agox86/hpet: Fix ambiguity in broadcast info message
Andrew Cooper [Thu, 7 Nov 2013 14:15:45 +0000 (15:15 +0100)]
x86/hpet: Fix ambiguity in broadcast info message

"$N will be used for broadcast" is ambiguous between "$N timers" or "timer
$N", particuarly when N is 0.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
12 years agox86/acpi: Warn about multiple HPET tables
Andrew Cooper [Thu, 7 Nov 2013 14:15:28 +0000 (15:15 +0100)]
x86/acpi: Warn about multiple HPET tables

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
12 years agoMAINTAINERS: remove Jacob Shin
Jan Beulich [Wed, 6 Nov 2013 09:33:17 +0000 (10:33 +0100)]
MAINTAINERS: remove Jacob Shin

As was requested by him, now that he left AMD.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
12 years agocall sched_destroy_domain before cpupool_rm_domain
Nathan Studer [Wed, 6 Nov 2013 09:21:09 +0000 (10:21 +0100)]
call sched_destroy_domain before cpupool_rm_domain

The domain destruction code, removes a domain from its cpupool
before attempting to destroy its scheduler information.  Since
the scheduler framework uses the domain's cpupool information
to decide on which scheduler ops to use, this results in the
the wrong scheduler's destroy domain function being called
when the cpupool scheduler and the initial scheduler are
different.

Correct this by destroying the domain's scheduling information
before removing it from the pool.

Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
Reviewed-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn: don't lose pending state if FIFO event array page is missing
David Vrabel [Wed, 6 Nov 2013 09:20:24 +0000 (10:20 +0100)]
evtchn: don't lose pending state if FIFO event array page is missing

When the FIFO-based ABI is in use, if an event is bound when the
corresponding event array page is missing any attempt to set the event
pending will lose the event (because there is nowhere to write the
pending state).

This wasn't initially considered an issue because guests were expected
to only bind events once they had expanded the event array, however:

1. A domain may start with events already bound (by the toolstack).

2. The guest does not know what the port number will be until the
   event is bound (it doesn't know how many already bound events there
   are), so it does not know how many event array pages are required.
   This makes it difficult to expand in advanced (the current Linux
   implementation expands after binding for example).

To prevent pending events from being lost because there is no array
page, temporarily store the pending state in evtchn->pending.  When an
array page is added, use this state to set the port as pending.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoMAINTAINERS: Add FIFO-based event channel ABI maintainer
David Vrabel [Wed, 6 Nov 2013 09:20:07 +0000 (10:20 +0100)]
MAINTAINERS: Add FIFO-based event channel ABI maintainer

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoVMX: flush cache when vmentry back to UC guest
Liu Jinsong [Wed, 6 Nov 2013 09:13:20 +0000 (10:13 +0100)]
VMX: flush cache when vmentry back to UC guest

This patch flush cache when vmentry back to UC guest, to prevent
cache polluted by hypervisor access guest memory during UC mode.

The elegant way to do this is, simply add wbinvd just before vmentry.
However, currently wbinvd before vmentry will mysteriously trigger
lapic timer interrupt storm, hung booting stage for 10s ~ 60s. We still
didn't dig out the root cause of interrupt storm, so currently this
patch add flag indicating hypervisor access UC guest memory to prevent
interrupt storm -- though it still leaves aspects un-addressed, i.e.
speculative reads, and multi-vCPU issues, etc.

Whenever the interrupt storm got root caused and fixed, the protection
flag can be removed -- that would be final clean and elegant approach
dealing with cache flushing before vmentry.

This is CVE-2013-2212 / XSA-60.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
12 years agoVMX: fix cr0.cd handling
Liu Jinsong [Wed, 6 Nov 2013 09:12:36 +0000 (10:12 +0100)]
VMX: fix cr0.cd handling

This patch solves XSA-60 security hole:
1. For guest w/o VT-d, and for guest with VT-d but snooped, Xen need
do nothing, since hardware snoop mechanism has ensured cache coherency.

2. For guest with VT-d but non-snooped, cache coherency can not be
guaranteed by h/w snoop, therefore it need emulate UC type to guest:
2.1). if it works w/ Intel EPT, set guest IA32_PAT fields as UC so that
guest memory type are all UC.
2.2). if it works w/ shadow, drop all shadows so that any new ones would
be created on demand w/ UC.

This patch also fix a bug of shadow cr0.cd setting. Current shadow has a
small window between cache flush and TLB invalidation, resulting in possilbe
cache pollution. This patch pause vcpus so that no vcpus context involved
into the window.

This is CVE-2013-2212 / XSA-60.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoVMX: remove the problematic set_uc_mode logic
Liu Jinsong [Wed, 6 Nov 2013 09:12:00 +0000 (10:12 +0100)]
VMX: remove the problematic set_uc_mode logic

XSA-60 security hole comes from the problematic vmx_set_uc_mode.
This patch remove vmx_set_uc_mode logic, which will be replaced by
PAT approach at later patch.

This is CVE-2013-2212 / XSA-60.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
12 years agoVMX: disable EPT when !cpu_has_vmx_pat
Liu Jinsong [Wed, 6 Nov 2013 09:11:18 +0000 (10:11 +0100)]
VMX: disable EPT when !cpu_has_vmx_pat

Recently Oracle developers found a Xen security issue as DOS affecting,
named as XSA-60. Please refer http://xenbits.xen.org/xsa/advisory-60.html
Basically it involves how to handle guest cr0.cd setting, which under
some environment it consumes much time resulting in DOS-like behavior.

This is a preparing patch for fixing XSA-60. Later patch will fix XSA-60
via PAT under Intel EPT case, which depends on cpu_has_vmx_pat.

This is CVE-2013-2212 / XSA-60.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
12 years agox86/viridian: TSC and APIC Frequency MSRs
Paul Durrant [Wed, 6 Nov 2013 09:04:56 +0000 (10:04 +0100)]
x86/viridian: TSC and APIC Frequency MSRs

These viridian MSRs are read-only sources of the TSC and APIC frequency (in
units of Hz)

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agox86/viridian: Time Reference Count MSR
Paul Durrant [Wed, 6 Nov 2013 09:04:12 +0000 (10:04 +0100)]
x86/viridian: Time Reference Count MSR

This viridian MSR is a read-only source of time (in units of 100ns) since the
domain started.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agox86: command line parameter to disable ARAT
Andrew Cooper [Tue, 5 Nov 2013 15:21:51 +0000 (16:21 +0100)]
x86: command line parameter to disable ARAT

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agox86/HVM: 32-bit IN result must be zero-extended to 64 bits
Jan Beulich [Tue, 5 Nov 2013 13:51:53 +0000 (14:51 +0100)]
x86/HVM: 32-bit IN result must be zero-extended to 64 bits

Just like for all other operations with 32-bit operand size.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years ago.config: Allow all URL(s) to be specified in .config
Don Slutz [Tue, 5 Nov 2013 01:56:23 +0000 (20:56 -0500)]
.config: Allow all URL(s) to be specified in .config

This allow building of XEN from source on a system without internet
access using either the environment or .config

Do the same with XEN_EXTFILES_URL, QEMU_REMOTE, IPXE_GIT_URL, and
IPXE_TARBALL_URL as QEMU_UPSTREAM_URL, SEABIOS_UPSTREAM_URL, etc.

Signed-off-by: Don Slutz <dslutz@verizon.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: make hotplug execution conditional on backend_domid == domid
Roger Pau Monne [Wed, 2 Oct 2013 09:24:25 +0000 (11:24 +0200)]
libxl: make hotplug execution conditional on backend_domid == domid

libxl currently refuses to execute hotplug scripts if the backend
domid of a device is different than LIBXL_TOOLSTACK_DOMID. This will
prevent libxl from executing hotplug scripts when running on a domain
different than LIBXL_TOOLSTACK_DOMID, we should instead check if
backend_domid is different than current domid.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agolibxl: remove unneeded libxl_domain_info in wait_device_connection
Roger Pau Monne [Wed, 2 Oct 2013 09:24:24 +0000 (11:24 +0200)]
libxl: remove unneeded libxl_domain_info in wait_device_connection

The info fetched by libxl_domain_info is not used anywere in the
function.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
12 years agolibxl/hotplug: add support for getting domid
Roger Pau Monne [Wed, 2 Oct 2013 09:24:23 +0000 (11:24 +0200)]
libxl/hotplug: add support for getting domid

This patch writes Dom0 domid on xenstore (like it's done for other
guests), and adds a libxl helper function to fetch that domid from
xenstore.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
[ ijc -- dropped xencommons hunk, same change was made independently
  in 02ebea7768fe ]

12 years agoxen: arm: consitfy platform compatibility lists and use __initconst
Ian Campbell [Fri, 1 Nov 2013 14:03:12 +0000 (14:03 +0000)]
xen: arm: consitfy platform compatibility lists and use __initconst

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen/arm: Blacklist sun7i UARTs
Ian Campbell [Fri, 1 Nov 2013 14:03:11 +0000 (14:03 +0000)]
xen/arm: Blacklist sun7i UARTs

These are in the same page as the UART which is used as the Xen console. We are
not currently smart enough to avoid passing them through to the guest,
accidentally giving the guest access to the Xen console UART.

we blacklist them all, if necessary in the future we can split the list into
two halves and make a per platform decision about which half should be
blacklisted on a given platform depending on which UART is wired up as the
console.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>