Ian Campbell [Thu, 10 Oct 2013 14:43:41 +0000 (15:43 +0100)]
xen: arm: Enable 40 bit addressing in VTCR for arm64
This requires setting the v8 specific VTCR_EL2.PS field. These bits are
UNK/SBZP on v7.
Also the TS0SZ field is described slightly differently for v8, so update the
comment to reflect this.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Thu, 10 Oct 2013 14:43:40 +0000 (15:43 +0100)]
xen: correct xenheap_bits after "xen: support RAM at addresses 0 and 4096"
This is incorrect after commit
1aac966e24e which shuffled the zones up by one.
I've observed failures on arm64 systems with RAM at 0x8,
00000000-0x8,
7fffffff
since xenheap_bits ends up as 35 instead of 36 (which is the zone with all the
RAM).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Ian Campbell [Mon, 21 Oct 2013 09:21:23 +0000 (10:21 +0100)]
xen: arm: fix usage of bootargs for Xen.
The chosen node's bootargs property should be used for Xen if there is a dom0
kernel multiboot module with a command line, not just if xen,dom0-bootargs is
present.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.linaro.org>
Ian Campbell [Tue, 22 Oct 2013 16:12:14 +0000 (17:12 +0100)]
xen: arm: correct XEN_COMPILE_ARCH autodetection for arm64
At least on aarch64 openSUSE running with qemu-user-aarch64 "uname -m" reports
"aarch64" and not "armv8" so include that in the seddery. There's no harm
leaving the existing armv8 rune too so do so.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Julien Grall [Tue, 22 Oct 2013 10:51:48 +0000 (11:51 +0100)]
xen/arm: Allocate memory for dom0 from the bottom with the 1:1 Workaround
On Linux, the option CONFIG_ARM_PATCH_PHYS_VIRT (by default enabled) allows
the Kernel to be loaded anywhere (or nearly) by patching the translation
pv<->virt at boot time.
The current solution in Linux assuming that the delta physical address -
virtual address is always negative. A positive delta will destroy all the
optimisation to modify only a part of the translation instruction (add/sub).
By default, Xen is allocating memory from the top of memory and then
goes down. To avoid booting issue with Linux, we must allocate memory
from the bottom (ie starting from 0).
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Chen Baozi [Tue, 15 Oct 2013 08:45:31 +0000 (16:45 +0800)]
xen/arm: implement smp initialization callbacks for omap5
Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Chen Baozi [Tue, 15 Oct 2013 08:45:29 +0000 (16:45 +0800)]
xen/arm: fix a typo in comment of PLATFORM_QUIRK_DOM0_MAPPING_11
Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Paul Durrant [Thu, 24 Oct 2013 08:47:50 +0000 (09:47 +0100)]
netif.h: Add IPv6 related changes
My recent patch series to Linux netback added IPv6 checksum
offload and GSO support. This involved making some changes to the
copy of netif.h in Linux.
This patch adds those changes to the canonical copy of netif.h.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 23 Oct 2013 16:28:47 +0000 (17:28 +0100)]
xen/arm: add_to_physmap_one: Avoid to map mfn 0 if an error occurs
By default, the function add_to_physmap_one set mfn to 0. Some code paths that
result to an error, continue and the map the mfn 0 (valid on ARM) to the
slot given by the guest.
To fix the problem, return directly an error if sanity check has failed.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Andrew Cooper [Tue, 22 Oct 2013 15:16:29 +0000 (17:16 +0200)]
spinlock: ensure the flags parameter is wide enough
Because of the construction of spin_lock_irq() (and varients), the flags
parameter could be trucated. Use a BUILD_BUG_ON() to verify the width of the
parameter.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Tue, 22 Oct 2013 15:15:58 +0000 (17:15 +0200)]
widen flags parameter for spinlock_irqsave() and friends
These issues were detected using the subsequent patch which forces a
compilation error if the result from local_irq_save() would be truncated.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Tue, 22 Oct 2013 15:11:16 +0000 (17:11 +0200)]
x86/irq: local_irq_restore() should not blindly popf
local_irq_restore() should only be concerned with possibly changing the
interrupt flag. A blind popf could corrupt other system flags.
While playing in this area, fixup an opencoded use of X86_EFLAGS_IF.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 21 Oct 2013 15:26:16 +0000 (17:26 +0200)]
x86/xsave: also save/restore XCR0 across suspend (ACPI S3)
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Marc Carino [Wed, 16 Oct 2013 22:57:06 +0000 (15:57 -0700)]
xen/arm: Add CPU ID for Broadcom Brahma-B15
Let Xen recognize the Broadcom Brahma-B15 CPU by adding the appropriate
MIDR mask to the initialization phase. Further, ensure that the console
output properly reports the CPU manufacturer as "Broadcom Corporation".
Signed-off-by: Marc Carino <marc.ceeeee@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Thu, 17 Oct 2013 09:35:26 +0000 (11:35 +0200)]
x86: print relevant (tail) part of filename for warnings and crashes
In particular when the origin construct is in a header file (and
hence the file name is an absolute path instead of just the file name
portion) the information can otherwise become rather useless when the
build tree isn't sitting relatively close to the file system root.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Mon, 23 Sep 2013 12:19:16 +0000 (13:19 +0100)]
tools: update to SeaBIOS 1.7.3.1
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 11 Oct 2013 11:49:05 +0000 (12:49 +0100)]
xend: Drop long deprecation warning in /var/run not /tmp
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Fri, 11 Oct 2013 15:31:57 +0000 (16:31 +0100)]
xen: arm: Emacs style fix
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Juergen Gross [Wed, 16 Oct 2013 10:28:04 +0000 (12:28 +0200)]
add cap value to credit scheduler debug info
Currently only the weight is the only scheduling parameter printed for
domains in the credit scheduler key handler. Add the cap value to be
printed as well.
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Juergen Gross [Wed, 16 Oct 2013 10:26:48 +0000 (12:26 +0200)]
credit: unpause parked vcpu before destroying it
A capped out vcpu must be unpaused in case of moving it to another cpupool,
otherwise it will be paused forever.
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Julien Grall [Mon, 14 Oct 2013 22:19:37 +0000 (23:19 +0100)]
xen/evtchn: Fix build on ARM
The recent event channel changes introduced by commit
a77eb86 and before...
break the compilation on Xen ARM. This commit adds missing includes in
common/event_fifo.c and include/xen/sched.h.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Fabio Fantoni [Mon, 30 Sep 2013 11:53:08 +0000 (13:53 +0200)]
libxl: remove qemu default devices for upstream qemu
Remove default devices created by qemu. Qemu will create only devices
defined by xen, since the devices not defined by xen are not usable.
Remove deleting of empty floppy no more needed with nodefault.
(Removed a whitespace error. -iwj)
Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 10 Oct 2013 09:37:37 +0000 (10:37 +0100)]
pygrub: Support (/dev/xvda) style disk specifications
You get these if you install Debian Wheezy as HVM and then try to convert to
PV.
This is Debian bug #603391.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Tril <tril@metapipe.net>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
David Vrabel [Mon, 14 Oct 2013 08:25:39 +0000 (10:25 +0200)]
libxl,xl: add max_event_channels option to xl configuration file
Add the 'max_event_channels' option to the xl configuration file to
limit the number of event channels that domain may use.
Plumb this option through to libxl via a new libxl_build_info field
and call xc_domain_set_max_evtchn() in the post build stage of domain
creation.
A new LIBXL_HAVE_BUILDINFO_EVENT_CHANNELS #define indicates that this
new field is available.
The default value of 1023 limits the domain to using the minimum
amount of global mapping pages and at most 5 xenheap pages.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:24:03 +0000 (10:24 +0200)]
libxc: add xc_domain_set_max_evtchn()
Add xc_domain_set_max_evtchn(), a wrapper around the
DOMCTL_set_max_evtchn hypercall.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:23:10 +0000 (10:23 +0200)]
Add DOMCTL to limit the number of event channels a domain may use
Add XEN_DOMCTL_set_max_evtchn which may be used during domain creation to
set the maximum event channel port a domain may use. This may be used to
limit the amount of Xen resources (global mapping space and xenheap) that
a domain may use for event channels.
A domain that does not have a limit set may use all the event channels
supported by the event channel ABI in use.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:22:07 +0000 (10:22 +0200)]
evtchn: add FIFO-based event channel hypercalls and port ops
Add the implementation for the FIFO-based event channel ABI. The new
hypercall sub-ops (EVTCHNOP_init_control, EVTCHNOP_expand_array) and
the required evtchn_ops (set_pending, unmask, etc.).
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:21:06 +0000 (10:21 +0200)]
evtchn: implement EVTCHNOP_set_priority and add the set_priority hook
Implement EVTCHNOP_set_priority. A new set_priority hook added to
struct evtchn_port_ops will do the ABI specific validation and setup.
If an ABI does not provide a set_priority hook (as is the case of the
2-level ABI), the sub-op will return -ENOSYS.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:20:02 +0000 (10:20 +0200)]
evtchn: add FIFO-based event channel ABI
Add the event channel hypercall sub-ops and the definitions for the
shared data structures for the FIFO-based event channel ABI.
The design document for this new ABI is available here:
http://xenbits.xen.org/people/dvrabel/event-channels-F.pdf
In summary, events are reported using a per-domain shared event array
of event words. Each event word has PENDING, LINKED and MASKED bits
and a LINK field for pointing to the next event in the event queue.
There are 16 event queues (with different priorities) per-VCPU.
Key advantages of this new ABI include:
- Support for over 100,000 events (2^17).
- 16 different event priorities.
- Improved fairness in event latency through the use of FIFOs.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:19:21 +0000 (10:19 +0200)]
evtchn: allow many more evtchn objects to be allocated per domain
Expand the number of event channels that can be supported internally
by altering now struct evtchn's are allocated.
The objects are indexed using a two level scheme of groups and buckets
(instead of only buckets). Each group is a page of bucket pointers.
Each bucket is a page-sized array of struct evtchn's.
The optimal number of evtchns per bucket is calculated at compile
time.
If XSM is not enabled, struct evtchn is 16 bytes and each bucket
contains 256, requiring only 1 group of 512 pointers for 2^17
(131,072) event channels. With XSM enabled, struct evtchn is 24
bytes, each bucket contains 128 and 2 groups are required.
For the common case of a domain with only a few event channels,
instead of requiring an additional allocation for the group page, the
first bucket is indexed directly.
As a consequence of this, struct domain shrinks by at least 232 bytes
as 32 bucket pointers are replaced with 1 bucket pointer and (at most)
2 group pointers.
[ Based on a patch from Wei Liu with improvements from Malcolm
Crossley. ]
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:18:24 +0000 (10:18 +0200)]
evtchn: use a per-domain variable for the max number of event channels
Instead of the MAX_EVTCHNS(d) macro, use d->max_evtchns instead. This
avoids having to repeatedly check the ABI type.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:17:14 +0000 (10:17 +0200)]
evtchn: print ABI specific state with the 'e' debug key
In the output of the 'e' debug key, print some ABI specific state in
addition to the (p)ending and (m)asked bits.
For the 2-level ABI, print the state of that event's selector
bit. e.g.,
(XEN) port [p/m/s]
(XEN) 1 [0/0/1]: s=3 n=0 x=0 d=0 p=74
(XEN) 2 [0/0/1]: s=3 n=0 x=0 d=0 p=75
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:15:49 +0000 (10:15 +0200)]
evtchn: refactor low-level event channel port ops
Use functions for the low-level event channel port operations
(set/clear pending, unmask, is_pending and is_masked).
Group these functions into a struct evtchn_port_op so they can be
replaced by alternate implementations (for different ABIs) on a
per-domain basis.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Mon, 14 Oct 2013 08:14:38 +0000 (10:14 +0200)]
debug: remove some event channel info from the 'i' and 'q' debug keys
The 'i' key would always use VCPU0's selector word when printing the
event channel state. Remove the incorrect output as a subsequent
change will add the (correct) information to the 'e' key instead.
When dumping domain information, printing the state of the VIRQ_DEBUG
port is redundant -- this information is available via the 'e' key.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 14 Oct 2013 07:54:09 +0000 (09:54 +0200)]
x86/HVM: cache emulated instruction for retry processing
Rather than re-reading the instruction bytes upon retry processing,
stash away and re-use what we already read. That way we can be certain
that the retry won't do something different from what requested the
retry, getting once again closer to real hardware behavior (where what
we use retries for is simply a bus operation, not involving redundant
decoding of instructions).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 14 Oct 2013 07:53:31 +0000 (09:53 +0200)]
x86/HVM: properly deal with hvm_copy_*_guest_phys() errors
In memory read/write handling the default case should tell the caller
that the operation cannot be handled rather than the operation having
succeeded, so that when new HVMCOPY_* states get added not handling
them explicitly will not result in errors being ignored.
In task switch emulation code stop handling some errors, but not
others.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 14 Oct 2013 07:52:33 +0000 (09:52 +0200)]
x86/HVM: don't ignore hvm_copy_to_guest_phys() errors during I/O intercept
Building upon the extended retry logic we can now also make sure to
not ignore errors resulting from writing data back to guest memory.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 14 Oct 2013 07:51:40 +0000 (09:51 +0200)]
x86/HVM: fix direct PCI port I/O emulation retry and error handling
dpci_ioport_{read,write}() guest memory access failure handling should
be modelled after process_portio_intercept()'s (and others): Upon
encountering an error on other than the first iteration, the count
successfully handled needs to be stored and X86EMUL_OKAY returned, in
order for the generic instruction emulator to update register state
correctly before reporting failure or retrying (both of which would
only happen after re-invoking emulation).
Further we leverage (and slightly extend, due to the above mentioned
need to return X86EMUL_OKAY) the "large MMIO" retry model.
Note that there is still a special case not explicitly taken care of
here: While the first retry on the last iteration of a "rep ins"
correctly recovers the already read data, an eventual subsequent retry
is being handled by the pre-existing mmio-large logic (through
hvmemul_do_io() storing the [recovered] data [again], also taking into
consideration that the emulator converts a single iteration "ins" to
->read_io() plus ->write()).
Also fix an off-by-one in the mmio-large-read logic, and slightly
simplify the copying of the data.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 14 Oct 2013 07:50:16 +0000 (09:50 +0200)]
x86/HVM: properly handle backward string instruction emulation
Multiplying a signed 32-bit quantity with an unsigned 32-bit quantity
produces an unsigned 32-bit result, yet for emulation of backward
string instructions we need the result sign extended before getting
added to the base address.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Mon, 14 Oct 2013 07:07:44 +0000 (09:07 +0200)]
sched: Correct function prototypes
struct vcpu pointers are traditionally v rather than d.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 14 Oct 2013 07:07:02 +0000 (09:07 +0200)]
x86/MSI: fix locking in pci_restore_msi_state()
Right after the loop the lock is being dropped, so all loop exits
should happen with the lock still held.
Reported-by: Kristoffer Egefelt <kristoffer@itoc.dk>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Kristoffer Egefelt <kristoffer@itoc.dk>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
David Vrabel [Mon, 14 Oct 2013 06:58:31 +0000 (08:58 +0200)]
sched: fix race between sched_move_domain() and vcpu_wake()
From: David Vrabel <david.vrabel@citrix.com>
sched_move_domain() changes v->processor for all the domain's VCPUs.
If another domain, softirq etc. triggers a simultaneous call to
vcpu_wake() (e.g., by setting an event channel as pending), then
vcpu_wake() may lock one schedule lock and try to unlock another.
vcpu_schedule_lock() attempts to handle this but only does so for the
window between reading the schedule_lock from the per-CPU data and the
spin_lock() call. This does not help with sched_move_domain()
changing v->processor between the calls to vcpu_schedule_lock() and
vcpu_schedule_unlock().
Fix the race by taking the schedule_lock for v->processor in
sched_move_domain().
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Use vcpu_schedule_lock_irq() (which now returns the lock) to properly
retry the locking should the to be used lock have changed in the course
of acquiring it (issue pointed out by George Dunlap).
Add a comment explaining the state after the v->processor adjustment.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 14 Oct 2013 06:57:56 +0000 (08:57 +0200)]
scheduler: adjust internal locking interface
Make the locking functions return the lock pointers, so they can be
passed to the unlocking functions (which in turn can check that the
lock is still actually providing the intended protection, i.e. the
parameters determining which lock is the right one didn't change).
Further use proper spin lock primitives rather than open coded
local_irq_...() constructs, so that interrupts can be re-enabled as
appropriate while spinning.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 14 Oct 2013 06:52:18 +0000 (08:52 +0200)]
x86: fix bug_line()
Due to the packing into a bit field together with a relocated field,
the computation can overflow when the relocated field ends up getting a
negative value stored. Hence it isn't sufficient to correct the value
by 1 in this case, but we also need to mask the result to the width of
the original bit field.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Ian Jackson [Fri, 11 Oct 2013 18:05:31 +0000 (19:05 +0100)]
Revert "QEMU_TAG update"
(My script edited the wrong xen.git branch)
This reverts commit
363cfda13a58eab51a4a85f30c7c740990b53c3a.
Ian Jackson [Fri, 11 Oct 2013 18:04:25 +0000 (19:04 +0100)]
QEMU_TAG update
Ian Jackson [Fri, 11 Oct 2013 11:10:45 +0000 (12:10 +0100)]
libxl: make libxl__poller_put tolerate p==NULL
This is less fragile, and more in keeping with the usual style of
initialising everything to 0 and freeing things unconditionally.
Correspondingly, remove the tests at the call sites.
Apropos of
c1f3f174. No overall functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Fri, 11 Oct 2013 07:31:16 +0000 (09:31 +0200)]
x86: check for canonical address before doing page walks
... as there doesn't really exists any valid mapping for them.
Particularly in the case of do_page_walk() this also avoids returning
non-NULL for such invalid input.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 11 Oct 2013 07:30:31 +0000 (09:30 +0200)]
x86: use {rd,wr}{fs,gs}base when available
... as being intended to be faster than MSR reads/writes.
In the case of emulate_privileged_op() also use these in favor of the
cached (but possibly stale) addresses from arch.pv_vcpu. This allows
entirely removing the code that was the subject of XSA-67.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 11 Oct 2013 07:29:43 +0000 (09:29 +0200)]
x86: add address validity check to guest_map_l1e()
Just like for guest_get_eff_l1e() this prevents accessing as page
tables (and with the wrong memory attribute) internal data inside Xen
happening to be mapped with 1Gb pages.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 11 Oct 2013 07:28:26 +0000 (09:28 +0200)]
x86: correct LDT checks
- MMUEXT_SET_LDT should behave as similarly to the LLDT instruction as
possible: fail only if the base address is non-canonical
- instead LDT descriptor accesses should fault if the descriptor
address ends up being non-canonical (by ensuring this we at once
avoid reading an entry from the mach-to-phys table and consider it a
page table entry)
- fault propagation on using LDT selectors must distinguish #PF and #GP
(the latter must be raised for a non-canonical descriptor address,
which also applies to several other uses of propagate_page_fault(),
and hence the problem is being fixed there)
- map_ldt_shadow_page() should properly wrap addresses for 32-bit VMs
At once remove the odd invokation of map_ldt_shadow_page() from the
MMUEXT_SET_LDT handler: There's nothing really telling us that the
first LDT page is going to be preferred over others.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Matthew Daley [Tue, 10 Sep 2013 10:18:46 +0000 (22:18 +1200)]
libxl: fix out-of-memory error handling in libxl_list_cpupool
...otherwise it will return freed memory. All the current users of this
function check already for a NULL return, so use that.
Coverity-ID:
1056194
This is CVE-2013-4371 / XSA-70
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Matthew Daley [Tue, 10 Sep 2013 11:12:45 +0000 (23:12 +1200)]
tools/ocaml: fix erroneous free of cpumap in stub_xc_vcpu_getaffinity
Not sure how it got there...
Coverity-ID:
1056196
This is CVE-2013-4370 / XSA-69
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 10 Oct 2013 14:48:55 +0000 (15:48 +0100)]
libxl: fix vif rate parsing
strtok can return NULL here. We don't need to use strtok anyway, so just
use a simple strchr method.
Coverity-ID:
1055642
This is CVE-2013-4369 / XSA-68
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Fix type. Add test case
Signed-off-by: Ian Campbell <Ian.campbell@citrix.com>
Matthew Daley [Thu, 10 Oct 2013 13:19:53 +0000 (15:19 +0200)]
x86: check segment descriptor read result in 64-bit OUTS emulation
When emulating such an operation from a 64-bit context (CS has long
mode set), and the data segment is overridden to FS/GS, the result of
reading the overridden segment's descriptor (read_descriptor) is not
checked. If it fails, data_base is left uninitialized.
This can lead to 8 bytes of Xen's stack being leaked to the guest
(implicitly, i.e. via the address given in a #PF).
Coverity-ID:
1055116
This is CVE-2013-4368 / XSA-67.
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Fix formatting.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Jaeyong Yoo [Fri, 4 Oct 2013 04:44:02 +0000 (13:44 +0900)]
xen/arm: Fixing clear_guest_offset macro
Fix the the broken macro 'clear_guest_offset' in arm.
Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Thu, 10 Oct 2013 11:41:10 +0000 (12:41 +0100)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Dario Faggioli [Thu, 3 Oct 2013 17:46:02 +0000 (19:46 +0200)]
libxl: introduce libxl_node_to_cpumap
As an helper for the special case (of libxl_nodemap_to_cpumap) when
one wants the cpumap for just one node.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Dario Faggioli [Thu, 3 Oct 2013 17:45:47 +0000 (19:45 +0200)]
xl: fix a typo in main_vcpulist()
which was preventing `xl vcpu-list -h' to work.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Dario Faggioli [Thu, 3 Oct 2013 17:45:38 +0000 (19:45 +0200)]
xl: update the manpage about "cpus=" and NUMA node-affinity
Since
d06b1bf169a01a9c7b0947d7825e58cb455a0ba5 ('libxl: automatic placement
deals with node-affinity') it is no longer true that, if no "cpus=" option
is specified, xl picks up some pCPUs by default and pin the domain there.
In fact, it is the NUMA node-affinity that is affected by automatic
placement, not vCPU to pCPU pinning.
Update the xl config file documenation accordingly, as it seems to have
been forgotten at that time.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Thu, 10 Oct 2013 11:23:10 +0000 (12:23 +0100)]
tools/migrate: Fix regression when migrating from older version of Xen
Commit
00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
"libxc: provide notification of final checkpoint to restore end"
broke migration from any version of Xen using tools from prior to that commit
Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
tools xc_domain_restore() to start reading the qemu save record, as
ctx->last_checkpoint is 0.
The failure looks like:
xc: error: Max batch size exceeded (
1970103633). Giving up.
where
1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"
With this fix in place, the behaviour for normal migrations is reverted to how
it was before the regression; the migration is considered non-checkpointed
right from the start. A XC_SAVE_ID_LAST_CHECKPOINT chunk seen in the
migration stream is a nop. For checkpointed migrations the behaviour is
unchanged.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> (Remus bits)
Fabio Fantoni [Fri, 27 Sep 2013 14:00:46 +0000 (16:00 +0200)]
tools: adds tracer on qemu-xen debug configure options
When building tools in debug mode (debug=y), pass also
--enable-trace-backend=stderr when configuring qemu-xen.
Useful to improve debug.
Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Julien Grall [Mon, 7 Oct 2013 14:44:35 +0000 (15:44 +0100)]
xen/arm32: Call start_xen only on the boot CPU
The boot CPU can have a CPU ID non-equal to zero. Xen needs to check the
logical CPU ID (in r12) to know if the CPU is the boot one.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Mon, 7 Oct 2013 14:44:35 +0000 (15:44 +0100)]
xen/arm32: Call start_xen only on the boot CPU
The boot CPU can have a CPU ID non-equal to zero. Xen needs to check the
logical CPU ID (in r12) to know if the CPU is the boot one.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Anthony PERARD [Tue, 8 Oct 2013 12:59:57 +0000 (13:59 +0100)]
qemu-xen: Set localstatedir to /var.
This path is used by the QEMU build system to create the /run directory.
If local-state-dir is not set, the result become $prefix/var which
is not an acceptable path.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Anthony PERARD [Tue, 8 Oct 2013 12:59:56 +0000 (13:59 +0100)]
qemu-xen: Disabling build of guest-agent.
It is not use when QEMU is run with Xen.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Wed, 9 Oct 2013 10:11:48 +0000 (12:11 +0200)]
hvm/vidirian: Avoid printing page_to_mfn(NULL) on error paths
While working in the viridian code, I noticed that
4cb6c4f4941
"x86/hvm: Use get_page_from_gfn() instead of get_gfn()/put_gfn."
introduced two error paths where page_to_mfn(NULL) would be formatted and
presented as a bad MFN. This provides junk in the warning rather than
something useful.
These two codepaths are fixed up to match their counterpart in
wrmsr_hypervisor_regs()
While auditing the other changes from
4cb6c4f4941, I noticed a small
optimisation which could be made by changing the order of the validity checks
to remove 6 NULL pointer checks.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Wed, 9 Oct 2013 10:10:46 +0000 (12:10 +0200)]
x86/traps: improvements to {rd,wr}msr_hypervisor_regs()
Coverity ID:
1055249 1055250
Coverity was complaining that the switch statments contained dead code in
their default statements. While this is quite minor, the code flow in
wrmsr_hypervisor_regs() was sufficiently opaque that I felt it approprate to
fix.
Other improvements include:
* not shadowing the function parameter 'idx'.
* use of PAGE_{SHIFT,SIZE} instead of opencoded numbers.
* a more descriptive error message for attempting to write invalid indicies
for hypercall pages.
There is no behavioural change as a result.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Julien Grall [Tue, 8 Oct 2013 16:48:33 +0000 (17:48 +0100)]
xen/x86: Remove GB macro in asm-x86/config.h
Commit
983843e "xen: Add macros MB and GB" introduce a generic GB macro.
By mistake, the macro in asm-x86/config.h was not removed. This is result to
a compilation error when Xen is build for x86.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Julien Grall [Fri, 27 Sep 2013 16:56:37 +0000 (17:56 +0100)]
xen/dts: Support Linux initrd DT bindings
Linux uses the property linux,initrd-start and linux,initrd-end to know where
the initrd lives in memory.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Fri, 27 Sep 2013 16:56:36 +0000 (17:56 +0100)]
xen/arm: Add support to load initrd in dom0
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Fri, 27 Sep 2013 16:56:35 +0000 (17:56 +0100)]
xen/dts: Use ROUNDUP macro instead of the internal ALIGN
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Fri, 27 Sep 2013 16:56:34 +0000 (17:56 +0100)]
xen: Add macro ROUNDUP
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
Julien Grall [Fri, 27 Sep 2013 16:56:33 +0000 (17:56 +0100)]
xen: Add macros MB and GB
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 8 Oct 2013 09:09:22 +0000 (11:09 +0200)]
x86/HPET: basic cleanup
* Strip trailing whitespace
* Remove redundant definitions
* Update stale documentation links
* Move hpet_address into __initdata
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 8 Oct 2013 09:06:48 +0000 (11:06 +0200)]
VT-d: fix suspected data race condition in iommu_set_root_entry()
Coverity ID:
1054967
Coverity spotted that iommu->root_maddr was optionally allocated within the
protection of the iommu->lock, but was referenced with the protection of the
iommu->register_lock, and freed without any lock.
Luckily, the code as-is is not vulnerable to the potential risks identified.
However, the alloc_pgtable_maddr() is far more appropriately done in
iommu_alloc(), removing a set of spinlock calls, and a possibility for the
iommu setup to fail later than iommu_alloc() with an -ENOMEM.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Jan Beulich [Mon, 7 Oct 2013 07:42:51 +0000 (09:42 +0200)]
libxc: add LZ4 decompression support
Since there's no shared or static library to link against, this simply
re-uses the hypervisor side code. However, I only audited the code
added here for possible security issues, not the referenced code in
the hypervisor tree.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Kyungsik Lee [Mon, 7 Oct 2013 07:40:35 +0000 (09:40 +0200)]
xen: add LZ4 decompression support
Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for
Xen are based on LZ4 implementation by Yann Collet.
Benchmark Results(PATCH v3)
Compiler: Linaro ARM gcc 4.6.2
1. ARMv7, 1.5GHz based board
Kernel: linux 3.4
Uncompressed Kernel Size: 14MB
Compressed Size Decompression Speed
LZO 6.7MB 20.1MB/s, 25.2MB/s(UA)
LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA)
2. ARMv7, 1.7GHz based board
Kernel: linux 3.7
Uncompressed Kernel Size: 14MB
Compressed Size Decompression Speed
LZO 6.0MB 34.1MB/s, 52.2MB/s(UA)
LZ4 6.5MB 86.7MB/s
- UA: Unaligned memory Access support
- Latest patch set for LZO applied
This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a
very fast lossless compression algorithm and it also features an extremely
fast decoder [1].
But we have five of decompressors already and one question which does
arise, however, is that of where do we stop adding new ones? This issue
had been discussed and came to the conclusion [2].
Russell King said that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)
If we have a replacement one for one of these, then it should do exactly
that: replace it.
The benchmark shows that an 8% increase in image size vs a 66% increase
in decompression speed compared to LZO(which has been known as the
fastest decompressor in the Kernel). Therefore the "fast but may not be
small" compression title has clearly been taken by LZ4 [3].
[1] http://code.google.com/p/lz4/
[2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157
[3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347
LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository: http://code.google.com/p/lz4/
Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Yann Collet <yann.collet.73@gmail.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Fri, 4 Oct 2013 10:58:20 +0000 (12:58 +0200)]
x86: Improve information from domain_crash_synchronous
As it currently stands, the string "domain_crash_sync called from entry.S" is
not helpful at identifying why the domain was crashed, and a debug build of
Xen doesn't help the matter
This patch improves the information printed, by pointing to where the crash
decision was made.
Specific improvements include:
* Moving the ascii string "domain_crash_sync called from entry.S\n" away from
some semi-hot code cache lines.
* Moving the printk into C code (especially as this_cpu() is miserable to use
in assembly code)
* Undo the previous confusing situation of having the
domain_crash_synchronous() as a macro in C code, yet a global symbol in
assembly code.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Fri, 4 Oct 2013 10:57:43 +0000 (12:57 +0200)]
x86/traps: Record last extable faulting address
... so the following patch can identify the location of faults leading to a
decision to crash a domain.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Konrad Rzeszutek Wilk [Fri, 4 Oct 2013 10:54:38 +0000 (12:54 +0200)]
x86: allow HVM guests to make console_io hypercall
The console_io hypercall is provided for PV guests and for HVM
guests it is done via the 0xe9 port. However the PV hypercall
is more efficient as it takes a string rather than one character
per write.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Daniel De Graaf [Fri, 4 Oct 2013 10:52:56 +0000 (12:52 +0200)]
xsm: clean up unneeded current references
Some XSM hooks in dummy.h used current->domain when this was also passed
as a parameter; use the parameter in these cases. There are two hooks
where this does not apply and which are not immediately obvious:
xsm_set_target's parameters are the device model and HVM domains, and
xsm_mem_sharing_op's first parameter is the source of the shared page,
not the domain making the hypercall.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Daniel De Graaf [Fri, 4 Oct 2013 10:51:44 +0000 (12:51 +0200)]
xsm: forbid PV guest console reads
The CONSOLEIO_read operation was incorrectly allowed to PV guests if the
hypervisor was compiled in debug mode (with VERBOSE defined).
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Jan Beulich [Fri, 4 Oct 2013 10:32:25 +0000 (12:32 +0200)]
x86: make hvm_cpuid() tolerate NULL pointers
Now that other HVM code started making more extensive use of
hvm_cpuid(), let's not force every caller to declare dummy variables
for output not cared about.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Yang Zhang [Fri, 4 Oct 2013 10:30:09 +0000 (12:30 +0200)]
Nested VMX: fix IA32_VMX_CR4_FIXED1 msr emulation
Currently, it use hardcode value for IA32_VMX_CR4_FIXED1. This is wrong.
We should check guest's cpuid to know which bits are writeable in CR4 by guest
and allow the guest to set the corresponding bit only when guest has the feature.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Cleanup.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Jan Beulich [Fri, 4 Oct 2013 10:29:08 +0000 (12:29 +0200)]
VMX: clean up capability checks
VMCS size validation on APs should check against BP's size.
No need for a separate cpu_has_vmx_ins_outs_instr_info variable
anymore.
Use proper symbolics.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Yang Zhang [Fri, 4 Oct 2013 10:28:14 +0000 (12:28 +0200)]
Nested VMX: check VMX capability before read VMX related MSRs
VMX MSRs only available when the CPU support the VMX feature. In addition,
VMX_TRUE* MSRs only available when bit 55 of VMX_BASIC MSR is set.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Cleanup.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Andrew Cooper [Fri, 4 Oct 2013 10:24:34 +0000 (12:24 +0200)]
x86/percpu: Force INVALID_PERCPU_AREA into the non-canonical address region
This causes accidental uses of per_cpu() on a pcpu with an INVALID_PERCPU_AREA
to result in a #GF for attempting to access the middle of the non-canonical
virtual address region.
This is preferable to the current behaviour, where incorrect use of per_cpu()
will result in an effective NULL structure dereference which has security
implication in the context of PV guests.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Fri, 4 Oct 2013 10:23:23 +0000 (12:23 +0200)]
x86/idle: Fix get_cpu_idle_time()'s interaction with offline pcpus
Checking for "idle_vcpu[cpu] != NULL" is insufficient protection against
offline pcpus. From a hypercall, vcpu_runstate_get() will determine "v !=
current", and try to take the vcpu_schedule_lock(). This will try to look up
per_cpu(schedule_data, v->processor) and promptly suffer a NULL structure
deference as v->processors' __per_cpu_offset is INVALID_PERCPU_AREA.
One example might look like this:
...
Xen call trace:
[<
ffff82c4c0126ddb>] vcpu_runstate_get+0x50/0x113
[<
ffff82c4c0126ec6>] get_cpu_idle_time+0x28/0x2e
[<
ffff82c4c012b5cb>] do_sysctl+0x3db/0xeb8
[<
ffff82c4c023280d>] compat_hypercall+0xbd/0x116
Pagetable walk from
0000000000000040:
L4[0x000] =
0000000186df8027 0000000000028207
L3[0x000] =
0000000188e36027 00000000000261c9
L2[0x000] =
0000000000000000 ffffffffffffffff
****************************************
Panic on CPU 11:
...
get_cpu_idle_time() has been updated to correctly deal with offline pcpus
itself by returning 0, in the same way as it would if it was missing the
idle_vcpu[] pointer.
In doing so, XENPF_getidletime needed updating to correctly retain its
described behaviour of clearing bits in the cpumap for offline pcpus.
As this crash can only be triggered with toolstack hypercalls, it is not a
security issue and just a simple bug.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Matthew Daley [Sun, 29 Sep 2013 05:47:37 +0000 (18:47 +1300)]
libxl: correctly handle libxl_get_cpu_topology failure in libxl_{cpu, node}map_to_{node, cpu}map
Initialize nr_cpus to 0 so that if it is unchanged by a failing
libxl_get_cpu_topology, libxl_cputopology_list_free still works OK
afterward.
Coverity-ID:
1055294
Coverity-ID:
1055295
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
Stefano Stabellini [Mon, 30 Sep 2013 12:06:12 +0000 (13:06 +0100)]
xen/arm: map_domain_page: reuse slots with avail == 0
If a slot has avail == 0 but still points to the right mfn, reuse it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Matthew Daley [Sun, 29 Sep 2013 05:24:36 +0000 (18:24 +1300)]
libxl: only put poller if already gotten in libxl_event_wait
Coverity-ID:
1055292
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Matthew Daley [Sun, 29 Sep 2013 01:35:02 +0000 (14:35 +1300)]
libxc: only munmap when something has actually been mapped in change_pte
Coverity-ID:
1055269
signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Zhu Yanhai [Mon, 30 Sep 2013 08:12:10 +0000 (16:12 +0800)]
xm-test: fix the ip allocation function
__findFirstOctetIP() is expecting min and max available octets according to
its code, however the caller getFreeIP() gives it the min octet and (max -
min + 1), which is the length instead.
Signed-off-by: Zhu Yanhai <gaoyang.zyh@taobao.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Fri, 27 Sep 2013 16:49:52 +0000 (17:49 +0100)]
xen/arm32: don't export v7_init
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 27 Sep 2013 10:16:22 +0000 (11:16 +0100)]
xl: fork before execing vncviewer
Otherwise we don't daemonize to monitor the domain.
Heavily cargo-culted from autoconnect-console and only compile tested.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Matthew Daley [Fri, 27 Sep 2013 11:29:10 +0000 (23:29 +1200)]
libxl: handle null lists in libxl_string_list_length
After commit
b0be2b12 ("libxl: fix libxl_string_list_length and its only
caller") libxl_string_list_length no longer handles null (empty) lists. Fix
so they are handled, returning length 0.
While at it, remove the unneccessary undereferenced null pointer check
and tidy the layout of the function.
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Mon, 30 Sep 2013 13:28:12 +0000 (15:28 +0200)]
x86: don't blindly create L3 tables for the direct map
Now that the direct map area can extend all the way up to almost the
end of address space, this is wasteful.
Also fold two almost redundant messages in SRAT parsing into one.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 30 Sep 2013 12:18:58 +0000 (14:18 +0200)]
x86: properly set up fbld emulation operand address
This is CVE-2013-4361 / XSA-66.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Tim Deegan [Mon, 30 Sep 2013 12:18:25 +0000 (14:18 +0200)]
x86/mm/shadow: Fix initialization of PV shadow L4 tables.
Shadowed PV L4 tables must have the same Xen mappings as their
unshadowed equivalent. This is done by copying the Xen entries
verbatim from the idle pagetable, and then using guest_l4_slot()
in the SHADOW_FOREACH_L4E() iterator to avoid touching those entries.
adc5afbf1c70ef55c260fb93e4b8ce5ccb918706 (x86: support up to 16Tb)
changed the definition of ROOT_PAGETABLE_XEN_SLOTS to extend right to
the top of the address space, which causes the shadow code to
copy Xen mappings into guest-kernel-address slots too.
In the common case, all those slots are zero in the idle pagetable,
and no harm is done. But if any slot above #271 is non-zero, Xen will
crash when that slot is later cleared (it attempts to drop
shadow-pagetable refcounts on its own L4 pagetables).
Fix by using the new ROOT_PAGETABLE_PV_XEN_SLOTS when appropriate.
Monitor pagetables need the full Xen mappings, so they keep using the
old name (with its new semantics).
This is CVE-2013-4356 / XSA-64.
Signed-off-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>