xen.git
12 years agolibxc: fix memory leak in move_l3_below_4G error handling
Matthew Daley [Wed, 30 Oct 2013 07:51:39 +0000 (20:51 +1300)]
libxc: fix memory leak in move_l3_below_4G error handling

...otherwise mmu gets leaked.

Coverity-ID: 1055844
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxc: fix retrieval of and remove pointless check on gzip size
Matthew Daley [Thu, 31 Oct 2013 02:58:53 +0000 (15:58 +1300)]
libxc: fix retrieval of and remove pointless check on gzip size

Coverity-ID: 1055587
Coverity-ID: 1055963
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxencommons: write domain 0's domid to xenstore
Matthew Daley [Thu, 31 Oct 2013 06:03:55 +0000 (19:03 +1300)]
xencommons: write domain 0's domid to xenstore

libvchan's init_xs_srv (server-side xenstore-related initialization)
expects to find the current domain's domid at this xenstore key. libxl
(and xend) write this for domains they create. Do the same for domain 0,
allowing the use of libvchan in dom0.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: HVM domain S3 bugfix
Liu Jinsong [Fri, 23 Aug 2013 15:30:23 +0000 (23:30 +0800)]
libxl: HVM domain S3 bugfix

Currently Xen hvm s3 has a bug coming from the difference between
qemu-traditional and qemu-xen. For qemu-traditional, the way to
resume from hvm s3 is via 'xl trigger' command. However, for
qemu-xen, the way to resume from hvm s3 inherited from standard
qemu, i.e. via QMP, and it doesn't work under Xen.

The root cause is, for qemu-xen, 'xl trigger' command didn't reset
devices, while QMP didn't unpause hvm domain though they did qemu
system reset.

We have two qemu patches one xl patch to fix the HVM S3 bug:
This patch is the xl patch. It invokes QMP system_wakeup so that
qemu logic for hvm s3 could be triggered.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agotools: remove unnecessary null pointer checks before frees
Matthew Daley [Tue, 15 Oct 2013 05:18:02 +0000 (18:18 +1300)]
tools: remove unnecessary null pointer checks before frees

Patch generated by the following semantic patch
(http://coccinelle.lip6.fr/):

@@
expression *P;
@@

- if(P) free(P);
+ free(P);

...and then by filtering through the following command:

filterdiff -p1 -x 'stubdom/*' -x 'tools/firmware/*' -x 'tools/qemu-*' -x 'tools/blktap*'

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: ocaml: add META to list of generated files in Makefile
Rob Hoes [Mon, 21 Oct 2013 13:32:31 +0000 (14:32 +0100)]
libxl: ocaml: add META to list of generated files in Makefile

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agolibxl: idl: allow KeyedUnion members to be empty
Rob Hoes [Fri, 4 Oct 2013 15:58:19 +0000 (16:58 +0100)]
libxl: idl: allow KeyedUnion members to be empty

This is useful when the key enum has an "invalid" option and avoids
the need to declare a dummy struct. Use this for domain_build_info
resulting in the generated API changing like so:
    --- tools/libxl/_libxl_BACKUP_types.h
    +++ tools/libxl/_libxl_types.h
    @@ -377,8 +377,6 @@ typedef struct libxl_domain_build_info {
                 const char * features;
                 libxl_defbool e820_host;
             } pv;
    -        struct {
    -        } invalid;
         } u;
     } libxl_domain_build_info;
     void libxl_domain_build_info_dispose(libxl_domain_build_info *p);

+ a related change to the JSON generation.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agoRevert "VMX: Eliminate cr3 store/load vmexit when UG enabled"
Jan Beulich [Wed, 30 Oct 2013 15:20:29 +0000 (16:20 +0100)]
Revert "VMX: Eliminate cr3 store/load vmexit when UG enabled"

This reverts commit c9efe34c119418a5ac776e5d91aeefcce4576518. It
doesn't work right on non-UG hardware.

12 years agotools: xenstored: if the reply is too big then send E2BIG error
Ian Jackson [Tue, 29 Oct 2013 15:45:53 +0000 (15:45 +0000)]
tools: xenstored: if the reply is too big then send E2BIG error

This fixes the issue for both C and ocaml xenstored, however only the ocaml
xenstored is vulnerable in its default configuration.

Adding a new error appears to be safe, since bit libxenstore and the Linux
driver at least treat an unknown error code as EINVAL.

This is XSA-72 / CVE-2013-4416.

Original ocaml patch by Jerome Maloberti <jerome.maloberti@citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Thomas Sanders <thomas.sanders@citrix.com>
12 years agofix locking in cpu_disable_scheduler()
Jan Beulich [Tue, 29 Oct 2013 08:57:14 +0000 (09:57 +0100)]
fix locking in cpu_disable_scheduler()

So commit eedd6039 ("scheduler: adjust internal locking interface")
uncovered - by now using proper spin lock constructs - a bug after all:
When bringing down a CPU, cpu_disable_scheduler() gets called with
interrupts disabled, and hence the use of vcpu_schedule_lock_irq() was
never really correct (i.e. the caller ended up with interrupts enabled
despite having disabled them explicitly).

Fixing this however surfaced another problem: The call path
vcpu_migrate() -> evtchn_move_pirqs() wants to acquire the event lock,
which however is a non-IRQ-safe once, and hence check_lock() doesn't
like this lock to be acquired when interrupts are already off. As we're
in stop-machine context here, getting things wrong wrt interrupt state
management during lock acquire/release is out of question though, so
the simple solution to this appears to be to just suppress spin lock
debugging for the period of time while the stop machine callback gets
run.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoVMX: Eliminate cr3 store/load vmexit when UG enabled
Yang Zhang [Tue, 29 Oct 2013 08:55:23 +0000 (09:55 +0100)]
VMX: Eliminate cr3 store/load vmexit when UG enabled

With the feature of unrestricted guest, Xen should not cause
vmexits for cr3 accesses in non-paging mode.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
12 years agoxl: don't emit misleading daemon pid message
Matthew Daley [Sun, 27 Oct 2013 06:49:14 +0000 (19:49 +1300)]
xl: don't emit misleading daemon pid message

After creating a domain, xl forks off a process to handle domain events
(shutdown, disk eject, ...). It prints out the pid of the process
created by the fork to stdout. However, the newly forked process soon
after calls daemon(), which itself forks once more (and exit()s the
original process). This means that the pid printed out is not the pid of
the actual process which remains in the background after all is said and
done, instead it is the pid of the transient process that exists between
xl's fork() and the fork'd process's daemon() call.

We could resolve this by printing the correct pid, ie. by open-coding
daemon() (we already do most of the heavy lifting it does ourselves by
fiddling with the standard fds). However, since no-one seems to be
complaining about the misleading message to begin with, and since it
seems like a pointless message anyway, just remove it outright instead.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agolibxenstore: Use PTHREAD_STACK_MIN
Ian Campbell [Fri, 25 Oct 2013 07:47:35 +0000 (08:47 +0100)]
libxenstore: Use PTHREAD_STACK_MIN

The existing value of 16K is smaller than the arm64 minimum stack size, which
is 128K. PTHREAD_STACK_MIN appears to be standard
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_attr_setstacksize.html

Consindered setting a lower bound but the stack requirements of the watcher
thread are pretty minimal (tens of bytes from the looks of it) and unlikely to
blow PTHREAD_STACK_MIN on any useful platform.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agolibxl: remove spurious newline from LOG() message
Ian Campbell [Fri, 25 Oct 2013 07:47:24 +0000 (08:47 +0100)]
libxl: remove spurious newline from LOG() message

The macro already includes this.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agolibelf: improve errors in elf_xen_note_check()
Andrew Cooper [Mon, 28 Oct 2013 11:30:35 +0000 (11:30 +0000)]
libelf: improve errors in elf_xen_note_check()

I recently debugged an isolated failure to boot, with no information other
than the logs.

The "Will only load images built for the generic loader or Linux images"
string was missing a newline, leading to the subsequent error being appended
to this line, rather than having its own line with correctly identified
function.

Furthermore, error messages which state "param containing $FOO is not $BAR" is
fairly useless for debugging without identifying which bad $FOO caused the
failure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agox86/irq: print direct vector mappings in the 'i' debug key
Andrew Cooper [Mon, 28 Oct 2013 11:01:19 +0000 (12:01 +0100)]
x86/irq: print direct vector mappings in the 'i' debug key

Also adjust the initial print message, as the IRQ loop has contained
non-guest interrutps for a while now.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agox86: refine address validity checks before accessing page tables
Jan Beulich [Mon, 28 Oct 2013 11:00:36 +0000 (12:00 +0100)]
x86: refine address validity checks before accessing page tables

In commit 40d66baa ("x86: correct LDT checks") and d06a0d71 ("x86: add
address validity check to guest_map_l1e()") I didn't really pay
attention to the fact that these checks would better be done before the
paging_mode_translate() ones, as there's also no equivalent check down
the shadow code paths involved here (at least not up to the first use
of the address), and such generic checks shouldn't really be done by
particular backend functions anyway.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
12 years agocommon/initcall: extern linker symbols with correct types
Andrew Cooper [Mon, 28 Oct 2013 10:58:44 +0000 (11:58 +0100)]
common/initcall: extern linker symbols with correct types

Coverity IDs 10549561054957

Coverity pointed out that we applying array operations based on an expression
which yielded singleton pointers.  The problem is actually that the externs
were typed incorrectly.

Correct the extern declaration to prevent straying into undefined behaviour,
and relying on the lenience of GCC to work.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoxenctx: fix typo in arm64 output
Ian Campbell [Mon, 7 Oct 2013 16:39:53 +0000 (17:39 +0100)]
xenctx: fix typo in arm64 output

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen: arm: Ensure HCR_EL2.RW is set correctly when building dom0
Ian Campbell [Thu, 10 Oct 2013 14:43:45 +0000 (15:43 +0100)]
xen: arm: Ensure HCR_EL2.RW is set correctly when building dom0

copy_to_user and friends rely on this, since the address transalation
functions (guest VA -> MFN) will truncate VA to the appropriate size.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen: arm: correctly round down MFN to 1GB boundary make sure pagetable mask macros...
Ian Campbell [Thu, 10 Oct 2013 14:43:44 +0000 (15:43 +0100)]
xen: arm: correctly round down MFN to 1GB boundary make sure pagetable mask macros as physaddr size

~FIRST_MASK is nothing like correct for rounding down an MFN. It is the
inverse *and* an address not a framenumber so wrong in every dimension! We
cannot use FIRST_MASK since that would mask off any zeroeth level bits.
Instead calculate the correct value from FIRST_SIZE.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen: arm: make sure pagetable mask macros have appropriate size
Ian Campbell [Thu, 10 Oct 2013 14:43:43 +0000 (15:43 +0100)]
xen: arm: make sure pagetable mask macros have appropriate size

{ZEROETH,FIRST,SECOND,THIRD}_MASK are used with physical addresses which may
be larger than 32 bits. Therefore ensure that they are wide enough by casting
to paddr_t otherwise we may truncate addresses on 32-bit.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen: arm: map entire memory banks on arm64
Ian Campbell [Thu, 10 Oct 2013 14:43:42 +0000 (15:43 +0100)]
xen: arm: map entire memory banks on arm64

Currently we only map regions which are not part of boot modules. However we
subsequently free at least some of those modules to the heaps in
discard_initial_modules and if we were unluckly with sizing/location we might
end up adding unmapped pages to the heap.

The heaps on 64-bit use 1GB mappings, so in practice this is probably pretty
unlikely and I've not actually seen it.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen: arm: Enable 40 bit addressing in VTCR for arm64
Ian Campbell [Thu, 10 Oct 2013 14:43:41 +0000 (15:43 +0100)]
xen: arm: Enable 40 bit addressing in VTCR for arm64

This requires setting the v8 specific VTCR_EL2.PS field. These bits are
UNK/SBZP on v7.

Also the TS0SZ field is described slightly differently for v8, so update the
comment to reflect this.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen: correct xenheap_bits after "xen: support RAM at addresses 0 and 4096"
Ian Campbell [Thu, 10 Oct 2013 14:43:40 +0000 (15:43 +0100)]
xen: correct xenheap_bits after "xen: support RAM at addresses 0 and 4096"

This is incorrect after commit 1aac966e24e which shuffled the zones up by one.
I've observed failures on arm64 systems with RAM at 0x8,00000000-0x8,7fffffff
since xenheap_bits ends up as 35 instead of 36 (which is the zone with all the
RAM).

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
12 years agoxen: arm: fix usage of bootargs for Xen.
Ian Campbell [Mon, 21 Oct 2013 09:21:23 +0000 (10:21 +0100)]
xen: arm: fix usage of bootargs for Xen.

The chosen node's bootargs property should be used for Xen if there is a dom0
kernel multiboot module with a command line, not just if xen,dom0-bootargs is
present.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.linaro.org>
12 years agoxen: arm: correct XEN_COMPILE_ARCH autodetection for arm64
Ian Campbell [Tue, 22 Oct 2013 16:12:14 +0000 (17:12 +0100)]
xen: arm: correct XEN_COMPILE_ARCH autodetection for arm64

At least on aarch64 openSUSE running with qemu-user-aarch64 "uname -m" reports
"aarch64" and not "armv8" so include that in the seddery. There's no harm
leaving the existing armv8 rune too so do so.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen/arm: Allocate memory for dom0 from the bottom with the 1:1 Workaround
Julien Grall [Tue, 22 Oct 2013 10:51:48 +0000 (11:51 +0100)]
xen/arm: Allocate memory for dom0 from the bottom with the 1:1 Workaround

On Linux, the option CONFIG_ARM_PATCH_PHYS_VIRT (by default enabled) allows
the Kernel to be loaded anywhere (or nearly) by patching the translation
pv<->virt at boot time.

The current solution in Linux assuming that the delta physical address -
virtual address is always negative. A positive delta will destroy all the
optimisation to modify only a part of the translation instruction (add/sub).

By default, Xen is allocating memory from the top of memory and then
goes down. To avoid booting issue with Linux, we must allocate memory
from the bottom (ie starting from 0).

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen/arm: implement smp initialization callbacks for omap5
Chen Baozi [Tue, 15 Oct 2013 08:45:31 +0000 (16:45 +0800)]
xen/arm: implement smp initialization callbacks for omap5

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agoxen/arm: fix a typo in comment of PLATFORM_QUIRK_DOM0_MAPPING_11
Chen Baozi [Tue, 15 Oct 2013 08:45:29 +0000 (16:45 +0800)]
xen/arm: fix a typo in comment of PLATFORM_QUIRK_DOM0_MAPPING_11

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
12 years agonetif.h: Add IPv6 related changes
Paul Durrant [Thu, 24 Oct 2013 08:47:50 +0000 (09:47 +0100)]
netif.h: Add IPv6 related changes

My recent patch series to Linux netback added IPv6 checksum
offload and GSO support. This involved making some changes to the
copy of netif.h in Linux.
This patch adds those changes to the canonical copy of netif.h.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen/arm: add_to_physmap_one: Avoid to map mfn 0 if an error occurs
Julien Grall [Wed, 23 Oct 2013 16:28:47 +0000 (17:28 +0100)]
xen/arm: add_to_physmap_one: Avoid to map mfn 0 if an error occurs

By default, the function add_to_physmap_one set mfn to 0. Some code paths that
result to an error, continue and the map the mfn 0 (valid on ARM) to the
slot given by the guest.

To fix the problem, return directly an error if sanity check has failed.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
12 years agospinlock: ensure the flags parameter is wide enough
Andrew Cooper [Tue, 22 Oct 2013 15:16:29 +0000 (17:16 +0200)]
spinlock: ensure the flags parameter is wide enough

Because of the construction of spin_lock_irq() (and varients), the flags
parameter could be trucated.  Use a BUILD_BUG_ON() to verify the width of the
parameter.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agowiden flags parameter for spinlock_irqsave() and friends
Andrew Cooper [Tue, 22 Oct 2013 15:15:58 +0000 (17:15 +0200)]
widen flags parameter for spinlock_irqsave() and friends

These issues were detected using the subsequent patch which forces a
compilation error if the result from local_irq_save() would be truncated.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/irq: local_irq_restore() should not blindly popf
Andrew Cooper [Tue, 22 Oct 2013 15:11:16 +0000 (17:11 +0200)]
x86/irq: local_irq_restore() should not blindly popf

local_irq_restore() should only be concerned with possibly changing the
interrupt flag.  A blind popf could corrupt other system flags.

While playing in this area, fixup an opencoded use of X86_EFLAGS_IF.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/xsave: also save/restore XCR0 across suspend (ACPI S3)
Jan Beulich [Mon, 21 Oct 2013 15:26:16 +0000 (17:26 +0200)]
x86/xsave: also save/restore XCR0 across suspend (ACPI S3)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoxen/arm: Add CPU ID for Broadcom Brahma-B15
Marc Carino [Wed, 16 Oct 2013 22:57:06 +0000 (15:57 -0700)]
xen/arm: Add CPU ID for Broadcom Brahma-B15

Let Xen recognize the Broadcom Brahma-B15 CPU by adding the appropriate
MIDR mask to the initialization phase. Further, ensure that the console
output properly reports the CPU manufacturer as "Broadcom Corporation".

Signed-off-by: Marc Carino <marc.ceeeee@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agox86: print relevant (tail) part of filename for warnings and crashes
Jan Beulich [Thu, 17 Oct 2013 09:35:26 +0000 (11:35 +0200)]
x86: print relevant (tail) part of filename for warnings and crashes

In particular when the origin construct is in a header file (and
hence the file name is an absolute path instead of just the file name
portion) the information can otherwise become rather useless when the
build tree isn't sitting relatively close to the file system root.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agotools: update to SeaBIOS 1.7.3.1
Ian Campbell [Mon, 23 Sep 2013 12:19:16 +0000 (13:19 +0100)]
tools: update to SeaBIOS 1.7.3.1

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxend: Drop long deprecation warning in /var/run not /tmp
Ian Campbell [Fri, 11 Oct 2013 11:49:05 +0000 (12:49 +0100)]
xend: Drop long deprecation warning in /var/run not /tmp

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agoxen: arm: Emacs style fix
Wei Liu [Fri, 11 Oct 2013 15:31:57 +0000 (16:31 +0100)]
xen: arm: Emacs style fix

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoadd cap value to credit scheduler debug info
Juergen Gross [Wed, 16 Oct 2013 10:28:04 +0000 (12:28 +0200)]
add cap value to credit scheduler debug info

Currently only the weight is the only scheduling parameter printed for
domains in the credit scheduler key handler. Add the cap value to be
printed as well.

Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
12 years agocredit: unpause parked vcpu before destroying it
Juergen Gross [Wed, 16 Oct 2013 10:26:48 +0000 (12:26 +0200)]
credit: unpause parked vcpu before destroying it

A capped out vcpu must be unpaused in case of moving it to another cpupool,
otherwise it will be paused forever.

Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
12 years agoxen/evtchn: Fix build on ARM
Julien Grall [Mon, 14 Oct 2013 22:19:37 +0000 (23:19 +0100)]
xen/evtchn: Fix build on ARM

The recent event channel changes introduced by commit a77eb86 and before...
break the compilation on Xen ARM. This commit adds missing includes in
common/event_fifo.c and include/xen/sched.h.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: remove qemu default devices for upstream qemu
Fabio Fantoni [Mon, 30 Sep 2013 11:53:08 +0000 (13:53 +0200)]
libxl: remove qemu default devices for upstream qemu

Remove default devices created by qemu. Qemu will create only devices
defined by xen, since the devices not defined by xen are not usable.
Remove deleting of empty floppy no more needed with nodefault.

(Removed a whitespace error. -iwj)

Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agopygrub: Support (/dev/xvda) style disk specifications
Ian Campbell [Thu, 10 Oct 2013 09:37:37 +0000 (10:37 +0100)]
pygrub: Support (/dev/xvda) style disk specifications

You get these if you install Debian Wheezy as HVM and then try to convert to
PV.

This is Debian bug #603391.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Tril <tril@metapipe.net>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agolibxl,xl: add max_event_channels option to xl configuration file
David Vrabel [Mon, 14 Oct 2013 08:25:39 +0000 (10:25 +0200)]
libxl,xl: add max_event_channels option to xl configuration file

Add the 'max_event_channels' option to the xl configuration file to
limit the number of event channels that domain may use.

Plumb this option through to libxl via a new libxl_build_info field
and call xc_domain_set_max_evtchn() in the post build stage of domain
creation.

A new LIBXL_HAVE_BUILDINFO_EVENT_CHANNELS #define indicates that this
new field is available.

The default value of 1023 limits the domain to using the minimum
amount of global mapping pages and at most 5 xenheap pages.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agolibxc: add xc_domain_set_max_evtchn()
David Vrabel [Mon, 14 Oct 2013 08:24:03 +0000 (10:24 +0200)]
libxc: add xc_domain_set_max_evtchn()

Add xc_domain_set_max_evtchn(), a wrapper around the
DOMCTL_set_max_evtchn hypercall.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoAdd DOMCTL to limit the number of event channels a domain may use
David Vrabel [Mon, 14 Oct 2013 08:23:10 +0000 (10:23 +0200)]
Add DOMCTL to limit the number of event channels a domain may use

Add XEN_DOMCTL_set_max_evtchn which may be used during domain creation to
set the maximum event channel port a domain may use.  This may be used to
limit the amount of Xen resources (global mapping space and xenheap) that
a domain may use for event channels.

A domain that does not have a limit set may use all the event channels
supported by the event channel ABI in use.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn: add FIFO-based event channel hypercalls and port ops
David Vrabel [Mon, 14 Oct 2013 08:22:07 +0000 (10:22 +0200)]
evtchn: add FIFO-based event channel hypercalls and port ops

Add the implementation for the FIFO-based event channel ABI.  The new
hypercall sub-ops (EVTCHNOP_init_control, EVTCHNOP_expand_array) and
the required evtchn_ops (set_pending, unmask, etc.).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn: implement EVTCHNOP_set_priority and add the set_priority hook
David Vrabel [Mon, 14 Oct 2013 08:21:06 +0000 (10:21 +0200)]
evtchn: implement EVTCHNOP_set_priority and add the set_priority hook

Implement EVTCHNOP_set_priority.  A new set_priority hook added to
struct evtchn_port_ops will do the ABI specific validation and setup.

If an ABI does not provide a set_priority hook (as is the case of the
2-level ABI), the sub-op will return -ENOSYS.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn: add FIFO-based event channel ABI
David Vrabel [Mon, 14 Oct 2013 08:20:02 +0000 (10:20 +0200)]
evtchn: add FIFO-based event channel ABI

Add the event channel hypercall sub-ops and the definitions for the
shared data structures for the FIFO-based event channel ABI.

The design document for this new ABI is available here:

http://xenbits.xen.org/people/dvrabel/event-channels-F.pdf

In summary, events are reported using a per-domain shared event array
of event words.  Each event word has PENDING, LINKED and MASKED bits
and a LINK field for pointing to the next event in the event queue.

There are 16 event queues (with different priorities) per-VCPU.

Key advantages of this new ABI include:

- Support for over 100,000 events (2^17).
- 16 different event priorities.
- Improved fairness in event latency through the use of FIFOs.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn: allow many more evtchn objects to be allocated per domain
David Vrabel [Mon, 14 Oct 2013 08:19:21 +0000 (10:19 +0200)]
evtchn: allow many more evtchn objects to be allocated per domain

Expand the number of event channels that can be supported internally
by altering now struct evtchn's are allocated.

The objects are indexed using a two level scheme of groups and buckets
(instead of only buckets).  Each group is a page of bucket pointers.
Each bucket is a page-sized array of struct evtchn's.

The optimal number of evtchns per bucket is calculated at compile
time.

If XSM is not enabled, struct evtchn is 16 bytes and each bucket
contains 256, requiring only 1 group of 512 pointers for 2^17
(131,072) event channels.  With XSM enabled, struct evtchn is 24
bytes, each bucket contains 128 and 2 groups are required.

For the common case of a domain with only a few event channels,
instead of requiring an additional allocation for the group page, the
first bucket is indexed directly.

As a consequence of this, struct domain shrinks by at least 232 bytes
as 32 bucket pointers are replaced with 1 bucket pointer and (at most)
2 group pointers.

[ Based on a patch from Wei Liu with improvements from Malcolm
Crossley. ]

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn: use a per-domain variable for the max number of event channels
David Vrabel [Mon, 14 Oct 2013 08:18:24 +0000 (10:18 +0200)]
evtchn: use a per-domain variable for the max number of event channels

Instead of the MAX_EVTCHNS(d) macro, use d->max_evtchns instead.  This
avoids having to repeatedly check the ABI type.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn: print ABI specific state with the 'e' debug key
David Vrabel [Mon, 14 Oct 2013 08:17:14 +0000 (10:17 +0200)]
evtchn: print ABI specific state with the 'e' debug key

In the output of the 'e' debug key, print some ABI specific state in
addition to the (p)ending and (m)asked bits.

For the 2-level ABI, print the state of that event's selector
bit. e.g.,

(XEN)     port [p/m/s]
(XEN)        1 [0/0/1]: s=3 n=0 x=0 d=0 p=74
(XEN)        2 [0/0/1]: s=3 n=0 x=0 d=0 p=75

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoevtchn: refactor low-level event channel port ops
David Vrabel [Mon, 14 Oct 2013 08:15:49 +0000 (10:15 +0200)]
evtchn: refactor low-level event channel port ops

Use functions for the low-level event channel port operations
(set/clear pending, unmask, is_pending and is_masked).

Group these functions into a struct evtchn_port_op so they can be
replaced by alternate implementations (for different ABIs) on a
per-domain basis.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agodebug: remove some event channel info from the 'i' and 'q' debug keys
David Vrabel [Mon, 14 Oct 2013 08:14:38 +0000 (10:14 +0200)]
debug: remove some event channel info from the 'i' and 'q' debug keys

The 'i' key would always use VCPU0's selector word when printing the
event channel state. Remove the incorrect output as a subsequent
change will add the (correct) information to the 'e' key instead.

When dumping domain information, printing the state of the VIRQ_DEBUG
port is redundant -- this information is available via the 'e' key.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/HVM: cache emulated instruction for retry processing
Jan Beulich [Mon, 14 Oct 2013 07:54:09 +0000 (09:54 +0200)]
x86/HVM: cache emulated instruction for retry processing

Rather than re-reading the instruction bytes upon retry processing,
stash away and re-use what we already read. That way we can be certain
that the retry won't do something different from what requested the
retry, getting once again closer to real hardware behavior (where what
we use retries for is simply a bus operation, not involving redundant
decoding of instructions).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/HVM: properly deal with hvm_copy_*_guest_phys() errors
Jan Beulich [Mon, 14 Oct 2013 07:53:31 +0000 (09:53 +0200)]
x86/HVM: properly deal with hvm_copy_*_guest_phys() errors

In memory read/write handling the default case should tell the caller
that the operation cannot be handled rather than the operation having
succeeded, so that when new HVMCOPY_* states get added not handling
them explicitly will not result in errors being ignored.

In task switch emulation code stop handling some errors, but not
others.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/HVM: don't ignore hvm_copy_to_guest_phys() errors during I/O intercept
Jan Beulich [Mon, 14 Oct 2013 07:52:33 +0000 (09:52 +0200)]
x86/HVM: don't ignore hvm_copy_to_guest_phys() errors during I/O intercept

Building upon the extended retry logic we can now also make sure to
not ignore errors resulting from writing data back to guest memory.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/HVM: fix direct PCI port I/O emulation retry and error handling
Jan Beulich [Mon, 14 Oct 2013 07:51:40 +0000 (09:51 +0200)]
x86/HVM: fix direct PCI port I/O emulation retry and error handling

dpci_ioport_{read,write}() guest memory access failure handling should
be modelled after process_portio_intercept()'s (and others): Upon
encountering an error on other than the first iteration, the count
successfully handled needs to be stored and X86EMUL_OKAY returned, in
order for the generic instruction emulator to update register state
correctly before reporting failure or retrying (both of which would
only happen after re-invoking emulation).

Further we leverage (and slightly extend, due to the above mentioned
need to return X86EMUL_OKAY) the "large MMIO" retry model.

Note that there is still a special case not explicitly taken care of
here: While the first retry on the last iteration of a "rep ins"
correctly recovers the already read data, an eventual subsequent retry
is being handled by the pre-existing mmio-large logic (through
hvmemul_do_io() storing the [recovered] data [again], also taking into
consideration that the emulator converts a single iteration "ins" to
->read_io() plus ->write()).

Also fix an off-by-one in the mmio-large-read logic, and slightly
simplify the copying of the data.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/HVM: properly handle backward string instruction emulation
Jan Beulich [Mon, 14 Oct 2013 07:50:16 +0000 (09:50 +0200)]
x86/HVM: properly handle backward string instruction emulation

Multiplying a signed 32-bit quantity with an unsigned 32-bit quantity
produces an unsigned 32-bit result, yet for emulation of backward
string instructions we need the result sign extended before getting
added to the base address.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agosched: Correct function prototypes
Andrew Cooper [Mon, 14 Oct 2013 07:07:44 +0000 (09:07 +0200)]
sched: Correct function prototypes

struct vcpu pointers are traditionally v rather than d.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/MSI: fix locking in pci_restore_msi_state()
Jan Beulich [Mon, 14 Oct 2013 07:07:02 +0000 (09:07 +0200)]
x86/MSI: fix locking in pci_restore_msi_state()

Right after the loop the lock is being dropped, so all loop exits
should happen with the lock still held.

Reported-by: Kristoffer Egefelt <kristoffer@itoc.dk>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Kristoffer Egefelt <kristoffer@itoc.dk>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agosched: fix race between sched_move_domain() and vcpu_wake()
David Vrabel [Mon, 14 Oct 2013 06:58:31 +0000 (08:58 +0200)]
sched: fix race between sched_move_domain() and vcpu_wake()

From: David Vrabel <david.vrabel@citrix.com>

sched_move_domain() changes v->processor for all the domain's VCPUs.
If another domain, softirq etc. triggers a simultaneous call to
vcpu_wake() (e.g., by setting an event channel as pending), then
vcpu_wake() may lock one schedule lock and try to unlock another.

vcpu_schedule_lock() attempts to handle this but only does so for the
window between reading the schedule_lock from the per-CPU data and the
spin_lock() call.  This does not help with sched_move_domain()
changing v->processor between the calls to vcpu_schedule_lock() and
vcpu_schedule_unlock().

Fix the race by taking the schedule_lock for v->processor in
sched_move_domain().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Use vcpu_schedule_lock_irq() (which now returns the lock) to properly
retry the locking should the to be used lock have changed in the course
of acquiring it (issue pointed out by George Dunlap).

Add a comment explaining the state after the v->processor adjustment.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoscheduler: adjust internal locking interface
Jan Beulich [Mon, 14 Oct 2013 06:57:56 +0000 (08:57 +0200)]
scheduler: adjust internal locking interface

Make the locking functions return the lock pointers, so they can be
passed to the unlocking functions (which in turn can check that the
lock is still actually providing the intended protection, i.e. the
parameters determining which lock is the right one didn't change).

Further use proper spin lock primitives rather than open coded
local_irq_...() constructs, so that interrupts can be re-enabled as
appropriate while spinning.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: fix bug_line()
Jan Beulich [Mon, 14 Oct 2013 06:52:18 +0000 (08:52 +0200)]
x86: fix bug_line()

Due to the packing into a bit field together with a relocated field,
the computation can overflow when the relocated field ends up getting a
negative value stored. Hence it isn't sufficient to correct the value
by 1 in this case, but we also need to mask the result to the width of
the original bit field.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agoRevert "QEMU_TAG update"
Ian Jackson [Fri, 11 Oct 2013 18:05:31 +0000 (19:05 +0100)]
Revert "QEMU_TAG update"

(My script edited the wrong xen.git branch)

This reverts commit 363cfda13a58eab51a4a85f30c7c740990b53c3a.

12 years agoQEMU_TAG update
Ian Jackson [Fri, 11 Oct 2013 18:04:25 +0000 (19:04 +0100)]
QEMU_TAG update

12 years agolibxl: make libxl__poller_put tolerate p==NULL
Ian Jackson [Fri, 11 Oct 2013 11:10:45 +0000 (12:10 +0100)]
libxl: make libxl__poller_put tolerate p==NULL

This is less fragile, and more in keeping with the usual style of
initialising everything to 0 and freeing things unconditionally.

Correspondingly, remove the tests at the call sites.

Apropos of c1f3f174.  No overall functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agox86: check for canonical address before doing page walks
Jan Beulich [Fri, 11 Oct 2013 07:31:16 +0000 (09:31 +0200)]
x86: check for canonical address before doing page walks

... as there doesn't really exists any valid mapping for them.

Particularly in the case of do_page_walk() this also avoids returning
non-NULL for such invalid input.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: use {rd,wr}{fs,gs}base when available
Jan Beulich [Fri, 11 Oct 2013 07:30:31 +0000 (09:30 +0200)]
x86: use {rd,wr}{fs,gs}base when available

... as being intended to be faster than MSR reads/writes.

In the case of emulate_privileged_op() also use these in favor of the
cached (but possibly stale) addresses from arch.pv_vcpu. This allows
entirely removing the code that was the subject of XSA-67.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: add address validity check to guest_map_l1e()
Jan Beulich [Fri, 11 Oct 2013 07:29:43 +0000 (09:29 +0200)]
x86: add address validity check to guest_map_l1e()

Just like for guest_get_eff_l1e() this prevents accessing as page
tables (and with the wrong memory attribute) internal data inside Xen
happening to be mapped with 1Gb pages.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86: correct LDT checks
Jan Beulich [Fri, 11 Oct 2013 07:28:26 +0000 (09:28 +0200)]
x86: correct LDT checks

- MMUEXT_SET_LDT should behave as similarly to the LLDT instruction as
  possible: fail only if the base address is non-canonical
- instead LDT descriptor accesses should fault if the descriptor
  address ends up being non-canonical (by ensuring this we at once
  avoid reading an entry from the mach-to-phys table and consider it a
  page table entry)
- fault propagation on using LDT selectors must distinguish #PF and #GP
  (the latter must be raised for a non-canonical descriptor address,
  which also applies to several other uses of propagate_page_fault(),
  and hence the problem is being fixed there)
- map_ldt_shadow_page() should properly wrap addresses for 32-bit VMs

At once remove the odd invokation of map_ldt_shadow_page() from the
MMUEXT_SET_LDT handler: There's nothing really telling us that the
first LDT page is going to be preferred over others.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agolibxl: fix out-of-memory error handling in libxl_list_cpupool
Matthew Daley [Tue, 10 Sep 2013 10:18:46 +0000 (22:18 +1200)]
libxl: fix out-of-memory error handling in libxl_list_cpupool

...otherwise it will return freed memory. All the current users of this
function check already for a NULL return, so use that.

Coverity-ID: 1056194

This is CVE-2013-4371 / XSA-70

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agotools/ocaml: fix erroneous free of cpumap in stub_xc_vcpu_getaffinity
Matthew Daley [Tue, 10 Sep 2013 11:12:45 +0000 (23:12 +1200)]
tools/ocaml: fix erroneous free of cpumap in stub_xc_vcpu_getaffinity

Not sure how it got there...

Coverity-ID: 1056196

This is CVE-2013-4370 / XSA-69

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agolibxl: fix vif rate parsing
Ian Jackson [Thu, 10 Oct 2013 14:48:55 +0000 (15:48 +0100)]
libxl: fix vif rate parsing

strtok can return NULL here. We don't need to use strtok anyway, so just
use a simple strchr method.

Coverity-ID: 1055642

This is CVE-2013-4369 / XSA-68

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Fix type. Add test case

Signed-off-by: Ian Campbell <Ian.campbell@citrix.com>
12 years agox86: check segment descriptor read result in 64-bit OUTS emulation
Matthew Daley [Thu, 10 Oct 2013 13:19:53 +0000 (15:19 +0200)]
x86: check segment descriptor read result in 64-bit OUTS emulation

When emulating such an operation from a 64-bit context (CS has long
mode set), and the data segment is overridden to FS/GS, the result of
reading the overridden segment's descriptor (read_descriptor) is not
checked. If it fails, data_base is left uninitialized.

This can lead to 8 bytes of Xen's stack being leaked to the guest
(implicitly, i.e. via the address given in a #PF).

Coverity-ID: 1055116

This is CVE-2013-4368 / XSA-67.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Fix formatting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
12 years agoxen/arm: Fixing clear_guest_offset macro
Jaeyong Yoo [Fri, 4 Oct 2013 04:44:02 +0000 (13:44 +0900)]
xen/arm: Fixing clear_guest_offset macro

Fix the the broken macro 'clear_guest_offset' in arm.

Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoMerge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Ian Campbell [Thu, 10 Oct 2013 11:41:10 +0000 (12:41 +0100)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging

12 years agolibxl: introduce libxl_node_to_cpumap
Dario Faggioli [Thu, 3 Oct 2013 17:46:02 +0000 (19:46 +0200)]
libxl: introduce libxl_node_to_cpumap

As an helper for the special case (of libxl_nodemap_to_cpumap) when
one wants the cpumap for just one node.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agoxl: fix a typo in main_vcpulist()
Dario Faggioli [Thu, 3 Oct 2013 17:45:47 +0000 (19:45 +0200)]
xl: fix a typo in main_vcpulist()

which was preventing `xl vcpu-list -h' to work.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxl: update the manpage about "cpus=" and NUMA node-affinity
Dario Faggioli [Thu, 3 Oct 2013 17:45:38 +0000 (19:45 +0200)]
xl: update the manpage about "cpus=" and NUMA node-affinity

Since d06b1bf169a01a9c7b0947d7825e58cb455a0ba5 ('libxl: automatic placement
deals with node-affinity') it is no longer true that, if no "cpus=" option
is specified, xl picks up some pCPUs by default and pin the domain there.

In fact, it is the NUMA node-affinity that is affected by automatic
placement, not vCPU to pCPU pinning.

Update the xl config file documenation accordingly, as it seems to have
been forgotten at that time.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
12 years agotools/migrate: Fix regression when migrating from older version of Xen
Andrew Cooper [Thu, 10 Oct 2013 11:23:10 +0000 (12:23 +0100)]
tools/migrate: Fix regression when migrating from older version of Xen

Commit 00a4b65f8534c9e6521eab2e6ce796ae36037774 Sep 7 2010
  "libxc: provide notification of final checkpoint to restore end"
broke migration from any version of Xen using tools from prior to that commit

Older tools have no idea about an XC_SAVE_ID_LAST_CHECKPOINT, causing newer
tools xc_domain_restore() to start reading the qemu save record, as
ctx->last_checkpoint is 0.

The failure looks like:
  xc: error: Max batch size exceeded (1970103633). Giving up.
where 1970103633 = 0x756d6551 = *(uint32_t*)"Qemu"

With this fix in place, the behaviour for normal migrations is reverted to how
it was before the regression; the migration is considered non-checkpointed
right from the start.  A XC_SAVE_ID_LAST_CHECKPOINT chunk seen in the
migration stream is a nop.  For checkpointed migrations the behaviour is
unchanged.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Shriram Rajagopalan <rshriram@cs.ubc.ca> (Remus bits)
12 years agotools: adds tracer on qemu-xen debug configure options
Fabio Fantoni [Fri, 27 Sep 2013 14:00:46 +0000 (16:00 +0200)]
tools: adds tracer on qemu-xen debug configure options

When building tools in debug mode (debug=y), pass also
--enable-trace-backend=stderr when configuring qemu-xen.
Useful to improve debug.

Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
12 years agoxen/arm32: Call start_xen only on the boot CPU
Julien Grall [Mon, 7 Oct 2013 14:44:35 +0000 (15:44 +0100)]
xen/arm32: Call start_xen only on the boot CPU

The boot CPU can have a CPU ID non-equal to zero. Xen needs to check the
logical CPU ID (in r12) to know if the CPU is the boot one.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen/arm32: Call start_xen only on the boot CPU
Julien Grall [Mon, 7 Oct 2013 14:44:35 +0000 (15:44 +0100)]
xen/arm32: Call start_xen only on the boot CPU

The boot CPU can have a CPU ID non-equal to zero. Xen needs to check the
logical CPU ID (in r12) to know if the CPU is the boot one.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoqemu-xen: Set localstatedir to /var.
Anthony PERARD [Tue, 8 Oct 2013 12:59:57 +0000 (13:59 +0100)]
qemu-xen: Set localstatedir to /var.

This path is used by the QEMU build system to create the /run directory.
If local-state-dir is not set, the result become $prefix/var which
is not an acceptable path.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoqemu-xen: Disabling build of guest-agent.
Anthony PERARD [Tue, 8 Oct 2013 12:59:56 +0000 (13:59 +0100)]
qemu-xen: Disabling build of guest-agent.

It is not use when QEMU is run with Xen.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agohvm/vidirian: Avoid printing page_to_mfn(NULL) on error paths
Andrew Cooper [Wed, 9 Oct 2013 10:11:48 +0000 (12:11 +0200)]
hvm/vidirian: Avoid printing page_to_mfn(NULL) on error paths

While working in the viridian code, I noticed that 4cb6c4f4941

"x86/hvm: Use get_page_from_gfn() instead of get_gfn()/put_gfn."

introduced two error paths where page_to_mfn(NULL) would be formatted and
presented as a bad MFN.  This provides junk in the warning rather than
something useful.

These two codepaths are fixed up to match their counterpart in
wrmsr_hypervisor_regs()

While auditing the other changes from 4cb6c4f4941, I noticed a small
optimisation which could be made by changing the order of the validity checks
to remove 6 NULL pointer checks.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
12 years agox86/traps: improvements to {rd,wr}msr_hypervisor_regs()
Andrew Cooper [Wed, 9 Oct 2013 10:10:46 +0000 (12:10 +0200)]
x86/traps: improvements to {rd,wr}msr_hypervisor_regs()

Coverity ID: 1055249 1055250

Coverity was complaining that the switch statments contained dead code in
their default statements.  While this is quite minor, the code flow in
wrmsr_hypervisor_regs() was sufficiently opaque that I felt it approprate to
fix.

Other improvements include:
 * not shadowing the function parameter 'idx'.
 * use of PAGE_{SHIFT,SIZE} instead of opencoded numbers.
 * a more descriptive error message for attempting to write invalid indicies
   for hypercall pages.

There is no behavioural change as a result.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agoxen/x86: Remove GB macro in asm-x86/config.h
Julien Grall [Tue, 8 Oct 2013 16:48:33 +0000 (17:48 +0100)]
xen/x86: Remove GB macro in asm-x86/config.h

Commit 983843e "xen: Add macros MB and GB" introduce a generic GB macro.
By mistake, the macro in asm-x86/config.h was not removed. This is result to
a compilation error when Xen is build for x86.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agoxen/dts: Support Linux initrd DT bindings
Julien Grall [Fri, 27 Sep 2013 16:56:37 +0000 (17:56 +0100)]
xen/dts: Support Linux initrd DT bindings

Linux uses the property linux,initrd-start and linux,initrd-end to know where
the initrd lives in memory.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen/arm: Add support to load initrd in dom0
Julien Grall [Fri, 27 Sep 2013 16:56:36 +0000 (17:56 +0100)]
xen/arm: Add support to load initrd in dom0

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen/dts: Use ROUNDUP macro instead of the internal ALIGN
Julien Grall [Fri, 27 Sep 2013 16:56:35 +0000 (17:56 +0100)]
xen/dts: Use ROUNDUP macro instead of the internal ALIGN

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
12 years agoxen: Add macro ROUNDUP
Julien Grall [Fri, 27 Sep 2013 16:56:34 +0000 (17:56 +0100)]
xen: Add macro ROUNDUP

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
12 years agoxen: Add macros MB and GB
Julien Grall [Fri, 27 Sep 2013 16:56:33 +0000 (17:56 +0100)]
xen: Add macros MB and GB

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agox86/HPET: basic cleanup
Andrew Cooper [Tue, 8 Oct 2013 09:09:22 +0000 (11:09 +0200)]
x86/HPET: basic cleanup

* Strip trailing whitespace
* Remove redundant definitions
* Update stale documentation links
* Move hpet_address into __initdata

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 years agoVT-d: fix suspected data race condition in iommu_set_root_entry()
Andrew Cooper [Tue, 8 Oct 2013 09:06:48 +0000 (11:06 +0200)]
VT-d: fix suspected data race condition in iommu_set_root_entry()

Coverity ID: 1054967

Coverity spotted that iommu->root_maddr was optionally allocated within the
protection of the iommu->lock, but was referenced with the protection of the
iommu->register_lock, and freed without any lock.

Luckily, the code as-is is not vulnerable to the potential risks identified.

However, the alloc_pgtable_maddr() is far more appropriately done in
iommu_alloc(), removing a set of spinlock calls, and a possibility for the
iommu setup to fail later than iommu_alloc() with an -ENOMEM.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
12 years agolibxc: add LZ4 decompression support
Jan Beulich [Mon, 7 Oct 2013 07:42:51 +0000 (09:42 +0200)]
libxc: add LZ4 decompression support

Since there's no shared or static library to link against, this simply
re-uses the hypervisor side code. However, I only audited the code
added here for possible security issues, not the referenced code in
the hypervisor tree.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>