xen.git
13 years agoamd iommu: add 2 helper functions: iommu_is_pte_present and iommu_next_level
Wei Wang [Tue, 11 Sep 2012 12:00:04 +0000 (14:00 +0200)]
amd iommu: add 2 helper functions: iommu_is_pte_present and iommu_next_level

Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agox86: refactor mce code
Christoph Egger [Tue, 11 Sep 2012 10:28:32 +0000 (12:28 +0200)]
x86: refactor mce code

Factor common mc code out of intel specific code and move it into
common files. No functional changes.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agox86: make the dom0_max_vcpus option more flexible
David Vrabel [Tue, 11 Sep 2012 10:26:25 +0000 (12:26 +0200)]
x86: make the dom0_max_vcpus option more flexible

The dom0_max_vcpus command line option only allows the exact number of
VCPUs for dom0 to be set.  It is not possible to say "up to N VCPUs
but no more than the number physically present."

Allow a range for the option to set a minimum number of VCPUs, and a
maximum which does not exceed the number of PCPUs.

For example, with "dom0_max_vcpus=4-8":

    PCPUs  Dom0 VCPUs
     2      4
     4      4
     6      6
     8      8
    10      8

Existing command lines with "dom0_max_vcpus=N" still work as before
(and are equivalent to dom0_max_vcpus=N-N).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agopowernow: Update P-state directly when _PSD's CoordType is DOMAIN_COORD_TYPE_HW_ALL
Boris Ostrovsky [Tue, 11 Sep 2012 08:57:36 +0000 (10:57 +0200)]
powernow: Update P-state directly when _PSD's CoordType is DOMAIN_COORD_TYPE_HW_ALL

When _PSD's CoordType is DOMAIN_COORD_TYPE_HW_ALL (i.e. shared_type is
CPUFREQ_SHARED_TYPE_HW) which most often is the case on servers, there
is no reason to go into on_selected_cpus() code, we call call
transition_pstate() directly.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agox86/HVM: assorted RTC emulation adjustments
Jan Beulich [Tue, 11 Sep 2012 08:00:06 +0000 (10:00 +0200)]
x86/HVM: assorted RTC emulation adjustments

- don't look at RTC_PIE in rtc_timer_update(), and hence don't call the
  function on REG_B writes at all
- only call alarm_timer_update() on REG_B writes when relevant bits
  change
- only call check_update_timer() on REG_B writes when SET changes
- instead properly handle AF and PF when the guest is not also setting
  AIE/PIE respectively (for UF this was already the case, only a
  comment was slightly inaccurate)
- raise the RTC IRQ not only when UIE gets set while UF was already
  set, but generalize this to cover AIE and PIE as well
- properly mask off bit 7 when retrieving the hour values in
  alarm_timer_update(), and properly use RTC_HOURS_ALARM's bit 7 when
  converting from 12- to 24-hour value
- also handle the two other possible clock bases
- use RTC_* names in a couple of places where literal numbers were used
  so far

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agox86/hvm: don't give vector callback higher priority than NMI/MCE
Jan Beulich [Mon, 10 Sep 2012 14:47:31 +0000 (16:47 +0200)]
x86/hvm: don't give vector callback higher priority than NMI/MCE

Those two should always be delivered first imo.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
13 years agodocs: document "ucode=" hypervisor command line option
Jan Beulich [Mon, 10 Sep 2012 10:13:56 +0000 (11:13 +0100)]
docs: document "ucode=" hypervisor command line option

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agodocs: correct formatting errors in xmdomain.cfg
Matt Wilson [Mon, 10 Sep 2012 10:13:55 +0000 (11:13 +0100)]
docs: correct formatting errors in xmdomain.cfg

This patch corrects the following errors produced by pod2man:

Hey! The above document had some coding errors, which are explained
below:

Around line 301:
    You can't have =items (as at line 305) unless the first thing after
    the =over is an =item

Around line 311:
    '=item' outside of any '=over'

Signed-off-by: Matt Wilson <msw@amazon.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoxl.cfg: gfx_passthru documentation improvements
Pasi Kärkkäinen [Mon, 10 Sep 2012 10:13:54 +0000 (11:13 +0100)]
xl.cfg: gfx_passthru documentation improvements

gfx_passthru: Document gfx_passthru makes the GPU become primary in the guest
and other generic info about gfx_passthru.

Signed-off-by: Pasi Kärkkäinen <pasik@iki.fi>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agolibxl: fix error message in device_backend_callback
Roger Pau Monne [Mon, 10 Sep 2012 10:13:53 +0000 (11:13 +0100)]
libxl: fix error message in device_backend_callback

device_backend_callback error path always says "unable to disconnect",
but this can also happen during the connection of a device. Fix the
error message using the information in aodev->action.

Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agounmodified_drivers: handle IRQF_SAMPLE_RANDOM
Olaf Hering [Mon, 10 Sep 2012 08:54:13 +0000 (10:54 +0200)]
unmodified_drivers: handle IRQF_SAMPLE_RANDOM

The flag IRQF_SAMPLE_RANDOM was removed in 3.6-rc1. Add it only if it is
defined. An additional call to add_interrupt_randomness is appearently
not needed because its now called unconditionally in
handle_irq_event_percpu().

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agoVT-d: split .ack and .disable DMA-MSI actors
Jan Beulich [Mon, 10 Sep 2012 07:45:30 +0000 (09:45 +0200)]
VT-d: split .ack and .disable DMA-MSI actors

Calling irq_complete_move() from .disable is wrong, breaking S3 resume.

Comparing with all other .ack actors, it was also missing a call to
move_{native,masked}_irq(). As the actor is masking its interrupt
anyway (albeit it's not immediately obvious why), the latter is the
better choice.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by Xiantao Zhang <xiantao.zhang@intel.com>

13 years agoadjust a few RCU domain locking calls
Jan Beulich [Fri, 7 Sep 2012 15:58:12 +0000 (17:58 +0200)]
adjust a few RCU domain locking calls

x86's do_physdev_op() had a case where the locking was entirely
superfluous. Its physdev_map_pirq() further had a case where the lock
was being obtained too early, needlessly complicating early exit paths.

Grant table code had two open coded instances of
rcu_lock_target_domain_by_id(), and a third code section could be
consolidated by using the newly introduced helper function.

The memory hypercall code had two more instances of open coding
rcu_lock_target_domain_by_id(), but note that here this is not just
cleanup, but also fixes an error return path in memory_exchange() to
actually return an error.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agox86/MSI: fix 2nd S3 resume with interrupt remapping enabled
Jan Beulich [Fri, 7 Sep 2012 15:57:10 +0000 (17:57 +0200)]
x86/MSI: fix 2nd S3 resume with interrupt remapping enabled

The first resume from S3 was corrupting internal data structures (in
that pci_restore_msi_state() updated the globally stored MSI message
from traditional to interrupt remapped format, which would then be
translated a second time during the second resume, breaking interrupt
delivery).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agox86/32-on-64: adjust Dom0 initial page table layout
Jan Beulich [Fri, 7 Sep 2012 13:01:39 +0000 (15:01 +0200)]
x86/32-on-64: adjust Dom0 initial page table layout

Drop the unnecessary reservation of the L4 page for 32on64 Dom0, and
allocate its L3 first (to match behavior when running identical bit-
width hypervisor and Dom0 kernel).

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agodocs: remove WIP notice from command line docs
Ian Campbell [Fri, 7 Sep 2012 12:44:21 +0000 (13:44 +0100)]
docs: remove WIP notice from command line docs

I'm sure they aren't perfect but various people have done a pass over
them recently and they are much improved. I don't think we need to
continue to describe them so pessimistically.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoxen: clamp bitmaps to correct number of bits
Ian Campbell [Fri, 7 Sep 2012 12:23:45 +0000 (14:23 +0200)]
xen: clamp bitmaps to correct number of bits

Valgrind running xl create reports:
 ==24777== Invalid read of size 4
 ==24777==    at 0x4072805: libxl__get_numa_candidate (libxl_numa.c:203)
 ==24777==    by 0x40680B6: libxl__build_pre (libxl_dom.c:166)
 ==24777==    by 0x405B82E: libxl__domain_build (libxl_create.c:323)
 ==24777==    by 0x405BB9C: domcreate_bootloader_done (libxl_create.c:747)
 ==24777==    by 0x407AD27: bootloader_local_detached_cb (libxl_bootloader.c:281)
 ==24777==    by 0x40508D8: local_device_detach_cb (libxl.c:2470)
 ==24777==    by 0x4052B10: libxl__device_disk_local_initiate_detach (libxl.c:2445)
 ==24777==    by 0x407AE9F: bootloader_callback (libxl_bootloader.c:265)
 ==24777==    by 0x407C69A: libxl__bootloader_run (libxl_bootloader.c:392)
 ==24777==    by 0x405CB24: do_domain_create (libxl_create.c:687)
 ==24777==    by 0x405CC5E: libxl_domain_create_new (libxl_create.c:1177)
 ==24777==    by 0x805BDE2: create_domain (xl_cmdimpl.c:1812)
 ==24777==  Address 0x42dbdd8 is 8 bytes after a block of size 48 alloc'd
 ==24777==    at 0x4023340: calloc (vg_replace_malloc.c:593)
 ==24777==    by 0x406D479: libxl__zalloc (libxl_internal.c:88)
 ==24777==    by 0x404EF38: libxl_get_cpu_topology (libxl.c:3707)
 ==24777==    by 0x4072232: libxl__get_numa_candidate (libxl_numa.c:314)
 ==24777==    by 0x40680B6: libxl__build_pre (libxl_dom.c:166)
 ==24777==    by 0x405B82E: libxl__domain_build (libxl_create.c:323)
 ==24777==    by 0x405BB9C: domcreate_bootloader_done (libxl_create.c:747)
 ==24777==    by 0x407AD27: bootloader_local_detached_cb (libxl_bootloader.c:281)
 ==24777==    by 0x40508D8: local_device_detach_cb (libxl.c:2470)
 ==24777==    by 0x4052B10: libxl__device_disk_local_initiate_detach (libxl.c:2445)
 ==24777==    by 0x407AE9F: bootloader_callback (libxl_bootloader.c:265)
 ==24777==    by 0x407C69A: libxl__bootloader_run (libxl_bootloader.c:392)

This is because with nr_cpus=4 the bitmask returned from Xen
contains 0xff rather than 0x0f bit our bitmap walking routines (e.g.
libxl_for_each_set_bit) round up to the next byte (so it iterates
e.g. 8 times not 4). This then causes us to access of the end of
whatever array we are walking through each set bit for.

The principal of least surprise suggests that these bits ought not to
be set and this is not a hot path so fix this at the hypervisor layer
by clamping the bits in the returned bitmap to the correct limit.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agoamd iommu: dump flags of IO page faults
Wei Wang [Fri, 7 Sep 2012 12:23:20 +0000 (14:23 +0200)]
amd iommu: dump flags of IO page faults

Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agoUpdate Xen version to 4.3-unstable
Keir Fraser [Fri, 7 Sep 2012 11:55:26 +0000 (12:55 +0100)]
Update Xen version to 4.3-unstable

13 years agoDefault to debug builds.
Keir Fraser [Fri, 7 Sep 2012 11:54:22 +0000 (12:54 +0100)]
Default to debug builds.

13 years agoAdded tag 4.2.0-branched for changeset 528f0708b6db
Keir Fraser [Fri, 7 Sep 2012 11:22:44 +0000 (12:22 +0100)]
Added tag 4.2.0-branched for changeset 528f0708b6db

13 years agoAdded signature for changeset 68640a3c99ce
Keir Fraser [Fri, 7 Sep 2012 11:18:46 +0000 (12:18 +0100)]
Added signature for changeset 68640a3c99ce

13 years agoAdded tag 4.2.0-rc4 for changeset 68640a3c99ce
Keir Fraser [Fri, 7 Sep 2012 11:08:27 +0000 (12:08 +0100)]
Added tag 4.2.0-rc4 for changeset 68640a3c99ce

13 years agoUpdate Xen version to 4.2.0-rc4
Keir Fraser [Fri, 7 Sep 2012 11:08:10 +0000 (12:08 +0100)]
Update Xen version to 4.2.0-rc4

13 years agoDefault to non-debug build.
Keir Fraser [Fri, 7 Sep 2012 10:09:46 +0000 (11:09 +0100)]
Default to non-debug build.

Signed-off-by: Keir Fraser <keir@xen.org>
13 years agoQEMU_TAG update (security fix XSA-15)
Ian Jackson [Thu, 6 Sep 2012 16:08:44 +0000 (17:08 +0100)]
QEMU_TAG update (security fix XSA-15)

13 years agotimer: remove stray local_irq_enable()
David Vrabel [Thu, 6 Sep 2012 14:39:01 +0000 (16:39 +0200)]
timer: remove stray local_irq_enable()

migrate_timers_from_cpu() has a stray local_irq_enable() that does
nothing (it's immediately after a spin_unlock_irq()) and has no
matching local_irq_disable().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agox86: fix RCU locking in PHYSDEVOP_get_free_pirq
Jan Beulich [Wed, 5 Sep 2012 13:09:48 +0000 (15:09 +0200)]
x86: fix RCU locking in PHYSDEVOP_get_free_pirq

Apart from properly pairing locks with unlocks, also reduce the lock
scope - no need to do the copy_{from,to}_guest()-s inside the protected
region.

I actually wonder whether the RCU locks are needed here at all.

Reported-by: Tim Deegan <tim@xen.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agox86: drop "index" parameter from get_free_pirq()
Jan Beulich [Wed, 5 Sep 2012 13:07:42 +0000 (15:07 +0200)]
x86: drop "index" parameter from get_free_pirq()

It's unused.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agoQEMU_TAG update (XSA-17 / CVE-2012-3515)
Ian Jackson [Wed, 5 Sep 2012 11:38:40 +0000 (12:38 +0100)]
QEMU_TAG update (XSA-17 / CVE-2012-3515)

13 years agoxen/gnttab: Validate input to GNTTABOP_swap_grant_ref
Ian Jackson [Wed, 5 Sep 2012 11:30:26 +0000 (12:30 +0100)]
xen/gnttab: Validate input to GNTTABOP_swap_grant_ref

xen-unstable c/s 24548:d115844ebfbb introduces a new GNTTABOP to swap
grant refs.  However, it fails to validate the two refs passed from
the guest.

The result is that passing out-of-range refs can cause Xen to read
past the end of the grant_table->active[] array, and deference
whatever it finds.  Typically, this results in Xen trying to deference
a low pointer and fail with a page-fault.

As this hypercall can be issued by an unprivileged guest, this is a
Denial of Service against Xen.  This is XSA-18 / CVE-2012-3516.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Paul Durrant <paul.durrant@citrix.com>
13 years agox86/pvhvm: properly range-check PHYSDEVOP_map_pirq/MAP_PIRQ_TYPE_GSI
Ian Jackson [Wed, 5 Sep 2012 11:29:52 +0000 (12:29 +0100)]
x86/pvhvm: properly range-check PHYSDEVOP_map_pirq/MAP_PIRQ_TYPE_GSI

This is being used as a array index, and hence must be validated before
use.

This is XSA-16 / CVE-2012-3498.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
13 years agoxen: Don't BUG_ON() PoD operations on a non-translated guest.
Ian Jackson [Wed, 5 Sep 2012 11:29:03 +0000 (12:29 +0100)]
xen: Don't BUG_ON() PoD operations on a non-translated guest.

This is XSA-14 / CVE-2012-3496

Signed-off-by: Tim Deegan <tim@xen.org>
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoxen: prevent a 64 bit guest setting reserved bits in DR7
Ian Jackson [Wed, 5 Sep 2012 11:27:25 +0000 (12:27 +0100)]
xen: prevent a 64 bit guest setting reserved bits in DR7

The upper 32 bits of this register are reserved and should be written as
zero.

This is XSA-12 / CVE-2012-3494

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoxl.cfg: videoram and stdvga documentation improvements
Pasi Kärkkäinen [Mon, 3 Sep 2012 10:22:02 +0000 (11:22 +0100)]
xl.cfg: videoram and stdvga documentation improvements

- videoram: Document that only qemu-xen-traditional device-model currently
   supports changing the amount of video memory for stdvga graphics device.

- videoram: Better document the default amount of videoram for both stdvga
  and Cirrus.

- stdvga: Add a note that stdvga allows bigger amount of videoram and
  bigger resolutions.

Signed-off-by: Pasi Kärkkäinen <pasik@iki.fi>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agolibxl: fix api check Makefile
Ian Jackson [Mon, 3 Sep 2012 10:22:01 +0000 (11:22 +0100)]
libxl: fix api check Makefile

Touch the libxl.api-ok stamp file, and unconditionally put in place
the new _libxl.api-for-check.  This avoids needlessly rerunning the
preprocessor on libxl.h each time we call "make".

Ensure that _libxl.api-for-check gets the CFLAGS used for xl, so that
if it is asked for in a standalone make run it can find xentoollog.h.

Remove *.api-ok on clean.

Also fix .gitignore.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoarm: correctly check for error on dom0 allocation
Ian Campbell [Mon, 3 Sep 2012 10:22:00 +0000 (11:22 +0100)]
arm: correctly check for error on dom0 allocation

Drop the redundant printk

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agodocs/command line: Clarify the behavior with invalid input.
Andrew Cooper [Mon, 3 Sep 2012 10:22:00 +0000 (11:22 +0100)]
docs/command line: Clarify the behavior with invalid input.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.de>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agolibxl/xl: implement support for guest ioport and irq permissions.
Ian Campbell [Mon, 3 Sep 2012 10:21:59 +0000 (11:21 +0100)]
libxl/xl: implement support for guest ioport and irq permissions.

This is useful for passing legacy ISA devices (e.g. com ports,
parallel ports) to guests.

Supported syntax is as described in
http://cmrg.fifthhorseman.net/wiki/xen#grantingaccesstoserialhardwaretoadomU

I tested this using Xen's 'q' key handler which prints out the I/O
port and IRQ ranges allowed for each domain. e.g.:

(XEN) Rangesets belonging to domain 31:
(XEN)     I/O Ports  { 2e8-2ef, 2f8-2ff }
(XEN)     Interrupts { 3, 5-6 }

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Dieter Bloms <dieter@bloms.de>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agomake domain_create() return a proper error code
Jan Beulich [Mon, 3 Sep 2012 07:40:38 +0000 (09:40 +0200)]
make domain_create() return a proper error code

While triggered by the XSA-9 fix, this really is of more general use;
that fix just pointed out very sharply that the current situation
with all domain creation failures reported to user (tools) space as
-ENOMEM is very unfortunate (actively misleading users _and_ support
personnel).

Pull over the pointer <-> error code conversion infrastructure from
Linux, and use it in domain_create() and all it callers.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agox86/HVM: RTC periodic timer emulation adjustments
Jan Beulich [Mon, 3 Sep 2012 06:35:41 +0000 (08:35 +0200)]
x86/HVM: RTC periodic timer emulation adjustments

- don't call rtc_timer_update() on REG_A writes when the value didn't
  change (doing the call always was reported to cause wall clock time
  lagging with the JVM running on Windows)
- don't call rtc_timer_update() on REG_B writes when RTC_PIE didn't
  change

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agox86: comment opaque expression in __page_to_virt()
Jan Beulich [Mon, 3 Sep 2012 06:17:50 +0000 (08:17 +0200)]
x86: comment opaque expression in __page_to_virt()

mm.h's __page_to_virt() has a rather opaque expression. Comment it.

Reported-By: Ian Campbell <ian.campbell@citrix.com>
Suggested-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agonestedsvm: fix interrupt handling
Christoph Egger [Fri, 31 Aug 2012 20:15:31 +0000 (21:15 +0100)]
nestedsvm: fix interrupt handling

Give the l2 guest a chance to finish the delivery of the last injected
interrupt or exception before we emulate a VMEXIT.
For example after a NPF handled by the host there can be an interrupt
for the l1 guest.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agotmem: add matching unlock for an about-to-be-destroyed object
Dan Magenheimer [Fri, 31 Aug 2012 20:13:39 +0000 (21:13 +0100)]
tmem: add matching unlock for an about-to-be-destroyed object

A 4.2 changeset forces a preempt_disable/enable with
every lock/unlock.

Tmem has dynamically allocated "objects" that contain a
lock.  The lock is held when the object is destroyed.
No reason to unlock something that's about to be destroyed!
But with the preempt_enable/disable in the generic locking code,
and the fact that do_softirq ASSERTs that preempt_count
must be zero, a crash occurs soon after any object is
destroyed.

So force lock to be released before destroying objects.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agolibxl: fix double free on some config parser errors
Ian Jackson [Fri, 31 Aug 2012 11:24:57 +0000 (12:24 +0100)]
libxl: fix double free on some config parser errors

If libxlu_cfg_y.y encountered a config file error, the code generated
by bison would sometimes _both_ run the %destructor _and_ call
xlu__cfg_set_store for the same XLU_ConfigSetting* semantic value.
The result would be a double free.

This appears to be because of the use of a mid-rule action.  There is
some discussion of the problems with destructors and mid-rule action
error handling in "(bison)Mid-Rule Actions".  This area is complex and
best avoided.

So fix the bug by abolishing the use of a mid-rule action, which was
in any case not necessary here.

Also while we are there rename the nonterminal rule "setting" to
"assignment", to avoid confusion with the token type "setting", which
had an identically name in a different namespace.  This was especially
confusing because the nonterminal "setting" did not have "setting" as
the type of its semantic value!  (In fact the nonterminal, now called
"assignment", does not have a value so it does not have a value type.)

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agotools: remove --disable-pythontools option
Ian Campbell [Fri, 31 Aug 2012 10:13:49 +0000 (11:13 +0100)]
tools: remove --disable-pythontools option

This incorrectly removes the $(PYTHON) variable which is used at build
time as well as by the tools.

Remove and revisit for 4.3.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoxencommons: Attempt to load blktap2 driver
Ian Campbell [Fri, 31 Aug 2012 10:13:48 +0000 (11:13 +0100)]
xencommons: Attempt to load blktap2 driver

Older kernels, such as those found in Debian Squeeze:
* Have bugs in handling of AIO into foreign pages
* Have blktap modules, which will cause qemu not to use AIO, but
  which are not loaded on boot.

Attempt to load blktap in xencommons, to make sure modern qemu's which
use AIO will work properly on those kernels.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Prefer to load blktap2 if it exists. This is the name of the driver in
classic-Xen ports, while in mainline kernels the driver is called just
blktap.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agotools: remove vestigial default_lib.m4 macros and adjust substitutions
Matt Wilson [Fri, 31 Aug 2012 09:42:09 +0000 (10:42 +0100)]
tools: remove vestigial default_lib.m4 macros and adjust substitutions

LIB_PATH is no longer used, so the AX_DEFAULT_LIB macro is no longer
needed. Additionally lower case make variables are now used as
autoconf substitutions, which allows for more correct overrides at
build time.

I've checked the file layout in dist/install from the build made
before this change versus after with ./configure values of:
 1) ./configure (no flags provided)
 2) ./configure --libdir=/usr/lib/x86_64-linux-gnu (Debian style)
 3) ./configure --libdir='${exec_prefix}/lib' (late variable expansion)

Signed-off-by: Matt Wilson <msw@amazon.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc - reran autogen.sh ]
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agouninstall: push tools uninstall down into tools/Makefile
Ian Campbell [Fri, 31 Aug 2012 09:42:08 +0000 (10:42 +0100)]
uninstall: push tools uninstall down into tools/Makefile

Many of the rules here depend on having run configure and the
variables which it defines in config/Tools.mk

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Looks-good: Jan Beulich <JBeulich@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agouninstall: do not remove kernels or modules on uninstall.
Ian Campbell [Fri, 31 Aug 2012 09:42:08 +0000 (10:42 +0100)]
uninstall: do not remove kernels or modules on uninstall.

The pattern used is very broad and will delete any kernel with xen in
its filename, likewise modules, including those which come packages
from the distribution etc.

I don't think this was ever the right thing to do but it is doubly
wrong now that Xen does not even build or install a kernel by default.

Push cleanup of the installed hypervisor down into xen/Makefile so that
it can cleanup exactly what it actually installs.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Looks-good: Jan Beulich <JBeulich@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agohotplug/NetBSD: check type of file to attach from params
Roger Pau Monne [Fri, 31 Aug 2012 09:42:07 +0000 (10:42 +0100)]
hotplug/NetBSD: check type of file to attach from params

xend used to set the xenbus backend entry "type" to either "phy" or
"file", but now libxl sets it to "phy" for both file and block device.
We have to manually check for the type of the "param" field in order
to detect if we are trying to attach a file or a block device.

Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agohotplug/NetBSD: write error message to hotplug-error
Roger Pau Monne [Fri, 31 Aug 2012 09:42:06 +0000 (10:42 +0100)]
hotplug/NetBSD: write error message to hotplug-error

As recommended by Ian Campbell, write the hotplug error to
hotplug-error, just as the Linux hotplug script does.

Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agohotplug/NetBSD: fix xenstore_write usage in error
Roger Pau Monne [Fri, 31 Aug 2012 09:42:05 +0000 (10:42 +0100)]
hotplug/NetBSD: fix xenstore_write usage in error

xenstore_write doesn't exist, use xenstore-write instead. The error
function is currently broken without this change.

Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoxenconsoled: clean-up after all dead domains
David Vrabel [Fri, 31 Aug 2012 09:42:04 +0000 (10:42 +0100)]
xenconsoled: clean-up after all dead domains

xenconsoled expected domains that are being shutdown to end up in the
the DYING state and would only clean-up such domains.  HVM domains
either didn't enter the DYING state or weren't in long enough for
xenconsoled to notice.

For every shutdown HVM domain, xenconsoled would leak memory, grow its
list of domains and (if guest console logging was enabled) leak the
log file descriptor.  If the file descriptors were leaked and enough
HVM domains were shutdown, no more console connections would work as
the evtchn device could not be opened.  Guests would then block
waiting to send console output.

Fix this by tagging domains that exist in enum_domains().  Afterwards,
all untagged domains are assumed to be dead and are shutdown and
cleaned up.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoREADME: Update references to PyXML to lxml
Ian Campbell [Fri, 31 Aug 2012 09:42:04 +0000 (10:42 +0100)]
README: Update references to PyXML to lxml

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agodocs: improve documentation of Xen command line parameters
Matt Wilson [Fri, 31 Aug 2012 09:42:03 +0000 (10:42 +0100)]
docs: improve documentation of Xen command line parameters

This change improves documentation for several Xen command line
parameters. Some of the Itanium-specific options are now removed. A
more thorough check should be performed to remove any other remnants.

I've reformatted some of the entries to fit in 80 column terminals.

Options that are yet undocumented but accept standard boolean /
integer values are now annotated as such.

The size suffixes have been corrected to use the binary prefixes
instead of decimal prefixes.

Signed-off-by: Matt Wilson <msw@amazon.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agox86/i8259: Clean up _mask_and_ack_8259A_irq().
Keir Fraser [Thu, 30 Aug 2012 17:17:20 +0000 (18:17 +0100)]
x86/i8259: Clean up _mask_and_ack_8259A_irq().

Signed-off-by: Keir Fraser <keir@xen.org>
13 years agox86/i8259: Handle bogus spurious interrupts more quietly
Andrew Cooper [Thu, 30 Aug 2012 17:06:39 +0000 (18:06 +0100)]
x86/i8259: Handle bogus spurious interrupts more quietly

c/s 25336:edd7c7ad1ad2 introduced the concept of a bogus vector, for
in irqs delivered through the i8259 PIC after IO-APICs had been set
up.

However, if supurious PIC vectors are received, many "No irq handler
for vector" log messages can be seen on the console.

This patch adds to the bogus vector logic to detect spurious PIC
vectors and simply ignore them.  _mask_and_ack_8259A_irq() has been
modified to return a boolean indicating whether the irq is real or
not, and in the case of a spurious vector, the error in do_IRQ() is
not printed.

One complication is that now, _mask_and_ack_8259A_irq() can get called
whatever the ack mode is, so has been altered to work out whether it
should EOI the irq or not.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agonvmx: fix unhandled nested XSETBV VMExit
Dongxiao Xu [Thu, 30 Aug 2012 16:58:23 +0000 (17:58 +0100)]
nvmx: fix unhandled nested XSETBV VMExit

If the L2 guest issue a XSETBV instruction, we need to deliver to
L1 guest.

This could fix the Fedora 17 booting hang issue as a L2 guest.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agodocs: update xenpaging.txt
Olaf Hering [Thu, 30 Aug 2012 16:57:31 +0000 (17:57 +0100)]
docs: update xenpaging.txt

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Keir Fraser <keir@xen.org>
13 years agonvmx: fix resource relinquish for nested VMX
Dongxiao Xu [Thu, 30 Aug 2012 16:55:31 +0000 (17:55 +0100)]
nvmx: fix resource relinquish for nested VMX

The previous order of relinquish resource is:
relinquish_domain_resources() -> vcpu_destroy() ->
nvmx_vcpu_destroy().  However some L1 resources like nv_vvmcx and
io_bitmaps are free in nvmx_vcpu_destroy(), therefore the
relinquish_domain_resources() will not reduce the refcnt of the domain
to 0, therefore the latter vcpu release functions will not be called.

To fix this issue, we need to release the nv_vvmcx and io_bitmaps in
relinquish_domain_resources().

Besides, after destroy the nested vcpu, we need to switch the
vmx->vmcs back to the L1 and let the vcpu_destroy() logic to free the
L1 VMCS page.

Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agox86: Prefer multiboot-provided e820 over bios-provided e801 memory info.
Keir Fraser [Tue, 28 Aug 2012 21:40:45 +0000 (22:40 +0100)]
x86: Prefer multiboot-provided e820 over bios-provided e801 memory info.

Some UEFI systems do not provide e820 information. In this case we
should take the detailed memory map provided by a multiboot-capable
loader, rather than rely on very conservative values from the e801
bios call. Using the latter on any modern system really hardly makes
good sense.

[Excellent candidate for 4.1 backport]

Signed-off-by: Keir Fraser <keir@xen.org>
Tested-by: Jonathan Tripathy <jonnyt@abpni.co.uk>
13 years agotools/xl: Fix uninitialized variable error.
Andrew Cooper [Tue, 28 Aug 2012 13:46:30 +0000 (14:46 +0100)]
tools/xl: Fix uninitialized variable error.

c/s 25779:4ca40e0559c3 introduced a compilation error for any build
system using -Werror=uninitialized, such as the default CentOS 5.7
version of gcc.

And with good reason, because if the global libxl
default_output_format is neither OUTPUT_FORMAT_SXP nor
OUTPUT_FORMAT_JSON, the variable hand will be used before being
initialised.

The attached patch fixes the warning, and futher fixes the logic to
work correctly when a new OUTPUT_FORMAT is added to xl.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agolibxl: Rerun bison
Ian Jackson [Fri, 24 Aug 2012 11:38:18 +0000 (12:38 +0100)]
libxl: Rerun bison

This updates libxlu_cfg_y.[ch] to code generated by bison from
Debian squeeze (1:2.4.1.dfsg-3 i386).

There should be no functional change since there is no change to the
source file, but we will inherit bugfixes and behavioural changes from
the new version of bison.  So this is more a matter of hope than
knowledge.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
13 years agolibxl: Rerun flex
Ian Jackson [Fri, 24 Aug 2012 11:38:16 +0000 (12:38 +0100)]
libxl: Rerun flex

This undoes some systematic changes which were made to
libxlu_cfg_l.[ch] along with manually-edited files (eg, whitespace
changes, emacs local variables) and returns these two files to exactly
the output of flex (Debian squeeze 2.5.35-10 i386).

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
13 years agolibxl: provide "make realclean" target
Ian Jackson [Fri, 24 Aug 2012 11:38:14 +0000 (12:38 +0100)]
libxl: provide "make realclean" target

This removes all the autogenerated files.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
13 years agonested vmx: Don't set bit 55 in IA32_VMX_BASIC_MSR
Zhang Xiantao [Fri, 24 Aug 2012 08:49:47 +0000 (09:49 +0100)]
nested vmx: Don't set bit 55 in IA32_VMX_BASIC_MSR

All related IA32_VMX_TRUE_*_MSR are not implemented,
so set this bit to 0, otherwise system L1VMM may
get incorrect default1 class settings.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agonested vmx: VM_ENTRY_IA32E_MODE shouldn't be in default1 class
Zhang Xiantao [Fri, 24 Aug 2012 08:49:14 +0000 (09:49 +0100)]
nested vmx: VM_ENTRY_IA32E_MODE shouldn't be in default1 class
for IA32_VM_ENTRY_CTLS_MSR.

If set to 1, L2 guest's paging mode maybe mis-judged
and mis-set.

Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agoxl: make "xl list -l" proper JSON
Ian Campbell [Thu, 23 Aug 2012 18:12:28 +0000 (19:12 +0100)]
xl: make "xl list -l" proper JSON

Bastian Blank reports that the output of this command is just multiple
JSON objects concatenated and is not a single properly formed JSON
object.

Fix this by wrapping in an array. This turned out to be a bit more
intrusive than I was expecting due to the requirement to keep
supporting the SXP output mode.

Python's json module is happy to parse the result...

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
13 years agolibxl: make domain resume API asynchronous
Ian Campbell [Thu, 23 Aug 2012 18:00:09 +0000 (19:00 +0100)]
libxl: make domain resume API asynchronous

Although the current implementation has no asynchromous parts I can
envisage it needing to do bits of create/destroy like functionality
which may need async support in the future.

To do this make the meat into an internal libxl__domain_resume
function in order to satisfy the no-internal-callers rule for the
async function.

Since I needed to touch the logging to s/ctx/CTX/ anyway switch to the
LOG* helper macros.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
13 years agoUpdate Xen version to 4.2.0-rc4-pre
Keir Fraser [Thu, 23 Aug 2012 14:06:11 +0000 (15:06 +0100)]
Update Xen version to 4.2.0-rc4-pre

13 years agoAdded signature for changeset d44f290e81df
Keir Fraser [Thu, 23 Aug 2012 14:05:48 +0000 (15:05 +0100)]
Added signature for changeset d44f290e81df

13 years agoAdded tag 4.2.0-rc3 for changeset d44f290e81df
Keir Fraser [Thu, 23 Aug 2012 14:05:36 +0000 (15:05 +0100)]
Added tag 4.2.0-rc3 for changeset d44f290e81df

13 years agoUpdate Xen version to 4.2.0-rc3
Keir Fraser [Thu, 23 Aug 2012 14:05:30 +0000 (15:05 +0100)]
Update Xen version to 4.2.0-rc3

13 years agox86,cmdline: Fix setting skip_realmode boolean on no-real-mode and tboot options
Keir Fraser [Thu, 23 Aug 2012 14:02:04 +0000 (15:02 +0100)]
x86,cmdline: Fix setting skip_realmode boolean on no-real-mode and tboot options
...effect should be cumulative.

Signed-off-by: Keir Fraser <keir@xen.org>
13 years agoDump IOMMU p2m table
Santosh Jodh [Wed, 22 Aug 2012 21:29:06 +0000 (22:29 +0100)]
Dump IOMMU p2m table

New key handler 'o' to dump the IOMMU p2m table for each domain.
Skips dumping table for domain 0.
Intel and AMD specific iommu_ops handler for dumping p2m table.

Incorporated feedback from Jan Beulich and Wei Wang.
Fixed indent printing with %*s.
Removed superflous superpage and other attribute prints.
Make next_level use consistent for AMD IOMMU dumps. Warn if found
inconsistent.
AMD IOMMU does not skip levels. Handle 2mb and 1gb IOMMU page size for
AMD.

Signed-off-by: Santosh Jodh <santosh.jodh@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agoFix shared entry status for grant copy operation on paged-out gfn
Andres Lagar-Cavilla [Wed, 22 Aug 2012 21:27:50 +0000 (22:27 +0100)]
Fix shared entry status for grant copy operation on paged-out gfn

The unwind path was not clearing the shared entry status bits. This
was BSOD-ing guests on network activity under certain configurations.

Also:
 * sed the fixup method name to signal it's related to grant copy.
 * use atomic clear flag ops during fixup.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Committed-by: Keir Fraser <keir@xen.org>
13 years agohvm: Remove VM genearation ID device and incr_generationid from build_info.
Paul Durrant [Wed, 22 Aug 2012 21:26:27 +0000 (22:26 +0100)]
hvm: Remove VM genearation ID device and incr_generationid from build_info.

Microsoft have now published their VM generation ID specification at
https://www.microsoft.com/en-us/download/details.aspx?id=30707.
It differs from the original specification upon which I based my
implementation in several key areas. Particularly, it is no longer
an incrementing 64-bit counter and so this patch is to remove
the incr_generationid field from the build_info and also disable the
ACPI device before 4.2 is released.

I will follow up with further patches to implement the VM generation
ID to the new specification.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
13 years agolibxc: Support save/restore of up to 4096 VCPUs (increase from 64 VCPUs).
Keir Fraser [Wed, 22 Aug 2012 21:20:42 +0000 (22:20 +0100)]
libxc: Support save/restore of up to 4096 VCPUs (increase from 64 VCPUs).

Signed-off-by: Keir Fraser <keir@xen.org>
13 years agoflask/policy: add accesses used by newer dom0s
Daniel De Graaf [Wed, 22 Aug 2012 21:15:36 +0000 (22:15 +0100)]
flask/policy: add accesses used by newer dom0s

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
13 years agoxsm/flask: remove page-to-domain lookups from XSM hooks
Daniel De Graaf [Wed, 22 Aug 2012 21:14:52 +0000 (22:14 +0100)]
xsm/flask: remove page-to-domain lookups from XSM hooks

Doing a reverse lookup from MFN to its owning domain is redundant with
the internal checks Xen does on pages. Change the checks to operate
directly on the domain owning the pages for normal memory; MMIO areas
are still checked with security_iomem_sid.

This fixes a hypervisor crash when a domU attempts to map an MFN that
is free in Xen's heap: the XSM hook is called before the validity
check, and page_get_owner returns garbage when called on these
pages. While explicitly checking for such pages using
page_get_owner_and_reference is a possible solution, this ends up
duplicating parts of get_page_from_l1e.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
13 years agoxsm: Add missing dummy hooks
Daniel De Graaf [Wed, 22 Aug 2012 21:13:32 +0000 (22:13 +0100)]
xsm: Add missing dummy hooks

A few XSM hooks have been defined without implementation in dummy.c;
these will cause a null function pointer deference if called. Also
implement the efi_call hook, which was incorrectly added without any
implementations.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
13 years agox86-64: refine the XSA-9 fix
Jan Beulich [Mon, 20 Aug 2012 06:46:47 +0000 (08:46 +0200)]
x86-64: refine the XSA-9 fix

Our product management wasn't happy with the "solution" for XSA-9, and
demanded that customer systems must continue to boot. Rather than
having our and perhaps other distros carry non-trivial patches, allow
for more fine grained control (panic on boot, deny guest creation, or
merely warn) by means of a single line change.

Also, as this was found to be a problem with remotely managed systems,
don't default to boot denial (just deny guest creation).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agox86: don't expose SYSENTER on unknown CPUs
Jan Beulich [Mon, 20 Aug 2012 06:40:01 +0000 (08:40 +0200)]
x86: don't expose SYSENTER on unknown CPUs

So far we only ever set up the respective MSRs on Intel CPUs, yet we
hide the feature only on a 32-bit hypervisor. That prevents booting of
PV guests on top of a 64-bit hypervisor making use of the instruction
on unknown CPUs (VIA in this case).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agodocs: console: correct example console type definition
Ian Campbell [Fri, 17 Aug 2012 13:57:29 +0000 (14:57 +0100)]
docs: console: correct example console type definition

I think this is intended to be under the specific console's directory.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agotools/python: Clean python correctly
Andrew Cooper [Fri, 17 Aug 2012 13:46:49 +0000 (14:46 +0100)]
tools/python: Clean python correctly

Cleaning the python directory should completely remove the build/
directory, otherwise subsequent builds may be short-circuited and a
stale build installed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agolibxc/Linux: Add VM_DONTCOPY flag of the VMA of the hypercall buffer
Wangzhenguo [Fri, 17 Aug 2012 13:46:48 +0000 (14:46 +0100)]
libxc/Linux: Add VM_DONTCOPY flag of the VMA of the hypercall buffer

This avoids the hypercall buffer becoming CoW on fork.

In multi-threads and multi-processes environment, e.g. the process has two
threads, thread A may call hypercall, thread B may call fork() to create child
process. After forking, all pages of the process including hypercall buffers
are cow. It will cause a write protection and return EFAULT error if hypervisor
calls copy_to_user in hypercall in thread A context,

Fix:
1. Before hypercall: use MADV_DONTFORK of madvise syscall to make the hypercall
   buffer not to be copied to child process after fork.
2. After hypercall: undo the effect of MADV_DONTFORK for the hypercall buffer
   by using MADV_DOFORK of madvise syscall.
3. Use mmap/nunmap for memory alloc/free instead of malloc/free to bypass libc.

Note:
Child processes must not use the opened xc_{interface,evtchn,gnttab,gntshr}
handle that inherits from parents. They should reopen the handle if they want
to interact with xc. Otherwise, it may cause segment fault to access hypercall
buffer caches of the handle.

Signed-off-by: Zhenguo Wang <wangzhenguo@huawei.com>
Signed-off-by: Xiaowei Yang <xiaowei.yang@huawei.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- s/ptr/p/ to fix build & tweaked the wording of the comments
         slightly. ]
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agoxend: Replace the use of XMLPrettyPrint from PyXML with stdlib functionality.
M A Young [Fri, 17 Aug 2012 13:10:26 +0000 (14:10 +0100)]
xend: Replace the use of XMLPrettyPrint from PyXML with stdlib functionality.

This appears to have been missed by changeset 22235:b8cc53d22545
"Replace pyxml/xmlproc-based XML validator with lxml based one"

This was reported by Toshio Ernie Kuratomi at
https://bugzilla.redhat.com/show_bug.cgi?id=842843

Signed-off-by: Michael Young <m.a.young@durham.ac.uk>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
13 years agofix typos in xen/arch/x86/hvm/vmx/vmcs.c
Yongjie Ren [Fri, 17 Aug 2012 10:36:38 +0000 (12:36 +0200)]
fix typos in xen/arch/x86/hvm/vmx/vmcs.c

Signed-off-by: Yongjie Ren <yongjie.ren@intel.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agox86/ucode: don't crash during AP bringup on non-Intel, non-AMD CPUs
Jan Beulich [Fri, 17 Aug 2012 09:36:08 +0000 (11:36 +0200)]
x86/ucode: don't crash during AP bringup on non-Intel, non-AMD CPUs

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
13 years agoEPT/PoD: fix interaction with 1Gb pages
Jan Beulich [Thu, 16 Aug 2012 16:38:05 +0000 (17:38 +0100)]
EPT/PoD: fix interaction with 1Gb pages

When PoD got enabled to support 1Gb pages, ept_get_entry() didn't get
updated to match - the assertion in there triggered, indicating that
the call to p2m_pod_demand_populate() needed adjustment.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
13 years agox86/mm: update max_mapped_pfn on MMIO mappings too.
Tim Deegan [Thu, 16 Aug 2012 13:31:09 +0000 (14:31 +0100)]
x86/mm: update max_mapped_pfn on MMIO mappings too.

max_mapped_pfn should reflect the highest mapping we've ever seen of
any type, or the tests in the lookup functions will be wrong.  As it
happens, the highest mapping has always been a RAM one, but this is no
longer the case when we allow 64-bit BARs.

Reported-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
13 years agox86/PoD: clean up types
Jan Beulich [Thu, 16 Aug 2012 08:16:19 +0000 (10:16 +0200)]
x86/PoD: clean up types

GMFN values must undoubtedly be "unsigned long". "count" and
"entry_count", since they are signed types, should also be "long" as
otherwise they can't fit all values that can fit into "d->tot_pages"
(which currently is "uint32_t").

Beyond that, the patch doesn't convert everything to "long" as in many
places it is clear that "int" suffices. In places where "long" is being
used partially already, the change is however being done.

Furthermore, page order values have no use of being "long".

Finally, in the course of updating a few printk messages anyway, some
also get slightly shortened (to focus on the relevant information).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
13 years agox86/PoD: prevent guest from being destroyed upon early access to its memory
Jan Beulich [Thu, 16 Aug 2012 08:14:11 +0000 (10:14 +0200)]
x86/PoD: prevent guest from being destroyed upon early access to its memory

When an external agent (e.g. a monitoring daemon) happens to access the
memory of a PoD guest prior to setting the PoD target, that access must
fail for there not being any page in the PoD cache, and only the space
above the low 2Mb gets scanned for victim pages (while only the low 2Mb
got real pages populated so far).

To accomodate for this
- set the PoD target first
- do all physmap population in PoD mode (i.e. not just large [2Mb or
  1Gb] pages)
- slightly lift the restrictions enforced by p2m_pod_set_mem_target()
  to accomodate for the changed tools behavior

Tested-by: Jürgen Groß <juergen.gross@ts.fujitsu.com>
           (in a 4.0.x based incarnation)
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
13 years agoetherboot: Build fixes for gcc 4.7.
Keir Fraser [Wed, 15 Aug 2012 08:41:21 +0000 (09:41 +0100)]
etherboot: Build fixes for gcc 4.7.

Signed-off-by: Keir Fraser <keir@xen.org>
13 years agoacpi: Make sure valid CPU is passed to do_pm_op()
Boris Ostrovsky [Wed, 15 Aug 2012 07:43:25 +0000 (09:43 +0200)]
acpi: Make sure valid CPU is passed to do_pm_op()

Passing invalid CPU value to do_pm_op() will cause assertion
in cpu_online().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Such checks would, at a first glance, then also be missing at the top
of various helper functions, but these check really were already
redundant with the check in do_pm_op(). Remove the redundant checks
for clarity and brevity.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agox86-64/EFI: add CFLAGS to check compile
Daniel De Graaf [Wed, 15 Aug 2012 07:42:14 +0000 (09:42 +0200)]
x86-64/EFI: add CFLAGS to check compile

Without this, the compilation of check.c could fail due to compiler
features such as -fstack-protector being enabled, which causes a
missing __stack_chk_fail symbol error.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Rather than using plain CFLAGS here, remove CFLAGS-y from them to
particularly get rid of the -MF argument referencing (the undefined
here) $(@F).

The use of CFLAGS at once allows dropping the explicit use of -Werror.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
13 years agoQEMU_TAG update
Ian Jackson [Tue, 14 Aug 2012 14:59:38 +0000 (15:59 +0100)]
QEMU_TAG update

13 years agox86/PoD: fix (un)locking after 24772:28edc2b31a9b
Jan Beulich [Tue, 14 Aug 2012 08:28:14 +0000 (10:28 +0200)]
x86/PoD: fix (un)locking after 24772:28edc2b31a9b

That c/s introduced a double unlock on the out-of-memory error path of
p2m_pod_demand_populate().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
13 years agoconfig: Split debug build from debug symbols
Andrew Cooper [Mon, 13 Aug 2012 17:09:33 +0000 (18:09 +0100)]
config: Split debug build from debug symbols

RPM based packaging systems expect binaries to have debug symbols which get
placed in a separate debuginfo RPM.

Split the concept of a debug build up so that binaries can be built with
debugging symbols without having the other gubbins which $(debug) implies, most
notibly frame pointers.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>