xen.git
10 years agotools/libxc: C implementation of stream format
Andrew Cooper [Sat, 15 Mar 2014 20:18:45 +0000 (20:18 +0000)]
tools/libxc: C implementation of stream format

Provide the C structures matching the binary (wire) format of the new
stream format.  All header/record fields are naturally aligned and
explicit padding fields are used to ensure the correct layout (i.e.,
there is no need for any non-standard structure packing pragma or
attribute).

Provide some helper functions for converting types to string for
diagnostic purposes.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: Migration v2 framework
Andrew Cooper [Sat, 15 Mar 2014 18:50:31 +0000 (18:50 +0000)]
tools/libxc: Migration v2 framework

For testing purposes, the environmental variable "XG_MIGRATION_V2" allows the
two save/restore codepaths to coexist, and have a runtime switch.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agolibxc/progress: Extend the progress interface
Andrew Cooper [Thu, 24 Jul 2014 12:05:27 +0000 (13:05 +0100)]
libxc/progress: Extend the progress interface

Progress information is logged via a different logger to regular libxc log
messages, and currently can only express a range.  However, not everything
which needs reporting as progress comes with a range.  Extend the interface to
allow reporting of a single statement.

The programming interface now looks like:
  xc_set_progress_prefix()
    set the prefix string to be used
  xc_report_progress_single()
    report a single action
  xc_report_progress_step()
    report $X of $Y

The new programming interface is implemented in a compatible way with the
existing caller interface (by reporting a single action as "0 of 0").

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: Implement writev_exact() in the same style as write_exact()
Andrew Cooper [Tue, 1 Jul 2014 18:10:35 +0000 (19:10 +0100)]
tools/libxc: Implement writev_exact() in the same style as write_exact()

This implementation of writev_exact() will cope with an iovcnt greater than
IOV_MAX because glibc will actually let this work anyway, and it is very
useful not to have to work about this in the caller of writev_exact().  The
caller is still required to ensure that the sum of iov_len's doesn't overflow
a ssize_t.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agoxen: arm: X-Gene Storm check GIC DIST address for EOI quirk
Pranavkumar Sawargaonkar [Wed, 29 Apr 2015 09:38:27 +0000 (15:08 +0530)]
xen: arm: X-Gene Storm check GIC DIST address for EOI quirk

In old X-Gene Storm firmware and DT, secure mode addresses have been
mentioned in GICv2 node. In this case maintenance interrupt is used
instead of EOI HW method.

This patch checks the GIC Distributor Base Address to enable EOI quirk
for old firmware.

Ref:
http://lists.xen.org/archives/html/xen-devel/2014-07/msg01263.html

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Tested-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: p2m: Add an ASSERT to check that p2m lock is taken in __p2m_lookup
Julien Grall [Mon, 27 Apr 2015 14:58:33 +0000 (15:58 +0100)]
xen/arm: p2m: Add an ASSERT to check that p2m lock is taken in __p2m_lookup

__p2m_lookup should be called with the p2m lock taken. Add an ASSERT in
order to catch wrong caller in debug build.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: convert strings and ints to xenbus_state
Olaf Hering [Fri, 24 Apr 2015 09:07:14 +0000 (09:07 +0000)]
libxl: convert strings and ints to xenbus_state

Convert all plain ints and strings which are used for xenbus "state"
files to xenbus_state. This makes it easier to find code which deals
with backend/frontend state changes.

Convert usage of libxl__sprintf to GCSPRINTF.

No change in behaviour is expected by this change, beside a small
increase of runtime memory usage in places that used a string constant.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: Set HVM_PARAM_CONSOLE_EVTCHN during restore
Boris Ostrovsky [Thu, 23 Apr 2015 02:49:18 +0000 (22:49 -0400)]
tools/libxc: Set HVM_PARAM_CONSOLE_EVTCHN during restore

When resuming, the guest needs to check whether the port has changed. HVM
guests use this parameter to get the port number.

(We can't always use xenstore where this value is also written: for example
on Linux the console is resumed very early, before the store is up).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
10 years agop2m/ept: enable PML in p2m-ept for log-dirty
Kai Huang [Mon, 4 May 2015 10:19:25 +0000 (12:19 +0200)]
p2m/ept: enable PML in p2m-ept for log-dirty

This patch firstly enables EPT A/D bits if PML is used, as PML depends on EPT
A/D bits to work. A bit is set for all present p2m types in middle and leaf EPT
entries, and D bit is set for all writable types in EPT leaf entry, except for
log-dirty type with PML.

With PML, for 4K pages, instead of setting EPT entry to read-only, we just need
to clear D bit in order to log that GFN. For superpages, we still need to set it
to read-only as we need to split superpage to 4K pages in EPT violation.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agolog-dirty: refine common code to support PML
Kai Huang [Mon, 4 May 2015 10:18:51 +0000 (12:18 +0200)]
log-dirty: refine common code to support PML

Using PML, it's possible there are dirty GPAs logged in vcpus' PML buffers
when userspace peek/clear dirty pages, therefore we need to flush them befor
reporting dirty pages to userspace. This applies to both video ram tracking and
paging_log_dirty_op.

This patch adds new p2m layer functions to enable/disable PML and flush PML
buffers. The new functions are named to be generic to cover potential futher
PML-like features for other platforms.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agovmx: disable PML in vmx_vcpu_destroy
Kai Huang [Mon, 4 May 2015 10:17:43 +0000 (12:17 +0200)]
vmx: disable PML in vmx_vcpu_destroy

It's possible domain still remains in log-dirty mode when it is about to be
destroyed, in which case we should manually disable PML for it.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: handle PML enabling in vmx_vcpu_initialise
Kai Huang [Mon, 4 May 2015 10:17:10 +0000 (12:17 +0200)]
vmx: handle PML enabling in vmx_vcpu_initialise

It's possible domain has already been in log-dirty mode when creating vcpu, in
which case we should enable PML for this vcpu if PML has been enabled for the
domain.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: handle PML buffer full VMEXIT
Kai Huang [Mon, 4 May 2015 10:15:49 +0000 (12:15 +0200)]
vmx: handle PML buffer full VMEXIT

We need to flush PML buffer when it's full.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: add help functions to support PML
Kai Huang [Mon, 4 May 2015 10:15:07 +0000 (12:15 +0200)]
vmx: add help functions to support PML

This patch adds help functions to enable/disable PML, and flush PML buffer for
single vcpu and particular domain for further use.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: add new data structure member to support PML
Kai Huang [Mon, 4 May 2015 10:14:15 +0000 (12:14 +0200)]
vmx: add new data structure member to support PML

A new 4K page pointer is added to arch_vmx_struct as PML buffer for vcpu. And a
new 'status' field is added to vmx_domain to indicate whether PML is enabled for
the domain or not.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: add PML definition and feature detection
Kai Huang [Mon, 4 May 2015 10:12:11 +0000 (12:12 +0200)]
vmx: add PML definition and feature detection

The patch adds PML definition and feature detection. Note PML won't be detected
if PML is disabled from boot parameter. PML is also disabled in construct_vmcs,
as it will only be enabled when domain is switched to log dirty mode.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agolog-dirty: add new paging_mark_gfn_dirty
Kai Huang [Mon, 4 May 2015 10:10:41 +0000 (12:10 +0200)]
log-dirty: add new paging_mark_gfn_dirty

PML logs GPA in PML buffer. Original paging_mark_dirty takes MFN as parameter
but it gets guest pfn internally and use guest pfn to as index for looking up
radix log-dirty tree. In flushing PML buffer, calling paging_mark_dirty directly
introduces redundant p2m lookups (gfn->mfn->gfn), therefore we introduce
paging_mark_gfn_dirty which is bulk of paging_mark_dirty but takes guest pfn as
parameter, and in flushing PML buffer we call paging_mark_gfn_dirty directly.
Original paging_mark_dirty then simply is a wrapper of paging_mark_gfn_dirty.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agovmx: add new boot parameter to control PML enabling
Kai Huang [Mon, 4 May 2015 10:09:03 +0000 (12:09 +0200)]
vmx: add new boot parameter to control PML enabling

A top level EPT parameter "ept=<options>" and a sub boolean "opt_pml_enabled"
are added to control PML. Other booleans can be further added for any other EPT
related features.

The document description for the new parameter is also added.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agotest_x86_emulate: extend EFLAGS check of CMPXCHG test
Eugene Korenevsky [Mon, 4 May 2015 09:56:21 +0000 (11:56 +0200)]
test_x86_emulate: extend EFLAGS check of CMPXCHG test

CMPXCHG: in the case of inequality of the rAX and the operand,
need to check CF, PF, AF, SF and OF flags as well.

This adjustment covers the fix of incorrect comparison during
CMPXCHG emulation.

Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
10 years agox86_emulate: fix EFLAGS setting of CMPXCHG emulation
Eugene Korenevsky [Mon, 4 May 2015 09:55:41 +0000 (11:55 +0200)]
x86_emulate: fix EFLAGS setting of CMPXCHG emulation

CMPXCHG sets CF, PF, AF, SF, and OF flags according to the results of the
comparison the rAX with the operand of the instruction.
rAX must be the first argument of the comparison (a minuend), the operand
must be the second one (a subtrahend).

Due to improper order of comparison arguments, CF, PF, AF, SF and OF flags were
set incorrectly in the case of inequality. Need to swap them.

Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
10 years agox86: improve psr scheduling code
Chao Peng [Mon, 4 May 2015 09:54:39 +0000 (11:54 +0200)]
x86: improve psr scheduling code

Switching RMID from previous vcpu to next vcpu only needs to write
MSR_IA32_PSR_ASSOC once. Write it with the value of next vcpu is enough,
no need to write '0' first. Idle domain has RMID set to 0 and because MSR
is already updated lazily, so just switch it as it does.

Also move the initialization of per-CPU variable which used for lazy
update from context switch to CPU starting.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
10 years agolibxlu: don't crash on empty lists
Jan Beulich [Fri, 24 Apr 2015 10:15:15 +0000 (12:15 +0200)]
libxlu: don't crash on empty lists

Prior to 1a09c5113a ("libxlu: rework internal representation of
setting") empty lists in config files did get accepted. Restore that
behavior.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
10 years agox86/hvm: implicitly disable an ioreq server when it is destroyed
Paul Durrant [Fri, 24 Apr 2015 10:14:23 +0000 (12:14 +0200)]
x86/hvm: implicitly disable an ioreq server when it is destroyed

Currently, unless a (non-default) ioreq server is explicitly disabled before
being destroyed, its gmfns will not be placed back into the p2m but still
released back into the ioreq_gmfn mask. This is somewhat counter-intuitive
and easily remedied by this small patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/hvm: actually release ioreq server pages
Paul Durrant [Fri, 24 Apr 2015 10:13:48 +0000 (12:13 +0200)]
x86/hvm: actually release ioreq server pages

hvm_free_ioreq_gmfn has the sense of the ioreq_gmfn mask inverted; it
needs to set a bit to release the gmfn, not clear it.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agouse 'Hardware domain' instead of 'Domain 0' in hwdom_shutdown()
Vitaly Kuznetsov [Fri, 24 Apr 2015 10:07:00 +0000 (12:07 +0200)]
use 'Hardware domain' instead of 'Domain 0' in hwdom_shutdown()

hwdom_shutdown() operates with hardware domains, use the proper wording.
Eliminate pointless braces from switch cases.

Use hardware_domain->domain_id instead of hardware_domid to print the actual
domain ID as in some cases it can differ (e.g. Dom0 dies before the actual HW
domain got created, kexec for the HW domain is being performed,...).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agoAMD IOMMU: only translate remapped IO-APIC RTEs
Jan Beulich [Fri, 24 Apr 2015 10:06:26 +0000 (12:06 +0200)]
AMD IOMMU: only translate remapped IO-APIC RTEs

1aeb1156fa ("x86 don't change affinity with interrupt unmasked")
introducing RTE reads prior to the respective interrupt having got
enabled for the first time uncovered a bug in 2ca9fbd739 ("AMD IOMMU:
allocate IRTE entries instead of using a static mapping"): We obviously
shouldn't be translating RTEs for which remapping didn't get set up
yet.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
10 years agopassthrough/amd: avoid reading an uninitialized variable
Tim Deegan [Fri, 24 Apr 2015 10:04:57 +0000 (12:04 +0200)]
passthrough/amd: avoid reading an uninitialized variable

update_intremap_entry_from_msi() doesn't write to its data pointer on
some error paths, so we copying that variable into the msg would count
as undefined behaviour.

Signed-off-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
10 years agox86/shadow: fix big-memory build
Jan Beulich [Thu, 23 Apr 2015 11:10:19 +0000 (13:10 +0200)]
x86/shadow: fix big-memory build

Modifiers to the pointer passed into list_next_entry() are also being
applied to the macro's return type, and hence if the input pointer is
const-qualified a variable the result gets assigned to would also need
to be. As that doesn't seem desirable here, drop the const qualifier
on the input pointer instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agorefine C++ header checking compiler invocation
Jan Beulich [Thu, 23 Apr 2015 11:09:10 +0000 (13:09 +0200)]
refine C++ header checking compiler invocation

g++ 4.1.x dies with "cc1plus: error: output filename specified twice"
on the currently used construct. That's apparently due to it converting
the manually specified "c++" into "c++-header", and mis-handling that
(which, when using "c++-header" explicitly btw gets mis-handled even
with 4.9.x and also, using "c-header", by the plain C compiler).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoadjust assertion in alloc_heap_pages()
Jan Beulich [Thu, 23 Apr 2015 11:08:40 +0000 (13:08 +0200)]
adjust assertion in alloc_heap_pages()

Older gcc warns (and due to -Werror fails) on this ASSERT() now that
"node" is of unsigned type. Make it more useful at once.

Coverity-ID: 1055630

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agosysctl: zero structures on the stack
Andrew Cooper [Thu, 23 Apr 2015 11:07:59 +0000 (13:07 +0200)]
sysctl: zero structures on the stack

None of these structures currently contain a hole.  However, there is a risk
that a change to the structure might introduce a hole, and thus create a
hypervisor stack leak to the toolstack.

Mitigate this risk by preemptively zeroing these structures.  These are not
hotpaths, so the slight overhead is not an issue.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agoVT-d: replace bogus gprintk()
Jan Beulich [Thu, 23 Apr 2015 11:05:33 +0000 (13:05 +0200)]
VT-d: replace bogus gprintk()

Just like the other messages in this function this one should be issued
through plain printk() - the current vCPU is irrelevant here. (Noticed
while backporting to older trees, which don't have gprintk().)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agox86/hvm: refactor code that allocates ioreq gfns.
Tim Deegan [Thu, 16 Apr 2015 16:34:24 +0000 (17:34 +0100)]
x86/hvm: refactor code that allocates ioreq gfns.

It was confusing GCC's uninitialized-variable detection.

Signed-off-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agolibxl: fd events: Suppress spurious fd events
Ian Jackson [Thu, 16 Apr 2015 18:23:28 +0000 (19:23 +0100)]
libxl: fd events: Suppress spurious fd events

Always recheck with poll() right before making the callback.

All sorts of things may have happened since poll() originally signaled
the fd.  We would like the main functional libxl code not to have to
worry about spurious wakeups.

In particular, this fixes a bug in the save/restore callout: the save
helper message reader operates with the fd in blocking mode.  In a
multithreaded program one thread might have eaten all the messages out
of the fd while another one is busy returning from poll and reacquiring
the libxl lock, possibly resulting in a deadlock.

(Also, we abolish the anomalous direct caller of efd->func.)

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Jim Fehlig <jfehlig@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: fd events: Break out fd_occurs
Ian Jackson [Thu, 16 Apr 2015 18:23:27 +0000 (19:23 +0100)]
libxl: fd events: Break out fd_occurs

No functional change, only code motion.

Currently, contrary to this function's name, there are two sites where
efd->func() is called so one of them doesn't go through here just yet.
That will be dealt with in the next commit.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Jim Fehlig <jfehlig@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: fd events: Break out libxl__fd_poll_recheck
Ian Jackson [Thu, 16 Apr 2015 18:23:26 +0000 (19:23 +0100)]
libxl: fd events: Break out libxl__fd_poll_recheck

Replaces two call sites where a rechecking poll() was open-coded.

No functional change, other than to highly unusual error path
diagnosis, and debug and error message output.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Jim Fehlig <jfehlig@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agodocs/build: Support generation of pandoc documents
Andrew Cooper [Tue, 21 Apr 2015 15:47:25 +0000 (16:47 +0100)]
docs/build: Support generation of pandoc documents

pandoc is a superset of markdown

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs/build: Move install checks into individual build targets
Andrew Cooper [Tue, 21 Apr 2015 15:47:05 +0000 (16:47 +0100)]
docs/build: Move install checks into individual build targets

For top-level targets which use more than a single program to produce content
(txt already, pdf once pandoc is supported), these current checks are
unsuitable.

By moving the the install checks to the rules which actually use the programs,
it is now possible to build a subset of a top-level target depending on the
installed programs.

As a bonus, it removes the need to recurse for txt, man-pages and pdf targets.

A side effect of this is that every individual source which cannot be
generated will have a specific message logged, giving the file and program.
As such, these message are updated to consistently report the target file
which was not generated.

Finally, update "ifdef foo" to "ifneq($(foo),)" to be more resilient to errors
caused by having foo defined as an empty string.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agodocs/build: Do not create directories if we are not going to use them
Andrew Cooper [Mon, 20 Apr 2015 10:49:24 +0000 (11:49 +0100)]
docs/build: Do not create directories if we are not going to use them

and be quite about doing so; these are only intermediate directories.

No practical change, but the build log is roughly halved.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs/build: Do not use move-if-changed
Andrew Cooper [Mon, 20 Apr 2015 10:49:23 +0000 (11:49 +0100)]
docs/build: Do not use move-if-changed

Nothing expensive depends on these results.

Also prefer $(INSTALL_DATA) over cp to get correct file attributes (see
fb33b2b "docs: make .txt files over-writable when building from r/o sources")

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agodocs/build: Move two rules for consistency, and comment sections
Andrew Cooper [Mon, 20 Apr 2015 10:49:22 +0000 (11:49 +0100)]
docs/build: Move two rules for consistency, and comment sections

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs/build: Do not open-code $*
Andrew Cooper [Mon, 20 Apr 2015 10:49:21 +0000 (11:49 +0100)]
docs/build: Do not open-code $*

Sometimes there is already a round enough wheel to hand.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs/build: Misc cleanup
Andrew Cooper [Mon, 20 Apr 2015 10:49:20 +0000 (11:49 +0100)]
docs/build: Misc cleanup

 * Use $(PANDOC) from ./configure
 * Swap '-N' for its less-obscure longer form
 * Don't explicitly echo about markdown.  The call to markdown is emitted
 * Whitespace cleanup

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agohotplug/FreeBSD: set network interface MTU to bridge MTU
Gustau Perez [Mon, 20 Apr 2015 07:12:52 +0000 (09:12 +0200)]
hotplug/FreeBSD: set network interface MTU to bridge MTU

On creation time, tap and xnb interfaces are created with an mtu of
1500 bytes, assuming the bridge will have the same value.
Instead, check the bridge mtu and configure the new xnb or
tap interface with the same value.

The tools used are sed and ifconfig, both included on base. No need
to install additional ports (no new dependences).

Signed-off-by: Gustau Perez <gustau.perez@gmail.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.cmapbell@citrix.com>
[ ijc -- clarified title ]

10 years agolibxl: document foreground '-F' option of create command
Giuseppe Mazzotta [Fri, 17 Apr 2015 15:36:34 +0000 (17:36 +0200)]
libxl: document foreground '-F' option of create command

Signed-off-by: Giuseppe Mazzotta <g.mazzotta@iragan.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: use DEBUG log level instead of INFO
Wei Liu [Fri, 17 Apr 2015 11:31:29 +0000 (12:31 +0100)]
libxl: use DEBUG log level instead of INFO

Make libxl less noisy when destroying a domain.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: provide libxl_bitmap_{or,and}
Linda Jacobson [Wed, 15 Apr 2015 17:02:07 +0000 (11:02 -0600)]
libxl: provide libxl_bitmap_{or,and}

New functions to provide logical and and or of two bitmaps.  These are
generically useful utility functions added to the public API for the
benefit of libxl's users.

In the future they may also be useful internally, e.g. in the
vNUMA configuration check function.

Signed-off-by: Linda Jacobson <lindaj@jma3.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- rewrote commit message and fixed typo ]

10 years agoxen/arm: Enable mem_access on ARM
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:24 +0000 (17:06 +0200)]
xen/arm: Enable mem_access on ARM

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
10 years agotools/tests: Enable xen-access on ARM
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:23 +0000 (17:06 +0200)]
tools/tests: Enable xen-access on ARM

Switch to use maximum gpfn as the limit to setting permissions. Also,
move HAS_MEM_ACCESS definition into config.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- removed obsolete reference to test_and_set_bit from the
         commit message ]

10 years agotools/libxc: Allocate magic page for mem access on ARM
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:22 +0000 (17:06 +0200)]
tools/libxc: Allocate magic page for mem access on ARM

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Implement domain_get_maximum_gpfn
Julien Grall [Mon, 20 Apr 2015 15:06:21 +0000 (17:06 +0200)]
xen/arm: Implement domain_get_maximum_gpfn

The function domain_get_maximum_gpfn is returning the maximum gpfn ever
mapped in the guest. We can use d->arch.p2m.max_mapped_gfn for this purpose.

We use this in xenaccess as to avoid the user attempting to set page
permissions on pages which don't exist for the domain, as a non-arch specific
sanity check.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Instruction prefetch abort (X) mem_access event handling
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:19 +0000 (17:06 +0200)]
xen/arm: Instruction prefetch abort (X) mem_access event handling

Add missing structure definition for iabt and update the trap handling
mechanism to only inject the exception if the mem_access checker
decides to do so.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Data abort exception (R/W) mem_access events
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:18 +0000 (17:06 +0200)]
xen/arm: Data abort exception (R/W) mem_access events

This patch enables to store, set, check and deliver LPAE R/W mem_events.
As the LPAE PTE's lack enough available software programmable bits,
we store the permissions in a Radix tree. The tree is only looked at if
mem_access_enabled is turned on.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Allow hypervisor access to mem_access protected pages
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:17 +0000 (17:06 +0200)]
xen/arm: Allow hypervisor access to mem_access protected pages

The hypervisor may use the MMU to verify that the given guest has read/write
access to a given page during hypercalls. As we may have custom mem_access
permissions set on these pages, we do a software-based type checking in case
the MMU based approach failed, but only if mem_access_enabled is set.

These memory accesses are not forwarded to the mem_event listener. Accesses
performed by the hypervisor are currently not part of the mem_access scheme.
This is consistent behaviour with the x86 side as well.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: groundwork for mem_access support on ARM
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:16 +0000 (17:06 +0200)]
xen/arm: groundwork for mem_access support on ARM

Add necessary changes for page table construction routines to pass
the default access information and hypercall continuation mask. Also,
define necessary functions and data fields to be used later by mem_access.

The p2m_access_t info will be stored in a Radix tree as the PTE lacks
enough software programmable bits, thus in this patch we add the radix-tree
construction/destruction portions. The tree itself will be used later
by mem_access.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoConfig.mk: Fix (and, effectively, update) QEMU_TAG
Ian Jackson [Tue, 21 Apr 2015 10:27:59 +0000 (11:27 +0100)]
Config.mk: Fix (and, effectively, update) QEMU_TAG

In 952944f7 "QEMU_TAG update" my tag update script mangled the
machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first
assignment to QEMU_TRADITIONAL_REVISION it found rather than the one
which ought to have been replaced.

The result was that:
 * From that commit on, QEMU_TAG was no longer honoured although
   QEMU_TRADITIONAL_REVISION still was
 * That particular update to QEMU_TRADITIONAL_REVISION's default
   value was effective
 * The next attempt to update QEMU_TRADITIONAL_REVISION, in
   1fc3aeb3 "libxl: use new QEMU xenstore protocol" was totally
   ineffective.

Fix this by restoring the transfer from QEMU_TAG.  The effects are:
 * Once more, honour QEMU_TAG.
 * Belatedly apply the qemu-trad change part of "libxl: use new QEMU
   xenstore protocol.

(I have also fixed my script to not do this again.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
10 years agosysctl: make XEN_SYSCTL_numainfo a little more efficient
Boris Ostrovsky [Tue, 21 Apr 2015 07:06:00 +0000 (09:06 +0200)]
sysctl: make XEN_SYSCTL_numainfo a little more efficient

A number of changes to XEN_SYSCTL_numainfo interface:

* Make sysctl NUMA topology query use fewer copies by combining some
  fields into a single structure and copying distances for each node
  in a single copy.
* NULL meminfo and distance handles are a request for maximum number
  of nodes (num_nodes). If those handles are valid and num_nodes is
  is smaller than the number of nodes in the system then -ENOBUFS is
  returned (and correct num_nodes is provided)
* Instead of using max_node_index for passing number of nodes keep this
  value in num_nodes: almost all uses of max_node_index required adding
  or subtracting one to eventually get to number of nodes anyway.
* Replace INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ and add
  XEN_INVALID_NODE_DIST.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/domctl: don't allow a toolstack domain to pause itself
Andrew Cooper [Tue, 21 Apr 2015 07:05:26 +0000 (09:05 +0200)]
x86/domctl: don't allow a toolstack domain to pause itself

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
10 years agox86/domctl: cleanup
Andrew Cooper [Tue, 21 Apr 2015 07:04:45 +0000 (09:04 +0200)]
x86/domctl: cleanup

 * latch curr/currd once at start
 * drop redundant "ret = 0" and braces
 * use "copyback = 1" when appropriate
 * move break statements inside case-specific braced scopes
 * don't bother check for NULL before calling xfree()
 * eliminate trailing whitespace
 * Xen style corrections

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agodomctl/sysctl: don't leak hypervisor stack to toolstacks
Andrew Cooper [Tue, 21 Apr 2015 07:03:15 +0000 (09:03 +0200)]
domctl/sysctl: don't leak hypervisor stack to toolstacks

This is CVE-2015-3340 / XSA-132.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agox86/efi: Reserve SMBIOS table region when EFI booting
Ross Lagerwall [Fri, 17 Apr 2015 08:44:48 +0000 (10:44 +0200)]
x86/efi: Reserve SMBIOS table region when EFI booting

Some EFI firmware implementations may place the SMBIOS table in RAM
marked as BootServicesData, which Xen does not consider as reserved.
When dom0 tries to access the SMBIOS, the region is not contained in the
initial P2M and it crashes with a page fault. To fix this, reserve the
SMBIOS region.

Also, fix the memcmp checks for existence of the SMBIOS.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agopublic/grant_table.h: fix description of GNTTABOP_map_grant_ref
Rafał Wojdyła [Fri, 17 Apr 2015 08:44:29 +0000 (10:44 +0200)]
public/grant_table.h: fix description of GNTTABOP_map_grant_ref

Error code is not returned in the <handle> field of the
gnttab_map_grant_ref structure but in the <status> field only.

Signed-off-by: Rafał Wojdyła <omeg@invisiblethingslab.com>
10 years agoVMX: replace some plain numbers
Liang Li [Fri, 17 Apr 2015 08:42:13 +0000 (10:42 +0200)]
VMX: replace some plain numbers

... making the code better document itself. No functional change
intended.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovtpmmgr: execute deep quote in locality 0
Emil Condrea [Wed, 15 Apr 2015 18:00:14 +0000 (21:00 +0300)]
vtpmmgr: execute deep quote in locality 0

Enables deep quote execution for vtpmmgr which can not be started
using locality 2. Flags are used to request additional data to be
present when executing quote. They are interpreted as a bitmask of:
 * VTPM_QUOTE_FLAGS_HASH_UUID
 * VTPM_QUOTE_FLAGS_VTPM_MEASUREMENTS
 * VTPM_QUOTE_FLAGS_GROUP_INFO
 * VTPM_QUOTE_FLAGS_GROUP_PUBKEY

The externData param for TPM_Quote is calculated as:
externData = SHA1 (
       extraInfoFlags
       requestData
       [SHA1 (
          [SHA1 (UUIDs if requested)]
          [SHA1 (vTPM measurements if requested)]
          [SHA1 (vTPM group update policy if requested)]
          [SHA1 (vTPM group public key if requested)]
       ) if flags !=0 ]
)

The response param pcrValues is an array containing requested hashes used
for externData calculation : UUIDs, vTPM measurements, vTPM group update
policy, group public key. At the end of these hashes the PCR values are
appended.

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovtpm: deep quote flags
Emil Condrea [Wed, 15 Apr 2015 18:00:13 +0000 (21:00 +0300)]
vtpm: deep quote flags

Currently, the flags are not interpreted by vTPM. They are just
packed and sent to vtpmmgr.

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agoxen/vm_event: Add RESUME option to vm_event_op domctl
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:53 +0000 (16:32 +0200)]
xen/vm_event: Add RESUME option to vm_event_op domctl

Thus far mem_access and mem_sharing memops had been able to signal
to Xen to start pulling responses off the corresponding rings. In this patch
we retire these memops and add them to the option to the vm_event_op domctl.

The vm_event_op domctl suboptions are the same for each ring thus we
consolidate them into XEN_VM_EVENT_ENABLE/DISABLE/RESUME.

As part of this patch in libxc we also rename the mem_access_enable/disable
functions to monitor_enable/disable and move them into xc_monitor.c.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen/xsm: Split vm_event_op into three separate labels
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:52 +0000 (16:32 +0200)]
xen/xsm: Split vm_event_op into three separate labels

The XSM label vm_event_op has been used to control the three memops
controlling mem_access, mem_paging and mem_sharing. While these systems
rely on vm_event, these are not vm_event operations themselves. Thus,
in this patch we introduce three separate labels for each of these memops.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen/vm_event: Relocate memop checks
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:51 +0000 (16:32 +0200)]
xen/vm_event: Relocate memop checks

The memop handler function for paging/sharing responsible for calling XSM
doesn't really have anything to do with vm_event, thus in this patch we
relocate it into mem_paging_memop and mem_sharing_memop. This has already
been the approach in mem_access_memop, so in this patch we just make it
consistent.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
10 years agoxen/vm_event: Decouple vm_event and mem_access.
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:50 +0000 (16:32 +0200)]
xen/vm_event: Decouple vm_event and mem_access.

The vm_event subsystem has been artifically tied to the presence of mem_access.
While mem_access does depend on vm_event, vm_event is an entirely independent
subsystem that can be used for arbitrary function-offloading to helper apps in
domains. This patch removes the dependency that mem_access needs to be supported
in order to enable vm_event.

A new vm_event_resume function is introduced which pulls all responses off from
given ring and delegates handling to appropriate helper functions (if
necessary). By default, vm_event_resume just pulls the response from the ring
and unpauses the corresponding vCPU. This approach reduces code duplication
and present a single point of entry for the entire vm_event subsystem's
response handling mechanism.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen/vm_event: Deprecate VM_EVENT_FLAG_DUMMY flag
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:49 +0000 (16:32 +0200)]
xen/vm_event: Deprecate VM_EVENT_FLAG_DUMMY flag

There are no use-cases for this flag.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen: Introduce monitor_op domctl
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:48 +0000 (16:32 +0200)]
xen: Introduce monitor_op domctl

In preparation for allowing for introspecting ARM and PV domains the old
control interface via the hvm_op hypercall is retired. A new control mechanism
is introduced via the domctl hypercall: monitor_op.

This patch aims to establish a base API on which future applications can build
on.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
10 years agolibxenstat: qmp_read fix and cleanup
Wei Liu [Wed, 8 Apr 2015 16:08:22 +0000 (17:08 +0100)]
libxenstat: qmp_read fix and cleanup

The second argument of poll(2) is the number of file descriptors. POLLIN
is defined as 1 so it happens to work. Also reduce the size of array to
one as there is only one file descriptor.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Charles Arnold <carnold@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxenstat: always free qmp_stats
Wei Liu [Wed, 8 Apr 2015 16:08:21 +0000 (17:08 +0100)]
libxenstat: always free qmp_stats

Originally qmp_stats is only freed in failure path and leaked in success
path.

Instead of wiring up the success path, rearrange the code a bit to
always free qmp_stats before checking if info is NULL.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Charles Arnold <carnold@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxenstat: YAJL_GET_STRING may return NULL
Wei Liu [Wed, 8 Apr 2015 16:08:20 +0000 (17:08 +0100)]
libxenstat: YAJL_GET_STRING may return NULL

Passing NULL to strcmp can cause segmentation fault. Continue in that
case.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Charles Arnold <carnold@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxenstat: reuse xc_handle open in xenstat_init
Wei Liu [Wed, 8 Apr 2015 16:08:19 +0000 (17:08 +0100)]
libxenstat: reuse xc_handle open in xenstat_init

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Charles Arnold <carnold@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: check return value of libxl_vcpu_setaffinity
Wei Liu [Wed, 8 Apr 2015 16:05:24 +0000 (17:05 +0100)]
libxl: check return value of libxl_vcpu_setaffinity

That function can fail.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Don't write to GICH_MISR
Edgar E. Iglesias [Fri, 10 Apr 2015 06:21:10 +0000 (16:21 +1000)]
xen/arm: Don't write to GICH_MISR

GICH_MISR is read-only in GICv2.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoREADME: Reference some more comprehensive docs from the Quick-start
Ian Campbell [Tue, 14 Apr 2015 15:25:49 +0000 (16:25 +0100)]
README: Reference some more comprehensive docs from the Quick-start

The quick-start is not terribly comprehensive for beginners.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
10 years agoxenstore: document xs_set_permissions
Wei Liu [Tue, 31 Mar 2015 12:26:11 +0000 (13:26 +0100)]
xenstore: document xs_set_permissions

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl/vcpu-set - allow to decrease vcpu count on overcommitted guests (v5)
Konrad Rzeszutek Wilk [Fri, 3 Apr 2015 20:02:34 +0000 (16:02 -0400)]
libxl/vcpu-set - allow to decrease vcpu count on overcommitted guests (v5)

We have a check to warn the user if they are overcommitting.
But the check only checks the hosts CPU amount and does
not take into account the case when the user is trying to fix
the overcommit. That is - they want to limit the amount of
online VCPUs.

This fix allows the user to offline vCPUs without any
warnings when they are running an overcommitted guest.

Also fix the extra space in the message.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl/vcpuset: Remove useless limit on max_vcpus.
Konrad Rzeszutek Wilk [Fri, 3 Apr 2015 20:02:33 +0000 (16:02 -0400)]
libxl/vcpuset: Remove useless limit on max_vcpus.

The check is superflous. If the 'max_vcpus' (argument
value) is greater than  pCPU and --ignore-host has not
been supplied we would print an warning and return
and not call this code.

If the --ignore-host parameter had been used we would
never end up in this condition and enforce 'max_vcpus'.

The only time it would be invoked is if max_vcpus < host_cpu
in which case it would set max_vcpus to max_vcpus.

In short - it is dead code.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl/vcpuset: Return error value if failed.
Konrad Rzeszutek Wilk [Fri, 3 Apr 2015 20:02:32 +0000 (16:02 -0400)]
libxl/vcpuset: Return error value if failed.

The function does not return any values at all. Convert the
internal libxl errors (ERROR_FAIL, ..., etc) to 1.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl/vcpuset: Print error if libxl_set_vcpuonline returns ERROR_DOMAIN_NOTFOUND
Konrad Rzeszutek Wilk [Fri, 3 Apr 2015 20:02:31 +0000 (16:02 -0400)]
libxl/vcpuset: Print error if libxl_set_vcpuonline returns ERROR_DOMAIN_NOTFOUND

Instead of just printing an generic error.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: In libxl_set_vcpuonline check for maximum number of VCPUs against the cpumap.
Konrad Rzeszutek Wilk [Fri, 3 Apr 2015 20:02:29 +0000 (16:02 -0400)]
libxl: In libxl_set_vcpuonline check for maximum number of VCPUs against the cpumap.

There is no sense in trying to online (or offline) CPUs when the size of
cpumap is greater than the maximum number of VCPUs the guest can go to.

As such fail the operation if the count of CPUs to online is greater
than what the guest started with. For the offline case we do not
check (as the bits are unset in the cpumap) and let it go through.

We coalesce some of the underlying libxl_set_vcpuonline code
together which was duplicated in QMP and XenStore codepaths.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: Add ERROR_DOMAIN_NOTFOUND for libxl_domain_info when it cannot find the domain
Konrad Rzeszutek Wilk [Fri, 3 Apr 2015 20:02:28 +0000 (16:02 -0400)]
libxl: Add ERROR_DOMAIN_NOTFOUND for libxl_domain_info when it cannot find the domain

And use that for all of its callers in the tree.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: Cope with pipes which signal POLLHUP|POLLIN on read eof
Ian Jackson [Tue, 7 Apr 2015 13:05:28 +0000 (14:05 +0100)]
libxl: Cope with pipes which signal POLLHUP|POLLIN on read eof

Some operating systems (including Linux and FreeBSD[1]) signal not
(only) POLLIN when a reading pipe reaches EOF, but POLLHUP (with or
without POLLIN).  This is permitted[2].  The implications are that in
the general case it is not possible to determine whether POLLHUP
indicates an error or simply eof without attempting a read.

Datacopiers mishandle this, because they always treat POLLHUP
exceptionally (either reporting it via callback_pollhup, or treating
it as an error).  datacopiers reading from pipes on such OSs can fail
(perhaps leaving some data unprocessed) rather than completing
successfully.

[1] http://www.greenend.org.uk/rjk/tech/poll.html
[2] http://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html

Distinguishing POLLHUP is needed for pty fds, but most callers in
libxl do not care about POLLHUP except as an error or eof condition.

So change the datacopier semantics so that if callback_pollhup is not
specified we treat POLLHUP almost like POLLIN.  The difference is that
if we get HUP from poll, but EWOULDBLOCK from read, we must signal an
error rather than attempting the read again.

This fixes the problem which 7e9ec50b0535 was aimed at.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: datacopier: Avoid eof/POLLHUP race
Ian Jackson [Tue, 7 Apr 2015 13:05:27 +0000 (14:05 +0100)]
libxl: datacopier: Avoid eof/POLLHUP race

When the bootloader exits, several things change, all at once:
 (a) The master pty fd (held by libxl) starts to signal POLLHUP
    and maybe also POLLIN.
 (b) The child exits (so that the SIGCHLD self-pipe signals POLLIN,
    which will be handled by the libxl child process code.
 (c) reads on the master pty fd start to return EOF

From the point of view of the datacopier these might happen in any
order.

(c) can be detected only after a previous POLLIN without POLLHUP and
that previous POLLIN would be associated with data which was read,
which must therefore have ended up in the dc's buffer.  But nothing
stops the dc from writing that data into the output fd and reporting
eof before it calls poll again.

This race is unlikely.  But  nevertheless it should be fixed.

We solve the race with a poll of the reading fd, to double-check, when
we detect eof via read.  (This is only necessary if the caller has
specified callback_pollhup, as otherwise POLLHUP|POLLIN - and,
presumably, POLLIN followed perhaps by POLLHUP|POLLIN, is to be
treated as eof anyway.)

With a testing patch supplied by me, Roger Pau Monné has reproduced
the failure on FreeBSD and verified that this patch fixes the problem.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Tested-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
10 years agox86/vMSI-X: add valid bits for read acceleration
Jan Beulich [Tue, 14 Apr 2015 14:51:18 +0000 (16:51 +0200)]
x86/vMSI-X: add valid bits for read acceleration

Again because Xen doesn't get to see all guest writes, it shouldn't
serve reads from its cache before having seen a write to the respective
address.

Also use DECLARE_BITMAP() in a related field declaration instead of
open coding it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/vMSI-X: honor all mask requests
Jan Beulich [Tue, 14 Apr 2015 14:50:35 +0000 (16:50 +0200)]
x86/vMSI-X: honor all mask requests

Commit 74fd0036de ("x86: properly handle MSI-X unmask operation from
guests") didn't go far enough: it fixed an issue with unmasking, but
left an issue with masking in place: Due to the (late) point in time
when qemu requests the hypervisor to set up MSI-X interrupts (which is
where the MMIO intercept gets put in place), the hypervisor doesn't
see all guest writes, and hence shouldn't make assumptions on the state
the virtual MSI-X resources are in. Bypassing the rest of the logic on
a guest mask operation leads to

[00:04.0] pci_msix_write: Error: Can't update msix entry 1 since MSI-X is already enabled.

which surprisingly enough doesn't lead to the device not working
anymore (I didn't dig in deep enough to figure out why that is). But it
does prevent the IRQ to be migrated inside the guest, i.e. all
interrupts will always arrive in vCPU 0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: use real assert frames for ASSERT_INTERRUPTS_{EN,DIS}ABLED
Andrew Cooper [Tue, 14 Apr 2015 13:29:19 +0000 (15:29 +0200)]
x86: use real assert frames for ASSERT_INTERRUPTS_{EN,DIS}ABLED

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agox86: infrastructure to create BUG_FRAMES in asm code
Andrew Cooper [Tue, 14 Apr 2015 13:07:24 +0000 (15:07 +0200)]
x86: infrastructure to create BUG_FRAMES in asm code

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agox86: set regs->entry_vector for early_page_fault
Don Slutz [Tue, 14 Apr 2015 13:03:27 +0000 (15:03 +0200)]
x86: set regs->entry_vector for early_page_fault

This changes:

(XEN) Early fatal page fault at e008:ffff82d080164252 (cr2=0000000000000000, ec=0000)
(XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d080164252>] arch_domain_create+0x3e/0x4ef
...
(XEN) Xen call trace:
(XEN)    [<ffff82d080164252>] arch_domain_create+0x3e/0x4ef
(XEN)    [<ffff82d080105262>] domain_create+0x384/0x556
(XEN)    [<ffff82d0802a0de4>] scheduler_init+0x1c4/0x244
(XEN)    [<ffff82d0802be359>] __start_xen+0x1d0e/0x22a1
(XEN)    [<ffff82d080100067>] __high_start+0x53/0x58
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) FATAL TRAP: vector = 0 (divide error)
(XEN) [error_code=0000] , IN INTERRUPT CONTEXT
(XEN) ****************************************
...

to:

(XEN) Early fatal page fault at e008:ffff82d080164252 (cr2=0000000000000000, ec=0000)
(XEN) ----[ Xen-4.6-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d080164252>] arch_domain_create+0x3e/0x4ef
...
(XEN) Xen call trace:
(XEN)    [<ffff82d080164252>] arch_domain_create+0x3e/0x4ef
(XEN)    [<ffff82d080105262>] domain_create+0x384/0x556
(XEN)    [<ffff82d0802a0de4>] scheduler_init+0x1c4/0x244
(XEN)    [<ffff82d0802be359>] __start_xen+0x1d0e/0x22a1
(XEN)    [<ffff82d080100067>] __high_start+0x53/0x58
(XEN)
(XEN) Faulting linear address: 0000000000000000
(XEN) Pagetable walk from 0000000000000000:
(XEN)  L4[0x000] = 000000083a1a6063 ffffffffffffffff
(XEN)  L3[0x000] = 000000083a1a5063 ffffffffffffffff
(XEN)  L2[0x000] = 000000083a1a4063 ffffffffffffffff
(XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) FATAL TRAP: vector = 14 (page fault)
(XEN) [error_code=0000] , IN INTERRUPT CONTEXT
(XEN) ****************************************
...

Signed-off-by: Don Slutz <dslutz@verizon.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/mtrr: include asm/atomic.h
David Vrabel [Tue, 14 Apr 2015 13:02:32 +0000 (15:02 +0200)]
x86/mtrr: include asm/atomic.h

asm/atomic.h is needed but only included indirectly via
asm/spinlock.h.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/hvm: don't include asm/spinlock.h
David Vrabel [Tue, 14 Apr 2015 13:02:10 +0000 (15:02 +0200)]
x86/hvm: don't include asm/spinlock.h

asm/spinlock.h should not be included directly.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agoRevert "x86/hvm: wait for at least one ioreq server to be enabled"
Wei Liu [Tue, 14 Apr 2015 13:01:14 +0000 (15:01 +0200)]
Revert "x86/hvm: wait for at least one ioreq server to be enabled"

We don't need this workaround anymore since we have fixed the toolstack
interlock problem that affects stubdom.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
10 years agox86: clean up psr boot parameter parsing
Chao Peng [Tue, 14 Apr 2015 13:00:44 +0000 (15:00 +0200)]
x86: clean up psr boot parameter parsing

Change type of opt_psr from bool to int so more psr features can fit.

Introduce a new routine to parse bool parameter so that both cmt and
future psr features like cat can use it.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agodocs: efi: given some hint about the dom0 command line
Ian Campbell [Tue, 14 Apr 2015 13:00:28 +0000 (15:00 +0200)]
docs: efi: given some hint about the dom0 command line

Suggested-by: Carlos Gustavo Ramirez Rodriguez <carlosgrr@gmail.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
10 years agox86/traps: identify the vcpu in context when dumping registers
Andrew Cooper [Tue, 14 Apr 2015 12:59:53 +0000 (14:59 +0200)]
x86/traps: identify the vcpu in context when dumping registers

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/cpuidle: identify a legitimate fallthrough case
Andrew Cooper [Tue, 14 Apr 2015 12:59:37 +0000 (14:59 +0200)]
x86/cpuidle: identify a legitimate fallthrough case

to appease the Missing Break checker.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Coverity-id: 1291938

10 years agosched_credit2: more info when dumping
Dario Faggioli [Tue, 14 Apr 2015 12:58:52 +0000 (14:58 +0200)]
sched_credit2: more info when dumping

more specifically, for each runqueue, print what pCPUs
belong to it, which ones are idle and which ones have
been tickled.

While there, also convert the whole file to use
keyhandler_scratch for printing cpumask-s.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>