Shuai Ruan [Wed, 25 Nov 2015 16:23:51 +0000 (17:23 +0100)]
x86/xsaves: enable xsaves/xrstors for hvm guest
This patch enables xsaves for hvm guest, includes:
1.handle xsaves vmcs init and vmexit.
2.add logic to write/read the XSS msr.
Add IA32_XSS_MSR save/rstore support.
Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Shuai Ruan [Wed, 25 Nov 2015 16:20:05 +0000 (17:20 +0100)]
x86/xsaves: enable xsaves/xrstors/xsavec in xen
This patch uses xsaves/xrstors/xsavec instead of xsaveopt/xrstor
to perform the xsave_area switching so that xen itself
can benefit from them when available.
For xsaves/xrstors/xsavec only use compact format. Add format conversion
support when perform guest os migration. Also, pv guest will not support
xsaves/xrstors.
Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
[dropped redundant uses of XRSTOR_FIXUP and fix formatting]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Shuai Ruan [Wed, 25 Nov 2015 16:19:45 +0000 (17:19 +0100)]
x86/xsaves: using named operand instead numbered operand in xrstor
This is pre-req patch for latter xsaves patch. This patch introduce
a macro to handle restor fixup, also use named opreand instead of
numbered operand in restor fixup code.
Signed-off-by: Shuai Ruan <shuai.ruan@intel.com>
[with the expectation of later doing some cleanup:]
Acked-by: Jan Beulich <jbeulich@suse.com>
Jonathan Creekmore [Wed, 25 Nov 2015 16:19:01 +0000 (17:19 +0100)]
build: remove .d files from xen/ on a clean
Dependency files were getting left behind in the xen
directory (since
8b6ef9c152edceabecc7f90c811cd538a7b7a110),
so append the $(DEPS) to the clean rule that runs in the
hypervisor directory.
Signed-off-by: Jonathan Creekmore <jonathan.creekmore@gmail.com>
Jan Beulich [Wed, 25 Nov 2015 16:18:21 +0000 (17:18 +0100)]
console: make printk() line continuation tracking per-CPU
This avoids cases where split messages (with other than the initial
part not carrying a log level; single line messages only of course)
issued on multiple CPUs interfere with each other, causing messages to
be issued which are supposed to be suppressed due to the log level
setting. E.g.
CPU A CPU B
XENLOG_G_DEBUG "abc"
XENLOG_G_DEBUG "def\n"
"xyz\n"
would cause the last message to be logged despite this obviously not
being intended (at default log levels).
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:28:06 +0000 (17:28 +0000)]
xen/arm: vgic-v3: Make clear that GICD_*SPI_* registers are reserved
Our vGIC emulation have GICD_TYPER.MBIS set to 0 which means that
GICD_*SPI_* registers are reserved. Implement them using the *_reserved
labels.
Also, implement theses registers for the read part.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:28:05 +0000 (17:28 +0000)]
xen/arm: vgic-v3: Don't implement write-only register read as zero
A read to a write only register is unknown. Use a memorable value to
differentiate from an actual RAZ register.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:28:04 +0000 (17:28 +0000)]
xen/arm: vgic-v3: Remove spurious return in GICR_INVALLR
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:28:03 +0000 (17:28 +0000)]
xen/arm: vgic-v3: Emulate read to GICD_ICACTIVER<n>
The GICD_ICACTIVER<n> registers are missing in the read emulation of the
distributor.
Call the common emulation for the whole range.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:28:02 +0000 (17:28 +0000)]
xen/arm: vgic: Re-order the register emulations to match the memory map
It helps to find quickly whether we forgot to emulate a register or not.
At the same time add the missing reserved/implementation defined
registers. All other missing registers will be added in a follow-up if
necessary.
Note that only the distributor register map explicitely say the
size of a register (see 8.8 in ARM IHI 0069A). When the size is not
known, the implementation defined/reserved may not be emulated
correctly.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:28:01 +0000 (17:28 +0000)]
xen/arm: vgic-v3: Remove GICR_MOVALLR and GICR_MOVLPIR
The 2 registers are not described in the software spec (ARM IHI 0069A)
and their offsets are marked "implementation defined".
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:28:00 +0000 (17:28 +0000)]
xen/arm: vgic: Properly emulate the full register
The offset in the emulation is based on byte. As most of the registers
are 64/32 bits, they will span over multiple bytes.
However, the current emulation only cares about the first offset. This
will result in not properly emulating any access on the register with
any other offset.
Introduce new macros to help implementing access on multiple byte and
use them over the vGIC emulation.
Note that I didn't convert the reserved/implementation defined
registers. It will be done in a follow-up.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:27:59 +0000 (17:27 +0000)]
xen/arm: vgic-v3: Only emulate identification registers required by the spec
Most of the identification registers space contains implementation
defined registers (see 8.1.13 in ARM IHI 0069A) and only GIC{D,R}_PIDR2
is required to be implemented.
Currently the emulation of those registers mimic the ARM implementation,
but it's untrue to say that we properly emulate a such implementation.
Keep only GIC{D,R}_PIDR2 implemented with the "implementation defined
bits" to zero and the ArchRev field (bits[7:4]) to 0x3 as we emulate a
GICv3.
Note that the emulation of the range wasn't valid anyway because the
registers are split in 2 sets (PIDR4-PIDR7 and PIDR0-PIDR2).
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:27:58 +0000 (17:27 +0000)]
xen/arm: vgic-v3: Use the correct offset GICR_IGRPMODR0
The offset is 0x0D00 and not 0x0F80.
Also re-order the definition to keep all the definitions ordered.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:27:57 +0000 (17:27 +0000)]
xen/arm: vgic-v3: Don't try to emulate IROUTER which do not exist in the spec
The range of valid IROUTER<n> are n = 32 - 1019 (see 8.9.13 in IHI 0069A)
which correspond to the offset 0x6100-0x7FD8.
Other offsets are invalid and therefore should not be emulated.
Also remove the now unused label read_as_zero_64 and write_ignore_64.
Note that GICD_IROUTER is kept to accommodate the GICv3 drivers which has
been in part taken from Linux.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 17:27:56 +0000 (17:27 +0000)]
xen/arm: vgic-v2: Implement correctly ICFGR{0, 1} read-only
Each ITARGETSR register is 4-bytes wide and the offset is in bytes.
The current implementation is computing the offset of ICFGR1 and ICFG2
wrongly result to emulate only the first 2 byte of the ICFGR<n> range
read-only. The rest will be treated as read-write.
For convenience introduce ITARGETSR1 and ITARGETSR2.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- typoes in commit message ]
Julien Grall [Wed, 18 Nov 2015 16:42:43 +0000 (16:42 +0000)]
xen/arm: vgic-v3: Support 32-bit access for 64-bit registers
Based on 8.1.3 (IHI 0069A), unless stated otherwise, the 64-bit registers
supports both 32-bit and 64-bits access.
All the registers we properly emulate (i.e not RAZ/WI) supports 32-bit access.
For RAZ/WI, it's also seems to be the case but I'm not 100% sure. Anyway,
emulating 32-bit access for them doesn't hurt. Note that we would need
some extra care when they will be implemented (for instance GICR_PROPBASER).
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 16:42:42 +0000 (16:42 +0000)]
xen/arm: vgic: Introduce helpers to extract/update/clear/set vGIC register ...
and use them in the vGIC emulation.
The GIC registers may support different access sizes. Rather than open
coding the access for every registers, provide a set of helpers to access
them.
The caller will have to call vgic_regN_* where N is the size of the
emulated registers.
The new helpers supports any access size and expect the caller to
validate the access size supported by the emulated registers.
Finally, take the opportunity to fix the coding style in section we are
modifying.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 16:42:41 +0000 (16:42 +0000)]
xen/arm: vgic: Optimize the way to store the target vCPU in the rank
Xen is currently directly storing the value of GICD_ITARGETSR register
(for GICv2) and GICD_IROUTER (for GICv3) in the rank. This makes the
emulation of the registers access very simple but makes the code to get
the target vCPU for a given vIRQ more complex.
While the target vCPU of an vIRQ is retrieved every time an vIRQ is
injected to the guest, the access to the register occurs less often.
So the data structure should be optimized for the most common case
rather than the inverse.
This patch introduces the usage of an array to store the target vCPU for
every interrupt in the rank. This will make the code to get the target
very quick. The emulation code will now have to generate the GICD_ITARGETSR
and GICD_IROUTER register for read access and split it to store in a
convenient way.
With the new way to store the target vCPU, the structure vgic_irq_rank
is shrunk down from 320 bytes to 92 bytes. This is saving about 228
bytes of memory allocated separately per vCPU.
Note that with these changes, any read to those register will list only
the target vCPU used by Xen. As the spec is not clear whether this is a
valid choice or not, OSes which have a different interpretation of the
spec (i.e OSes which perform read-modify-write operations on these
registers) may not boot anymore on Xen. Although, I think this is fair
trade between memory usage in Xen (1KB less on a domain using 4 vCPUs
with no SPIs) and a strict interpretation of the spec (though all the
cases are not clearly defined).
Furthermore, the implementation of the callback get_target_vcpu is now
exactly the same. Consolidate the implementation in the common vGIC code
and drop the callback.
Finally take the opportunity to fix coding style and replace "irq" by
"virq" to make clear that we are dealing with virtual IRQ in section we
are modifying.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 16:42:40 +0000 (16:42 +0000)]
xen/arm: vgic-v2: Don't ignore a write in ITARGETSR if one field is 0
The current implementation ignores the whole write if one of the field is
0. Although, based on the spec (4.3.12 IHI 0048B.b), 0 is a valid value
when:
- The interrupt is not wired in the distributor. From the Xen
point of view, it means that the corresponding bit is not set in
d->arch.vgic.allocated_irqs.
- The user wants to disable the IRQ forwarding in the distributor.
I.e the IRQ stays pending in the distributor and never received by
the guest.
Implementing the later will require more work in Xen because we always
assume the interrupt is forwarded to a valid vCPU. So for now, ignore
any field where the value is 0.
The emulation of the write access of ITARGETSR has been reworked and
moved to a new function because it would have been difficult to
implement properly the behavior with the current code.
The new implementation is breaking the register in 4 distinct bytes. For
each byte, it will check the validity of the target list, find the new
target, migrate the interrupt and store the value if necessary.
In the new implementation there is nearly no distinction of the access
size to avoid having too many different path which is harder to test.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Wed, 18 Nov 2015 16:42:39 +0000 (16:42 +0000)]
xen/arm: vgic-v2: Handle correctly byte write in ITARGETSR
During a store, the byte is always in the low part of the register (i.e
[0:7]).
We are incorrectly masking the register by using a shift of the byte
offset in the ITARGETSR while the byte is alwasy in r[0:7]. This will
result in a target list equal to 0 which is ignored by the emulation.
Because of that the guest will only be able to modify the first byte in
each ITARGETSR.
Furthermore, the body of the loop is retrieving the old target list
using the index of the byte.
To avoid modifying too much the loop, shift the byte stored to the correct
offset.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Julien Grall [Wed, 18 Nov 2015 16:42:38 +0000 (16:42 +0000)]
xen/arm: vgic-v2: Implement correctly ITARGETSR0 - ITARGETSR7 read-only
Each ITARGETSR register are 4-byte wide and the offset is in byte.
The current implementation is computing the end of the range wrongly
resulting to emulate only ITARGETSR{0,1} read-only. The rest will be
treated as read-write.
As 8 registers should be read-only, the end of the range should be
ITARGETSR + (4 * 8) - 1.
For convenience introduce ITARGETSR7 and ITARGETSR8.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Stefano Stabellini [Thu, 12 Nov 2015 17:46:04 +0000 (17:46 +0000)]
xen/arm: move ticks conversions function declarations to the header file
This is just a cleanup, not required at the moment.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Stefano Stabellini [Thu, 12 Nov 2015 17:46:03 +0000 (17:46 +0000)]
arm: export platform_op XENPF_settime64
Call update_domain_wallclock_time at domain initialization.
Set time_offset_seconds to the number of seconds between physical boot
and domain initialization: it is going to be used to get/set the
wallclock time.
Add time_offset_seconds to system_time when before calling do_settime,
so that system_time actually accounts for all the time in nsec between
machine boot and when the wallclock was set.
Expose xsm_platform_op to ARM.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
CC: dgdegra@tycho.nsa.gov
Stefano Stabellini [Thu, 12 Nov 2015 17:46:02 +0000 (17:46 +0000)]
xen: move wallclock functions from x86 to common
Remove dummy arm implementation of wallclock_time.
Use shared_info() in common code rather than x86-ism to access it, when
possible.
Define the static variable wc_sec, and the local variable sec in
update_domain_wallclock_time, as uint64_t instead of unsigned long, to
avoid size issue on arm.
Take a uint64_t sec parameter in do_settime for the same reason.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
CC: JBeulich@suse.com
CC: andrew.cooper3@citrix.com
[ ijc -- typoes in commit message ]
Brendan Gregg [Wed, 25 Nov 2015 10:12:55 +0000 (11:12 +0100)]
x86/VPMU: return correct fixed PMC count
Fixes a register typo.
Signed-off-by: Brendan Gregg <bgregg@netflix.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Wed, 25 Nov 2015 10:12:40 +0000 (11:12 +0100)]
public/io/netif.h: tidy up and remove duplicate comments
Now that requests and response types and extra info segments are
documented in block comments, we can get rid of the inline comments
in the structures. This has the happy side-effect of making the Linux
checkpatch.pl script make fewer complaints after import.
This patch also fixes a small whitespace issue in the initial boiler-
plate comment, and a typo in one of the ascii-art diagrams.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Paul Durrant [Wed, 25 Nov 2015 10:12:34 +0000 (11:12 +0100)]
public/io/netif.h: add definition of gso_prefix flag
This flag is defined here only for compatibility with the Linux variant of
this header. The feature has never been documented and should be
considered deprecated.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Paul Durrant [Wed, 25 Nov 2015 10:12:26 +0000 (11:12 +0100)]
public/io/netif.h: document the reality of netif_rx_request/reponse
Because GSO metadata is passed from backend to frontend using
netif_extra_info segments, which do not carry information stating which
netif_rx_request_t was consumed to free up their slot, frontends must
assume some form of identity relation between ring slot and request.
Hence, so that it is able to use GSO metadata, Linux netfront simply
assumes rx responses appear in the same ring slot as their corresponding
request.
This patch documents the assumption made by Linux netfront and the
necessity of the assumption (to support GSO) so that backends are coded
to be compatible.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Boris Ostrovsky [Tue, 24 Nov 2015 17:33:08 +0000 (18:33 +0100)]
x86/VPMU: Initialize VPMU's lvtpc vector
If a guest sets up performance counters so that they can generate
a PMC interrupt but does not initilaize APIC LVTPC register the
resulting interrupt will cause an APIC error.
Note that a guest deciding to clear LVTPC in order to unduce the error
will not be successful in achieving its goal: emulation code only
looks at the mask bit and always sets the vector to PMU_APIC_VECTOR.
Only the initial value of LVTPC (which is zero) that gets loaded into
APIC as result of PMC initialization is the problem.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Jan Beulich [Tue, 24 Nov 2015 17:32:20 +0000 (18:32 +0100)]
x86/vPMU: document as unsupported
This is XSA-163.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Ian Campbell [Tue, 24 Nov 2015 16:50:27 +0000 (16:50 +0000)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Andrew Cooper [Tue, 24 Nov 2015 16:41:04 +0000 (17:41 +0100)]
x86/kexec: hide more kexec infrastructure behind CONFIG_KEXEC
Experimenting with the kconfig series showed that various bits of kexec
infrastructure were still being unconditionally included. Make them
conditional on CONFIG_KEXEC.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Jan Beulich [Tue, 24 Nov 2015 16:40:18 +0000 (17:40 +0100)]
x86: drop MAX_APICID
It's unused and wrong (we already have MAX_LOCAL_APIC and MAX_APICS).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Tue, 17 Nov 2015 16:19:20 +0000 (16:19 +0000)]
libxl: fix line wrapping issues introduced by automatic replacement
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Tue, 17 Nov 2015 16:19:19 +0000 (16:19 +0000)]
libxl: convert libxl__sprintf(gc) to GCSPRINTF
The rune used is:
sed -i 's/libxl__sprintf(gc,\s*/GCSPRINTF(/g' libxl*.c
This rune is simple and better than trying to match every possible
patterns.
Two instances in libxl_dm.c need fixing up. They are in fact better to just
use libxl__strdup.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Olaf Hering [Thu, 19 Nov 2015 08:32:52 +0000 (08:32 +0000)]
tools/hotplug: quote all variables in vif-bridge
Cosmetics: most of the variables used in vif-bridge are already quoted.
Add quoting also to the remaining shell variables.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Paul Durrant [Tue, 17 Nov 2015 11:32:05 +0000 (11:32 +0000)]
docs: Introduce xenstore paths for guest network address information
It is useful for a toolstack to be able to see the network addresses
in use by a domain for a particular vif in xenstore for display
purposes and, for example, so that a VNC session can be established
to the guest GUI.
This patch documents paths to allow a domain to advertise an interface
name, MAC (unicast and multicast) and IP (version 4 and 6) address
information.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Paul Durrant [Tue, 17 Nov 2015 11:32:04 +0000 (11:32 +0000)]
docs: Introduce xenstore paths for hotplug features
Without some indication from a guest it is not possible for a
toolstack to know whether instantiation of a new vbd or vif should
result in a new PV device of the appropriate type being brought online.
(In other words whether guest PV drivers are present and functioning).
This patch documents two paths which vif and vbd frontend drivers can
use to advertise their ability to respond to new vif or vbd
instantiations.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Paul Durrant [Tue, 17 Nov 2015 11:32:03 +0000 (11:32 +0000)]
docs: Introduce xenstore paths for PV driver information
For domain management purposes it is convenient to be able to see
information about PV drivers in xenstore. The XAPI toolstack in
XenServer has always created a ~/drivers path for this purpose.
This patch documents that path and also adds a specification of how
it should be used.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Paul Durrant [Tue, 17 Nov 2015 11:32:02 +0000 (11:32 +0000)]
docs: Introduce xenstore paths for PV control features
XenServer already makes use of ~/control/feature-suspend being written
to advertise guest capability of responding to 'suspend' when written to
~/control/shutdown and, since they are derived from XenServer drivers,
the Xen Project Windows PV drivers attempt to write this value. The write
currently fails for libxl provisioned VMs because ~/control is read-only
to the guest (only ~/control/shutdown is writable, for ackowledgement
purposes).
This patch documents feature-suspend and also a set of similar control
feature flags, so that that they may be added to libxl provisioned
guests by subsequent patches:
feature-poweroff: PV drivers/agent can shut down the guest
feature-reboot: PV drivers/agent can reboot the guest
feature-s3: PV drivers/agent can trigger guest sleep (HVM only)
feature-s4: PV drivers/agent can trigger guest hibernate (HVM only)
The patch (bacause it adds features relating to S3 and S4 power states)
also clarifies that the initial set of platform properties mentioned are
booleans, and updates the specifier accordingly.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Joe Perches [Thu, 19 Nov 2015 08:43:53 +0000 (08:43 +0000)]
get_maintainer: fix perl 5.22/5.24 deprecated/incompatible "\C" use
Perl 5.22 emits a deprecated message when "\C" is used in a regex. Perl
5.24 will disallow it altogether.
Fix it by using [A-Z] instead of \C.
[ Upstream commit
ce8155f7a3d59ce868ea16d8891edda4d865e873 ]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Thu, 19 Nov 2015 12:43:52 +0000 (12:43 +0000)]
tools/libxl: Drop dead code following calls to libxl__exec()
libxl__exec() doesn't ever return. Inform the compiler of this, and
remove all dead code.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Thu, 19 Nov 2015 12:46:09 +0000 (12:46 +0000)]
xen/arm: use masking operation instead of test_bit for MCSF bits
This is a follow of commit
90f2e2a307fc6a6258c39cc87b3b2bf9441c0fa7 "use
masking operation instead of test_bit for MCSF bits" where the ARM
changes were missing.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 20 Nov 2015 14:22:11 +0000 (14:22 +0000)]
MAINTAINERS: mini-os patches should be copied to minios-devel
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: samuel.thibault@ens-lyon.org
Cc: stefano.stabellini@eu.citrix.com
Cc: minios-devel@lists.xenproject.org
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Ian Campbell [Tue, 24 Nov 2015 16:10:32 +0000 (16:10 +0000)]
MINIOS_UPSTREAM_REVISION Update
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Wed, 18 Nov 2015 12:01:33 +0000 (12:01 +0000)]
Config.mk: Update SEABIOS_UPSTREAM_TAG to rel-1.9.0
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Dario Faggioli [Tue, 24 Nov 2015 13:50:30 +0000 (14:50 +0100)]
sched: get rid of the per domain vCPU list in Credit2
As, curently, there is no reason for bothering having
it and keeping it updated.
In fact, it is only used for dumping and changing
vCPUs parameters, but that can be achieved easily with
for_each_vcpu.
While there, improve alignment of comments, ad
add a const qualifier to a pointer, making things
more consistent with what happens everywhere else
in the source file.
This also allows us to kill one of the remaining
FIXMEs in the code, which is always good.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Dario Faggioli [Tue, 24 Nov 2015 13:50:09 +0000 (14:50 +0100)]
sched: get rid of the per domain vCPU list in RTDS
As, curently, there is no reason for bothering having
it and keeping it updated.
In fact, it is only used for dumping and changing
vCPUs parameters, but that can be achieved easily with
for_each_vcpu.
While there, take care of the case when
XEN_DOMCTL_SCHEDOP_getinfo is called but no vCPUs have
been allocated yet (by returning the default scheduling
parameters).
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
Dario Faggioli [Tue, 24 Nov 2015 13:49:47 +0000 (14:49 +0100)]
sched: better handle (not) inserting idle vCPUs in runqueues
Idle vCPUs are set to run immediately, as a part of their
own initialization, so we shouldn't even try to put them
in a runqueue. In fact, no scheduler does that, even when
asked to (that is rather explicit in Credit2 and RTDS, a
bit less evident in Credit1).
Let's make things look as follows:
- in generic code, explicitly avoid even trying to
insert idle vCPUs in runqueues;
- in specific schedulers' code, enforce that.
Note that, as csched_vcpu_insert() is no longer being
called, during boot (from sched_init_vcpu()) we can
safely avoid saving the flags when taking the runqueue
lock.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Dario Faggioli [Tue, 24 Nov 2015 13:49:09 +0000 (14:49 +0100)]
sched: clarify use cases of schedule_cpu_switch()
schedule_cpu_switch() is meant to be only used for moving
pCPUs from a cpupool to no cpupool, and from there back
to a cpupool, *not* to move them directly from one cpupool
to another.
This is something inherent to the way the function is
implemented and called, but is not that clear, just by the
look of it.
Make it more evident by:
- adding commentary and ASSERT()s;
- update the cpupool per-CPU variable (mapping pCPUs to
pools) directly in schedule_cpu_switch(), rather than
in various places in cpupool.c.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Dario Faggioli [Tue, 24 Nov 2015 13:48:34 +0000 (14:48 +0100)]
sched: fix locking for insert_vcpu() in credit1 and RTDS
The insert_vcpu() hook is handled with inconsistent locking.
In fact, schedule_cpu_switch() calls the hook with runqueue
lock held, while sched_move_domain() relies on the hook
implementations to take the lock themselves (and, since that
is not done in Credit1 and RTDS, such operation is not safe
in those cases).
This is fixed as follows:
- take the lock in the hook implementations, in specific
schedulers' code;
- avoid calling insert_vcpu(), for the idle vCPU, in
schedule_cpu_switch(). In fact, idle vCPUs are set to run
immediately, and the various schedulers won't insert them
in their runqueues anyway, even when explicitly asked to.
While there, still in schedule_cpu_switch(), locking with
_irq() is enough (there's no need to do *_irqsave()).
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Tue, 24 Nov 2015 11:31:13 +0000 (12:31 +0100)]
x86/HVM: type adjustments
- constify struct hvm_trap * function parameters
- width reduce and shuffle some struct hvm_trap members
- use bool_t for boolean fields struct hvm_function_table
- use unsigned for struct hvm_function_table's hap_capabilities field
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky<boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Tue, 24 Nov 2015 11:30:31 +0000 (12:30 +0100)]
VMX: fix/adjust trap injection
In the course of investigating the 4.1.6 backport issue of the XSA-156
patch I realized that #DB injection has always been broken, but with it
now getting always intercepted the problem has got worse: Documentation
clearly states that neither DR7.GD nor DebugCtl.LBR get cleared before
the intercept, so this is something we need to do before reflecting the
intercepted exception.
While adjusting this (and also with 4.1.6's strange use of
X86_EVENTTYPE_SW_EXCEPTION for #DB in mind) I further realized that
the special casing of individual vectors shouldn't be done for
software interrupts (resulting from INT $nn).
And then some code movement: Setting of CR2 for #PF can be done in the
same switch() statement (no need for a separate if()), and reading of
intr_info is better done close the the consumption of the variable
(allowing the compiler to generate better code / use fewer registers
for variables).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Bob Moore [Tue, 24 Nov 2015 11:25:37 +0000 (12:25 +0100)]
ACPI 6.0: Add changes for FADT table
ACPICA commit
72b0b6741990f619f6aaa915302836b7cbb41ac4
One new 64-bit field at the end of the table.
FADT version is now 6.
Signed-off-by: Bob Moore <robert.moore@intel.com>
[Linux commit
aeb823bbacc2a3aaee29eda5875b58a049fa1f78]
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Naresh Bhat [Tue, 24 Nov 2015 11:18:02 +0000 (12:18 +0100)]
acpi/NUMA: build NUMA for x86 only
NUMA is currently not supported for ARM in Xen. Add a new compilation
option HAS_NUMA for NUMA. Configure and build NUMA only for x86
architecture now.
Signed-off-by: Naresh Bhat <naresh.bhat@linaro.org>
Signed-off-by: Parth Dixit <parth.dixit@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Feng Wu [Tue, 24 Nov 2015 11:14:17 +0000 (12:14 +0100)]
VT-d: dump the posted format IRTE
Add the utility to dump the posted format IRTE.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Feng Wu [Tue, 24 Nov 2015 11:13:58 +0000 (12:13 +0100)]
vt-d: extend struct iremap_entry to support VT-d Posted-Interrupts
Extend struct iremap_entry according to VT-d Posted-Interrupts Spec.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Feng Wu [Tue, 24 Nov 2015 11:13:03 +0000 (12:13 +0100)]
VT-d: remove pointless casts
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Feng Wu [Tue, 24 Nov 2015 11:12:39 +0000 (12:12 +0100)]
vmx: initialize VT-d Posted-Interrupts Descriptor
This patch initializes the VT-d Posted-interrupt Descriptor.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Feng Wu [Tue, 24 Nov 2015 11:11:00 +0000 (12:11 +0100)]
vmx: add some helper functions for Posted-Interrupts
This patch adds some helper functions to manipulate the
Posted-Interrupts Descriptor.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Feng Wu [Tue, 24 Nov 2015 11:10:36 +0000 (12:10 +0100)]
vmx: extend struct pi_desc to support VT-d Posted-Interrupts
Extend struct pi_desc according to VT-d Posted-Interrupts Spec.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Feng Wu [Tue, 24 Nov 2015 11:10:10 +0000 (12:10 +0100)]
VT-d Posted-Interrupts feature detection
VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Feng Wu [Tue, 24 Nov 2015 11:09:28 +0000 (12:09 +0100)]
iommu: add iommu_intpost to control VT-d Posted-Interrupts feature
VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.
This patch adds variable 'iommu_intpost' to control whether enable VT-d
posted-interrupt or not in the generic IOMMU code.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 24 Nov 2015 11:07:27 +0000 (12:07 +0100)]
vVMX: use latched VMCS machine address
Instead of calling domain_page_map_to_mfn() over and over, latch the
guest VMCS machine address unconditionally (i.e. independent of whether
VMCS shadowing is supported by the hardware).
Since this requires altering the parameters of __[gs]et_vmcs{,_real}()
(and hence all their callers) anyway, take the opportunity to also drop
the bogus double underscores from their names (and from
__[gs]et_vmcs_virtual() as well).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Tue, 24 Nov 2015 11:06:26 +0000 (12:06 +0100)]
VMX: allocate VMCS pages from domain heap
There being only very few uses of the virtual address of a VMCS,
convert these cases to establish a mapping and lift the Xen heap
restriction from the VMCS allocation.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Ian Campbell [Mon, 23 Nov 2015 09:39:17 +0000 (09:39 +0000)]
MINIOS_UPSTREAM_REVISION Update
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Thu, 19 Nov 2015 14:45:41 +0000 (14:45 +0000)]
tools/libxc: Correct XC_DOM_PAGE_SIZE() to return a long long
c/s
abdf3c5b "libxc: create p2m list outside of kernel mapping if supported"
introduces a use which Coverity objects to; an int used to mask a uint64_t.
The result needs to be signed to allow ~XC_DOM_PAGE_SIZE() to function
correctly, and long long to function properly in 32bit builds.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 19 Nov 2015 16:11:08 +0000 (17:11 +0100)]
libxl: correct bug in domain builder regarding page tables for pvh
Commit
81a76e4b12961a9f54f5021809074196dfe6dbba ("libxc: rework of
domain builder's page table handler") dropped a special case for pvh
resulting in page tables being mapped read-only. This led to a panic
of the domain in early boot.
Correct this error.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Jan Beulich [Fri, 20 Nov 2015 11:38:33 +0000 (12:38 +0100)]
x86/P2M: consolidate handling of types not requiring a valid MFN
As noted regarding the mixture of checks in p2m_pt_set_entry(),
introduce a new P2M type group allowing to be used everywhere we
just care about accepting operations with either a valid MFN or a type
permitting to be used without (valid) MFN.
Note that p2m_mmio_dm is not included in P2M_NO_MFN_TYPES, as for the
intended purpose that one ought to be treated similar to p2m_invalid
(perhaps the two should ultimately get folded anyway).
Note further that PoD superpages now get INVALID_MFN used when creating
page table entries (was _mfn(0) before).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Fri, 20 Nov 2015 11:37:37 +0000 (12:37 +0100)]
x86/PoD: tighten conditions for checking super page
Since calling the function isn't cheap, try to avoid the call when we
know up front it won't help; see the code comment for details on those
conditions.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Thu, 19 Nov 2015 15:46:10 +0000 (16:46 +0100)]
x86/IO-APIC: fix setting of destinations
In commit
a85da715cf ("x86/IO-APIC: adjust setting of destinations") I
made a pretty blatant mistake: get_apic_id() can be used there only
when running APICs in physical mode. For both flat and clustered modes
the change was wrong, causing different kinds of boot problems on
affected systems. Don't revert that change though, but use TARGET_CPUS
(equaling cpu_online_map, and with there only being a single online CPU
fulfilling the original commits intention).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 19 Nov 2015 15:44:59 +0000 (16:44 +0100)]
x86: fixes to LAPIC probing
* Fix (unsafe) assumption that X86_FEATURE_APIC resided in feature word 0.
* All 64bit processors have local APICs; drop the vendor check.
* Unconditionally probe MSR_IA32_APICBASE (safely, to fail more gracefully in
broken situations) and avoid a redundant double rdmsr().
* Avoid repeatedly OR'ing APICBASE_ENABLE and DEFAULT_PHYS_BASE when
attempting to reenable the LAPIC.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 17 Nov 2015 12:23:11 +0000 (13:23 +0100)]
ns16550: limit mapped MMIO size
There's no point in mapping more than the memory we actually may need
to touch, and in fact the too large region could actually extend into
another device's one (which currently is benign on x86 since only a
single page gets mapped anyway, but which is a latent bug on ARM
whenever PCI support gets enabled there).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Tue, 17 Nov 2015 12:22:44 +0000 (13:22 +0100)]
ns16550: reset bar_64 on each iteration
Re-using the possibly non-zero value from a previous iteration can't
do any good.
Take the opportunity and
- limit a few other variables' scopes at once,
- adjust a few types.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Feng Wu [Tue, 17 Nov 2015 12:21:52 +0000 (13:21 +0100)]
x86: move some APIC related macros to apicdef.h
Move some APIC related macros to apicdef.h, so they can be used
outside of vlapic.c.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Feng Wu [Tue, 17 Nov 2015 12:21:33 +0000 (13:21 +0100)]
x86: add cmpxchg16b support
This patch adds cmpxchg16b support for x86-64, so software
can perform 128-bit atomic write/read.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Bob Liu [Tue, 17 Nov 2015 12:21:13 +0000 (13:21 +0100)]
blkif.h: document blkif multi-queue/ring extension
Document the multi-queue/ring feature in terms of XenStore keys to be written
by the backend and by the frontend.
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Campbell [Mon, 16 Nov 2015 13:38:33 +0000 (13:38 +0000)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Juergen Gross [Thu, 12 Nov 2015 13:43:36 +0000 (14:43 +0100)]
libxc: create p2m list outside of kernel mapping if supported
In case the kernel of a new pv-domU indicates it is supporting a p2m
list outside the initial kernel mapping by specifying INIT_P2M, let
the domain builder allocate the memory for the p2m list from physical
guest memory only and map it to the address the kernel is expecting.
This will enable loading pv-domUs larger than 512 GB.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 12 Nov 2015 13:43:35 +0000 (14:43 +0100)]
libxc: rework of domain builder's page table handler
In order to prepare a p2m list outside of the initial kernel mapping
do a rework of the domain builder's page table handler. The goal is
to be able to use common helpers for page table allocation and setup
for initial kernel page tables and page tables mapping the p2m list.
This is achieved by supporting multiple mapping areas. The mapped
virtual addresses of the single areas must not overlap, while the
page tables of a new area added might already be partially present.
Especially the top level page table is existing only once, of course.
Currently restrict the number of mappings to 1 because the only mapping
now is the initial mapping created by toolstack. There should not be
behaviour change and guest visible change introduced.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 12 Nov 2015 13:43:34 +0000 (14:43 +0100)]
libxc: split p2m allocation in domain builder from other magic pages
Carve out the p2m list allocation from the .alloc_magic_pages hook of
the domain builder in order to prepare allocating the p2m list outside
of the initial kernel mapping. This will be needed to support loading
domains with huge memory (>512 GB).
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 12 Nov 2015 13:43:33 +0000 (14:43 +0100)]
libxc: create unmapped initrd in domain builder if supported
In case the kernel of a new pv-domU indicates it is supporting an
unmapped initrd, don't waste precious virtual space for the initrd,
but allocate only guest physical memory for it.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 12 Nov 2015 13:43:32 +0000 (14:43 +0100)]
libxc: use domain builder architecture private data for x86 pv domains
Move some data private to the x86 domain builder to the private data
section. Remove extra_pages as they are used nowhere.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 12 Nov 2015 13:43:31 +0000 (14:43 +0100)]
libxc: introduce domain builder architecture specific data
Reorganize struct xc_dom_image to contain a pointer to domain builder
architecture specific private data. This will abstract the architecture
or domain type specific data from the general used data.
The new area is allocated as soon as the domain type is known.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 12 Nov 2015 13:43:30 +0000 (14:43 +0100)]
libxc: rename domain builder count_pgtables to alloc_pgtables
Rename the count_pgtables hook of the domain builder to alloc_pgtables
and do the allocation of the guest memory for page tables inside this
hook. This will remove the need for accessing the x86 specific pgtables
member of struct xc_dom_image in the generic domain builder code.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 12 Nov 2015 13:43:29 +0000 (14:43 +0100)]
xen: add generic flag to elf_dom_parms indicating support of unmapped initrd
Support of an unmapped initrd is indicated by the kernel of the domain
via elf notes. In order not to have to use raw elf data in the tools
for support of an unmapped initrd add a flag to the parsed data area
to indicate the kernel supporting this feature.
Switch using this flag in the hypervisor domain builder.
Cc: andrew.cooper3@citrix.com
Cc: jbeulich@suse.com
Cc: keir@xen.org
Suggested-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Juergen Gross [Thu, 12 Nov 2015 13:43:28 +0000 (14:43 +0100)]
libxc: reorganize domain builder guest memory allocator
Guest memory allocation in the domain builder of libxc is done via
virtual addresses only. In order to be able to support preallocated
areas not virtually mapped reorganize the memory allocator to keep
track of allocated pages globally and in allocated segments.
This requires an interface change of the allocate callback of the
domain builder which currently is using the last mapped virtual
address as a parameter. This is no problem as the only user of this
callback is stubdom/grub/kexec.c using this virtual address to
calculate the last used pfn.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Mon, 16 Nov 2015 12:12:20 +0000 (13:12 +0100)]
x86: drop hard_smp_procssor_id()
... and use what it aliased to directly.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 16 Nov 2015 12:11:59 +0000 (13:11 +0100)]
x86/IO-APIC: adjust setting of destinations
setup_IO_APIC_irqs() runs before APs get brought up, so using
desc->arch.cpu_mask as best risks it being either empty or having bits
for CPUs other than the BP set. Just use the APIC ID of the only
online CPU directly.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 16 Nov 2015 12:11:08 +0000 (13:11 +0100)]
x86/IO-APIC: fix setup of Xen internally used IRQs (take 2)
..., i.e. namely that of a PCI serial card with an IRQ above the
legacy range. This had got broken by the switch to cpumask_any() in
cpu_mask_to_apicid_phys(). Fix this by allowing all CPUs for that IRQ
(via setup_vector_irq() properly updating a booting CPU's vector_irq[],
thus avoiding "No irq handler for vector" messages and the interrupt
not working).
Cleanup coding style and types there at once.
While doing this I also noticed that io_apic_set_pci_routing() can't
be quite right: It sets up the destination _before_ getting a vector
allocated (which on other than systems using the flat APIC mode
affects the possible destinations), and also didn't restrict affinity
to ->arch.cpu_mask (as established by assign_irq_vector()).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Campbell [Mon, 16 Nov 2015 11:29:45 +0000 (11:29 +0000)]
MINIOS_UPSTREAM_REVISION Update
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Tue, 10 Nov 2015 10:46:44 +0000 (10:46 +0000)]
tools/ocaml/xb: Correct calculations of data/space the ring
ml_interface_{read,write}() would miscalculate the quantity of
data/space in the ring if it crossed the ring boundary, and incorrectly
return a short read/write.
This causes a protocol stall, as either side of the ring ends up waiting
for what they believe to be the other side needing to take the next
action.
Correct the calculations to cope with crossing the ring boundary.
In addition, correct the error detection. It is a hard error if the
producer index gets more than a ring size ahead of the consumer, or if
the consumer ever overtakes the producer.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: David Scott <dave@recoil.org>
Jim Fehlig [Fri, 13 Nov 2015 02:40:46 +0000 (19:40 -0700)]
libxl: relax readonly check introduced by XSA-142 fix
The fix for XSA-142 is quite a big hammer, rejecting readonly
disk configuration even when the requested backend is known to
support readonly. While it is true that qemu doesn't support
readonly for emulated IDE or AHCI disks
$ /usr/lib/xen/bin/qemu-system-i386 \
-drive file=/tmp/disk.raw,if=ide,media=disk,format=raw,readonly=on
qemu-system-i386: Can't use a read-only drive
$ /usr/lib/xen/bin/qemu-system-i386 -device ahci,id=ahci0 \
-drive file=/tmp/disk.raw,if=none,id=ahcidisk-0,format=raw,readonly=on \
-device ide-hd,bus=ahci0.0,unit=0,drive=ahcidisk-0
qemu-system-i386: -device ide-hd,bus=ahci0.0,unit=0,drive=ahcidisk-0:
Can't use a read-only drive
It does support readonly SCSI disks
$ /usr/lib/xen/bin/qemu-system-i386 \
-drive file=/tmp/disk.raw,if=scsi,media=disk,format=raw,readonly=on
[ok]
Inside a guest using such a disk, the SCSI kernel driver sees write
protect on
[ 7.339232] sd 2:0:1:0: [sdb] Write Protect is on
Also, PV drivers support readonly, but the patch rejects such
configuration even when PV drivers (vdev=xvd*) have been explicitly
specified and creation of an emulated twin is skiped.
This follow-up patch loosens the restriction to reject readonly when
creating an emulated IDE or AHCI disk, but allows it when the backend
is known to support readonly.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Juergen Gross [Fri, 23 Oct 2015 13:05:01 +0000 (15:05 +0200)]
libxc: remove xc_get_bit_size() from tools/libxc/xc_dom_compat_linux.c
xc_get_bit_size() is being used by the unused python wrapper
xc.getBitSize() only. Remove the wrapper and xc_get_bit_size().
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Juergen Gross [Fri, 23 Oct 2015 13:05:00 +0000 (15:05 +0200)]
libxc: remove most of tools/libxc/xc_dom_compat_linux.c
In tools/libxc/xc_dom_compat_linux.c xc_linux_build() is the only
domain building function used by an in-tree component (qemu-xen) which
is really necessary.
Remove the other domain building functions and the unused python
wrapper xc.linux_build() referencing one of the to be removed
functions.
Suggested-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Thu, 12 Nov 2015 10:06:58 +0000 (10:06 +0000)]
Config.mk: update OVMF changeset
The new osstest tested head contains a fix for gcc-4.4 toolchain.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jonathan Davies [Wed, 11 Nov 2015 11:21:53 +0000 (11:21 +0000)]
oxenstored: Quota.merge: don't assume domain already exists
In Quota.merge, we merge two quota hashtables, orig_quota and mod_quota, putting
the results into dest_quota. These hashtables map domids to the number of
entries currently owned by that domain.
When mod_quota contains an entry for a domid that was not present in orig_quota
(or dest_quota), the call to get_entry caused Quota.merge to raise a Not_found
exception. This propagates back to the client as an ENOENT error, which is not
an appropriate return value from some operations, such as transaction_end.
This situation can arise when a transaction that introduces a domain (hence
calling Quota.add_entry) needs to be coalesced due to concurrent xenstore
activity.
This patch handles the merge in the case where mod_quota contains an entry not
present in orig_quota (or in dest_quota) by treating that hashtable as having
existing value 0.
Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Oleksandr Tyshchenko [Thu, 5 Nov 2015 17:53:07 +0000 (19:53 +0200)]
xen/serial: Return actual bytes stored in TX FIFO for OMAP
This is intended to decrease a time spending in transmitter
while waiting for the free space in TX FIFO.
And as result to reduce the impact of hvc on the entire system
running on OMAP5/DRA7XX based platforms.
Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Julien Grall <julien.grall@citrix.com>
CC: Stefano Stabellini <stefano.stabellini@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Oleksandr Tyshchenko [Thu, 5 Nov 2015 17:53:06 +0000 (19:53 +0200)]
xen/serial: Move any OMAP specific things to OMAP UART driver
The 8250-uart.h contains extra serial register definitions
for the internal UARTs in TI OMAP SoCs which are used in
OMAP UART driver only.
In order to clean up code move these definitions to omap-uart.c.
Also rename some definitions to follow to the UART_OMAP* prefix.
Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Julien Grall <julien.grall@citrix.com>
CC: Stefano Stabellini <stefano.stabellini@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>