Yi Sun [Tue, 19 Dec 2017 00:42:15 +0000 (08:42 +0800)]
tools: create general interfaces to support psr allocation features
This patch creates general interfaces in libxl to support all psr
allocation features.
Add 'LIBXL_HAVE_PSR_GENERIC' to indicate interface change.
Please note, the functionality cannot work until later patches
are applied.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Thu, 18 Jan 2018 10:34:04 +0000 (10:34 +0000)]
xen/pvshim: switch shim.c to use typesafe mfn_to_page and virt_to_mfn
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Wed, 17 Jan 2018 08:37:54 +0000 (08:37 +0000)]
firmware/shim: fix build process to use POSIX find options
The -printf find option is not POSIX compatible, so replace it with
another rune.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Wed, 17 Jan 2018 09:29:35 +0000 (09:29 +0000)]
xen/pvshim: fix coding style issues
Fix a couple of coding style issues.
No code or functional change.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Wed, 17 Jan 2018 09:24:03 +0000 (09:24 +0000)]
xen/pvshim: re-order replace_va_mapping code
No functional change.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Wed, 17 Jan 2018 09:20:05 +0000 (09:20 +0000)]
xen/pvshim: identity pin shim vCPUs to pCPUs
Since VCPUOP_{up/down} already identity maps vCPU hotplug to pCPU
hotplug also identity pin the vCPUs to the pCPUs in the scheduler.
This prevents vCPU migration and should improve performance.
While there also use __cpumask_set_cpu instead of cpumask_set_cpu,
there's no need to use the locked variant.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Wed, 17 Jan 2018 08:34:26 +0000 (08:34 +0000)]
xen/pvh: place the trampoline starting at MFN 1
Since PVH guest jump straight into trampoline_setup trampoline_phys is
not initialized, thus the trampoline is relocated to address 0.
This works, but has the undesirable effect of having VA 0 mapped to
MFN 0, which means NULL pointed dereferences no longer trigger a page
fault.
In order to solve this, place the trampoline starting at MFN 1 and
reserve the memory used by it.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Wed, 17 Jan 2018 08:34:19 +0000 (08:34 +0000)]
xen/pvshim: map vcpu_info earlier for APs
Or else init_percpu_time is going to dereference a NULL pointer when
trying to access vcpu_info.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Wed, 17 Jan 2018 16:43:54 +0000 (16:43 +0000)]
ocaml: fix arm build
ARM doesn't have emulation_flags in the arch_domainconfig.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
Doug Goldstein [Tue, 16 Jan 2018 16:28:56 +0000 (10:28 -0600)]
docs: note default for timer_mode in xl.cfg man
There was no default documented but inspecting
libxl__domain_build_info_setdefault() shows the default to be
LIBXL_TIMER_MODE_NO_DELAY_FOR_MISSED_TICKS.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Michael Young [Thu, 18 Jan 2018 10:49:38 +0000 (10:49 +0000)]
-xen-attach is needed for pvh boot with qemu-xen
Currently the boot of a pvh guest using the qemu-xen device model
fails with the error
xen emulation not implemented (yet)
in the qemu-dm log file. This patch adds the missing -xen-attach
argument.
Signed-off-by: Michael Young <m.a.young@durham.ac.uk>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>e
Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: ported to staging ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Fri, 1 Dec 2017 14:14:07 +0000 (15:14 +0100)]
libxl: put RSDP for PVH guest near 4GB
Instead of locating the RSDP table below 1MB put it just below 4GB
like the rest of the ACPI tables in case of PVH guests. This will
avoid punching more holes than necessary into the memory map.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Ian Jackson [Tue, 17 Oct 2017 16:44:11 +0000 (17:44 +0100)]
MAINTAINERS: Make Christian Lindig maintainer for ocaml tools
oxenstored is our default implementation of xenstore, for platforms
that have ocaml support. We need it to be maintained. Dave Scott,
the only existing maintainer, has had limited availability.
Christian has been reveiwing patches and offering opinions where
necessary, although activity in this area has been quiet and there has
not been a great deal of new development.
Christian's contributions have been sensible and I think it would be a
good idea now to formally make him a maintainer.
CC: Christian Lindig <christian.lindig@citrix.com>
CC: David Scott <dave@recoil.org>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: David Scott <dave@recoil.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Tue, 16 Jan 2018 14:23:37 +0000 (14:23 +0000)]
xen/arm64: Implement branch predictor hardening for affected Cortex-A CPUs
Cortex-A57, A72, A73 and A75 are susceptible to branch predictor
aliasing and can theoritically be attacked by malicious code.
This patch implements a PSCI-based mitigation for these CPUs when
available. The call into firmware will invalidate the branch predictor
state, preventing any malicious entries from affection other victim
contexts.
Ported from Linux git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
branch kpti.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This is part of XSA-254.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Tue, 16 Jan 2018 14:23:36 +0000 (14:23 +0000)]
xen/arm64: Add skeleton to harden the branch predictor aliasing attacks
Aliasing attacked against CPU branch predictors can allow an attacker to
redirect speculative control flow on some CPUs and potentially divulge
information from one context to another.
This patch adds initial skeleton code behind a new Kconfig option to
enable implementation-specific mitigations against these attacks for
CPUs that are affected.
Most of the mitigations will have to be applied when entering to the
hypervisor from the guest context. For safety, it is applied at every
exception entry. So there are potential for optimizing when receiving
an exception at the same level.
Because the attack is against branch predictor, it is not possible to
safely use branch instruction before the mitigation is applied.
Therefore, this has to be done in the vector entry before jump to the
helper handling a given exception.
On Arm64, each vector can hold 32 instructions. This leave us 31
instructions for the mitigation. The last one is the branch instruction
to the helper.
Because a platform may have CPUs with different micro-architectures,
per-CPU vector table needs to be provided. Realistically, only a few
different mitigations will be necessary. So provide a small set of
vector tables. They will be re-used and patch with the mitigations
on-demand.
This is based on the work done in Linux (see [1]).
This is part of XSA-254.
[1] git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
branch ktpi
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Tue, 16 Jan 2018 14:23:35 +0000 (14:23 +0000)]
xen/arm: cpuerrata: Add MIDR_ALL_VERSIONS
Introduce a new macro MIDR_ALL_VERSIONS to match all variant/revision of a
given CPU model.
This is part of XSA-254.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Tue, 16 Jan 2018 14:23:34 +0000 (14:23 +0000)]
xen/arm64: Add missing MIDR values for Cortex-A72, A73 and A75
Cortex-A72, A73 and A75 MIDR will be used to a follow-up for hardening
the branch predictor.
This is part of XSA-254.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Tue, 16 Jan 2018 14:23:33 +0000 (14:23 +0000)]
xen/arm: Introduce enable callback to enable a capabilities on each online CPU
Once Xen knows what features/workarounds present on the platform, it
might be necessary to configure each online CPU.
Introduce a new callback "enable" that will be called on each online CPU to
configure the "capability".
The code is based on Linux v4.14 (where cpufeature.c comes from), the
explanation of why using stop_machine_run is kept as we have similar
problem in the future.
Lastly introduce enable_errata_workaround that will be called once CPUs
have booted and before the hardware domain is created.
This is part of XSA-254.
Signed-of-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Ian Jackson [Wed, 17 Jan 2018 14:29:24 +0000 (14:29 +0000)]
Revert "xl: Default guest mode changed from PV to PVH with PV shim"
This breaks ARM. It should be protected by some x86 #if. For now,
revert it, as it's not critical (and it isn't included in the
comet/vixen security patch branches published via XSA-254).
This reverts commit
63080b704351022cb7badb73339d47646fb465bd.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Julien Grall <julien.grall@linaro.org>
Wei Liu [Wed, 17 Jan 2018 09:50:27 +0000 (09:50 +0000)]
tools: fix arm build after
bdf693ee61b48
The ramdisk fields were removed. We should use modules[0] instead.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Tue, 16 Jan 2018 18:56:45 +0000 (18:56 +0000)]
Don't build xen-shim for 32 bit build host
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Ian Jackson [Fri, 5 Jan 2018 16:13:31 +0000 (16:13 +0000)]
xl: Default guest mode changed from PV to PVH with PV shim
If the config file specifies a type (or builder), it overrides this
default. But if it doesn't, you now get a PV-in-PVH guest.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Fri, 22 Dec 2017 16:12:23 +0000 (16:12 +0000)]
xl: pvshim: Provide and document xl config
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Ian Jackson [Fri, 5 Jan 2018 15:59:29 +0000 (15:59 +0000)]
libxl: pvshim: Introduce pvshim_extra
And move the debugging options from the default config into a doc
comment in libxl_types.idl.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Thu, 14 Dec 2017 16:16:20 +0000 (16:16 +0000)]
libxl: pvshim: Provide first-class config settings to enable shim mode
This is API-compatible because old callers are supposed to call
libxl_*_init to initialise the struct; and the updated function clears
these members.
It is ABI-compatible because the new fields make this member of the
guest type union larger but only within the existing size of that
union.
Unfortunately it is not easy to backport because it depends on the PVH
domain type. Attempts to avoid use of the PVH domain type involved
working with two views of the configuration: the "underlying" domain
type and the "visible" type (and corresponding config info). Also
there are different sets of config settings for PV and PVH, which
callers would have to know to set.
And, unfortunately, it will not be possible, with this approach, to
enable the shim by default for all libxl callers. (Although it could
perhaps be done in xl.)
For now, our config defaults are:
* if enabled, path is "xen-shim" in the xen firmware directory
* if enabled, cmdline is the one we are currently debugging with
The debugging arguments will be rationalised in a moment.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:21 +0000 (11:41 +0000)]
xen/shim: allow DomU to have as many vcpus as available
Since the shim VCPUOP_{up/down} hypercall is wired to the plug/unplug
of CPUs to the shim itself, start the shim DomU with only the BSP
online, and let the guest bring up other CPUs as it needs them.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:21 +0000 (11:41 +0000)]
xen/shim: crash instead of reboot in shim mode
All guest shutdown operations are forwarded to L0, so the only native
calls to machine_restart happen from crash related paths inside the
hypervisor, hence switch the reboot code to instead issue a crash
shutdown.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
[ wei: fix arm build ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:20 +0000 (11:41 +0000)]
xen/pvshim: use default position for the m2p mappings
When running a 32bit kernel as Dom0 on a 64bit hypervisor the
hypervisor will try to shrink the hypervisor hole to the minimum
needed, and thus requires the Dom0 to use XENMEM_machphys_mapping in
order to fetch the position of the start of the hypervisor virtual
mappings.
Disable this feature when running as a PV shim, since some DomU
kernels don't implemented XENMEM_machphys_mapping and break if the m2p
doesn't begin at the default address.
NB: support for the XENMEM_machphys_mapping was added in Linux by
commit 7e7750.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:20 +0000 (11:41 +0000)]
xen/shim: modify shim_mem parameter behaviour
shim_mem will now account for both the memory used by the hypervisor
loaded in memory and the free memory slack given to the shim for
runtime usage.
From experimental testing it seems like the total amount of MiB used
by the shim (giving it ~1MB of free memory for runtime) is:
memory/113 + 20
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:20 +0000 (11:41 +0000)]
xen/pvshim: memory hotplug
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:20 +0000 (11:41 +0000)]
xen/pvshim: support vCPU hotplug
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:20 +0000 (11:41 +0000)]
xen/pvshim: set max_pages to the value of tot_pages
So that the guest is not able to deplete the memory pool of the shim
itself by trying to balloon up.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Sergey Dyasli [Thu, 11 Jan 2018 11:41:20 +0000 (11:41 +0000)]
xen/pvshim: add shim_mem cmdline parameter
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:19 +0000 (11:41 +0000)]
xen/pvshim: add migration support
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Sergey Dyasli [Thu, 11 Jan 2018 11:45:23 +0000 (11:45 +0000)]
x86/pv-shim: shadow PV console's page for L2 DomU
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
[remove notify_guest helper and directly use pv_shim_inject_evtchn]
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:19 +0000 (11:41 +0000)]
xen/pvshim: add grant table operations
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:19 +0000 (11:41 +0000)]
xen/pvshim: forward evtchn ops between L0 Xen and L2 DomU
Note that the unmask and the virq operations are handled by the shim
itself, and that FIFO event channels are not exposed to the guest.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:19 +0000 (11:41 +0000)]
xen/pvshim: set correct domid value
If domid is not provided by L0 set domid to 1 by default. Note that L0
not provinding the domid can cause trouble if the guest tries to use
it's domid instead of DOMID_SELF when performing hypercalls that are
forwarded to the L0 hypervisor.
Since the domain created is no longer the hardware domain add a hook
to the domain shutdown path in order to forward shutdown operations to
the L0 hypervisor.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:18 +0000 (11:41 +0000)]
xen/pvshim: modify Dom0 builder in order to build a DomU
According to the PV ABI the initial virtual memory regions should
contain the xenstore and console pages after the start_info. Also set
the correct values in the start_info for DomU operation.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:18 +0000 (11:41 +0000)]
xen: mark xenstore/console pages as RAM
This si required so that later they can be shared with the guest if
Xen is running in shim mode.
Also prevent them from being used by Xen by marking them as bad pages
in init_boot_pages.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:18 +0000 (11:41 +0000)]
xen/pvshim: skip Dom0-only domain builder parts
Do not allow access to any iomem or ioport by the shim, and also
remove the check for Dom0 kernel support.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Thu, 11 Jan 2018 11:41:18 +0000 (11:41 +0000)]
xen/pvh: do not mark the low 1MB as IO mem
On PVH there's nothing special on the low 1MB.
This is an optional patch that doesn't affect the functionality of the
shim.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Tue, 28 Nov 2017 09:54:17 +0000 (09:54 +0000)]
xen/x86: make VGA support selectable
Through a Kconfig option. Enable it by default, and disable it for the
PV-in-PVH shim.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Wed, 22 Nov 2017 13:31:26 +0000 (13:31 +0000)]
tools/firmware: Build and install xen-shim
Link a minimum set of files to build the shim. The linkfarm rune can
handle creation and deletion of files. Introduce build-shim and
install-shim targets in xen/Makefile.
We can do better by properly generate the dependency from the list of
files but that's an improvement for later.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Fri, 10 Nov 2017 16:35:26 +0000 (16:35 +0000)]
x86/shim: Kconfig and command line options
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Sergey Dyasli [Fri, 24 Nov 2017 11:21:17 +0000 (11:21 +0000)]
x86/guest: use PV console for Xen/Dom0 I/O
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Sergey Dyasli [Fri, 24 Nov 2017 11:07:32 +0000 (11:07 +0000)]
x86/guest: add PV console code
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Tue, 9 Jan 2018 12:51:37 +0000 (12:51 +0000)]
x86/guest: setup event channel upcall vector
And a dummy event channel upcall handler.
Note that with the current code the underlying Xen (L0) must support
HVMOP_set_evtchn_upcall_vector or else event channel setup is going to
fail. This limitation can be lifted by implementing more event channel
interrupt injection methods as a backup.
Register callback_irq to trick toolstack to think the domain is
enlightened.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Thu, 11 Jan 2018 13:45:48 +0000 (13:45 +0000)]
x86: don't swallow the first command line item in guest mode
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Fri, 17 Nov 2017 15:19:09 +0000 (15:19 +0000)]
x86: read wallclock from Xen when running in pvh mode
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Fri, 17 Nov 2017 12:46:41 +0000 (12:46 +0000)]
x86: APIC timer calibration when running as a guest
The timer calibration currently depends on PIT. Introduce a variant
to wait for a tick's worth of time to elapse when running as a PVH
guest.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Thu, 16 Nov 2017 17:56:18 +0000 (17:56 +0000)]
x86: xen pv clock time source
It is a variant of TSC clock source.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Thu, 28 Dec 2017 15:22:34 +0000 (15:22 +0000)]
x86/guest: map per-cpu vcpu_info area.
Mapping the per-vcpu vcpu_info area is required in order to use more
than XEN_LEGACY_MAX_VCPUS.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Wed, 27 Dec 2017 09:23:01 +0000 (09:23 +0000)]
xen/guest: fetch vCPU ID from Xen
If available.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
[ wei: fix non-shim build ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Tue, 9 Jan 2018 11:19:44 +0000 (11:19 +0000)]
x86/guest: map shared_info page
Use an unpopulated PFN in order to map it.
Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Wed, 3 Jan 2018 16:50:24 +0000 (16:50 +0000)]
xen/pvshim: keep track of used PFN ranges
Simple infrastructure to keep track of PFN space usage, so that we can
use unpopulated PFNs to map special pages like shared info and grant
table.
As rangeset depends on malloc being ready so hypervisor_setup is
introduced for things that can be initialised late in the process.
Note that the PFN is marked as reserved at least up to 4GiB (or more
if the guest has more memory). This is not a perfect solution but
avoids using the MMIO hole below 4GiB. Ideally the shim (L1) should
have a way to ask the underlying Xen (L0) which memory regions are
populated, unpopulated, or MMIO space.
Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Wed, 3 Jan 2018 16:38:54 +0000 (16:38 +0000)]
xen: introduce rangeset_claim_range
Reserve a hole in a rangeset.
Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Thu, 11 Jan 2018 10:18:09 +0000 (10:18 +0000)]
xen/console: Introduce console=xen
This specifies whether to use Xen specific console output. There are
two variants: one is the hypervisor console, the other is the magic
debug port 0xe9.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Tue, 14 Nov 2017 18:19:09 +0000 (18:19 +0000)]
x86/pvh: Retrieve memory map from Xen
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Tue, 21 Nov 2017 14:43:32 +0000 (14:43 +0000)]
x86/shutdown: Support for using SCHEDOP_{shutdown,reboot}
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 21 Nov 2017 13:54:47 +0000 (13:54 +0000)]
x86/guest: Hypercall support
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Tue, 28 Nov 2017 14:53:51 +0000 (14:53 +0000)]
x86/entry: Probe for Xen early during boot
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Wed, 22 Nov 2017 11:39:04 +0000 (11:39 +0000)]
x86/boot: Map more than the first 16MB
TODO: Replace somehow (bootstrap_map() ?)
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Mon, 13 Nov 2017 17:32:19 +0000 (17:32 +0000)]
x86/entry: Early PVH boot code
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Fri, 10 Nov 2017 16:19:40 +0000 (16:19 +0000)]
x86: produce a binary that can be booted as PVH
Produce a binary that can be booted as PVH. It doesn't do much yet.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Fri, 10 Nov 2017 12:36:49 +0000 (12:36 +0000)]
x86: introduce ELFNOTE macro
It is needed later for introducing PVH entry point.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Wed, 22 Nov 2017 11:09:41 +0000 (11:09 +0000)]
x86/link: Relocate program headers
When the xen binary is loaded by libelf (in the future) we rely on the
elf loader to load the binary accordingly. Specify the load address so
that the resulting binary can make p_vaddr and p_paddr have different
values.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Fri, 10 Nov 2017 16:35:26 +0000 (16:35 +0000)]
x86/Kconfig: Options for Xen and PVH support
Introduce two options. One to detect whether the binary is running on
Xen, the other enables PVH ABI support.
The former will be useful to PV in HVM approach. Both will be used by
PV in PVH approach.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 11 Jan 2018 17:48:00 +0000 (17:48 +0000)]
x86: Common cpuid faulting support
With CPUID Faulting offered to SVM guests, move Xen's faulting code to being
common rather than Intel specific.
This is necessary for nested Xen (inc. pv-shim mode) to prevent PV guests from
finding the outer HVM Xen leaves via native cpuid.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 11 Jan 2018 17:48:00 +0000 (17:48 +0000)]
x86/fixmap: Modify fix_to_virt() to return a void pointer
Almost all users of fix_to_virt() actually want a pointer. Include the cast
within the definition, so the callers don't need to.
Two users which need the integer value are switched to using __fix_to_virt()
directly. A few users stay fully unchanged, due to GCC's void pointer
arithmetic extension causing the same behaviour. Most users however have
their explicit casting dropped.
Since __iomem is not used consistently in Xen, we drop it too.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Jon Ludlam [Thu, 11 Jan 2018 17:47:59 +0000 (17:47 +0000)]
tools/ocaml: Extend domain_create() to take arch_domainconfig
No longer passing NULL into xc_domain_create() allows for the creation
of PVH guests.
Signed-off-by: Jon Ludlam <jonathan.ludlam@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 11 Jan 2018 17:47:59 +0000 (17:47 +0000)]
tools/ocaml: Expose arch_config in domaininfo
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 11 Jan 2018 17:47:59 +0000 (17:47 +0000)]
xen/domctl: Return arch_config via getdomaininfo
This allows toolstack software to distinguish HVM from PVH guests.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Bob Moore [Thu, 11 Jan 2018 17:47:59 +0000 (17:47 +0000)]
ACPICA: Make ACPI Power Management Timer (PM Timer) optional.
PM Timer is now optional.
This support is already in Windows8 and "SHOULD" come out in ACPI 5.0A
(if all goes well).
The change doesn't affect Xen directly, because it does not rely
on the presence of the PM timer.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ported to Xen]
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Thu, 11 Jan 2018 17:47:59 +0000 (17:47 +0000)]
x86/link: Introduce and use SECTION_ALIGN
... to reduce the quantity of #ifdef EFI.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 11 Jan 2018 17:47:59 +0000 (17:47 +0000)]
x86/time: Print a more helpful error when a platform timer can't be found
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 11 Jan 2018 17:47:58 +0000 (17:47 +0000)]
xen/common: Widen the guest logging buffer slightly
This reduces the amount of line wrapping from guests; Xen in particular likes
to print lines longer than 80 characters.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Jonathan Ludlam [Thu, 11 Jan 2018 17:47:58 +0000 (17:47 +0000)]
tools/libxc: Multi modules support
Signed-off-by: Jonathan Ludlam <jonathan.ludlam@citrix.com>
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Thu, 11 Jan 2018 17:47:58 +0000 (17:47 +0000)]
tools/libelf: fix elf notes check for PVH guest
PVH only requires PHYS32_ENTRY to be set. Return immediately if that's
the case.
Also remove the printk in pvh_load_kernel.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Thu, 11 Jan 2018 17:47:58 +0000 (17:47 +0000)]
tools/libxc: remove extraneous newline in xc_dom_load_acpi
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Thu, 11 Jan 2018 17:47:58 +0000 (17:47 +0000)]
xen/x86: report domain id on cpuid
Use the ECX register of the hypervisor leaf 5. The EAX register on
this leaf is a flags field that can be used to notice the presence of
the domain id in ECX. Note that this is only available to HVM guests.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Thu, 11 Jan 2018 17:47:57 +0000 (17:47 +0000)]
x86/svm: Offer CPUID Faulting to AMD HVM guests as well
CPUID Faulting can be virtulised for HVM guests without hardware support,
meaning it can be offered to SVM guests.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Fri, 3 Nov 2017 16:12:13 +0000 (16:12 +0000)]
x86/cmdline: Introduce a command line option to disable IBRS/IBPB, STIBP and IBPB
Instead of gaining yet another top level boolean, introduce a more generic
cpuid= option. Also introduce a helper function to parse a generic boolean
value.
This is part of XSA-254.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 3 Nov 2017 16:12:13 +0000 (16:12 +0000)]
x86/feature: Definitions for Indirect Branch Controls
Contemporary processors are gaining Indirect Branch Controls via microcode
updates. Intel are introducing one bit to indicate IBRS and IBPB support, and
a second bit for STIBP. AMD are introducing IBPB only, so enumerate it with a
separate bit.
Furthermore, depending on compiler and microcode availability, we may want to
run Xen with IBRS set, or clear.
To use these facilities, we synthesise separate IBRS and IBPB bits for
internal use. A lot of infrastructure is required before these features are
safe to offer to guests.
This is part of XSA-254.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Mon, 18 Dec 2017 13:54:25 +0000 (13:54 +0000)]
x86: Introduce alternative indirect thunks
Depending on hardware and microcode availability, we will want to replace
IND_THUNK_REPOLINE with other implementations.
For AMD hardware, choose IND_THUNK_LFENCE in preference to retpoline if lfence
is known to be (or was successfully made) dispatch serialising.
This is part of XSA-254.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Sun, 17 Dec 2017 16:20:50 +0000 (16:20 +0000)]
x86/amd: Try to set lfence as being Dispatch Serialising
This property is required for the AMD's recommended mitigation for Branch
Target Injection, but Xen needs to cope with being unable to detect or modify
the MSR.
This is part of XSA-254.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 16 Nov 2017 16:09:44 +0000 (16:09 +0000)]
x86/boot: Report details of speculative mitigations
Nothing very interesting at the moment, but the logic will grow as new
mitigations are added.
This is part of XSA-254.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 18 Dec 2017 13:54:25 +0000 (13:54 +0000)]
x86: Support indirect thunks from assembly code
Introduce INDIRECT_CALL and INDIRECT_JMP which either degrade to a normal
indirect branch, or dispatch to the __x86_indirect_thunk_* symbols.
Update all the manual indirect branches in to use the new thunks. The
indirect branches in the early boot and kexec path are left intact as we can't
use the compiled-in thunks at those points.
This is part of XSA-254.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 18 Dec 2017 13:54:25 +0000 (13:54 +0000)]
x86: Support compiling with indirect branch thunks
Use -mindirect-branch=thunk-extern/-mindirect-branch-register when available.
To begin with, use the retpoline thunk. Later work will add alternative
thunks which can be selected at boot time.
This is part of XSA-254.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 16 Jan 2018 16:50:59 +0000 (17:50 +0100)]
x86: allow Meltdown band-aid to be disabled
First of all we don't need it on AMD systems. Additionally allow its use
to be controlled by command line option. For best backportability, this
intentionally doesn't use alternative instruction patching to achieve
the intended effect - while we likely want it, this will be later
follow-up.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 16 Jan 2018 16:49:03 +0000 (17:49 +0100)]
x86: Meltdown band-aid against malicious 64-bit PV guests
This is a very simplistic change limiting the amount of memory a running
64-bit PV guest has mapped (and hence available for attacking): Only the
mappings of stack, IDT, and TSS are being cloned from the direct map
into per-CPU page tables. Guest controlled parts of the page tables are
being copied into those per-CPU page tables upon entry into the guest.
Cross-vCPU synchronization of top level page table entry changes is
being effected by forcing other active vCPU-s of the guest into the
hypervisor.
The change to context_switch() isn't strictly necessary, but there's no
reason to keep switching page tables once a PV guest is being scheduled
out.
This isn't providing full isolation yet, but it should be covering all
pieces of information exposure of which would otherwise require an XSA.
There is certainly much room for improvement, especially of performance,
here - first and foremost suppressing all the negative effects on AMD
systems. But in the interest of backportability (including to really old
hypervisors, which may not even have alternative patching) any such is
being left out here.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Fri, 1 Sep 2017 11:15:39 +0000 (12:15 +0100)]
x86/mm: Always set _PAGE_ACCESSED on L4e updates
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 6 Oct 2017 13:21:32 +0000 (13:21 +0000)]
x86/Rules: Use -mskip-rax-setup if the compiler supports it
This option is available from GCC 5 onwards, and was specifically introduced
as an optimisation for Linux. When using variadic functions, the caller needs
to know how many floating point arguments were passed. Xen, like Linux,
doesn't uses floating point arguments, so doesn't need to emit code to inform
variadic functions such as printk() that there are zero arguments.
The net delta for a release build is:
add/remove: 0/0 grow/shrink: 35/625 up/down: 603/-5489 (-4886)
with the single biggest change being:
x86_emulate 101933 101751 -182
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 29 Dec 2017 12:56:24 +0000 (12:56 +0000)]
x86/bitops: Introduce variable/constant pairs for __{set,clear,change}_bit()
Just as with test_bit, the non-atomic set/clear/change helpers can be better
optimised by the compiler in the case that the nr parameter is constant, and
it often is.
This results in a general replacement of `mov $imm, %reg; bt* %reg, mem` with
the shorter and more simple `op $imm, mem`, also reducing register pressure.
The net diffstat is:
add/remove: 0/1 grow/shrink: 5/17 up/down: 90/-301 (-211)
As a piece of minor cleanup, drop unnecessary brackets in the test_bit()
macro, and fix the indentation.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 29 Dec 2017 13:06:14 +0000 (13:06 +0000)]
xen/sched_rt: Move repl_timer into struct rt_private
struct timer is only 48 bytes and repl_timer has a 1-to-1 correspondance with
struct rt_private, so having it referenced by pointer is wasteful.
This avoids one memory allocation in rt_init(), and the resulting diffstat is:
add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-156 (-156)
function old new delta
rt_switch_sched 134 133 -1
rt_context_saved 278 271 -7
rt_vcpu_remove 253 245 -8
rt_vcpu_sleep 234 218 -16
repl_timer_handler 761 744 -17
rt_deinit 44 20 -24
rt_init 219 136 -83
As an extra bit of cleanup noticed while making this change, there is no need
to call cpumask_clear() on an zeroed memory allocation.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
Andrew Cooper [Fri, 29 Dec 2017 12:56:34 +0000 (12:56 +0000)]
xen/credit2: Drop unnecessary bit test
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Andrew Cooper [Thu, 11 Jan 2018 23:38:40 +0000 (23:38 +0000)]
x86/boot: Fix boot following c/s
b6c2c7f48a
c/s
b6c2c7f48a unfortunately broke booting on affected systems. Most of the
time, ioemul_handle_quirk() doesn't write a custom stub, and the redundant
call was depending on the seemingly-pointless writing of the default stub.
Alter the ioemul_handle_quirk() API to return a boolean if a custom stub was
written, allowing its caller to know whether it should write a default stub
instead.
Finally, adjust the /* Regular stubs */ comment to make it clearer that the 16
refers to the length of the emul stub opcode.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli [Wed, 10 Jan 2018 18:20:34 +0000 (19:20 +0100)]
MAINTAINERS: update my entries to new email address.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Meng Xu <mengxu@cis.upenn.edu>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 28 Nov 2017 19:11:12 +0000 (19:11 +0000)]
x86/microcode: Use the exported bootstrap_map() function
... rather than obtaining it via function pointer. The internal ucode_mod_map
function pointer can also be dropped.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 28 Nov 2017 19:07:02 +0000 (19:07 +0000)]
x86/xsm: Use the exported bootstrap_map() function
... rather than obtaining it via function pointer.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>