Andrew Cooper [Fri, 13 Sep 2019 16:17:21 +0000 (17:17 +0100)]
drivers/acpi: Drop "ERST table was not found" message
ERST isn't a mandatory table, and also isn't very common to find. The message
is unnecessary noise during boot. Furthermore, it is redundant with the list
of found ACPI tables printed just ahead.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 13 Sep 2019 16:13:35 +0000 (17:13 +0100)]
x86/vpmu: Drop "VPMU: disabled" message
Printing "$foo disabled" is unnecessary noise during boot. All other VPMU
settings emit a message, so this doesn't result in any ambiguity.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 6 Sep 2019 12:41:03 +0000 (14:41 +0200)]
tools/libs: put common Makefile parts into new libs.mk
The Makefile below tools/libs have a lot in common. Put those common
parts into a new libs.mk and include that from the specific Makefiles.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Roger Pau Monné [Tue, 17 Sep 2019 14:13:39 +0000 (16:13 +0200)]
vpci: honor read-only devices
Don't allow the hardware domain write access the PCI config space of
devices marked as read-only.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Tue, 17 Sep 2019 14:12:47 +0000 (16:12 +0200)]
sysctl / libxl: report whether IOMMU/HAP page table sharing is supported
This patch defines a new bit reported in the hw_cap field of struct
xen_sysctl_physinfo to indicate whether the platform supports sharing of
HAP page tables (i.e. the P2M) with the IOMMU. This informs the toolstack
whether the domain needs extra memory to store discrete IOMMU page tables
or not.
NOTE: This patch makes sure iommu_hap_pt_shared is clear if HAP is not
supported or the IOMMU is disabled, and defines it to false if
!CONFIG_HVM.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Paul Durrant [Tue, 17 Sep 2019 14:11:48 +0000 (16:11 +0200)]
use is_iommu_enabled() where appropriate...
...rather than testing the global iommu_enabled flag and ops pointer.
Now that there is a per-domain flag indicating whether the domain is
permitted to use the IOMMU (which determines whether the ops pointer will
be set), many tests of the global iommu_enabled flag and ops pointer can
be translated into tests of the per-domain flag. Some of the other tests of
purely the global iommu_enabled flag can also be translated into tests of
the per-domain flag.
NOTE: The comment in iommu_share_p2m_table() is also fixed; need_iommu()
disappeared some time ago. Also, whilst the style of the 'if' in
flask_iommu_resource_use_perm() is fixed, I have not translated any
instances of u32 into uint32_t to keep consistency. IMO such a
translation would be better done globally for the source module in
a separate patch.
The change to the definition of iommu_call() is to keep the PV shim
build happy. Without this change it will fail to compile with errors
of the form:
iommu.c:361:32: error: unused variable ‘hd’ [-Werror=unused-variable]
const struct domain_iommu *hd = dom_iommu(d);
^~
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: "Roger Pau Monné" <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Paul Durrant [Tue, 17 Sep 2019 14:10:38 +0000 (16:10 +0200)]
domain: introduce XEN_DOMCTL_CDF_iommu flag
This patch introduces a common domain creation flag to determine whether
the domain is permitted to make use of the IOMMU. Currently the flag is
always set for both dom0 and any domU created by libxl if the IOMMU is
globally enabled (i.e. iommu_enabled == 1). sanitise_domain_config() is
modified to reject the flag if !iommu_enabled.
A new helper function, is_iommu_enabled(), is added to test the flag and
iommu_domain_init() will return immediately if !is_iommu_enabled(). This is
slightly different to the previous behaviour based on !iommu_enabled where
the call to arch_iommu_domain_init() was made regardless, however it appears
that this call was only necessary to initialize the dt_devices list for ARM
such that iommu_release_dt_devices() can be called unconditionally by
domain_relinquish_resources(). Adding a simple check of is_iommu_enabled()
into iommu_release_dt_devices() keeps this unconditional call working.
No functional change should be observed with this patch applied.
Subsequent patches will allow the toolstack to control whether use of the
IOMMU is enabled for a domain.
NOTE: The introduction of the is_iommu_enabled() helper function might
seem excessive but its use is expected to increase with subsequent
patches. Also, having iommu_domain_init() bail before calling
arch_iommu_domain_init() is not strictly necessary, but I think the
consequent addition of the call to is_iommu_enabled() in
iommu_release_dt_devices() makes the code clearer.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: "Roger Pau Monné" <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Juergen Gross [Tue, 17 Sep 2019 14:09:50 +0000 (16:09 +0200)]
sched: populate cpupool0 only after all cpus are up
Simplify cpupool initialization by populating cpupool0 with cpus only
after all cpus are up. This avoids having to call the cpu notifier
directly for cpu 0.
With that in place there is no need to create cpupool0 earlier, so
do that just before assigning the cpus. Initialize free cpus with all
online cpus at that time in order to be able to add the cpu notifier
late, too.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Juergen Gross [Tue, 17 Sep 2019 14:08:48 +0000 (16:08 +0200)]
spinlocks: print lock profile info in panic()
Print the lock profile data when the system crashes and add some more
information for each lock data (lock address, cpu holding the lock).
While at it use the PRI_stime format specifier for printing time data.
This is especially beneficial for watchdog triggered crashes in case
of deadlocks.
In order to have the cpu holding the lock available let the
lock profile config option select DEBUG_LOCKS.
As printing the lock profile data will make use of locking, too, we
need to disable spinlock debugging before calling
spinlock_profile_printall() from panic().
While at it remove a superfluous #ifdef CONFIG_LOCK_PROFILE and rename
CONFIG_LOCK_PROFILE to CONFIG_DEBUG_LOCK_PROFILE.
Also move the .lockprofile.data section to init area in linker scripts
as the data is no longer needed after boot.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Tue, 17 Sep 2019 14:08:03 +0000 (16:08 +0200)]
xen: add new CONFIG_DEBUG_LOCKS option
Instead of enabling debugging for debug builds only add a dedicated
Kconfig option for that purpose which defaults to DEBUG.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Tue, 17 Sep 2019 14:07:11 +0000 (16:07 +0200)]
spinlocks: in debug builds store cpu holding the lock
Add the cpu currently holding the lock to struct lock_debug. This makes
analysis of locking errors easier and it can be tested whether the
correct cpu is releasing a lock again.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 17 Sep 2019 14:06:15 +0000 (16:06 +0200)]
x86/PCI: read MSI-X table entry count early
Rather than doing this every time we set up interrupts for a device
anew (and then in two distinct places) fill this invariant field
right after allocating struct arch_msix.
While at it also obtain the MSI-X capability structure position just
once, in msix_capability_init(), rather than in each caller.
Furthermore take the opportunity and eliminate the multi_msix_capable()
alias of msix_table_size().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 17 Sep 2019 14:05:34 +0000 (16:05 +0200)]
AMD/IOMMU: let callers of amd_iommu_alloc_intremap_table() handle errors
Additional users of the function will want to handle errors more
gracefully. Remove the BUG_ON()s and make the current caller panic()
instead.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 17 Sep 2019 14:05:01 +0000 (16:05 +0200)]
AMD/IOMMU: introduce a "valid" flag for IVRS mappings
For us to no longer blindly allocate interrupt remapping tables for
everything the ACPI tables name, we can't use struct ivrs_mappings'
intremap_table field anymore to also have the meaning of "this entry
is valid". Add a separate boolean field instead.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 17 Sep 2019 14:03:44 +0000 (16:03 +0200)]
AMD/IOMMU: don't free shared IRT multiple times
Calling amd_iommu_free_intremap_table() for every IVRS entry is correct
only in per-device-IRT mode. Use a NULL 2nd argument to indicate that
the shared table should be freed, and call the function exactly once in
shared mode.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Pawel Wieczorkiewicz [Wed, 21 Aug 2019 10:04:30 +0000 (10:04 +0000)]
livepatch: always print XENLOG_ERR information (ARM, ELF)
This complements [1] commit for ARM and livepatch_elf files.
[1]
4470efeae4 livepatch: always print XENLOG_ERR information
Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Chao Gao [Fri, 13 Sep 2019 10:31:34 +0000 (12:31 +0200)]
microcode: pass a patch pointer to apply_microcode()
apply_microcode()'s always loading the cached ucode patch forces
a patch to be stored before being loaded. Make apply_microcode()
accept a patch pointer to remove the limitation so that a patch
can be stored after a successful loading.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Chao Gao [Fri, 13 Sep 2019 10:31:01 +0000 (12:31 +0200)]
microcode/amd: call svm_host_osvw_init() in common code
Introduce a vendor hook, .end_update_percpu, for svm_host_osvw_init().
The hook function is called on each cpu after loading an update.
It is a preparation for spliting out apply_microcode() from
cpu_request_microcode().
Note that svm_host_osvm_init() should be called regardless of the
result of loading an update.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Chao Gao [Fri, 13 Sep 2019 10:30:12 +0000 (12:30 +0200)]
microcode: remove pointless 'cpu' parameter
Some callbacks in microcode_ops or related functions take a cpu
id parameter. But at current call sites, the cpu id parameter is
always equal to current cpu id. Some of them even use an assertion
to guarantee this. Remove this redundent 'cpu' parameter.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Chao Gao [Fri, 13 Sep 2019 10:28:44 +0000 (12:28 +0200)]
microcode: remove struct ucode_cpu_info
Remove the per-cpu cache field in struct ucode_cpu_info since it has
been replaced by a global cache. It would leads to only one field
remaining in ucode_cpu_info. Then, this struct is removed and the
remaining field (cpu signature) is stored in per-cpu area.
The cpu status notifier is also removed. It was used to free the "mc"
field to avoid memory leak.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Chao Gao [Fri, 13 Sep 2019 10:28:13 +0000 (12:28 +0200)]
microcode: clean up microcode_resume_cpu
Previously, a per-cpu ucode cache is maintained. Then each CPU had one
per-cpu update cache and there might be multiple versions of microcode.
Thus microcode_resume_cpu tried best to update microcode by loading
every update cache until a successful load.
But now the cache struct is simplified a lot and only a single ucode is
cached. a single invocation of ->apply_microcode() would load the cache
and make microcode updated.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Chao Gao [Fri, 13 Sep 2019 10:27:42 +0000 (12:27 +0200)]
microcode: introduce a global cache of ucode patch
to replace the current per-cpu cache 'uci->mc'.
With the assumption that all CPUs in the system have the same signature
(family, model, stepping and 'pf'), one microcode update matches with
one cpu should match with others. Having differing microcode revisions
on cpus would cause system unstable and should be avoided. Hence, caching
one microcode update is good enough for all cases.
Introduce a global variable, microcode_cache, to store the newest
matching microcode update. Whenever we get a new valid microcode update,
its revision id is compared against that of the microcode update to
determine whether the "microcode_cache" needs to be replaced. And
this global cache is loaded to cpu in apply_microcode().
All operations on the cache is protected by 'microcode_mutex'.
Note that I deliberately avoid touching the old per-cpu cache ('uci->mc')
as I am going to remove it completely in the following patches. We copy
everything to create the new cache blob to avoid reusing some buffers
previously allocated for the old per-cpu cache. It is not so efficient,
but it is already corrected by a patch later in this series.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Chao Gao [Fri, 13 Sep 2019 10:26:51 +0000 (12:26 +0200)]
microcode/amd: distinguish old and mismatched ucode in microcode_fits()
Sometimes, an ucode with a level lower than or equal to current CPU's
patch level is useful. For example, to work around a broken bios which
only loads ucode for BSP, when BSP parses an ucode blob during bootup,
it is better to save an ucode with lower or equal level for APs
No functional change is made in this patch. But following patch would
handle "old ucode" and "mismatched ucode" separately.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Chao Gao [Fri, 13 Sep 2019 10:26:16 +0000 (12:26 +0200)]
microcode/intel: extend microcode_update_match()
to a more generic function. So that it can be used alone to check
an update against the CPU signature and current update revision.
Note that enum microcode_match_result will be used in common code
(aka microcode.c), it has been placed in the common header. And
constifying the parameter of microcode_sanity_check() such that it
can be called by microcode_update_match().
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Fri, 13 Sep 2019 10:21:47 +0000 (12:21 +0200)]
public/xen.h: update the comment explaining 'Wallclock time'
Since commit
0629adfd80e "Actually set a HVM domain's time offset when it
sets the RTC", the comment in the public header has been misleading, since
it claims that wallclock time is only updated by control software.
Moreover, the comments stating that wc_sec and wc_nsec are seconds and
nanoseconds (respectively) in UTC since the Unix epoch are bogus. Their
values are adjusted by the domain's time_offset_seconds value, which is
updated by a guest write to the emulated RTC and hence the wallclock
timezone is under guest control.
This patch attempts to bring the comment in line with reality whilst
keeping it reasonably short.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Thu, 12 Sep 2019 14:18:47 +0000 (15:18 +0100)]
Update my MAINTAINERS entries
My Citrix email address will expire shortly.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Juergen Gross [Fri, 13 Sep 2019 06:15:05 +0000 (08:15 +0200)]
debugtrace: fix Arm build
Add missing #includes.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Julien Grall [Wed, 11 Sep 2019 15:31:34 +0000 (16:31 +0100)]
xen/arm: setup: Relocate the Device-Tree later on in the boot
At the moment, the Device-Tree is relocated into xenheap while setting
up the memory subsystem. This is actually not necessary because the
early mapping is still present and we don't require the virtual address
to be stable until unflatting the Device-Tree.
So the relocation can safely be moved after the memory subsystem is
fully setup. This has the nice advantage to make the relocation common
and let the xenheap allocator decides where to put it.
Lastly, the device-tree is not going to be used for ACPI system. So
there are no need to relocate it and can just be discarded.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Wed, 11 Sep 2019 15:19:42 +0000 (16:19 +0100)]
xen/arm: bootfd: Fix indentation in process_multiboot_node()
One line in process_multiboot_node() is using hard tab rather than soft
tab. So fix it!
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Lars Kurth [Fri, 30 Aug 2019 19:35:13 +0000 (20:35 +0100)]
scripts/add_maintainers.pl: Add logic to use V entry
Add logic to use V section entry in THE REST for identifying xen trees
Specifically:
* Move check until after the MAINTAINERS file has been read
* Add get_xen_maintainers_file_version() for check
* Remove top_of_tree as not needed any more
* Fail with extended error message when used out of tree
Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Lars Kurth [Fri, 30 Aug 2019 17:42:56 +0000 (18:42 +0100)]
MAINTAINERS: Add V section entry to allow identification of Xen file
This change provides sufficient information to allow get_maintainer.pl /
add_maintainers.pl scripts to be run on xen sister repositories such as
mini-os.git, osstest.git, etc
A suggested template for sister repositories of Xen is
========================================================
This file follows the same conventions as outlined in
xen.git:MAINTAINERS. Please refer to the file in xen.git
for more information.
THE REST
M: MAINTAINER1 <maintainer1@email.com>
M: MAINTAINER2 <maintainer2@email.com>
L: xen-devel@lists.xenproject.org
S: Supported
F: *
F: */
V: xen-maintainers-1
========================================================
Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Lars Kurth [Fri, 30 Aug 2019 17:18:16 +0000 (18:18 +0100)]
scripts/add_maintainers.pl: Remove hardcoding
Instead of using a hardcoded location, inherit the
location from $0
Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Juergen Gross [Thu, 12 Sep 2019 13:13:47 +0000 (15:13 +0200)]
debugtrace: add entry when entry count is wrapping
The debugtrace entry count is a 32 bit variable, so it can wrap when
lots of trace entries are being produced. Making it wider would result
in a waste of buffer space as the printed count value would consume
more bytes when not wrapping.
So instead of letting the count value grow to huge values let it wrap
and add a wrap counter printed in this situation. This will keep the
needed buffer space at today's value while avoiding to loose a way to
sort all entries in case multiple trace buffers are involved.
Note that the wrap message will be printed before the first trace
entry in case output is switched to console early. This is on purpose
in order to enable a future support of debugtrace to console without
any allocated buffer.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Thu, 12 Sep 2019 13:12:21 +0000 (15:12 +0200)]
debugtrace: add per-cpu buffer option
debugtrace is normally writing trace entries into a single trace
buffer. There are cases where this is not optimal, e.g. when hunting
a bug which requires writing lots of trace entries and one cpu is
stuck. This will result in other cpus filling the trace buffer and
finally overwriting the interesting trace entries of the hanging cpu.
In order to be able to debug such situations add the capability to use
per-cpu trace buffers. This can be selected by specifying the
debugtrace boot parameter with the modifier "cpu:", like:
debugtrace=cpu:16
At the same time switch the parsing function to accept size modifiers
(e.g. 4M or 1G).
Printing out the trace entries is done for each buffer in order to
minimize the effort needed during printing. As each entry is prefixed
with its sequence number sorting the entries can easily be done when
analyzing them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Tue, 10 Sep 2019 15:25:38 +0000 (17:25 +0200)]
sysctl: report shadow paging capability
Report whether shadow paging is supported by the hypervisor, since it
can be disabled at build time.
Reuse and tweak LIBXL_HAVE_PHYSINFO_CAP_HAP as it hasn't appeared in a
released version of Xen yet.
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 12 Sep 2019 09:57:37 +0000 (10:57 +0100)]
x86/msr: Fix 'plaform' typo
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monné [Wed, 11 Sep 2019 12:55:20 +0000 (14:55 +0200)]
sysctl/libxl: choose a sane default for HAP
Current libxl code will always enable Hardware Assisted Paging (HAP),
expecting that the hypervisor will fallback to shadow if HAP is not
available. With the changes to DOMCTL_createdomain that's not the case
any longer, and the hypervisor will raise an error if HAP is not
available instead of silently falling back to shadow.
In order to keep the previous functionality report whether HAP is
available or not in XEN_SYSCTL_physinfo, so that the toolstack can
select a sane default if there's no explicit user selection of whether
HAP should be used.
Note that on ARM hardware HAP capability is always reported since it's
a required feature in order to run Xen.
Fixes: d0c0ba7d3de ('x86/hvm/domain: remove the 'hap_enabled' flag')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Jan Beulich [Wed, 11 Sep 2019 12:54:34 +0000 (14:54 +0200)]
x86/shadow: fold p2m page accounting into sh_min_allocation()
This is to make the function live up to the promise its name makes. And
it simplifies all callers.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Ian Jackson [Tue, 10 Sep 2019 15:16:51 +0000 (16:16 +0100)]
tools/ocaml: abi check: #include on x86 only. Spotted by Gitlab CI
Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 10 Sep 2019 14:35:09 +0000 (16:35 +0200)]
x86emul: fix test harness and fuzzer build dependencies
Commit
fd35f32b4b ("tools/x86emul: Use struct cpuid_policy in the
userspace test harnesses") didn't account for the dependencies of
cpuid-autogen.h to potentially change between incremental builds. In
particular the harness has a "run" goal which is supposed to be usable
independently of the rest of the tools sub-tree building, and both the
harness and the fuzzer code are also supposed to be buildable
independently. Therefore a re-build of the generated header needs to be
triggered first, which is achieved by introducing a new top-level target
pattern (for just the "run" part for now).
Further cpuid.o did not have any dependencies added for it.
Finally, while at it, add a "run" target to the cpu-policy test harness.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Tue, 10 Sep 2019 14:34:21 +0000 (16:34 +0200)]
x86/IRQ: make 'i' debug output more tabular again
Since the affinity values are no longer of uniform width, move them
further to the right such that as much of the output as possible comes
out aligned with one another.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Tue, 10 Sep 2019 14:32:47 +0000 (16:32 +0200)]
ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
The loop in FOR_EACH_IOREQ_SERVER is backwards hence the cleanup on
failure needs to be done forwards.
Fixes: 97a5a3e30161 ('x86/hvm/ioreq: maintain an array of ioreq servers rather than a list')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Andrew Cooper [Tue, 10 Sep 2019 14:04:55 +0000 (15:04 +0100)]
tools/ocaml: Fix build error with CentOS 7
gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-28) complains:
xenctrl_stubs.c: In function 'stub_xc_domain_create':
xenctrl_stubs.c:216:28: error: 'val' may be used uninitialized
in this function [-Werror=maybe-uninitialized]
cfg.arch.emulation_flags = ocaml_list_to_c_bitmap
^
xenctrl_stubs.c:198:12: error: 'val' may be used uninitialized
in this function [-Werror=maybe-uninitialized]
cfg.flags = ocaml_list_to_c_bitmap
^
cc1: all warnings being treated as errors
GCC doesn't point at the correct piece of code, but the diagnostic text is
correct, and can occur when the list is empty. Initialise val to 0.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Andrew Cooper [Tue, 10 Sep 2019 11:17:30 +0000 (12:17 +0100)]
tools/ocaml: abi: Use formal conversion and check in more places
Now we have a caller for ocaml_list_to_c_bitmap.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Ian Jackson [Tue, 10 Sep 2019 11:34:03 +0000 (12:34 +0100)]
tools/ocaml: tools/ocaml: Add missing CDF_* values
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Ian Jackson [Tue, 10 Sep 2019 11:27:45 +0000 (12:27 +0100)]
tools/ocaml: abi-check: Check properly.
Fix a broken regexp which would mention `$/' when it ought to have
mentioned `$'. The result would be that it would match lines like
type some_ocaml_type = Thing | Other_Thing
but ignore everything but the type name, giving wrong answers.
Check that we check mentioned types. Otherwise if we fail to spot
some suitable thing in the ocaml, we would just omit checking this
type !
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Andrew Cooper [Tue, 10 Sep 2019 11:14:51 +0000 (12:14 +0100)]
tools/ocaml: Reformat domain_create_flag
This will allow us to apply the abi checker soon.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Ian Jackson [Tue, 10 Sep 2019 11:25:26 +0000 (12:25 +0100)]
tools/ocaml: abi-check: Cope with multiple conversions of same type
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Ian Jackson [Tue, 10 Sep 2019 11:34:38 +0000 (12:34 +0100)]
tools/ocaml: abi-check: Improve output and error messages
In the generated C, add some comments saying where we found the ocaml
type. This helps with debugging. (I considered emitting #line
directives but decided this would be more confusing than helpful.)
Improve two dies.
Use better-named filehandles (perl prints thier names when it dies).
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Andrew Cooper [Tue, 10 Sep 2019 11:18:45 +0000 (12:18 +0100)]
tools/ocaml: abi handling: Provide ocaml->C conversion/check
No users of this yet so no overall change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Ian Jackson [Tue, 10 Sep 2019 11:12:44 +0000 (12:12 +0100)]
tools/ocaml: abi-check: Add comments
Provide interface documentation for this script.
Explain why we check .ml not .mli.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Andrew Cooper [Tue, 10 Sep 2019 10:41:33 +0000 (11:41 +0100)]
xen/domctl: Drop guest suffix from XEN_DOMCTL_CDF_hvm
The suffix is redundant, and dropping it helps to simplify the Ocaml/C
ABI checking.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Jackson [Mon, 9 Sep 2019 17:12:06 +0000 (18:12 +0100)]
tools/ocaml: Introduce xenctrl ABI build-time checks
c/s
f089fddd941 broke the Ocaml ABI by renumering
XEN_SYSCTL_PHYSCAP_directio without adjusting the Ocaml
physinfo_cap_flag enumeration.
Add build machinery which will check the ABI correspondence.
This will result in a compile time failure whenever constants get
renumbered/added without a compatible adjustment to the Ocaml ABI.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Andrew Cooper [Mon, 9 Sep 2019 17:12:05 +0000 (18:12 +0100)]
tools/ocaml: Add missing CAP_PV
c/s
f089fddd941 broke the Ocaml ABI by renumering XEN_SYSCTL_PHYSCAP_directio
without adjusting the Ocaml physinfo_cap_flag enumeration. Fix this by
inserting CAP_PV between CAP_HVM and CAP_DirectIO.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Mon, 9 Sep 2019 17:12:04 +0000 (18:12 +0100)]
tools/ocaml: Add missing X86_EMU_VPCI
This was missing from x86_arch_emulation_flags.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Andrew Cooper [Mon, 9 Sep 2019 10:35:03 +0000 (11:35 +0100)]
x86/boot: Improve code generation from bootsym()
The code generation for bootsym() is atrocious, and unnecessarily complicated.
Given the appropriate physical address, all we need is to construct a virtual
address of the appropriate type.
add/remove: 0/0 grow/shrink: 0/9 up/down: 0/-4256 (-4256)
Function old new delta
kexec_reserve_area.constprop 165 159 -6
reset_videomode_after_s3 231 70 -161
identify_cpu 1341 1176 -165
parse_acpi_sleep 408 240 -168
early_init_intel 632 440 -192
__cpu_up 1983 1682 -301
do_platform_op 6469 5526 -943
compat_platform_op 6433 5482 -951
__start_xen 12939 11570 -1369
Total: Before=
3341298, After=
3337042, chg -0.13%
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 9 Sep 2019 15:53:28 +0000 (16:53 +0100)]
x86/cpuid: Fix build with CentOS 6 following c/s
7479151106
GCC of a CentOS 6 vintage complains:
cpuid.c: In function 'parse_xen_cpuid':
cpuid.c:32: error: 'mid' may be used uninitialized in this function
This can't occur in practice because the while() loop is guarenteed to be
entered, but initialise mid to work around the issues.
Spotted by Gitlab CI.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 6 Sep 2019 15:59:02 +0000 (16:59 +0100)]
x86/cpuid: Fix handling of the CPUID.7[0].eax levelling MSR
7a0 is an integer field, not a mask - taking the logical and of the hardware
and policy values results in nonsense. Instead, take the policy value
directly.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@cirtrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Mon, 9 Sep 2019 12:37:25 +0000 (14:37 +0200)]
xen: refactor debugtrace data
As a preparation for per-cpu buffers do a little refactoring of the
debugtrace data: put the needed buffer admin data into the buffer as
it will be needed for each buffer. In order not to limit buffer size
switch the related fields from unsigned int to unsigned long, as on
huge machines with RAM in the TB range it might be interesting to
support buffers >4GB.
While at it switch debugtrace_send_to_console and debugtrace_used to
bool and delete an empty line.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Mon, 9 Sep 2019 12:36:10 +0000 (14:36 +0200)]
xen: move debugtrace coding to common/debugtrace.c
Instead of living in drivers/char/console.c move the debugtrace
related coding to a new file common/debugtrace.c
No functional change, code movement only.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Mon, 9 Sep 2019 12:34:37 +0000 (14:34 +0200)]
xen: fix debugtrace clearing
After dumping the debugtrace buffer it is cleared. This results in some
entries not being printed in case the buffer is dumped again before
having wrapped.
While at it remove the trailing zero byte in the buffer as it is no
longer needed. Commit
b5e6e1ee8da59f introduced passing the number of
chars to be printed in the related interfaces, so the trailing 0 byte
is no longer required.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Fri, 6 Sep 2019 14:30:20 +0000 (16:30 +0200)]
sysctl: report existing physcaps on Arm
Current physcaps in XEN_SYSCTL_physinfo are only used by x86, albeit
the capabilities themselves are not x86 specific.
This patch adds support for also reporting the current capabilities on
Arm hardware. Note that on Arm PHYSCAP_hvm is always reported, and
setting PHYSCAP_directio has been moved to common code since the same
logic to set it is used by x86 and Arm.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Julien Grall [Mon, 22 Jul 2019 13:24:43 +0000 (14:24 +0100)]
xen/arm32: head: Don't setup the fixmap on secondary CPUs
setup_fixmap() will setup the fixmap in the boot page tables in order to
use earlyprintk and also update the register r11 holding the address to
the UART.
However, secondary CPUs are not using earlyprintk between turning the
MMU on and switching to the runtime page table. So setting up the
fixmap in the boot pages table is pointless.
This means most of setup_fixmap() is not necessary for the secondary
CPUs. The update of UART address is now moved out of setup_fixmap() and
duplicated in the CPU boot and secondary CPUs boot. Additionally, the
call to setup_fixmap() is removed from secondary CPUs boot.
Lastly, take the opportunity to replace load from literal pool with the
new macro mov_w.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Sat, 20 Apr 2019 17:18:01 +0000 (18:18 +0100)]
xen/arm32: head: Move assembly switch to the runtime PT in secondary CPUs path
The assembly switch to the runtime PT is only necessary for the
secondary CPUs. So move the code in the secondary CPUs path.
While this is definitely not compliant with the Arm Arm as we are
switching between two differents set of page-tables without turning off
the MMU. Turning off the MMU is impossible here as the ID map may clash
with other mappings in the runtime page-tables. This will require more
rework to avoid the problem. So for now add a TODO in the code.
Finally, the code is currently assume that r5 will be properly set to 0
before hand. This is done by create_page_tables() which is called quite
early in the boot process. There are a risk this may be oversight in the
future and therefore breaking secondary CPUs boot. Instead, set r5 to 0
just before using it.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Sat, 20 Apr 2019 12:33:31 +0000 (13:33 +0100)]
xen/arm32: head: Document enable_mmu()
Document the behavior and the main registers usage within enable_mmu().
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Sun, 21 Jul 2019 18:35:19 +0000 (19:35 +0100)]
xen/arm32: head: Document create_pages_tables()
Document the behavior and the main registers usage within the function.
Note that r6 is now only used within the function, so it does not need
to be part of the common register.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Wed, 26 Jun 2019 20:23:50 +0000 (21:23 +0100)]
xen/arm32: head: Rework and document zero_bss()
On secondary CPUs, zero_bss() will be a NOP because BSS only need to be
zeroed once at boot. So the call in the secondary CPUs path can be
removed.
Lastly, document the behavior and the main registers usage within the
function.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Tue, 16 Apr 2019 13:53:19 +0000 (14:53 +0100)]
xen/arm32: head: Rework and document check_cpu_mode()
A branch in the success case can be avoided by inverting the branch
condition. At the same time, remove a pointless comment as Xen can only
run at Hypervisor Mode.
Lastly, document the behavior and the main registers usage within the
function.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Wed, 26 Jun 2019 12:46:56 +0000 (13:46 +0100)]
xen/arm32: head: Introduce distinct paths for the boot CPU and secondary CPUs
The boot code is currently quite difficult to go through because of the
lack of documentation and a number of indirection to avoid executing
some path in either the boot CPU or secondary CPUs.
In an attempt to make the boot code easier to follow, each parts of the
boot are now in separate functions. Furthermore, the paths for the boot
CPU and secondary CPUs are now distinct and for now will call each
functions.
Follow-ups will remove unnecessary calls and do further improvement
(such as adding documentation and reshuffling).
Note that the switch from using the ID mapping to the runtime mapping
is duplicated for each path. This is because in the future we will need
to stay longer in the ID mapping for the boot CPU.
Lastly, it is now required to save lr in cpu_init() becauswe the
function will call other functions and therefore clobber lr.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Mon, 15 Apr 2019 22:11:42 +0000 (23:11 +0100)]
xen/arm32: head: Introduce print_reg
At the moment, the user should save r14/lr if it cares about it.
Follow-up patches will introduce more use of putn in place where lr
should be preserved.
Furthermore, any user of putn should also move the value to register r0
if it was stored in a different register.
For convenience, a new macro is introduced to print a given register.
The macro will take care for us to move the value to r0 and also
preserve lr.
Lastly the new macro is used to replace all the callsite of putn. This
will simplify rework/review later on.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Mon, 15 Apr 2019 21:16:25 +0000 (22:16 +0100)]
xen/arm32: head: Rework UART initialization on boot CPU
Anything executed after the label common_start can be executed on all
CPUs. However most of the instructions executed between the label
common_start and init_uart are not executed on the boot CPU.
The only instructions executed are to lookup the CPUID so it can be
printed on the console (if earlyprintk is enabled). Printing the CPUID
is not entirely useful to have for the boot CPU and requires a
conditional branch to bypass unused instructions.
Furthermore, the function init_uart is only called for boot CPU
requiring another conditional branch. This makes the code a bit tricky
to follow.
The UART initialization is now moved before the label common_start. This
now requires to have a slightly altered print for the boot CPU and set
the early UART base address in each the two path (boot CPU and
secondary CPUs).
This has the nice effect to remove a couple of conditional branch in
the code.
After this rework, the CPUID is only used at the very beginning of the
secondary CPUs boot path. So there is no need to "reserve" x24 for the
CPUID.
Lastly, take the opportunity to replace load from literal pool with the
new macro mov_w.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Mon, 15 Apr 2019 14:57:38 +0000 (15:57 +0100)]
xen/arm32: head: Don't clobber r14/lr in the macro PRINT
The current implementation of the macro PRINT will clobber r14/lr. This
means the user should save r14 if it cares about it.
Follow-up patches will introduce more use of PRINT in places where lr
should be preserved. Rather than requiring all the user to preserve lr,
the macro PRINT is modified to save and restore it.
While the comment state r3 will be clobbered, this is not the case. So
PRINT will use r3 to preserve lr.
Lastly, take the opportunity to move the comment on top of PRINT and use
PRINT in init_uart. Both changes will be helpful in a follow-up patch.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Wed, 26 Jun 2019 11:29:54 +0000 (12:29 +0100)]
xen/arm32: head: Mark the end of subroutines with ENDPROC
putn() and puts() are two subroutines. Add ENDPROC for the benefits of
static analysis tools and the reader.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Mon, 15 Apr 2019 20:58:51 +0000 (21:58 +0100)]
xen/arm32: head: Add a macro to move an immediate constant into a 32-bit register
The current boot code is using the pattern ldr rX, =... to move an
immediate constant into a 32-bit register.
This pattern implies to load the immediate constant from a literal pool,
meaning a memory access will be performed.
The memory access can be avoided by using movw/movt instructions.
A new macro is introduced to move an immediate constant into a 32-bit
register without a memory load. Follow-up patches will make use of it.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Wed, 31 Jul 2019 19:26:19 +0000 (20:26 +0100)]
xen/arm64: head: Fix typo in the documentation on top of init_uart()
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Mon, 17 Jun 2019 13:51:21 +0000 (14:51 +0100)]
xen/arm64: head: Introduce a macro to get a PC-relative address of a symbol
Arm64 provides instructions to load a PC-relative address, but with some
limitations:
- adr is enable to cope with +/-1MB
- adrp is enale to cope with +/-4GB but relative to a 4KB page
address
Because of that, the code requires to use 2 instructions to load any Xen
symbol. To make the code more obvious, introducing a new macro adr_l is
introduced.
The new macro is used to replace a couple of open-coded use in
efi_xen_start.
The macro is copied from Linux 5.2-rc4.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Sat, 13 Apr 2019 21:55:18 +0000 (22:55 +0100)]
xen/arm64: head: Setup TTBR_EL2 in enable_mmu() and add missing isb
At the moment, TTBR_EL2 is setup in create_page_tables(). This is fine
as it is called by every CPUs.
However, such assumption may not hold in the future. To make change
easier, the TTBR_EL2 is not setup in enable_mmu().
Take the opportunity to add the missing isb() to ensure the TTBR_EL2 is
seen before the MMU is turned on.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Mon, 15 Apr 2019 11:24:30 +0000 (12:24 +0100)]
xen/arm64: head: Rework and document launch()
Boot CPU and secondary CPUs will use different entry point to C code. At
the moment, the decision on which entry to use is taken within launch().
In order to avoid a branch for the decision and make the code clearer,
launch() is reworked to take in parameters the entry point and its
arguments.
Lastly, document the behavior and the main registers usage within the
function.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Tue, 6 Aug 2019 17:14:08 +0000 (18:14 +0100)]
xen/arm: lpae: Allow more LPAE helpers to be used in assembly
A follow-up patch will require to use *_table_offset() and *_MASK helpers
from assembly. This can be achieved by using _AT() macro to remove the type
when called from assembly.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper [Mon, 26 Nov 2018 17:06:23 +0000 (17:06 +0000)]
x86/cpuid: Extend the cpuid= option to support all named features
For gen-cpuid.py, fix a comment describing self.names, and generate the
reverse mapping in self.values. Write out INIT_FEATURE_NAMES which maps a
string name to a bit position.
For parse_cpuid(), use cmdline_strcmp() and perform a binary search over
INIT_FEATURE_NAMES. A tweak to cmdline_strcmp() is needed to break at equals
signs as well.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Bandan Das [Fri, 6 Sep 2019 15:07:55 +0000 (17:07 +0200)]
x86/apic: do not initialize LDR and DFR for bigsmp
Legacy apic init uses bigsmp for smp systems with 8 and more CPUs. The
bigsmp APIC implementation uses physical destination mode, but it
nevertheless initializes LDR and DFR. The LDR even ends up incorrectly with
multiple bit being set.
This does not cause a functional problem because LDR and DFR are ignored
when physical destination mode is active, but it triggered a problem on a
32-bit KVM guest which jumps into a kdump kernel.
The multiple bits set unearthed a bug in the KVM APIC implementation. The
code which creates the logical destination map for VCPUs ignores the
disabled state of the APIC and ends up overwriting an existing valid entry
and as a result, APIC calibration hangs in the guest during kdump
initialization.
Remove the bogus LDR/DFR initialization.
This is not intended to work around the KVM APIC bug. The LDR/DFR
ininitalization is wrong on its own.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bandan Das <bsd@redhat.com>
[Linux commit
bae3a8d3308ee69a7dbdf145911b18dfda8ade0d]
Drop init_apic_ldr_x2apic_phys() at the same time.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Bandan Das [Fri, 6 Sep 2019 15:07:14 +0000 (17:07 +0200)]
x86/apic: include the LDR when clearing out APIC registers
Although APIC initialization will typically clear out the LDR before
setting it, the APIC cleanup code should reset the LDR.
This was discovered with a 32-bit KVM guest jumping into a kdump
kernel. The stale bits in the LDR triggered a bug in the KVM APIC
implementation which caused the destination mapping for VCPUs to be
corrupted.
Note that this isn't intended to paper over the KVM APIC bug. The kernel
has to clear the LDR when resetting the APIC registers except when X2APIC
is enabled.
Signed-off-by: Bandan Das <bsd@redhat.com>
[Linux commit
558682b5291937a70748d36fd9ba757fb25b99ae]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 6 Sep 2019 15:06:19 +0000 (17:06 +0200)]
x86: drop CONFIG_X86_MCE_THERMAL
There's no point having this if it's not exposed through Kconfig.
Take the liberty and also drop an unnecessary "return" in context.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Zhang Rui [Fri, 6 Sep 2019 15:05:39 +0000 (17:05 +0200)]
x86/mwait-idle: add support for Jacobsville
Jacobsville uses the same C-states as Denverton.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[Linux commit
04b1d5d098491244f506c4265cc95b87210eef2f]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Fri, 6 Sep 2019 15:04:39 +0000 (17:04 +0200)]
x86/xstate: make use_xsave non-init
LLVM code generation can attempt to load from a variable in the next
condition of an expression under certain circumstances, thus
attempting to load use_xsave regardless of the value of the bsp
variable, which leads to a page fault when the init section has
already been unmapped.
Fix this by making use_xsave non-init, thus preventing the page fault;
use __read_mostly instead. The LLVM bug with the discussion about this
issue can be found at:
https://bugs.llvm.org/show_bug.cgi?id=39707
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 6 Sep 2019 12:33:19 +0000 (13:33 +0100)]
Revert "x86/shim: Refresh pvshim_defconfig"
This reverts commit
32b1d62887d01f85f0c1d2e0103f69f74e1f6fa3 and its fixup
060f4eee0fb408b316548775ab921e16b7acd0e0, which are still causing build and
test problems.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 27 Dec 2018 15:14:01 +0000 (15:14 +0000)]
x86/AMD: Fix handling of x87 exception pointers on Fam17h hardware
AMD Pre-Fam17h CPUs "optimise" {F,}X{SAVE,RSTOR} by not saving/restoring
FOP/FIP/FDP if an x87 exception isn't pending. This causes an information
leak, CVE-2006-1056, and worked around by several OSes, including Xen. AMD
Fam17h CPUs no longer have this leak, and advertise so in a CPUID bit.
Introduce the RSTR_FP_ERR_PTRS feature, as specified by AMD, and expose to all
guests by default. While adjusting libxl's cpuid table, add CLZERO which
looks to have been omitted previously.
Also introduce an X86_BUG bit to trigger the (F)XRSTOR workaround, and set it
on AMD hardware where RSTR_FP_ERR_PTRS is not advertised. Optimise the
conditions for the workaround paths.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 27 Dec 2018 15:13:55 +0000 (15:13 +0000)]
x86/feature: Generalise synth and introduce a bug word
Future changes are going to want to use cpu_bug_* in a mannor similar to
Linux. Introduce one bug word, and generalise the calculation of
NCAPINTS.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 27 Nov 2018 15:27:41 +0000 (15:27 +0000)]
x86/vtd: Drop struct intel_iommu
The sole remaining member of struct intel_iommu is the drhd backpointer. Move
this into struct vtd_iommu, replacing the the 'intel' pointer.
This removes one dynamic memory allocation per IOMMU on the system.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 27 Nov 2018 15:06:15 +0000 (15:06 +0000)]
x86/vtd: Drop struct iommu_flush
It is unclear why this abstraction exists, but iommu_get_flush() returns
possibly NULL and every user unconditionally dereferences the result. In
practice, I can't spot a path where iommu is NULL, so I think it is mostly
dead.
Move the two function pointers into struct vtd_iommu (using a flush prefix),
and delete iommu_get_flush(). Furthermore, there is no need to pass the IOMMU
pointer to the callbacks via a void pointer, so change the parameter to be
correctly typed as struct vtd_iommu. Clean up bool_t to bool in surrounding
context.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 27 Nov 2018 15:02:18 +0000 (15:02 +0000)]
x86/vtd: Drop struct ir_ctrl
It is unclear why this abstraction exists, but iommu_ir_ctrl() returns
possibly NULL and every user unconditionally dereferences the result. In
practice, I can't spot a path where iommu is NULL, so I think it is mostly
dead.
Move the fields into struct vtd_iommu, and delete iommu_ir_ctrl().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 27 Nov 2018 14:57:14 +0000 (14:57 +0000)]
x86/vtd: Drop struct qi_ctrl
It is unclear why this abstraction exists, but iommu_qi_ctrl() returns
possibly NULL and every user unconditionally dereferences the result. In
practice, I can't spot a path where iommu is NULL, so I think it is mostly
dead.
Move the sole member into struct vtd_iommu, and delete iommu_qi_ctrl().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 27 Nov 2018 15:05:48 +0000 (15:05 +0000)]
x86/vtd: Rename struct iommu to vtd_iommu
VT-d's local struct iommu is an overly-generic name, for a structure which in
practice maps 1-to-1 with the real IOMMUs in the system.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Thu, 5 Sep 2019 08:02:11 +0000 (10:02 +0200)]
VT-d/ATS: tidy device_in_domain()
Use appropriate types. Drop unnecessary casts. Check for failures which
can (at least in theory because of non-obvious breakage elsewhere)
occur, instead of ones which really can't (map_domain_page() won't
return NULL).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Juergen Gross [Thu, 5 Sep 2019 08:00:36 +0000 (10:00 +0200)]
x86: remove sched-if.h includes from various sources
xen/sched-if.h is included in multiple sources where it isn't directly
needed. Remove those #include statements.
Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 5 Sep 2019 08:00:07 +0000 (10:00 +0200)]
x86/cpu-policy: work around bogus warning in test harness
Despite %.12s properly limiting the number of characters read from
ident[], gcc 9 (at least up to 9.2.0) warns about the strings not
being nul-terminated:
test-cpu-policy.c:64:18: error: '%.12s' directive argument is not a nul-terminated string [-Werror=format-overflow=]
64 | fail(" Test '%.12s', expected vendor %u, got %u\n",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test-cpu-policy.c:20:12: note: in definition of macro 'fail'
20 | printf(fmt, ##__VA_ARGS__); \
| ^~~
test-cpu-policy.c:64:27: note: format string is defined here
64 | fail(" Test '%.12s', expected vendor %u, got %u\n",
| ^~~~~
test-cpu-policy.c:44:7: note: referenced argument declared here
44 | } tests[] = {
| ^~~~~
The issue was reported against gcc in their bugzilla (bug 91667).
Re-order array entries, oddly enough suppressing the warning.
Reported-by: Christopher Clark <christopher.w.clark@gmail.com>
Reported-by: Dario Faggioli <dfaggioli@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Thu, 5 Sep 2019 07:59:26 +0000 (09:59 +0200)]
p2m/ept: add _subtree suffix to ept_invalidate_emt
So that the name implies the function is used to walk the page table
pointer passed as parameter. Drop the parent_ prefix from the level
parameter, since the level passed is the one matching the EPT entry
passed in the mfn parameter.
While there also change bool_t to bool and add an assert to make sure
no level 0 entries (ie: 4K EPT leaf entries) are passed as parameters.
No functional change intended.
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Thu, 5 Sep 2019 07:58:17 +0000 (09:58 +0200)]
VT-d: avoid PCI device lookup
The two uses of pci_get_pdev_by_domain() lack proper locking, but are
also only used to get hold of a NUMA node ID. Calculate and store the
node ID earlier on and remove the lookups (in lieu of fixing the
locking).
While doing this it became apparent that iommu_alloc()'s use of
alloc_pgtable_maddr() would occur before RHSAs would have been parsed:
iommu_alloc() gets called from the DRHD parsing routine, which - on
spec conforming platforms - happens strictly before RHSA parsing. Defer
the allocation until after all ACPI table parsing has finished,
established the node ID there first.
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Thu, 5 Sep 2019 07:57:44 +0000 (09:57 +0200)]
VT-d: tidy <X>_to_<Y>() functions
Drop iommu_to_drhd() altogether - there's no need for a loop here, the
corresponding DRHD is a field in struct intel_iommu.
Constify drhd_to_rhsa()'s parameter and adjust style.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Thu, 5 Sep 2019 07:56:42 +0000 (09:56 +0200)]
x86/shadow: don't enable shadow mode with too small a shadow allocation (part 2)
Commit
2634b997af ("x86/shadow: don't enable shadow mode with too small
a shadow allocation") was incomplete: The adjustment done there to
shadow_enable() is also needed in shadow_one_bit_enable(). The (new)
problem report was (apparently) a failed PV guest migration followed by
another migration attempt for that same guest. Disabling log-dirty mode
after the first one had left a couple of shadow pages allocated (perhaps
something that also wants fixing), and hence the second enabling of
log-dirty mode wouldn't have allocated anything further.
Reported-by: James Wang <jnwang@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>