Hans van Kranenburg [Thu, 3 Jan 2019 23:35:45 +0000 (00:35 +0100)]
t/h/L/vif-common.sh: disable handle_iptable
Also see Debian bug #894013. The current attempt at providing
anti-spoofing rules results in a situation that does not have any
effect. Also note that forwarding bridged traffic to iptables is not
enabled by default, and that for openvswitch users it does not make any
sense.
So, stop cluttering the live iptables ruleset.
This functionality seems to be introduced before 2004 and since then it
has never got some additional love.
It would be nice to have a proper discussion upstream about how Xen
could provide some anti mac/ip spoofing in the dom0. It does not seem to
be a trivial thing to do, since it requires having quite some knowledge
about what the domU is allowed to do or not (e.g. a domU can be a
router...).
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Ian Jackson [Fri, 12 Oct 2018 16:56:56 +0000 (17:56 +0100)]
docs/man/xen-vbd-interface.7: Provide properly-formatted NAME section
This manpage was omitted from
docs/man: Provide properly-formatted NAME sections
because I was previously building with markdown not installed.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Fri, 12 Oct 2018 17:17:10 +0000 (17:17 +0000)]
shim: Provide separate install-shim target
When building on a 32-bit userland, the user wants to build 32-bit
tools and a 64-bit hypervisor. This involves setting XEN_TARGET_ARCH
to different values for the tools build and the hypervisor build.
So the user must invoke the tools build and the hypervisor build
separately.
However, although the shim is done by the tools/firmware Makefile, its
bitness needs to be the same as the hypervisor, not the same as the
tools. When run with XEN_TARGET_ARCH=x86_32, it it skipped, which is
wrong.
So the user must invoke the shim build separately. This can be done
with
make -C tools/firmware/xen-dir XEN_TARGET_ARCH=x86_64
However, tools/firmware/xen-dir has no `install' target. The
installation of all `firmware' is done in tools/firmware/Makefile. It
might be possible to fix this, but it is not trivial. For example,
the definitions of INST_DIR and DEBG_DIR would need to be copied, as
would an appropriate $(INSTALL_DIR) call.
For now, provide an `install-shim' target in tools/firmware/Makefile.
This has to be called from `install' of course. We can't make it
a dependency of `install' because it might be run before `all' has
completed. We could make it depend on a `shim' target but such
a target is nearly impossible to write because everything is done by
the inflexible subdir-$@ machinery.
The overally result of this patch is that existing make invocations
work as before. But additionally, the user can say
make -C tools/firmware install-shim XEN_TARGET_ARCH=x86_64
to install the shim. The user must have built it already.
Unlike the build rune, this install-rune is properly conditional
so it is OK to call on ARM.
What a mess.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Ian Jackson [Fri, 12 Oct 2018 16:00:16 +0000 (16:00 +0000)]
config/Tools.mk.in: Respect caller's CONFIG_PV_SHIM
This makes it easier to disable the shim build. (In Debian we need to
build the shim separately because it needs different compiler flags).
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
[ Hans: adjust from tools/firmware/Makefile to config/Tools.mk.in to
follow changes that happened in
8845155c83 ("pvshim: make PV shim build
selectable from configure") ]
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Ian Jackson [Fri, 5 Oct 2018 17:05:48 +0000 (18:05 +0100)]
.gitignore: Add configure output which we always delete and regenerate
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 3 Oct 2018 15:25:58 +0000 (16:25 +0100)]
autoconf: Provide libexec_libdir_suffix
This is going to be used to put libfsimage.so into a path containing
the multiarch triplet.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Hans van Kranenburg [Mon, 25 May 2020 15:08:18 +0000 (17:08 +0200)]
tools-libfsimage-prefix.diff
\o/
Ian Jackson [Thu, 20 Sep 2018 17:10:14 +0000 (18:10 +0100)]
Do not build the instruction emulator
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Bastian Blank [Sat, 5 Jul 2014 09:47:29 +0000 (11:47 +0200)]
Remove static solaris support from pygrub
Patch-Name: tools-pygrub-remove-static-solaris-support
Gbp-Pq: Topic misc
Gbp-Pq: Name tools-pygrub-remove-static-solaris-support
Bastian Blank [Sat, 5 Jul 2014 09:47:30 +0000 (11:47 +0200)]
Do not ship COPYING into /usr/include
This is not wanted in Debian. COPYING ends up in
/usr/share/doc/xen-*copyright.
Patch-Name: tools-include-no-COPYING.diff
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Bastian Blank [Sat, 5 Jul 2014 09:46:45 +0000 (11:46 +0200)]
config-prefix.diff
Patch-Name: config-prefix.diff
Gbp-Pq: Topic prefix-abiname
Gbp-Pq: Name config-prefix.diff
Bastian Blank [Sat, 5 Jul 2014 09:46:43 +0000 (11:46 +0200)]
Display Debian package version in hypervisor log
During hypervisor boot, disable the banner and nicely display the xen
version as well as the Maintainer address from debian/control.
For this to work the SOURCE_BASE_DIR variable needs to be set by the
build system to the top directory, i.e. where the debian folder is.
Original patch by Bastian Blank <waldi@debian.org>
Modified by
Hans van Kranenburg <hans@knorrie.org>
Maximilian Engelhardt <maxi@daemonizer.de>
Ian Jackson [Wed, 19 Sep 2018 15:53:22 +0000 (16:53 +0100)]
Delete configure output
These autogenerated files are not useful in Debian; dh_autoreconf will
regenerate them.
If this patch does not apply when rebasing, you can simply delete the
files again.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Wed, 19 Sep 2018 15:45:49 +0000 (16:45 +0100)]
Delete config.sub and config.guess
dh_autoreconf will provide these back.
If this patch does not apply when rebasing, you can simply delete the
files again.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Hans van Kranenburg [Sat, 18 Dec 2021 17:27:39 +0000 (18:27 +0100)]
debian/control: change Uploaders address for Ian
Ian does not work at Citrix any more now, but he keeps the xenproject
address.
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Hans van Kranenburg [Fri, 17 Dec 2021 10:51:13 +0000 (11:51 +0100)]
debian/changelog: finish 4.16.0-1~exp1
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Maximilian Engelhardt [Sun, 6 Dec 2020 15:17:15 +0000 (16:17 +0100)]
debian/rules: remove unused pybuild settings
These are currently not used and not needed.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Maximilian Engelhardt [Sun, 6 Dec 2020 15:17:15 +0000 (16:17 +0100)]
debian: fix dependency generation for python (Closes: #976597)
During the Debian 11 release cycle, we did a Xen upload at the same time
when a transition of default Python version (from 3.8 to 3.9) was
happening. This exposed a problem: Our xen-utils-V package just depended on
'python3'.
Why was this a problem? We ship a compiled extension, which gets built for
the *default* Python version in the system. In this case it was the
xenfsimage.cpython-39-x86_64-linux-gnu.so file (note the 39 in the name).
Having just the quite generic 'python3' dependency allowed our packages to
enter Debian testing before the transition of default Python version (from
3.8 to 3.9) did. As a result, the xenfsimage library could not be found,
because the user system would still be looking for [...]cpython-38[...].
To generate correct dependencies, dh-python >= 5.
20211016 (which
includes a fix for #980303) is needed.
In the above case, the correct dependencies that would need to be added
are: 'python3 (<< 3.10), python3 (>= 3.9~)'. Having these dependencies in
place would make sure our packages enter testing at the same time as the
default change to Python would.
The actual patch in here is quite small, but not easy to understand as
multiple other bugs are also intertwined, so here follows more explanation.
First, this patch reverts
1ca529cc3c ("debian/control: fix xen-utils-4.14
python3 dependency"), which was a workaround for a missing python3
dependency in the xen-utils-V package. This problem happened due to
the dynamic ${python3:Depends} back in) is a prerequisite for the second part
below.
Second this patch adds scanning of the private directory
"/usr/lib/xen-$(upstream_version)/lib/python" for dh-python to detect
our private python extension and generate proper version dependencies
for python3. This unfortunately was broken in dh-python (#980303), but
is fixed since 5.
20211016.
This part is the fix for #976597.
We were thinking about adding a build dependency to dh-python >=
5.
20211016 to the xen source package, but decided to omit the version
from the dependency for the following reasons:
* A dh-python with all relevant fixes is meanwhile in unstable and
testing.
* We wanted to make backporting xen to bullseye easy and dh-python >=
5.
20211016 is not in bullseye. Adding the versioned dependency would
mean more manual work while doing the backport.
We believe this is safe to do for the following reasons:
* When building for unstable or testing in an up-to-date chroot there
should be no issues as all relevant bugs in dh-python are fixed.
* When building a backport for bullseye, proper version dependencies for
python3 will not be generated, but also have not been generated in the
past, so this is not a regression. And, the default python3 version in
bullseye will very likely never change.
* The revert of
1ca529cc3c might be seen as a problem as the dependency
on python3 in xen-utils might be missing when compiled for bullseye.
However this luckily does not happen because of our additional scan
for private python extension in this commit. While the bullseye
dh-python is buggy in a way that it doesn't generate proper version
dependencies for python3 it still detects python3 usage and adds a
'python3' dependency.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
[ adding first three introductory paragraphs in the commit message ]
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Maximilian Engelhardt [Sun, 24 Oct 2021 17:24:17 +0000 (19:24 +0200)]
debian: call update-grub when installing/removing xen-hypervisor-common
This fixes #988901 as now update-grub will be called by all packages
that can affect the state of the update-grub generated output.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Maximilian Engelhardt [Sun, 24 Oct 2021 16:34:40 +0000 (18:34 +0200)]
d/salsa-ci.yml: disable cross building as it's currently not working
Cross building xen currently fails due to Debian bug #982406 in
markdown. It can be tried again when there are better chances of it
building successfully.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Maximilian Engelhardt [Wed, 15 Sep 2021 17:53:42 +0000 (19:53 +0200)]
d/salsa-ci.yml: Explicitly set RELEASE variable to unstable
This makes it easier to switch to another distribution.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Maximilian Engelhardt [Thu, 4 Mar 2021 22:25:31 +0000 (23:25 +0100)]
d/rules: remove reproducible=+fixfilepath from DEB_BUILD_MAINT_OPTIONS
It is enabled by default now.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Hans van Kranenburg [Mon, 13 Dec 2021 17:59:31 +0000 (18:59 +0100)]
debian: No longer build for i386
It was already not possible to use x86_32 hardware because the i386
packages already shipped a 64-bit hypervisor and PV shim. Running 32-bit
utils with a 64-bit hypervisor requires using a compatibility layer that
is fragile and becomes harder to maintain and test upstream.
The libc6-xen package is not being built any more, and 32-bit PV support
also has been removed from the Linux kernel now.
This change ends the 'grace period' in which users should have moved to
using a fully 64-bit dom0.
As a result the only reverse dependency for the libc6-xen package in
Debian is also removed, so this also (Closes: #992909) in the process.
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Hans van Kranenburg [Mon, 13 Dec 2021 17:20:15 +0000 (18:20 +0100)]
debian/control.md5sum: remove this obsolete file
This was used in the past to check if input files for generating
configuration were changed. It's no longer used now.
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Maximilian Engelhardt [Tue, 7 Dec 2021 20:52:15 +0000 (21:52 +0100)]
debian/rules: provide SOURCE_BASE_DIR
This adds the SOURCE_BASE_DIR variable pointing to the top directory.
The variable is then later used by the patch in our delta queue that
adjusts the first hypervisor log line to insert Debian specific
information.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Maximilian Engelhardt [Sat, 4 Dec 2021 18:34:38 +0000 (19:34 +0100)]
debian: libxenstore went from 3.0 to 4
In upstream commit
d2cad41def ("tools/libs/store: cleanup libxenstore
interface") a number of functions in this library were removed, which
results in a soname bump.
Follow this change by now shipping libxenstore4 instead of 3.0.
Maximilian Engelhardt [Sat, 4 Dec 2021 18:01:20 +0000 (19:01 +0100)]
debian: update debian/control for xen-4.16 build
replace 4.14 with 4.16
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Maximilian Engelhardt [Sat, 4 Dec 2021 17:47:19 +0000 (18:47 +0100)]
debian: follow upstream removal of '.sh' suffix in xl bash_completion file
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Hans van Kranenburg [Sat, 4 Dec 2021 14:34:33 +0000 (15:34 +0100)]
Update changelog for new upstream 4.16.0
[git-debrebase changelog: new upstream 4.16.0]
Hans van Kranenburg [Sat, 4 Dec 2021 14:34:33 +0000 (15:34 +0100)]
Update to upstream 4.16.0
[git-debrebase anchor: new upstream 4.16.0, merge]
Hans van Kranenburg [Sat, 27 Nov 2021 14:45:55 +0000 (15:45 +0100)]
debian/changelog: finish 4.14.3+
32-g9de3671772-1
Ian Jackson [Tue, 30 Nov 2021 11:42:42 +0000 (11:42 +0000)]
xen/Makefile: Set 4.16 version
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Tue, 30 Nov 2021 11:40:21 +0000 (11:40 +0000)]
CHANGELOG.md: Set 4.16 version and date
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Tue, 30 Nov 2021 11:38:20 +0000 (11:38 +0000)]
README: make heading say 4.16
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Tue, 30 Nov 2021 11:14:00 +0000 (11:14 +0000)]
SUPPORT.md: Define support lifetime
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Ian Jackson [Tue, 30 Nov 2021 11:32:36 +0000 (11:32 +0000)]
Config.mk: Bump tags to 4.16.0 final
No actual change to the code since RC4.
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monne [Wed, 24 Nov 2021 11:24:03 +0000 (12:24 +0100)]
CHANGELOG: add missing entries for work during the 4.16 release cycle
Document some of the relevant changes during the 4.16 release cycle.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Hans van Kranenburg [Sat, 27 Nov 2021 14:09:47 +0000 (15:09 +0100)]
Update changelog for new upstream 4.14.3+
32-g9de3671772
[git-debrebase changelog: new upstream 4.14.3+
32-g9de3671772]
Hans van Kranenburg [Sat, 27 Nov 2021 14:09:47 +0000 (15:09 +0100)]
Update to upstream 4.14.3+
32-g9de3671772
[git-debrebase anchor: new upstream 4.14.3+
32-g9de3671772, merge]
Hans van Kranenburg [Mon, 13 Sep 2021 11:38:58 +0000 (13:38 +0200)]
debian/changelog: finish 4.14.3-1
Andrew Cooper [Wed, 24 Nov 2021 21:11:52 +0000 (21:11 +0000)]
Revert "x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents"
OSSTest has identified a 3rd regression caused by this change. Migration
between Xen 4.15 and 4.16 on the nocera pair of machines (AMD Opteron 4133)
fails with:
xc: error: Failed to set CPUID policy: leaf
00000000, subleaf
ffffffff, msr
ffffffff (22 = Invalid argument): Internal error
xc: error: Restore failed (22 = Invalid argument): Internal error
which is a safety check to prevent resuming the guest when the CPUID data has
been truncated. The problem is caused by shrinking of the max policies, which
is an ABI that needs handling compatibly between different versions of Xen.
Furthermore, shrinking of the default policies also breaks things in some
cases, because certain cpuid= settings in a VM config file which used to have
an effect will now be silently discarded.
This reverts commit
540d911c2813c3d8f4cdbb3f5672119e5e768a3d, as well as the
partial fix attempt in
81da2b544cbb003a5447c9b14d275746ad22ab37 (which added
one new case where cpuid= settings might not apply correctly) and restores the
same behaviour as Xen 4.15.
Fixes: 540d911c2813 ("x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents")
Fixes: 81da2b544cbb ("x86/cpuid: prevent shrinking migrated policies max leaves")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release_Acked-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Tue, 23 Nov 2021 16:43:31 +0000 (16:43 +0000)]
Turn off debug by default
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Tue, 23 Nov 2021 16:41:30 +0000 (16:41 +0000)]
SUPPORT.md: Set Release Notes link
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Tue, 23 Nov 2021 16:39:47 +0000 (16:39 +0000)]
Config.mk: switch to named tags (for stable branch)
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Tue, 23 Nov 2021 12:30:09 +0000 (13:30 +0100)]
x86/P2M: deal with partial success of p2m_set_entry()
M2P and PoD stats need to remain in sync with P2M; if an update succeeds
only partially, respective adjustments need to be made. If updates get
made before the call, they may also need undoing upon complete failure
(i.e. including the single-page case).
Log-dirty state would better also be kept in sync.
Note that the change to set_typed_p2m_entry() may not be strictly
necessary (due to the order restriction enforced near the top of the
function), but is being kept here to be on the safe side.
This is CVE-2021-28705 and CVE-2021-28709 / XSA-389.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
master commit:
74a11c43fd7e074b1f77631b446dd2115eacb9e8
master date: 2021-11-22 12:27:30 +0000
Jan Beulich [Tue, 23 Nov 2021 12:29:54 +0000 (13:29 +0100)]
x86/PoD: handle intermediate page orders in p2m_pod_cache_add()
p2m_pod_decrease_reservation() may pass pages to the function which
aren't 4k, 2M, or 1G. Handle all intermediate orders as well, to avoid
hitting the BUG() at the switch() statement's "default" case.
This is CVE-2021-28708 / part of XSA-388.
Fixes: 3c352011c0d3 ("x86/PoD: shorten certain operations on higher order ranges")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
master commit:
8ec13f68e0b026863d23e7f44f252d06478bc809
master date: 2021-11-22 12:27:30 +0000
Jan Beulich [Tue, 23 Nov 2021 12:29:41 +0000 (13:29 +0100)]
x86/PoD: deal with misaligned GFNs
Users of XENMEM_decrease_reservation and XENMEM_populate_physmap aren't
required to pass in order-aligned GFN values. (While I consider this
bogus, I don't think we can fix this there, as that might break existing
code, e.g Linux'es swiotlb, which - while affecting PV only - until
recently had been enforcing only page alignment on the original
allocation.) Only non-PoD code paths (guest_physmap_{add,remove}_page(),
p2m_set_entry()) look to be dealing with this properly (in part by being
implemented inefficiently, handling every 4k page separately).
Introduce wrappers taking care of splitting the incoming request into
aligned chunks, without putting much effort in trying to determine the
largest possible chunk at every iteration.
Also "handle" p2m_set_entry() failure for non-order-0 requests by
crashing the domain in one more place. Alongside putting a log message
there, also add one to the other similar path.
Note regarding locking: This is left in the actual worker functions on
the assumption that callers aren't guaranteed atomicity wrt acting on
multiple pages at a time. For mis-aligned GFNs gfn_lock() wouldn't have
locked the correct GFN range anyway, if it didn't simply resolve to
p2m_lock(), and for well-behaved callers there continues to be only a
single iteration, i.e. behavior is unchanged for them. (FTAOD pulling
out just pod_lock() into p2m_pod_decrease_reservation() would result in
a lock order violation.)
This is CVE-2021-28704 and CVE-2021-28707 / part of XSA-388.
Fixes: 3c352011c0d3 ("x86/PoD: shorten certain operations on higher order ranges")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
master commit:
182c737b9ba540ebceb1433f3940fbed6eac4ea9
master date: 2021-11-22 12:27:30 +0000
Julien Grall [Tue, 23 Nov 2021 12:29:09 +0000 (13:29 +0100)]
xen/page_alloc: Harden assign_pages()
domain_tot_pages() and d->max_pages are 32-bit values. While the order
should always be quite small, it would still be possible to overflow
if domain_tot_pages() is near to (2^32 - 1).
As this code may be called by a guest via XENMEM_increase_reservation
and XENMEM_populate_physmap, we want to make sure the guest is not going
to be able to allocate more than it is allowed.
Rework the allocation check to avoid any possible overflow. While the
check domain_tot_pages() < d->max_pages should technically not be
necessary, it is probably best to have it to catch any possible
inconsistencies in the future.
This is CVE-2021-28706 / part of XSA-385.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
master commit:
143501861d48e1bfef495849fd68584baac05849
master date: 2021-11-22 11:11:05 +0000
Jan Beulich [Mon, 22 Nov 2021 11:12:32 +0000 (11:12 +0000)]
x86/P2M: deal with partial success of p2m_set_entry()
M2P and PoD stats need to remain in sync with P2M; if an update succeeds
only partially, respective adjustments need to be made. If updates get
made before the call, they may also need undoing upon complete failure
(i.e. including the single-page case).
Log-dirty state would better also be kept in sync.
Note that the change to set_typed_p2m_entry() may not be strictly
necessary (due to the order restriction enforced near the top of the
function), but is being kept here to be on the safe side.
This is CVE-2021-28705 and CVE-2021-28709 / XSA-389.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 22 Nov 2021 11:11:44 +0000 (11:11 +0000)]
x86/PoD: handle intermediate page orders in p2m_pod_cache_add()
p2m_pod_decrease_reservation() may pass pages to the function which
aren't 4k, 2M, or 1G. Handle all intermediate orders as well, to avoid
hitting the BUG() at the switch() statement's "default" case.
This is CVE-2021-28708 / part of XSA-388.
Fixes: 3c352011c0d3 ("x86/PoD: shorten certain operations on higher order ranges")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 22 Nov 2021 11:11:44 +0000 (11:11 +0000)]
x86/PoD: deal with misaligned GFNs
Users of XENMEM_decrease_reservation and XENMEM_populate_physmap aren't
required to pass in order-aligned GFN values. (While I consider this
bogus, I don't think we can fix this there, as that might break existing
code, e.g Linux'es swiotlb, which - while affecting PV only - until
recently had been enforcing only page alignment on the original
allocation.) Only non-PoD code paths (guest_physmap_{add,remove}_page(),
p2m_set_entry()) look to be dealing with this properly (in part by being
implemented inefficiently, handling every 4k page separately).
Introduce wrappers taking care of splitting the incoming request into
aligned chunks, without putting much effort in trying to determine the
largest possible chunk at every iteration.
Also "handle" p2m_set_entry() failure for non-order-0 requests by
crashing the domain in one more place. Alongside putting a log message
there, also add one to the other similar path.
Note regarding locking: This is left in the actual worker functions on
the assumption that callers aren't guaranteed atomicity wrt acting on
multiple pages at a time. For mis-aligned GFNs gfn_lock() wouldn't have
locked the correct GFN range anyway, if it didn't simply resolve to
p2m_lock(), and for well-behaved callers there continues to be only a
single iteration, i.e. behavior is unchanged for them. (FTAOD pulling
out just pod_lock() into p2m_pod_decrease_reservation() would result in
a lock order violation.)
This is CVE-2021-28704 and CVE-2021-28707 / part of XSA-388.
Fixes: 3c352011c0d3 ("x86/PoD: shorten certain operations on higher order ranges")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Julien Grall [Mon, 22 Nov 2021 11:11:05 +0000 (11:11 +0000)]
xen/page_alloc: Harden assign_pages()
domain_tot_pages() and d->max_pages are 32-bit values. While the order
should always be quite small, it would still be possible to overflow
if domain_tot_pages() is near to (2^32 - 1).
As this code may be called by a guest via XENMEM_increase_reservation
and XENMEM_populate_physmap, we want to make sure the guest is not going
to be able to allocate more than it is allowed.
Rework the allocation check to avoid any possible overflow. While the
check domain_tot_pages() < d->max_pages should technically not be
necessary, it is probably best to have it to catch any possible
inconsistencies in the future.
This is CVE-2021-28706 / part of XSA-385.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Thu, 18 Nov 2021 08:28:06 +0000 (09:28 +0100)]
efi: fix alignment of function parameters in compat mode
Currently the max_store_size, remain_store_size and max_size in
compat_pf_efi_runtime_call are 4 byte aligned, which makes clang
13.0.0 complain with:
In file included from compat.c:30:
./runtime.c:646:13: error: passing 4-byte aligned argument to 8-byte aligned parameter 2 of 'QueryVariableInfo' may result in an unaligned pointer access [-Werror,-Walign-mismatch]
&op->u.query_variable_info.max_store_size,
^
./runtime.c:647:13: error: passing 4-byte aligned argument to 8-byte aligned parameter 3 of 'QueryVariableInfo' may result in an unaligned pointer access [-Werror,-Walign-mismatch]
&op->u.query_variable_info.remain_store_size,
^
./runtime.c:648:13: error: passing 4-byte aligned argument to 8-byte aligned parameter 4 of 'QueryVariableInfo' may result in an unaligned pointer access [-Werror,-Walign-mismatch]
&op->u.query_variable_info.max_size);
^
Fix this by bouncing the variables on the stack in order for them to
be 8 byte aligned.
Note this could be done in a more selective manner to only apply to
compat code calls, but given the overhead of making an EFI call doing
an extra copy of 3 variables doesn't seem to warrant the special
casing.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v3:
- Remove hard tabs. Apply Jan's r-b as authorised in email.
Changes since v2:
- Adjust the commentary as per discussion.
Changes since v1:
- Copy back the results.
Anthony PERARD [Fri, 19 Nov 2021 10:29:48 +0000 (10:29 +0000)]
golang/xenlight: regen generated code
Fixes: 7379f9e10a3b ("gnttab: allow setting max version per-domain")
Fixes: 1e6706b0d123 ("xen/arm: Introduce gpaddr_bits field to struct xen_domctl_getdomaininfo")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 19 Nov 2021 14:14:08 +0000 (15:14 +0100)]
VT-d: fix reduced page table levels support when sharing tables
domain_pgd_maddr() contains logic to adjust the root address to be put
in the context entry in case 4-level page tables aren't supported by an
IOMMU. This logic may not be bypassed when sharing page tables.
This is CVE-2021-28710 / XSA-390.
Fixes: 25ccd093425c ("iommu: remove the share_p2m operation")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 19 Nov 2021 08:42:10 +0000 (09:42 +0100)]
public/gnttab: relax v2 recommendation
With there being a way to disable v2 support, telling new guests to use
v2 exclusively is not a good suggestion.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
master commit:
2d72d2784eb71d8532bfbd6462d261739c9e82e4
master date: 2021-11-16 17:34:06 +0100
Jan Beulich [Fri, 19 Nov 2021 08:41:41 +0000 (09:41 +0100)]
x86/APIC: avoid iommu_supports_x2apic() on error path
The value it returns may change from true to false in case
iommu_enable_x2apic() fails and, as a side effect, clears iommu_intremap
(as can happen at least on AMD). Latch the return value from the first
invocation to replace the second one.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
master commit:
0f50d1696b3c13cbf0b18fec817fc291d5a30a31
master date: 2021-11-04 14:44:43 +0100
Jan Beulich [Fri, 19 Nov 2021 08:41:09 +0000 (09:41 +0100)]
x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing
x2apic_bsp_setup() gets called ahead of iommu_setup(), and since x2APIC
mode (physical vs clustered) depends on iommu_intremap, that variable
needs to be set to off as soon as we know we can't / won't enable
interrupt remapping, i.e. in particular when parsing of the respective
ACPI tables failed. Move the turning off of iommu_intremap from AMD
specific code into acpi_iommu_init(), accompanying it by clearing of
iommu_enable.
Take the opportunity and also fully skip ACPI table parsing logic on
VT-d when both "iommu=off" and "iommu=no-intremap" are in effect anyway,
like was already the case for AMD.
The tag below only references the commit uncovering a pre-existing
anomaly.
Fixes: d8bd82327b0f ("AMD/IOMMU: obtain IVHD type to use earlier")
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
master commit:
46c4061cd2bf69e8039021af615c2bdb94e50088
master date: 2021-11-04 14:44:01 +0100
Marek Marczykowski-Górecki [Fri, 19 Nov 2021 08:40:44 +0000 (09:40 +0100)]
x86/xstate: reset cached register values on resume
set_xcr0() and set_msr_xss() use cached value to avoid setting the
register to the same value over and over. But suspend/resume implicitly
reset the registers and since percpu areas are not deallocated on
suspend anymore, the cache gets stale.
Reset the cache on resume, to ensure the next write will really hit the
hardware. Choose value 0, as it will never be a legitimate write to
those registers - and so, will force write (and cache update).
Note the cache is used io get_xcr0() and get_msr_xss() too, but:
- set_xcr0() is called few lines below in xstate_init(), so it will
update the cache with appropriate value
- get_msr_xss() is not used anywhere - and thus not before any
set_msr_xss() that will fill the cache
Fixes: aca2a985a55a "xen: don't free percpu areas during suspend"
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
f7f4a523927fa4c7598e4647a16bc3e3cf8009d0
master date: 2021-11-04 14:42:37 +0100
Andrew Cooper [Fri, 19 Nov 2021 08:40:19 +0000 (09:40 +0100)]
x86/traps: Fix typo in do_entry_CP()
The call to debugger_trap_entry() should pass the correct vector. The
break-for-gdbsx logic is in practice unreachable because PV guests can't
generate #CP, but it will interfere with anyone inserting custom debugging
into debugger_trap_entry().
Fixes: 5ad05b9c2490 ("x86/traps: Implement #CP handler and extend #PF for shadow stacks")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
512863ed238d7390f74d43f0ba298b1dfa8f4803
master date: 2021-11-03 19:13:17 +0000
Andrew Cooper [Fri, 19 Nov 2021 08:39:46 +0000 (09:39 +0100)]
x86/shstk: Fix use of shadow stacks with XPTI active
The call to setup_cpu_root_pgt(0) in smp_prepare_cpus() is too early. It
clones the BSP's stack while the .data mapping is still in use, causing all
mappings to be fully read read/write (and with no guard pages either). This
ultimately causes #DF when trying to enter the dom0 kernel for the first time.
Defer setting up BSPs XPTI pagetable until reinit_bsp_stack() after we've set
up proper shadow stack permissions.
Fixes: 60016604739b ("x86/shstk: Rework the stack layout to support shadow stacks")
Fixes: b60ab42db2f0 ("x86/shstk: Activate Supervisor Shadow Stacks")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
b2851580b1f2ff121737a37cb25a370d7692ae3b
master date: 2021-11-03 13:08:42 +0000
Dongli Zhang [Fri, 19 Nov 2021 08:39:09 +0000 (09:39 +0100)]
update system time immediately when VCPUOP_register_vcpu_info
The guest may access the pv vcpu_time_info immediately after
VCPUOP_register_vcpu_info. This is to borrow the idea of
VCPUOP_register_vcpu_time_memory_area, where the
force_update_vcpu_system_time() is called immediately when the new memory
area is registered.
Otherwise, we may observe clock drift at the VM side if the VM accesses
the clocksource immediately after VCPUOP_register_vcpu_info().
Reference: https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg00571.html
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
b67f09721f136cc3a9afcb6a82466d1bd27aa6c0
master date: 2021-11-03 10:19:06 +0100
Jan Beulich [Fri, 19 Nov 2021 08:38:42 +0000 (09:38 +0100)]
x86/paging: restrict physical address width reported to guests
Modern hardware may report more than 48 bits of physical address width.
For paging-external guests our P2M implementation does not cope with
larger values. Telling the guest of more available bits means misleading
it into perhaps trying to actually put some page there (like was e.g.
intermediately done in OVMF for the shared info page).
While there also convert the PV check to a paging-external one (which in
our current code base are synonyms of one another anyway).
Fixes: 5dbd60e16a1f ("x86/shadow: Correct guest behaviour when creating PTEs above maxphysaddr")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
master commit:
b7635526acffbe4ad8ad16fd92812c57742e54c2
master date: 2021-10-19 10:08:30 +0200
Jan Beulich [Fri, 19 Nov 2021 08:38:09 +0000 (09:38 +0100)]
x86/AMD: make HT range dynamic for Fam17 and up
At the time of
d838ac2539cf ("x86: don't allow Dom0 access to the HT
address range") documentation correctly stated that the range was
completely fixed. For Fam17 and newer, it lives at the top of physical
address space, though.
To correctly determine the top of physical address space, we need to
account for their physical address reduction, hence the calculation of
paddr_bits also gets adjusted.
While for paddr_bits < 40 the HT range is completely hidden, there's no
need to suppress the range insertion in that case: It'll just have no
real meaning.
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
master commit:
d6e38eea2d806c53d976603717aebf6e5de30a1e
master date: 2021-10-19 10:04:13 +0200
Jan Beulich [Fri, 19 Nov 2021 08:37:37 +0000 (09:37 +0100)]
x86emul: de-duplicate scatters to the same linear address
The SDM specifically allows for earlier writes to fully overlapping
ranges to be dropped. If a guest did so, hvmemul_phys_mmio_access()
would crash it if varying data was written to the same address. Detect
overlaps early, as doing so in hvmemul_{linear,phys}_mmio_access() would
be quite a bit more difficult. To maintain proper faulting behavior,
instead of dropping earlier write instances of fully overlapping slots
altogether, write the data of the final of these slots multiple times.
(We also can't pull ahead the [single] write of the data of the last of
the slots, clearing all involved slots' op_mask bits together, as this
would yield incorrect results if there were intervening partially
overlapping ones.)
Note that due to cache slot use being linear address based, there's no
similar issue with multiple writes to the same physical address (mapped
through different linear addresses).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
master commit:
a8cddbac5051020bb4a59a7f0ea27500c51063fb
master date: 2021-10-19 10:02:39 +0200
Jan Beulich [Fri, 19 Nov 2021 08:37:10 +0000 (09:37 +0100)]
x86/HVM: correct cleanup after failed viridian_vcpu_init()
This happens after nestedhvm_vcpu_initialise(), so its effects also need
to be undone.
Fixes: 40a4a9d72d16 ("viridian: add init hooks")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit:
66675056c6e59b8a8b651a29ef53c63e9e04f58d
master date: 2021-10-18 14:21:17 +0200
Anthony PERARD [Fri, 19 Nov 2021 08:36:36 +0000 (09:36 +0100)]
build: fix dependencies in arch/x86/boot
Temporary fix the list of headers that cmdline.c and reloc.c depends
on, until the next time the list is out of sync again.
Also, add the linker script to the list.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit:
2f5f0a1b77161993c16c4cc243467d75e5b7633b
master date: 2021-10-14 12:35:42 +0200
Roger Pau Monné [Wed, 17 Nov 2021 11:43:05 +0000 (12:43 +0100)]
tools/python: fix python libxc bindings to pass a max grant version
Such max version should be provided by the caller, otherwise the
bindings will default to specifying a max version of 2, which is
inline with the current defaults in the hypervisor.
Fixes: 7379f9e10a ('gnttab: allow setting max version per-domain')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monné [Wed, 17 Nov 2021 11:35:26 +0000 (12:35 +0100)]
CHANGELOG: set Xen 4.15 release date
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monné [Wed, 17 Nov 2021 07:13:18 +0000 (08:13 +0100)]
test/tsx: set grant version for created domains
Set the grant table version for the created domains to use version 1,
as such tests domains don't require the usage of the grant table at
all. A TODO note is added to switch those dummy domains to not have a
grant table at all when possible. Without setting the grant version
the domains for the tests cannot be created.
Fixes: 7379f9e10a ('gnttab: allow setting max version per-domain')
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monné [Wed, 17 Nov 2021 07:13:02 +0000 (08:13 +0100)]
tests/resource: set grant version for created domains
Set the grant table version for the created domains to use version 1,
as that's the used by the test cases. Without setting the grant
version the domains for the tests cannot be created.
Fixes: 7379f9e10a ('gnttab: allow setting max version per-domain')
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monné [Wed, 17 Nov 2021 07:12:00 +0000 (08:12 +0100)]
domctl: introduce a macro to set the grant table max version
Such macro just clamps the passed version to fit in the designated
bits of the domctl field. The main purpose is to make it clearer in
the code when max grant version is being set in the grant_opts field.
Existing users that where setting the version in the grant_opts field
are switched to use the macro.
No functional change intended.
Requested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Tue, 16 Nov 2021 16:34:06 +0000 (17:34 +0100)]
public/gnttab: relax v2 recommendation
With there being a way to disable v2 support, telling new guests to use
v2 exclusively is not a good suggestion.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jane Malalane [Fri, 12 Nov 2021 14:48:21 +0000 (14:48 +0000)]
tests/resource: Extend to check that the grant frames are mapped correctly
Previously, we checked that we could map 40 pages with nothing
complaining. Now we're adding extra logic to check that those 40
frames are "correct".
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Wed, 10 Nov 2021 17:40:59 +0000 (18:40 +0100)]
x86/cpuid: prevent shrinking migrated policies max leaves
CPUID policies from guest being migrated shouldn't have the maximum
leaves shrink, as that would be a guest visible change. The hypervisor
has no knowledge on whether a guest has been migrated or is build from
scratch, and hence it must not blindly shrink the CPUID policy in
recalculate_cpuid_policy. Remove the
x86_cpuid_policy_shrink_max_leaves call from recalculate_cpuid_policy.
Removing such call could be seen as a partial revert of
540d911c28.
Instead let the toolstack shrink the policies for newly created
guests, while keeping the previous values for guests that are migrated
in. Note that guests migrated in without a CPUID policy won't get any
kind of shrinking applied.
Fixes: 540d911c28 ('x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 12 Nov 2021 12:56:51 +0000 (13:56 +0100)]
VT-d: per-domain IOMMU bitmap needs to have dynamic size
With no upper bound (anymore) on the number of IOMMUs, a fixed-size
64-bit map may be insufficient (systems with 40 IOMMUs have already been
observed).
Fixes: 27713fa2aa21 ("VT-d: improve save/restore of registers across S3")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Stefano Stabellini [Fri, 5 Nov 2021 15:44:45 +0000 (08:44 -0700)]
MAINTAINERS: add Bertrand to the ARM reviewers
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Stefano Stabellini [Wed, 10 Nov 2021 20:55:55 +0000 (12:55 -0800)]
xen/arm: allocate_bank_memory: don't create memory banks of size zero
allocate_bank_memory can be called with a tot_size of zero, as an
example see the implementation of allocate_memory which can call
allocate_bank_memory with a tot_size of zero for the second memory bank.
If tot_size == 0, don't create an empty memory bank, just return
immediately without error. Otherwise a zero-size memory bank will be
added to the domain device tree.
Note that Linux is known to be able to cope with zero-size memory banks,
and Xen more recently gained the ability to do so as well (
5a37207df520
"xen/arm: bootfdt: Ignore empty memory bank"). However, there might be
other non-Linux OSes that are not able to cope with empty memory banks
as well as Linux (and now Xen). It would be more robust to avoid
zero-size memory banks unless required.
Moreover, the code to find empty address regions in make_hypervisor_node
in Xen is not able to cope with empty memory banks today and would
result in a Xen crash. This is only a latent bug because
make_hypervisor_node is only called for Dom0 at present and
allocate_memory is only called for DomU at the moment. (But if
make_hypervisor_node was to be called for a DomU, then the Xen crash
would become manifest.)
Fixes: f2931b4233ec ("xen/arm: introduce allocate_memory")
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Stefano Stabellini [Wed, 10 Nov 2021 20:18:12 +0000 (12:18 -0800)]
xen/arm: don't assign domU static-mem to dom0 as reserved-memory
DomUs static-mem ranges are added to the reserved_mem array for
accounting, but they shouldn't be assigned to dom0 as the other regular
reserved-memory ranges in device tree.
In make_memory_nodes, fix the error by skipping banks with xen_domain
set to true in the reserved-memory array. Also make sure to use the
first valid (!xen_domain) start address for the memory node name.
Fixes: 41c031ff437b ("xen/arm: introduce domain on Static Allocation")
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Penny Zheng <penny.zheng@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monne [Tue, 9 Nov 2021 09:47:21 +0000 (10:47 +0100)]
tools/configure: make iPXE dependent on QEMU traditional
iPXE is only used by QEMU traditional, so make it off by default
unless QEMU traditional is enabled.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Fixes: bcf77ce510 ('configure: modify default of building rombios')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monne [Thu, 4 Nov 2021 10:48:34 +0000 (11:48 +0100)]
gnttab: allow setting max version per-domain
Introduce a new domain create field so that toolstack can specify the
maximum grant table version usable by the domain. This is plumbed into
xl and settable by the user as max_grant_version.
Previously this was only settable on a per host basis using the
gnttab command line option.
Note the version is specified using 4 bits, which leaves room to
specify up to grant table version 15. Given that we only have 2 grant
table versions right now, and a new version is unlikely in the near
future using 4 bits seems more than enough.
xenstored stubdomains are limited to grant table v1 because the
current MiniOS code used to build them only has support for grants v1.
There are existing limits set for xenstored stubdomains at creation
time that already match the defaults in MiniOS.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Fri, 29 Oct 2021 17:38:13 +0000 (18:38 +0100)]
xen: Report grant table v1/v2 capabilities to the toolstack
In order to let the toolstack be able to set the gnttab version on a
per-domain basis, it needs to know which ABIs Xen supports. Introduce
XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
userspace.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Releae-Acked-by: Ian Jackson <iwj@xenproject.org>
Luca Fancellu [Fri, 5 Nov 2021 13:07:28 +0000 (13:07 +0000)]
xen/efi: Fix Grub2 boot on arm64
The code introduced by commit
a1743fc3a9fe9b68c265c45264dddf214fd9b882
("arm/efi: Use dom0less configuration when using EFI boot") is
introducing a problem to boot Xen using Grub2 on ARM machine using EDK2.
Despite UEFI specification, EDK2+Grub2 is returning a NULL DeviceHandle
inside the interface given by the LOADED_IMAGE_PROTOCOL service, this
handle is used later by efi_bs->HandleProtocol(...) inside
get_parent_handle(...) when requesting the SIMPLE_FILE_SYSTEM_PROTOCOL
interface, causing Xen to stop the boot because of an EFI_INVALID_PARAMETER
error.
Before the commit above, the function was never called because the
logic was skipping the call when there were multiboot modules in the
DT because the filesystem was never used and the bootloader had
put in place all the right modules in memory and the addresses
in the DT.
To fix the problem the old logic is put back in place. Because the handle
was given to the efi_check_dt_boot(...), but the revert put the handle
out of scope, the signature of the function is changed to use an
EFI_LOADED_IMAGE handle and request the EFI_FILE_HANDLE only when
needed (module found using xen,uefi-binary).
Another problem is found when the UEFI stub tries to check if Dom0
image or DomUs are present.
The logic doesn't work when the UEFI stub is not responsible to load
any modules, so the efi_check_dt_boot(...) return value is modified
to return the number of multiboot module found and not only the number
of module loaded by the stub.
Taking the occasion to update the comment in handle_module_node(...)
to explain why we return success even if xen,uefi-binary is not found.
Fixes: a1743fc3a9 ("arm/efi: Use dom0less configuration when using EFI boot")
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Juergen Gross [Thu, 4 Nov 2021 16:11:21 +0000 (17:11 +0100)]
tools: disable building qemu-trad per default
Using qemu-traditional as device model is deprecated for some time now.
So change the default for building it to "disable". This will affect
ioemu-stubdom, too, as there is a direct dependency between the two.
Today it is possible to use a PVH/HVM Linux-based stubdom as device
model. Additionally using ioemu-stubdom isn't really helping for
security, as it requires to run a very old and potentially buggy qemu
version in a PV domain. This is adding probably more security problems
than it is removing by using a stubdom.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Ian Jackson <iwj@xenproject.org>
Release-acked-by: Ian Jackson <iwj@xenproject.org>
Juergen Gross [Thu, 4 Nov 2021 16:11:20 +0000 (17:11 +0100)]
configure: modify default of building rombios
The tools/configure script will default to build rombios if qemu
traditional is enabled. If rombios is being built, ipxe will be built
per default, too.
This results in rombios and ipxe no longer being built by default when
disabling qemu traditional.
Fix that be rearranging the dependencies:
- build ipxe by default
- build rombios by default if either ipxe or qemu traditional are
being built
This modification prepares not building qemu traditional by default
without affecting build of rombios and ipxe.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-acked-by: Ian Jackson <iwj@xenproject.org>
Juergen Gross [Thu, 4 Nov 2021 14:42:42 +0000 (15:42 +0100)]
tools/helpers: fix broken xenstore stubdom init
Commit
1787cc167906f3f ("libs/guest: Move the guest ABI check earlier
into xc_dom_parse_image()") broke starting the xenstore stubdom. This
is due to a rather special way the xenstore stubdom domain config is
being initialized: in order to support both, PV and PVH stubdom,
init-xenstore-domain is using xc_dom_parse_image() to find the correct
domain type. Unfortunately above commit requires xc_dom_boot_xen_init()
to have been called before using xc_dom_parse_image(). This requires
the domid, which is known only after xc_domain_create(), which requires
the domain type.
In order to break this circular dependency, call xc_dom_boot_xen_init()
with an arbitrary domid first, and then set dom->guest_domid later.
Fixes: 1787cc167906f3f ("libs/guest: Move the guest ABI check earlier into xc_dom_parse_image()")
Signed-off-by: Juergen Gross <jgross@suse.com>
Release-acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 4 Nov 2021 13:44:43 +0000 (14:44 +0100)]
x86/APIC: avoid iommu_supports_x2apic() on error path
The value it returns may change from true to false in case
iommu_enable_x2apic() fails and, as a side effect, clears iommu_intremap
(as can happen at least on AMD). Latch the return value from the first
invocation to replace the second one.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Thu, 4 Nov 2021 13:44:01 +0000 (14:44 +0100)]
x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing
x2apic_bsp_setup() gets called ahead of iommu_setup(), and since x2APIC
mode (physical vs clustered) depends on iommu_intremap, that variable
needs to be set to off as soon as we know we can't / won't enable
interrupt remapping, i.e. in particular when parsing of the respective
ACPI tables failed. Move the turning off of iommu_intremap from AMD
specific code into acpi_iommu_init(), accompanying it by clearing of
iommu_enable.
Take the opportunity and also fully skip ACPI table parsing logic on
VT-d when both "iommu=off" and "iommu=no-intremap" are in effect anyway,
like was already the case for AMD.
The tag below only references the commit uncovering a pre-existing
anomaly.
Fixes: d8bd82327b0f ("AMD/IOMMU: obtain IVHD type to use earlier")
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Marek Marczykowski-Górecki [Thu, 4 Nov 2021 13:42:37 +0000 (14:42 +0100)]
x86/xstate: reset cached register values on resume
set_xcr0() and set_msr_xss() use cached value to avoid setting the
register to the same value over and over. But suspend/resume implicitly
reset the registers and since percpu areas are not deallocated on
suspend anymore, the cache gets stale.
Reset the cache on resume, to ensure the next write will really hit the
hardware. Choose value 0, as it will never be a legitimate write to
those registers - and so, will force write (and cache update).
Note the cache is used io get_xcr0() and get_msr_xss() too, but:
- set_xcr0() is called few lines below in xstate_init(), so it will
update the cache with appropriate value
- get_msr_xss() is not used anywhere - and thus not before any
set_msr_xss() that will fill the cache
Fixes: aca2a985a55a "xen: don't free percpu areas during suspend"
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Tue, 28 Sep 2021 20:55:56 +0000 (21:55 +0100)]
x86/traps: Fix typo in do_entry_CP()
The call to debugger_trap_entry() should pass the correct vector. The
break-for-gdbsx logic is in practice unreachable because PV guests can't
generate #CP, but it will interfere with anyone inserting custom debugging
into debugger_trap_entry().
Fixes: 5ad05b9c2490 ("x86/traps: Implement #CP handler and extend #PF for shadow stacks")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Oleksandr Andrushchenko [Tue, 2 Nov 2021 11:20:41 +0000 (13:20 +0200)]
xen/arm: fix SBDF calculation for vPCI MMIO handlers
While in vPCI MMIO trap handlers for the guest PCI host bridge it is not
enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as
the base address may not be aligned in the way that the translation
always work. If not adjusted with respect to the base address it may not be
able to properly convert SBDF.
Fix this by adjusting the gpa with respect to the host bridge base address
in a way as it is done for x86.
Please note, that this change is not strictly required given the current
value of GUEST_VPCI_ECAM_BASE which has bits 0 to 27 clear, but could cause
issues if such value is changed, or when handlers for dom0 ECAM
regions are added as those will be mapped over existing hardware
regions that could use non-aligned base addresses.
Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM")
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Wed, 3 Nov 2021 15:20:02 +0000 (15:20 +0000)]
Revert "tools: disable building qemu-trad per default"
Unfortunately this breaks the gitlab CI. See mails on-list.
This reverts commit
ce309942c791628ff42082d1b74bfaeaa5267ae0.
Andrew Cooper [Mon, 1 Nov 2021 20:45:26 +0000 (20:45 +0000)]
x86/shstk: Fix use of shadow stacks with XPTI active
The call to setup_cpu_root_pgt(0) in smp_prepare_cpus() is too early. It
clones the BSP's stack while the .data mapping is still in use, causing all
mappings to be fully read read/write (and with no guard pages either). This
ultimately causes #DF when trying to enter the dom0 kernel for the first time.
Defer setting up BSPs XPTI pagetable until reinit_bsp_stack() after we've set
up proper shadow stack permissions.
Fixes: 60016604739b ("x86/shstk: Rework the stack layout to support shadow stacks")
Fixes: b60ab42db2f0 ("x86/shstk: Activate Supervisor Shadow Stacks")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Juergen Gross [Fri, 10 Sep 2021 05:55:18 +0000 (07:55 +0200)]
tools: disable building qemu-trad per default
Using qemu-traditional as device model is deprecated for some time now.
So change the default for building it to "disable". This will affect
ioemu-stubdom, too, as there is a direct dependency between the two.
Today it is possible to use a PVH/HVM Linux-based stubdom as device
model. Additionally using ioemu-stubdom isn't really helping for
security, as it requires to run a very old and potentially buggy qemu
version in a PV domain. This is adding probably more security problems
than it is removing by using a stubdom.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Dongli Zhang [Wed, 3 Nov 2021 09:19:06 +0000 (10:19 +0100)]
update system time immediately when VCPUOP_register_vcpu_info
The guest may access the pv vcpu_time_info immediately after
VCPUOP_register_vcpu_info. This is to borrow the idea of
VCPUOP_register_vcpu_time_memory_area, where the
force_update_vcpu_system_time() is called immediately when the new memory
area is registered.
Otherwise, we may observe clock drift at the VM side if the VM accesses
the clocksource immediately after VCPUOP_register_vcpu_info().
Reference: https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg00571.html
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Wed, 3 Nov 2021 09:17:47 +0000 (10:17 +0100)]
x86: de-duplicate MONITOR/MWAIT CPUID-related definitions
As of
724b55f48a6c ("x86: introduce MWAIT-based, ACPI-less CPU idle
driver") they (also) live in asm/mwait.h; no idea how I missed the
duplicates back at the time.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Mon, 1 Nov 2021 12:36:26 +0000 (12:36 +0000)]
README, xen/Makefile: Change version to 4.16-rc
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Ian Jackson [Mon, 1 Nov 2021 12:33:54 +0000 (12:33 +0000)]
Config.mk: pin QEMU_UPSTREAM_REVISION (prep for Xen 4.16 RC1)
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Stefano Stabellini [Fri, 29 Oct 2021 16:33:38 +0000 (09:33 -0700)]
automation: add a QEMU based x86_64 Dom0/DomU test
Introduce a test based on QEMU to run Xen, Dom0 and start a DomU.
This is similar to the existing qemu-alpine-arm64.sh script and test.
The only differences are:
- use Debian's qemu-system-x86_64 (on ARM we build our own)
- use ipxe instead of u-boot and ImageBuilder
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Stefano Stabellini [Tue, 26 Oct 2021 00:55:40 +0000 (17:55 -0700)]
automation: Linux 5.10.74 test-artifact
Build a 5.10 kernel to be used as Dom0 and DomU kernel for testing. This
is almost the same as the existing ARM64 recipe for Linux 5.9, the
only differences are:
- upgrade to latest 5.10.x stable
- force Xen modules to built-in (on ARM it was already done by defconfig)
Also add the exporting job to build.yaml so that the binary can be used
during gitlab-ci runs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>