xen.git
4 years agox86/AMD: adjust SYSCFG, TOM, etc exposure to deal with running nested
Jan Beulich [Mon, 19 Jul 2021 10:28:50 +0000 (12:28 +0200)]
x86/AMD: adjust SYSCFG, TOM, etc exposure to deal with running nested

In the original change I neglected to consider the case of us running as
L1 under another Xen. In this case we're not Dom0, so the underlying Xen
wouldn't permit us access to these MSRs. As an immediate workaround use
rdmsr_safe(); I don't view this as the final solution though, as the
original problem the earlier change tried to address also applies when
running nested. Yet it is then unclear to me how to properly address the
issue: We shouldn't generally expose the MSR values, but handing back
zero (or effectively any other static value) doesn't look appropriate
either.

Fixes: bfcdaae9c210 ("x86/AMD: expose SYSCFG, TOM, TOM2, and IORRs to Dom0")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agolibxl/x86: check return value of SHADOW_OP_SET_ALLOCATION domctl
Jan Beulich [Mon, 19 Jul 2021 10:28:09 +0000 (12:28 +0200)]
libxl/x86: check return value of SHADOW_OP_SET_ALLOCATION domctl

The hypervisor may not have enough memory to satisfy the request. While
there, make the unit of the value clear by renaming the local variable.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
4 years agostubdom: foreignmemory: Fix build after 0dbb4be739c5
Julien Grall [Tue, 13 Jul 2021 09:20:19 +0000 (10:20 +0100)]
stubdom: foreignmemory: Fix build after 0dbb4be739c5

Commit 0dbb4be739c5 add the inclusion of xenctrl.h from private.h and
wreck the build in an interesting way:

In file included from xen/stubdom/include/xen/domctl.h:39:0,
                 from xen/tools/include/xenctrl.h:36,
                 from private.h:4,
                 from minios.c:29:
xen/include/public/memory.h:407:5: error: expected specifier-qualifier-list before ‘XEN_GUEST_HANDLE_64’
     XEN_GUEST_HANDLE_64(const_uint8) buffer;
     ^~~~~~~~~~~~~~~~~~~

This is happening because xenctrl.h defines __XEN_TOOLS__ and therefore
the public headers will start to expose the non-stable ABI. However,
xen.h has already been included by a mini-OS header before hand. So
there is a mismatch in the way the headers are included.

For now solve it in a very simple (and gross) way by including
xenctrl.h before the mini-os headers.

Fixes: 0dbb4be739c5 ("tools/libs/foreignmemory: Fix PAGE_SIZE redefinition error")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoCHANGELOG: record changed PCI device quarantining default
Jan Beulich [Tue, 13 Jul 2021 08:17:33 +0000 (10:17 +0200)]
CHANGELOG: record changed PCI device quarantining default

This amends commit 980d6acf1517 ("IOMMU: make DMA containment of
quarantined devices optional").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agoIOMMU: correct parsing of "quarantine=scratch-page"
Jan Beulich [Tue, 13 Jul 2021 08:16:18 +0000 (10:16 +0200)]
IOMMU: correct parsing of "quarantine=scratch-page"

During the multiple renames of the sub-option I apparently forgot to
update the left side of the &&, and this pretty consistently.

Fixes: 980d6acf1517 ("IOMMU: make DMA containment of quarantined devices optional")
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agotests/xenstore: Rework Makefile
Andrew Cooper [Tue, 15 Jun 2021 15:02:29 +0000 (16:02 +0100)]
tests/xenstore: Rework Makefile

In particular, fill in the install/uninstall rules so this test can be
packaged to be automated sensibly.

This causes the code to be noticed by CI, which objects as follows:

  test-xenstore.c: In function 'main':
  test-xenstore.c:486:5: error: ignoring return value of 'asprintf', declared
  with attribute warn_unused_result [-Werror=unused-result]
       asprintf(&path, "%s/%u", TEST_PATH, getpid());
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Address the CI failure by checking the asprintf() return value and exiting.

Rename xs-test to test-xenstore to be consistent with other tests.  Honour
APPEND_FLAGS too.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agotests/cpu-policy: Rework Makefile
Andrew Cooper [Tue, 15 Jun 2021 14:37:49 +0000 (15:37 +0100)]
tests/cpu-policy: Rework Makefile

In particular, fill in the install/uninstall rules so this test can be
packaged to be automated sensibly.

Rework TARGET-y to be TARGETS, drop redundant -f's for $(RM), drop the
unconditional -O3 and use the default instead, and drop CFLAGS from the link
line but honour APPEND_LDFLAGS.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agotests/resource: Rework Makefile
Andrew Cooper [Tue, 15 Jun 2021 14:22:11 +0000 (15:22 +0100)]
tests/resource: Rework Makefile

In particular, fill in the install/uninstall rules so this test can be
packaged to be automated sensibly.

Make all object files depend on the Makefile, drop redundant -f's for $(RM),
and use $(TARGET) when appropriate.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agotools/tests: Drop obsolete mce-test infrastructure
Andrew Cooper [Tue, 15 Jun 2021 13:19:15 +0000 (14:19 +0100)]
tools/tests: Drop obsolete mce-test infrastructure

mce-test has a test suite, but it depends on xend, needs to run in-tree, and
requires manual setup of at least one guest, and manual parameters to pass
into cases.  Drop the test infrasturcture.

Move the one useful remaining item, xen-mceinj, into misc/, fixing some minor
style issues as it goes.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agotools/misc/xen-vmtrace: handle more signals and install by default
Tamas K Lengyel [Fri, 7 May 2021 15:28:36 +0000 (11:28 -0400)]
tools/misc/xen-vmtrace: handle more signals and install by default

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoautomation: provide pciutils in opensuse packages
Olaf Hering [Fri, 9 Jul 2021 14:32:48 +0000 (16:32 +0200)]
automation: provide pciutils in opensuse packages

qemu-xen-traditional may make use of pciutils-devel, for PCI passthrough.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoautomation: provide SDL and SDL2 in opensuse images
Olaf Hering [Fri, 9 Jul 2021 14:32:47 +0000 (16:32 +0200)]
automation: provide SDL and SDL2 in opensuse images

qemu-xen-traditional may make use of SDL, qemu-xen may make use of SDL2.
Use pkgconfig() as resolvable instead of a rpm name, the latter may change.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoautomation: add meson and ninja to tumbleweed container
Olaf Hering [Fri, 9 Jul 2021 14:06:53 +0000 (16:06 +0200)]
automation: add meson and ninja to tumbleweed container

qemu uses meson as for configuration, and requires ninja for building.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/ocaml: Fix redefinition errors
Costin Lupu [Tue, 8 Jun 2021 12:35:29 +0000 (15:35 +0300)]
tools/ocaml: Fix redefinition errors

If PAGE_SIZE is already defined in the system (e.g. in /usr/include/limits.h
header) then gcc will trigger a redefinition error because of -Werror. This
patch replaces usage of PAGE_* macros with XC_PAGE_* macros in order to avoid
confusion between control domain page granularity (PAGE_* definitions) and
guest domain page granularity (which is what we are dealing with here).

Same issue applies for redefinitions of Val_none and Some_val macros which
can be already define in the OCaml system headers (e.g.
/usr/lib/ocaml/caml/mlvalues.h).

Signed-off-by: Costin Lupu <costin.lupu@cs.pub.ro>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Tested-by: Dario Faggioli <dfaggioli@suse.com>
4 years agotools/libs/gnttab: Fix PAGE_SIZE redefinition error
Costin Lupu [Tue, 8 Jun 2021 12:35:28 +0000 (15:35 +0300)]
tools/libs/gnttab: Fix PAGE_SIZE redefinition error

If PAGE_SIZE is already defined in the system (e.g. in /usr/include/limits.h
header) then gcc will trigger a redefinition error because of -Werror. This
patch replaces usage of PAGE_* macros with XC_PAGE_* macros in order to avoid
confusion between control domain page granularity (PAGE_* definitions) and
guest domain page granularity.

The exception is in osdep_xenforeignmemory_map() where we need the system page
size to check whether the PFN array should be allocated with mmap() or with
dynamic allocation.

Signed-off-by: Costin Lupu <costin.lupu@cs.pub.ro>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/libs/foreignmemory: Fix PAGE_SIZE redefinition error
Costin Lupu [Tue, 8 Jun 2021 12:35:27 +0000 (15:35 +0300)]
tools/libs/foreignmemory: Fix PAGE_SIZE redefinition error

If PAGE_SIZE is already defined in the system (e.g. in /usr/include/limits.h
header) then gcc will trigger a redefinition error because of -Werror. This
patch replaces usage of PAGE_* macros with XC_PAGE_* macros in order to avoid
confusion between control domain page granularity (PAGE_* definitions) and
guest domain page granularity.

The exception is in osdep_xenforeignmemory_map() where we need the system page
size to check whether the PFN array should be allocated with mmap() or with
dynamic allocation.

Signed-off-by: Costin Lupu <costin.lupu@cs.pub.ro>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/libfsimage: Fix PATH_MAX redefinition error
Costin Lupu [Tue, 8 Jun 2021 12:35:26 +0000 (15:35 +0300)]
tools/libfsimage: Fix PATH_MAX redefinition error

If PATH_MAX is already defined in the system (e.g. in /usr/include/limits.h
header) then gcc will trigger a redefinition error because of -Werror.

Signed-off-by: Costin Lupu <costin.lupu@cs.pub.ro>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/debugger: Fix PAGE_SIZE redefinition error
Costin Lupu [Tue, 8 Jun 2021 12:35:25 +0000 (15:35 +0300)]
tools/debugger: Fix PAGE_SIZE redefinition error

If PAGE_SIZE is already defined in the system (e.g. in /usr/include/limits.h
header) then gcc will trigger a redefinition error because of -Werror. This
patch replaces usage of PAGE_* macros with KDD_PAGE_* macros in order to avoid
confusion between control domain page granularity (PAGE_* definitions) and
guest domain page granularity (which is what we are dealing with here).

We chose to define the KDD_PAGE_* macros instead of using XC_PAGE_* macros
because (1) the code in kdd.c should not include any Xen headers and (2) to add
consistency for code in both kdd.c and kdd-xen.c.

Signed-off-by: Costin Lupu <costin.lupu@cs.pub.ro>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoautomation: document how to refresh a container
Olaf Hering [Thu, 8 Jul 2021 14:56:28 +0000 (16:56 +0200)]
automation: document how to refresh a container

The Tumbleweed container should be updated often.
Describe the neccessary steps how to refresh and test it before
pushing the new image to gitlab.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoautomation: avoid globbering the docker run args
Olaf Hering [Thu, 8 Jul 2021 14:56:49 +0000 (16:56 +0200)]
automation: avoid globbering the docker run args

containerize bash -c './configure && make' fails due to shell expansion.

Collect all arguments for the script and pass them verbatim to the
docker run command.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andew.cooper3@citrix.com>
4 years agoautomation: use zypper dup in tumbleweed dockerfile
Olaf Hering [Thu, 8 Jul 2021 13:57:04 +0000 (15:57 +0200)]
automation: use zypper dup in tumbleweed dockerfile

The 'dup' command aligns the installed packages with the packages
found in the enabled repositories, taking the repository priorities
into account. Using this command is generally a safe thing to do.

In the context of Tumbleweed using 'dup' is essential, because package
versions might be downgraded, and package names occasionally change.
Only 'dup' will do the correct thing in such cases.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agocirrus-ci: Fix FreeBSD build following QEMU update
Andrew Cooper [Thu, 8 Jul 2021 11:52:14 +0000 (12:52 +0100)]
cirrus-ci: Fix FreeBSD build following QEMU update

QEMU requires ninja and bash to build now.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agotools/libxenstat: fix populating vbd.rd_sect
Richard Kojedzinszky [Fri, 9 Jul 2021 08:06:45 +0000 (10:06 +0200)]
tools/libxenstat: fix populating vbd.rd_sect

Fixes: 91c3e3dc91d6 ("tools/xentop: Display '-' when stats are not available.")
Signed-off-by: Richard Kojedzinszky <richard@kojedz.in>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools: ipxe: update for fixing build with GCC11
Olaf Hering [Wed, 16 Jun 2021 13:14:35 +0000 (15:14 +0200)]
tools: ipxe: update for fixing build with GCC11

Use a snapshot which includes commit
f3f568e382a5f19824b3bfc6081cde39eee661e8 ("[crypto] Add
memory output constraints for big-integer inline assembly"),
which fixes build with gcc11.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: mark hypercall argument regs clobbering for intended fall-through
Jan Beulich [Fri, 9 Jul 2021 06:32:07 +0000 (08:32 +0200)]
x86: mark hypercall argument regs clobbering for intended fall-through

The CIDs below are all for the PV side of things, yet while at it take
care of the HVM side as well.

Coverity-ID: 14858961485901148590614859101485911,
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: pad blob-execution "okay" messages
Jan Beulich [Fri, 9 Jul 2021 06:31:28 +0000 (08:31 +0200)]
x86emul: pad blob-execution "okay" messages

We already do so in the native execution case, and a few descriptions (I
did notice this with SHA ones) are short enough for the output to look
slightly odd.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/AMD: drop MSR_K7_HWCR
Jan Beulich [Fri, 9 Jul 2021 06:30:35 +0000 (08:30 +0200)]
x86/AMD: drop MSR_K7_HWCR

We don't support any K7 (32-bit only) hardware anymore, and the MSR is
accessible as MSR_K8_HWCR as well. Using the K7 name was particularly
odd for Hygon as well as in a Fam0F-specific piece of code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/AMD: expose SYSCFG, TOM, TOM2, and IORRs to Dom0
Jan Beulich [Fri, 9 Jul 2021 06:28:14 +0000 (08:28 +0200)]
x86/AMD: expose SYSCFG, TOM, TOM2, and IORRs to Dom0

Sufficiently old Linux (3.12-ish) accesses these MSRs (with the
exception of IORRs) in an unguarded manner. Furthermore these same MSRs,
at least on Fam11 and older CPUs, are also consulted by modern Linux,
and their (bogus) built-in zapping of #GP faults from MSR accesses leads
to it effectively reading zero instead of the intended values, which are
relevant for PCI BAR placement (which ought to all live in MMIO-type
space, not in DRAM-type one).

For SYSCFG, only certain bits get exposed. Since MtrrVarDramEn also
covers the IORRs, expose them as well. Introduce (consistently named)
constants for the bits we're interested in and use them in pre-existing
code as well. While there also drop the unused and somewhat questionable
K8_MTRR_RDMEM_WRMEM_MASK. To complete the set of memory type and DRAM vs
MMIO controlling MSRs, also expose TSEG_{BASE,MASK} (the former also
gets read by Linux, dealing with which was already the subject of
6eef0a99262c ["x86/PV: conditionally avoid raising #GP for early guest
MSR reads"]).

As a welcome side effect, verbosity on/of debug builds gets (perhaps
significantly) reduced.

Note that at least as far as those MSR accesses by Linux are concerned,
there's no similar issue for DomU-s, as the accesses sit behind PCI
device matching logic. The checked for devices would never be exposed to
DomU-s in the first place. Nevertheless I think that at least for HVM we
should return sensible values, not 0 (as svm_msr_read_intercept() does
right now). The intended values may, however, need to be determined by
hvmloader, and then get made known to Xen.

Fixes: 322ec7c89f66 ("x86/pv: disallow access to unknown MSRs")
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
4 years agodocs/designs/launch: Hyperlaunch device tree
Daniel P. Smith [Fri, 9 Jul 2021 06:26:42 +0000 (08:26 +0200)]
docs/designs/launch: Hyperlaunch device tree

Adds a design document for Hyperlaunch device tree structure.

Signed-off-by: Christopher Clark <christopher.clark@starlab.io>
Signed-off by: Daniel P. Smith <dpsmith@apertussolutions.com>

4 years agodocs/designs/launch: Hyperlaunch design document
Daniel P. Smith [Fri, 9 Jul 2021 06:19:47 +0000 (08:19 +0200)]
docs/designs/launch: Hyperlaunch design document

Adds a design document for Hyperlaunch, formerly DomB mode of dom0less.

Signed-off-by: Christopher Clark <christopher.clark@starlab.io>
Signed-off by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Rich Persaud <rp@stacktrust.org>
4 years agoautomation: collect log files in subdirectories
Olaf Hering [Thu, 8 Jul 2021 06:54:35 +0000 (08:54 +0200)]
automation: collect log files in subdirectories

The current single *.log pattern collects just config.log, which
usually contains little useful information.
Collect also log files in subdirectories, tools/config.log usually
contains information about configure failures.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoautomation: dump contents of /etc/os-release
Olaf Hering [Thu, 8 Jul 2021 06:29:22 +0000 (08:29 +0200)]
automation: dump contents of /etc/os-release

To aid debugging build failures, dump /etc/os-release during build.
This helps with rolling releases such as Tumbleweed to understand the
state of the build container.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoautomation: Check if ninja is available before building QEMU
Anthony PERARD [Wed, 7 Jul 2021 16:40:01 +0000 (17:40 +0100)]
automation: Check if ninja is available before building QEMU

ninja is now required to build the latest version of QEMU, and not all
distros have a suitable version.  Skip the QEMU build when ninja is not
available.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoautomation: Adding ninja-build to some docker images
Anthony PERARD [Wed, 7 Jul 2021 16:40:00 +0000 (17:40 +0100)]
automation: Adding ninja-build to some docker images

This is to allow building the latest version of QEMU.

fedora/29:
    In addition to adding "ninja", I've add to make some other
    changes: some `go build` failed with `mkdir /.cache` no
    permission, so I've created a user.
    (this was discovered while testing the new container with the
    script containerize.)

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agobuild,tools: have default rules depends on symbols
Anthony PERARD [Wed, 7 Jul 2021 15:51:49 +0000 (17:51 +0200)]
build,tools: have default rules depends on symbols

No need to call $(MAKE) again.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agobuild: use $(kconfig) shortcut in clean rule
Anthony PERARD [Wed, 7 Jul 2021 15:51:34 +0000 (17:51 +0200)]
build: use $(kconfig) shortcut in clean rule

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agobuild: clean "lib.a"
Anthony PERARD [Wed, 7 Jul 2021 15:51:18 +0000 (17:51 +0200)]
build: clean "lib.a"

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/arm: smmuv1: Switch from kzalloc_array(..) to devm_kcalloc(..)
Rahul Singh [Tue, 6 Jul 2021 10:53:59 +0000 (11:53 +0100)]
xen/arm: smmuv1: Switch from kzalloc_array(..) to devm_kcalloc(..)

Switch from kzalloc_array(..) to devm_kcalloc(..) when allocating the
SMR to make code coherent.

Signed-off-by: Rahul Singh <rahul.singh@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoarm: Fix arch_initialise_vcpu to be unsupported
Michal Orzel [Tue, 6 Jul 2021 10:28:53 +0000 (12:28 +0200)]
arm: Fix arch_initialise_vcpu to be unsupported

Function arch_initialise_vcpu is not reachable as the
VCPUOP_initialise is an unsupported operation on arm.
Modify the function by adding ASSERT_UNREACHABLE() and
returning -EOPNOTSUPP.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Michal Orzel <michal.orzel@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agotools: Fix CPSR/SPSR print size
Bertrand Marquis [Tue, 6 Jul 2021 15:28:57 +0000 (16:28 +0100)]
tools: Fix CPSR/SPSR print size

918b8842a852 changed CPSR and SPSR to be stored as 64bit values.

This is fixing the print size in some tools to use 64bit type.

Fixes: 918b8842a852 ("arm64: Change type of hsr, cpsr, spsr_el1 to uint64_t")
Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Michal Orzel <michal.orzel@arm.com>
Tested-by: Michal Orzel <michal.orzel@arm.com>
4 years agotools/xen-foreign: Update the size for vcpu_guest_{core_regs, context}
Julien Grall [Tue, 6 Jul 2021 13:20:00 +0000 (14:20 +0100)]
tools/xen-foreign: Update the size for vcpu_guest_{core_regs, context}

Commit 918b8842a852 ("arm64: Change type of hsr, cpsr, spsr_el1 to
uint64_t") updated the size of the structure vcpu_guest_core_regs and
indirectly vcpu_guest_context.

On Arm, the two structures are only accessible to the tools and the
hypervisor (and therefore stable). However, they are still checked
by the scripts in tools/include/xen-foreign are not able to understand
that.

Ideally we should rework the scripts so we don't have to update
the size for non-stable structure. But I don't have limited time
to spend on the issue. So chose the simple solution and update
the size accordingly.

Note that we need to keep vcpu_guest_core_regs around because
the structure is used by vcpu_guest_context and therefore the
scripts expects the generated header to contain it.

Fixes: 918b8842a852 ("arm64: Change type of hsr, cpsr, spsr_el1 to uint64_t")
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Michal Orzel <michal.orzel@arm.com>
Tested-by: Michal Orzel <michal.orzel@arm.com>
4 years agox86/mem-sharing: mov {get,put}_two_gfns()
Jan Beulich [Wed, 7 Jul 2021 10:35:54 +0000 (12:35 +0200)]
x86/mem-sharing: mov {get,put}_two_gfns()

There's no reason for every CU including p2m.h to have these two
functions compiled, when they're both mem-sharing specific right now and
for the foreseeable future.

Largely just code movement, with some style tweaks, the inline-s
dropped, and "put" being made consistent with "get" as to their NULL
checking of the passed in pointer to struct two_gfns.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
4 years agox86/mem-sharing: ensure consistent lock order in get_two_gfns()
Jan Beulich [Wed, 7 Jul 2021 10:35:12 +0000 (12:35 +0200)]
x86/mem-sharing: ensure consistent lock order in get_two_gfns()

While the comment validly says "Sort by domain, if same domain by gfn",
the implementation also included equal domain IDs in the first part of
the check, thus rending the second part entirely dead and leaving
deadlock potential when there's only a single domain involved.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
4 years agoIOMMU: make DMA containment of quarantined devices optional
Jan Beulich [Wed, 7 Jul 2021 10:32:45 +0000 (12:32 +0200)]
IOMMU: make DMA containment of quarantined devices optional

Containing still in flight DMA was introduced to work around certain
devices / systems hanging hard upon hitting a "not-present" IOMMU fault.
Passing through (such) devices (on such systems) is inherently insecure
(as guests could easily arrange for IOMMU faults of any kind to occur).
Defaulting to a mode where admins may not even become aware of issues
with devices can be considered undesirable. Therefore convert this mode
of operation to an optional one, not one enabled by default.

This involves resurrecting code commit ea38867831da ("x86 / iommu: set
up a scratch page in the quarantine domain") did remove, in a slightly
extended and abstracted fashion. Here, instead of reintroducing a pretty
pointless use of "goto" in domain_context_unmap(), and instead of making
the function (at least temporarily) inconsistent, take the opportunity
and replace the other similarly pointless "goto" as well.

In order to key the re-instated bypasses off of there (not) being a root
page table this further requires moving the allocate_domain_resources()
invocation from reassign_device() to amd_iommu_setup_domain_device() (or
else reassign_device() would allocate a root page table anyway); this is
benign to the second caller of the latter function.

In VT-d's domain_context_unmap(), instead of adding yet another
"goto out" when all that's wanted is a "return", eliminate the "out"
label at the same time.

Take the opportunity and also limit the control to builds supporting
PCI.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agotools/migration: unify type checking for data pfns in migration stream
Olaf Hering [Thu, 1 Jul 2021 09:56:08 +0000 (11:56 +0200)]
tools/migration: unify type checking for data pfns in migration stream

Introduce a helper which decides if a given pfn type has data
in the migration stream.

No change in behaviour intended, except for invalid page types which now
have a safer default.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/migration: unify type checking for data pfns in the VM
Olaf Hering [Thu, 1 Jul 2021 09:56:07 +0000 (11:56 +0200)]
tools/migration: unify type checking for data pfns in the VM

Introduce a helper which decides if a given pfn in the migration
stream is backed by memory.

This highlights more clearly that type XEN_DOMCTL_PFINFO_XALLOC (a
synthetic toolstack-only type used between Xen 4.2 to 4.5 which
indicated a dirty page on the sending side for which no data will be
send in the initial iteration) does get populated in the VM.

No change in behaviour intended, except for invalid page types which now
have a safer default.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/migration: unify known page type checking
Olaf Hering [Thu, 1 Jul 2021 09:56:05 +0000 (11:56 +0200)]
tools/migration: unify known page type checking

Users of xc_get_pfn_type_batch may want to sanity check the data
returned by Xen. Add helpers for this purpose:

is_known_page_type verifies the type returned by Xen on the saving
side, or the incoming type for a page on the restoring side, is known
by the save/restore code.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/python: fix Python3.4 TypeError in format string
Olaf Hering [Thu, 1 Jul 2021 09:56:01 +0000 (11:56 +0200)]
tools/python: fix Python3.4 TypeError in format string

Using the first element of a tuple for a format specifier fails with
python3.4 as included in SLE12:
    b = b"string/%x" % (i, )
TypeError: unsupported operand type(s) for %: 'bytes' and 'tuple'

It happens to work with python 2.7 and 3.6.
To support older Py3, format as strings and explicitly encode as ASCII.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
4 years agotools/python: handle libxl__physmap_info.name properly in convert-legacy-stream
Olaf Hering [Thu, 1 Jul 2021 09:56:00 +0000 (11:56 +0200)]
tools/python: handle libxl__physmap_info.name properly in convert-legacy-stream

The trailing member name[] in libxl__physmap_info is written as a
cstring into the stream. The current code does a sanity check if the
last byte is zero. This attempt fails with python3 because name[-1]
returns a type int. As a result the comparison with byte(\00) fails:

  File "/usr/lib/xen/bin/convert-legacy-stream", line 347, in read_libxl_toolstack
    raise StreamError("physmap name not NUL terminated")
  StreamError: physmap name not NUL terminated

To handle both python variants, cast to bytearray().

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
4 years agoarm64: Change type of hsr, cpsr, spsr_el1 to uint64_t
Michal Orzel [Mon, 5 Jul 2021 06:39:52 +0000 (08:39 +0200)]
arm64: Change type of hsr, cpsr, spsr_el1 to uint64_t

AArch64 registers are 64bit whereas AArch32 registers
are 32bit or 64bit. MSR/MRS are expecting 64bit values thus
we should get rid of helpers READ/WRITE_SYSREG32
in favour of using READ/WRITE_SYSREG.
We should also use register_t type when reading sysregs
which can correspond to uint64_t or uint32_t.
Even though many AArch64 registers have upper 32bit reserved
it does not mean that they can't be widen in the future.

Modify type of hsr, cpsr, spsr_el1 to uint64_t.
Previously we relied on the padding after spsr_el1.
As we removed the padding, modify the union to be 64bit so we don't corrupt spsr_fiq.
No need to modify the assembly code because the accesses were based on 64bit
registers as there was a 32bit padding after spsr_el1.

Remove 32bit padding in cpu_user_regs before spsr_fiq
as it is no longer needed due to upper union being 64bit now.
Add 64bit padding in cpu_user_regs before spsr_el1
because the kernel frame should be 16-byte aligned.

Change type of cpsr to uint64_t in the public outside interface
"public/arch-arm.h" to allow ABI compatibility between 32bit and 64bit.
Increment XEN_DOMCTL_INTERFACE_VERSION.

Change type of cpsr to uint64_t in the public outside interface
"public/vm_event.h" to allow ABI compatibility between 32bit and 64bit.

Signed-off-by: Michal Orzel <michal.orzel@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/arm: bootfdt: Always sort memory banks
Oleksandr Tyshchenko [Mon, 5 Jul 2021 17:48:51 +0000 (20:48 +0300)]
xen/arm: bootfdt: Always sort memory banks

At the moment, Xen on Arm64 expects the memory banks to be ordered.
Unfortunately, there may be a case when updated by firmware
device tree contains unordered banks. This means Xen will panic
when setting xenheap mappings for the subsequent bank with start
address being less than xenheap_mfn_start (start address of
the first bank).

As there is no clear requirement regarding ordering in the device
tree, update code to be able to deal with by sorting memory
banks. There is only one heap region on Arm32, so the sorting
is fine to be done in the common code.

Suggested-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agotools/xenstored: Stash the correct request in lu_status->in
Julien Grall [Thu, 1 Jul 2021 14:03:07 +0000 (15:03 +0100)]
tools/xenstored: Stash the correct request in lu_status->in

When Live-Updating with some load, Xenstored may hit the assert
req->in == lu_status->in in do_lu_start().

This is happening because the request is stashed when Live-Update
begins. This happens in a different request (see call lu_begin()
when select the new binary) from the one performing Live-Update.

To avoid the problem, stash the request in lu_start().

Fixes: 65f19ed62aa1 ("tools/xenstore: Don't assume conn->in points to the LU request")
Reported-by: Michael Kurth <mku@amazon.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: luca.fancellu@arm.com
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agolibxl/arm: provide guests with random seed
Sergiy Kibrik [Tue, 6 Jul 2021 06:33:45 +0000 (06:33 +0000)]
libxl/arm: provide guests with random seed

Pass 128 bytes of random seed via FDT, so that guests' CRNGs are better seeded
early at boot. This is larger than ChaCha20 key size of 32, so each byte of
CRNG state will be mixed 4 times using this seed. There does not seem to be
advantage in larger seed though.

Depending on its configuration Linux can use the seed as device randomness
or to just quickly initialize CRNG.
In either case this will provide extra randomness to further harden CRNG.

Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Reviewed-by: Julien Grall <julien@xen.org>
Reviewed-by: Michal Orzel <michal.orzel@arm.com>
4 years agoMAINTAINERS: Updating after change to tools/include/
Anthony PERARD [Mon, 5 Jul 2021 14:48:06 +0000 (16:48 +0200)]
MAINTAINERS: Updating after change to tools/include/

The LIBS section doesn't mention the headers associated with the
libraries, same for LIBXENLIGHT section.

They aren't any ':' in other section names, so remove it.

Fixes: 4664034cdc72 ("tools/libs: move official headers to common directory")
Fixes: f7079d7ef69f ("MAINTAINERS: add myself as tools/libs reviewer")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agobuild: fix %.s: %.S rule
Anthony PERARD [Mon, 5 Jul 2021 14:47:51 +0000 (16:47 +0200)]
build: fix %.s: %.S rule

Fixes: e321576f4047 ("xen/build: start using if_changed")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/shadow: drop callback_mask pseudo-variables
Jan Beulich [Mon, 5 Jul 2021 14:46:46 +0000 (16:46 +0200)]
x86/shadow: drop callback_mask pseudo-variables

In commit 90629587e16e ("x86/shadow: replace stale literal numbers in
hash_{vcpu,domain}_foreach()") I had to work around Clang not following
gcc in certain relaxed requirements as to the expressions usable with
_Static_assert() (gcc tolerates static const variables in otherwise
integer constant expressions). Roberto suggests that we'd better not
rely on such behavior. Drop the involved static const-s, using their
"expansions" in both of the prior use sites each. This then allows
dropping the short-circuiting of the check for clang.

Requested-by: Roberto Bagnara <roberto.bagnara@bugseng.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agotools/libxenguest: Fix migration's debug option
Andrew Cooper [Fri, 2 Jul 2021 18:08:46 +0000 (19:08 +0100)]
tools/libxenguest: Fix migration's debug option

The code has gone through many refactors, but the first refactor was the one
which broke it by inverting the check with respect to checkpointed streams.

Fixes: 7449fb36c6c8 ("migration/save: pass checkpointed_stream from libxl to libxc")
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agotools/libxenguest: Fix max_extd_leaf calculation for legacy restore
Andrew Cooper [Fri, 2 Jul 2021 17:37:57 +0000 (18:37 +0100)]
tools/libxenguest: Fix max_extd_leaf calculation for legacy restore

0x1c is lower than any value which will actually be observed in
p->extd.max_leaf, but higher than the logical 9 leaves worth of extended data
on Intel systems, causing x86_cpuid_copy_to_buffer() to fail with -ENOBUFS.

Correct the calculation.

The problem was first noticed in c/s 34990446ca9 "libxl: don't ignore the
return value from xc_cpuid_apply_policy" but introduced earlier.

Fixes: 111c8c33a8a1 ("x86/cpuid: do not expand max leaves on restore")
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agotools: use integer division in convert-legacy-stream
Olaf Hering [Thu, 1 Jul 2021 09:55:59 +0000 (11:55 +0200)]
tools: use integer division in convert-legacy-stream

A single slash gives a float, a double slash gives an int.

    bitmap = unpack_exact("Q" * ((max_id/64) + 1))
TypeError: can't multiply sequence by non-int of type 'float'

Use future division to remain compatible with python 2.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools: fix comment typo in libxl__cpuid_legacy
Olaf Hering [Thu, 1 Jul 2021 10:30:48 +0000 (12:30 +0200)]
tools: fix comment typo in libxl__cpuid_legacy

Replace emualted with emulated.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibxl: Fix QEMU cmdline for scsi device
Anthony PERARD [Mon, 28 Jun 2021 10:01:57 +0000 (11:01 +0100)]
libxl: Fix QEMU cmdline for scsi device

Usage of 'scsi-disk' device is deprecated and removed from QEMU,
instead we need to use 'scsi-hd' for hard drives.
See QEMU 879be3af49 (hw/scsi: remove 'scsi-disk' device)

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
4 years agolibxl: Replace short-form boolean for QEMU's -vnc
Anthony PERARD [Mon, 28 Jun 2021 10:01:56 +0000 (11:01 +0100)]
libxl: Replace short-form boolean for QEMU's -vnc

f3f778c81769 forgot one boolean parameter.

Fixes: f3f778c81769 ("libxl: Replace QEMU's command line short-form boolean option")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
4 years agoConfig.mk: re-pin OVMF changeset and unpin qemu-xen
Anthony PERARD [Mon, 28 Jun 2021 13:42:17 +0000 (14:42 +0100)]
Config.mk: re-pin OVMF changeset and unpin qemu-xen

qemu-xen tree have a osstest gate and doesn't need to be pinned.

On the other hand, OVMF's xen repository doesn't have a gate and needs
to be pinned. The "master" branch correspond now to the tag
"edk2-stable202105", so pin to that commit.

Fixes: a04509d34d72 ("Branching: Update version files etc. for newly unstable")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoAMD/IOMMU: re-work locking around sending of commands
Jan Beulich [Tue, 29 Jun 2021 10:35:12 +0000 (12:35 +0200)]
AMD/IOMMU: re-work locking around sending of commands

It appears unhelpful to me for flush_command_buffer() to block all
progress elsewhere for the given IOMMU by holding its lock while waiting
for command completion. There's no real need for callers of that
function or of send_iommu_command() to hold the lock. Contain all
command sending related locking to the latter function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agoAMD/IOMMU: redo awaiting of command completion
Jan Beulich [Tue, 29 Jun 2021 10:34:37 +0000 (12:34 +0200)]
AMD/IOMMU: redo awaiting of command completion

The present abuse of the completion interrupt does not only stand in the
way of, down the road, using it for its actual purpose, but also
requires holding the IOMMU lock while waiting for command completion,
limiting parallelism and keeping interrupts off for non-negligible
periods of time. Have the IOMMU do an ordinary memory write instead of
signaling an otherwise disabled interrupt (by just updating a status
register bit).

Since IOMMU_COMP_WAIT_I_FLAG_SHIFT is now unused and
IOMMU_COMP_WAIT_[FS]_FLAG_SHIFT already were, drop all three of them
while at it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agox86emul: avoid using _PRE_EFLAGS() in a few cases
Jan Beulich [Tue, 29 Jun 2021 10:33:37 +0000 (12:33 +0200)]
x86emul: avoid using _PRE_EFLAGS() in a few cases

The macro expanding to quite a few insns, replace its use by simply
clearing the status flags when the to be executed insn doesn't depend
on their initial state, in cases where this is easily possible. (There
are more cases where the uses are hidden inside macros, and where some
of the users of the macros want guest flags put in place before running
the insn, i.e. the macros can't be updated as easily.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/mm: pull a sanity check earlier in xenmem_add_to_physmap_one()
Jan Beulich [Tue, 29 Jun 2021 09:03:29 +0000 (11:03 +0200)]
x86/mm: pull a sanity check earlier in xenmem_add_to_physmap_one()

We should try to limit the failure reasons after we've started making
changes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/paging: deal with log-dirty stats overflow
Jan Beulich [Tue, 29 Jun 2021 09:02:35 +0000 (11:02 +0200)]
x86/paging: deal with log-dirty stats overflow

While the precise values are unlikely of interest once they exceed 4
billion (allowing us to leave alone the domctl struct), we still
shouldn't wrap or truncate the actual values. It is in particular
problematic if the truncated values were zero (causing libxenguest to
skip an iteration altogether) or a very small value (leading to
premature exiting of the pre-copy phase).

Change the internal fields to unsigned long, and suitably saturate for
copying to guest context.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agofully replace mfn_to_gmfn()
Jan Beulich [Tue, 29 Jun 2021 09:00:51 +0000 (11:00 +0200)]
fully replace mfn_to_gmfn()

Convert the two remaining uses as well as Arm's stub to the properly
named and type-safe mfn_to_gfn(), dropping x86's definition (where we
already have mfn_to_gfn()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoReplace FSF street address with canonical URL (again)
Andrew Cooper [Fri, 25 Jun 2021 13:35:02 +0000 (14:35 +0100)]
Replace FSF street address with canonical URL (again)

As recommended in http://www.gnu.org/licenses/gpl-howto.en.html.

Exactly as per changeset 443701ef0c7ff3 - Some errors have crept back in in
the past 6 years.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/arm: smmuv1: Set privileged attr to 'default'
Rahul Singh [Fri, 25 Jun 2021 16:37:27 +0000 (17:37 +0100)]
xen/arm: smmuv1: Set privileged attr to 'default'

Backport commit e19898077cfb642fe151ba22981e795c74d9e114
"iommu/arm-smmu: Set privileged attribute to 'default' instead of
'unprivileged'"

Original commit message:
    Currently the driver sets all the device transactions privileges
    to UNPRIVILEGED, but there are cases where the iommu masters wants
    to isolate privileged supervisor and unprivileged user.
    So don't override the privileged setting to unprivileged, instead
    set it to default as incoming and let it be controlled by the
    pagetable settings.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rahul Singh <rahul.singh@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: smmuv1: Fixed stream matching register allocation
Rahul Singh [Fri, 25 Jun 2021 16:37:26 +0000 (17:37 +0100)]
xen/arm: smmuv1: Fixed stream matching register allocation

SMR allocation should be based on the number of supported stream
matching register for each SMMU device.

Issue introduced by commit 5e08586afbb90b2e2d56c175c07db77a4afa873c
when backported the patches from Linux to XEN to fix the stream match
conflict issue when two devices have the same stream-id.

Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Rahul Singh <rahul.singh@arm.com>
4 years agoIOMMU/PCI: don't let domain cleanup continue when device de-assignment failed
Jan Beulich [Fri, 25 Jun 2021 12:06:55 +0000 (14:06 +0200)]
IOMMU/PCI: don't let domain cleanup continue when device de-assignment failed

Failure here could in principle mean the device may still be issuing DMA
requests, which would continue to be translated by the page tables the
device entry currently points at. With this we cannot allow the
subsequent cleanup step of freeing the page tables to occur, to prevent
use-after-free issues. We would need to accept, for the time being, that
in such a case the remaining domain resources will all be leaked, and
the domain will continue to exist as a zombie.

However, with flushes no longer timing out (and with proper timeout
detection for device I/O TLB flushing yet to be implemented), there's no
way anymore for failures to occur, except due to bugs elsewhere. Hence
the change here is merely a "just in case" one.

In order to continue the loop in spite of an error, we can't use
pci_get_pdev_by_domain() anymore. I have no idea why it was used here in
the first place, instead of the cheaper list iteration.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agolibxencall: Bump SONAME following new functionality
Andrew Cooper [Thu, 24 Jun 2021 17:49:14 +0000 (18:49 +0100)]
libxencall: Bump SONAME following new functionality

Fixes: bef64f2c00 ("libxencall: introduce variant of xencall2() returning long")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/xenstored: Correctly read the requests header from the stream
Julien Grall [Fri, 25 Jun 2021 06:45:22 +0000 (07:45 +0100)]
tools/xenstored: Correctly read the requests header from the stream

Commit c0fe360f42 ("tools/xenstored: Extend restore code to handle
multiple input buffer") extend the read_buffered_state() to support
multiple input buffers. Unfortunately, the commit didn't go far
enough and still used sc->data (start of the buffers) for retrieving
the header. This would lead to read the wrong headers for second and
follow-up commands.

Use data in place for sc->data for the source of the memcpy()s.

Fixes: c0fe360f42 ("tools/xenstored: Extend restore code to handle multiple input buffer")
Reported-by: Raphael Ning <raphning@amazon.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstored: Remove redundant check in socket_can_process()
Julien Grall [Thu, 24 Jun 2021 14:55:03 +0000 (15:55 +0100)]
tools/xenstored: Remove redundant check in socket_can_process()

Commit 3adfb50315d9 ("tools/xenstored: Introduce a wrapper for
conn->funcs->can_{read, write}") consolidated the check
!conn->is_ignored in two new wrappers.

This means the check in socket_can_process() is now redundant. In
fact it should have been removed in orignal commit (as it was done
for the domain helpers).

Reported-by: Raphael Ning <raphning@amazon.com
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agolibxc: make xc_domain_maximum_gpfn() endianness-agnostic
Jan Beulich [Thu, 24 Jun 2021 14:40:57 +0000 (16:40 +0200)]
libxc: make xc_domain_maximum_gpfn() endianness-agnostic

libxc generally uses uint32_t to represent domain IDs. This is fine as
long as addresses of such variables aren't taken, to then pass into
hypercalls: To the hypervisor, a domain ID is a 16-bit value. Introduce
a wrapper struct to deal with the issue. (On architectures with
arguments passed in registers, an intermediate variable would have been
created by the compiler already anyway, just one of the wrong type.)

The public interface change is both source and binary compatible for
the architectures we currently support.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibxencall: drop bogus mentioning of xencall6()
Jan Beulich [Thu, 24 Jun 2021 14:39:55 +0000 (16:39 +0200)]
libxencall: drop bogus mentioning of xencall6()

There's no xencall6(), so the version script also shouldn't mention it.
If such a function would ever appear, it shouldn't land in version 1.0.

No change to the generated binary, nor abi-dumper's view of the object.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibxc: use multicall for memory-op on Linux (and Solaris)
Jan Beulich [Thu, 24 Jun 2021 14:39:26 +0000 (16:39 +0200)]
libxc: use multicall for memory-op on Linux (and Solaris)

Some sub-functions, XENMEM_maximum_gpfn and XENMEM_maximum_ram_page in
particular, can return values requiring more than 31 bits to represent.
Hence we cannot issue the hypercall directly when the return value of
ioctl() is used to propagate this value. This is the case for Linux
and Solaris (and hence needs changing), while the BSDs avoid using the
return value for dual purposes altogether, and MiniOS already wraps all
hypercalls in a multicall.

Suggested-by: Jürgen Groß <jgross@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibxencall: introduce variant of xencall2() returning long
Jan Beulich [Thu, 24 Jun 2021 14:39:02 +0000 (16:39 +0200)]
libxencall: introduce variant of xencall2() returning long

Some hypercalls, memory-op in particular, can return values requiring
more than 31 bits to represent. Hence the underlying layers need to make
sure they won't truncate such values.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibxencall: osdep_hypercall() should return long
Jan Beulich [Thu, 24 Jun 2021 14:38:37 +0000 (16:38 +0200)]
libxencall: osdep_hypercall() should return long

Some hypercalls, memory-op in particular, can return values requiring
more than 31 bits to represent. Hence the underlying layers need to make
sure they won't truncate such values. (Note that for Solaris the
function also gets renamed, to match the other OSes.)

Due to them merely propagating ioctl()'s return value, this change is
benign on Linux and Solaris. IOW there's an actual effect here only for
the BSDs and MiniOS, but even then further adjustments are needed at the
xencall<N>() level.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/HVM: wire up multicalls
Jan Beulich [Thu, 24 Jun 2021 14:35:39 +0000 (16:35 +0200)]
x86/HVM: wire up multicalls

To be able to use them from, in particular, the tool stack, they need to
be supported for all guest types. Note that xc_resource_op() already
does, so would not work without this on PVH Dom0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Begrudingly acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoVT-d: drop/move a few QI related constants
Jan Beulich [Thu, 24 Jun 2021 14:30:51 +0000 (16:30 +0200)]
VT-d: drop/move a few QI related constants

Replace uses of QINVAL_ENTRY_ORDER and QINVAL_INDEX_SHIFT, such that
the constants can be dropped. Move the remaining QINVAL_* ones to the
single source file using them.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: centralize mapping of QI entries
Jan Beulich [Thu, 24 Jun 2021 14:30:32 +0000 (16:30 +0200)]
VT-d: centralize mapping of QI entries

Introduce a helper function to reduce redundancy. Take the opportunity
to express the logic without using the somewhat odd QINVAL_ENTRY_ORDER.
Also take the opportunity to uniformly unmap after updating queue tail
and dropping the lock (like was done so far only by
queue_invalidate_context_sync()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: don't lose errors when flushing TLBs on multiple IOMMUs
Jan Beulich [Thu, 24 Jun 2021 14:30:06 +0000 (16:30 +0200)]
VT-d: don't lose errors when flushing TLBs on multiple IOMMUs

While no longer an immediate problem with flushes no longer timing out,
errors (if any) get properly reported by iommu_flush_iotlb_{dsi,psi}().
Overwriting such an error with, perhaps, a success indicator received
from another IOMMU will misguide callers. Record the first error, but
don't bail from the loop (such that further necessary invalidation gets
carried out on a best effort basis).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: clear_fault_bits() should clear all fault bits
Jan Beulich [Thu, 24 Jun 2021 14:29:42 +0000 (16:29 +0200)]
VT-d: clear_fault_bits() should clear all fault bits

If there is any way for one fault to be left set in the recording
registers, there's no reason there couldn't also be multiple ones. If
PPF set set (being the OR or all F fields), simply loop over the entire
range of fault recording registers, clearing F everywhere.

Since PPF is a r/o bit, also remove it from DMA_FSTS_FAULTS (arguably
the constant's name is ambiguous as well).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: adjust domid map updating when unmapping context
Jan Beulich [Thu, 24 Jun 2021 14:29:13 +0000 (16:29 +0200)]
VT-d: adjust domid map updating when unmapping context

When an earlier error occurred, cleaning up the domid mapping data is
wrong, as references likely still exist. The only exception to this is
when the actual unmapping worked, but some flush failed (supposedly
impossible after XSA-373). The guest will get crashed in such a case
though, so add fallback cleanup to domain destruction to cover this
case. This in turn makes it desirable to silence the dprintk() in
domain_iommu_domid().

Note that no error will be returned anymore when the lookup fails - in
the common case lookup failure would already have caused
domain_context_unmap_one() to fail, yet even from a more general
perspective it doesn't look right to fail domain_context_unmap() in such
a case when this was the last device, but not when any earlier unmap was
otherwise successful.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: undo device mappings upon error
Jan Beulich [Thu, 24 Jun 2021 14:28:25 +0000 (16:28 +0200)]
VT-d: undo device mappings upon error

When
 - flushes (supposedly not possible anymore after XSA-373),
 - secondary mappings for legacy PCI devices behind bridges,
 - secondary mappings for chipset quirks, or
 - find_upstream_bridge() invocations
fail, the successfully established device mappings should not be left
around.

Further, when (parts of) unmapping fail, simply returning an error is
typically not enough. Crash the domain instead in such cases, arranging
for domain cleanup to continue in a best effort manner despite such
failures.

Finally make domain_context_unmap()'s error behavior consistent in the
legacy PCI device case: Don't bail from the function in one special
case, but always just exit the switch statement.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agotools/xenstored: Don't crash xenstored when Live-Update is cancelled
Julien GralL [Thu, 24 Jun 2021 11:15:49 +0000 (12:15 +0100)]
tools/xenstored: Don't crash xenstored when Live-Update is cancelled

As Live-Update is asynchronous, it is possible to receive a request to
cancel it (either on the same connection or from a different one).

Currently, this will crash xenstored because do_lu_start() assumes
lu_status will be valid. This is not the case when Live-Update has been
cancelled. This will result to dereference a NULL pointer and
crash Xenstored.

Rework do_lu_start() to check if lu_status is NULL and return an
error in this case.

Fixes: af216a99fb ("tools/xenstore: add the basic framework for doing the live update")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstored: Delay new transaction while Live-Update is pending
Julien Grall [Thu, 24 Jun 2021 11:14:56 +0000 (12:14 +0100)]
tools/xenstored: Delay new transaction while Live-Update is pending

At the moment, Live-Update will, by default, not proceed if there are
in-flight transactions. It is possible force it by passing -F but this
will break any connection with in-flight transactions.

There are PV drivers out that may never terminate some transaction. On
host running such guest, we would need to use -F. Unfortunately, this
also risks to break well-behaving guests (and even dom0) because
Live-Update will happen as soon as the timeout is hit.

Ideally, we would want to preserve transactions but this requires
some work and a lot of testing to be able to use it in production.

As a stop gap, we want to limit the damage of -F. This patch will delay
any transactions that are started after Live-Update has been requested.

If the request cannot be delayed, the connection will be stalled to
avoid loosing requests.

If the connection has already a pending transaction before Live-Update,
then new transaction will not be delayed. This is to avoid the connection
to stall.

With this stop gap in place, domains with long running transactions will
still break when using -F, but other domains which starts a transaction
in the middle of Live-Update will continue to work.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstored: Dump delayed requests
Julien Grall [Thu, 24 Jun 2021 11:12:54 +0000 (12:12 +0100)]
tools/xenstored: Dump delayed requests

Currently, only Live-Update request can be delayed. In a follow-up,
we will want to delay more requests (e.g. transaction start).
Therefore we want to preserve delayed requests across Live-Update.

Delayed requests are just complete "in" buffer. So the code is
refactored to allow sharing the code to dump "in" buffer.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agomaintainers: adding new reviewer for xsm
Daniel P. Smith [Thu, 17 Jun 2021 23:49:55 +0000 (19:49 -0400)]
maintainers: adding new reviewer for xsm

Would like to add myself as a reviewer for XSM.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoiommu/arm: ipmmu-vmsa: Add compatible for Renesas R-Car M3-W+ SoC
Oleksandr Tyshchenko [Mon, 14 Jun 2021 19:18:12 +0000 (22:18 +0300)]
iommu/arm: ipmmu-vmsa: Add compatible for Renesas R-Car M3-W+ SoC

The "renesas,r8a77961" string identifies M3-W+ (aka M3-W ES3.0)
instead of "renesas,r8a7796" since Linux commit:
"9c9f7891093b02eb64ca4e1c7ab776a4296c058f soc: renesas: Identify R-Car M3-W+".
Add new compatible to the Xen driver.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agotools/xenstored: Extend restore code to handle multiple input buffer
Julien Grall [Thu, 24 Jun 2021 10:41:00 +0000 (11:41 +0100)]
tools/xenstored: Extend restore code to handle multiple input buffer

Currently, the restore code is considering the stream will contain at
most one in-flight request per connection. In a follow-up changes, we
will want to transfer multiple in-flight requests.

The function read_state_buffered() is now extended to restore multiple
in-flight request. Complete requests will be queued as delayed
requests, if there a partial request (only the last one can) then it
will used as the current in-flight request.

Note that we want to bypass the quota check for delayed requests as
the new Xenstore may have a lower limit.

Lastly, there is no need to change the specification as there was
no restriction on the number of in-flight requests preserved.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstored: delay_request: don't assume conn->in == in
Julien Grall [Thu, 24 Jun 2021 08:08:56 +0000 (09:08 +0100)]
tools/xenstored: delay_request: don't assume conn->in == in

delay_request() is currently assuming that the request delayed is
always conn->in. This is currently correct, but it is a call for
a latent bug as the function allows the caller to specify any request.

To prevent any future surprise, check if the request delayed is the
current one.

Fixes: c5ca1404b4 ("tools/xenstore: add support for delaying execution of a xenstore request")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstored: Introduce a wrapper for conn->funcs->can_{read, write}
Julien Grall [Thu, 24 Jun 2021 08:08:42 +0000 (09:08 +0100)]
tools/xenstored: Introduce a wrapper for conn->funcs->can_{read, write}

Currently, the callbacks can_read and can_write are called directly. This
doesn't allow us to add generic check and therefore requires duplication.

At the moment, one check that could benefit to be common is whether the
connection should ignored. The position is slightly different between
domain and socket because for the latter we want to check the state of
the file descriptor first.

In follow-up patches, there will be more potential generic checks.

This patch provides wrappers to read/write a connection and move
the check ->is_ignored after the callback for everyone.

This also requires to replace the direct call to domain_can_read()
and domain_can_write() with the new wrapper. At the same time,
both functions can now be static. Note that the implementations need
to be moved earlier in the file xenstored_domain.c to avoid
declaring the prototype.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstored: xenstored_core.h should include fcntl.h
Julien Grall [Thu, 24 Jun 2021 08:07:52 +0000 (09:07 +0100)]
tools/xenstored: xenstored_core.h should include fcntl.h

xenstored_core.h will consider live-udpate is not supported if
O_CLOEXEC doesn't exist. However, the header doesn't include the one
defining O_CLOEXEC (i.e. fcntl.h). This means that depending on
the header included, some source file will think Live-Update is not
supported.

I am not aware of any issue with the existing. Therefore this is just
a latent bug so far.

Prevent any potential issue by including fcntl.h in xenstored_core.h

Fixes: cd831ee438 ("tools/xenstore: handle CLOEXEC flag for local files and pipes")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstored: Limit the number of requests a connection can delay
Julien Grall [Thu, 24 Jun 2021 08:07:30 +0000 (09:07 +0100)]
tools/xenstored: Limit the number of requests a connection can delay

Currently, only liveupdate request can be delayed. The request can only
be performed by a privileged connection (e.g. dom0). So it is fine to
have no limits.

In a follow-up patch we will want to delay request for unprivileged
connection as well. So it is best to apply a limit.

For now and for simplicity, only a single request can be delayed
for a given unprivileged connection.

Take the opportunity to tweak the prototype and provide a way to
bypass the quota check. This would be useful when the function
is called from the restore code.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstore: Don't assume conn->in points to the LU request
Julien Grall [Thu, 24 Jun 2021 08:06:58 +0000 (09:06 +0100)]
tools/xenstore: Don't assume conn->in points to the LU request

call_delayed() is currently assuming that conn->in is NULL when
handling delayed request. However, the connection is not paused.
Therefore new request can be processed and conn->in may be non-NULL
if we have only received a partial request.

Furthermore, as we overwrite conn->in, the current partial request
will not be transferred. This will result to corrupt the connection.

Rather than updating conn->in, stash the LU request in lu_status and
let each callback for delayed request to update conn->in when
necessary.

To keep a sane interface, the code to write the "OK" response the
LU request is moved in xenstored_core.c.

Fixes: c5ca1404b4 ("tools/xenstore: add support for delaying execution of a xenstore request")
Fixes: ed6eebf17d ("tools/xenstore: dump the xenstore state for live update")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agotools/xenstored: Introduce lu_get_connection() and use it
Julien Grall [Thu, 24 Jun 2021 08:02:59 +0000 (09:02 +0100)]
tools/xenstored: Introduce lu_get_connection() and use it

At the moment, dump_state_buffered_data() is taking two connections
in parameters (one is the connection to dump, the other is the
connection used to request LU). The naming doesn't help to
distinguish (c vs conn) them and this already lead to several mistake
while modifying the function.

To remove the confusion, introduce an help lu_get_connection() that
will return the connection used to request LU and use it
in place of the existing parameter.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>