Andrew Cooper [Fri, 20 Dec 2019 19:38:26 +0000 (19:38 +0000)]
tools/libxc: Restore CPUID/MSR data found in the migration stream
With all other pieces in place, it is now safe to restore the CPUID and MSR
data in the migration stream, rather than discarding them and using the higher
level toolstacks compatibility logic.
While this is a small patch, it has large implications for migrated/resumed
domains. Most obviously, the CPU family/model/stepping data,
cache/tlb/etc. will no longer change behind the guests back.
Another change is the interpretation of the Xend cpuid strings. The 'k'
option is not a sensible thing to have ever supported, and 's' is how how the
stream will end up behaving.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Mon, 16 Dec 2019 19:03:14 +0000 (19:03 +0000)]
tools/libx[cl]: Plumb 'missing' through static_data_done() up into libxl
Pre Xen-4.14 streams will not contain any CPUID/MSR information. There is
nothing libxc can do about this, and will have to rely on the higher level
toolstack to provide backwards compatibility.
To facilitate this, extend the static_data_done() callback, highlighting the
missing information, and modify libxl to use it. At the libxc level, this
requires an arch-specific hook which, for now, always reports CPUID and MSR as
missing. This will be adjusted in a later change.
No overall functional change - this is just plumbing.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Tue, 17 Dec 2019 12:41:02 +0000 (12:41 +0000)]
libxc/save: Write X86_{CPUID,MSR}_DATA records
With the destination side now able to understand X86_{CPUID,MSR}_DATA
records (and compatibly handle their absense), update the sending logic to
obtain and forward this data from Xen.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Wed, 18 Dec 2019 18:51:01 +0000 (18:51 +0000)]
libxc/restore: Handle X86_{CPUID,MSR}_DATA records
For now, the data are just stashed, and discarded at the end.
A future change will restore the data, once libxl has been adjusted to avoid
clobbering the data.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Tue, 17 Dec 2019 13:10:04 +0000 (13:10 +0000)]
docs/migration: Specify X86_{CPUID,MSR}_POLICY records
These two records move blobs from the XEN_DOMCTL_{get,set}_cpu_policy
hypercall.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Thu, 2 Jan 2020 19:44:36 +0000 (19:44 +0000)]
tools/libxl: Re-position CPUID handling during domain construction
CPUID handling needs to be earlier in construction. Move it from its current
position in libxl__build_post() to libxl__build_pre() for fresh builds, and
libxl__srm_callout_callback_static_data_done() for the migration/resume case.
Later changes will make the migration/resume case conditional on whether CPUID
data was present in the migration stream, and the libxc layer took care of
restoring it.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Fri, 20 Dec 2019 11:58:03 +0000 (11:58 +0000)]
tools/libxl: Provide a static_data_done callback for domain restore
This will be needed shortly to provide backwards compatiblity for migration
streams which do not have CPUID information contained within them.
No functional change yet.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Tue, 17 Dec 2019 12:29:42 +0000 (12:29 +0000)]
libxc/save: Write a v3 stream
Introduce a new static_data() hook which is responsible for writing out
any static data records. The HVM side continues to be a no-op, while
the PV side moves write_x86_pv_info() into this earlier hook. The the
common code writes out a STATIC_DATA_END record, and the stream version
is bumped to 3.
Update convert-legacy-stream to write a v3 stream, because this will
bypass the compatibly logic in libxc.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Mon, 16 Dec 2019 19:03:14 +0000 (19:03 +0000)]
libxc/restore: STATIC_DATA_END inference for v2 compatibility
A v3 stream can compatibly read a v2 stream by inferring the position of the
STATIC_DATA_END record.
v2 compatibility is only needed for x86. No other architectures exist yet,
but they will have a minimum of v3 when introduced.
The x86 HVM compatibility point being in handle_page_data() (which is common
code) is a bit awkward. However, as the two compatibility points are subtly
different, and it is (intentionally) not possible to call into arch specific
code from common code (except via the ops hooks), use some #ifdef-ary and
opencode the check, rather than make handle_page_data() a per-arch helper.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Mon, 16 Dec 2019 19:03:14 +0000 (19:03 +0000)]
libxc/restore: Support v3 streams and handle STATIC_DATA_END
Higher level toolstacks may wish to know when the static data is complete, so
introduce a restore_callback for the purpose.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Mon, 16 Dec 2019 19:39:43 +0000 (19:39 +0000)]
python/migration: Update validation logic to understand a v3 stream
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Thu, 5 Dec 2019 15:57:13 +0000 (15:57 +0000)]
docs/migration Specify migration v3 and STATIC_DATA_END
Migration data can be split into two parts - that which is invariant of
guest execution, and that which is not. Separate these two with the
STATIC_DATA_END record.
The short term, we want to move the x86 CPU Policy data into the stream.
In the longer term, we want to provisionally send the static data only
to the destination as a more robust compatibility check. In both cases,
we will want a callback into the higher level toolstack.
Mandate the presence of the STATIC_DATA_END record, and declare this v3,
along with instructions for how to compatibly interpret a v2 stream.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Mon, 16 Dec 2019 18:53:02 +0000 (18:53 +0000)]
tools/migration: Drop IHDR_VERSION constant from libxc and python
Migration v3 is in the process of being introduced, meaning that the code has
to cope with both versions. Use an explicit 2 for now.
For the verify-stream-v2 and convert-legacy-stream scripts, update text to say
"v2 (or later)". What matters is the distinction vs legacy streams.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Tamas K Lengyel [Fri, 29 May 2020 16:22:34 +0000 (17:22 +0100)]
tools/libxl: fix setting altp2m param broken by
1e9bc407cf0
The patch
1e9bc407cf0 mistakenly converted the altp2m config option to a
boolean. This is incorrect and breaks external-only usecases of altp2m that
is set with a value of 2.
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Juergen Gross [Fri, 29 May 2020 11:37:09 +0000 (12:37 +0100)]
docs: update xenstore-migration.md
Update connection record details:
- make flags common for sockets and domains (makes it easier to have a
C union for conn-spec)
- add pending incoming data (needed for handling partially read
requests when doing live update)
- add partial response length (needed for proper split to individual
responses after live update)
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Roger Pau Monné [Fri, 29 May 2020 15:52:06 +0000 (17:52 +0200)]
clang: don't define nocall
Clang doesn't support attribute error, and the possible equivalents
like diagnose_if don't seem to work well in this case as they trigger
when when the function is not called (just by being used by the
APPEND_CALL macro).
Define nocall to a noop on clang until a proper solution can be found.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
[jb: error -> __error__]
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 29 May 2020 10:29:53 +0000 (11:29 +0100)]
tools: fix Rules.mk library make variables
Both SHDEPS_libxendevicemodel and SHDEPS_libxenhypfs have a bug by
adding $(SHLIB_xencall) instead of $(SHLIB_libxencall).
The former seems not to have any negative impact, probably because
it is not used anywhere in Xen without the correct $(SHLIB_libxencall)
being used, too.
Fixes: 86234eafb95295 ("libs: add libxenhypfs")
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Fri, 29 May 2020 15:35:09 +0000 (17:35 +0200)]
x86emul: support FXSAVE/FXRSTOR
Note that FPU selector handling as well as MXCSR mask saving for now
does not honor differences between host and guest visible featuresets.
While for Intel operation of the insns with CR4.OSFXSR=0 is
implementation dependent, use the easiest solution there: Simply don't
look at the bit in the first place. For AMD and alike the behavior is
well defined, so it gets handled together with FFXSR.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 29 May 2020 15:34:31 +0000 (17:34 +0200)]
x86emul: support FLDENV and FRSTOR
While the Intel SDM claims that FRSTOR itself may raise #MF upon
completion, this was confirmed by Intel to be a doc error which will be
corrected in due course; behavior is like FLDENV, and like old hard copy
manuals describe it.
Re-arrange a switch() statement's case label order to allow for
fall-through from FLDENV handling to FNSTENV's.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 29 May 2020 15:33:54 +0000 (17:33 +0200)]
x86emul: support FNSTENV and FNSAVE
To avoid introducing another boolean into emulator state, the
rex_prefix field gets (ab)used to convey the real/VM86 vs protected mode
info (affecting structure layout, albeit not size) to x86_emul_blk().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 29 May 2020 15:32:55 +0000 (17:32 +0200)]
x86emul: support ENQCMD insns
Note that the ISA extensions document revision 038 doesn't specify
exception behavior for ModRM.mod == 0b11; assuming #UD here.
No tests are being added to the harness - this would be quite hard,
we can't just issue the insns against RAM. Their similarity with
MOVDIR64B should have the test case there be god enough to cover any
fundamental flaws.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 29 May 2020 15:32:14 +0000 (17:32 +0200)]
x86emul: support MOVDIR{I,64B} insns
Introduce a new blk() hook, paralleling the rmw() one in a certain way,
but being intended for larger data sizes, and hence its HVM intermediate
handling function doesn't fall back to splitting the operation if the
requested virtual address can't be mapped.
Note that SDM revision 071 doesn't specify exception behavior for
ModRM.mod == 0b11; assuming #UD here.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Andrew Cooper <andrew.cooper@citrix.com>
Jan Beulich [Fri, 29 May 2020 15:31:13 +0000 (17:31 +0200)]
x86emul: disable FPU/MMX/SIMD insn emulation when !HVM
In a pure PV environment (the PV shim in particular) we don't really
need emulation of all these. To limit #ifdef-ary utilize some of the
CASE_*() macros we have, by providing variants expanding to
(effectively) nothing (really a label, which in turn requires passing
-Wno-unused-label to the compiler when build such configurations).
Due to the mixture of macro and #ifdef use, the placement of some of
the #ifdef-s is a little arbitrary.
The resulting object file's .text is less than half the size of the
original, and looks to also be compiling a little more quickly.
This is meant as a first step; more parts can likely be disabled down
the road.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Bregrudingly-acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 29 May 2020 15:30:35 +0000 (17:30 +0200)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Jan Beulich [Fri, 29 May 2020 15:29:59 +0000 (17:29 +0200)]
x86emul: also test decoding and mem access / write logic
x86emul_is_mem_{access,write}() (and their interaction with
x86_decode()) have become sufficiently complex that we should have a way
to test this logic. Start by covering legacy encoded GPR insns, with the
exception of a few the main emulator doesn't support yet (left as
comments in the respective tables, or about to be added by subsequent
patches). This has already helped spot a few flaws in said logic,
addressed by (revised) earlier patches.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 29 May 2020 15:28:45 +0000 (17:28 +0200)]
x86emul: rework CMP and TEST emulation
Unlike similarly encoded insns these don't write their memory operands,
and hence x86_is_mem_write() should return false for them. However,
rather than adding special logic there, rework how their emulation gets
done, by making decoding attributes properly describe the r/o nature of
their memory operands:
- change the table entries for opcodes 0x38 and 0x39, with no other
adjustments to the attributes later on,
- for the other opcodes, leave the table entries as they are, and
override the attributes for the specific sub-cases (identified by
ModRM.reg).
For opcodes 0x38 and 0x39 the change of the table entries implies
changing the order of operands as passed to emulate_2op_SrcV(), hence
the splitting of the cases in the main switch().
Note how this also allows dropping custom LOCK prefix checks.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 29 May 2020 15:28:04 +0000 (17:28 +0200)]
x86emul: address x86_insn_is_mem_{access,write}() omissions
First of all explain in comments what the functions' purposes are. Then
make them actually match their comments.
Note that
fc6fa977be54 ("x86emul: extend x86_insn_is_mem_write()
coverage") didn't actually fix the function's behavior for {,V}STMXCSR:
Both are covered by generic code higher up in the function, due to
x86_decode_twobyte() already doing suitable adjustments. And VSTMXCSR
wouldn't have been covered anyway without a further X86EMUL_OPC_VEX()
case label. Keep the inner case label in a comment for reference.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 28 May 2020 13:03:22 +0000 (14:03 +0100)]
x86/hvm: Improve error information in handle_pio()
domain_crash() should always have a message which is emitted even in release
builds, so something more useful than this is presented to the user.
(XEN) domain_crash called from io.c:171
(XEN) domain_crash called from io.c:171
(XEN) domain_crash called from io.c:171
...
To avoid possibly printing stack rubble, initialise data to ~0 right away.
Furthermore, the maximum access size is 4, so drop data from long to int.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 29 May 2020 10:22:50 +0000 (12:22 +0200)]
SUPPORT.md: add hypervisor file system
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 29 May 2020 10:22:42 +0000 (12:22 +0200)]
CHANGELOG: add hypervisor file system support
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Juergen Gross [Fri, 29 May 2020 10:20:31 +0000 (12:20 +0200)]
xen: remove XEN_SYSCTL_set_parameter support
The functionality of XEN_SYSCTL_set_parameter is available via hypfs
now, so it can be removed.
This allows to remove the kernel_param structure for runtime parameters
by putting the now only used structure element into the hypfs node
structure of the runtime parameters.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 29 May 2020 10:20:16 +0000 (12:20 +0200)]
tools/libxc: remove xc_set_parameters()
There is no user of xc_set_parameters() left, so remove it.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Juergen Gross [Fri, 29 May 2020 10:20:08 +0000 (12:20 +0200)]
tools/libxl: use libxenhypfs for setting xen runtime parameters
Instead of xc_set_parameters() use xenhypfs_write() for setting
parameters of the hypervisor.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Juergen Gross [Fri, 29 May 2020 10:18:36 +0000 (12:18 +0200)]
xen: add runtime parameter access support to hypfs
Add support to read and modify values of hypervisor runtime parameters
via the hypervisor file system.
As runtime parameters can be modified via a sysctl, too, this path has
to take the hypfs rw_lock as writer.
For custom runtime parameters the connection between the parameter
value and the file system is done via an init function which will set
the initial value (if needed) and the leaf properties.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Juergen Gross [Fri, 29 May 2020 10:15:54 +0000 (12:15 +0200)]
xen: add /buildinfo/config entry to hypervisor filesystem
Add the /buildinfo/config entry to the hypervisor filesystem. This
entry contains the .config file used to build the hypervisor.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 29 May 2020 10:14:51 +0000 (12:14 +0200)]
xen: provide version information in hypfs
Provide version and compile information in /buildinfo/ node of the
Xen hypervisor file system. As this information is accessible by dom0
only no additional security problem arises.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 29 May 2020 10:14:24 +0000 (12:14 +0200)]
xen/hypfs: make struct hypfs_entry_leaf initializers work with gcc 4.1
gcc 4.1 has problems with static initializers for anonymous unions.
Fix this by naming the union in struct hypfs_entry_leaf.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 29 May 2020 08:20:32 +0000 (10:20 +0200)]
tools: add xenfs tool
Add the xenfs tool for accessing the hypervisor filesystem.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Juergen Gross [Fri, 29 May 2020 08:20:21 +0000 (10:20 +0200)]
libs: add libxenhypfs
Add the new library libxenhypfs for access to the hypervisor filesystem.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Juergen Gross [Fri, 29 May 2020 08:15:50 +0000 (10:15 +0200)]
xen: add basic hypervisor filesystem support
Add the infrastructure for the hypervisor filesystem.
This includes the hypercall interface and the base functions for
entry creation, deletion and modification.
In order not to have to repeat the same pattern multiple times in case
adding a new node should BUG_ON() failure, the helpers for adding a
node (hypfs_add_dir() and hypfs_add_leaf()) get a nofault parameter
causing the BUG() in case of a failure.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Juergen Gross [Fri, 29 May 2020 08:15:35 +0000 (10:15 +0200)]
docs: add feature document for Xen hypervisor sysfs-like support
On the 2019 Xen developer summit there was agreement that the Xen
hypervisor should gain support for a hierarchical name-value store
similar to the Linux kernel's sysfs.
In the beginning there should only be basic support: entries can be
added from the hypervisor itself only, there is a simple hypercall
interface to read the data.
Add a feature document for setting the base of a discussion regarding
the desired functionality and the entries to add.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Juergen Gross [Fri, 29 May 2020 08:15:08 +0000 (10:15 +0200)]
xen: add a generic way to include binary files as variables
Add a new script xen/tools/binfile for including a binary file at build
time being usable via a pointer and a size variable in the hypervisor.
Make use of that generic tool in xsm.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
George Dunlap [Thu, 28 May 2020 11:20:57 +0000 (12:20 +0100)]
automation/containerize: Add a shortcut for Debian unstable
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
George Dunlap [Thu, 28 May 2020 11:20:56 +0000 (12:20 +0100)]
automation: Add golang packages to various dockerfiles
Specifically, Fedora 29, Archlinux, and Debian unstable. This will
cause the CI loop to detect golang build failures.
CentOS 6 and 7 don't have golang packages, and the packages in
stretch, jessie, xenial, and trusty are too old.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
George Dunlap [Thu, 28 May 2020 11:20:55 +0000 (12:20 +0100)]
automation/archlinux: Add 32-bit glibc headers
This fixes the following build error in hvmloader:
usr/include/gnu/stubs.h:7:11: fatal error: gnu/stubs-32.h: No such file or directory
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
George Dunlap [Thu, 28 May 2020 11:20:54 +0000 (12:20 +0100)]
golang/xenlight: Get rid of GOPATH-based build artefacts
The original build setup used a "fake GOPATH" in tools/golang to test
the mechanism of building from go package files installed on a
filesystem. With the move to modules, this isn't necessary, and leads
to potentially confusing directories being created. (I.e., it might
not be obvious that files under tools/golang/src shouldn't be edited.)
Get rid of the code that creates this (now unused) intermediate
directory. Add direct dependencies from 'build' onto the source
files.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Nick Rosbrook <rosbrookn@ainfosec.com>
George Dunlap [Thu, 28 May 2020 11:20:53 +0000 (12:20 +0100)]
libxl: Generate golang bindings in libxl Makefile
The generated golang bindings (types.gen.go and helpers.gen.go) are
left checked in so that they can be fetched from xenbits using the
golang tooling. This means that they must be updated whenever
libxl_types.idl (or other dependencies) are updated. However, the
golang bindings are only built optionally; we can't assume that anyone
updating libxl_types.idl will also descend into the tools/golang tree
to re-generate the bindings.
Fix this by re-generating the golang bindings from the libxl Makefile
when the IDL dependencies are updated, so that anyone who updates
libxl_types.idl will also end up updating the golang generated files
as well.
- Make a variable for the generated files, and a target in
xenlight/Makefile which will only re-generate the files.
- Add a target in libxl/Makefile to call external idl generation
targets (currently only golang).
For ease of testing, also add a specific target in libxl/Makefile just
to check and update files generated from the IDL.
This does mean that there are two potential paths for generating the
files during a parallel build; but that shouldn't be an issue, since
tools/golang/xenlight should never be built until after tools/libxl
has completed building anyway.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Thu, 28 May 2020 10:03:25 +0000 (12:03 +0200)]
VT-x: extend LBR Broadwell errata coverage
For lbr_tsx_fixup_check() simply name a few more specific erratum
numbers.
For bdf93_fixup_check(), however, more models are affected. Oddly enough
despite being the same model and stepping, the erratum is listed for
Xeon E3 but not its Core counterpart. Apply the workaround uniformly,
and also for Xeon D, which only has the LBR-from one listed in its spec
update.
Seeing this broader applicability, rename anything BDF93-related to more
generic names.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Thu, 28 May 2020 10:00:24 +0000 (12:00 +0200)]
x86: relax LDT check in arch_set_info_guest()
It is wrong for us to check the base address when there's no LDT in the
first place. Once we don't do this check anymore we can also set the
base address to a non-canonical value when the LDT is empty.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Stefano Stabellini [Wed, 15 Apr 2020 01:02:55 +0000 (18:02 -0700)]
xen/arm: call iomem_permit_access for passthrough devices
iomem_permit_access should be called for MMIO regions of devices
assigned to a domain. Currently it is not called for MMIO regions of
passthrough devices of Dom0less guests. This patch fixes it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Julien Grall <julien@xen.org>
Andrew Cooper [Wed, 27 May 2020 12:48:45 +0000 (13:48 +0100)]
x86/boot: Fix load_system_tables() to be NMI/#MC-safe
During boot, load_system_tables() is used in reinit_bsp_stack() to switch the
virtual addresses used from their .data/.bss alias, to their directmap alias.
The structure assignment is implemented as a memset() to zero first, then a
copy-in of the new data. This causes the NMI/#MC stack pointers to
transiently become 0, at a point where we may have an NMI watchdog running.
Rewrite the logic using a volatile tss pointer (equivalent to, but more
readable than, using ACCESS_ONCE() for all writes).
This does drop the zeroing side effect for holes in the structure, but the
backing memory for the TSS is fully zeroed anyway, and architecturally, they
are all reserved.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tamas K Lengyel [Wed, 27 May 2020 07:50:55 +0000 (09:50 +0200)]
x86/mem_sharing: gate enabling on cpu_has_vmx
It is unclear whether mem_sharing was ever made to work on other architectures
but at this time the only verified platform for it is vmx. No plans to support
or maintain it on other architectures. Make this explicit by checking during
initialization.
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Reviewed-by: Wei Liu <wl@xen.org>
Jan Beulich [Wed, 27 May 2020 07:49:37 +0000 (09:49 +0200)]
x86: clear RDRAND CPUID bit on AMD family 15h/16h
Inspired by Linux commit
c49a0a80137c7ca7d6ced4c812c9e07a949f6f24:
There have been reports of RDRAND issues after resuming from suspend on
some AMD family 15h and family 16h systems. This issue stems from a BIOS
not performing the proper steps during resume to ensure RDRAND continues
to function properly.
Update the CPU initialization to clear the RDRAND CPUID bit for any family
15h and 16h processor that supports RDRAND. If it is known that the family
15h or family 16h system does not have an RDRAND resume issue or that the
system will not be placed in suspend, the "cpuid=rdrand" kernel parameter
can be used to stop the clearing of the RDRAND CPUID bit.
Note, that clearing the RDRAND CPUID bit does not prevent a processor
that normally supports the RDRAND instruction from executing it. So any
code that determined the support based on family and model won't #UD.
Warn if no explicit choice was given on affected hardware.
Check RDRAND functions at boot as well as after S3 resume (the retry
limit chosen is entirely arbitrary).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Mon, 27 Apr 2020 12:19:21 +0000 (13:19 +0100)]
x86/ioemul: Rewrite stub generation to be shadow stack compatible
The logic is completely undocumented and almost impossible to follow. It
actually uses return oriented programming. Rewrite it to conform to more
normal call mechanics, and leave a big comment explaining thing. As well as
the code being easier to follow, it will execute faster as it isn't fighting
the branch predictor.
Move the ioemul_handle_quirk() function pointer from traps.c to
ioport_emulate.c. There is no reason for it to be in neither of the two
translation units which use it. Alter the behaviour to return the number of
bytes written into the stub.
Introduce a new nocall annotation using __attribute__((error)) to prohibit
calls being made.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
George Dunlap [Tue, 26 May 2020 11:01:27 +0000 (12:01 +0100)]
golang: Add a variable for the libxl source directory
...rather than duplicating the path in several places.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Nick Rosbrook <rosbrookn@ainfosec.com>
George Dunlap [Tue, 26 May 2020 11:01:26 +0000 (12:01 +0100)]
golang: Add a minimum go version to go.mod
`go build` wants to add the current go version to go.mod as the
minimum every time we run `make` in the directory. Add 1.11 (the
earliest Go version that supports modules) there to make it happy.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Andrew Cooper [Thu, 21 May 2020 08:45:27 +0000 (09:45 +0100)]
x86/shadow: Reposition sh_remove_write_access_from_sl1p()
When compiling with SHOPT_OUT_OF_SYNC disabled, the build fails with:
common.c:41:12: error: ‘sh_remove_write_access_from_sl1p’ declared ‘static’ but never defined [-Werror=unused-function]
static int sh_remove_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
due to an unguarded forward declaration.
It turns out there is no need to forward declare
sh_remove_write_access_from_sl1p() to begin with, so move it to just ahead of
its first user, which is within a larger #ifdef'd SHOPT_OUT_OF_SYNC block.
Fix up for style while moving it. No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Juergen Gross [Mon, 25 May 2020 06:21:55 +0000 (08:21 +0200)]
vmx: let opt_ept_ad always reflect the current setting
In case opt_ept_ad has not been set explicitly by the user via command
line or runtime parameter, it is treated as "no" on Avoton cpus.
Change that handling by setting opt_ept_ad to 0 for this cpu type
explicitly if no user value has been set.
By putting this into the (renamed) boot time initialization of vmcs.c
_vmx_cpu_up() can be made static.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Julien Grall [Sat, 16 May 2020 19:16:57 +0000 (20:16 +0100)]
xen/arm: plat: Allocate as much as possible memory below 1GB for dom0 for RPI
The raspberry PI 4 has devices that can only DMA into the first GB of
the RAM. Therefore we want allocate as much as possible memory below 1GB
for dom0.
Use the recently introduced dma_bitsize field to specify the DMA width
supported.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reported-by: Corey Minyard <minyard@acm.org>
Tested-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Sat, 16 May 2020 10:57:00 +0000 (11:57 +0100)]
xen/arm: Take into account the DMA width when allocating Dom0 memory banks
At the moment, Xen is assuming that all the devices are at least 32-bit
DMA capable. However, some SoCs have devices that may be able to access
a much restricted range. For instance, the Raspberry PI 4 has devices
that can only access the first GB of RAM.
The function arch_get_dma_bit_size() will return the lowest DMA width on
the platform. Use it to decide what is the limit for the low memory.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Tested-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Julien Grall [Sat, 16 May 2020 10:41:16 +0000 (11:41 +0100)]
xen/arm: Allow a platform to override the DMA width
At the moment, Xen is assuming that all the devices are at least 32-bit
DMA capable. However, some SoC have devices that may be able to access
a much restricted range. For instance, the RPI has devices that can
only access the first 1GB of RAM.
The structure platform_desc is now extended to allow a platform to
override the DMA width. The new is used to implement
arch_get_dma_bit_size().
The prototype is now moved in asm-arm/mm.h as the function is not NUMA
specific. The implementation is done in platform.c so we don't have to
include platform.h everywhere. This should be fine as the function is
not expected to be called in hotpath.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Tested-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Nick Rosbrook [Thu, 21 May 2020 14:55:25 +0000 (10:55 -0400)]
golang/xenlight: add an empty line after DO NOT EDIT comment
When generating documentation, pkg.go.dev and godoc.org assume a comment
that immediately precedes the package declaration is a "package
comment", and should be shown in the documentation. Add an empty line
after the DO NOT EDIT comment in generated files to prevent these
comments from appearing as "package comments."
Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper [Thu, 21 May 2020 08:19:33 +0000 (09:19 +0100)]
xen/trace: Don't dump offline CPUs in debugtrace_dump_worker()
The 'T' debugkey reliably wedges on one of my systems, which has a sparse
APIC_ID layout due to a non power-of-2 number of cores per socket. The
per_cpu(dt_cpu_data, cpu) calcution falls over the deliberately non-canonical
poison value.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 22 May 2020 14:46:44 +0000 (15:46 +0100)]
x86/idle: Extend ISR/C6 erratum workaround to Haswell
This bug was first discovered against Haswell. It is definitely affected.
(The XenServer ticket for this bug was opened on 2013-05-30 which is coming up
on 7 years old, and predates Broadwell).
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 18 May 2020 15:13:33 +0000 (16:13 +0100)]
x86/traps: Rework #PF[Rsvd] bit handling
The reserved_bit_page_fault() paths effectively turn reserved bit faults into
a warning, but in the light of L1TF, the real impact is far more serious.
Make #PF[Rsvd] a hard error, irrespective of mode. Any new panic() caused by
this constitutes pagetable corruption, and probably an L1TF gadget needing
fixing.
Drop the PFEC_reserved_bit check in __page_fault_type() which has been made
dead by the rearrangement in do_page_fault().
Additionally, drop the comment for do_page_fault(). It is inaccurate (bit 0
being set isn't always a protection violation) and stale (missing bits
5,6,15,31).
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 22 May 2020 14:10:40 +0000 (16:10 +0200)]
x86/PV: polish pv_set_gdt()
There's no need to invoke get_page_from_gfn(), and there's also no need
to update the passed in frames[]. Invoke get_page_and_type() directly.
Also make the function's frames[] parameter const, change its return
type to int, and drop the bogus casts from two of its invocations.
Finally a little bit of cosmetics.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 22 May 2020 14:09:54 +0000 (16:09 +0200)]
x86: relax GDT check in arch_set_info_guest()
It is wrong for us to check frames beyond the guest specified limit
(in the compat case another loop bound is already correct).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Fri, 22 May 2020 14:08:54 +0000 (16:08 +0200)]
x86/idle: prevent entering C3/C6 on some Intel CPUs due to errata
Apply a workaround for errata BA80, AAK120, AAM108, AAO67, BD59,
AAY54: Rapid Core C3/C6 Transition May Cause Unpredictable System
Behavior.
Limit maximum C state to C1 when SMT is enabled on the affected CPUs.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Fri, 22 May 2020 14:07:38 +0000 (16:07 +0200)]
x86/idle: prevent entering C6 with in service interrupts on Intel
Apply a workaround for Intel errata BDX99, CLX30, SKX100, CFW125,
BDF104, BDH85, BDM135, KWB131: "A Pending Fixed Interrupt May Be
Dispatched Before an Interrupt of The Same Priority Completes".
Apply the errata to all server and client models (big cores) from
Broadwell to Cascade Lake. The workaround is grouped together with the
existing fix for errata AAJ72, and the eoi from the function name is
removed.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 22 May 2020 12:41:15 +0000 (14:41 +0200)]
x86/HVM: cosmetics to hvm_set_cr3()
Eliminate the not really useful local variable "old". Reduce the scope
of "page". Rename the latched "current".
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 22 May 2020 12:40:30 +0000 (14:40 +0200)]
x86/HVM: refuse CR3 loads with reserved (upper) bits set
While bits 11 and below are, if not used for other purposes, reserved
but ignored, bits beyond physical address width are supposed to raise
exceptions (at least in the non-nested case; I'm not convinced the
current nested SVM/VMX behavior of raising #GP(0) here is correct, but
that's not the subject of this change).
Introduce currd as a local variable, and replace other v->domain
instances at the same time.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 22 May 2020 12:37:09 +0000 (14:37 +0200)]
x86/HVM: move NOFLUSH handling out of hvm_set_cr3()
The bit is meaningful only for MOV-to-CR3 insns, not anywhere else, in
particular not when loading nested guest state.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 22 May 2020 12:35:04 +0000 (14:35 +0200)]
x86emul: correct test harness {evex} assembler capability check
The {evex} pseudo prefix gets rejected by gas for insns not allowing
EVEX encoding. Except there's a gas bug due to which its check gets
bypassed for insns without operands. Let's not rely on that bug to
remain there.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
George Dunlap [Fri, 22 May 2020 09:35:10 +0000 (10:35 +0100)]
golang: Update generated files after libxl_types.idl change
c/s
7efd9f3d45 ("libxl: Handle Linux stubdomain specific QEMU
options.") modified libl_types.idl. Run gengotypes.py again to update
the geneated golang bindings.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
Anthony PERARD [Wed, 20 May 2020 16:39:42 +0000 (17:39 +0100)]
tools/xenstore: mark variable in header as extern
This patch fix "multiple definition of `xprintf'" (or xgt_handle)
build error with GCC 10.1.0.
These are the error reported:
gcc xs_tdb_dump.o utils.o tdb.o talloc.o -o xs_tdb_dump
/usr/bin/ld: utils.o:./utils.h:27: multiple definition of `xprintf'; xs_tdb_dump.o:./utils.h:27: first defined here
[...]
gcc xenstored_core.o xenstored_watch.o xenstored_domain.o xenstored_transaction.o xenstored_control.o xs_lib.o talloc.o utils.o tdb.o hashtable.o xenstored_posix.o -lsystemd -Wl,-rpath-link=... ../libxc/libxenctrl.so -lrt -o xenstored
/usr/bin/ld: xenstored_watch.o:./xenstored_core.h:207: multiple definition of `xgt_handle'; xenstored_core.o:./xenstored_core.h:207: first defined here
/usr/bin/ld: xenstored_domain.o:./xenstored_core.h:207: multiple definition of `xgt_handle'; xenstored_core.o:./xenstored_core.h:207: first defined here
/usr/bin/ld: xenstored_transaction.o:./xenstored_core.h:207: multiple definition of `xgt_handle'; xenstored_core.o:./xenstored_core.h:207: first defined here
/usr/bin/ld: xenstored_control.o:./xenstored_core.h:207: multiple definition of `xgt_handle'; xenstored_core.o:./xenstored_core.h:207: first defined here
/usr/bin/ld: xenstored_posix.o:./xenstored_core.h:207: multiple definition of `xgt_handle'; xenstored_core.o:./xenstored_core.h:207: first defined here
A difference that I noticed with earlier version of the build chain is
that before, I had:
$ nm xs_tdb_dump.o | grep xprintf
0000000000000008 C xprintf
And now, it's:
0000000000000000 B xprintf
With the patch apply, the symbol isn't in xs_tdb_dump.o anymore.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Wed, 20 May 2020 10:49:28 +0000 (12:49 +0200)]
x86/mem-paging: further adjustments to p2m_mem_paging_prep()'s error handling
Address late comments on
ecb913be4aaa ("x86/mem-paging: correct
p2m_mem_paging_prep()'s error handling"):
- insert a gprintk() ahead of domain_crash(),
- add a comment.
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Wed, 20 May 2020 10:48:37 +0000 (12:48 +0200)]
x86/idle: rework C6 EOI workaround
Change the C6 EOI workaround (errata AAJ72) to use x86_match_cpu. Also
call the workaround from mwait_idle, previously it was only used by
the ACPI idle driver. Finally make sure the routine is called for all
states equal or greater than ACPI_STATE_C3, note that the ACPI driver
doesn't currently handle them, but the errata condition shouldn't be
limited by that.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
David Woodhouse [Wed, 20 May 2020 10:47:48 +0000 (12:47 +0200)]
x86/setup: lift dom0 creation out into create_dom0() function
The creation of dom0 can be relatively self-contained. Shift it into
a separate function and simplify __start_xen() a little bit.
This is a cleanup in its own right, but will be even more desireable
when live update provides an alternative path through __start_xen()
that doesn't involve creating a new dom0 at all.
Move the calculation of the 'initrd' parameter for create_dom0()
down past the cosmetic printk about NX support, because in the fullness
of time the whole initrd and create_dom0() part will be under the same
"not live update" conditional. And in the meantime it's just neater.
Also drop the explicit check for initrd to be module #0 since that would
be the dom0 kernel and the corresponding bit is always clear in
module_map.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Jan Beulich <jbeulich@suse.com>
Jason Andryuk [Tue, 19 May 2020 01:55:03 +0000 (21:55 -0400)]
libxl: Check stubdomain kernel & ramdisk presence
Just out of context is the following comment for libxl__domain_make:
/* fixme: this function can leak the stubdom if it fails */
When the stubdomain kernel or ramdisk is not present, the domid and
stubdomain name will indeed be leaked. Avoid the leak by checking the
file presence and erroring out when absent. It doesn't fix all cases,
but it avoids a big one when using a linux device model stubdomain.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jason Andryuk [Tue, 19 May 2020 01:55:02 +0000 (21:55 -0400)]
docs: Add device-model-domid to xenstore-paths
Document device-model-domid for when using a device model stubdomain.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:55:01 +0000 (21:55 -0400)]
libxl: consider also qemu in stubdomain in libxl__dm_active check
Since qemu-xen can now run in stubdomain too, handle this case when
checking it's state too.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:55:00 +0000 (21:55 -0400)]
libxl: ignore emulated IDE disks beyond the first 4
Qemu supports only 4 emulated IDE disks, when given more (or with higher
indexes), it will fail to start. Since the disks can still be accessible
using PV interface, just ignore emulated path and log a warning, instead
of rejecting the configuration altogether.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:59 +0000 (21:54 -0400)]
libxl: require qemu in dom0 for multiple stubdomain consoles
Device model stubdomains (both Mini-OS + qemu-trad and linux + qemu-xen)
are always started with at least 3 consoles: log, save, and restore.
Until xenconsoled learns how to handle multiple consoles, this is needed
for save/restore support.
For Mini-OS stubdoms, this is a bug. In practice, it works in most
cases because there is something else that triggers qemu in dom0 too:
vfb/vkb added if vnc/sdl/spice is enabled.
Additionally, Linux-based stubdomain waits for all the backends to
initialize during boot. Lack of some console backends results in
stubdomain startup timeout.
This is a temporary patch until xenconsoled will be improved.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
[Updated commit message with Marek's explanation from mailing list.]
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:58 +0000 (21:54 -0400)]
libxl: use vchan for QMP access with Linux stubdomain
Access to QMP of QEMU in Linux stubdomain is possible over vchan
connection. Handle the actual vchan connection in a separate process
(vchan-socket-proxy). This simplified integration with QMP (already
quite complex), but also allows preliminary filtering of (potentially
malicious) QMP input.
Since only one client can be connected to vchan server at the same time
and it is not enforced by the libxenvchan itself, additional client-side
locking is needed. It is implicitly implemented by vchan-socket-proxy,
as it handle only one connection at a time. Note that qemu supports only
one simultaneous client on a control socket anyway (but in UNIX socket
case, it enforce it server-side), so it doesn't add any extra
limitation.
libxl qmp client code already has locking to handle concurrent access
attempts to the same qemu qmp interface.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Squash in changes of regenerated autotools files.
Kill the vchan-socket-proxy so we don't leak the daemonized processes.
libxl__stubdomain_is_linux_running() works against the guest_domid, but
the xenstore path is beneath the stubdomain. This leads to the use of
libxl_is_stubdom in addition to libxl__stubdomain_is_linux_running() so
that the stubdomain calls kill for the qmp-proxy.
Also call libxl__qmp_cleanup() to remove the unix sockets used by
vchan-socket-proxy. vchan-socket-proxy only creates qmp-libxl-$domid,
and libxl__qmp_cleanup removes that as well as qmp-libxenstat-$domid.
However, it tolerates ENOENT, and a stray qmp-libxenstat-$domid should
not exist.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jason Andryuk [Tue, 19 May 2020 01:54:57 +0000 (21:54 -0400)]
libxl: Refactor kill_device_model to libxl__kill_xs_path
Move kill_device_model to libxl__kill_xs_path so we have a helper to
kill a process from a pid stored in xenstore. We'll be using it to kill
vchan-qmp-proxy.
libxl__kill_xs_path takes a "what" string for use in printing error
messages. kill_device_model is retained in libxl_dm.c to provide the
string.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:56 +0000 (21:54 -0400)]
tools: add simple vchan-socket-proxy
Add a simple proxy for tunneling socket connection over vchan. This is
based on existing vchan-node* applications, but extended with socket
support. vchan-socket-proxy serves both as a client and as a server,
depending on parameters. It can be used to transparently communicate
with an application in another domian that normally expose UNIX socket
interface. Specifically, it's written to communicate with qemu running
within stubdom.
Server mode listens for vchan connections and when one is opened,
connects to a pointed UNIX socket. Client mode listens on UNIX
socket and when someone connects, opens a vchan connection. Only
a single connection at a time is supported.
Additionally, socket can be provided as a number - in which case it's
interpreted as already open FD (in case of UNIX listening socket -
listen() needs to be already called). Or "-" meaning stdin/stdout - in
which case it is reduced to vchan-node2 functionality.
Example usage:
1. (in dom0) vchan-socket-proxy --mode=client <DOMID>
/local/domain/<DOMID>/data/vchan/1234 /run/qemu.(DOMID)
2. (in DOMID) vchan-socket-proxy --mode=server 0
/local/domain/<DOMID>/data/vchan/1234 /run/qemu.(DOMID)
This will listen on /run/qemu.(DOMID) in dom0 and whenever connection is
made, it will connect to DOMID, where server process will connect to
/run/qemu.(DOMID) there. When client disconnects, vchan connection is
terminated and server vchan-socket-proxy process also disconnects from
qemu.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:55 +0000 (21:54 -0400)]
tools: add missing libxenvchan cflags
libxenvchan.h include xenevtchn.h and xengnttab.h, so applications built
with it needs applicable -I in CFLAGS too.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:54 +0000 (21:54 -0400)]
libxl: add save/restore support for qemu-xen in stubdomain
Rely on a wrapper script in stubdomain to attach relevant consoles to
qemu. The save console (1) must be attached to fdset/1. When
performing a restore, $STUBDOM_RESTORE_INCOMING_ARG must be replaced on
the qemu command line by "fd:$FD", where $FD is an open file descriptor
number to the restore console (2).
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Address TODO in dm_state_save_to_fdset: Only remove savefile for
non-stubdom.
Use $STUBDOM_RESTORE_INCOMING_ARG instead of fd:3 and update commit
message.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:53 +0000 (21:54 -0400)]
tools/libvchan: notify server when client is connected
Let the server know when the client is connected. Otherwise server will
notice only when client send some data.
This change does not break existing clients, as libvchan user should
handle spurious notifications anyway (for example acknowledge of remote
side reading the data).
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Replace spaces with tabs to match the file's whitespace.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:52 +0000 (21:54 -0400)]
xl: add stubdomain related options to xl config parser
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:51 +0000 (21:54 -0400)]
libxl: write qemu arguments into separate xenstore keys
This allows using arguments with spaces, like -append, without
nominating any special "separator" character.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Write arguments in dm-argv directory instead of overloading mini-os's
dmargs string.
Make libxl__write_stub_dmargs vary behaviour based on the
is_linux_stubdom flag.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jason Andryuk [Tue, 19 May 2020 01:54:50 +0000 (21:54 -0400)]
libxl: Use libxl__xs_* in libxl__write_stub_dmargs
Re-work libxl__write_stub_dmargs to use libxl_xs_* functions in a loop.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Wed, 11 Mar 2020 17:59:33 +0000 (17:59 +0000)]
tools: Use INSTALL_PYTHON_PROG
Whenever python scripts are install, have the shebang be modified to use
whatever PYTHON_PATH is. This is useful for system where python isn't available, or
where the package build tools prevent unversioned shebang.
INSTALL_PYTHON_PROG only looks for "#!/usr/bin/env python".
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
Anthony PERARD [Wed, 11 Mar 2020 17:59:32 +0000 (17:59 +0000)]
tools/python: Fix install-wrap
This allows to use install-wrap when the source scripts is in a
subdirectory.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
Eric Shelton [Tue, 19 May 2020 01:54:49 +0000 (21:54 -0400)]
libxl: Handle Linux stubdomain specific QEMU options.
This patch creates an appropriate command line for the QEMU instance
running in a Linux-based stubdomain.
NOTE: a number of items are not currently implemented for Linux-based
stubdomains, such as:
- save/restore
- QMP socket
- graphics output (e.g., VNC)
Signed-off-by: Eric Shelton <eshelton@pobox.com>
Simon:
* fix disk path
* fix cdrom path and "format"
Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
[drop Qubes-specific parts]
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Allow setting stubdomain_ramdisk independently from stubdomain_kernel
Add a qemu- prefix for qemu-stubdom-linux-{kernel,rootfs} since stubdom
doesn't convey device-model. Use qemu- since this code is qemu specific.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:48 +0000 (21:54 -0400)]
libxl: Allow running qemu-xen in stubdomain
Do not prohibit anymore using stubdomain with qemu-xen.
To help distingushing MiniOS and Linux stubdomain, add helper inline
functions libxl__stubdomain_is_linux() and
libxl__stubdomain_is_linux_running(). Those should be used where really
the difference is about MiniOS/Linux, not qemu-xen/qemu-xen-traditional.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:47 +0000 (21:54 -0400)]
libxl: fix qemu-trad cmdline for no sdl/vnc case
When qemu is running in stubdomain, any attempt to initialize vnc/sdl
there will crash it (on failed attempt to load a keymap from a file). If
vfb is present, all those cases are skipped. But since
b053f0c4c9e533f3d97837cf897eb920b8355ed3 "libxl: do not start dom0 qemu
for stubdomain when not needed" it is possible to create a stubdomain
without vfb and contrary to the comment -vnc none do trigger VNC
initialization code (just skips exposing it externally).
Change the implicit SDL avoiding method to -nographics option, used when
none of SDL or VNC is enabled.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:46 +0000 (21:54 -0400)]
Document ioemu Linux stubdomain protocol
Add documentation for upcoming Linux stubdomain for qemu-upstream.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Tue, 19 May 2020 01:54:45 +0000 (21:54 -0400)]
Document ioemu MiniOS stubdomain protocol
Add documentation based on reverse-engineered toolstack-ioemu stubdomain
protocol.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Mon, 18 May 2020 14:44:00 +0000 (16:44 +0200)]
tools: use HOSTCC/CPP to compile rombios code and helper
Use also HOSTCFLAGS for biossums while touching the code.
Spotted by inspecting build logfile.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>