Jan Beulich [Tue, 15 Aug 2017 13:07:25 +0000 (15:07 +0200)]
gnttab: split maptrack lock to make it fulfill its purpose again
The way the lock is currently being used in get_maptrack_handle(), it
protects only the maptrack limit: The function acts on current's list
only, so races on list accesses are impossible even without the lock.
Otoh list access races are possible between __get_maptrack_handle() and
put_maptrack_handle(), due to the invocation of the former for other
than current from steal_maptrack_handle(). Introduce a per-vCPU lock
for list accesses to become race free again. This lock will be
uncontended except when it becomes necessary to take the steal path,
i.e. in the common case there should be no meaningful performance
impact.
When in get_maptrack_handle adds a stolen entry to a fresh, empty,
freelist, we think that there is probably no concurrency. However,
this is not a fast path and adding the locking there makes the code
clearly correct.
Also, while we are here: the stolen maptrack_entry's tail pointer was
not properly set. Set it.
This is CVE-2017-12136 / XSA-228.
Reported-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Andrew Cooper [Tue, 15 Aug 2017 13:06:45 +0000 (15:06 +0200)]
x86/grant: disallow misaligned PTEs
Pagetable entries must be aligned to function correctly. Disallow attempts
from the guest to have a grant PTE created at a misaligned address, which
would result in corruption of the L1 table with largely-guest-controlled
values.
This is CVE-2017-12137 / XSA-227.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Boris Ostrovsky [Mon, 14 Aug 2017 15:18:49 +0000 (17:18 +0200)]
mm: clean up free_heap_pages()
Make buddy merging part of free_heap_pages() a bit more readable.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Julien Grall [Mon, 14 Aug 2017 15:17:44 +0000 (17:17 +0200)]
grant_table: include mm.h in xen/grant_table.h
While re-ordering the include alphabetically in arch/arm/domain.c, I got
a complitation error because grant_table.h is using gfn_t before been
defined:
In file included from domain.c:14:0:
xen/xen/include/xen/grant_table.h:153:29: error: unknown type name \91gfn_t\92
gfn_t *gfn, uint16_t *status);
^
Fix it by including xen/mm.h in it.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Mon, 14 Aug 2017 10:52:20 +0000 (11:52 +0100)]
x86/page: Introduce and use PAGE_HYPERVISOR_UC
Always map the PCI MMCFG region as strongly uncacheable. Nothing good will
happen if stray MTRR settings end up converting UC- to WC.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 14 Aug 2017 10:42:24 +0000 (11:42 +0100)]
x86/page: Rename PAGE_HYPERVISOR_NOCACHE to PAGE_HYPERVISOR_UCMINUS
To better describe its actual function.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 11 Aug 2017 13:35:48 +0000 (14:35 +0100)]
x86/config: Fix stale documentation concerning virtual layout
The hypercall argument translation area lives in the per-domain mappings in
PML4 slot 260. Nothing currently resides in the lower canonical half above
the 4GB boundary in a 32bit PV guest.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Thu, 10 Aug 2017 17:22:53 +0000 (18:22 +0100)]
xen: remove struct domain and vcpu declarations from types.h
They don't belong there. Removing them causes build errors in several
places. Add the forward declarations in those places.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Andrew Cooper [Fri, 23 Jun 2017 10:56:37 +0000 (10:56 +0000)]
xen/flask: Switch to using bool
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Andrew Cooper [Thu, 10 Aug 2017 13:13:00 +0000 (14:13 +0100)]
xsm/flask: Fix build following "xsm: correct AVC lookups for two sysctls"
avc_current_has_perm() takes 4 arguments, not 3. Spotted by a Travis
randconfig run which actually turned XSM on.
https://travis-ci.org/xen-project/xen/jobs/
263063220
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Andrew Cooper [Wed, 26 Jul 2017 09:18:02 +0000 (10:18 +0100)]
common/domain_page: Drop domain_mmap_cache infrastructure
This infrastructure is used exclusively by the x86 do_mmu_update() hypercall.
Mapping and unmapping domain pages is probably not the slow part of that
function, but even with an opencoded caching implementation, Bloat-o-meter
reports:
function old new delta
do_mmu_update 6815 6573 -242
The !CONFIG_DOMAIN_PAGE stub code has a mismatch between mapping and
unmapping, which is a latent bug.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Wed, 9 Aug 2017 12:35:19 +0000 (13:35 +0100)]
x86/psr: remove useless check in free_socket_resources
The check is useless because pointer arithmetic ensures "info" is
always non-zero.
Replace it with an ASSERT for socket_info. The only caller of
free_socket_resources already ensures socket_info is not NULL before
calling it.
Coverity-ID:
1416344
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Thu, 10 Aug 2017 10:37:24 +0000 (12:37 +0200)]
x86/HVM: fix boundary check in hvmemul_insn_fetch() (again)
Commit
5a992b670b ("x86/hvm: Fix boundary check in
hvmemul_insn_fetch()") went a little too far in its correction to
commit
0943a03037 ("x86/hvm: Fixes to hvmemul_insn_fetch()"): Keep the
start offset check, but restore the original end offset one.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Jan Beulich [Thu, 10 Aug 2017 10:36:58 +0000 (12:36 +0200)]
x86/mm: make various hotplug related functions static
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 10 Aug 2017 10:36:24 +0000 (12:36 +0200)]
IOMMU/PCI: properly annotate setup_one_hwdom_device()
Its sole caller is __hwdom_init, so it can be such itself, too.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Christopher Clark [Thu, 10 Aug 2017 10:35:50 +0000 (12:35 +0200)]
cpufreq: only stop ondemand governor if already started
On CPUFREQ_GOV_STOP in cpufreq_governor_dbs, shortcut to
return success if the governor is already stopped.
Avoid executing dbs_timer_exit, to prevent tripping an assertion
within a call to kill_timer on a timer that has not been prepared
with init_timer, if the CPUFREQ_GOV_START case has not
run beforehand.
kill_timer validates timer state:
* itself, via BUG_ON(this_cpu(timers).running == timer);
* within active_timer, ASSERTing timer->status is within bounds;
* within list_del, which ASSERTs timer inactive list membership.
Patch is synonymous to an OpenXT patch produced at Citrix prior to
June 2014.
Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Daniel De Graaf [Thu, 10 Aug 2017 10:35:28 +0000 (12:35 +0200)]
xsm: correct AVC lookups for two sysctls
The current code was incorrectly using SECCLASS_XEN instead of
SECCLASS_XEN2, resulting in the wrong permission being checked.
GET_CPU_LEVELLING_CAPS was checking MTRR_DEL
GET_CPU_FEATURESET was checking MTRR_READ
The default XSM policy only allowed these permissions to dom0, so this
didn't result in a security issue there.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Christopher Clark [Thu, 10 Aug 2017 10:34:58 +0000 (12:34 +0200)]
x86/tboot: disable interrupts after map_pages_to_xen() in tboot_shutdown()
Move the point where interrupts are disabled in tboot_shutdown
to slightly later, to after the call to map_pages_to_xen.
This patch originated in OpenXT with the following report:
"Disabling interrupts early causes debug assertions.
This is only seen with debug builds but since it causes assertions it is
probably a bigger problem. It clearly says in map_pages_to_xen that it
should not be called with interrupts disabled. Moved disabling to just
after that call."
The Xen code comment ahead of map_pages_to_xen notes that the CPU cache
flushing in map_pages_to_xen differs depending on whether interrupts are
enabled or not. The flush logic with interrupts enabled is more
conservative, flushing all CPUs' TLBs/caches, rather than just local.
This is just before the tboot memory integrity MAC calculation is performed
in the case of entering S3.
Original patch author credit: Ross Philipson.
Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Thu, 10 Aug 2017 10:34:21 +0000 (12:34 +0200)]
AMD IOMMU: drop amd_iommu_setup_hwdom_device()
By moving its bridge special casing to amd_iommu_add_device(), we can
pass the latter to setup_hwdom_pci_devices() and at once consistently
handle bridges discovered at boot time as well as such reported by Dom0
later on.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Boqun Feng (Intel) [Thu, 10 Aug 2017 10:33:27 +0000 (12:33 +0200)]
x86/cpufeatures: expose UMIP to HVM guests
User-Mode Instruction Prevention (UMIP) is a security feature present in
new Intel Processors. With this feature, when the UMIP bit in CR4 set,
the following instructions cannot be executed if CPL > 0: SGDT, SIDT,
SLDT, SMSW, and STR. An attempt at such execution causes a general-
protection exception (#GP).
This patch simply adds necessary definitions to expose this feature to
hvm guests.
Signed-off-by: Boqun Feng (Intel) <boqun.feng@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Chao Gao [Thu, 10 Aug 2017 10:32:16 +0000 (12:32 +0200)]
VT-d PI: disable VT-d PI when CPU-side PI isn't enabled
From the context calling pi_desc_init(), we can conclude the current
implementation of VT-d PI depends on CPU-side PI. If we enable VT-d PI
and disable CPU-side PI by disabling APICv explicitly in xen boot
command line, we would get an assertion failure.
This patch clears iommu_intpost once finding CPU-side PI won't be enabled.
It is safe for this is done before this flag starts taking effect. Also
take this chance to remove the useless check of "acknowledge interrupt on
exit", which is a minimal requirement which has been checked earlier.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Wei Liu [Wed, 9 Aug 2017 13:04:33 +0000 (14:04 +0100)]
Config.mk: update mini-os changeset
Pull in the change to fix stubdom build with gcc 7.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Olaf Hering [Fri, 23 Jun 2017 17:35:04 +0000 (19:35 +0200)]
vtpmmgr: make inline functions static
gcc7 is more strict with functions marked as inline. They are not
automatically inlined. Instead a function call is generated, but the
actual code is not visible by the linker.
Do a mechanical change and mark every 'inline' as 'static inline'. For
simpler review the static goes into an extra line.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Tested-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monné [Wed, 9 Aug 2017 10:18:20 +0000 (11:18 +0100)]
x86/hvm: fix arch_set_info_hvm_guest SEG macro
Commit
6c9abf0e802 modified the SEG macro in arch_set_info_hvm_guest and
inverted the limit and base fields. Restore the correct order.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 8 Aug 2017 10:47:07 +0000 (11:47 +0100)]
common/page_alloc: Drop BOOT_BUG_ON()
Regular BUG_ON()'s work fine by this point on all architectures, so drop the
custom infrastructure. Substitute BUG_ON(1) for BUG().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Yi Sun [Mon, 7 Aug 2017 01:50:49 +0000 (09:50 +0800)]
x86: adjust place of an ASSERT to avoid crash when destroy a domain.
In 'psr_free_cos', we should not use 'ASSERT(socket_info)' at the beginning
because the 'socket_info' is allocated only if 'psr' boot parameter is set.
So adjust its place to avoid crash.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Yi Sun [Mon, 7 Aug 2017 10:44:15 +0000 (11:44 +0100)]
docs: remove a special character to avoid html creation error.
The '®' (a special character) may cause html document creation
failure. So remove it from the feature document.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Tested-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
xiliang [Tue, 1 Aug 2017 15:57:50 +0000 (23:57 +0800)]
xl: add --clear option to dmesg command
The manual of xl says --clear option is supported and that option
worked for xm. Add that to xl now.
Signed-off-by: Xiao Liang <xiliang@redhat.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Mon, 31 Jul 2017 11:22:48 +0000 (12:22 +0100)]
docs: hook up process/ to build system
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Mon, 31 Jul 2017 11:22:47 +0000 (12:22 +0100)]
docs: add xen-release-management.pandoc
A document for the release manager.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Wei Liu [Mon, 31 Jul 2017 11:22:46 +0000 (12:22 +0100)]
docs: consolidate release related documents
Move the existing docs from misc to docs/process.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Marek Marczykowski-Górecki [Fri, 28 Jul 2017 16:42:14 +0000 (18:42 +0200)]
libxl: do not start dom0 qemu for stubdomain when not needed
Do not setup vfb+vkb when no access method was configured. Then check if
qemu is really needed.
The only not configurable thing forcing qemu running in dom0 after this
change are consoles used to save/restore. But even in that case, there
is much smaller part of qemu exposed.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Tue, 1 Aug 2017 11:25:39 +0000 (12:25 +0100)]
libxc: check pointer is not null before printing
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Rusty Bird [Thu, 3 Aug 2017 10:40:25 +0000 (12:40 +0200)]
VT-d: don't panic/warn on iommu=no-igfx
When operating on an Intel graphics device, iommu_enable_translation()
panicked (force_iommu==1) or warned (force_iommu==0) about the BIOS if
is_igd_vt_enabled_quirk() returned 0. That's good if the actual BIOS
problem has been detected. But since commit
1463411, returning 0 could
also happen if the user simply passed "iommu=no-igfx", in which case
bailing out with an info message (instead of a panic/warning) would be
more appropriate.
The panic broke the combination "iommu=force,no-igfx", and also the case
where "iommu=no-igfx" is passed but force_iommu=1 is set automatically
by x2apic_bsp_setup().
Move the iommu_igfx check from is_igd_vt_enabled_quirk() into its only
caller iommu_enable_translation(), and tweak the logic.
Signed-off-by: Rusty Bird <rustybird@openmailbox.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
docs: add L2 CAT description in docs.
This patch adds L2 CAT description in related documents.
Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
tools: L2 CAT: support set cbm for L2 CAT.
This patch implements the xl/xc changes to support set CBM
for L2 CAT.
The new level option is introduced to original CAT setting
command in order to set CBM for specified level CAT.
- 'xl psr-cat-set' is updated to set cache capacity bitmasks(CBM)
for a domain according to input cache level.
root@:~$ xl psr-cat-set -l2 1 0x7f
root@:~$ xl psr-cat-show -l2 1
Socket ID : 0
Default CBM : 0xff
ID NAME CBM
1 ubuntu14 0x7f
Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
tools: L2 CAT: support show cbm for L2 CAT.
This patch implements changes in xl/xc changes to support
showing CBM of L2 CAT.
The new level option is introduced to original CAT showing
command in order to show CBM for specified level CAT.
- 'xl psr-cat-show' is updated to show CBM of a domain
according to input cache level.
Examples:
root@:~$ xl psr-cat-show -l2 1
Socket ID : 0
Default CBM : 0xff
ID NAME CBM
1 ubuntu14 0x7f
Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
tools: L2 CAT: support get HW info for L2 CAT.
This patch implements xl/xc changes to support get HW info
for L2 CAT.
'xl psr-hwinfo' is updated to show both L3 CAT and L2 CAT
info.
Example(on machine which only supports L2 CAT):
Cache Monitoring Technology (CMT):
Enabled : 0
Cache Allocation Technology (CAT): L2
Socket ID : 0
Maximum COS : 3
CBM length : 8
Default CBM : 0xff
Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
x86: L2 CAT: implement set value flow.
This patch implements L2 CAT set value related callback function
and domctl interface.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
x86: L2 CAT: implement get value flow.
This patch implements L2 CAT get value interface in domctl.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
x86: L2 CAT: implement get hw info flow.
This patch implements L2 CAT get HW info flow and interface in sysctl.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
x86: L2 CAT: implement CPU init flow.
This patch implements the CPU init flow for L2 CAT.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:05:00 +0000 (11:05 +0200)]
x86: refactor psr: CDP: implement set value callback function.
This patch implements L3 CDP set value related callback function.
With this patch, 'psr-cat-cbm-set' command can work for L3 CDP.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: CDP: implement get hw info flow.
This patch implements get HW info flow for CDP including L3 CDP callback
function. The flow is almost same as L3 CAT.
With this patch, 'psr-hwinfo' can work for L3 CDP.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Thu, 3 Aug 2017 02:25:00 +0000 (04:25 +0200)]
x86: refactor psr: CDP: implement CPU init flow.
This patch implements the CPU init flow for CDP. The flow is almost
same as L3 CAT.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: set value: implement write msr flow.
Continue from previous patch:
'x86: refactor psr: L3 CAT: set value: implement cos id picking flow.'
We have got the feature value and COS ID to set. Then, we write MSRs of the
designated feature.
Till now, set value process is completed.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: set value: implement cos id picking flow.
Continue from previous patch:
'x86: refactor psr: L3 CAT: set value: implement cos finding flow.'
If fail to find a COS ID, we need pick a new COS ID for domain. Only COS ID
that ref[COS_ID] is 1 or 0 can be picked to input a new set feature values.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: set value: implement cos finding flow.
Continue from patch:
'x86: refactor psr: L3 CAT: set value: assemble features value array'
We can try to find if there is a COS ID on which all features' COS registers
values are same as the array assembled before.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: set value: assemble features value array.
Only can one COS ID be used by one domain at one time. That means all enabled
features' COS registers at this COS ID are valid for this domain at that time.
When user updates a feature's value, we need make sure all other features'
values are not affected. So, we firstly need gather an array which contains
all features current values and replace the setting feature's value in array
to new value.
Then, we can try to find if there is a COS ID on which all features' COS
registers values are same as the array. If we can find, we just use this COS
ID. If fail to find, we need pick a new COS ID.
This patch implements value array assembling flow.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: set value: implement framework.
As set value flow is the most complicated one in psr, it will be
divided to some patches to make things clearer. This patch
implements the set value framework to show a whole picture firstly.
It also changes domctl interface to make it more general.
To make the set value flow be general and can support multiple features
at same time, it includes below steps:
1. Test and set dom_ids bit corresponding to the domain. If the old bit is 0
which means the domain's COS ID is invalid, restore COS ID to 0. If the
COS ID is valid, get the COS ID that current domain is using.
2. Gather a value array to store all features current value
into it and replace the current value of the feature which is
being set to the new input value.
3. Find if there is already a COS ID on which all features'
values are same as the array. Then, we can reuse this COS
ID.
4. If fail to find, we need pick an available COS ID. Only COS ID which ref
is 0 or 1 can be picked.
5. Write the feature's MSRs according to the COS ID.
6. Update ref according to COS ID.
7. Save the COS ID into current domain's psr_cos_ids[socket] so that we
can know which COS the domain is using on the socket.
So, some functions are abstracted and the callback functions will be
implemented in next patches.
Here is an example to understand the process. The CPU supports
two featuers, e.g. L3 CAT and L2 CAT. User wants to set L3 CAT
of Dom1 to 0x1ff.
1. At the initial time, the old_cos of Dom1 is 0. The COS registers values
are below at this time.
-------------------------------
| COS 0 | COS 1 | COS 2 | ... |
-------------------------------
L3 CAT | 0x7ff | 0x7ff | 0x7ff | ... |
-------------------------------
L2 CAT | 0xff | 0xff | 0xff | ... |
-------------------------------
2. Gather the value array and insert new value into it:
val[0]: 0x1ff
val[1]: 0xff
3. It cannot find a matching COS.
4. Pick COS 1 to store the value set.
5. Write the L3 CAT COS 1 registers. The COS registers values are
changed to below now.
-------------------------------
| COS 0 | COS 1 | COS 2 | ... |
-------------------------------
L3 CAT | 0x7ff | 0x1ff | ... | ... |
-------------------------------
L2 CAT | 0xff | 0xff | ... | ... |
-------------------------------
6. The ref[1] is increased to 1 because Dom1 is using it now.
7. Save 1 to Dom1's psr_cos_ids[socket].
Then, user wants to set L3 CAT of Dom2 to 0x1ff too. The old_cos
of Dom2 is 0 too. Repeat above flow.
The val array assembled is:
val[0]: 0x1ff
val[1]: 0xff
So, it can find a matching COS, COS 1. Then, it can reuse COS 1
for Dom2.
The ref[1] is increased to 2 now because both Dom1 and Dom2 are
using this COS ID. Set 1 to Dom2's psr_cos_ids[socket].
There is one thing need to emphasize that we need restore domain's COS ID to
0 when socket is offline. Otherwise, a wrong COS ID will be used when the
socket is online again. That may cause user see the wrong CBM shown. But it
takes much time to iterate all domains to restore COS ID to 0. So, we define
a 'dom_ids[]' to represents all domains, one bit corresponds to one domain.
If the bit is 0 when entering 'psr_ctxt_switch_to', that means this is the
first time the domain is switched to this socket or domain's COS ID has not
been set since the socket is online. So, the COS ID set to ASSOC register on
this socket should be default value, 0. If not, that means the domain's COS
ID has been set when the socket was online. So, this COS ID is valid and we
can directly use it. We restore the domain's COS ID to 0 if the bit
corresponding to the domain is 0 but the domain's COS ID is not 0 when
'psr_get_val' and 'psr_set_val' is called. This can avoid CPU serialization
if restoring action is exectued in 'psr_ctxt_switch_to'.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: implement get value flow.
There is an interface in user space to show feature value of
domains.
This patch implements get value flow in hypervisor.
It also changes domctl interface to make it more general.
With this patch, 'psr-cat-show' can work for L3 CAT but not for
L3 code/data which is implemented in CDP related patches.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: implement get hw info flow.
This patch implements get HW info flow including L3 CAT callback
function.
It also changes sysctl interface to make it more general.
With this patch, 'psr-hwinfo' can work for L3 CAT.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: implement Domain init/free and schedule flows.
This patch implements the Domain init/free and schedule flows.
- When domain init, its psr resource should be allocated.
- When domain free, its psr resource should be freed too.
- When domain is scheduled, its COS ID on the socket should be
set into ASSOC register to make corresponding COS MSR value
work.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: L3 CAT: implement main data structures, CPU init and free flows.
To construct an extendible framework, we need analyze PSR features
and abstract the common things and feature specific things. Then,
encapsulate them into different data structures.
By analyzing PSR features, we can get below map.
+------+------+------+
--------->| Dom0 | Dom1 | ... |
| +------+------+------+
| |
|Dom ID | cos_id of domain
| V
| +-----------------------------------------------------------------------------+
User --------->| PSR |
Socket ID | +--------------+---------------+---------------+ |
| | Socket0 Info | Socket 1 Info | ... | |
| +--------------+---------------+---------------+ |
| | cos_id=0 cos_id=1 ... |
| | +-----------------------+-----------------------+-----------+ |
| |->Ref : | ref 0 | ref 1 | ... | |
| | +-----------------------+-----------------------+-----------+ |
| | +-----------------------+-----------------------+-----------+ |
| |->L3 CAT: | cos 0 | cos 1 | ... | |
| | +-----------------------+-----------------------+-----------+ |
| | +-----------------------+-----------------------+-----------+ |
| |->L2 CAT: | cos 0 | cos 1 | ... | |
| | +-----------------------+-----------------------+-----------+ |
| | +-----------+-----------+-----------+-----------+-----------+ |
| |->CDP : | cos0 code | cos0 data | cos1 code | cos1 data | ... | |
| +-----------+-----------+-----------+-----------+-----------+ |
+-----------------------------------------------------------------------------+
So, we need define a socket info data structure, 'struct
psr_socket_info' to manage information per socket. It contains a
reference count array according to COS ID and a feature array to
manage all features enabled. Every entry of the reference count
array is used to record how many domains are using the COS registers
according to the COS ID. For example, L3 CAT and L2 CAT are enabled,
Dom1 uses COS_ID=1 registers of both features to save CBM values, like
below.
+-------+-------+-------+-----+
| COS 0 | COS 1 | COS 2 | ... |
+-------+-------+-------+-----+
L3 CAT | 0x7ff | 0x1ff | ... | ... |
+-------+-------+-------+-----+
L2 CAT | 0xff | 0xff | ... | ... |
+-------+-------+-------+-----+
If Dom2 has same CBM values, it can reuse these registers which COS_ID=1.
That means, both Dom1 and Dom2 use same COS registers(ID=1) to keep same
L3/L2 values. So, the value of ref[1] is 2 which means 2 domains are using
COS_ID 1.
To manage a feature, we need define a feature node data structure,
'struct feat_node', to manage feature's specific HW info, and an array of all
COS registers values of this feature.
To manage feature properties, we need define a feature property data structure,
'struct feat_props', to manage common properties (callback functions - all
feature's specific behaviors are encapsulated into these callback functions,
and generic values - e.g. the cos_max), the feature independent values.
CDP is a special feature which uses two entries of the array
for one COS ID. So, the number of CDP COS registers is the half of L3
CAT. E.g. L3 CAT has 16 COS registers, then CDP has 8 COS registers if
it is enabled. CDP uses the COS registers array as below.
+-----------+-----------+-----------+-----------+-----------+
CDP cos_reg_val[] index: | 0 | 1 | 2 | 3 | ... |
+-----------+-----------+-----------+-----------+-----------+
value: | cos0 code | cos0 data | cos1 code | cos1 data | ... |
+-----------+-----------+-----------+-----------+-----------+
For more details, please refer SDM and patches to implement 'get value' and
'set value'.
This patch also implements the CPU init and free flow including L3 CAT
initialization and some resources free. It includes below flows:
1. presmp init:
- parse command line parameter.
- allocate socket info for every socket.
- allocate feature resource.
- initialize socket info, get feature info and add feature into feature
array per cpuid result.
- free resources allocated if error happens.
- register cpu notifier to handle cpu events.
2. cpu notifier:
- handle cpu online events, if initialization work has been done before,
do nothing.
- handle cpu offline events, if it is the last cpu offline, free some
socket resources.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: refactor psr: remove L3 CAT/CDP codes.
The current cache allocation codes in psr.c do not consider
future features addition and are not friendly to extend.
To make psr.c be more flexible to add new features and fulfill
the program principle, open for extension but closed for
modification, we have to refactor the psr.c:
1. Analyze cache allocation features and abstract general data
structures.
2. Analyze the init and all other functions flow, abstract all
steps that different features may have different implementations.
Make these steps be callback functions and register feature
specific fuctions. Then, the main processes will not be changed
when introducing a new feature.
Because the quantity of refactor codes is big and the logics are
changed a lot, it will cause reviewers confused if just change
old codes. Reviewers have to understand both old codes and new
implementations. After review iterations from V1 to V3, Jan has
proposed to remove all old cache allocation codes firstly, then
implement new codes step by step. This will help to make codes
be more easily reviewable.
There is no construction without destruction. So, this patch
removes all current L3 CAT/CDP codes in psr.c. The following
patches will introduce the new mechanism.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
x86: move cpuid_count_leaf from cpuid.c to processor.h.
This patch moves 'cpuid_count_leaf' from cpuid.c to processor.h to
make it available to external codes.
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Yi Sun [Tue, 1 Aug 2017 09:04:00 +0000 (11:04 +0200)]
docs: create Cache Allocation Technology (CAT) and Code and Data Prioritization (CDP) feature document
This patch creates CAT and CDP feature document in doc/features/. It describes
key points to implement L3 CAT/CDP and L2 CAT which is described in details in
Intel SDM "INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES".
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Sergej Proskurin [Thu, 3 Aug 2017 10:27:15 +0000 (12:27 +0200)]
move PAGE_*_* macros to xen/page-defs.h
Move pre-existing PAGE_(SHIFT|SIZE|MASK|ALIGN)_(4K|64K) and
introduce corresponding defines for 16K page granularity to/in a
common place in xen/page-defs.h to allow later commits to use the
consolidated defines.
Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de>
Acked-by: Jan Beulich <jbeulich@suse.com>
Praveen Kumar [Thu, 3 Aug 2017 10:24:25 +0000 (12:24 +0200)]
rbtree: changes to align the code with Linux tree
The patch aligns the code of rbtree related files with Linux tree.
This will minimize the conflicts during any future porting from Linux tree.
Linux commit till
f4b477c47332367d35686bd2b808c2156b96d7c7 for rbtree.h
This includes addition of commented inline functions in rbtree.h, to have
complete replica from Linux tree.
Linux commit till
4c60117811171d867d4f27f17ea07d7419d45dae for rbtree.c
This includes updates in comments in header note in rbtree.c.
Signed-off-by: Praveen Kumar <kpraveen.lkml@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Thu, 3 Aug 2017 10:20:17 +0000 (12:20 +0200)]
IOMMU/PCI: make a few functions static
Add forward declarations in order to not move things around.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Andrii Anisov [Thu, 27 Jul 2017 14:50:13 +0000 (17:50 +0300)]
xen:arm: earlyprintk configuration for R-Car Gen3 boards
Introduce an earlyprintk configuration for R-Car Gen3 SoC based development
boards, like:
- Salvator-X [http://elinux.org/R-Car/Boards/Salvator-X]
- M3ULCB [http://elinux.org/R-Car/Boards/M3SK]
- H3ULCB [http://elinux.org/R-Car/Boards/H3SK]
Signed-off-by: Iurii Konovalenko <iurii.konovalenko@globallogic.com>
Signed-off-by: Iurii Mykhalskyi <iurii.mykhalskyi@globallogic.com>
Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Iurii Konovalenko [Thu, 27 Jul 2017 14:50:12 +0000 (17:50 +0300)]
xen:arm64: Add SCIF UART support for earlyprintk
Add support for a SCIF compatible UART found in Renesas R-Car Gen3 SoCs.
Signed-off-by: Iurii Konovalenko <iurii.konovalenko@globallogic.com>
Signed-off-by: Iurii Mykhalskyi <iurii.mykhalskyi@globallogic.com>
Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Andrii Anisov [Thu, 27 Jul 2017 14:55:10 +0000 (17:55 +0300)]
xen/arm: Fix comments coding style in assembler files
Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Olaf Hering [Wed, 26 Jul 2017 14:39:50 +0000 (16:39 +0200)]
docs: add pod variant of xl-numa-placement
Convert source for xl-numa-placement.7 from markdown to pod.
This removes the buildtime requirement for pandoc, and subsequently the
need for ghc, in the chain for BuildRequires of xen.rpm.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Olaf Hering [Wed, 26 Jul 2017 14:39:49 +0000 (16:39 +0200)]
docs: add pod variant of xl-network-configuration.5
Convert source for xl-network-configuration.5 from markdown to pod.
This removes the buildtime requirement for pandoc, and subsequently the
need for ghc, in the chain for BuildRequires of xen.rpm.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Olaf Hering [Wed, 26 Jul 2017 14:39:48 +0000 (16:39 +0200)]
docs: add pod variant of xen-pv-channel.7
Convert source for xen-pv-channel.7 from markdown to pod.
This removes the buildtime requirement for pandoc, and subsequently the
need for ghc, in the chain for BuildRequires of xen.rpm.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Petre Pircalabu [Thu, 27 Jul 2017 17:08:27 +0000 (20:08 +0300)]
Makefile: Fix uninstall target
Running "make uninstall" does not remove all installed files, a
situation which might cause link related issues if xen is re-installed
in a different location.
In order to make uninstall correctly remove the files it is best
the process should be done recursively by mirroring each "install"
target with an "uninstall" who removes the installed files.
An exception to this rule is uninstalling the files produced by
"qemu-xen-dir-remote" and "qemu-xen-traditional-dir", which are external
to the project. These projects do not implement an "uninstall" target so
the files have to be removed manually.
Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Marek Marczykowski-Górecki [Wed, 26 Jul 2017 21:27:14 +0000 (23:27 +0200)]
libvchan: Fix cleanup when xc_gntshr_open failed
If xc_gntshr_open failed the only thing to cleanup is free allocated
memory. So instead of calling libxenvchan_close (which assume
valid calculated buffers being mmaped already) free memory and return.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Wed, 26 Jul 2017 17:17:06 +0000 (18:17 +0100)]
scripts/get_maintainers.pl: Don't blindly drop "THE REST" maintainers
"THE REST" maintainers should always be CCed for any modification that
don't fall under the responsability of a specific component maintainer.
However, the script get_maintainers.pl will remove "THE REST"
maintainers as soon as one maintainer of a specific component will be
present.
Fix the script once for all.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Tue, 27 Jun 2017 12:43:06 +0000 (12:43 +0000)]
x86/cpuid: Rename *_policy to *_cpuid_policy
In the future, there will be other policy objects, e.g. MSR.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Wed, 26 Jul 2017 11:48:58 +0000 (12:48 +0100)]
x86/svm: Drop unused SVM_REG_* definitions
These are entirely unused, and are actually the general x86 register encoding,
rather than being SVM specific.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Andrew Cooper [Fri, 30 Jun 2017 12:24:19 +0000 (12:24 +0000)]
x86/svm: Alias the VMCB segment registers as an array
This allows svm_{get,set}_segment_register() to access the user segments by
array index, as the x86_seg_* constants match the hardware encoding.
While making this alteration, add some newlines for clarity, switch an int for
a bool, and make the functions fail safe in a release build, rather than
crashing Xen.
Bloat-o-meter reports some modest improvements:
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-130 (-130)
function old new delta
svm_set_segment_register 662 653 -9
svm_get_segment_register 409 288 -121
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Andrew Cooper [Wed, 19 Jul 2017 11:37:53 +0000 (12:37 +0100)]
x86/vvmx: Fix auditing of MSR_BITMAP parameter
The MSR_BITMAP field is required to be page aligned. Also switch gpa to be a
uint64_t, as the MSR_BITMAP is strictly a 64bit VMCS field.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Wed, 19 Jul 2017 09:28:03 +0000 (10:28 +0100)]
x86/vvmx: Fix handing of the MSR_BITMAP field with VMCS shadowing
Currently, the following sequence of actions:
* VMPTRLD (creates a mapping, likely pointing at gfn 0 for an empty vmcs)
* VMWRITE CPU_BASED_VM_EXEC_CONTROL (completed by hardware)
* VMWRITE MSR_BITMAP (completed by hardware)
* VMLAUNCH
results in an L2 guest running with ACTIVATE_MSR_BITMAP set, but Xen using a
stale mapping (likely gfn 0) when reading the interception bitmap. The
MSR_BITMAP field needs unconditionally intercepting even with VMCS shadowing,
so Xen's mapping of the bitmap can be updated.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 18 Jul 2017 14:55:03 +0000 (14:55 +0000)]
x86/vvmx: Switch nested MSR intercept handling to use struct vmx_msr_bitmap
Rename vmx_check_msr_bitmap() to vmx_msr_is_intercepted() in order to more
clearly identify what the boolean return value means. Change the int
access_type to bool is_write.
The NULL pointer check is moved out, as it doesn't pertain to whether the MSR
is intercepted or not. The check is moved into nvmx_n2_vmexit_handler(),
where it becomes a hard error in the case that ACTIVATE_MSR_BITMAP is set.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 18 Jul 2017 14:44:05 +0000 (14:44 +0000)]
x86/vmx: Introduce and use struct vmx_msr_bitmap
This avoids opencoding the bitmap bases in accessor functions. Introduce a
build_assertions() function to check the structure layout against the manual
definiton. In addition, drop some stale comments and ASSERT() that callers
pass an in-range MSR.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 18 Jul 2017 14:33:13 +0000 (14:33 +0000)]
x86/vpmu: Use vmx_{clear,set}_msr_intercept() rather than opencoding them
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 18 Jul 2017 14:14:32 +0000 (14:14 +0000)]
x86/vmx: Improvements to vmx_{dis,en}able_intercept_for_msr()
* Shorten the names to vmx_{clear,set}_msr_intercept()
* Use an enumeration for MSR_TYPE rather than a plain integer
* Introduce VMX_MSR_RW, as most callers alter both the read and write
intercept at the same time.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 25 Jul 2017 18:48:43 +0000 (19:48 +0100)]
x86/hvm: Fix boundary check in hvmemul_insn_fetch()
c/s
0943a03037 added some extra protection for overflowing the emulation
instruction cache, but Coverity points out that boundary condition is off by
one when memcpy()'ing out of the buffer.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Wei Liu [Wed, 26 Jul 2017 07:44:56 +0000 (08:44 +0100)]
libxc: bail immediately when PV superpage is discovered
The original code was added with the hope that PV superpage migration
might work. But it was never proven that the code actually worked.
Now that PV superpage is gone, simplify the code by returning error
immediately.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Wei Liu [Wed, 26 Jul 2017 07:44:55 +0000 (08:44 +0100)]
tools: nuke superpage parameters in code
Also fix manpage because there is no superpages options in xl.cfg.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Wei Liu [Wed, 26 Jul 2017 07:44:54 +0000 (08:44 +0100)]
x86: nuke PV superpage option and code
Delete the user visible option and code for PV superpage support. The
mm code is modified as if the option is set to false (the default
value).
Return the address space occupied by spage_info back to the reserved
address space.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 26 Jul 2017 17:18:45 +0000 (10:18 -0700)]
xen:arm earlyprintk configuration for Hikey 960 boards
Introduce an earlyprintk configuration of Hikey 960 boards.
Tested with:
https://github.com/96boards-hikey/edk2.git #testing/hikey960_v2.5
https://github.com/96boards-hikey/OpenPlatformPkg.git #testing/hikey960_v1.3.4
https://git.savannah.gnu.org/git/grub.git #master
https://github.com/96boards-hikey/linux.git #hikey960-upstream-rebase
For GRUB, the following stanza was used:
GRUB_MODULES="boot chain configfile echo efinet eval ext2 fat font gettext gfxterm gzio help linux loadenv lsefi normal part_gpt par
t_msdos read regexp search search_fs_file search_fs_uuid search_label terminal terminfo test tftp time xen_boot"
grub-install/usr/bin/grub-mkimage \
--config grub.config \
--dtb linux/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dtb \
--directory=grub/usr/lib64/grub/arm64-efi \
--output=grubaa64.efi \
--format=arm64-efi \
--prefix="/boot/grub" \
$GRUB_MODULES
And grub.config:
search.fs_label rootfs root
set prefix=($root)/boot/grub
configfile $prefix/grub.cfg
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Dario Faggioli [Wed, 26 Jul 2017 14:55:29 +0000 (15:55 +0100)]
tools: tracing: handle null scheduler's events
In both xentrace and xenalyze.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli [Wed, 26 Jul 2017 14:55:29 +0000 (15:55 +0100)]
xen: sched_null: add some tracing
In line with what is there in all the other schedulers.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Dario Faggioli [Wed, 26 Jul 2017 14:55:28 +0000 (15:55 +0100)]
xen: sched-null: support soft-affinity
The null scheduler does not really use hard-affinity for
scheduling, it uses it for 'placement', i.e., for deciding
to what pCPU to statically assign a vCPU.
Let's use soft-affinity in the same way, of course with the
difference that, if there's no free pCPU within the vCPU's
soft-affinity, we go checking the hard-affinity, instead of
putting the vCPU in the waitqueue.
This does has no impact on the scheduling overhead, because
soft-affinity is only considered in cold-path (like when a
vCPU joins the scheduler for the first time, or is manually
moved between pCPUs by the user).
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli [Wed, 26 Jul 2017 14:55:27 +0000 (15:55 +0100)]
xen: sched_null: check for pending tasklet work a bit earlier
Whether or not there's pending tasklet work to do, it's
something we know from the tasklet_work_scheduled parameter.
Deal with that as soon as possible, like all other schedulers
do.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli [Wed, 26 Jul 2017 14:55:27 +0000 (15:55 +0100)]
xen: sched: factor affinity helpers out of sched_credit.c
In fact, we want to be able to use them from any scheduler.
While there, make the moved code use 'v' for struct_vcpu*
variable, like it should be done everywhere.
No functional change intended.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper [Mon, 5 Jun 2017 16:19:27 +0000 (17:19 +0100)]
x86/emul: Drop segment_attributes_t
The amount of namespace resolution is unnecessarily large, as all code deals
in terms of struct segment_register. This removes the attr.fields part of all
references, and alters attr.bytes to just attr.
Three areas of code using initialisers for segment_register are tweaked to
compile with older versions of GCC. arch_set_info_hvm_guest() has its SEG()
macros altered to use plain comma-based initialisation, while
{rm,vm86}_{cs,ds}_attr are simplified to plain numbers which matches their
description in the manuals.
No functional change. (For some reason, the old {rm,vm86}_{cs,ds}_attr causes
GCC to create variable in .rodata, whereas the new code uses immediate
operands. As a result, vmx_{get,set}_segment_register() are slightly
shorter.)
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 5 Jun 2017 16:19:27 +0000 (17:19 +0100)]
x86/hvm: Rearange check_segment() to use a switch statement
This simplifies the logic by separating the x86_segment check from the type
check. No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 30 Jun 2017 12:12:00 +0000 (12:12 +0000)]
x86/svm: Drop svm_segment_register_t
Most SVM code already uses struct segment_register. Drop the typedef and
adjust the definitions in struct vmcb_struct, and svm_dump_sel(). Introduce
some build-time assertions that struct segment_register from the common
emulation code is usable in struct vmcb_struct.
While making these adjustments, fix some comments to not mix decimal and
hexidecimal offsets, and drop all trailing whitespace in vmcb.h
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Andrew Cooper [Mon, 24 Jul 2017 16:28:25 +0000 (17:28 +0100)]
x86/pagewalk: Remove opt_allow_superpage check from guest_can_use_l2_superpages()
The purpose of guest_walk_tables() is to match the behaviour of real hardware.
A PV guest can have 2M superpages in its pagetables, via the M2P (and for dom0
via the initial P2M), even if the guest isn't permitted to create arbitrary 2M
superpage mappings.
guest_can_use_l2_superpages() checking opt_allow_superpage is a piece of PV
guest policy enforcement, rather than its intended purpose of meaning "would
hardware tolerate finding an L2 superpage with these control settings?"
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Wed, 18 Jan 2017 18:02:19 +0000 (18:02 +0000)]
x86/mm: Rename get_page_and_type_from_pagenr() to get_page_and_type_from_mfn()
'pagenr' is actually an mfn. Rename the function to use consistent
terminology, switching it to use a typesafe mfn_t.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Wed, 18 Jan 2017 17:58:42 +0000 (17:58 +0000)]
x86/mm: Rename get_page_from_pagenr() to get_page_from_mfn()
'pagenr' is actually an mfn. Rename the function to use consistent
terminology, switching it to use a typesafe mfn_t and boolean return type.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Chao Gao [Tue, 25 Jul 2017 10:48:26 +0000 (18:48 +0800)]
Revert "VT-d: fix VF of RC integrated PF matched to wrong VT-d unit"
This reverts commit
89df98b77d28136c4d7aade13a1c8bc154d2919f, which
incurs Xen crash when loading VF driver. The reason seems that
pci_get_pdev() can't be called when interrupt is disabled. I don't have a
quick solution to fix this; therefore revert this patch to let common cases
work well. As to the corner case I intended to fix, I will propose another
solution later.
Below is the call trace of Xen crash:
(XEN) Xen BUG at spinlock.c:47
(XEN) ----[ Xen-4.10-unstable x86_64 debug=y Tainted: C ]----
(XEN) CPU: 2
(XEN) RIP: e008:[<
ffff82d08023513c>] spinlock.c#check_lock+0x3c/0x40
(XEN) RFLAGS:
0000000000010046 CONTEXT: hypervisor (d0v2)
(XEN) rax:
0000000000000000 rbx:
ffff82d08043b9c8 rcx:
0000000000000001
(XEN) rdx:
0000000000000000 rsi:
0000000000000000 rdi:
ffff82d08043b9ce
(XEN) rbp:
ffff83043c47fa50 rsp:
ffff83043c47fa50 r8:
0000000000000000
(XEN) r9:
0000000000000000 r10:
0000000000000000 r11:
0000ffff0000ffff
(XEN) r12:
0000000000000001 r13:
0000000000000000 r14:
0000000000000072
(XEN) r15:
ffff83043c006c00 cr0:
0000000080050033 cr4:
00000000003526e0
(XEN) cr3:
000000081b39a000 cr2:
ffff88016c058548
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
(XEN) Xen code around <
ffff82d08023513c> (spinlock.c#check_lock+0x3c/0x40):
(XEN) 98 83 f2 01 39 d0 75 02 <0f> 0b 5d c3 55 48 89 e5 f0 ff 05 a1 f6 1e 00 5d
(XEN) Xen stack trace from rsp=
ffff83043c47fa50:
(XEN)
ffff83043c47fa68 ffff82d080235234 0000000000000005 ffff83043c47fa78
(XEN)
ffff82d080251df3 ffff83043c47fab8 ffff82d080251e80 ffff83043c47fac8
(XEN)
ffff83043c422580 ffff83042e973cd0 0000000000000005 ffff83042e9609e0
(XEN)
0000000000000072 ffff83043c47fae8 ffff82d08025795a ffff83043c47fb18
(XEN)
ffff83043c47fc18 ffff83043c47fc18 ffff83042e9609e0 ffff83043c47fba8
(XEN)
ffff82d080259be1 ffff83043c47fb10 ffff82d08023516b 0000000000000246
(XEN)
ffff83043c47fb28 0000000000000206 0000000000000002 ffff83043c47fb58
(XEN)
ffff82d080290e38 ffff83042e973cd0 ffff83043c532000 ffff83043c532000
(XEN)
ffff83042e973db0 ffff83043c47fb68 ffff82d080354dd0 ffff83043c47fc18
(XEN)
ffff82d080274e07 0000000000000040 ffff83042e9609e0 ffff83043c47fc18
(XEN)
ffff83043c47fc18 0000000000000072 ffff83043c006c00 ffff83043c47fbb8
(XEN)
ffff82d0802526f7 ffff83043c47fc08 ffff82d080273c17 ffff83043ff99d90
(XEN)
ffff83043c006c00 ffff83043c47fc08 ffff83043c006c00 ffff83042e9609e0
(XEN)
ffff83043c47fc18 0000000000000072 ffff83043c006c00 ffff83043c47fc48
(XEN)
ffff82d0802754d1 00000000feeff00c 00000fff000041ca 0000000000000002
(XEN)
ffff83042e9609e0 ffff83042e973cd0 0000000000000002 ffff83043c47fc88
(XEN)
ffff82d0802755a8 ffff83043c47fc70 0000000000000246 ffff83043c532000
(XEN)
000000000000006c ffff83043c006c00 0000000000000000 ffff83043c47fd28
(XEN)
ffff82d080279b4f ffff83043c532000 ffff83043c47fe00 ffff83043c47fcd8
(XEN)
ffff83042e973d20 ffff83043c47fcf0 ffff830400000325 0000000000000246
(XEN) Xen call trace:
(XEN) [<
ffff82d08023513c>] spinlock.c#check_lock+0x3c/0x40
(XEN) [<
ffff82d080235234>] _spin_is_locked+0x11/0x4d
(XEN) [<
ffff82d080251df3>] pcidevs_locked+0x10/0x17
(XEN) [<
ffff82d080251e80>] pci_get_pdev+0x2f/0xfd
(XEN) [<
ffff82d08025795a>] acpi_find_matched_drhd_unit+0x4d/0x11a
(XEN) [<
ffff82d080259be1>] msi_msg_write_remap_rte+0x2f/0x749
(XEN) [<
ffff82d0802526f7>] iommu_update_ire_from_msi+0x36/0x38
(XEN) [<
ffff82d080273c17>] msi.c#write_msi_msg+0x3f/0x188
(XEN) [<
ffff82d0802754d1>] __setup_msi_irq+0x3a/0x5c
(XEN) [<
ffff82d0802755a8>] setup_msi_irq+0xb5/0xf7
(XEN) [<
ffff82d080279b4f>] map_domain_pirq+0x445/0x653
(XEN) [<
ffff82d08027aa99>] allocate_and_map_msi_pirq+0x10d/0x184
(XEN) [<
ffff82d080291258>] physdev_map_pirq+0x1f8/0x26b
(XEN) [<
ffff82d0802919a6>] do_physdev_op+0x595/0x110f
(XEN) [<
ffff82d080352db0>] pv_hypercall+0x1ef/0x42c
(XEN) [<
ffff82d080356606>] entry.o#test_all_events+0/0x30
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) Xen BUG at spinlock.c:47
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
Signed-off-by: Chao Gao <chao.gao@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 25 Jul 2017 10:40:40 +0000 (11:40 +0100)]
xen: Drop repeated semicolons
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
David Woodhouse [Tue, 25 Jul 2017 09:21:37 +0000 (10:21 +0100)]
xen/link: Move .data.rel.ro sections into .rodata for final link
This includes stuff like the hypercall tables which we really kind of want
to be read-only. And they were going into .data.read-mostly.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrii Anisov [Tue, 18 Jul 2017 16:45:30 +0000 (19:45 +0300)]
xen:Kconfig: Make SCIF built by default for ARM
Both Renesas R-Car Gen2(ARM32) and Gen3(ARM64) are utilizing SCIF IP,
so make its serial driver built by default for ARM.
Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Olaf Hering [Wed, 24 May 2017 09:12:40 +0000 (11:12 +0200)]
docs: correct paragraph indention in xen-tscmode
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Olaf Hering [Wed, 24 May 2017 09:12:24 +0000 (11:12 +0200)]
docs: replace xm with xl in xen-tscmode
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>