x86/ctxt-switch: Document and improve GDT handling
Calling virt_to_mfn() in the context switch path is a lot
of wasted cycles for a result which is constant after boot.
Begin by documenting how Xen handles the GDTs across context switch.
The loop in write_full_gdt_ptes() is unnecessary, because
NR_RESERVED_GDT_PAGES is 1. Dropping it makes the code substantially
more clear, and with it dropped, write_full_gdt_ptes() becomes more
obviously a poor name, so rename it to update_xen_slot_in_full_gdt().
Furthermore, load_full_gdt() is completely independent of the current
CPU, and load_default_gdt() only needs the current CPU's regular
GDT. (This is a change in behaviour, as previously it may have used the
compat GDT, but either will do.)
Add two extra per-cpu variables which cache the L1e for the regular and compat
GDT, calculated in cpu_smpboot_alloc()/trap_init() as appropriate, so
update_xen_slot_in_full_gdt() doesn't need to waste time performing the same
calculation on every context switch.
One performance scenario of Jüergen's (time to build the hypervisor on
an 8 CPU system, with two single-vCPU MiniOS VMs constantly interrupting
dom0 with events) shows the following, average over 5 measurements:
elapsed user system
Unpatched 66.51 232.93 109.21
Patched 57.00 225.47 105.47
which is a substantial improvement.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Tested-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>