x86: improve MSR_SHADOW_GS accesses
Instead of using RDMSR/WRMSR, on fsgsbase-capable systems use a double
SWAPGS combined with RDGSBASE/WRGSBASE. This halves execution time for
a shadow GS update alone on my Haswell (and we have indications of
good performance improvements by this on Skylake too), while the win is
even higher when e.g. updating more than one base (as may and commonly
will happen in load_segments()).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>