x86/pv: Optimise prefetching in svm_load_segs()
Split into two functions. Passing a load of zeros in results in unnecessary
caller setup code.
Update the prefetching comment to note that the main point is the TLB fill.
Reorder the writes in svm_load_segs() to access the VMCB fields in ascending
order, which gets better next-line prefetch behaviour out of hardware. Update
the prefetch instruction to match.
The net delta is:
add/remove: 1/0 grow/shrink: 0/2 up/down: 38/-39 (-1)
Function old new delta
svm_load_segs_prefetch - 38 +38
__context_switch 967 951 -16
svm_load_segs 291 268 -23
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>