x86/emul: Optimise decode_register() somewhat
authorAndrew Cooper <andrew.cooper3@citrix.com>
Thu, 25 Jan 2018 12:16:12 +0000 (12:16 +0000)
committerAndrew Cooper <andrew.cooper3@citrix.com>
Wed, 31 Jan 2018 12:23:11 +0000 (12:23 +0000)
commit004f98ffc7e57cac17365341c07389aa8d0418be
treeab0271110c93871e1674b5aa3f5f7ecd6ca74626
parent95f64ae3c97613a13f24ab12f9910580b048c112
x86/emul: Optimise decode_register() somewhat

The positions of GPRs inside struct cpu_user_regs doesn't follow any
particular order, so as compiled, decode_register() becomes a jump table to 16
blocks which calculate the appropriate offset, at a total of 207 bytes.

Instead, pre-compute the offsets at build time and use pointer arithmetic to
calculate the result.  By observation, most callers in x86_emulate() inline
and constant-propagate the highbyte_regs value of 0.

The splitting of the general and legacy byte-op cases means that we will now
hit an ASSERT if any code path tries to use the legacy byte-op encoding with a
REX prefix.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/arch/x86/x86_emulate/x86_emulate.c