In this case, we know better than the compiler.
gcc 4.7 (Debian Wheezy) chooses to create translation-unit-local functions
(even for non-debug builds) named stac() and clac(), and calls them.
$ objdump -d xen-syms | grep -c "<stac>:"
6
$ objdump -d xen-syms | grep -o "callq [0-9a-f]\+ <stac>" | uniq -c
5 callq
ffff82d0801166c9 <stac>
20 callq
ffff82d08015ef99 <stac>
4 callq
ffff82d080165169 <stac>
8 callq
ffff82d080188cb9 <stac>
3 callq
ffff82d080228779 <stac>
4 callq
ffff82d08022c5c9 <stac>
Forcing always_inline removes these functions, and replaces each of the callqs
with the expected 3byte nops.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
#define ASM_STAC ASM_AC(STAC)
#define ASM_CLAC ASM_AC(CLAC)
#else
-static inline void clac(void)
+static always_inline void clac(void)
{
/* Note: a barrier is implicit in alternative() */
alternative(ASM_NOP3, __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
}
-static inline void stac(void)
+static always_inline void stac(void)
{
/* Note: a barrier is implicit in alternative() */
alternative(ASM_NOP3, __stringify(__ASM_STAC), X86_FEATURE_SMAP);