There's no need to subtract _QR_BIAS from the lock value for storing
in the local cnts variable in the read lock slow path: the users of
the value in cnts only care about the writer-related bits and use a
mask to get the value.
Note that further setting of cnts in rspin_until_writer_unlock already
do not subtract _QR_BIAS.
Originally _QR_BIAS was subtracted from the result of
atomic_add_return_acquire in order to prevent GCC from emitting an
unneeded ADD instruction. This being in the lock slow path such
optimizations don't seem likely to make any relevant performance
difference. Also modern GCC and CLANG versions will already avoid
emitting the ADD instruction.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
while ( atomic_read(&lock->cnts) & _QW_WMASK )
cpu_relax();
- cnts = atomic_add_return(_QR_BIAS, &lock->cnts) - _QR_BIAS;
+ cnts = atomic_add_return(_QR_BIAS, &lock->cnts);
rspin_until_writer_unlock(lock, cnts);
/*