xen/smp: Speed up on_selected_cpus()
authorAndrew Cooper <andrew.cooper3@citrix.com>
Fri, 4 Feb 2022 20:12:04 +0000 (20:12 +0000)
committerAndrew Cooper <andrew.cooper3@citrix.com>
Mon, 7 Feb 2022 17:41:24 +0000 (17:41 +0000)
cpumask_weight() is an incredibly expensive way to find if no bits are set,
made worse by the fact that the calculation is performed with the global
call_lock held.

This appears to be a missing optimisation from c/s 433f14699d48 ("x86: Clean
up smp_call_function handling.") in 2011 which dropped the logic requiring the
count of CPUs.

Switch to using cpumask_empty() instead, which will short circuit as soon as
it finds any set bit in the cpumask.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/common/smp.c

index 781bcf2c246c6375a463831ba451c87f898a5f4f..a011f541f1eae0ae8af327cb64bff336186cc0dc 100644 (file)
@@ -50,8 +50,6 @@ void on_selected_cpus(
     void *info,
     int wait)
 {
-    unsigned int nr_cpus;
-
     ASSERT(local_irq_is_enabled());
     ASSERT(cpumask_subset(selected, &cpu_online_map));
 
@@ -59,8 +57,7 @@ void on_selected_cpus(
 
     cpumask_copy(&call_data.selected, selected);
 
-    nr_cpus = cpumask_weight(&call_data.selected);
-    if ( nr_cpus == 0 )
+    if ( cpumask_empty(&call_data.selected) )
         goto out;
 
     call_data.func = func;