The introduction of vcpu_force_reschedule() in 14320:
215b799fa181 was
incompatible with the SEDF scheduler: Any vCPU using
VCPUOP_stop_periodic_timer (e.g. any vCPU of half way modern PV Linux
guests) ends up on pCPU0 after that call. Obviously, running all PV
guests' (and namely Dom0's) vCPU-s on pCPU0 causes problems for those
guests rather sooner than later.
So the main thing that was clearly wrong (and bogus from the beginning)
was the use of cpumask_first() in sedf_pick_cpu(). It is being replaced
by a construct that prefers to put back the vCPU on the pCPU that it
got launched on.
However, there's one more glitch: When reducing the affinity of a vCPU
temporarily, and then widening it again to a set that includes the pCPU
that the vCPU was last running on, the generic scheduler code would not
force a migration of that vCPU, and hence it would forever stay on the
pCPU it last ran on. Since that can again create a load imbalance, the
SEDF scheduler wants a migration to happen regardless of it being
apparently unnecessary.
Of course, an alternative to checking for SEDF explicitly in
vcpu_set_affinity() would be to introduce a flags field in struct
scheduler, and have SEDF set a "always-migrate-on-affinity-change"
flag.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
online = cpupool_scheduler_cpumask(v->domain->cpupool);
cpumask_and(&online_affinity, v->cpu_affinity, online);
- return cpumask_first(&online_affinity);
+ return cpumask_cycle(v->vcpu_id % cpumask_weight(&online_affinity) - 1,
+ &online_affinity);
}
/*
vcpu_schedule_lock_irq(v);
cpumask_copy(v->cpu_affinity, affinity);
- if ( !cpumask_test_cpu(v->processor, v->cpu_affinity) )
+ if ( VCPU2OP(v)->sched_id == XEN_SCHEDULER_SEDF ||
+ !cpumask_test_cpu(v->processor, v->cpu_affinity) )
set_bit(_VPF_migrating, &v->pause_flags);
vcpu_schedule_unlock_irq(v);