The generic domain creation logic in
xen/common/domctl.c:default_vcpu0_location() attempts to try to do
initial placement load-balancing by placing vcpu 0 on the least-busy
non-primary hyperthread available. Unfortunately, the logic can end
up picking a pcpu that's not in the online mask. When this is passed
to a scheduler such which assumes that the initial assignment is
valid, it causes a null pointer dereference looking up the runqueue.
Furthermore, this initial placement doesn't take into account hard or
soft affinity, or any scheduler-specific knowledge (such as historic
runqueue load, as in credit2).
To solve this, when inserting a vcpu, always call the per-scheduler
"pick" function to revise the initial placement. This will
automatically take all knowledge the scheduler has into account.
csched2_cpu_pick ASSERTs that the vcpu's pcpu scheduler lock has been
taken. Grab and release the lock to minimize time spend with irqs
disabled.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
Reviwed-by: Dario Faggioli <dario.faggioli@citrix.com>
BUG_ON( is_idle_vcpu(vc) );
+ /* This is safe because vc isn't yet being scheduled */
+ vc->processor = csched_cpu_pick(ops, vc);
+
lock = vcpu_schedule_lock_irq(vc);
if ( !__vcpu_on_runq(svc) && vcpu_runnable(vc) && !vc->is_running )
ASSERT(!is_idle_vcpu(vc));
ASSERT(list_empty(&svc->runq_elem));
- /* Add vcpu to runqueue of initial processor */
+ /* csched2_cpu_pick() expects the pcpu lock to be held */
+ lock = vcpu_schedule_lock_irq(vc);
+
+ vc->processor = csched2_cpu_pick(ops, vc);
+
+ spin_unlock_irq(lock);
+
lock = vcpu_schedule_lock_irq(vc);
+ /* Add vcpu to runqueue of initial processor */
runq_assign(ops, vc);
vcpu_schedule_unlock_irq(lock, vc);
BUG_ON( is_idle_vcpu(vc) );
+ /* This is safe because vc isn't yet being scheduled */
+ vc->processor = rt_cpu_pick(ops, vc);
+
lock = vcpu_schedule_lock_irq(vc);
now = NOW();