More efficient TLB-flush filtering in alloc_heap_pages().
Rather than per-cpu filtering for every page in a super-page
allocation, simply remember the most recent TLB timestamp across all
allocated pages, and filter on that, just once, at the end of the
function.
For large-CPU systems, doing 2MB allocations during domain creation,
this cuts down the domain creation time *massively*.
TODO: It may make sense to move the filtering out into some callers,
such as memory.c:populate_physmap() and
memory.c:increase_reservation(), so that the filtering can be moved
outside their loops, too.
Signed-off-by: Keir Fraser <keir@xen.org>