> From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> >>> Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> 19.12.09 05:43 >>>
> >if first ever timer interrupt comes after start_jiffies is
> set, dom0 boot
> >may hang if delta in timer_interrupt() is so huge that it
> causes jiffies
> >to wrap. It appears delta is very large when memory is more
> than 512GB on
> >certain boxes causing wrap around.
> >why is delta in dom0->timer_interrupt() related to memory on system?
> >Because hyp creates dom0, then page scrubs, then unpauses vcpu. so it
> >appears lot of page scurbbing results in huge delta on first tick.
> Based on prior analysis of similar problems, I'm not
> convinced this is the
> right solution: Kernel code should not need changing here.
> Instead, I'd
> recommend trying to insert a call to process_pending_timers() every so
> many pages scrubbed (just like is e.g. being done in the P2M/M2P table
> population code).
Mukesh has dug into this a lot deeper than I, but I think
process_pending_timers() is irrelevant here. When dom0
is constructed, its data space is initialized in memory
and jiffies has been initialized in the data section with
a fixed value of -300 * HZ. At this point, dom0 lives in
memory but has not executed a single instruction, so is
not capable of receiving any interrupts. I *think* Xen
also initializes a clocksource (pvclock?) here.
Then scrub_heap_pages() occurs which eats up a lot of time.
THEN dom0 is started and receives a timer interrupt and,
I guess, the clocksource code updates jiffies based on
the time elapsed and, since jiffies is unsigned, it
So (admitting I don't understand this fully), I think the
problem is that the kernel has hardcoded into it that it's
impossible for 300 seconds to expire between the time it
is put in memory and the time the first interrupt occurs.
That seems like a kernel bug to me, maybe in the pvclock
code, but still in the kernel.
Not to say the problem can't or shouldn't be fixed in Xen.
Keir, would bad things happen if construct_dom0 is done after
scrub_heap_pages()? Other than some time wastage because
dom0's memory would get scrubbed just before it gets
overwritten (which is admittedly a much bigger problem
when dom0_mem is not specified in the Xen boot line
on a machine with ginormous memory).
Xen-devel mailing list