>>> Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> 21.12.09 19:20 >>>
> From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
>> Based on prior analysis of similar problems, I'm not
>> convinced this is the
>> right solution: Kernel code should not need changing here.
>> Instead, I'd
>> recommend trying to insert a call to process_pending_timers() every so
>> many pages scrubbed (just like is e.g. being done in the P2M/M2P table
>> population code).
>
>Mukesh has dug into this a lot deeper than I, but I think
>process_pending_timers() is irrelevant here. When dom0
Why would this be any different than a lot of time being consumed
populating large p2m/m2p tables? All this happens when Dom0 already
exists, but isn't running yet.
>is constructed, its data space is initialized in memory
>and jiffies has been initialized in the data section with
>a fixed value of -300 * HZ. At this point, dom0 lives in
>memory but has not executed a single instruction, so is
>not capable of receiving any interrupts. I *think* Xen
>also initializes a clocksource (pvclock?) here.
... and updates it each time local_time_calibration() is run, which is
the missing piece (process_pending_timers() causes
time_calibration() to run as needed, in turn causing
TIME_CALIBRATE_SOFTIRQ to be raised as needed [and run the
latest immediately before Dom0 gets passed control], in turn
causing local_time_calibration() to run, updating dom0:vcpu0's
system time).
>Then scrub_heap_pages() occurs which eats up a lot of time.
... and confuses Xen's own time keeping (because, depending on
the platform timer used and it's wrap-around interval, a wrap may
be missed if process_pending_timers() isn't being executed
frequently enough.
But from the other mail regarding this subject I conclude that this
suggestion wasn't even tried, despite me knowing that it fixed
similar problems on 1Tb systems. And be assured, I spent hours (if
not days) analyzing the problem until I finally understood that this
is entirely unrelated to the kernel.
>THEN dom0 is started and receives a timer interrupt and,
>I guess, the clocksource code updates jiffies based on
>the time elapsed and, since jiffies is unsigned, it
>wraps around.
>
>So (admitting I don't understand this fully), I think the
>problem is that the kernel has hardcoded into it that it's
>impossible for 300 seconds to expire between the time it
>is put in memory and the time the first interrupt occurs.
>That seems like a kernel bug to me, maybe in the pvclock
>code, but still in the kernel.
No, the time the kernel gets put in memory doesn't matter at all.
Counting starts when the kernel starts initializing its time
subsystem, and with timer interrupts being disabled initially I
can't even see how multiple of them could pile up.
>Not to say the problem can't or shouldn't be fixed in Xen.
>Keir, would bad things happen if construct_dom0 is done after
>scrub_heap_pages()? Other than some time wastage because
>dom0's memory would get scrubbed just before it gets
>overwritten (which is admittedly a much bigger problem
>when dom0_mem is not specified in the Xen boot line
>on a machine with ginormous memory).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|