This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] [timer/ticks related] dom0 hang during boot on large 1TB

To: Jan Beulich <JBeulich@xxxxxxxxxx>, mukesh.rathor@xxxxxxxxxx
Subject: RE: [Xen-devel] [timer/ticks related] dom0 hang during boot on large 1TB system
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Mon, 21 Dec 2009 10:20:28 -0800 (PST)
Cc: kurt.hackel@xxxxxxxxxx, jeremy@xxxxxxxx, Xen-devel@xxxxxxxxxxxxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Mon, 21 Dec 2009 10:21:15 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4B2F54220200007800026F02@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> >>> Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> 19.12.09 05:43 >>>
> >if first ever timer interrupt comes after start_jiffies is 
> set, dom0 boot 
> >may hang if delta in timer_interrupt() is so huge that it 
> causes jiffies 
> >to wrap. It appears delta is very large when memory is more 
> than 512GB on
> >certain boxes causing wrap around.
> >
> >why is delta in dom0->timer_interrupt() related to memory on system? 
> >Because hyp creates dom0, then page scrubs, then unpauses vcpu. so it
> >appears lot of page scurbbing results in huge delta on first tick.
> Based on prior analysis of similar problems, I'm not 
> convinced this is the
> right solution: Kernel code should not need changing here. 
> Instead, I'd
> recommend trying to insert a call to process_pending_timers() every so
> many pages scrubbed (just like is e.g. being done in the P2M/M2P table
> population code).

Mukesh has dug into this a lot deeper than I, but I think
process_pending_timers() is irrelevant here.  When dom0
is constructed, its data space is initialized in memory
and jiffies has been initialized in the data section with
a fixed value of -300 * HZ.  At this point, dom0 lives in
memory but has not executed a single instruction, so is
not capable of receiving any interrupts.  I *think* Xen
also initializes a clocksource (pvclock?) here.

Then scrub_heap_pages() occurs which eats up a lot of time.

THEN dom0 is started and receives a timer interrupt and,
I guess, the clocksource code updates jiffies based on
the time elapsed and, since jiffies is unsigned, it
wraps around.

So (admitting I don't understand this fully), I think the
problem is that the kernel has hardcoded into it that it's
impossible for 300 seconds to expire between the time it
is put in memory and the time the first interrupt occurs.
That seems like a kernel bug to me, maybe in the pvclock
code, but still in the kernel.

Not to say the problem can't or shouldn't be fixed in Xen.
Keir, would bad things happen if construct_dom0 is done after
scrub_heap_pages()?  Other than some time wastage because
dom0's memory would get scrubbed just before it gets
overwritten (which is admittedly a much bigger problem
when dom0_mem is not specified in the Xen boot line
on a machine with ginormous memory).


Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>