Mukesh Rathor wrote:
> On Fri, 18 Dec 2009 07:02:55 +0000
> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote:
>
>> On 18/12/2009 04:36, "Mukesh Rathor" <mukesh.rathor@xxxxxxxxxx> wrote:
>>
>>> The other fix I thought of was to change INITIAL_JIFFIES to
>>> something sooner.
>>>
>>> Would appreciate any help, I don't understand xen time management
>>> well.
>> This isn't really Xen time code, but unchanged Linux time code. I
>> don't know which tree you quoted the code from -- 2.6.18 has similar
>> but not identical. Anyway, I suggest try using the jiffy-comparison
>> macros from <linux/jiffies.h>: time_before(), time_after(), etc.
>> These are designed to work even when jiffies wraps. Feel free to send
>> patch(es) for that, if you test that out and it works okay.
>>
>> -- Keir
>>
>
> Ok, I came up with the following patch. Jeremy, can you please take a
> look also, and comment on my fix since I noticed you've got the same
> issue in your tree. Here's a summary for your benefit:
>
> init/calibrate.c : calibrate_delay_direct():
>
> start_jiffies = get_jiffies_64();
> while (get_jiffies_64() <= (start_jiffies + tick_divider)) {
> pre_start = start;
> read_current_timer(&start);
> }
>
Linux time code explicitly forces jiffies (32-bit) to wrap soon after boot to
prevent other kernel code from making assumptions about jiffies wrap. In your
case, I'm guessing that the scrubbing delay is causing a sufficient number of
timer interrupts to be delayed (queued up) that it is forcing the jiffies to
wrap earlier in the boot path than expected.
As Keir suggests, the correct solution is probably to use the time_before/after
macros appropriately.
The proposed code avoids the problem by accessing jiffies_64 instead.
> if first ever timer interrupt comes after start_jiffies is set, dom0 boot
> may hang if delta in timer_interrupt() is so huge that it causes jiffies
> to wrap. It appears delta is very large when memory is more than 512GB on
> certain boxes causing wrap around.
>
> why is delta in dom0->timer_interrupt() related to memory on system?
> Because hyp creates dom0, then page scrubs, then unpauses vcpu. so it
> appears lot of page scurbbing results in huge delta on first tick.
The problem here may be that timers are running in the domain while the vcpu is
not.
Steve
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|