|
|
|
|
|
|
|
|
|
|
xen-devel
[Xen-devel] [timer/ticks related] dom0 hang during boot on large 1TB sys
Hi,
I finally solved a hang on a 1TB box during our dom0 boot on xen 3.4.0,
that I'd been working on. The hang comes from:
calibrate_delay_direct():
....
for (i = 0; i < MAX_DIRECT_CALIBRATION_RETRIES; i++) {
pre_start = 0;
start_jiffies = jiffies;
while (jiffies <= (start_jiffies + tick_divider)) {
pre_start = start;
read_current_timer(&start);
}
read_current_timer(&post_start);
...
start_jiffies is set to : INITIAL_JIFFIES == 0xfffedb08
now, timer interrupt comes in and finding delta to be rather
huge (thanks to the page scrubbing of 1TB in xen), makes jiffies
wrap around. This causes hang in the loop, that would resolve after
say several days.
delta: 940b7d68a4, jiffies:00009f8b
I came up with fix (is there a reason it doesn't use 64bit values?) :
while (jiffies <= (start_jiffies + tick_divider)) {
pre_start = start;
read_current_timer(&start);
+ if (jiffies < start_jiffies) /* jiffies wrapped */
+ start_jiffies = jiffies;
}
The other fix I thought of was to change INITIAL_JIFFIES to something
sooner.
Would appreciate any help, I don't understand xen time management well.
thanks,
Mukesh
PS: I'm attaching output of 'xm debug-key t'.
skew.out
Description: Binary data
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|