WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Freeze with 2.6.32.19 and xen-4.0.1rc5

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] Freeze with 2.6.32.19 and xen-4.0.1rc5
From: Claus Rosenberger <claus.rosenberger@xxxxxxxxx>
Date: Sun, 22 Aug 2010 00:08:53 +0200
Delivery-date: Sat, 21 Aug 2010 15:09:49 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100821140234.GX2804@xxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4C6FD90D.9080907@xxxxxxxxx> <20100821140234.GX2804@xxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2.8) Gecko/20100802 Thunderbird/3.1.2
 Am 21.08.2010 16:02, schrieb Pasi Kärkkäinen:
> On Sat, Aug 21, 2010 at 03:47:57PM +0200, Claus Rosenberger wrote:
>>  Hi,
>>
>> i have big trouble with a Debian Lenny dom0 and latest kernel 2.6.32.19
>> with xen-4.0.1rc5. Due some reason the system freezes from time to time.
>> I used kernel 2.6.31.9 with xen-3.4.2 before. The machine doesn't write
>> anything to serial console so there are no errors or something like that.
>>
>> Perhaps there is something to see from the logs ...
>>
> Hello,
>
> A couple of questions:
>
>       - Do you use PCI passthru? 
I tried but now i disabled to avoid a mixup of to many issues.
>       - Is there something special happening when it freezes? 
Last time it happened as creating filesystems, perhaps it's something
about disk usage. At the end of the mail i describe more about the disk
problems.
>       - Does it freeze at regular intervals, at the same time/uptime, or 
> randomly? 
It happens or not, it's randomly.
>       - By freezing you mean it doesn't respond to anything? Or does it 
> reboot?
If it's freezing then i cannot do anything, i can connect with iamt and
reboot, nothing else.
>       - Can you try using the old 2.6.31.9 kernel with the new xen hypervisor?
Sure.
> -- Pasi
>
>
>> Configuration Grub
>>
>> title           Xen 4.0-amd64 / Debian GNU/Linux, kernel 2.6.32.19
>> root            (hd0,0)
>> kernel          /boot/xen-4.0-amd64.gz dom0_mem=524288 cpufreq=xen
>> cpuidle console=com1 com1=115200,8n1,0xf1c0,0 sync_console
> Try adding "loglvl=all guest_loglvl=all" for xen.gz.
Sure.
>> module          /boot/vmlinuz-2.6.32.19 root=/dev/md0 ro console=tty0
>> console=hvc0
>> module          /boot/initrd.img-2.6.32.19
>>
> And try adding "nomodeset" for dom0 kernel (vmlinuz).
Whats that parameter for?

I switched the disk because there was an error on the last one, now on
sata2 there is a brand new disk and i can see following on my console
log. I cannot believe it's a disk problem, perhaps it's a disk
controller problem instead or there is something with the kernel. I will
add the parameters and switch off/on the machine to restart from scratch.

Claus


[17392.097849] sd 1:0:0:0: [sdb] Unhandled error code
[17392.100047] BUG: soft lockup - CPU#0 stuck for 66s! [swapper:0]
[17392.100049] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack xt_physdev iptable_filter ip_tables x_tables
xen_evtchn xenfs 8021q garp bridge stp coretemp lm85 hwmon_vid loop
evdev video output tpm_tis tpm snd_pcsp tpm_bios psmouse snd_pcm
serio_raw snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core
processor button acpi_processor ext3 jbd mbcache dm_mirror
dm_region_hash dm_log dm_snapshot dm_mod raid1 md_mod sd_mod crc_t10dif
ata_piix ehci_hcd uhci_hcd ata_generic libata usbcore nls_base scsi_mod
e1000e thermal fan thermal_sys [last unloaded: scsi_wait_scan]
[17392.100088] CPU 0:
[17392.100089] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack xt_physdev iptable_filter ip_tables x_tables
xen_evtchn xenfs 8021q garp bridge stp coretemp lm85 hwmon_vid loop
evdev video output tpm_tis tpm snd_pcsp tpm_bios psmouse snd_pcm
serio_raw snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core
processor button acpi_processor ext3 jbd mbcache dm_mirror
dm_region_hash dm_log dm_snapshot dm_mod raid1 md_mod sd_mod crc_t10dif
ata_piix ehci_hcd uhci_hcd ata_generic libata usbcore nls_base scsi_mod
e1000e thermal fan thermal_sys [last unloaded: scsi_wait_scan]
[17392.100120] Pid: 0, comm: swapper Not tainted 2.6.32.19 #2
[17392.100122] RIP: e030:[<ffffffff8100928a>]  [<ffffffff8100928a>]
hypercall_page+0x28a/0x1001
[17392.100129] RSP: e02b:ffff880002f38df8  EFLAGS: 00000a07
[17392.100130] RAX: 0000000000000000 RBX: ffffc900081d2060 RCX:
ffffffff8100928a
[17392.100132] RDX: 0000000000000001 RSI: ffffc900081d51c0 RDI:
0000000000000001
[17392.100134] RBP: ffffc900081d2198 R08: 0000000000000000 R09:
0000000000000000
[17392.100135] R10: 0000000000015640 R11: 0000000000000a07 R12:
0000000000000003
[17392.100137] R13: 0000000000004620 R14: 0000000000000021 R15:
6db6db6db6db6db7
[17392.100142] FS:  00007f3b93f6a6e0(0000) GS:ffff880002f35000(0000)
knlGS:0000000000000000
[17392.100144] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[17392.100145] CR2: 00007f3b93f69000 CR3: 000000001efba000 CR4:
0000000000002660
[17392.100147] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[17392.100149] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[17392.100151] Call Trace:
[17392.100153]  <IRQ>  [<ffffffff811fd74a>] ? net_tx_action+0x294/0x9be
[17392.100160]  [<ffffffff8100eadf>] ? xen_restore_fl_direct_end+0x0/0x1
[17392.100164]  [<ffffffff8109786c>] ? check_for_new_grace_period+0x9e/0xa8
[17392.100167]  [<ffffffff81052bc3>] ? tasklet_action+0x77/0xd3
[17392.100170]  [<ffffffff81054352>] ? __do_softirq+0xe0/0x1a2
[17392.100173]  [<ffffffff811ee9ff>] ? __xen_evtchn_do_upcall+0x12a/0x16c
[17392.100176]  [<ffffffff81012bec>] ? call_softirq+0x1c/0x30
[17392.100179]  [<ffffffff81014813>] ? do_softirq+0x3f/0x7c
[17392.100181]  [<ffffffff810541b3>] ? irq_exit+0x36/0x79
[17392.100184]  [<ffffffff811eeeb0>] ? xen_evtchn_do_upcall+0x35/0x42
[17392.100186]  [<ffffffff81012c3e>] ? xen_do_hypervisor_callback+0x1e/0x30
[17392.100187]  <EOI>  [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
[17392.100191]  [<ffffffff810093aa>] ? hypercall_page+0x3aa/0x1001
[17392.100194]  [<ffffffff8100e454>] ? xen_safe_halt+0xc/0x15
[17392.100196]  [<ffffffff8100bf15>] ? xen_idle+0x35/0x40
[17392.100199]  [<ffffffff81010c13>] ? cpu_idle+0xa3/0xdd
[17392.100203]  [<ffffffff814f3cdb>] ? start_kernel+0x3da/0x3e5
[17392.100205]  [<ffffffff814f5b83>] ? xen_start_kernel+0x5e6/0x5ea
[17392.103027] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
frozen
[17392.103031] ata1.00: failed command: READ DMA EXT
[17392.103035] ata1.00: cmd 25/00:00:5d:88:39/00:04:3d:00:00/e0 tag 0
dma 524288 in
[17392.103036]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[17392.103037] ata1.00: status: { DRDY }
[17392.103052] ata1.00: hard resetting link
[17392.372433] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK
driverbyte=DRIVER_TIMEOUT
[17392.376416] sd 1:0:0:0: [sdb] CDB: Write(10): 2a 00 3d 39 7f dd 00 04
00 00
[17392.380089] end_request: I/O error, dev sdb, sector 1027178461
[17392.384569] raid1: Disk failure on sdb3, disabling device.
[17392.384569] raid1: Operation continuing on 1 devices.
[17392.389710] BUG: soft lockup - CPU#1 stuck for 66s! [scsi_eh_1:538]
[17392.389710] Modules linked in:
[17392.393352] md: md2: resync done.
[17392.393348]  nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack
xt_physdev iptable_filter ip_tables x_tables xen_evtchn xenfs 8021q garp
bridge stp coretemp lm85 hwmon_vid loop evdev video output tpm_tis tpm
snd_pcsp tpm_bios psmouse snd_pcm serio_raw snd_timer snd soundcore
snd_page_alloc i2c_i801 i2c_core processor button acpi_processor ext3
jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot dm_mod raid1
md_mod sd_mod crc_t10dif ata_piix ehci_hcd uhci_hcd ata_generic libata
usbcore nls_base scsi_mod e1000e thermal fan thermal_sys [last unloaded:
scsi_wait_scan]
[17392.420099] CPU 1:
[17392.420099] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4
xt_state nf_conntrack xt_physdev iptable_filter ip_tables x_tables
xen_evtchn xenfs 8021q garp bridge stp coretemp lm85 hwmon_vid loop
evdev video output tpm_tis tpm snd_pcsp tpm_bios psmouse snd_pcm
serio_raw snd_timer snd soundcore snd_page_alloc i2c_i801 i2c_core
processor button acpi_processor ext3 jbd mbcache dm_mirror
dm_region_hash dm_log dm_snapshot dm_mod raid1 md_mod sd_mod crc_t10dif
ata_piix ehci_hcd uhci_hcd ata_generic libata usbcore nls_base scsi_mod
e1000e thermal fan thermal_sys [last unloaded: scsi_wait_scan]
[17392.448018] Pid: 538, comm: scsi_eh_1 Not tainted 2.6.32.19 #2
[17392.452071] RIP: e030:[<ffffffff8100922a>]  [<ffffffff8100922a>]
hypercall_page+0x22a/0x1001
[17392.456082] RSP: e02b:ffff88000205bbc8  EFLAGS: 00000246
[17392.460069] RAX: 0000000000040000 RBX: ffff880002353000 RCX:
ffffffff8100922a
[17392.464074] RDX: 000000000000d729 RSI: 0000000000000000 RDI:
0000000000000000
[17392.464074] RBP: ffff880002312000 R08: 0000000000000001 R09:
00000000000000fa
[17392.468023] R10: ffff88000206d170 R11: 0000000000000246 R12:
ffff880002338000
[17392.468023] R13: ffff880002353048 R14: ffff88001e7f0900 R15:
ffff880002698000
[17392.468023] FS:  00007f3b93f6a6e0(0000) GS:ffff880002f52000(0000)
knlGS:0000000000000000
[17392.472075] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[17392.472075] CR2: 00007f3b9356d1a4 CR3: 000000001f657000 CR4:
0000000000002660
[17392.472075] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[17392.476070] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[17392.476070] Call Trace:
[17392.476070]  [<ffffffff8100e41d>] ? xen_force_evtchn_callback+0x9/0xa
[17392.476070]  [<ffffffff8100eaf2>] ? check_events+0x12/0x20
[17392.480079] ata1.01: hard resetting link
[17392.480099]  [<ffffffff8100ea99>] ? xen_irq_enable_direct_end+0x0/0x7
[17392.480099]  [<ffffffffa003a19c>] ? scsi_request_fn+0x3b9/0x4da
[scsi_mod]
[17392.480099]  [<ffffffff8117d7b6>] ? __blk_run_queue+0x35/0x66
[17392.484070]  [<ffffffff8117d88d>] ? blk_run_queue+0x20/0x32
[17392.484070]  [<ffffffffa0039826>] ? scsi_run_queue+0x2da/0x370 [scsi_mod]
[17392.488084]  [<ffffffff810e6294>] ? kmem_cache_free+0x71/0xa4
[17392.488084]  [<ffffffffa003a4a5>] ? scsi_next_command+0x2d/0x39
[scsi_mod]
[17392.488084]  [<ffffffffa003adfc>] ? scsi_io_completion+0x1ed/0x416
[scsi_mod]
[17392.488084]  [<ffffffffa0037a7a>] ? scsi_eh_flush_done_q+0xec/0x10d
[scsi_mod]
[17392.488084]  [<ffffffffa00a7223>] ? ata_scsi_error+0x5e9/0x681 [libata]
[17392.488084]  [<ffffffffa0038a3d>] ? scsi_error_handler+0xec/0x5a9
[scsi_mod]
[17392.496345]  [<ffffffffa0038951>] ? scsi_error_handler+0x0/0x5a9
[scsi_mod]
[17392.496345]  [<ffffffff810652ad>] ? kthread+0x75/0x7d
[17392.496345]  [<ffffffff81012aea>] ? child_rip+0xa/0x20
[17392.496345]  [<ffffffff81011ca1>] ? int_ret_from_sys_call+0x7/0x1b
[17392.500072]  [<ffffffff8101245d>] ? retint_restore_args+0x5/0x6
[17392.500072]  [<ffffffff81012ae0>] ? child_rip+0x0/0x20
[17392.956361] ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[17392.957486] ata1.01: SATA link down (SStatus 0 SControl 300)
[17392.972769] ata1.00: configured for UDMA/133
[17392.973795] ata1.00: device reported invalid CHS sector 0
[17392.974743] ata1: EH complete
[17393.149020] md: checkpointing resync of md2.
[17393.482545] RAID1 conf printout:
[17393.483122]  --- wd:1 rd:2
[17393.483585]  disk 0, wo:0, o:1, dev:sda3
[17393.484259]  disk 1, wo:1, o:0, dev:sdb3
[17393.492056] RAID1 conf printout:
[17393.492628]  --- wd:1 rd:2
[17393.493108]  disk 0, wo:0, o:1, dev:sda3
[17393.494841] md: resync of RAID array md2
[17393.495559] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[17393.496573] md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for resync.
[17393.498235] md: using 128k window, over a total of 958020096 blocks.
[17393.498466] md: resuming resync of md2 from checkpoint.
[17393.498466] md: md2: resync done.
[17393.824165] RAID1 conf printout:
[17393.825572]  --- wd:1 rd:2
[17393.826761]  disk 0, wo:0, o:1, dev:sda3


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel