WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a taskle

On 08/24/2011 05:36 PM, Konrad Rzeszutek Wilk wrote:
On Wed, Aug 17, 2011 at 11:07:19AM +0200, Igor Mammedov wrote:
On 08/17/2011 04:38 AM, Konrad Rzeszutek Wilk wrote:
On Tue, Aug 16, 2011 at 04:26:55AM -0700, imammedo wrote:

Jeremy Fitzhardinge wrote:

Have you tried bisecting to see when this particular problem appeared?
It looks to me like something is accidentally re-enabling interrupts -
perhaps a stack overrun is corrupting the "flags" argument between a
spin_lock_irqsave()/restore pair.

Is it only on 32-bit kernels?

  ------------[ cut here ]------------
[604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80()
[604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl
sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4
nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront
pcspkr [last unloaded: scsi_wait_scan]
[604001.660147] Pid: 336, comm: udevd Tainted: G        W   3.0.0+ #50
[604001.660181] Call Trace:
[604001.660209]  [<c045c512>] warn_slowpath_common+0x72/0xa0
[604001.660243]  [<c06643a0>] ? blk_start_queue+0x70/0x80
[604001.660275]  [<c06643a0>] ? blk_start_queue+0x70/0x80
[604001.660310]  [<c045c562>] warn_slowpath_null+0x22/0x30
[604001.660343]  [<c06643a0>] blk_start_queue+0x70/0x80
[604001.660379]  [<c075e231>] kick_pending_request_queues+0x21/0x30
[604001.660417]  [<c075e42f>] blkif_interrupt+0x19f/0x2b0
...
  ------------[ cut here ]------------

I've debugged a bit blk-core warning and can say:
   - Yes, It is 32-bit PAE kernel and happens only with it so far.
   - Affects PV xen guest, bare-metal and kvm configs are not affected.
   - Upstream kernel is affected as well.
   - Reproduces on xen 4.1.1 and 3.1.2 hosts

And the dom0 is 2.6.18 right? This problem is not present
when you use a 3.0 dom0?

For xen 4.1.1 testing, I've used as dom0 Jeremy's 2.6.32.43

Jeremy pointed me to this:
https://patchwork.kernel.org/patch/1091772/
(and 
http://groups.google.com/group/linux.kernel/browse_thread/thread/39a397566cafc979)
which looks to have a similar  backtrack.

Perhaps Peter's fix solves the issue?


I've applied patches:
sched-separate-the-scheduler-entry-for-preemption.patch
sched-move-blk_schedule_flush_plug-out-of-__schedule.patch
block-shorten-interrupt-disabled-regions.patch

Unfortunately these patches don't help, the problem is still there.


--
Thanks,
  Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

--
Thanks,
 Igor

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel