WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Issue with pv_ops Kernel 2.6.31.6 and Xen

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] Issue with pv_ops Kernel 2.6.31.6 and Xen
From: Marcial Rion <marcial.rion@xxxxxxxxxxxxxx>
Date: Thu, 28 Jan 2010 06:59:54 +0100
Delivery-date: Wed, 27 Jan 2010 22:04:21 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4B5A28CC.1090404@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4B5A28CC.1090404@xxxxxxxxxxxxxx>
Reply-to: marcial.rion@xxxxxxxxxxxxxx
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.22 (X11/20090731)
Sorry, this is a duplicate of
http://lists.xensource.com/archives/html/xen-devel/2010-01/msg00855.html

Thought that this mail did not reach the mailing list, so I reposted it...


Marcial Rion wrote:
> Hi
>
> First of all I have to state that I am neither a Kernel nor a Xen
> developer. Nevertheless, while trying to use Kernel 2.6.31.6 from
> git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git as a Dom0
> Kernel, I discovered an issue and searching the Internet for a long
> time, I probably also found the cause. However, I won't be able to fix
> it by myself :-(, so I am trying to share my knowledge with this list,
> in the hope that the issue might gets fixed sometime :-)...
> I will try to give you all information that seems relevant to me;
> however, if it turns out I missed to give enough details about my system
> (configuration), log files or anything else, I will be glad to provide
> this information. Furthermore, I would also be happy to support
> "testing" of potential patches if this is required. I post to this list
> as this has been suggested at
> http://wiki.xensource.com/xenwiki/XenParavirtOps (bottom of page). If I
> am wrong, please give me a short hint so I wont bother you any longer...
>
> Now, let's get into it...
>
> About my system:
> I am running Gentoo (10.0, server profile) on an Asus P2B-D motherboard
> (PIIX4 chipset) with two PIII 500 MHz CPUs and 1G of RAM. The system
> furthermore possesses 3 PCI network interfaces of chip type Realtek RLT
> 8139 (rlt8139too Kernel driver). Network interface to be used is eth0 (I
> already tried  whether using another interface as eth0 would change
> anything - without success :-( ).
>
> The issue I have:
> While Xen pv_ops Kernel 2.6.31.6 perfectly runs on bare metal, it fails
> to get network connectivity when run on top of Xen 3.4.1 (Gentoo default
> installation). Though the system seems to come up correctly at a first
> sight and network interface is available (I can ping it locally), access
> to network fails (I cannot ping other system in the network nor vice-versa).
>
> What I discovered so far:
> Consulting the boot messages within "dmesg", I discovered that ACPI SCI
> fails to load when run on top of Xen, while this error is not happening
> on bare metal.
>
> With XEN:
> *********
> bio: create slab <bio-0> at 0
> ACPI: SCI (IRQ20) allocation failed
> ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control
> Interrupt handler 20090521 evevent-161
> ACPI: Unable to start the ACPI Interpreter
> ------------[ cut here ]------------
> WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c()
> Hardware name: System Name
> kobject: '<NULL>' (cf805ea0): is not initialized, yet kobject_put() is
> being called.
> Modules linked in:
> Pid: 1, comm: swapper Tainted: G        W  2.6.31.6 #14
> Call Trace:
>  [<c043a2db>] warn_slowpath_common+0x60/0x90
>  [<c043a33f>] warn_slowpath_fmt+0x24/0x27
>  [<c05588cb>] kobject_put+0x27/0x3c
>  [<c049e502>] kmem_cache_destroy+0x105/0x11b
>  [<c058adc8>] acpi_os_delete_cache+0x8/0xc
>  [<c05a6fe6>] acpi_ut_delete_caches+0xd/0x6b
>  [<c05a77f7>] acpi_ut_subsystem_shutdown+0x87/0x90
>  [<c0904837>] ? acpi_init+0x0/0x263
>  [<c05a8067>] acpi_terminate+0x8/0x14
>  [<c09049cb>] acpi_init+0x194/0x263
>  [<c05f0e66>] ? __class_create+0x44/0x5e
>  [<c09021c5>] ? fbmem_init+0x0/0x78
>  [<c0904837>] ? acpi_init+0x0/0x263
>  [<c0403051>] do_one_initcall+0x4c/0x13a
>  [<c08e030d>] kernel_init+0x12c/0x17d
>  [<c08e01e1>] ? kernel_init+0x0/0x17d
>  [<c040ad17>] kernel_thread_helper+0x7/0x10
> ---[ end trace 4eaa2a86a8e2da23 ]---
> ------------[ cut here ]------------
> WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c()
> Hardware name: System Name
> kobject: '<NULL>' (cf805f60): is not initialized, yet kobject_put() is
> being called.
> Modules linked in:
> Pid: 1, comm: swapper Tainted: G        W  2.6.31.6 #14
> Call Trace:
>  [<c043a2db>] warn_slowpath_common+0x60/0x90
>  [<c043a33f>] warn_slowpath_fmt+0x24/0x27
>  [<c05588cb>] kobject_put+0x27/0x3c
>  [<c049e502>] kmem_cache_destroy+0x105/0x11b
>  [<c058adc8>] acpi_os_delete_cache+0x8/0xc
>  [<c05a700e>] acpi_ut_delete_caches+0x35/0x6b
>  [<c05a77f7>] acpi_ut_subsystem_shutdown+0x87/0x90
>  [<c0904837>] ? acpi_init+0x0/0x263
>  [<c05a8067>] acpi_terminate+0x8/0x14
>  [<c09049cb>] acpi_init+0x194/0x263
>  [<c05f0e66>] ? __class_create+0x44/0x5e
>  [<c09021c5>] ? fbmem_init+0x0/0x78
>  [<c0904837>] ? acpi_init+0x0/0x263
>  [<c0403051>] do_one_initcall+0x4c/0x13a
>  [<c08e030d>] kernel_init+0x12c/0x17d
>  [<c08e01e1>] ? kernel_init+0x0/0x17d
>  [<c040ad17>] kernel_thread_helper+0x7/0x10
> ---[ end trace 4eaa2a86a8e2da24 ]---
> sync cpu 0 get result ffffffff max_id 0
> Failed to sync pcpu 0
> xenbus_probe_backend_init bus registered ok
>
>
> Wihout Xen:
> ***********
> bio: create slab <bio-0> at 0
> ACPI: EC: Look up EC in DSDT
> ACPI: Interpreter enabled
> ACPI: (supports S0 S5)
> ACPI: Using IOAPIC for interrupt routing
> ACPI: No dock devices found.
> ACPI: PCI Root Bridge [PCI0] (0000:00)
> pci 0000:00:00.0: reg 10 32bit mmio: [0xf8000000-0xfbffffff]
> pci 0000:00:04.1: reg 20 io port: [0xb800-0xb80f]
> pci 0000:00:04.2: reg 20 io port: [0xb400-0xb41f]
> * Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
> * this clock source is slow. Consider trying other clock sources
> pci 0000:00:04.3: quirk: region e400-e43f claimed by PIIX4 ACPI
> pci 0000:00:04.3: quirk: region e800-e80f claimed by PIIX4 SMB
> pci 0000:00:04.3: PIIX4 devres B PIO at 0290-0297
> pci 0000:00:09.0: reg 10 io port: [0xb000-0xb0ff]
> pci 0000:00:09.0: reg 14 32bit mmio: [0xde800000-0xde8000ff]
> pci 0000:00:09.0: reg 30 32bit mmio: [0x000000-0x00ffff]
> pci 0000:00:0a.0: reg 10 io port: [0xa800-0xa8ff]
> pci 0000:00:0a.0: reg 14 32bit mmio: [0xde000000-0xde0000ff]
> pci 0000:00:0a.0: supports D1 D2
> pci 0000:00:0a.0: PME# supported from D1 D2 D3hot
> pci 0000:00:0a.0: PME# disabled
> pci 0000:00:0b.0: reg 10 io port: [0xa400-0xa4ff]
> pci 0000:00:0b.0: reg 14 32bit mmio: [0xdd800000-0xdd8000ff]
> pci 0000:00:0b.0: supports D1 D2
> pci 0000:00:0b.0: PME# supported from D1 D2 D3hot
> pci 0000:00:0b.0: PME# disabled
> pci 0000:01:00.0: reg 10 32bit mmio: [0xe0000000-0xe3ffffff]
> pci 0000:01:00.0: reg 14 32bit mmio: [0xdf800000-0xdf87ffff]
> pci 0000:01:00.0: reg 18 io port: [0xd800-0xd8ff]
> pci 0000:01:00.0: reg 30 32bit mmio: [0xdf7e0000-0xdf7fffff]
> pci 0000:01:00.0: supports D1 D2
> pci 0000:00:01.0: bridge io port: [0xd000-0xdfff]
> pci 0000:00:01.0: bridge 32bit mmio: [0xf4000000-0xf40fffff]
> pci 0000:00:01.0: bridge 32bit mmio pref: [0xdf700000-0xe3ffffff]
> pci_bus 0000:00: on NUMA node 0
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
> ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
> ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 *12 14 15)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 3 *4 5 6 7 9 10 11 12 14 15)
> xenbus_probe_backend_init bus registered ok
>
>
> Respective to the error, the /proc/interrupts tables were also different:
>
> With XEN:
> *********
>            CPU0       CPU1
>   1:        426          0  xen-pirq-ioapic-edge  i8042
>   3:          0          0  xen-pirq-ioapic-edge  uhci_hcd:usb1
>   4:          2          0  xen-pirq-ioapic-edge  serial
>   8:          2          0  xen-pirq-ioapic-edge  rtc0
>  12:          0          0  xen-pirq-ioapic-edge  eth0
>  14:       4319          0  xen-pirq-ioapic-edge  ide0
>  15:         42          0  xen-pirq-ioapic-edge  ide1
> 411:          0          0   xen-dyn-event     xenbus
> 412:          0        703   xen-dyn-ipi       callfuncsingle1
> 413:          0          0   xen-dyn-virq      debug1
> 414:          0          0   xen-dyn-ipi       callfunc1
> 415:          0      45622   xen-dyn-ipi       resched1
> 416:          0        311   xen-dyn-ipi       spinlock1
> 417:          0     153289   xen-dyn-virq      timer1
> 418:        550          0   xen-dyn-ipi       callfuncsingle0
> 419:          0          0   xen-dyn-virq      debug0
> 420:          0          0   xen-dyn-ipi       callfunc0
> 421:      18071          0   xen-dyn-ipi       resched0
> 422:        661          0   xen-dyn-ipi       spinlock0
> 423:     277476          0   xen-dyn-virq      timer0
> NMI:          0          0   Non-maskable interrupts
> LOC:          0          0   Local timer interrupts
> SPU:          0          0   Spurious interrupts
> CNT:          0          0   Performance counter interrupts
> PND:          0          0   Performance pending work
> RES:      18071      45622   Rescheduling interrupts
> CAL:        550        703   Function call interrupts
> TLB:          0          0   TLB shootdowns
> TRM:          0          0   Thermal event interrupts
> THR:          0          0   Threshold APIC interrupts
> MCE:          0          0   Machine check exceptions
> MCP:        132        132   Machine check polls
> ERR:          0
> MIS:          0
>
>
> Without XEN:
> ************
>            CPU0       CPU1
>   0:         46          0   IO-APIC-edge      timer
>   1:       2567       4239   IO-APIC-edge      i8042
>   6:          3          0   IO-APIC-edge      floppy
>   8:          1          1   IO-APIC-edge      rtc0
>  14:      28604      27089   IO-APIC-edge      ide0
>  15:          0          0   IO-APIC-edge      ide1
>  18:       1942       1978   IO-APIC-fasteoi   eth0
>  20:          0          0   IO-APIC-fasteoi   acpi
> NMI:          0          0   Non-maskable interrupts
> LOC:    1097380    1052641   Local timer interrupts
> SPU:          0          0   Spurious interrupts
> CNT:          0          0   Performance counter interrupts
> PND:          0          0   Performance pending work
> RES:     105211     107135   Rescheduling interrupts
> CAL:         16         20   Function call interrupts
> TLB:       4542       4509   TLB shootdowns
> TRM:          0          0   Thermal event interrupts
> THR:          0          0   Threshold APIC interrupts
> MCE:          0          0   Machine check exceptions
> MCP:        289        289   Machine check polls
> ERR:          0
> MIS:          0
>
>
> Searching the Internet, I ran across different messages (i.e.
> http://www.mail-archive.com/kvm@xxxxxxxxxxxxxxx/msg26601.html)
> mentioning that on motherboards with the PIIX4 chipset SCI interrupt is
> hardwired to IRQ 9. However, on my system it is assigned IRQ 20 on bare
> metal, and fails to be set to IRQ 20 on top of Xen (see extract above of
> dmesg when run on top of Xen -> ACPI: SCI (IRQ20) allocation failed).
>
> As I started wondering whether it would work with IRQ 9 and having no
> knowledge of ACPI and interrupt handling in the Kernel, I badly fixed
> the code of <Kernel-DIR>/drivers/acpi/osl.c in the following manner:
>
> osl.c:391
> *********
> acpi_status
> acpi_os_install_interrupt_handler(u32 gsi, acpi_osd_handler handler,
>                                   void *context)
> {
>         unsigned int irq;
>
>         acpi_irq_stats_init();
>
>         /*
>          * Ignore the GSI from the core, and use the value in our copy
> of the
>          * FADT. It may not be the same if an interrupt source override
> exists
>          * for the SCI.
>          */
>         gsi = acpi_gbl_FADT.sci_interrupt;
>         if (acpi_gsi_to_irq(gsi, &irq) < 0) {
>                 printk(KERN_ERR PREFIX "SCI (ACPI GSI %d) not registered\n",
>                        gsi);
>                 return AE_OK;
>         }
> +       irq = 9;
>         acpi_irq_handler = handler;
>         acpi_irq_context = context;
>         if (request_irq(irq, acpi_irq, IRQF_SHARED, "acpi", acpi_irq)) {
>                 printk(KERN_ERR PREFIX "SCI (IRQ%d) allocation
> failed\n", irq);
>                 return AE_NOT_ACQUIRED;
>         }
>         acpi_irq_irq = irq;
>
>         return AE_OK;
> }
>
>
> As you can see, I just "overwrote" the IRQ number somehow evaluated by
> the system with IRQ 9, recompiled the Kernel and discovered(!) that
> networking was now working, even within Xen (btw: it was still working
> on bare metal).
>
> Now I don't know why it is working with SCI mapped to IRQ 20 on bare
> metal while SCI is supposed to be hardwired to IRQ 9, but the fact that
> it works in both cases with IRQ 9 suggests me there is something "wrong"
> or at least different when pv_ops Kernel 2.6.31.6 is run on top of Xen.
> So someone somewhen might have a look at it, because that's where my
> knowledge stops...
>
> Thanks & regards,
> Marcial
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>   


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel