Re: [Xen-devel] Issue with pv_ops Kernel 2.6.31.6 and Xen

Sorry, this is a duplicate of
http://lists.xensource.com/archives/html/xen-devel/2010-01/msg00855.html

Thought that this mail did not reach the mailing list, so I reposted it...


Marcial Rion wrote:
> Hi
>
> First of all I have to state that I am neither a Kernel nor a Xen
> developer. Nevertheless, while trying to use Kernel 2.6.31.6 from
> git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git as a Dom0
> Kernel, I discovered an issue and searching the Internet for a long
> time, I probably also found the cause. However, I won't be able to fix
> it by myself :-(, so I am trying to share my knowledge with this list,
> in the hope that the issue might gets fixed sometime :-)...
> I will try to give you all information that seems relevant to me;
> however, if it turns out I missed to give enough details about my system
> (configuration), log files or anything else, I will be glad to provide
> this information. Furthermore, I would also be happy to support
> "testing" of potential patches if this is required. I post to this list
> as this has been suggested at
> http://wiki.xensource.com/xenwiki/XenParavirtOps (bottom of page). If I
> am wrong, please give me a short hint so I wont bother you any longer...
>
> Now, let's get into it...
>
> About my system:
> I am running Gentoo (10.0, server profile) on an Asus P2B-D motherboard
> (PIIX4 chipset) with two PIII 500 MHz CPUs and 1G of RAM. The system
> furthermore possesses 3 PCI network interfaces of chip type Realtek RLT
> 8139 (rlt8139too Kernel driver). Network interface to be used is eth0 (I
> already tried  whether using another interface as eth0 would change
> anything - without success :-( ).
>
> The issue I have:
> While Xen pv_ops Kernel 2.6.31.6 perfectly runs on bare metal, it fails
> to get network connectivity when run on top of Xen 3.4.1 (Gentoo default
> installation). Though the system seems to come up correctly at a first
> sight and network interface is available (I can ping it locally), access
> to network fails (I cannot ping other system in the network nor vice-versa).
>
> What I discovered so far:
> Consulting the boot messages within "dmesg", I discovered that ACPI SCI
> fails to load when run on top of Xen, while this error is not happening
> on bare metal.
>
> With XEN:
> *********
> bio: create slab <bio-0> at 0
> ACPI: SCI (IRQ20) allocation failed
> ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control
> Interrupt handler 20090521 evevent-161
> ACPI: Unable to start the ACPI Interpreter
> ------------[ cut here ]------------
> WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c()
> Hardware name: System Name
> kobject: '<NULL>' (cf805ea0): is not initialized, yet kobject_put() is
> being called.
> Modules linked in:
> Pid: 1, comm: swapper Tainted: G        W  2.6.31.6 #14
> Call Trace:
>  [<c043a2db>] warn_slowpath_common+0x60/0x90
>  [<c043a33f>] warn_slowpath_fmt+0x24/0x27
>  [<c05588cb>] kobject_put+0x27/0x3c
>  [<c049e502>] kmem_cache_destroy+0x105/0x11b
>  [<c058adc8>] acpi_os_delete_cache+0x8/0xc
>  [<c05a6fe6>] acpi_ut_delete_caches+0xd/0x6b
>  [<c05a77f7>] acpi_ut_subsystem_shutdown+0x87/0x90
>  [<c0904837>] ? acpi_init+0x0/0x263
>  [<c05a8067>] acpi_terminate+0x8/0x14
>  [<c09049cb>] acpi_init+0x194/0x263
>  [<c05f0e66>] ? __class_create+0x44/0x5e
>  [<c09021c5>] ? fbmem_init+0x0/0x78
>  [<c0904837>] ? acpi_init+0x0/0x263
>  [<c0403051>] do_one_initcall+0x4c/0x13a
>  [<c08e030d>] kernel_init+0x12c/0x17d
>  [<c08e01e1>] ? kernel_init+0x0/0x17d
>  [<c040ad17>] kernel_thread_helper+0x7/0x10
> ---[ end trace 4eaa2a86a8e2da23 ]---
> ------------[ cut here ]------------
> WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c()
> Hardware name: System Name
> kobject: '<NULL>' (cf805f60): is not initialized, yet kobject_put() is
> being called.
> Modules linked in:
> Pid: 1, comm: swapper Tainted: G        W  2.6.31.6 #14
> Call Trace:
>  [<c043a2db>] warn_slowpath_common+0x60/0x90
>  [<c043a33f>] warn_slowpath_fmt+0x24/0x27
>  [<c05588cb>] kobject_put+0x27/0x3c
>  [<c049e502>] kmem_cache_destroy+0x105/0x11b
>  [<c058adc8>] acpi_os_delete_cache+0x8/0xc
>  [<c05a700e>] acpi_ut_delete_caches+0x35/0x6b
>  [<c05a77f7>] acpi_ut_subsystem_shutdown+0x87/0x90
>  [<c0904837>] ? acpi_init+0x0/0x263
>  [<c05a8067>] acpi_terminate+0x8/0x14
>  [<c09049cb>] acpi_init+0x194/0x263
>  [<c05f0e66>] ? __class_create+0x44/0x5e
>  [<c09021c5>] ? fbmem_init+0x0/0x78
>  [<c0904837>] ? acpi_init+0x0/0x263
>  [<c0403051>] do_one_initcall+0x4c/0x13a
>  [<c08e030d>] kernel_init+0x12c/0x17d
>  [<c08e01e1>] ? kernel_init+0x0/0x17d
>  [<c040ad17>] kernel_thread_helper+0x7/0x10
> ---[ end trace 4eaa2a86a8e2da24 ]---
> sync cpu 0 get result ffffffff max_id 0
> Failed to sync pcpu 0
> xenbus_probe_backend_init bus registered ok
>
>
> Wihout Xen:
> ***********
> bio: create slab <bio-0> at 0
> ACPI: EC: Look up EC in DSDT
> ACPI: Interpreter enabled
> ACPI: (supports S0 S5)
> ACPI: Using IOAPIC for interrupt routing
> ACPI: No dock devices found.
> ACPI: PCI Root Bridge [PCI0] (0000:00)
> pci 0000:00:00.0: reg 10 32bit mmio: [0xf8000000-0xfbffffff]
> pci 0000:00:04.1: reg 20 io port: [0xb800-0xb80f]
> pci 0000:00:04.2: reg 20 io port: [0xb400-0xb41f]
> * Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
> * this clock source is slow. Consider trying other clock sources
> pci 0000:00:04.3: quirk: region e400-e43f claimed by PIIX4 ACPI
> pci 0000:00:04.3: quirk: region e800-e80f claimed by PIIX4 SMB
> pci 0000:00:04.3: PIIX4 devres B PIO at 0290-0297
> pci 0000:00:09.0: reg 10 io port: [0xb000-0xb0ff]
> pci 0000:00:09.0: reg 14 32bit mmio: [0xde800000-0xde8000ff]
> pci 0000:00:09.0: reg 30 32bit mmio: [0x000000-0x00ffff]
> pci 0000:00:0a.0: reg 10 io port: [0xa800-0xa8ff]
> pci 0000:00:0a.0: reg 14 32bit mmio: [0xde000000-0xde0000ff]
> pci 0000:00:0a.0: supports D1 D2
> pci 0000:00:0a.0: PME# supported from D1 D2 D3hot
> pci 0000:00:0a.0: PME# disabled
> pci 0000:00:0b.0: reg 10 io port: [0xa400-0xa4ff]
> pci 0000:00:0b.0: reg 14 32bit mmio: [0xdd800000-0xdd8000ff]
> pci 0000:00:0b.0: supports D1 D2
> pci 0000:00:0b.0: PME# supported from D1 D2 D3hot
> pci 0000:00:0b.0: PME# disabled
> pci 0000:01:00.0: reg 10 32bit mmio: [0xe0000000-0xe3ffffff]
> pci 0000:01:00.0: reg 14 32bit mmio: [0xdf800000-0xdf87ffff]
> pci 0000:01:00.0: reg 18 io port: [0xd800-0xd8ff]
> pci 0000:01:00.0: reg 30 32bit mmio: [0xdf7e0000-0xdf7fffff]
> pci 0000:01:00.0: supports D1 D2
> pci 0000:00:01.0: bridge io port: [0xd000-0xdfff]
> pci 0000:00:01.0: bridge 32bit mmio: [0xf4000000-0xf40fffff]
> pci 0000:00:01.0: bridge 32bit mmio pref: [0xdf700000-0xe3ffffff]
> pci_bus 0000:00: on NUMA node 0
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
> ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
> ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 *12 14 15)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 3 *4 5 6 7 9 10 11 12 14 15)
> xenbus_probe_backend_init bus registered ok
>
>
> Respective to the error, the /proc/interrupts tables were also different:
>
> With XEN:
> *********
>            CPU0       CPU1
>   1:        426          0  xen-pirq-ioapic-edge  i8042
>   3:          0          0  xen-pirq-ioapic-edge  uhci_hcd:usb1
>   4:          2          0  xen-pirq-ioapic-edge  serial
>   8:          2          0  xen-pirq-ioapic-edge  rtc0
>  12:          0          0  xen-pirq-ioapic-edge  eth0
>  14:       4319          0  xen-pirq-ioapic-edge  ide0
>  15:         42          0  xen-pirq-ioapic-edge  ide1
> 411:          0          0   xen-dyn-event     xenbus
> 412:          0        703   xen-dyn-ipi       callfuncsingle1
> 413:          0          0   xen-dyn-virq      debug1
> 414:          0          0   xen-dyn-ipi       callfunc1
> 415:          0      45622   xen-dyn-ipi       resched1
> 416:          0        311   xen-dyn-ipi       spinlock1
> 417:          0     153289   xen-dyn-virq      timer1
> 418:        550          0   xen-dyn-ipi       callfuncsingle0
> 419:          0          0   xen-dyn-virq      debug0
> 420:          0          0   xen-dyn-ipi       callfunc0
> 421:      18071          0   xen-dyn-ipi       resched0
> 422:        661          0   xen-dyn-ipi       spinlock0
> 423:     277476          0   xen-dyn-virq      timer0
> NMI:          0          0   Non-maskable interrupts
> LOC:          0          0   Local timer interrupts
> SPU:          0          0   Spurious interrupts
> CNT:          0          0   Performance counter interrupts
> PND:          0          0   Performance pending work
> RES:      18071      45622   Rescheduling interrupts
> CAL:        550        703   Function call interrupts
> TLB:          0          0   TLB shootdowns
> TRM:          0          0   Thermal event interrupts
> THR:          0          0   Threshold APIC interrupts
> MCE:          0          0   Machine check exceptions
> MCP:        132        132   Machine check polls
> ERR:          0
> MIS:          0
>
>
> Without XEN:
> ************
>            CPU0       CPU1
>   0:         46          0   IO-APIC-edge      timer
>   1:       2567       4239   IO-APIC-edge      i8042
>   6:          3          0   IO-APIC-edge      floppy
>   8:          1          1   IO-APIC-edge      rtc0
>  14:      28604      27089   IO-APIC-edge      ide0
>  15:          0          0   IO-APIC-edge      ide1
>  18:       1942       1978   IO-APIC-fasteoi   eth0
>  20:          0          0   IO-APIC-fasteoi   acpi
> NMI:          0          0   Non-maskable interrupts
> LOC:    1097380    1052641   Local timer interrupts
> SPU:          0          0   Spurious interrupts
> CNT:          0          0   Performance counter interrupts
> PND:          0          0   Performance pending work
> RES:     105211     107135   Rescheduling interrupts
> CAL:         16         20   Function call interrupts
> TLB:       4542       4509   TLB shootdowns
> TRM:          0          0   Thermal event interrupts
> THR:          0          0   Threshold APIC interrupts
> MCE:          0          0   Machine check exceptions
> MCP:        289        289   Machine check polls
> ERR:          0
> MIS:          0
>
>
> Searching the Internet, I ran across different messages (i.e.
> http://www.mail-archive.com/kvm@xxxxxxxxxxxxxxx/msg26601.html)
> mentioning that on motherboards with the PIIX4 chipset SCI interrupt is
> hardwired to IRQ 9. However, on my system it is assigned IRQ 20 on bare
> metal, and fails to be set to IRQ 20 on top of Xen (see extract above of
> dmesg when run on top of Xen -> ACPI: SCI (IRQ20) allocation failed).
>
> As I started wondering whether it would work with IRQ 9 and having no
> knowledge of ACPI and interrupt handling in the Kernel, I badly fixed
> the code of <Kernel-DIR>/drivers/acpi/osl.c in the following manner:
>
> osl.c:391
> *********
> acpi_status
> acpi_os_install_interrupt_handler(u32 gsi, acpi_osd_handler handler,
>                                   void *context)
> {
>         unsigned int irq;
>
>         acpi_irq_stats_init();
>
>         /*
>          * Ignore the GSI from the core, and use the value in our copy
> of the
>          * FADT. It may not be the same if an interrupt source override
> exists
>          * for the SCI.
>          */
>         gsi = acpi_gbl_FADT.sci_interrupt;
>         if (acpi_gsi_to_irq(gsi, &irq) < 0) {
>                 printk(KERN_ERR PREFIX "SCI (ACPI GSI %d) not registered\n",
>                        gsi);
>                 return AE_OK;
>         }
> +       irq = 9;
>         acpi_irq_handler = handler;
>         acpi_irq_context = context;
>         if (request_irq(irq, acpi_irq, IRQF_SHARED, "acpi", acpi_irq)) {
>                 printk(KERN_ERR PREFIX "SCI (IRQ%d) allocation
> failed\n", irq);
>                 return AE_NOT_ACQUIRED;
>         }
>         acpi_irq_irq = irq;
>
>         return AE_OK;
> }
>
>
> As you can see, I just "overwrote" the IRQ number somehow evaluated by
> the system with IRQ 9, recompiled the Kernel and discovered(!) that
> networking was now working, even within Xen (btw: it was still working
> on bare metal).
>
> Now I don't know why it is working with SCI mapped to IRQ 20 on bare
> metal while SCI is supposed to be hardwired to IRQ 9, but the fact that
> it works in both cases with IRQ 9 suggests me there is something "wrong"
> or at least different when pv_ops Kernel 2.6.31.6 is run on top of Xen.
> So someone somewhen might have a look at it, because that's where my
> knowledge stops...
>
> Thanks & regards,
> Marcial
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>   


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] Issue with pv_ops Kernel 2.6.31.6 and Xen