Sorry, this is a duplicate of
http://lists.xensource.com/archives/html/xen-devel/2010-01/msg00855.html
Thought that this mail did not reach the mailing list, so I reposted it...
Marcial Rion wrote:
> Hi
>
> First of all I have to state that I am neither a Kernel nor a Xen
> developer. Nevertheless, while trying to use Kernel 2.6.31.6 from
> git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git as a Dom0
> Kernel, I discovered an issue and searching the Internet for a long
> time, I probably also found the cause. However, I won't be able to fix
> it by myself :-(, so I am trying to share my knowledge with this list,
> in the hope that the issue might gets fixed sometime :-)...
> I will try to give you all information that seems relevant to me;
> however, if it turns out I missed to give enough details about my system
> (configuration), log files or anything else, I will be glad to provide
> this information. Furthermore, I would also be happy to support
> "testing" of potential patches if this is required. I post to this list
> as this has been suggested at
> http://wiki.xensource.com/xenwiki/XenParavirtOps (bottom of page). If I
> am wrong, please give me a short hint so I wont bother you any longer...
>
> Now, let's get into it...
>
> About my system:
> I am running Gentoo (10.0, server profile) on an Asus P2B-D motherboard
> (PIIX4 chipset) with two PIII 500 MHz CPUs and 1G of RAM. The system
> furthermore possesses 3 PCI network interfaces of chip type Realtek RLT
> 8139 (rlt8139too Kernel driver). Network interface to be used is eth0 (I
> already tried whether using another interface as eth0 would change
> anything - without success :-( ).
>
> The issue I have:
> While Xen pv_ops Kernel 2.6.31.6 perfectly runs on bare metal, it fails
> to get network connectivity when run on top of Xen 3.4.1 (Gentoo default
> installation). Though the system seems to come up correctly at a first
> sight and network interface is available (I can ping it locally), access
> to network fails (I cannot ping other system in the network nor vice-versa).
>
> What I discovered so far:
> Consulting the boot messages within "dmesg", I discovered that ACPI SCI
> fails to load when run on top of Xen, while this error is not happening
> on bare metal.
>
> With XEN:
> *********
> bio: create slab <bio-0> at 0
> ACPI: SCI (IRQ20) allocation failed
> ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control
> Interrupt handler 20090521 evevent-161
> ACPI: Unable to start the ACPI Interpreter
> ------------[ cut here ]------------
> WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c()
> Hardware name: System Name
> kobject: '<NULL>' (cf805ea0): is not initialized, yet kobject_put() is
> being called.
> Modules linked in:
> Pid: 1, comm: swapper Tainted: G W 2.6.31.6 #14
> Call Trace:
> [<c043a2db>] warn_slowpath_common+0x60/0x90
> [<c043a33f>] warn_slowpath_fmt+0x24/0x27
> [<c05588cb>] kobject_put+0x27/0x3c
> [<c049e502>] kmem_cache_destroy+0x105/0x11b
> [<c058adc8>] acpi_os_delete_cache+0x8/0xc
> [<c05a6fe6>] acpi_ut_delete_caches+0xd/0x6b
> [<c05a77f7>] acpi_ut_subsystem_shutdown+0x87/0x90
> [<c0904837>] ? acpi_init+0x0/0x263
> [<c05a8067>] acpi_terminate+0x8/0x14
> [<c09049cb>] acpi_init+0x194/0x263
> [<c05f0e66>] ? __class_create+0x44/0x5e
> [<c09021c5>] ? fbmem_init+0x0/0x78
> [<c0904837>] ? acpi_init+0x0/0x263
> [<c0403051>] do_one_initcall+0x4c/0x13a
> [<c08e030d>] kernel_init+0x12c/0x17d
> [<c08e01e1>] ? kernel_init+0x0/0x17d
> [<c040ad17>] kernel_thread_helper+0x7/0x10
> ---[ end trace 4eaa2a86a8e2da23 ]---
> ------------[ cut here ]------------
> WARNING: at lib/kobject.c:595 kobject_put+0x27/0x3c()
> Hardware name: System Name
> kobject: '<NULL>' (cf805f60): is not initialized, yet kobject_put() is
> being called.
> Modules linked in:
> Pid: 1, comm: swapper Tainted: G W 2.6.31.6 #14
> Call Trace:
> [<c043a2db>] warn_slowpath_common+0x60/0x90
> [<c043a33f>] warn_slowpath_fmt+0x24/0x27
> [<c05588cb>] kobject_put+0x27/0x3c
> [<c049e502>] kmem_cache_destroy+0x105/0x11b
> [<c058adc8>] acpi_os_delete_cache+0x8/0xc
> [<c05a700e>] acpi_ut_delete_caches+0x35/0x6b
> [<c05a77f7>] acpi_ut_subsystem_shutdown+0x87/0x90
> [<c0904837>] ? acpi_init+0x0/0x263
> [<c05a8067>] acpi_terminate+0x8/0x14
> [<c09049cb>] acpi_init+0x194/0x263
> [<c05f0e66>] ? __class_create+0x44/0x5e
> [<c09021c5>] ? fbmem_init+0x0/0x78
> [<c0904837>] ? acpi_init+0x0/0x263
> [<c0403051>] do_one_initcall+0x4c/0x13a
> [<c08e030d>] kernel_init+0x12c/0x17d
> [<c08e01e1>] ? kernel_init+0x0/0x17d
> [<c040ad17>] kernel_thread_helper+0x7/0x10
> ---[ end trace 4eaa2a86a8e2da24 ]---
> sync cpu 0 get result ffffffff max_id 0
> Failed to sync pcpu 0
> xenbus_probe_backend_init bus registered ok
>
>
> Wihout Xen:
> ***********
> bio: create slab <bio-0> at 0
> ACPI: EC: Look up EC in DSDT
> ACPI: Interpreter enabled
> ACPI: (supports S0 S5)
> ACPI: Using IOAPIC for interrupt routing
> ACPI: No dock devices found.
> ACPI: PCI Root Bridge [PCI0] (0000:00)
> pci 0000:00:00.0: reg 10 32bit mmio: [0xf8000000-0xfbffffff]
> pci 0000:00:04.1: reg 20 io port: [0xb800-0xb80f]
> pci 0000:00:04.2: reg 20 io port: [0xb400-0xb41f]
> * Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
> * this clock source is slow. Consider trying other clock sources
> pci 0000:00:04.3: quirk: region e400-e43f claimed by PIIX4 ACPI
> pci 0000:00:04.3: quirk: region e800-e80f claimed by PIIX4 SMB
> pci 0000:00:04.3: PIIX4 devres B PIO at 0290-0297
> pci 0000:00:09.0: reg 10 io port: [0xb000-0xb0ff]
> pci 0000:00:09.0: reg 14 32bit mmio: [0xde800000-0xde8000ff]
> pci 0000:00:09.0: reg 30 32bit mmio: [0x000000-0x00ffff]
> pci 0000:00:0a.0: reg 10 io port: [0xa800-0xa8ff]
> pci 0000:00:0a.0: reg 14 32bit mmio: [0xde000000-0xde0000ff]
> pci 0000:00:0a.0: supports D1 D2
> pci 0000:00:0a.0: PME# supported from D1 D2 D3hot
> pci 0000:00:0a.0: PME# disabled
> pci 0000:00:0b.0: reg 10 io port: [0xa400-0xa4ff]
> pci 0000:00:0b.0: reg 14 32bit mmio: [0xdd800000-0xdd8000ff]
> pci 0000:00:0b.0: supports D1 D2
> pci 0000:00:0b.0: PME# supported from D1 D2 D3hot
> pci 0000:00:0b.0: PME# disabled
> pci 0000:01:00.0: reg 10 32bit mmio: [0xe0000000-0xe3ffffff]
> pci 0000:01:00.0: reg 14 32bit mmio: [0xdf800000-0xdf87ffff]
> pci 0000:01:00.0: reg 18 io port: [0xd800-0xd8ff]
> pci 0000:01:00.0: reg 30 32bit mmio: [0xdf7e0000-0xdf7fffff]
> pci 0000:01:00.0: supports D1 D2
> pci 0000:00:01.0: bridge io port: [0xd000-0xdfff]
> pci 0000:00:01.0: bridge 32bit mmio: [0xf4000000-0xf40fffff]
> pci 0000:00:01.0: bridge 32bit mmio pref: [0xdf700000-0xe3ffffff]
> pci_bus 0000:00: on NUMA node 0
> ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
> ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
> ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
> ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 *12 14 15)
> ACPI: PCI Interrupt Link [LNKD] (IRQs 3 *4 5 6 7 9 10 11 12 14 15)
> xenbus_probe_backend_init bus registered ok
>
>
> Respective to the error, the /proc/interrupts tables were also different:
>
> With XEN:
> *********
> CPU0 CPU1
> 1: 426 0 xen-pirq-ioapic-edge i8042
> 3: 0 0 xen-pirq-ioapic-edge uhci_hcd:usb1
> 4: 2 0 xen-pirq-ioapic-edge serial
> 8: 2 0 xen-pirq-ioapic-edge rtc0
> 12: 0 0 xen-pirq-ioapic-edge eth0
> 14: 4319 0 xen-pirq-ioapic-edge ide0
> 15: 42 0 xen-pirq-ioapic-edge ide1
> 411: 0 0 xen-dyn-event xenbus
> 412: 0 703 xen-dyn-ipi callfuncsingle1
> 413: 0 0 xen-dyn-virq debug1
> 414: 0 0 xen-dyn-ipi callfunc1
> 415: 0 45622 xen-dyn-ipi resched1
> 416: 0 311 xen-dyn-ipi spinlock1
> 417: 0 153289 xen-dyn-virq timer1
> 418: 550 0 xen-dyn-ipi callfuncsingle0
> 419: 0 0 xen-dyn-virq debug0
> 420: 0 0 xen-dyn-ipi callfunc0
> 421: 18071 0 xen-dyn-ipi resched0
> 422: 661 0 xen-dyn-ipi spinlock0
> 423: 277476 0 xen-dyn-virq timer0
> NMI: 0 0 Non-maskable interrupts
> LOC: 0 0 Local timer interrupts
> SPU: 0 0 Spurious interrupts
> CNT: 0 0 Performance counter interrupts
> PND: 0 0 Performance pending work
> RES: 18071 45622 Rescheduling interrupts
> CAL: 550 703 Function call interrupts
> TLB: 0 0 TLB shootdowns
> TRM: 0 0 Thermal event interrupts
> THR: 0 0 Threshold APIC interrupts
> MCE: 0 0 Machine check exceptions
> MCP: 132 132 Machine check polls
> ERR: 0
> MIS: 0
>
>
> Without XEN:
> ************
> CPU0 CPU1
> 0: 46 0 IO-APIC-edge timer
> 1: 2567 4239 IO-APIC-edge i8042
> 6: 3 0 IO-APIC-edge floppy
> 8: 1 1 IO-APIC-edge rtc0
> 14: 28604 27089 IO-APIC-edge ide0
> 15: 0 0 IO-APIC-edge ide1
> 18: 1942 1978 IO-APIC-fasteoi eth0
> 20: 0 0 IO-APIC-fasteoi acpi
> NMI: 0 0 Non-maskable interrupts
> LOC: 1097380 1052641 Local timer interrupts
> SPU: 0 0 Spurious interrupts
> CNT: 0 0 Performance counter interrupts
> PND: 0 0 Performance pending work
> RES: 105211 107135 Rescheduling interrupts
> CAL: 16 20 Function call interrupts
> TLB: 4542 4509 TLB shootdowns
> TRM: 0 0 Thermal event interrupts
> THR: 0 0 Threshold APIC interrupts
> MCE: 0 0 Machine check exceptions
> MCP: 289 289 Machine check polls
> ERR: 0
> MIS: 0
>
>
> Searching the Internet, I ran across different messages (i.e.
> http://www.mail-archive.com/kvm@xxxxxxxxxxxxxxx/msg26601.html)
> mentioning that on motherboards with the PIIX4 chipset SCI interrupt is
> hardwired to IRQ 9. However, on my system it is assigned IRQ 20 on bare
> metal, and fails to be set to IRQ 20 on top of Xen (see extract above of
> dmesg when run on top of Xen -> ACPI: SCI (IRQ20) allocation failed).
>
> As I started wondering whether it would work with IRQ 9 and having no
> knowledge of ACPI and interrupt handling in the Kernel, I badly fixed
> the code of <Kernel-DIR>/drivers/acpi/osl.c in the following manner:
>
> osl.c:391
> *********
> acpi_status
> acpi_os_install_interrupt_handler(u32 gsi, acpi_osd_handler handler,
> void *context)
> {
> unsigned int irq;
>
> acpi_irq_stats_init();
>
> /*
> * Ignore the GSI from the core, and use the value in our copy
> of the
> * FADT. It may not be the same if an interrupt source override
> exists
> * for the SCI.
> */
> gsi = acpi_gbl_FADT.sci_interrupt;
> if (acpi_gsi_to_irq(gsi, &irq) < 0) {
> printk(KERN_ERR PREFIX "SCI (ACPI GSI %d) not registered\n",
> gsi);
> return AE_OK;
> }
> + irq = 9;
> acpi_irq_handler = handler;
> acpi_irq_context = context;
> if (request_irq(irq, acpi_irq, IRQF_SHARED, "acpi", acpi_irq)) {
> printk(KERN_ERR PREFIX "SCI (IRQ%d) allocation
> failed\n", irq);
> return AE_NOT_ACQUIRED;
> }
> acpi_irq_irq = irq;
>
> return AE_OK;
> }
>
>
> As you can see, I just "overwrote" the IRQ number somehow evaluated by
> the system with IRQ 9, recompiled the Kernel and discovered(!) that
> networking was now working, even within Xen (btw: it was still working
> on bare metal).
>
> Now I don't know why it is working with SCI mapped to IRQ 20 on bare
> metal while SCI is supposed to be hardwired to IRQ 9, but the fact that
> it works in both cases with IRQ 9 suggests me there is something "wrong"
> or at least different when pv_ops Kernel 2.6.31.6 is run on top of Xen.
> So someone somewhen might have a look at it, because that's where my
> knowledge stops...
>
> Thanks & regards,
> Marcial
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|