Is this PAE or non-PAE?
Please can you try forceing emulation mode by toggling the "#if 0" in
arch/x86/mm.c ptwr_do_page_fault
The other thing to try is modifying set_pte_pfn_ma to call xen_l1_update
rather than set_pte. You could try set_pte_at too.
This will help narrow down the issue.
Thanks,
Ian
> Keir, Ian,
> With PCI mmconfig option on, and with the PCI express
> enabled BIOS, the dom0 kernel reads the PCI config from
> fix-mapped PCI mmconfig space.
> The PCI mmconfig space is of 256MB size, and it's access
> is implemented differently on i386 & x86_64. On x86_64 the
> whole 256MB is mapped in the Kernel virtual address space. On
> i386 it will consume too much of the kernels virtual address
> space, hence it is implemented using a single fix-mapped
> page. This page is mapped to the desired physical address for
> every PCI mmconfig access, as seen in the following code from
> mmconfig.c .
>
> static inline void pci_exp_set_dev_base(int bus, int devfn) {
> u32 dev_base = pci_mmcfg_base_addr | (bus << 20) | (devfn << 12);
> if (dev_base != mmcfg_last_accessed_device) {
> mmcfg_last_accessed_device = dev_base;
> set_fixmap_nocache(FIX_PCIE_MCFG, dev_base);
> }
> }
>
> static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
> unsigned int devfn, int reg, int len, u32 *value) {
> unsigned long flags;
>
> if (!value || (bus > 255) || (devfn > 255) || (reg > 4095))
> return -EINVAL;
>
> spin_lock_irqsave(&pci_config_lock, flags);
>
> pci_exp_set_dev_base(bus, devfn);
>
> switch (len) {
>
> At the time of boot the PCI mmconfig space is accessed
> thousands times, one after another; that causes fixed map &
> unmap continuously very fast for a long time. Currently the
> fix-mapped virtual address for Shared_info_page for dom0 &
> the PCI mmconfig page are adjacent in the fixed_addresses in
> the fixedmap.h.
>
> #ifdef CONFIG_PCI_MMCONFIG
> FIX_PCIE_MCFG,
> #endif
> FIX_SHARED_INFO,
> FIX_GNTTAB_BEGIN,
>
> I am suspecting that this is causing a race condition
> because of writable page tables. While accessing the PCI
> mmconfig on i386 the dom0 kernel (cpu 0) is continuously
> rewriting into the pte for FIX_PCIE_MCFG at a very fast rate.
> With writable page tables the updates to ptes are deferred.
> In the SMP case other CPUs are getting the interrupts (timer)
> at the same time, interrupts handlers access the shared_info
> page to notify the dom0 of the events such as timer event.
> The problem possibly is that because of the writable page
> tables, the L1 page is getting evicted during the mmconfig
> access, and the shared_page translation needed for event
> notification is also in the same L1 page. All the cpus are
> using the same page tables at this time. While writing the pte, the
> L2 page is getting cut off from the page table. This is
> somehow causing corruption in the dom0 page tables, and we
> see the errors.
> I belive this issue is not on x86_64 because each
> mmconfig access does not map/unmap fixmap, and the racing
> condition accessing the l2 page is not there.
> The current work around working for me is to disable
> PCI_MMCONFIG for
> i386 in the xen0 kernel config. Today or later other people
> will also notice this corruption on SMP boxes with SNMP dom0.
> I can see it once in a while on a 4 way box.
>
> Can we disable PCI_MMCONFIG for i386 in the xen0 config till
> we solve the race condition issue? Attached is the patch for
> the config.
> As I have a workaround and I am seeing issues with VMX
> guests, I am trying to fix those issues now.
>
> Thanks & Regards,
> Nitin
> --------------------------------------------------------------
> ----------
> -----------
> Sr Software Engineer
> Open Source Technology Center, Intel Corp -----Original Message-----
> From: Kamble, Nitin A
> Sent: Tuesday, August 30, 2005 10:06 AM
> To: Keir Fraser
> Cc: xen-devel
> Subject: RE: [Xen-devel] Re: SMP dom0 with 8 cpus of i386
>
> > Default but with smp enabled.
> Same here. I am seeing the issue inconsistently on a 4 way
> box. 8 way system does not have any issue with maxcpus=1.
> with 8 cpus it is consistent. More no of cpus are causing
> some corruption. It is always happening at the time of
> reading/writing the pci mmconfig space.
> I am debugging here.
>
> Thanks & Regards,
> Nitin
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|