WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] Re: SMP dom0 with 8 cpus of i386

To: "Kamble, Nitin A" <nitin.a.kamble@xxxxxxxxx>, "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] Re: SMP dom0 with 8 cpus of i386
From: "Ian Pratt" <m+Ian.Pratt@xxxxxxxxxxxx>
Date: Thu, 1 Sep 2005 01:15:22 +0100
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Thu, 01 Sep 2005 00:13:22 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcWtPJLK72L6LhUaRXe+r95yGbS2fAASAyNAAA8A6hAAMhKaAA==
Thread-topic: [Xen-devel] [PATCH] Re: SMP dom0 with 8 cpus of i386
 
Is this PAE or non-PAE?

Please can you try forceing emulation mode by toggling the "#if 0" in
arch/x86/mm.c ptwr_do_page_fault

The other thing to try is modifying set_pte_pfn_ma to call xen_l1_update
rather than set_pte. You could try set_pte_at too.

This will help narrow down the issue.

Thanks,
Ian

> Keir, Ian,
>    With PCI mmconfig option on, and with the PCI express 
> enabled BIOS, the dom0 kernel reads the PCI config from 
> fix-mapped PCI mmconfig space.
>    The PCI mmconfig space is of 256MB size, and it's access 
> is implemented differently on i386 & x86_64. On x86_64 the 
> whole 256MB is mapped in the Kernel virtual address space. On 
> i386 it will consume too much of the kernels virtual address 
> space, hence it is implemented using a single fix-mapped 
> page. This page is mapped to the desired physical address for 
> every PCI mmconfig access, as seen in the following code from 
> mmconfig.c .
> 
> static inline void pci_exp_set_dev_base(int bus, int devfn) {
>     u32 dev_base = pci_mmcfg_base_addr | (bus << 20) | (devfn << 12);
>     if (dev_base != mmcfg_last_accessed_device) {
>         mmcfg_last_accessed_device = dev_base;
>         set_fixmap_nocache(FIX_PCIE_MCFG, dev_base);
>     }
> }
> 
> static int pci_mmcfg_read(unsigned int seg, unsigned int bus,
>               unsigned int devfn, int reg, int len, u32 *value) {
>     unsigned long flags;
> 
>     if (!value || (bus > 255) || (devfn > 255) || (reg > 4095))
>         return -EINVAL;
> 
>     spin_lock_irqsave(&pci_config_lock, flags);
> 
>     pci_exp_set_dev_base(bus, devfn);
> 
>     switch (len) {
> 
>    At the time of boot the PCI mmconfig space is accessed 
> thousands times, one after another; that causes fixed map & 
> unmap continuously very fast for a long time. Currently the 
> fix-mapped virtual address for Shared_info_page for dom0 & 
> the PCI mmconfig page are adjacent in the fixed_addresses in 
> the fixedmap.h.
> 
> #ifdef CONFIG_PCI_MMCONFIG
>     FIX_PCIE_MCFG,
> #endif
>     FIX_SHARED_INFO,
>     FIX_GNTTAB_BEGIN,
> 
>    I am suspecting that this is causing a race condition 
> because of writable page tables. While accessing the PCI 
> mmconfig on i386 the dom0 kernel (cpu 0) is continuously 
> rewriting into the pte for FIX_PCIE_MCFG at a very fast rate. 
> With writable page tables the updates to ptes are deferred. 
> In the SMP case other CPUs are getting the interrupts (timer) 
> at the same time, interrupts handlers access the shared_info 
> page to notify the dom0 of the events such as timer event. 
> The problem possibly is that because of the writable page 
> tables, the L1 page is getting evicted during the mmconfig 
> access, and the shared_page translation needed for event 
> notification is also in the same L1 page. All the cpus are 
> using the same page tables at this time. While writing the pte, the
> L2 page is getting cut off from the page table. This is 
> somehow causing corruption in the dom0 page tables, and we 
> see the errors.
>     I belive this issue is not on x86_64 because each 
> mmconfig access does not map/unmap fixmap, and the racing 
> condition accessing the l2 page is not there.
>    The current work around working for me is to disable 
> PCI_MMCONFIG for
> i386 in the xen0 kernel config. Today or later other people 
> will also notice this corruption on SMP boxes with SNMP dom0. 
> I can see it once in a while on a 4 way box. 
> 
> Can we disable PCI_MMCONFIG for i386 in the xen0 config till 
> we solve the race condition issue? Attached is the patch for 
> the config.
>    As I have a workaround and I am seeing issues with VMX 
> guests, I am trying to fix those issues now.
> 
> Thanks & Regards,
> Nitin
> --------------------------------------------------------------
> ----------
> -----------
> Sr Software Engineer
> Open Source Technology Center, Intel Corp -----Original Message-----
> From: Kamble, Nitin A
> Sent: Tuesday, August 30, 2005 10:06 AM
> To: Keir Fraser
> Cc: xen-devel
> Subject: RE: [Xen-devel] Re: SMP dom0 with 8 cpus of i386
> 
> > Default but with smp enabled.
> Same here. I am seeing the issue inconsistently on a 4 way 
> box. 8 way system does not have any issue with maxcpus=1. 
> with 8 cpus it is consistent. More no of cpus are causing 
> some corruption. It is always happening at the time of 
> reading/writing the pci mmconfig space.
>   I am debugging here. 
> 
> Thanks & Regards,
> Nitin
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>