WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] iomem: Prevent Dom0 pci bus from allocating RAM

To: "Zhang, Fengzhe" <fengzhe.zhang@xxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] iomem: Prevent Dom0 pci bus from allocating RAM as I/O space
From: "Li, Xin" <xin.li@xxxxxxxxx>
Date: Mon, 21 Feb 2011 18:35:05 +0800
Accept-language: zh-CN, en-US
Acceptlanguage: zh-CN, en-US
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>
Delivery-date: Mon, 21 Feb 2011 02:46:50 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4D6121B6.2060109@xxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1A42CE6F5F474C41B63392A5F80372B2335E978C@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <20110218220558.GA18213@xxxxxxxxxxxx> <4D6121B6.2060109@xxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcvRCHk/eX0gXPs/R1OPgq+HYF83gAAdY0Vg
Thread-topic: [Xen-devel] [PATCH] iomem: Prevent Dom0 pci bus from allocating RAM as I/O space
I'm thinking how this issue happened.

For most devices, their MMIO resources are allocated in BIOS, thus it's ok for 
dom0 to use PFN as MFN, because dom0 is trusted.

However for devices like i915, whose drivers need to allocate MMIO when running 
in dom0, the issue Fengzhe is trying to fix may pop up, because dom0 kernel 
tries to allocate the MMIO resource from holes in its e820 table (not real 
hardware e820), which is RAM actually in below Fengzhe's case when limiting 
dom0 memory to a smaller value.

So for such cases the assumption of MFN == PFN is broken, possible solutions 
are:
1) use a hypercall to allocate MMIO from Xen/real hardware when dom0 allocates 
MMIO, also add the mappings into p2m of dom0.  But this needs to hack the dom0 
driver when it tries to program the PFN into device.
2) Fengzhe's solution to mark hardware RAM as reserved in dom0's e820 table, to 
avoid conflict and make MFN == PFN true again.  No driver changes required.

I don't know if I missed something important here, please correct me if you 
find any.

Also any other proposals?

Thanks!
-Xin



> -----Original Message-----
> From: Zhang, Fengzhe
> Sent: Sunday, February 20, 2011 10:14 PM
> To: Konrad Rzeszutek Wilk
> Cc: xen-devel; Dong, Eddie; Li, Xin
> Subject: Re: [Xen-devel] [PATCH] iomem: Prevent Dom0 pci bus from allocating 
> RAM
> as I/O space
> 
> On 2011/2/19 6:05, Konrad Rzeszutek Wilk wrote:
> > On Wed, Feb 16, 2011 at 10:26:20PM +0800, Zhang, Fengzhe wrote:
> >> iomem: Prevent Dom0 pci bus from allocating RAM as I/O space
> >>
> >> In Dom0, pci bus dynamically allocates I/O address resources from memory 
> >> hole
> within 4GB physical space, which can be RAM space not allocated to Dom0. This
> patch set physical RAM space to be unusable in Dom0 E820 map if they are not
> owned by Dom0 to prevent them from being misused as I/O address space. Dom0
> is assumed to look for MMIO space only below 4GB. If this assumption is 
> broken,
> additional fixes are required.
> >
> > So I am coming back to your patch trying to understand how it makes
> > the intel-agp crash go away. What this patch in effect does is
> > inhibit ioremap from inadvertly mapping System RAM, which it could do
> > before this - b/c it considered the "zapped" System RAM (so e820->size = 0)
> > as gap.
> >
> > But looking at the intel-agp.c code works, it seems like it wouldn't
> > really care about it initially. It figures out where to physical GTT is by
> > poking the "intel_private.registers+I810_PGETBL_CTL". Earlier we also make a
> > call in "agp_intel_probe" to do "pci_assign_resource(pdev, 0)" which
> > irregardless of this patch or not, would work (since the BARs don't change
> > with this patch? or do they?).
> >
> > The ioremap which is done in "intel_i9xx_setup_flush" would now potentially
> > _not_ work since the region the GTT might be in the RAM region, and ioremap
> > would consult the e820 table and find it is "UNUSABLE".. We then would
> > continue on without the flush page with "can't ioremap flush page - no 
> > chipset
> flushing"
> 
> intel_i9xx_setup_flush checks IFP(Intel Flush Page) BAR is 0 on machine
> bootup and calls pci_bus_alloc_resource to get an iommu page.
> pci_bus_alloc_resource allocates io address resource from the largest
> hole it finds in e820 map. Before patching, the hole would be from the
> top of dom0 RAM to ACPI space, which overlaps with real RAM. After
> patching, the hole would no longer overlap with any real RAM.
> 
> The following is the E820 map on the test machine with 4GB memory:
> (XEN) Xen-e820 RAM map:
> (XEN)  0000000000000000 - 0000000000097c00 (usable)
> (XEN)  0000000000097c00 - 00000000000a0000 (reserved)
> (XEN)  00000000000e8000 - 0000000000100000 (reserved)
> (XEN)  0000000000100000 - 00000000defafe00 (usable)
> (XEN)  00000000defafe00 - 00000000defb1ea0 (ACPI NVS)
> (XEN)  00000000defb1ea0 - 00000000e0000000 (reserved)
> (XEN)  00000000f4000000 - 00000000f8000000 (reserved)
> (XEN)  00000000fec00000 - 00000000fed40000 (reserved)
> (XEN)  00000000fed45000 - 0000000100000000 (reserved)
> (XEN)  0000000100000000 - 000000011c000000 (usable)
> 
> The following is the E820 map seen in Dom0 before patching:
> (Dom0 assigned with 1GB mem)
> [    0.000000] BIOS-provided physical RAM map:
> [    0.000000]  Xen: 0000000000000000 - 0000000000097c00 (usable)
> [    0.000000]  Xen: 0000000000097c00 - 0000000000100000 (reserved)
> [    0.000000]  Xen: 0000000000100000 - 0000000040000000 (usable)
> [    0.000000]  Xen: 00000000defafe00 - 00000000defb1ea0 (ACPI NVS)
> [    0.000000]  Xen: 00000000defb1ea0 - 00000000e0000000 (reserved)
> [    0.000000]  Xen: 00000000f4000000 - 00000000f8000000 (reserved)
> [    0.000000]  Xen: 00000000fec00000 - 00000000fed40000 (reserved)
> [    0.000000]  Xen: 00000000fed45000 - 0000000100000000 (reserved)
> [    0.000000]  Xen: 0000000100000000 - 00000001bafaf000 (usable)
> 
> The following is the E820 map seen in Dom0 after patching:
> (Dom0 assigned with 1GB mem)
> [    0.000000] BIOS-provided physical RAM map:
> [    0.000000]  Xen: 0000000000000000 - 0000000000097c00 (usable)
> [    0.000000]  Xen: 0000000000097c00 - 0000000000100000 (reserved)
> [    0.000000]  Xen: 0000000000100000 - 0000000040000000 (usable)
> [    0.000000]  Xen: 0000000040000000 - 00000000defafe00 (unusable)
> [    0.000000]  Xen: 00000000defafe00 - 00000000defb1ea0 (ACPI NVS)
> [    0.000000]  Xen: 00000000defb1ea0 - 00000000e0000000 (reserved)
> [    0.000000]  Xen: 00000000f4000000 - 00000000f8000000 (reserved)
> [    0.000000]  Xen: 00000000fec00000 - 00000000fed40000 (reserved)
> [    0.000000]  Xen: 00000000fed45000 - 0000000100000000 (reserved)
> [    0.000000]  Xen: 0000000100000000 - 00000001bafaf000 (usable)
> 
> The following is the /proc/iomem layout before patching:
> (Dom0 assigned with 1GB mem)
> 00000000-00000fff : System RAM
> 00001000-00005fff : reserved
> 00006000-00097bff : System RAM
> 00097c00-000fffff : reserved
> 00100000-7fffffff : System RAM
>    01000000-01513391 : Kernel code
>    01513392-018713d7 : Kernel data
>    018fa000-019c9483 : Kernel bss
> 40000000-401fffff : PCI Bus 0000:01
> 40200000-403fffff : PCI Bus 0000:01
> 40400000-405fffff : PCI Bus 0000:02
> 40600000-407fffff : PCI Bus 0000:02
> 40800000-408fffff : PCI Bus 0000:03
>    40800000-408fffff : 0000:03:04.0
> 40900000-40900fff : Intel Flush Page    //notice this line
> defafe00-defb1e9f : ACPI Non-volatile Storage
> defb1ea0-dfffffff : reserved
> e0000000-efffffff : 0000:00:02.0
> f0000000-f01fffff : PCI Bus 0000:03
>    f0000000-f00fffff : 0000:03:04.0
>      f0000000-f00fffff : e100
>    f0100000-f0100fff : 0000:03:04.0
>      f0100000-f0100fff : e100
> f0200000-f02fffff : 0000:00:02.0
> f0300000-f037ffff : 0000:00:02.0
> f0380000-f03fffff : 0000:00:02.1
> f0400000-f041ffff : 0000:00:19.0
>    f0400000-f041ffff : e1000e
> f0420000-f0423fff : 0000:00:1b.0
>    f0420000-f0423fff : ICH HD audio
> f0424000-f0424fff : 0000:00:03.3
> f0425000-f0425fff : 0000:00:19.0
>    f0425000-f0425fff : e1000e
> f0426000-f04267ff : 0000:00:1f.2
>    f0426000-f04267ff : ahci
> f0426800-f0426bff : 0000:00:1a.7
>    f0426800-f0426bff : ehci_hcd
> f0426c00-f0426fff : 0000:00:1d.7
>    f0426c00-f0426fff : ehci_hcd
> f0427100-f042710f : 0000:00:03.0
> f4000000-f7ffffff : PCI MMCONFIG 0 [00-3f]
>    f4000000-f7ffffff : reserved
>      f4000000-f7ffffff : pnp 00:10
> fec00000-fed3ffff : reserved
>    fec00000-fec00fff : IOAPIC 0
>    fec01000-fecfffff : pnp 00:10
>    fed00000-fed003ff : HPET 0
>    fed00400-fed3ffff : pnp 00:10
> fed45000-ffffffff : reserved
>    fed45000-ffffffff : pnp 00:10
> 100000000-17afaefff : System RAM
> 17afaf000-17bffffff : RAM buffer
> 
> The following is the /proc/iomem layout after patching:
> (Dom0 assigned with 1GB mem)
> 00000000-00000fff : System RAM
> 00001000-00005fff : reserved
> 00006000-00097bff : System RAM
> 00097c00-000fffff : reserved
> 00100000-3fffffff : System RAM
>    01000000-015134f1 : Kernel code
>    015134f2-018713d7 : Kernel data
>    018fa000-019c9483 : Kernel bss
> 40000000-defafdff : Unusable memory
> defafe00-defb1e9f : ACPI Non-volatile Storage
> defb1ea0-dfffffff : reserved
> e0000000-efffffff : 0000:00:02.0
> f0000000-f01fffff : PCI Bus 0000:03
>    f0000000-f00fffff : 0000:03:04.0
>      f0000000-f00fffff : e100
>    f0100000-f0100fff : 0000:03:04.0
>      f0100000-f0100fff : e100
> f0200000-f02fffff : 0000:00:02.0
> f0300000-f037ffff : 0000:00:02.0
> f0380000-f03fffff : 0000:00:02.1
> f0400000-f041ffff : 0000:00:19.0
>    f0400000-f041ffff : e1000e
> f0420000-f0423fff : 0000:00:1b.0
>    f0420000-f0423fff : ICH HD audio
> f0424000-f0424fff : 0000:00:03.3
> f0425000-f0425fff : 0000:00:19.0
>    f0425000-f0425fff : e1000e
> f0426000-f04267ff : 0000:00:1f.2
>    f0426000-f04267ff : ahci
> f0426800-f0426bff : 0000:00:1a.7
>    f0426800-f0426bff : ehci_hcd
> f0426c00-f0426fff : 0000:00:1d.7
>    f0426c00-f0426fff : ehci_hcd
> f0427100-f042710f : 0000:00:03.0
> f0428000-f0428fff : Intel Flush Page    //notice this line
> f0500000-f06fffff : PCI Bus 0000:01
> f0700000-f08fffff : PCI Bus 0000:01
> f0900000-f0afffff : PCI Bus 0000:02
> f0b00000-f0cfffff : PCI Bus 0000:02
> f0d00000-f0dfffff : PCI Bus 0000:03
>    f0d00000-f0dfffff : 0000:03:04.0
> f4000000-f7ffffff : PCI MMCONFIG 0 [00-3f]
>    f4000000-f7ffffff : reserved
>      f4000000-f7ffffff : pnp 00:10
> fec00000-fed3ffff : reserved
>    fec00000-fec00fff : IOAPIC 0
>    fec01000-fecfffff : pnp 00:10
>    fed00000-fed003ff : HPET 0
>    fed00400-fed3ffff : pnp 00:10
> fed45000-ffffffff : reserved
>    fed45000-ffffffff : pnp 00:10
> 100000000-1bafaefff : System RAM
> 1bafaf000-1bbffffff : RAM buffer
> 
> -Fengzhe
> 
> >
> > Is that really what this patch achieves? How is this related
> > to the igb driver?
> >
> > Can you provide a serial output of before this patch, and after this
> > patch? I am really curious to see how the intel agp functions.
> >
> >
> >>
> >> Signed-off-by: Fengzhe Zhang<fengzhe.zhang@xxxxxxxxx>
> >>
> >> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> >> index 1a1934a..f1a3896 100644
> >> --- a/arch/x86/xen/setup.c
> >> +++ b/arch/x86/xen/setup.c
> >> @@ -189,6 +189,16 @@ char * __init xen_memory_setup(void)
> >>                    end -= delta;
> >>
> >>                    extra_pages += PFN_DOWN(delta);
> >> +
> >> +                  /*
> >> +                   * Set RAM below 4GB that are not owned by Dom0 to be 
> >> unusable.
> >> +                   * This prevents RAM-backed address space from being 
> >> used as
> >> +                   * I/O address in Dom0. Dom0 is assumed to look for MMIO
> >> +                   * space only below 4GB. If this assumption is broken, 
> >> additional
> >> +                   * fixes are required.
> >> +                   */
> >> +                  if (delta&&  end<  0x100000000UL)
> >> +                          e820_add_region(end, delta, E820_UNUSABLE);
> >>            }
> >>
> >>            if (map[i].size>  0&&  end>  xen_extra_mem_start)
> >
> >
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >> http://lists.xensource.com/xen-devel
> >


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>