WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] scrub pages on guest termination

To: Ben Guthro <bguthro@xxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] scrub pages on guest termination
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Fri, 23 May 2008 18:19:25 +0100
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, Robert Phillips <rphillips@xxxxxxxxxxxxxxx>
Delivery-date: Fri, 23 May 2008 10:19:44 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <4836F85C.1010609@xxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Aci8+SX4ZE8sZyjsEd284wAWy6hiGQ==
Thread-topic: [Xen-devel] [PATCH] scrub pages on guest termination
User-agent: Microsoft-Entourage/11.4.0.080122
On 23/5/08 18:01, "Ben Guthro" <bguthro@xxxxxxxxxxxxxxx> wrote:

Yes, sorry -  should have removed our terminology from the description.
Node=physical machine
VS=HVM guest w/ pv-on-hvm drivers
Looking back at the original bug report - it seems to indicate it was migrating from a system with 2 processors to one with 8

It’s very surprising that lock contention would cause such a severe lack of progress on an 8-CPU system. If the lock is that hotly contended then even the usage of it in free_domheap_pages() has to be questionable.

I’m inclined to say that if we want to address this then we should do it in one or more of the following ways:
 1. Count CPUs into the scrub function with an atomic_t and beyond a limit all other CPUs bail straight out after re-setting their timer.
 2. Increase scrub batch size to reduce proportion of time that each loop iteration holds the lock.
 3. Turn the spin_lock() into a spin_trylock() so that the timeout check can be guaranteed to execute frequently.
 4. Eliminate the global lock by building a lock-free linked list, or by maintaining per-CPU hashed work queues with work stealing, or... etc.

The patch as-is at least suffers from the issue that the ‘primary scrubber’ should be regularly checking for softirq work. But I’m not sure such a sizeable change to the scheduling policy for scrubbing (such as it is!) is necessary or desirable.

Option 4 is on the morally highest ground but is of course the most work. :-)

 -- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>