This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Linux balloon driver stops accepting target_kb for a long ti

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] Linux balloon driver stops accepting target_kb for a long time
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Mon, 23 Aug 2010 15:45:53 -0700 (PDT)
Cc: jeremy@xxxxxxxx, Keir Fraser <Keir.Fraser@xxxxxxxxxxxxx>, JBeulich@xxxxxxxxxx
Delivery-date: Mon, 23 Aug 2010 15:48:59 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Balloon experts --

I'm seeing a strange problem in either the balloon driver
or in the Xen code that provides the support for it... still
trying to narrow down which.

The problem appears when I am running in-kernel selfballooning
code and then only rarely... I'm not sure exactly what conditions
are required but for a long period of time (>30 minutes), writing
to target_kb inside a PV guest has no effect at all on the
memory size of the VM (as viewed inside the guest with "free -k")!
Under most conditions, writing to target_kb "immediately" changes
the memory size, but once in this state, no effect at all.
At the end of this long period of time, suddenly everything
is back to normal... and there's no obvious trigger that
signals the return to normalcy.

Note that though the problem is observed with selfballooning,
changing target_kb manually fails as well, so I suspect the
problem exists regardless of selfballooning but only
selfballooning is exercising the balloon sizing enough to
encounter the bug.

Reviewing code, one thing caught my attention.  In balloon_process(),
the balloon_mutex is down'ed then, under certain conditions
schedule() is called with the balloon_mutex still held and without
another timer set.  Any chance this could be a problem, especially
if another kernel thread invokes balloon_set_new_target()?
If so, what might finally kick the scheduled-out thread after
30 minutes to reset the balloon_timer and up the mutex?

If this is wrong, any other ideas what might be causing
this weird problem?


P.S. This is the Linux 2.6.18-based balloon driver (with latest
patches from xen-unstable), but I may see if I can reproduce it
on an upstream balloon driver as well.

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>