WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH] Fix cpu online/offline bug: mce memory leak.

To: Haitao Shan <maillists.shan@xxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH] Fix cpu online/offline bug: mce memory leak.
From: Keir Fraser <keir.xen@xxxxxxxxx>
Date: Tue, 01 Mar 2011 10:25:38 +0000
Cc: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 01 Mar 2011 02:26:19 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:user-agent:date:subject:from:to:cc:message-id :thread-topic:thread-index:in-reply-to:mime-version:content-type :content-transfer-encoding; bh=TIKRgbt3e+P6LD0PlLopcuJdXSHVQjgFszL6yi+cZq4=; b=sto3RWT8OI7zIVyxPYq9qvMH6B7w8+ki07XT/uu33FjbuWR2wegnSqpWPqgrZKFwiW DarsTgjKxQTAVSt78DAky00KQ1lIJjvZ42dI1L99IPxBWvdF3MSwcYW+hb5AoCN4eC75 D/hitWrYl7llAe2tezU4c+8V4yEd1jgGf05L8=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=user-agent:date:subject:from:to:cc:message-id:thread-topic :thread-index:in-reply-to:mime-version:content-type :content-transfer-encoding; b=PvHTxhjf3i2SIPlMQasRXHdU+/PNTio+v9Oa3rzCUP/k1ri5Q2FtOIJ6bqyexGHud9 upwrMQwHd1T+0Sn1wJ+34P+9760zoTE1cXOaKPFjp/3tvan8w1RBa9d+Uzw0EMDpeK4S KeP2ykxZiPo7d4UyecaHBBNiygJdSxRs+dCGU=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTikbfE5aakUS0geuwGuQnZFsoUkcSwaJm+VVO6gZ@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcvX+wH4yCFvgaNhw0+5HxMnPYx7oA==
Thread-topic: [PATCH] Fix cpu online/offline bug: mce memory leak.
User-agent: Microsoft-Entourage/12.28.0.101117
Well, it is the correct fix, certainly. The new notifier mechanism we
borrowed from Linux should enable us to remove all dynamic allocations from
a CPU's early bringup path. It's just a case of finding them all as some are
tucked away in cpu/platform-specific areas (like this one).

There are a couple of alterations for Jinsong to make, and then smoke test
the patch again. When he resubmits with the alterations I will check the
patch straight in for the next release candidate (RC7).

 K.

On 01/03/2011 10:21, "Haitao Shan" <maillists.shan@xxxxxxxxx> wrote:

> Ah, yes. I was misled by the patch name itself. :)
> Smart fix, isn't it! I guess the issue is closed. Thanks, Jinsong.
> 
> Shan Haitao
> 
> 2011/3/1 Keir Fraser <keir.xen@xxxxxxxxx>
>> On 01/03/2011 10:12, "Haitao Shan" <maillists.shan@xxxxxxxxx> wrote:
>> 
>>> Hi, Keir,
>>> 
>>> Fixing memory leak would be always good. But I don't agree this could fix
>>> the
>>> problem I observed. 
>>> Yes, indeed, I observed the issue during stress test of cpu offline/online
>>> by
>>> offlining all cpus except CPU0 one by one and then bringing them back. In
>>> this
>>> case, with memory leak, after some time, xmalloc at onlining could hit a
>>> heap
>>> page that is formerly owned by domains, since usable Xen heap memory will be
>>> slowly used up. Without memory leak, xmalloc at onlining will be likely use
>>> the memory that is freed just by offlining. So it won't trigger this
>>> assertion.
>>> 
>>> But let us think a typical usage model, you will offline all LPs on a socket
>>> so that it can be later removed physically. Some other time, you will bring
>>> them back. Between offlining and onlining, there could be a time interval as
>>> long as one week and activities such as creating and killing many guests.
>>> How
>>> can we ensure that we won't meet this assertion at that time?
>>> 
>>> It is my understanding that this memory leak triggered the issue I raised
>>> but
>>> the issue itself can be triggered in many other ways. Please do correct me
>>> if
>>> I am wrong. Thanks!
>> 
>> The point is that the patch also moves the xmalloc() calls out of the new
>> CPU's bringup path, and into CPU_UP_PREPARE context. This means the
>> allocations will occur on a different CPU, before the new CPU begins
>> execution, and with irqs enabled. That's why it should fix your crash.
>> 
>>  -- Keir
>> 
>>> Shan Haitao
>>> 
>>> 2011/3/1 Keir Fraser <keir.xen@xxxxxxxxx>
>>>> Some comments inline below. Overall a good patch to have, but needs some
>>>> minor adjustments!
>>>> 
>>>>  -- Keir
>>>> 
>>>> On 01/03/2011 09:09, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>>>> 
>>>>> Fix cpu online/offline bug: mce memory leak.
>>>>> 
>>>>> Current Xen mce logic didn't free mcabanks. This would be a memory leak
>>>>> when
>>>>> cpu offline.
>>>>> When repeatly do cpu online/offline, this memory leak would make xenpool
>>>>> shrink, and at a time point,
>>>>> will call alloc_heap_pages --> flush_area_mask, which
>>>>> ASSERT(local_irq_is_enabled()).
>>>>> However, cpu online is irq disable, so it finally result in Xen crash.
>>>>> 
>>>>> This patch fix the memory leak bug, and tested OK over 110,000 round cpu
>>>>> online/offline.
>>>>> 
>>>>> Signed-off-by: Liu, Jinsong <jinsong.liu@xxxxxxxxx>
>>>>> 
>>>>> diff -r 1a364b17d66a xen/arch/x86/cpu/mcheck/mce_intel.c
>>>>> --- a/xen/arch/x86/cpu/mcheck/mce_intel.c Fri Feb 25 01:26:01 2011 +0800
>>>>> +++ b/xen/arch/x86/cpu/mcheck/mce_intel.c Mon Feb 28 19:19:20 2011 +0800
>>>>> @@ -1227,9 +1227,24 @@ static void intel_init_mce(void)
>>>>>      mce_uhandler_num = sizeof(intel_mce_uhandlers)/sizeof(struct
>>>>> mca_error_handler);
>>>>>  }
>>>>> 
>>>>> -static int intel_init_mca_banks(void)
>>>>> +static void cpu_mcabank_free(unsigned int cpu)
>>>>>  {
>>>>> -    struct mca_banks *mb1, *mb2, * mb3;
>>>>> +    struct mca_banks *mb1, *mb2, *mb3, *mb4;
>>>>> +
>>>>> +    mb1 = per_cpu(mce_clear_banks, cpu);
>>>>> +    mb2 = per_cpu(no_cmci_banks, cpu);
>>>>> +    mb3 = per_cpu(mce_banks_owned, cpu);
>>>>> +    mb4 = per_cpu(poll_bankmask, cpu);
>>>>> +
>>>>> +    mcabanks_free(mb1);
>>>>> +    mcabanks_free(mb2);
>>>>> +    mcabanks_free(mb3);
>>>>> +    mcabanks_free(mb4);
>>>>> +}
>>>>> +
>>>>> +static void cpu_mcabank_alloc(unsigned int cpu)
>>>>> +{
>>>>> +    struct mca_banks *mb1, *mb2, *mb3;
>>>>> 
>>>>>      mb1 = mcabanks_alloc();
>>>>>      mb2 = mcabanks_alloc();
>>>>> @@ -1237,22 +1252,23 @@ static int intel_init_mca_banks(void)
>>>>>      if (!mb1 || !mb2 || !mb3)
>>>>>          goto out;
>>>>> 
>>>>> -    __get_cpu_var(mce_clear_banks) = mb1;
>>>>> -    __get_cpu_var(no_cmci_banks) = mb2;
>>>>> -    __get_cpu_var(mce_banks_owned) = mb3;
>>>>> +    per_cpu(mce_clear_banks, cpu) = mb1;
>>>>> +    per_cpu(no_cmci_banks, cpu) = mb2;
>>>>> +    per_cpu(mce_banks_owned, cpu) = mb3;
>>>>> +    return;
>>>>> 
>>>>> -    return 0;
>>>>>  out:
>>>>>      mcabanks_free(mb1);
>>>>>      mcabanks_free(mb2);
>>>>>      mcabanks_free(mb3);
>>>>> -    return -ENOMEM;
>>>>>  }
>>>>> 
>>>>>  /* p4/p6 family have similar MCA initialization process */
>>>>>  enum mcheck_type intel_mcheck_init(struct cpuinfo_x86 *c)
>>>>>  {
>>>>> -    if (intel_init_mca_banks())
>>>>> +    if ( !this_cpu(mce_clear_banks) ||
>>>>> +         !this_cpu(no_cmci_banks)   ||
>>>>> +         !this_cpu(mce_banks_owned) )
>>>>>           return mcheck_none;
>>>>> 
>>>>>      intel_init_mca(c);
>>>>> @@ -1301,13 +1317,19 @@ static int cpu_callback(
>>>>>  static int cpu_callback(
>>>>>      struct notifier_block *nfb, unsigned long action, void *hcpu)
>>>>>  {
>>>>> +    unsigned int cpu = (unsigned long)hcpu;
>>>>> +
>>>>>      switch ( action )
>>>>>      {
>>>>> +    case CPU_UP_PREPARE:
>>>>> +        cpu_mcabank_alloc(cpu);
>>>>> +        break;
>>>>>      case CPU_DYING:
>>>>>          cpu_mcheck_disable();
>>>>>          break;
>>>>>      case CPU_DEAD:
>>>>>          cpu_mcheck_distribute_cmci();
>>>>> +        cpu_mcabank_free(cpu);
>>>>>          break;
>>>> 
>>>> You also need to cpu_mcabank_free(cpu) on CPU_UP_CANCELED. Else there's a
>>>> memory leak on failed CPU bringup.
>>>> 
>>>>>      default:
>>>>>          break;
>>>>> @@ -1322,6 +1344,8 @@ static struct notifier_block cpu_nfb = {
>>>>> 
>>>>>  static int __init intel_mce_initcall(void)
>>>>>  {
>>>>> +    void *hcpu = (void *)(long)smp_processor_id();
>>>>> +    cpu_callback(&cpu_nfb, CPU_UP_PREPARE, hcpu);
>>>>>      register_cpu_notifier(&cpu_nfb);
>>>>>      return 0;
>>>>>  }
>>>>> diff -r 1a364b17d66a xen/arch/x86/setup.c
>>>>> --- a/xen/arch/x86/setup.c Fri Feb 25 01:26:01 2011 +0800
>>>>> +++ b/xen/arch/x86/setup.c Mon Feb 28 19:19:20 2011 +0800
>>>>> @@ -1203,6 +1203,8 @@ void __init __start_xen(unsigned long mb
>>>>> 
>>>>>      arch_init_memory();
>>>>> 
>>>>> +    do_presmp_initcalls();
>>>>> +
>>>>>      identify_cpu(&boot_cpu_data);
>>>>>      if ( cpu_has_fxsr )
>>>>>          set_in_cr4(X86_CR4_OSFXSR);
>>>>> @@ -1235,8 +1237,6 @@ void __init __start_xen(unsigned long mb
>>>>>      initialize_keytable();
>>>>> 
>>>>>      console_init_postirq();
>>>>> -
>>>>> -    do_presmp_initcalls();
>>>> 
>>>> This looks risky, especially so close to 4.1 release. Instead of moving
>>>> do_presmp_initcalls(), please special-case cpu_mcabank_alloc() for CPU0 in
>>>> intel_mcheck_init(), and remove your direct call to
>>>> cpu_callback(...CPU_UP_PREPARE...) in intel_mce_initcall().
>>>> 
>>>>>      for_each_present_cpu ( i )
>>>>>      {
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel