WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] VPMU issue on Nehalem cpus

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] VPMU issue on Nehalem cpus
From: Dietmar Hahn <dietmar.hahn@xxxxxxxxxxxxxx>
Date: Mon, 22 Nov 2010 08:06:42 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Haitao Shan <haitao.shan@xxxxxxxxx>
Delivery-date: Sun, 21 Nov 2010 23:07:43 -0800
Dkim-signature: v=1; a=rsa-sha256; c=simple/simple; d=ts.fujitsu.com; i=dietmar.hahn@xxxxxxxxxxxxxx; q=dns/txt; s=s1536b; t=1290409606; x=1321945606; h=from:to:subject:date:cc:references:in-reply-to: mime-version:content-transfer-encoding:message-id; z=From:=20Dietmar=20Hahn=20<dietmar.hahn@xxxxxxxxxxxxxx> |To:=20Jan=20Beulich=20<JBeulich@xxxxxxxxxx>|Subject:=20R e:=20[Xen-devel]=20[PATCH]=20VPMU=20issue=20on=20Nehalem =20cpus|Date:=20Mon,=2022=20Nov=202010=2008:06:42=20+0100 |Cc:=20Haitao=20Shan=20<haitao.shan@xxxxxxxxx>,=0D=0A=20x en-devel@xxxxxxxxxxxxxxxxxxx|References:=20<201011191129. 07773.dietmar.hahn@xxxxxxxxxxxxxx>=20<4CE6685802000078000 23533@xxxxxxxxxxxxxxxxxx>|In-Reply-To:=20<4CE668580200007 800023533@xxxxxxxxxxxxxxxxxx>|MIME-Version:=201.0 |Content-Transfer-Encoding:=207bit|Message-Id:=20<2010112 20806.43098.dietmar.hahn@xxxxxxxxxxxxxx>; bh=wsKdu9xfFImC0Fy/eYx8OBiLm9B2IpS4+fkKNsdyYxM=; b=iHjGbPi77vruePRuqkrex8R/KYRR/1rIgLa5GeNOvrF51w2aHmLE6KhE OYGw0ZjfDDSFUdjO0/b503COp+ePcn1GjDmcnUmwqq9M7co3+vE4Tlf4R wYyiTA1Xqroq9vNJ114QwgP1H9L2aca+sVzJ6/IkFD3fUhRsZ9zhL/5Q1 IqbolXNekqz55wbKzqf76/RgE6TeVHRVmDX+cwF5v7jKPOiMoo0Y+wUlK lPaY3fgKO8X+R75C8RTuU2DWwvfBM;
Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:From:To:Subject:Date:User-Agent:Cc: References:In-Reply-To:X-KMail-Markup:MIME-Version: Content-Type:Content-Transfer-Encoding:Message-Id; b=ZsjIkojFMlbddgwsAhgUnoj85RQPkTbCbP2Ih+VJO7Etz3g8/rq0zRjq nHhMTrlo3Cgz3yd9M07ya/QAudPm9og9CZq3e0VSCCY41WDZanRpNvPeh C4417iiKG1SxqZTzn6AT3BSX4ETSn3idrAv0ebSQs+Rvs3BVoJvS9/EtL i1s3edAu/X2a77nxNqpO3V7wTebiIDtDWxXgJYkFchxpOdSdBgJd9fcEH XtCYj+m/of1brYNBaXj/rInN1JVWQ;
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CE668580200007800023533@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <201011191129.07773.dietmar.hahn@xxxxxxxxxxxxxx> <4CE668580200007800023533@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.13.5 (Linux/2.6.34.7-0.5-xen; KDE/4.5.3; x86_64; ; )

am 19.11.2010 schrieb "Jan Beulich <JBeulich@xxxxxxxxxx>":

> >>> On 19.11.10 at 11:29, Dietmar Hahn <dietmar.hahn@xxxxxxxxxxxxxx> wrote:

> > +/*

> > + * QUIRK to workaround an issue on Nehalem processors currently seen

> > + * on family 6 cpus E5520 (model 26) and X7542 (model 46).

> > + * The issue leads to endless NMI loops on the processor.

> > + * If a counter triggers an NMI and while the NMI handler is running another

> > + * counter overflows the second counter triggers endless new NMIs.

> > + * A solution is to read all flagged counters and if the value is 0 write

> > + * 1 into it.

> > + */

I will try to improve the comment.

>

> Two things I don't understand here: One is that I can't see how

> from the NMI handler control would get to

> core2_vpmu_do_interrupt() - afaics, this gets called only in the

> context of the (vectored) smp_pmu_apic_interrupt(). The other

> is that if nested interrupts occur, how would you prevent this

> by writing ones into zero counters? That is, in the best case I

> could see this shrinking the window within which unintended

> nested interrupts would occur. Or is it that the secondary

> interrupts only occur after the first one returned? Is this (mis-)

> behavior documented somewhere?

>

> > +static int is_nmi_quirk;

>

> bool_t __read_mostly?

Yes, would be better.

>

> > +

> > +static void check_nmi_quirk(void)

> > +{

> > + u8 family = current_cpu_data.x86;

> > + u8 cpu_model = current_cpu_data.x86_model;

> > + is_nmi_quirk = 0;

> > + if ( family == 6 )

> > + {

> > + if ( cpu_model == 46 || cpu_model == 26 )

> > + is_nmi_quirk = 1;

> > + }

> > +}

> > +

> > +static int core2_get_pmc_count(void);

> > +static void handle_nmi_quirk(u64 msr_content)

> > +{

> > + int num_gen_pmc = core2_get_pmc_count();

> > + int num_fix_pmc = 3;

> > + int i;

> > + u64 val;

> > +

> > + if ( !is_nmi_quirk )

> > + return;

> > +

> > + val = msr_content & ((1 << num_gen_pmc) - 1);

>

> What's the point of masking if the subsequent loop looks at the

> bottom so many bits only anyway?

Bits 0-31 flag the overflow of the general counters (currently max 4) and 32-63

flag the overflow of the fixed counter (currently max 3).

Yes the first mask is not necessary, maybe a comment would be better?

>

> > + for ( i = 0; i < num_gen_pmc; i++ )

> > + {

> > + if ( val & 0x1 )

> > + {

> > + u64 cnt;

> > + rdmsrl(MSR_P6_PERFCTR0 + i, cnt);

> > + if ( cnt == 0 )

> > + wrmsrl(MSR_P6_PERFCTR0 + i, 1);

> > + }

> > + val >>= 1;

> > + }

> > + val = (msr_content >> 32) & ((1 << num_fix_pmc) - 1);

>

> Same here.

>

> > + for ( i = 0; i < num_fix_pmc; i++ )

> > + {

> > + if ( val & 0x1 )

> > + {

> > + u64 cnt;

> > + rdmsrl(MSR_CORE_PERF_FIXED_CTR0 + i, cnt);

> > + if ( cnt == 0 )

> > + wrmsrl(MSR_CORE_PERF_FIXED_CTR0 + i, 1);

> > + }

> > + val >>= 1;

> > + }

> > +}

> > +

> > +#define CHECK_HANDLE_NMI_QUIRK(msr_content) \

> > + if ( is_nmi_quirk ) \

> > + handle_nmi_quirk(msr_content);

> > +

>

> Why do you need a macro here if you use it only once?

I did this only to have as few impact on the existing code as possible.

I will remove it.

Thanks.

Dietmar.

>

> > u32 core2_counters_msr[] = {

> > MSR_CORE_PERF_FIXED_CTR0,

> > MSR_CORE_PERF_FIXED_CTR1,

> > @@ -494,6 +558,9 @@

> > rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, msr_content);

> > if ( !msr_content )

> > return 0;

> > +

> > + CHECK_HANDLE_NMI_QUIRK(msr_content)

> > +

> > core2_vpmu_cxt->global_ovf_status |= msr_content;

> > msr_content = 0xC000000700000000 | ((1 << core2_get_pmc_count()) - 1);

> > wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, msr_content);

>

> Jan

--

Company details: http://ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel