WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ia64-devel

RE: [Xen-ia64-devel] RE: ar.unat[patch] fixed this ar.uantissue.[patch]

To: "Xu, Anthony" <anthony.xu@xxxxxxxxx>, "Magenheimer, Dan \(HP Labs Fort Collins\)" <dan.magenheimer@xxxxxx>
Subject: RE: [Xen-ia64-devel] RE: ar.unat[patch] fixed this ar.uantissue.[patch] fixed ar.unat save/restore issue
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Tue, 15 Nov 2005 11:55:49 +0800
Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 15 Nov 2005 03:55:51 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-ia64-devel-request@lists.xensource.com?subject=help>
List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
List-post: <mailto:xen-ia64-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcXf7/qlxt/UBLYXQj6idPbhpb4eQQAKxkpQAAOxG7AACK2+8AANzkbgABfse2AAt6FwgAAPttsgAL0t6+AAgrfsAAAbG/swAAflbjAAAtWlwA==
Thread-topic: [Xen-ia64-devel] RE: ar.unat[patch] fixed this ar.uantissue.[patch] fixed ar.unat save/restore issue
>From: Xu, Anthony
>Sent: 2005年11月15日 11:00
>
>>What if a privileged instruction is on a NaT page and
>>Xen needs to emulate that instruction?
>>
>
>That may happen. Because Nat Page fault has high privilege than illegal fault, 
>we
>should deliver Nat Page fault to guest first, in my mind, linux kernel will 
>change Nat
>page to normal page if this page belongs to corresponding application's address
>space, then will re-execute this privileged instruction, at this time HV can 
>emulate it.

Agree. We should just inject fault based on priority. If guest does want to 
change property to allow instruction execution, then GP fault will generate 
naturally then Xen tries to handle the privop. If not (like guest want to kill 
the process directly), nothing further goes to Xen.

Thanks,
Kevin
>
>>> So the logic in my mind is,
>>> If(register nat bit fault)
>>>     Panic();
>>> Else
>>>     Inject nat consumption fault to guest.
>>>
>Definitely we will remove panic and inject nat fault to guest, after we fix 
>all ar.unat
>related bug. The panic is temporarily solution to facilitate debugl.
>
>
>Thanks
>-Anthony
>
>
>
>>-----Original Message-----
>>From: Magenheimer, Dan (HP Labs Fort Collins)
>[mailto:dan.magenheimer@xxxxxx]
>>Sent: 2005年11月15日 6:50
>>To: Xu, Anthony
>>Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch] fixed ar.unat
>>save/restore issue
>>
>>> There should be not register nat bit fault when running itp,
>>
>>Could you explain why this is true? (what is itp?)
>>
>>> When nat page fault happens, it is usually caused by an
>>> instruction which is accessing a page whose page attribute is
>>> nat page, so it must be ld or st instruction, it is
>>
>>What if a privileged instruction is on a NaT page and
>>Xen needs to emulate that instruction?
>>
>>Thanks,
>>Dan
>>
>>> -----Original Message-----
>>> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>>> Sent: Monday, November 14, 2005 3:37 AM
>>> To: Magenheimer, Dan (HP Labs Fort Collins)
>>> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch]
>>> fixed ar.unat save/restore issue
>>>
>>> Yes, this patch may make dom0 go through ltp test,
>>> Your logic to handle nat consumption fault is
>>> If( register nat bit fault)
>>>     Inject nat consumption fault to guest;
>>> Else(means this nat page fault)
>>>     Attempting to handle as privop
>>>     If( it is privop)
>>>             Return;
>>>     Else
>>>             Inject nat consumption fault to guest
>>>
>>> When nat page fault happens, it is usually caused by an
>>> instruction which is accessing a page whose page attribute is
>>> nat page, so it must be ld or st instruction, it is
>>> definitely not privop instruction. So it is not necessary to
>>> attempt to handle nat fault as privop, we should inject it to
>>> guest directly.
>>> There should be not register nat bit fault when running itp,
>>> So the logic in my mind is,
>>> If(register nat bit fault)
>>>     Panic();
>>> Else
>>>     Inject nat consumption fault to guest.
>>>
>>> If it panics, there should be some places nearby where
>>> ar.unat is not correctly handled. We should take this chance
>>> to fix all ar.unat related bugs.
>>>
>>> >I am still not sure about the use of eml_unat.  I commented
>>> >out your code (in ia64_handle_reflection) that sets it to zero
>>>
>>> yes, you can comment this code, it was used for debugging
>>> ar.unat fault.
>>>
>>>
>>>
>>> Thanks
>>> -Anthony
>>>
>>>
>>>
>>>
>>>
>>>
>>> >-----Original Message-----
>>> >From: Magenheimer, Dan (HP Labs Fort Collins)
>>> [mailto:dan.magenheimer@xxxxxx]
>>> >Sent: 2005年11月12日 3:30
>>> >To: Xu, Anthony
>>> >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> >Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch]
>>> fixed ar.unat
>>> >save/restore issue
>>> >
>>> >Anthony --
>>> >
>>> >I just committed a fix to allow nat consumption faults to
>>> >be delivered again.  I think this is now necessary after
>>> >the region0 virtual address fixes needed for ltp-mmap09.
>>> >Without these nat fixes, ltp-getpeername01 reproducibly
>>> >goes into an infinite loop reporting NaT errors (because
>>> >the "return" in the reflection code doesn't result in
>>> >the NaT getting reflected to the guest).
>>> >
>>> >I have left the printfs so any code that results in
>>> >a inst/data page nat consumption fault (e.g. certain
>>> >situations where the zero page is accessed) will be
>>> >very chatty, but I think that's OK for now until we
>>> >are sure we have fixed all NaT problems.
>>> >
>>> >I am still not sure about the use of eml_unat.  I commented
>>> >out your code (in ia64_handle_reflection) that sets it to zero
>>> >and Tony's checker program and getpeername01 still work.
>>> >If this (setting eml_unat to zero) is handling some
>>> >special case that I am not testing for, please let me
>>> >know.
>>> >
>>> >Thanks,
>>> >Dan
>>> >
>>> >> -----Original Message-----
>>> >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>>> >> Sent: Monday, November 07, 2005 6:30 PM
>>> >> To: Magenheimer, Dan (HP Labs Fort Collins)
>>> >> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> >> Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch]
>>> >> fixed ar.unat save/restore issue
>>> >>
>>> >> See my comments,
>>> >>
>>> >> >-----Original Message-----
>>> >> >From: Magenheimer, Dan (HP Labs Fort Collins)
>>> >> [mailto:dan.magenheimer@xxxxxx]
>>> >> >Sent: 2005年11月8日 2:07
>>> >> >To: Xu, Anthony
>>> >> >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> >> >Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch]
>>> >> fixed ar.unat
>>> >> >save/restore issue
>>> >> >
>>> >> >Another NaT question...
>>> >> >
>>> >> >>I recall that some time ago (around the time of the merge)
>>> >> >>you submitted some patches related to fixing ar.unat saving
>>> >> >>and restoring.
>>> >> >
>>> >> >Another part of your earlier patch was a change in
>>> >> >ia64_handle_reflection.  I still periodically get the
>>> >> >message:
>>> >> >
>>> >> >   NaT fault... attempting to handle as privop
>>> >> >
>>> >> >Since your latest fix, Tony's regcheck tool no longer
>>> >> >reports ar.unat as being saved/restored incorrectly.
>>> >> >I was hoping that the above message would go away also,
>>> >> >but it has not.  I see it a couple times at boot and
>>> >> >a couple times for every linux compile (at the end so
>>> >> >it is probably the linker or some other link-related
>>> >> >tool).  I have also seen programs segfault after printing
>>> >> >this message.  So I went to look at the Xen/ia64 code where
>>> >> >this is printed.
>>> >> >
>>> >>
>>> >> I have not seen nat consumptions and segmentations faults for
>>> >> a long time, in your build test and ltp test. Otherwise, I'll
>>> >> definitely try to fix that.
>>> >>
>>> >> >It doesn't look right to me.  There are two issues:
>>> >> >
>>> >> >1) Your patch added a "return"... I think this means that
>>> >> >   NaT faults will never get reflected to a guest (even
>>> >> >   Register NaT Consumption faults).
>>> >>
>>> >> Yes, you are right, we should inject Nat Consumption faults
>>> >> to guest, but as I know there should be not NaT consumption
>>> >> faults in linux, so I simply added a "return". I think the
>>> >> best way is to add "panic" at this place, this will enforce
>>> >> us to debug this issue rather than temporarily work around.
>>> >>
>>> >>
>>> >> >2) Since a Instruction NaTPage Consumption fault has higher
>>> >> >   priority than a Privileged Operation fault, I think the
>>> >> >   original printf/priv_emulate code was intended to catch
>>> >> >   this case and properly emulate a privileged instruction
>>> >> >   on a NaTPage.  I think it may also be necessary if a Data
>>> >> >   NaTPage Consumption fault is incurred when the privop
>>> >> >   emulation code fetches the instruction.  (The code in
>>> >> >   ia64_handle_reflection should probably check the ISR to
>>> >> >   avoid calling priv_emulate for other kinds of NaT
>>> >> >   Consumption though.)
>>> >>
>>> >> I have been being curious why use emulate function to handle
>>> >> NaT consumption.
>>> >> Now I understand, thank you for your detailed explain. Maybe
>>> >> we need to put more comments in the confusing place like this.
>>> >>
>>> >>
>>> >>
>>> >> >You know more about NaT's than I do... could you recheck
>>> >> >this code in ia64_handle_reflection please?  Do you have
>>> >> >any test code that provokes any of these NaT faults?
>>> >> >
>>> >>
>>> >> It' is very kind of you to say that, unfortunately I have not
>>> >> seen those issues. What I suspect is dom0 does bank switch on
>>> >> shared page but not consider ar.unat.
>>> >>
>>> >> Anyway, I'll try to provoke this fault, If I find, I'll
>>> >> definitely fix it.
>>> >>
>>> >> >Thanks.
>>> >> >Dan
>>> >> >
>>> >> >> -----Original Message-----
>>> >> >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>>> >> >> Sent: Friday, November 04, 2005 12:10 AM
>>> >> >> To: Magenheimer, Dan (HP Labs Fort Collins)
>>> >> >> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> >> >> Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch]
>>> >> >> fixed ar.unat save/restore issue
>>> >> >>
>>> >> >> >I am curious about the use of B1NATS in the code
>>> >> >> >around this patch.  Under what circumstances does
>>> >> >> >this get set/used?
>>> >> >>
>>> >> >> 1. emulate  bsw1, bsw0
>>> >> >> 2. emulate rfi.
>>> >> >> 3. inject fault to guest.
>>> >> >>
>>> >> >> There is similar unat code in
>>> >> >> >fast_tick (default off) and fast_reflect (default on)
>>> >> >> >and I am wondering if similar unat changes are needed
>>> >> >> >and whether it is now OK to turn on HANDLE_AR_UNAT
>>> >> >> >(which is now default off).
>>> >> >> You are right, in above two cases you should also save
>>> >> >> ar.unat to XSI_B1NATS_OFS after spilling the guest bank1to
>>> >> >> share page. I had handled all this in C code. I didn't look
>>> >> >> into fast hypercall code, It's hard to read due to I am not
>>> >> >> good at assembly code. The principle of handling ar.unat is
>>> >> >> obvious; every time you spill banking register you must save
>>> >> >> corresponding ar.unat after it, every time you fill banking
>>> >> >> register you must restore corresponding ar.unat before it.
>>> >> >>
>>> >> >> We don't need to clear all guest b0 registers and their's nat
>>> >> >> bit. Because r16~r23 are preserved regs and r24~r31 are
>>> >> >> scratch regs, we only need to restore r16~r23 rather than
>>> >> >> clear r16~r23 to 0.
>>> >> >>
>>> >> >> Next time you enable some functions like hyper_ssm_i, when
>>> >> >> you save bank1 regs you should also save bank1 unat.
>>> >> >>
>>> >> >> Below patch enables HANDLE_AR_UNAT.
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> Signed-off-by Anthony Xu <Anthony.xu@xxxxxxxxx>
>>> >> >>
>>> >> >> Thanks,
>>> >> >> Anthony.
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> >-----Original Message-----
>>> >> >> >From: Magenheimer, Dan (HP Labs Fort Collins)
>>> >> >> [mailto:dan.magenheimer@xxxxxx]
>>> >> >> >Sent: 2005年11月3日 22:42
>>> >> >> >To: Xu, Anthony
>>> >> >> >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> >> >> >Subject: RE: ar.unat[patch] fixed this ar.uant issue.
>>> >> >> >
>>> >> >> >Hi Anthony --
>>> >> >> >
>>> >> >> >I am curious about the use of B1NATS in the code
>>> >> >> >around this patch.  Under what circumstances does
>>> >> >> >this get set/used?  There is similar unat code in
>>> >> >> >fast_tick (default off) and fast_reflect (default on)
>>> >> >> >and I am wondering if similar unat changes are needed
>>> >> >> >and whether it is now OK to turn on HANDLE_AR_UNAT
>>> >> >> >(which is now default off).
>>> >> >> >
>>> >> >> >Thanks,
>>> >> >> >Dan
>>> >> >> >
>>> >> >> >> -----Original Message-----
>>> >> >> >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>>> >> >> >> Sent: Thursday, November 03, 2005 1:08 AM
>>> >> >> >> To: Magenheimer, Dan (HP Labs Fort Collins)
>>> >> >> >> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> >> >> >> Subject: RE: ar.unat[patch] fixed this ar.uant issue.
>>> >> >> >>
>>> >> >> >> Dan,
>>> >> >> >> Last time, I used ar.unat register to restore guest general
>>> >> >> >> register nat bit in hyper_rfi function for eliminating nat
>>> >> >> >> bit consumption fault,but not restored ar.unat.
>>> >> >> >>
>>> >> >> >> Signed-off-by Anthony Xu <Anthony.xu@xxxxxxxxx>
>>> >> >> >>
>>> >> >> >> Thanks,
>>> >> >> >> Anthony.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> >-----Original Message-----
>>> >> >> >> >From: Magenheimer, Dan (HP Labs Fort Collins)
>>> >> >> >> [mailto:dan.magenheimer@xxxxxx]
>>> >> >> >> >Sent: 2005年11月3日 11:54
>>> >> >> >> >To: Xu, Anthony
>>> >> >> >> >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>>> >> >> >> >Subject: RE: ar.unat
>>> >> >> >> >
>>> >> >> >> >> I can take a look at this, please send me regcheck utilty.
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >> Thanks
>>> >> >> >> >> Anthony
>>> >> >> >> >
>>> >> >> >> >Great, thanks!  Here's where I got Tony's regcheck tool.  If
>>> >> >> >> >it's not still there, perhaps Tony can post it.
>>> >> >> >> >
>>> >> >> >> >By the way, if anyone tries this on a domU, Matt Chapman
>>> >> >> >> >has a pending fix that resolves a FP save/restore issue.
>>> >> >> >> >
>>> >> >> >> >Thanks,
>>> >> >> >> >Dan
>>> >> >> >> >
>>> >> >> >> >> -----Original Message-----
>>> >> >> >> >> From: linux-ia64-owner@xxxxxxxxxxxxxxx
>>> >> >> >> >> [mailto:linux-ia64-owner@xxxxxxxxxxxxxxx] On Behalf Of
>>> >> >> Luck, Tony
>>> >> >> >> >> Sent: Tuesday, March 01, 2005 4:33 PM
>>> >> >> >> >> To: linux-ia64@xxxxxxxxxxxxxxx
>>> >> >> >> >> Subject: RE: [patch 2.6.11-rc3-bk4] Correctly dereference
>>> >> >> >> >> ia64_mca_data
>>> >> >> >> >>
>>> >> >> >> >> Back on February 9th, I wrote:
>>> >> >> >> >> >I wrote a test program that loads up random values
>>> >> >> into registers
>>> >> >> >> >> >(just r1-r31, a bunch of stacked registers, and
>>> >> >> f2-f127 for now)
>>> >> >> >> >> >and then checks that all the registers haven't
>>> >> changed value a
>>> >> >> >> >> >few thousand times, before reloading with a new set
>>> >> of random
>>> >> >> >> >> >values.
>>> >> >> >> >>
>>> >> >> >> >> A few people asked whether I could post the program
>>> >> ... it took
>>> >> >> >> >> a while to get sign-off ... but that gave me time to
>>> >> >> add "branch",
>>> >> >> >> >> "predicate" and half a dozen "application" registers
>>> >> to the mix,
>>> >> >> >> >> plus make it print the name of the register that was
>>> >> >> nuked (instead
>>> >> >> >> >> of a number that required manual translation).
>>> >> >> >> >>
>>> >> >> >> >> I've tested it by using a debugger to zap one of
>>> each class
>>> >> >> >> >> of register
>>> >> >> >> >> that is being monitored to check that it works.
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >>
>>> http://www.kernel.org/pub/linux/kernel/people/aegl/ia64regcheck.tgz
>>> >> >> >> >>
>>> >> >> >> >> Usage ... compile, and run a few copies.  If they all
>>> >> >> >> "exit(0)" (which
>>> >> >> >> >> may take a couple of days) the test passed.  Otherwise you
>>> >> >> >> should see
>>> >> >> >> >> the name of the register printed to stderr, and
>>> exit code 1.
>>> >> >> >> >>
>>> >> >> >> >> Apart from the MCA case, I haven't seen it report
>>> a problem
>>> >> >> >> >> yet ... but
>>> >> >> >> >> I've only run a few hours.
>>> >> >> >> >>
>>> >> >> >> >> -Tony
>>> >> >> >>
>>> >> >>
>>> >>
>>>
>
>_______________________________________________
>Xen-ia64-devel mailing list
>Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>http://lists.xensource.com/xen-ia64-devel

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel

<Prev in Thread] Current Thread [Next in Thread>