WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] Re: Reproducable data corruption on xen-unstable

On Sat, 5 Feb 2005, Robin Green wrote:
On the assumption that this _is_ an FP save/restore bug,

Update: I have narrowed down this bug

I have confirmed that there IS definitely an FP save/restore bug with this kernel/xen combination (i.e. I've eliminated the possibility that it was just a non-floating-point-related bug)! I identified it using a different test case (running wget -d in a konsole), and I have established
that it is case 1 in the list of possible causes I gave, namely:

1. Something leaves the FPU in a state where it has bogus data in it,
   but it won't trap to tell the kernel to restore the old, correct data

More specifically, in this particular case, according to my printf's, what happened was:

A syscall was made (connect). Immediately before the syscall, the floating-point stack was empty; immediately after the syscall, the floating-point stack was nonempty, and the TS flag (Task Switch) was _cleared_.
(Source code and output available on request.)

This may not immediately cause problems. But over time, it would tend to lead to floating-point stack overflow, which leads to floating-point calculations generating bogus output.

So, in theory there are two possible algorithms which the kernel could be supposed to be following to avoid this situation.

A. Always set TS on task switch (Seems like the logical choice!)

B. Always set TS on task switch - except when the FPU has not been used
by the switched-to process, in which case do an FINIT on task switch. (This seems pointlessly complicated and slow, so I doubt the kernel follows this approach.)

So, it looks like we are looking for a code path in which TS doesn't end
up set after a task switch. (And it might be specifically to do with
syscalls.)

I will look for one - but does anyone have any ideas for what that code path might be, or how I could efficiently debug the kernel (while in X, remember, because this doesn't seem to occur in text mode!) to find out what that code path is? I don't have a serial console.

--
Robin


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel