Hi Keir,
While copying a big file to VMX domain I observed that the receive ring in
the DM was getting completely full; Partially because the VMX domain driver was
not able to pull it off from the DM fast enough. In that case the DM was
dropping the packet and sentting the status of 0x1000 and sending interrupt
notifying the linux driver that the packets are lost due to space unavailabilty
in the receive ring of DM. The pcnet driver handles this situation differently.
Based on the real pcnet issues on some real hardware, the pcnet driver tries to
clear up the receive ring, assuming it is full of errors. For the emulated DM
that is not the case and things go wrong from that point onwards. I think the
error handling part of the pcnet DM is not correct, and it causes the buffer
overwrites resulting the corrpution we see.
The patch is letting the DM detect the receive ring full condition in
advance, so that packets will not be pushed to DM, in that situation, and that
si better because otherwise it is just going to drop the packet and raise an
error.
Yeh, the DM is checking for this condition in the pcnet_receive(), to signal
the OS driver that packets are dropped. But it is too late because the OS
driver handling for this situation does not work properly for the DM.
Thanks & Regards,
Nitin
--------------------------------------------------
Open Source Technology Center, Intel Corp
-----Original Message-----
From: Keir Fraser [mailto:Keir.Fraser@xxxxxxxxxxxx]
Sent: Sat 2/25/2006 3:23 AM
To: Kamble, Nitin A
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Ian Pratt
Subject: Re: [PATCH] Fox for pcnet device model data corruption
On 25 Feb 2006, at 00:10, Kamble, Nitin A wrote:
> Hi Ian,
> The attached patch fixes pcnet data corruption for VMX guests as
> reported by you.
> All the packets go through the qemu generic packet interface to the
> specific device model. In this case the device model is pcnet.
> The pcnet device model receiver is registered with it like this.
> qemu_add_read_packet(nd, pcnet_can_receive, pcnet_receive, d);
> pcnet_can_receive function is used to tell the generic qemu
> framework that the DM can receive packets. It is suppose block
> incoming packets in the cases such as when the pcnet driver is not yet
> started by the OS or pcnet device is suspended or stopped by the OS or
> it is not ready to receive more packets.
> When the traffic is heavy on the DM, its receive rings can get
> filled up, and it will has to drop the receiving packets. This patch
> detects this situation in the pcnet_can_receive() function and avoids
> dropping of packets. This mechanism is working as a bandwidth
> handshaking between device model and the sender. Dm is saying send me
> up to the rate at which I can handle it.
I can see that this may avoid packet loss, but does pcnet_receive
really get confused and corrupt data if there is no spare space? It
appears to check the same status flag that you check in your patch?
-- Keir
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|