On Tue, Jul 06, 2004 at 08:26:43AM +0100, Ian Pratt wrote:
> > No proof whether I was hitting another entropy bug or just
> > workload. (Reminder to lurkers -- I'm running 1.2 with /dev/random
> > major,minor set to 1,9 -- same as /dev/urandom.)
> If you temporarily run out of entropy, it should only ever be the
> particular user-space process that's reading from /dev/random
> that blocks. Everything else should carry on fine in the meantime.
A little hard to tell -- in our case, hangs happen days after startup,
are pingable, but can't ssh in, don't respond to http requests, don't
show new syslog entries, etc. Some of these (maybe not syslog) might be
explained by named hanging, for instance, since it uses /dev/random (but
I haven't looked to see what named does with it -- maybe only sortlist
randomization). That plus a dose of not knowing what to look for at the
time could have made them look like a total inability to execute
userspace code and/or write to the root filesystem.
Are these symptoms consistent with what you know about the NFS bug?
I now have users executing a simple controller script themselves via ssh
to reboot machines when they hang -- it might make sense to start adding
diagnostic data collection to that. Right now it only makes a short log
entry -- and what I said a couple of days ago about "no hangs in a
month" was wrong. There are recent log entries; people have just been
rebooting their own hangs and have stopped telling me.
> In the unstable tree, AFAIK all interrupt sources are correctly
> adding entropy to the kernel's entropy pool -- there are just
> fewer bits of entropy generated per second in a VM.
> If people are still finding that some heavy users of /dev/random
> are blocking unexpectedly (e.g. apache during startup) then we'll
> need to think what to do. One grim hack would be to modify the
> guest kernel to make it less conservative about its estimate of
> entropy generated. An alternative would be to have Xen handle
> entropy generation centrally (from all physical interrupt
> sources) and then have a special random driver in each guest.
The only thing that bothers me about the latter is the special driver
needed in the guests. On the upside, I bet you could generate plenty of
entropy if it were done centrally. If so, I'm wondering if it would be
mathematically safe to include non-interrupt activities of other guests
in the pool as well.
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
http://www.stevegt.com -- http://Infrastructures.Org
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
Xen-devel mailing list