xen-users
Re: [Xen-users] Xen 3.0.0 32bit-pae (testing changeset 8270) crashes(pgt
Am Donnerstag, 2. Februar 2006 12:52 schrieb Ian Pratt:
> > 3.0.1 seems to fix the bug I saw on my two machines, but now
> > there is another
> > (but somehow related) problem for me in 3.0.1-pae. I don't
> > know if it's still
> > related to the 3ware controller, but at least it only appears
> > for domains
> > that have memory above the 32bit adress-space again, so the
> > first started
> > domUs run fine. The big difference is that I don't have any
> > complete freezes
> > of the xen machine anymore, just domUs are crashing this time.
>
> Interesting. It looks like xen is running out of memory below 4GB, and
> can't service the domain's request for a new L3 PGD, causing the domain
> to bug out.
>
> Are you using dom0_mem= on the xen command line to constrain dom0's
> memory usage or are relying on dom0 releasing memory automatically as
> you start other domains? If the latter, I expect dom0 is hogging all the
> pages below 4GB. [Grrr, PAE is such a crock...]
No, I don't use dom0_mem since xen3 is out, but I will setting the dom0_mem
again to check if this makes any difference. In general I like the new
feature of letting xen handle the dom0 memory.
And I agree. pae is a crock, but the problem in my case is, that we have
already some production system running on xen hosts in 32bit mode and it's
not as easy to upgrade them to use 64bit (because of the downtime and so on).
At the time the servers were bought I didn't knew that pci devices are taking
500MB to 1GB addresspace and I thought pae is just needed for systems with
more then 4gb physical ram.
Using a 64bit kernel but 32bit userspace is horrible too (at least in my
opinion), because for example iptables won't run then (if I got that right).
Mixing 64bit and 32bit userspace is possible, but I don't like that idea.
Customers want a clean system and not a somehow working solution. But I guess
in future we are forced to upgrade to 64bit anyway, because that is defintily
the future.
> Given that your 3ware controller is already putting pressure on the
> bottom 4GB you'd be better off setting your initial dom0 memory at boot
> time.
so you think that even this bug I see is related to the 3ware controller and
not a general issue?
I cannot really check it, because this 3ware-system is the only server that I
have available with 4GB of RAM for testing.
Is this really a 3ware specific problem?
I am asking, because I want to know if I shouldn't buy 3ware controllers for
xen systems anymore. In the next month we will need new xen systems and I
don't want to buy wrong hw then :)
> Please let me know how you get on. BTW: can you get a serial line on the
> machine? It might be interesting to see some of xen's memory usage
> diagnostics.
There is already a serial console on the machine, but it's not showing
anything interessting automaticly (just the normal xen output from boottime).
I guess with some of the SysRQ's I get the information you need, right? I
will take a look and mail you this information. That should not be a problem.
You can also have a ssh account to dom0 if you like, I just have to attach the
server to another network then.
> Ian
>
> > the domU doesn't always crash at the very same place,
> > sometimes at the
> > beginning of the init process, sometimes when it loads
> > modules, sometimes
> > when services gets started... Sometimes this crash happens
> > more then once
> > before the domU panics.
> >
> > here is what I see in the domU console:
> >
> > ------------[ cut here ]------------
> > kernel BUG at <bad filename>:63723!
> > invalid operand: 0000 [#1]
> > SMP
> > Modules linked in: 8250 reiserfs efs isofs vfat fat ext3 jbd
> > evdev pci_hotplug
> > dm_mod sd_mod 3w_xxxx e1000 jedec_probe cfi_probe gen_probe
> > chipreg mtdcore
> > map_funcs i2c_i801 i2c_core parport_pc parport serial_core
> > usbhid pcmcia
> > yenta_socket rsrc_nonstatic pcmcia_core processor genrtc sbp2
> > ohci1394
> > ieee1394 usb_storage ohci_hcd uhci_hcd 3w_9xxx scsi_mod unix
> > CPU: 0
> > EIP: 0061:[<c01182b6>] Not tainted VLI
> > EFLAGS: 00010282 (2.6.12.6-xen)
> > EIP is at pgd_ctor+0x26/0x30
> > eax: fffffff4 ebx: 00000001 ecx: f577e000 edx: 00000000
> > esi: c118fd80 edi: c12bd258 ebp: c12bd240 esp: c864dd38
> > ds: 007b es: 007b ss: 0069
> > Process rcS (pid: 1041, threadinfo=c864c000 task=c06f8a40)
> > Stack: c77ae000 00000000 00000020 c014dd51 c77ae000 c118fd80
> > 00000001 c12bd240
> > c77ae000 c118fd80 00000000 c014decd c118fd80 c12bd240
> > 00000001 000000d0
> > c118fde0 00000001 000000d0 c119d980 0000000c 000000d0
> > 00000000 c014e0db
> > Call Trace:
> > [<c014dd51>] cache_init_objs+0x71/0x80
> > [<c014decd>] cache_grow+0x10d/0x1a0
> > [<c014e0db>] cache_alloc_refill+0x17b/0x220
> > [<c014e39f>] kmem_cache_alloc+0x7f/0x90
> > [<c011833d>] pgd_alloc+0x1d/0x310
> > [<c01216fe>] mm_init+0xce/0x100
> > [<c0121a14>] copy_mm+0xd4/0x3d0
> > [<c0121fdf>] copy_files+0x1af/0x320
> > [<c03f9d00>] parse_header+0xb0/0xe0
> > [<c03f9d04>] parse_header+0xb4/0xe0
> > [<c01225af>] copy_process+0x3df/0xd00
> > [<c0166f4f>] fd_install+0x2f/0x60
> > [<c0122fc9>] do_fork+0x69/0x18f
> > [<c0130e4a>] sys_rt_sigprocmask+0xaa/0x110
> > [<c0108f91>] sys_fork+0x31/0x40
> > [<c010a65d>] syscall_call+0x7/0xb
> > Code: 00 f3 ab 5f c3 83 ec 0c b8 20 00 00 00 89 44 24 08 31
> > c0 89 44 24 04 8b
> > 44 24 10 89 04 24 e8 d2 2b 00 00 85 c0 75 04 83 c4 0c c3 <0f>
> > 0b eb f8 8d b6
> > 00 00 00 00 83 ec 08 b8 f8 e3 36 c0 89 5c 24
> > /etc/init.d/rcS: line 57: 1041 Segmentation fault (
> > trap - INT QUIT
> > TSTP; set start; . $i )
> >
> > something I can do to help resolving that?
> >
> > thx & regards,
> > -- Ralph
> >
> > > Ian
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
|
|