Looks like yesterday's paging problems were an artifact of something on
that particular dom0 disk image -- I haven't been able to reproduce it
on other nodes, and after I re-imaged the same node (with the same
image), the problem has gone away there too, so that rules out hardware.
However, something else did pop up -- while trying to break things with
"perl -e '$a="a"x100000000'", I got the following messages; do we care?
This is in mainstream linux mm/page_alloc.c, and it's hard for me to
tell from the code whether these are outright errors or whether they
were recoverable. Does anyone know?
DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
DOM4: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
This was still on last Monday's version of 1.2; I'll see if I can
reproduce it in today's 1.2; I'm deploying that later tonight.
Steve
On Mon, Feb 09, 2004 at 10:34:34PM -0800, wrote:
> Before anyone burns too much time on this, hang on -- I wasn't able to
> duplicate the problem on another cluster node (both nodes were built
> from the same SystemImager image). I'm looking for the reason why, and
> will let you know as soon as I do.
>
> Steve
>
> On Mon, Feb 09, 2004 at 08:13:22PM -0800, wrote:
> > Okay, the problem still exists when I bump the memory up to 256Mb, and
> > never swap. I.E. I've found no workaround. Hasn't anyone else hit
> > anything like this?
> >
> > Steve
> >
> >
> > DOM3: xen_console_init
> > DOM3: Linux version 2.4.24-xeno (stevegt@pathfinder) (gcc version
> > 3.0.4) #16 Mon Feb 2 17:46:41 PST 2004
> > DOM3: On node 0 totalpages: 65536
> > DOM3: zone(0): 4096 pages.
> > DOM3: zone(1): 61440 pages.
> > DOM3: zone(2): 0 pages.
> > DOM3: Kernel command line:
> > ip=64.71.149.20:10.27.2.50:64.71.149.1:255.255.255.0::eth0:off
> > root=/dev/nfs nfsroot=/export//xen/fs/stevegt/tcx/root 4 DOMID=20
> > DOM3: Initializing CPU#0
> > DOM3: Xen reported: 398.780 MHz processor.
> > DOM3: Calibrating delay loop... 1592.52 BogoMIPS
> > DOM3: Memory: 257132k/262144k available (1078k kernel code, 5012k
> > reserved, 308k data, 52k init, 0k highmem)
> > DOM3: Dentry cache hash table entries: 32768 (order: 6, 262144 bytes)
> > DOM3: Inode cache hash table entries: 16384 (order: 5, 131072 bytes)
> > DOM3: Mount cache hash table entries: 512 (order: 0, 4096 bytes)
> > DOM3: Buffer cache hash table entries: 16384 (order: 4, 65536 bytes)
> > DOM3: Page-cache hash table entries: 65536 (order: 6, 262144 bytes)
> > DOM3: CPU: L1 I cache: 16K, L1 D cache: 16K
> > DOM3: CPU: L2 cache: 512K
> > DOM3: CPU: Intel Pentium II (Deschutes) stepping 01
> > DOM3: POSIX conformance testing by UNIFIX
> > DOM3: Linux NET4.0 for Linux 2.4
> > DOM3: Based upon Swansea University Computer Society NET3.039
> > DOM3: Initializing RT netlink socket
> > DOM3: Starting kswapd
> > DOM3: Journalled Block Device driver loaded
> > DOM3: Installing knfsd (copyright (C) 1996 okir@xxxxxxxxxxxx).
> > DOM3: Xeno console successfully installed
> > DOM3: Starting Xeno Balloon driver
> > DOM3: pty: 256 Unix98 ptys configured
> > DOM3: RAMDISK driver initialized: 16 RAM disks of 4096K size 1024
> > blocksize
> > DOM3: loop: loaded (max 8 devices)
> > DOM3: NET4: Linux TCP/IP 1.0 for NET4.0
> > DOM3: IP Protocols: ICMP, UDP, TCP
> > DOM3: IP: routing cache hash table of 2048 buckets, 16Kbytes
> > DOM3: TCP: Hash tables configured (established 16384 bind 16384)
> > DOM3: IP-Config: Complete:
> > DOM3: device=eth0, addr=64.71.149.20, mask=255.255.255.0,
> > gw=64.71.149.1,
> > DOM3: host=64.71.149.20, domain=, nis-domain=(none),
> > DOM3: bootserver=10.27.2.50, rootserver=10.27.2.50, rootpath=
> > DOM3: ip_conntrack version 2.1 (2048 buckets, 16384 max) - 292 bytes
> > per conntrack
> > DOM3: ip_tables: (C) 2000-2002 Netfilter core team
> > DOM3: NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
> > DOM3: Looking up port of RPC 100003/2 on 10.27.2.50
> > DOM3: Looking up port of RPC 100005/1 on 10.27.2.50
> > DOM3: VFS: Mounted root (nfs filesystem).
> > DOM3: Freeing unused kernel memory: 52k freed
> > DOM3: INIT: version 2.84 booting
> > DOM3: Activating swap.
> > DOM3: Adding Swap: 262136k swap-space (priority -1)
> > DOM3: Checking root file system...
> > DOM3: fsck 1.27 (8-Mar-2002)
> > DOM3: 10.27.2.50:/export/xen/fs/stevegt/tcx: NFS file system.
> > DOM3: System time was Tue Feb 10 02:14:36 UTC 2004.
> > DOM3: Setting the System Clock using the Hardware Clock as
> > reference...
> > DOM3: modprobe: modprobe: Can't locate module char-major-10-135
> > DOM3: modprobe: modprobe: Can't locate module char-major-4
> > DOM3: hwclock is unable to get I/O port access: the iopl(3) call
> > failed.
> > DOM3: modprobe: modprobe: Can't locate module char-major-10-135
> > DOM3: modprobe: modprobe: Can't locate module char-major-4
> > DOM3: System Clock set. System local time is now Tue Feb 10 02:14:36
> > UTC 2004.
> > DOM3: Calculating module dependencies... depmod: cannot read ELF
> > header from /lib/modules/2.4.24-xeno/modules.dep
> > DOM3: depmod: cannot read ELF header from
> > /lib/modules/2.4.24-xeno/modules.generic_string
> > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.ieee1394map is not an
> > ELF file
> > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.isapnpmap is not an ELF
> > file
> > DOM3: depmod: cannot read ELF header from
> > /lib/modules/2.4.24-xeno/modules.parportmap
> > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.pcimap is not an ELF
> > file
> > DOM3: depmod: cannot read ELF header from
> > /lib/modules/2.4.24-xeno/modules.pnpbiosmap
> > DOM3: depmod: /lib/modules/2.4.24-xeno/modules.usbmap is not an ELF
> > file
> > DOM3: done.
> > DOM3: Loading modules:
> > DOM3: Checking all file systems...
> > DOM3: fsck 1.27 (8-Mar-2002)
> > DOM3: Setting kernel variables.
> > DOM3: Loading the saved-state of the serial devices...
> > DOM3: Mounting local filesystems...
> > DOM3: nothing was mounted
> > DOM3: Running 0dns-down to make sure resolv.conf is ok...done.
> > DOM3: Cleaning: /etc/network/ifstate.
> > DOM3: Setting up IP spoofing protection: rp_filter.
> > DOM3: Configuring network interfaces: done.
> > DOM3: Mounting remote filesystems...
> > DOM3:
> > DOM3: Setting the System Clock using the Hardware Clock as
> > reference...
> > DOM3: System Clock set. Local time: Tue Feb 10 02:14:37 UTC 2004
> > DOM3:
> > DOM3: Cleaning: /tmp /var/lock /var/run.
> > DOM3: Initializing random number generator... done.
> > DOM3: Recovering nvi editor sessions... done.
> > DOM3: INIT: Entering runlevel: 4
> > DOM3: Starting system log daemon: syslogd.
> > DOM3: Starting kernel log daemon: klogd.
> > DOM3: Starting internet superserver: inetd.
> > DOM3: Starting PCMCIA services: module directory
> > /lib/modules/2.4.24-xeno/pcmcia not found.
> > DOM3: Starting OpenBSD Secure Shell server: sshd.
> > DOM3: Starting deferred execution scheduler: atd.
> > DOM3: Starting periodic command scheduler: cron.
> > DOM3: INIT: no more processes left in this runlevel
> > DOM3: Unable to handle kernel paging request at virtual address
> > 20000001
> > DOM3: printing eip:
> > DOM3: c0007743
> > DOM3: *pde=00000000(00000000)
> > DOM3: Oops: 0000
> > DOM3: CPU: 0
> > DOM3: EIP: 0819:[<c0007743>] Not tainted
> > DOM3: EFLAGS: 00010202
> > DOM3: eax: 00000001 ebx: 20000001 ecx: c3ebde6c edx: c3ebde6c
> > DOM3: esi: c3ebc000 edi: c0114254 ebp: c46c1060 esp: c3ebdce4
> > DOM3: ds: 0821 es: 0821 ss: 0821
> > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1>
> > DOM3: Stack: 20000001 c3ebc000 c002c728 c3ebc000 c3ebddb0 c0114254
> > ffffffb0 c3ebc000
> > DOM3: c003e789 c3ebde6c c014b5ac c003e314 c3ebde6c 00000000
> > c0114250 00000000
> > DOM3: 00000000 00000000 01082003 8d588810 00000000 00000000
> > 00000000 00000000
> > DOM3: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>]
> > [<c002cf4a>]
> > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>]
> > DOM3:
> > DOM3: <1>Unable to handle kernel paging request at virtual address
> > 20000001
> > DOM3: printing eip:
> > DOM3: c000af0f
> > DOM3: *pde=00000000(00000000)
> > DOM3: Oops: 0002
> > DOM3: CPU: 0
> > DOM3: EIP: 0819:[<c000af0f>] Not tainted
> > DOM3: EFLAGS: 00010282
> > DOM3: eax: 20000001 ebx: c485c0a0 ecx: c3ebc264 edx: c3ebc264
> > DOM3: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c3ebdbb4
> > DOM3: ds: 0821 es: 0821 ss: 0821
> > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1>
> > DOM3: Stack: c485c0a0 00000000 c3ebc000 0000000b 0000000b c000b55f
> > 20000001 0000001f
> > DOM3: 00000000 cf4227e0 20000001 c0091a87 0000000b 00000000
> > c485c0bc c0096305
> > DOM3: c0129928 c3ebdcb0 00000000 c3ebc000 00000000 20000001
> > c46c1060 00000000
> > DOM3: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>]
> > [<c0018a25>]
> > DOM3: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>]
> > [<c0091768>] [<c0007743>]
> > DOM3: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>]
> > [<c002cf4a>] [<c002cf61>]
> > DOM3: [<c0090033>] [<c00914bf>]
> > DOM3:
> > DOM3: <1>Unable to handle kernel NULL pointer dereference at virtual
> > address 00000001
> > DOM3: printing eip:
> > DOM3: c000b623
> > DOM3: *pde=00000000(00000000)
> > DOM3: Oops: 0002
> > DOM3: CPU: 0
> > DOM3: EIP: 0819:[<c000b623>] Not tainted
> > DOM3: EFLAGS: 00010202
> > DOM3: eax: 00000000 ebx: 00000001 ecx: c3ebc264 edx: c3ebc264
> > DOM3: esi: 00000002 edi: c3ebc000 ebp: 0000000b esp: c3ebdaa0
> > DOM3: ds: 0821 es: 0821 ss: 0821
> > DOM3: Process cc (pid: 13793, stackpage=c3ebd000)<1>
> > DOM3: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b
> > 00000000 00000002
> > DOM3: c0096305 c0129928 c3ebdb80 00000002 c3ebc000 00000002
> > 20000001 0000000b
> > DOM3: 63303039 ffffffff c3ebc000 00000002 38383130 38643538
> > 00030001 64303030
> > DOM3: Call Trace: [<c0091a87>] [<c0096305>] [<c0008996>] [<c000f797>]
> > [<c000f991>]
> > DOM3: [<c0091768>] [<c000af0f>] [<c000b55f>] [<c0091a87>]
> > [<c0096305>] [<c002eb19>]
> > DOM3: [<c0018a25>] [<c0018c46>] [<c0018feb>] [<c0018ed4>]
> > [<c006e759>] [<c0091768>]
> > DOM3: [<c0007743>] [<c002c728>] [<c003e789>] [<c003e314>]
> > [<c002ccc7>] [<c002cf4a>]
> > DOM3: [<c002cf61>] [<c0090033>] [<c00914bf>]
> > DOM3:
> >
> >
> >
> >
> > On Mon, Feb 09, 2004 at 05:25:00PM -0800, wrote:
> > > Hi All,
> > >
> > > I seem to be able to reproduce a null pointer dereference and paging
> > > request errors in 1.2. Can anyone give me any pointers on tracking down
> > > what is causing it?
> > >
> > > This is with a 32Mb virtual domain, running debian woody, NFS root,
> > > 256Mb swap in a local VD, while running a process which builds openldap,
> > > python2.2.3, and related packages. I'm not sure which package, if any
> > > in particular, is causing this; could be just anything that causes a
> > > similar workload. This particular set of messages appeared before the
> > > virtual domain locked up during the openldap build...
> > >
> > > Steve
> > >
> > >
> > > DOM26: Unable to handle kernel paging request at virtual address
> > > 20000001
> > > DOM26: printing eip:
> > > DOM26: c0007743
> > > DOM26: *pde=00000000(00000000)
> > > DOM26: Oops: 0000
> > > DOM26: CPU: 0
> > > DOM26: EIP: 0819:[<c0007743>] Not tainted
> > > DOM26: EFLAGS: 00010202
> > > DOM26: eax: 00000001 ebx: 20000001 ecx: c0a79e6c edx: c0a79e6c
> > > DOM26: esi: c0a78000 edi: c0114254 ebp: c1e5f580 esp: c0a79ce4
> > > DOM26: ds: 0821 es: 0821 ss: 0821
> > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1>
> > > DOM26: Stack: 20000001 c0a78000 c002c728 c0a78000 c0a79db0 c0114254
> > > ffffffb0 c0a78000
> > > DOM26: c003e789 c0a79e6c c014b5ac c003e314 c0a79e6c 00000000
> > > c0114250 c0a79de8
> > > DOM26: c1419640 80000000 00000000 00000000 00000000 00000000
> > > 00000000 00000000
> > > DOM26: Call Trace: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>]
> > > [<c002cf4a>]
> > > DOM26: [<c002cf61>] [<c0090033>] [<c00914bf>]
> > > DOM26:
> > > DOM26: <1>Unable to handle kernel paging request at virtual address
> > > 20000001
> > > DOM26: printing eip:
> > > DOM26: c000af0f
> > > DOM26: *pde=00000000(00000000)
> > > DOM26: Oops: 0002
> > > DOM26: CPU: 0
> > > DOM26: EIP: 0819:[<c000af0f>] Not tainted
> > > DOM26: EFLAGS: 00010282
> > > DOM26: eax: 20000001 ebx: c1ed5b20 ecx: c0a78264 edx: c0a78264
> > > DOM26: esi: 00000000 edi: 20000001 ebp: 0000000b esp: c0a79bb4
> > > DOM26: ds: 0821 es: 0821 ss: 0821
> > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1>
> > > DOM26: Stack: c1ed5b20 00000000 c0a78000 0000000b 0000000b c000b55f
> > > 20000001 0000001f
> > > DOM26: 00000000 c140d6c0 20000001 c0091a87 0000000b 00000000
> > > c1ed5b3c c0096305
> > > DOM26: c0129928 c0a79cb0 00000000 c0a78000 00000000 20000001
> > > c1e5f580 00000000
> > > DOM26: Call Trace: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>]
> > > [<c0018a25>]
> > > DOM26: [<c0018c46>] [<c0018feb>] [<c0018ed4>] [<c006e759>]
> > > [<c0091768>] [<c0007743>]
> > > DOM26: [<c002c728>] [<c003e789>] [<c003e314>] [<c002ccc7>]
> > > [<c002cf4a>] [<c002cf61>]
> > > DOM26: [<c0090033>] [<c00914bf>]
> > > DOM26:
> > > DOM26: <1>Unable to handle kernel NULL pointer dereference at virtual
> > > address 00000001
> > > DOM26: printing eip:
> > > DOM26: c000b623
> > > DOM26: *pde=00000000(00000000)
> > > DOM26: Oops: 0002
> > > DOM26: CPU: 0
> > > DOM26: EIP: 0819:[<c000b623>] Not tainted
> > > DOM26: EFLAGS: 00010202
> > > DOM26: eax: 00000000 ebx: 00000001 ecx: c0a78264 edx: c0a78264
> > > DOM26: esi: 00000002 edi: c0a78000 ebp: 0000000b esp: c0a79aa0
> > > DOM26: ds: 0821 es: 0821 ss: 0821
> > > DOM26: Process sh (pid: 10086, stackpage=c0a79000)<1>
> > > DOM26: Stack: 0000001f 00000002 20000001 20000001 c0091a87 0000000b
> > > 00000000 00000002
> > > DOM26: c0096305 c0129928 c0a79b80 00000002 c0a78000 00000002
> > > 20000001 0000000b
> > > DOM26: 63303039 c101fc58 c0a78000 00000002 c101fc58 ffffffff
> > > 00030001 c001e621
> > > DOM26: Call Trace: [<c0091a87>] [<c0096305>] [<c001e621>] [<c001f6c0>]
> > > [<c0008996>]
> > > DOM26: [<c00200d7>] [<c00204e1>] [<c001464d>] [<c0014c92>]
> > > [<c0091768>] [<c000af0f>]
> > > DOM26: [<c000b55f>] [<c0091a87>] [<c0096305>] [<c002eb19>]
> > > [<c0018a25>] [<c0018c46>]
> > > DOM26: [<c0018feb>] [<c0018ed4>] [<c006e759>] [<c0091768>]
> > > [<c0007743>] [<c002c728>]
> > > DOM26: [<c003e789>] [<c003e314>] [<c002ccc7>] [<c002cf4a>]
> > > [<c002cf61>] [<c0090033>]
> > > DOM26: [<c00914bf>]
> > > DOM26:
> > >
> > >
> > >
> > > --
> > > Stephen G. Traugott (KG6HDQ)
> > > UNIX/Linux Infrastructure Architect, TerraLuna LLC
> > > stevegt@xxxxxxxxxxxxx
> > > http://www.stevegt.com -- http://Infrastructures.Org
> >
> > --
> > Stephen G. Traugott (KG6HDQ)
> > UNIX/Linux Infrastructure Architect, TerraLuna LLC
> > stevegt@xxxxxxxxxxxxx
> > http://www.stevegt.com -- http://Infrastructures.Org
>
> --
> Stephen G. Traugott (KG6HDQ)
> UNIX/Linux Infrastructure Architect, TerraLuna LLC
> stevegt@xxxxxxxxxxxxx
> http://www.stevegt.com -- http://Infrastructures.Org
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@xxxxxxxxxxxxx
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel
|