Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
       [not found] <46CC9A7A.2030404@linux.vnet.ibm.com>
@ 2007-08-22 20:48 ` Andrew Morton
  2007-08-22 20:50   ` Andrew Morton
  2007-08-22 21:09   ` Christoph Lameter
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Morton @ 2007-08-22 20:48 UTC (permalink / raw)
  To: Kamalesh Babulal
  Cc: linux-kernel, Balbir Singh, Christoph Lameter, Mel Gorman, linux-mm

On Thu, 23 Aug 2007 01:50:10 +0530
Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:

> Hi Andrew,
> 
> I see call trace followed by the kernel bug with the 2.6.23-rc3-mm1
> kernel and have attached the boot log and config file.
> 
> =======================================================
> SLUB: Genslabs=12, HWalign=128, Order=0-1, MinObjects=4, CPUs=4, Nodes=16
> Bad page state in process 'swapper'
> page:cf00000000015818 flags:0x0000020000000400 mapping:0000000000000000 
> mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
> Call Trace:
> [c0000000005cbab0] [c000000000010344] .show_stack+0x68/0x1b4 (unreliable)
> [c0000000005cbb60] [c0000000000a6c54] .bad_page+0x84/0x138
> [c0000000005cbbf0] [c0000000000aa9e0] .free_hot_cold_page+0xdc/0x21c
> [c0000000005cbc90] [c0000000000ad7ec] .put_page+0x158/0x180
> [c0000000005cbd30] [c0000000000d4de8] .kfree+0x74/0xf0
> [c0000000005cbdb0] [c0000000000a866c] .process_zones+0x1a8/0x1f8
> [c0000000005cbe60] [c0000000004b5160] .setup_per_cpu_pageset+0x24/0x48
> [c0000000005cbee0] [c0000000004978d8] .start_kernel+0x304/0x3f4
> [c0000000005cbf90] [c0000000003bef10] .start_here_common+0x54/0x58
> Hexdump:
> 000: cf 00 00 00 00 01 57 d0 00 00 02 00 00 00 04 00
> 010: 00 00 00 01 ff ff ff ff 00 00 00 00 00 00 00 00
> 020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 030: cf 00 00 00 00 01 58 08 cf 00 00 00 00 01 58 08
> 040: 00 00 02 00 00 00 04 00 00 00 00 00 ff ff ff ff
> 050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 060: 00 00 00 00 00 00 00 00 cf 00 00 00 00 01 58 40
> 070: cf 00 00 00 00 01 58 40 00 00 02 00 00 00 04 00
> 080: 00 00 00 01 ff ff ff ff 00 00 00 00 00 00 00 00
> 090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 0a0: cf 00 00 00 00 01 58 78 cf 00 00 00 00 01 58 78
> 0b0: 00 00 02 00 00 00 04 00 00 00 00 01 ff ff ff ff
> ------------[ cut here ]------------
> kernel BUG at mm/page_alloc.c:2876!
> cpu 0x0: Vector: 700 (Program Check) at [c0000000005cbbe0]
>     pc: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
>     lr: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
>     sp: c0000000005cbe60
>    msr: 8000000000029032
>   current = 0xc0000000004fd1b0
>   paca    = 0xc0000000004fdd80
>     pid   = 0, comm = swapper
> kernel BUG at mm/page_alloc.c:2876!
> 

Looks like process_zones() got a kmalloc_node() failure and then crashed in
the recovery code.

This:

--- a/mm/page_alloc.c~a
+++ a/mm/page_alloc.c
@@ -2814,6 +2814,8 @@ static int __cpuinit process_zones(int c
 	return 0;
 bad:
 	for_each_zone(dzone) {
+		if (!populated_zone(zone))
+			continue;		
 		if (dzone == zone)
 			break;
 		kfree(zone_pcp(dzone, cpu));
_

might help avoid the crash, but why did kmalloc_node() fail?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
  2007-08-22 20:48 ` [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876! Andrew Morton
@ 2007-08-22 20:50   ` Andrew Morton
  2007-08-23 13:07     ` Mel Gorman
  2007-08-22 21:09   ` Christoph Lameter
  1 sibling, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2007-08-22 20:50 UTC (permalink / raw)
  To: Kamalesh Babulal, linux-kernel, Balbir Singh, Christoph Lameter,
	Mel Gorman, linux-mm

On Wed, 22 Aug 2007 13:48:00 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> This:
> 
> --- a/mm/page_alloc.c~a
> +++ a/mm/page_alloc.c
> @@ -2814,6 +2814,8 @@ static int __cpuinit process_zones(int c
>  	return 0;
>  bad:
>  	for_each_zone(dzone) {
> +		if (!populated_zone(zone))
> +			continue;		
>  		if (dzone == zone)
>  			break;
>  		kfree(zone_pcp(dzone, cpu));
> _
> 
> might help avoid the crash

err, make that

--- a/mm/page_alloc.c~a
+++ a/mm/page_alloc.c
@@ -2814,6 +2814,8 @@ static int __cpuinit process_zones(int c
 	return 0;
 bad:
 	for_each_zone(dzone) {
+		if (!populated_zone(dzone))
+			continue;
 		if (dzone == zone)
 			break;
 		kfree(zone_pcp(dzone, cpu));
_


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
  2007-08-22 20:50   ` Andrew Morton
@ 2007-08-23 13:07     ` Mel Gorman
  2007-08-23 17:17       ` Kamalesh Babulal
  0 siblings, 1 reply; 9+ messages in thread
From: Mel Gorman @ 2007-08-23 13:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kamalesh Babulal, linux-kernel, Balbir Singh, Christoph Lameter,
	linux-mm

On (22/08/07 13:50), Andrew Morton didst pronounce:
> On Wed, 22 Aug 2007 13:48:00 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > This:
> > 
> > --- a/mm/page_alloc.c~a
> > +++ a/mm/page_alloc.c
> > @@ -2814,6 +2814,8 @@ static int __cpuinit process_zones(int c
> >  	return 0;
> >  bad:
> >  	for_each_zone(dzone) {
> > +		if (!populated_zone(zone))
> > +			continue;		
> >  		if (dzone == zone)
> >  			break;
> >  		kfree(zone_pcp(dzone, cpu));
> > _
> > 
> > might help avoid the crash
> 
> err, make that
> 

We're already in the error path at this point and it's going to blow up.
The real problem is kmalloc_node() returning NULL for whatever reason.

> --- a/mm/page_alloc.c~a
> +++ a/mm/page_alloc.c
> @@ -2814,6 +2814,8 @@ static int __cpuinit process_zones(int c
>  	return 0;
>  bad:
>  	for_each_zone(dzone) {
> +		if (!populated_zone(dzone))
> +			continue;
>  		if (dzone == zone)
>  			break;
>  		kfree(zone_pcp(dzone, cpu));
> _
> 
> 

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
  2007-08-23 13:07     ` Mel Gorman
@ 2007-08-23 17:17       ` Kamalesh Babulal
  2007-08-23 20:05         ` Christoph Lameter
  0 siblings, 1 reply; 9+ messages in thread
From: Kamalesh Babulal @ 2007-08-23 17:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, linux-kernel, Balbir Singh, Christoph Lameter, linux-mm

Mel Gorman wrote:
> On (22/08/07 13:50), Andrew Morton didst pronounce:
>   
>> On Wed, 22 Aug 2007 13:48:00 -0700
>> Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>>     
>>> This:
>>>
>>> --- a/mm/page_alloc.c~a
>>> +++ a/mm/page_alloc.c
>>> @@ -2814,6 +2814,8 @@ static int __cpuinit process_zones(int c
>>>  	return 0;
>>>  bad:
>>>  	for_each_zone(dzone) {
>>> +		if (!populated_zone(zone))
>>> +			continue;		
>>>  		if (dzone == zone)
>>>  			break;
>>>  		kfree(zone_pcp(dzone, cpu));
>>> _
>>>
>>> might help avoid the crash
>>>       
>> err, make that
>>
>>     
>
> We're already in the error path at this point and it's going to blow up.
> The real problem is kmalloc_node() returning NULL for whatever reason.
>
>   
>> --- a/mm/page_alloc.c~a
>> +++ a/mm/page_alloc.c
>> @@ -2814,6 +2814,8 @@ static int __cpuinit process_zones(int c
>>  	return 0;
>>  bad:
>>  	for_each_zone(dzone) {
>> +		if (!populated_zone(dzone))
>> +			continue;
>>  		if (dzone == zone)
>>  			break;
>>  		kfree(zone_pcp(dzone, cpu));
>> _
>>
>>
>>     
>
>   
After applying the patch, the call trace is gone but the kernel bug
is still hit


Memory: 4105840k/4194304k available (4964k kernel code, 88464k reserved, 
948k data, 571k bss, 264k init)
SLUB: Genslabs=12, HWalign=128, Order=0-1, MinObjects=4, CPUs=4, Nodes=16
------------[ cut here ]------------
kernel BUG at mm/page_alloc.c:2878!
cpu 0x0: Vector: 700 (Program Check) at [c0000000005cbbe0]
pc: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
lr: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
sp: c0000000005cbe60
msr: 8000000000029032
current = 0xc0000000004fd1b0
paca = 0xc0000000004fdd80
pid = 0, comm = swapper
kernel BUG at mm/page_alloc.c:2878!
enter ? for help
[c0000000005cbee0] c0000000004978d8 .start_kernel+0x304/0x3f4
[c0000000005cbf90] c0000000003bef1c .start_here_common+0x54/0x58

-
Kamalesh Babulal




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
  2007-08-23 17:17       ` Kamalesh Babulal
@ 2007-08-23 20:05         ` Christoph Lameter
  2007-08-24  6:15           ` Kamalesh Babulal
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Lameter @ 2007-08-23 20:05 UTC (permalink / raw)
  To: Kamalesh Babulal
  Cc: Andrew Morton, Mel Gorman, linux-kernel, Balbir Singh, linux-mm

On Thu, 23 Aug 2007, Kamalesh Babulal wrote:

> After applying the patch, the call trace is gone but the kernel bug
> is still hit

Yes that is what we expected. We need more information to figure out why 
the kmalloc_node fails there. It should walk through all nodes to find 
memory.

I see that you have 4 cpus and 16 nodes. How are the cpus assigned to 
nodes? If a cpu would be assigned to a nonexisting node then this could be 
the result.

Could you post the full boot log?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
  2007-08-23 20:05         ` Christoph Lameter
@ 2007-08-24  6:15           ` Kamalesh Babulal
  2007-08-24  8:58             ` Mel Gorman
  2007-08-24 16:54             ` Christoph Lameter
  0 siblings, 2 replies; 9+ messages in thread
From: Kamalesh Babulal @ 2007-08-24  6:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andrew Morton, Mel Gorman, linux-kernel, Balbir Singh, linux-mm

Christoph Lameter wrote:
> On Thu, 23 Aug 2007, Kamalesh Babulal wrote:
>
>   
>> After applying the patch, the call trace is gone but the kernel bug
>> is still hit
>>     
>
> Yes that is what we expected. We need more information to figure out why 
> the kmalloc_node fails there. It should walk through all nodes to find 
> memory.
>
> I see that you have 4 cpus and 16 nodes. How are the cpus assigned to 
> nodes? If a cpu would be assigned to a nonexisting node then this could be 
> the result.
>
> Could you post the full boot log?
>
>   
boot log with the andrew patch applied

Welcome to yaboot version 1.3.13
Enter "help" to get some basic usage information
boot: autobench
Please wait, loading kernel...
Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded at 02400000, size: 1191 Kbytes
OF stdout device is: /vdevice/vty@30000000
Hypertas detected, assuming LPAR !
command line: ro console=hvc0 autobench_args: root=/dev/sda6 
ABAT:1187885681
memory layout at init:
alloc_bottom : 000000000252a000
alloc_top : 0000000008000000
alloc_top_hi : 0000000100000000
rmo_top : 0000000008000000
ram_top : 0000000100000000
Looking for displays
instantiating rtas at 0x00000000077d9000 ... done
0000000000000000 : boot cpu 0000000000000000
0000000000000002 : starting cpu hw idx 0000000000000002... done
copying OF device tree ...
Building dt strings...
Building dt structure...
Device tree strings 0x000000000262b000 -> 0x000000000262c1d3
Device tree struct 0x000000000262d000 -> 0x0000000002635000
Calling quiesce ...
returning from prom_init
Partition configured for 4 cpus.


Starting Linux PPC64 #1 SMP Thu Aug 23 11:54:44 EDT 2007
-----------------------------------------------------
ppc64_pft_size = 0x1a
physicalMemorySize = 0x100000000
ppc64_caches.dcache_line_size = 0x80
ppc64_caches.icache_line_size = 0x80
htab_address = 0x0000000000000000
htab_hash_mask = 0x7ffff
-----------------------------------------------------
Linux version 2.6.23-rc3-mm1-autokern1 
(root@gekko-lp3.ltc.austin.ibm.com) (gcc version 3.4.6 20060404 (Red Hat 
3.4.6-3)) #1 SMP Thu Aug 23 11:54:44 EDT 2007
[boot]0012 Setup Arch
vmemmap cf00000000000000 allocated at c000000001000000, physical 
0000000001000000.
vmemmap cf00000001000000 allocated at c000000004000000, physical 
0000000004000000.
vmemmap cf00000002000000 allocated at c000000005000000, physical 
0000000005000000.
vmemmap cf00000003000000 allocated at c000000006000000, physical 
0000000006000000.
EEH: PCI Enhanced I/O Error Handling Enabled
PPC64 nvram contains 7168 bytes
Zone PFN ranges:
DMA 0 -> 1048576
Normal 1048576 -> 1048576
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
2: 0 -> 1048576
Could not find start_pfn for node 0
[boot]0015 Setup Done
Built 2 zonelists in Node order, mobility grouping off. Total pages: 0
Policy zone: DMA
Kernel command line: ro console=hvc0 autobench_args: root=/dev/sda6 
ABAT:1187885681
[boot]0020 XICS Init
[boot]0021 XICS Done
PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour dummy device 80x25
console handover: boot [udbg0] -> real [hvc0]
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
freeing bootmem node 2
Memory: 4105840k/4194304k available (4964k kernel code, 88464k reserved, 
948k data, 571k bss, 264k init)
SLUB: Genslabs=12, HWalign=128, Order=0-1, MinObjects=4, CPUs=4, Nodes=16
------------[ cut here ]------------
kernel BUG at mm/page_alloc.c:2878!
cpu 0x0: Vector: 700 (Program Check) at [c0000000005cbbe0]
pc: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
lr: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
sp: c0000000005cbe60
msr: 8000000000029032
current = 0xc0000000004fd1b0
paca = 0xc0000000004fdd80
pid = 0, comm = swapper
kernel BUG at mm/page_alloc.c:2878!
enter ? for help
[c0000000005cbee0] c0000000004978d8 .start_kernel+0x304/0x3f4
[c0000000005cbf90] c0000000003bef1c .start_here_common+0x54/0x58

-
Kamalesh Babulal.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
  2007-08-24  6:15           ` Kamalesh Babulal
@ 2007-08-24  8:58             ` Mel Gorman
  2007-08-24 16:54             ` Christoph Lameter
  1 sibling, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2007-08-24  8:58 UTC (permalink / raw)
  To: Kamalesh Babulal
  Cc: Christoph Lameter, Andrew Morton, linux-kernel, Balbir Singh, linux-mm

On (24/08/07 11:45), Kamalesh Babulal didst pronounce:
> Christoph Lameter wrote:
> >On Thu, 23 Aug 2007, Kamalesh Babulal wrote:
> >
> >  
> >>After applying the patch, the call trace is gone but the kernel bug
> >>is still hit
> >>    
> >
> >Yes that is what we expected. We need more information to figure out why 
> >the kmalloc_node fails there. It should walk through all nodes to find 
> >memory.
> >
> >I see that you have 4 cpus and 16 nodes. How are the cpus assigned to 
> >nodes? If a cpu would be assigned to a nonexisting node then this could be 
> >the result.
> >
> >Could you post the full boot log?
> >
> >  
> boot log with the andrew patch applied
> 
> Welcome to yaboot version 1.3.13
> Enter "help" to get some basic usage information
> boot: autobench
> Please wait, loading kernel...
> Elf64 kernel loaded...
> Loading ramdisk...
> ramdisk loaded at 02400000, size: 1191 Kbytes
> OF stdout device is: /vdevice/vty@30000000
> Hypertas detected, assuming LPAR !
> command line: ro console=hvc0 autobench_args: root=/dev/sda6 
> ABAT:1187885681
> memory layout at init:
> alloc_bottom : 000000000252a000
> alloc_top : 0000000008000000
> alloc_top_hi : 0000000100000000
> rmo_top : 0000000008000000
> ram_top : 0000000100000000
> Looking for displays
> instantiating rtas at 0x00000000077d9000 ... done
> 0000000000000000 : boot cpu 0000000000000000
> 0000000000000002 : starting cpu hw idx 0000000000000002... done
> copying OF device tree ...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x000000000262b000 -> 0x000000000262c1d3
> Device tree struct 0x000000000262d000 -> 0x0000000002635000
> Calling quiesce ...
> returning from prom_init
> Partition configured for 4 cpus.
> 
> 
> Starting Linux PPC64 #1 SMP Thu Aug 23 11:54:44 EDT 2007
> -----------------------------------------------------
> ppc64_pft_size = 0x1a
> physicalMemorySize = 0x100000000
> ppc64_caches.dcache_line_size = 0x80
> ppc64_caches.icache_line_size = 0x80
> htab_address = 0x0000000000000000
> htab_hash_mask = 0x7ffff
> -----------------------------------------------------
> Linux version 2.6.23-rc3-mm1-autokern1 
> (root@gekko-lp3.ltc.austin.ibm.com) (gcc version 3.4.6 20060404 (Red Hat 
> 3.4.6-3)) #1 SMP Thu Aug 23 11:54:44 EDT 2007
> [boot]0012 Setup Arch
> vmemmap cf00000000000000 allocated at c000000001000000, physical 
> 0000000001000000.
> vmemmap cf00000001000000 allocated at c000000004000000, physical 
> 0000000004000000.
> vmemmap cf00000002000000 allocated at c000000005000000, physical 
> 0000000005000000.
> vmemmap cf00000003000000 allocated at c000000006000000, physical 
> 0000000006000000.
> EEH: PCI Enhanced I/O Error Handling Enabled
> PPC64 nvram contains 7168 bytes
> Zone PFN ranges:
> DMA 0 -> 1048576
> Normal 1048576 -> 1048576
> Movable zone start PFN for each node
> early_node_map[1] active PFN ranges
> 2: 0 -> 1048576
> Could not find start_pfn for node 0
> [boot]0015 Setup Done
> Built 2 zonelists in Node order, mobility grouping off. Total pages: 0

This indicates to me that the zonelists are trashed. All memory is on
zone 2 according to early_node_map[] and the CPU is most likely part of
node 0 that doesn't have a proper fallback list

> Policy zone: DMA
> Kernel command line: ro console=hvc0 autobench_args: root=/dev/sda6 
> ABAT:1187885681
> [boot]0020 XICS Init
> [boot]0021 XICS Done
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Console: colour dummy device 80x25
> console handover: boot [udbg0] -> real [hvc0]
> Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
> Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
> freeing bootmem node 2
> Memory: 4105840k/4194304k available (4964k kernel code, 88464k reserved, 
> 948k data, 571k bss, 264k init)
> SLUB: Genslabs=12, HWalign=128, Order=0-1, MinObjects=4, CPUs=4, Nodes=16
> ------------[ cut here ]------------
> kernel BUG at mm/page_alloc.c:2878!
> cpu 0x0: Vector: 700 (Program Check) at [c0000000005cbbe0]
> pc: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
> lr: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
> sp: c0000000005cbe60
> msr: 8000000000029032
> current = 0xc0000000004fd1b0
> paca = 0xc0000000004fdd80
> pid = 0, comm = swapper
> kernel BUG at mm/page_alloc.c:2878!
> enter ? for help
> [c0000000005cbee0] c0000000004978d8 .start_kernel+0x304/0x3f4
> [c0000000005cbf90] c0000000003bef1c .start_here_common+0x54/0x58
> 
> -
> Kamalesh Babulal.
> 
> 
> 

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
  2007-08-24  6:15           ` Kamalesh Babulal
  2007-08-24  8:58             ` Mel Gorman
@ 2007-08-24 16:54             ` Christoph Lameter
  1 sibling, 0 replies; 9+ messages in thread
From: Christoph Lameter @ 2007-08-24 16:54 UTC (permalink / raw)
  To: Kamalesh Babulal
  Cc: Andrew Morton, Mel Gorman, linux-kernel, Balbir Singh, linux-mm

On Fri, 24 Aug 2007, Kamalesh Babulal wrote:

> Starting Linux PPC64 #1 SMP Thu Aug 23 11:54:44 EDT 2007

Argh. PPC64. The typical thing that we break on all major NUMA
changes.

> EEH: PCI Enhanced I/O Error Handling Enabled
> PPC64 nvram contains 7168 bytes
> Zone PFN ranges:
> DMA 0 -> 1048576
> Normal 1048576 -> 1048576
> Movable zone start PFN for each node
> early_node_map[1] active PFN ranges
> 2: 0 -> 1048576
> Could not find start_pfn for node 0
> [boot]0015 Setup Done
> Built 2 zonelists in Node order, mobility grouping off. Total pages: 0
> Policy zone: DMA

Uhhh huh. So we have node 0 and 2 that got zonelists. What happened to 
node 1?

> Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
> Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
> freeing bootmem node 2

Hmmm... The boot occurs on node 2??

There could be something wrong with zonelist generation since various 
people worked on it. Could you add some printks to show how the zonelists 
are generated?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
  2007-08-22 20:48 ` [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876! Andrew Morton
  2007-08-22 20:50   ` Andrew Morton
@ 2007-08-22 21:09   ` Christoph Lameter
  1 sibling, 0 replies; 9+ messages in thread
From: Christoph Lameter @ 2007-08-22 21:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kamalesh Babulal, linux-kernel, Balbir Singh, Mel Gorman, linux-mm

On Wed, 22 Aug 2007, Andrew Morton wrote:

> On Thu, 23 Aug 2007 01:50:10 +0530
> Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> 
> > Hi Andrew,
> > 
> > I see call trace followed by the kernel bug with the 2.6.23-rc3-mm1
> > kernel and have attached the boot log and config file.

> > =======================================================
> > SLUB: Genslabs=12, HWalign=128, Order=0-1, MinObjects=4, CPUs=4, Nodes=16

16 nodes and 4 cpus? Can I see the zones map that is displayed on 
boot? How are the cpus mapped to the nodes?

kmalloc_node walks the zonelists from the node that was specified.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-08-24 16:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <46CC9A7A.2030404@linux.vnet.ibm.com>
2007-08-22 20:48 ` [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876! Andrew Morton
2007-08-22 20:50   ` Andrew Morton
2007-08-23 13:07     ` Mel Gorman
2007-08-23 17:17       ` Kamalesh Babulal
2007-08-23 20:05         ` Christoph Lameter
2007-08-24  6:15           ` Kamalesh Babulal
2007-08-24  8:58             ` Mel Gorman
2007-08-24 16:54             ` Christoph Lameter
2007-08-22 21:09   ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox