Bad page state (was Re: Linux 2.6.31-rc7)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Bad page state (was Re: Linux 2.6.31-rc7)
       [not found] ` <200908212248.40987.gene.heskett@verizon.net>
@ 2009-08-22  4:17   ` Linus Torvalds
  2009-08-23  7:22     ` Wu Fengguang
  2009-08-24 13:55     ` Mel Gorman
  0 siblings, 2 replies; 7+ messages in thread
From: Linus Torvalds @ 2009-08-22  4:17 UTC (permalink / raw)
  To: Gene Heskett
  Cc: Wu Fengguang, Andrew Morton, Hugh Dickins, Mel Gorman, linux-mm

On Fri, 21 Aug 2009, Gene Heskett wrote:
> 
> From messages, I already have a problem with lzma too:

And for this too, can you tell what the last working kernel was?

Does the problem happen consistently? (And btw, it's not probably so much 
lzma, but something random that released a page without clearing some of 
the page flags or something).

Wu - I'm not seeing a lot of changes to compund page handling except for 
commit 20a0307c0396c2edb651401d2f2db193dda2f3c9 ("mm: introduce PageHuge() 
for testing huge/gigantic pages").

That one removed the

	set_compound_page_dtor(page, free_compound_page);

thing from prep_compound_gigantic_page(), which looks a bit odd and 
suspicious (the commit message only talks about _moving_ it). But I don't 
know the hugetlb code.

But that commit went into -rc1 already.  Gene, I know you sent me email 
about a later -rc release, but maybe you didn't test it on that machine or 
with that config?

> Aug 21 22:37:47 coyote kernel: [ 1030.152737] BUG: Bad page state in process lzma  pfn:a1093
> Aug 21 22:37:47 coyote kernel: [ 1030.152743] page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
> Aug 21 22:37:47 coyote kernel: [ 1030.152747] Pid: 17927, comm: lzma Not tainted 2.6.31-rc7 #1
> Aug 21 22:37:47 coyote kernel: [ 1030.152750] Call Trace:
> Aug 21 22:37:47 coyote kernel: [ 1030.152758]  [<c130e363>] ? printk+0x23/0x40
> Aug 21 22:37:47 coyote kernel: [ 1030.152763]  [<c108404f>] bad_page+0xcf/0x150
> Aug 21 22:37:47 coyote kernel: [ 1030.152767]  [<c10850ed>] get_page_from_freelist+0x37d/0x480
> Aug 21 22:37:47 coyote kernel: [ 1030.152771]  [<c10853cf>] __alloc_pages_nodemask+0xdf/0x520
> Aug 21 22:37:47 coyote kernel: [ 1030.152775]  [<c1096b19>] handle_mm_fault+0x4a9/0x9f0
> Aug 21 22:37:47 coyote kernel: [ 1030.152780]  [<c1020d61>] do_page_fault+0x141/0x290
> Aug 21 22:37:47 coyote kernel: [ 1030.152784]  [<c1020c20>] ? do_page_fault+0x0/0x290
> Aug 21 22:37:47 coyote kernel: [ 1030.152787]  [<c1311bcb>] error_code+0x73/0x78
> Aug 21 22:37:47 coyote kernel: [ 1030.152789] Disabling lock debugging due to kernel taint

It looks like 'flags' is the one that causes this problem at allocation 
time (count, mapcount, mapping and index all look nicely zeroed).

In particular, it's the 0x4000 bit (the high bit, which is also set, is 
the upper field bits for page section/node/zone numbers etc), which is 
either PG_head or PG_compound depending on CONFIG_PAGEFLAGS_EXTENDED.

And in your case, since you have CONFIG_PAGEFLAGS_EXTENDED=y, it would be 
PG_head.

Btw guys, why don't we check PG_head etc at free time when we add the page 
to the free list? Now we get that annoying error only when it is way too 
late, and have no way to know who screwed up..

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad page state (was Re: Linux 2.6.31-rc7)
  2009-08-22  4:17   ` Bad page state (was Re: Linux 2.6.31-rc7) Linus Torvalds
@ 2009-08-23  7:22     ` Wu Fengguang
  2009-08-23  8:20       ` Gene Heskett
  2009-08-24 13:55     ` Mel Gorman
  1 sibling, 1 reply; 7+ messages in thread
From: Wu Fengguang @ 2009-08-23  7:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Gene Heskett, Andrew Morton, Hugh Dickins, Mel Gorman, linux-mm

On Sat, Aug 22, 2009 at 12:17:48PM +0800, Linus Torvalds wrote:
> 
> 
> On Fri, 21 Aug 2009, Gene Heskett wrote:
> > 
> > From messages, I already have a problem with lzma too:
> 
> And for this too, can you tell what the last working kernel was?
> 
> Does the problem happen consistently? (And btw, it's not probably so much 
> lzma, but something random that released a page without clearing some of 
> the page flags or something).
> 
> Wu - I'm not seeing a lot of changes to compund page handling except for 
> commit 20a0307c0396c2edb651401d2f2db193dda2f3c9 ("mm: introduce PageHuge() 
> for testing huge/gigantic pages").
> 
> That one removed the
> 
> 	set_compound_page_dtor(page, free_compound_page);
> 
> thing from prep_compound_gigantic_page(), which looks a bit odd and 
> suspicious (the commit message only talks about _moving_ it). But I don't 
> know the hugetlb code.

Sorry for not describing the remove in changelog.  Remove of that line
was proposed by Mel and I think it changed nothing in behavior.
Because the only possible call train is:

        gather_bootmem_prealloc()
                prep_compound_huge_page()
                        prep_compound_gigantic_page()
==>                             set_compound_page_dtor(page, free_compound_page);
                prep_new_huge_page()
==>                     set_compound_page_dtor(page, free_huge_page);

So obviously the first set_compound_page_dtor() call is extraordinary.

> But that commit went into -rc1 already.  Gene, I know you sent me email 
> about a later -rc release, but maybe you didn't test it on that machine or 
> with that config?
> 
> > Aug 21 22:37:47 coyote kernel: [ 1030.152737] BUG: Bad page state in process lzma  pfn:a1093
> > Aug 21 22:37:47 coyote kernel: [ 1030.152743] page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
> > Aug 21 22:37:47 coyote kernel: [ 1030.152747] Pid: 17927, comm: lzma Not tainted 2.6.31-rc7 #1
> > Aug 21 22:37:47 coyote kernel: [ 1030.152750] Call Trace:
> > Aug 21 22:37:47 coyote kernel: [ 1030.152758]  [<c130e363>] ? printk+0x23/0x40
> > Aug 21 22:37:47 coyote kernel: [ 1030.152763]  [<c108404f>] bad_page+0xcf/0x150
> > Aug 21 22:37:47 coyote kernel: [ 1030.152767]  [<c10850ed>] get_page_from_freelist+0x37d/0x480
> > Aug 21 22:37:47 coyote kernel: [ 1030.152771]  [<c10853cf>] __alloc_pages_nodemask+0xdf/0x520
> > Aug 21 22:37:47 coyote kernel: [ 1030.152775]  [<c1096b19>] handle_mm_fault+0x4a9/0x9f0
> > Aug 21 22:37:47 coyote kernel: [ 1030.152780]  [<c1020d61>] do_page_fault+0x141/0x290
> > Aug 21 22:37:47 coyote kernel: [ 1030.152784]  [<c1020c20>] ? do_page_fault+0x0/0x290
> > Aug 21 22:37:47 coyote kernel: [ 1030.152787]  [<c1311bcb>] error_code+0x73/0x78
> > Aug 21 22:37:47 coyote kernel: [ 1030.152789] Disabling lock debugging due to kernel taint
> 
> It looks like 'flags' is the one that causes this problem at allocation 
> time (count, mapcount, mapping and index all look nicely zeroed).
> 
> In particular, it's the 0x4000 bit (the high bit, which is also set, is 
> the upper field bits for page section/node/zone numbers etc), which is 
> either PG_head or PG_compound depending on CONFIG_PAGEFLAGS_EXTENDED.
> 
> And in your case, since you have CONFIG_PAGEFLAGS_EXTENDED=y, it would be 
> PG_head.

Right. btw it takes time to reverse engineer the page flag names each
time it oops. Does it make sense to print a more readable form, eg.

        flags:80004000 (MOVABLE,head)

?

> Btw guys, why don't we check PG_head etc at free time when we add the page 
> to the free list? Now we get that annoying error only when it is way too 
> late, and have no way to know who screwed up..

And what puzzled me is that PG_head should have been cleared by
free_pages_check():

        if (page->flags & PAGE_FLAGS_CHECK_AT_PREP)
                page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad page state (was Re: Linux 2.6.31-rc7)
  2009-08-23  7:22     ` Wu Fengguang
@ 2009-08-23  8:20       ` Gene Heskett
  2009-08-23 16:44         ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Gene Heskett @ 2009-08-23  8:20 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Linus Torvalds, Andrew Morton, Hugh Dickins, Mel Gorman, linux-mm

On Sunday 23 August 2009, Wu Fengguang wrote:
>On Sat, Aug 22, 2009 at 12:17:48PM +0800, Linus Torvalds wrote:
>> On Fri, 21 Aug 2009, Gene Heskett wrote:
>> > From messages, I already have a problem with lzma too:
>>
>> And for this too, can you tell what the last working kernel was?
>>
>> Does the problem happen consistently? (And btw, it's not probably so much
>> lzma, but something random that released a page without clearing some of
>> the page flags or something).
>>
>> Wu - I'm not seeing a lot of changes to compund page handling except for
>> commit 20a0307c0396c2edb651401d2f2db193dda2f3c9 ("mm: introduce
>> PageHuge() for testing huge/gigantic pages").
>>
>> That one removed the
>>
>> 	set_compound_page_dtor(page, free_compound_page);
>>
>> thing from prep_compound_gigantic_page(), which looks a bit odd and
>> suspicious (the commit message only talks about _moving_ it). But I don't
>> know the hugetlb code.
>
>Sorry for not describing the remove in changelog.  Remove of that line
>was proposed by Mel and I think it changed nothing in behavior.
>Because the only possible call train is:
>
>        gather_bootmem_prealloc()
>                prep_compound_huge_page()
>                        prep_compound_gigantic_page()
>==>                             set_compound_page_dtor(page,
> free_compound_page); prep_new_huge_page()
>==>                     set_compound_page_dtor(page, free_huge_page);
>
>So obviously the first set_compound_page_dtor() call is extraordinary.
>
>> But that commit went into -rc1 already.  Gene, I know you sent me email
>> about a later -rc release, but maybe you didn't test it on that machine
>> or with that config?
>>
>> > Aug 21 22:37:47 coyote kernel: [ 1030.152737] BUG: Bad page state in
>> > process lzma  pfn:a1093 Aug 21 22:37:47 coyote kernel: [ 1030.152743]
>> > page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
>> > Aug 21 22:37:47 coyote kernel: [ 1030.152747] Pid: 17927, comm: lzma
>> > Not tainted 2.6.31-rc7 #1 Aug 21 22:37:47 coyote kernel: [ 1030.152750]
>> > Call Trace:
>> > Aug 21 22:37:47 coyote kernel: [ 1030.152758]  [<c130e363>] ?
>> > printk+0x23/0x40 Aug 21 22:37:47 coyote kernel: [ 1030.152763] 
>> > [<c108404f>] bad_page+0xcf/0x150 Aug 21 22:37:47 coyote kernel: [
>> > 1030.152767]  [<c10850ed>] get_page_from_freelist+0x37d/0x480 Aug 21
>> > 22:37:47 coyote kernel: [ 1030.152771]  [<c10853cf>]
>> > __alloc_pages_nodemask+0xdf/0x520 Aug 21 22:37:47 coyote kernel: [
>> > 1030.152775]  [<c1096b19>] handle_mm_fault+0x4a9/0x9f0 Aug 21 22:37:47
>> > coyote kernel: [ 1030.152780]  [<c1020d61>] do_page_fault+0x141/0x290
>> > Aug 21 22:37:47 coyote kernel: [ 1030.152784]  [<c1020c20>] ?
>> > do_page_fault+0x0/0x290 Aug 21 22:37:47 coyote kernel: [ 1030.152787] 
>> > [<c1311bcb>] error_code+0x73/0x78 Aug 21 22:37:47 coyote kernel: [
>> > 1030.152789] Disabling lock debugging due to kernel taint
>>
>> It looks like 'flags' is the one that causes this problem at allocation
>> time (count, mapcount, mapping and index all look nicely zeroed).
>>
>> In particular, it's the 0x4000 bit (the high bit, which is also set, is
>> the upper field bits for page section/node/zone numbers etc), which is
>> either PG_head or PG_compound depending on CONFIG_PAGEFLAGS_EXTENDED.
>>
>> And in your case, since you have CONFIG_PAGEFLAGS_EXTENDED=y, it would be
>> PG_head.
>
>Right. btw it takes time to reverse engineer the page flag names each
>time it oops. Does it make sense to print a more readable form, eg.
>
>        flags:80004000 (MOVABLE,head)
>
>?
>
>> Btw guys, why don't we check PG_head etc at free time when we add the
>> page to the free list? Now we get that annoying error only when it is way
>> too late, and have no way to know who screwed up..
>
>And what puzzled me is that PG_head should have been cleared by
>free_pages_check():
>
>        if (page->flags & PAGE_FLAGS_CHECK_AT_PREP)
>                page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
>
>Thanks,
>Fengguang

I changed the vmlinuz compression to gzip and rebooted to it last night, and 
got this shortly after the bootup to -rc7 with the kernal cli argument that 
makes sensors work on an asus board again:

Aug 22 22:29:07 coyote kernel: [ 2449.053652] BUG: Bad page state in process python  pfn:a0e93                                            
Aug 22 22:29:07 coyote kernel: [ 2449.053658] page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0                      
Aug 22 22:29:07 coyote kernel: [ 2449.053662] Pid: 4818, comm: python Not tainted 2.6.31-rc7 #3                                           
Aug 22 22:29:07 coyote kernel: [ 2449.053664] Call Trace:                                                                                 
Aug 22 22:29:07 coyote kernel: [ 2449.053672]  [<c130fb33>] ? printk+0x23/0x40                                                            
Aug 22 22:29:07 coyote kernel: [ 2449.053678]  [<c108352f>] bad_page+0xcf/0x150                                                           
Aug 22 22:29:07 coyote kernel: [ 2449.053682]  [<c10845cd>] get_page_from_freelist+0x37d/0x480                                            
Aug 22 22:29:07 coyote kernel: [ 2449.053686]  [<c10848af>] __alloc_pages_nodemask+0xdf/0x520                                             
Aug 22 22:29:07 coyote kernel: [ 2449.053691]  [<c1095ff9>] handle_mm_fault+0x4a9/0x9f0                                                   
Aug 22 22:29:07 coyote kernel: [ 2449.053695]  [<c105ca83>] ? tick_dev_program_event+0x43/0xf0                                            
Aug 22 22:29:07 coyote kernel: [ 2449.053699]  [<c105cbd6>] ? tick_program_event+0x36/0x60                                                
Aug 22 22:29:07 coyote kernel: [ 2449.053703]  [<c1020d61>] do_page_fault+0x141/0x290                                                     
Aug 22 22:29:07 coyote kernel: [ 2449.053707]  [<c1020c20>] ? do_page_fault+0x0/0x290                                                     
Aug 22 22:29:07 coyote kernel: [ 2449.053710]  [<c131339b>] error_code+0x73/0x78                                                          
Aug 22 22:29:07 coyote kernel: [ 2449.053712] Disabling lock debugging due to kernel taint

This doesn't look exactly like the previous one but the result is similar.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

Now I am depressed ...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad page state (was Re: Linux 2.6.31-rc7)
  2009-08-23  8:20       ` Gene Heskett
@ 2009-08-23 16:44         ` Linus Torvalds
  2009-08-23 17:04           ` Gene Heskett
  2009-08-24  2:22           ` Gene Heskett
  0 siblings, 2 replies; 7+ messages in thread
From: Linus Torvalds @ 2009-08-23 16:44 UTC (permalink / raw)
  To: Gene Heskett
  Cc: Wu Fengguang, Andrew Morton, Hugh Dickins, Mel Gorman, linux-mm

Gene - good news and bad news.

The good news is that this is almost certainly not a kernel bug.

The bad news is that your machine is almost certainly buggy and you'll 
need to replace your RAM (although it's possible that just removing it 
and re-seating it could fix things). See for details below.

On Sun, 23 Aug 2009, Gene Heskett wrote:
> 
> I changed the vmlinuz compression to gzip and rebooted to it last night, and 
> got this shortly after the bootup to -rc7 with the kernal cli argument that 
> makes sensors work on an asus board again:
> 
> Aug 22 22:29:07 coyote kernel: [ 2449.053652] BUG: Bad page state in process python  pfn:a0e93                                            
> Aug 22 22:29:07 coyote kernel: [ 2449.053658] page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0                      
> Aug 22 22:29:07 coyote kernel: [ 2449.053662] Pid: 4818, comm: python Not tainted 2.6.31-rc7 #3                                           
> Aug 22 22:29:07 coyote kernel: [ 2449.053664] Call Trace:                                                                                 
> Aug 22 22:29:07 coyote kernel: [ 2449.053672]  [<c130fb33>] ? printk+0x23/0x40                                                            
> Aug 22 22:29:07 coyote kernel: [ 2449.053678]  [<c108352f>] bad_page+0xcf/0x150                                                           
> Aug 22 22:29:07 coyote kernel: [ 2449.053682]  [<c10845cd>] get_page_from_freelist+0x37d/0x480                                            
> Aug 22 22:29:07 coyote kernel: [ 2449.053686]  [<c10848af>] __alloc_pages_nodemask+0xdf/0x520                                             
> Aug 22 22:29:07 coyote kernel: [ 2449.053691]  [<c1095ff9>] handle_mm_fault+0x4a9/0x9f0                                                   
> Aug 22 22:29:07 coyote kernel: [ 2449.053695]  [<c105ca83>] ? tick_dev_program_event+0x43/0xf0                                            
> Aug 22 22:29:07 coyote kernel: [ 2449.053699]  [<c105cbd6>] ? tick_program_event+0x36/0x60                                                
> Aug 22 22:29:07 coyote kernel: [ 2449.053703]  [<c1020d61>] do_page_fault+0x141/0x290                                                     
> Aug 22 22:29:07 coyote kernel: [ 2449.053707]  [<c1020c20>] ? do_page_fault+0x0/0x290                                                     
> Aug 22 22:29:07 coyote kernel: [ 2449.053710]  [<c131339b>] error_code+0x73/0x78                                                          
> Aug 22 22:29:07 coyote kernel: [ 2449.053712] Disabling lock debugging due to kernel taint
> 
> This doesn't look exactly like the previous one but the result is similar.

Actually, it looks _too_ much like the previous one in one very specific 
regard: that 'page' pointer is identical. Anf that is where the 'flags' 
came from.

Look here:

> Aug 21 22:37:47 coyote kernel: [ 1030.152737] BUG: Bad page state in process lzma  pfn:a1093
> Aug 21 22:37:47 coyote kernel: [ 1030.152743] page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0

> Aug 22 22:29:07 coyote kernel: [ 2449.053652] BUG: Bad page state in process python  pfn:a0e93
> Aug 22 22:29:07 coyote kernel: [ 2449.053658] page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0

and notice how "page:c28fc260" is the same, even though 'pfn' is not. 

Gene - I can almost guarantee that you have bad memory. Why? 

 - 'pfn' is the Linux kernel "page index" - so when the two 'pfn' numbers 
    are different, that means that we're talking about different 
    physical pages, and indexes into the 'struct page[]' array.

 - but because the page array was allocated at different addresses
   (probably because of slightly different configurations and timings
   during boot), the actual physical memory location that describes those 
   different pages happens to be the same.

 - and I can almost guarantee that you have a bit that is stuck to 1 in 
   that RAM location. The 'flags' field is the first one in 'struct page', 
   and so it's the memory location at kernel virtual address c28fc260 that 
   is corrupt - and the way the kernel mappings work on x86, that's 
   physical address 28fc260 (at around the 40MB mark).

There is almost certainly no way that this is a kernel bug - that memory 
location is smack dab in the middle of that 'struct page[]' array, and 
there is absolutely no reason why two different kernels with clearly 
different allocations would set the same incorrect bug. I mean - it 
_could_ happen, and maybe there's some really subtle idiotic thing going 
on, but it's really unlikely.

The address is just so random, and so non-special - and yet it's exactly 
the same physical address in both cases, even though it actually describes 
different things as far as the kernel is concerned. That's an almost 100% 
sure sign of a hard-error in your memory.

And depending on kernel config options, that bad RAM location will be used 
for different things. In your two cases, it's been used for the 'struct 
page[]' array both times, but in other cases it could have been used for 
something else - and maybe resulted in random crashes or other odd things, 
rather than happen to get noticed by a debug test.

The good news about hard memory errors is that if you boot into a memory 
tester like memtest86, it's going to find it. So we're not going to have 
to guess about whether I'm right or not - I would suggest you go download 
memtest86+ from www.memtest.org and run it. I'd just get the bootable ISO 
image of memtest86+ v2.11 and burn it to a CD, and boot it, but there are 
other ways to run that thing.

It's even possible that depending on which distro you have, you may 
already have a "memtest" entry in your LILO or grub setup. I think SuSE 
installs memtest as one of the bootable options, for example.

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad page state (was Re: Linux 2.6.31-rc7)
  2009-08-23 16:44         ` Linus Torvalds
@ 2009-08-23 17:04           ` Gene Heskett
  2009-08-24  2:22           ` Gene Heskett
  1 sibling, 0 replies; 7+ messages in thread
From: Gene Heskett @ 2009-08-23 17:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Wu Fengguang, Andrew Morton, Hugh Dickins, Mel Gorman, linux-mm

On Sunday 23 August 2009, Linus Torvalds wrote:
>Gene - good news and bad news.
>
>The good news is that this is almost certainly not a kernel bug.
>
>The bad news is that your machine is almost certainly buggy and you'll
>need to replace your RAM (although it's possible that just removing it
>and re-seating it could fix things). See for details below.
>
>On Sun, 23 Aug 2009, Gene Heskett wrote:
>> I changed the vmlinuz compression to gzip and rebooted to it last night,
>> and got this shortly after the bootup to -rc7 with the kernal cli
>> argument that makes sensors work on an asus board again:
>>
>> Aug 22 22:29:07 coyote kernel: [ 2449.053652] BUG: Bad page state in
>> process python  pfn:a0e93 Aug 22 22:29:07 coyote kernel: [ 2449.053658]
>> page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
>> Aug 22 22:29:07 coyote kernel: [ 2449.053662] Pid: 4818, comm: python Not
>> tainted 2.6.31-rc7 #3 Aug 22 22:29:07 coyote kernel: [ 2449.053664] Call
>> Trace:
>> Aug 22 22:29:07 coyote kernel: [ 2449.053672]  [<c130fb33>] ?
>> printk+0x23/0x40 Aug 22 22:29:07 coyote kernel: [ 2449.053678] 
>> [<c108352f>] bad_page+0xcf/0x150 Aug 22 22:29:07 coyote kernel: [
>> 2449.053682]  [<c10845cd>] get_page_from_freelist+0x37d/0x480 Aug 22
>> 22:29:07 coyote kernel: [ 2449.053686]  [<c10848af>]
>> __alloc_pages_nodemask+0xdf/0x520 Aug 22 22:29:07 coyote kernel: [
>> 2449.053691]  [<c1095ff9>] handle_mm_fault+0x4a9/0x9f0 Aug 22 22:29:07
>> coyote kernel: [ 2449.053695]  [<c105ca83>] ?
>> tick_dev_program_event+0x43/0xf0 Aug 22 22:29:07 coyote kernel: [
>> 2449.053699]  [<c105cbd6>] ? tick_program_event+0x36/0x60 Aug 22 22:29:07
>> coyote kernel: [ 2449.053703]  [<c1020d61>] do_page_fault+0x141/0x290 Aug
>> 22 22:29:07 coyote kernel: [ 2449.053707]  [<c1020c20>] ?
>> do_page_fault+0x0/0x290 Aug 22 22:29:07 coyote kernel: [ 2449.053710] 
>> [<c131339b>] error_code+0x73/0x78 Aug 22 22:29:07 coyote kernel: [
>> 2449.053712] Disabling lock debugging due to kernel taint
>>
>> This doesn't look exactly like the previous one but the result is
>> similar.
>
>Actually, it looks _too_ much like the previous one in one very specific
>regard: that 'page' pointer is identical. Anf that is where the 'flags'
>came from.
>
>Look here:
>> Aug 21 22:37:47 coyote kernel: [ 1030.152737] BUG: Bad page state in
>> process lzma  pfn:a1093 Aug 21 22:37:47 coyote kernel: [ 1030.152743]
>> page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
>>
>> Aug 22 22:29:07 coyote kernel: [ 2449.053652] BUG: Bad page state in
>> process python  pfn:a0e93 Aug 22 22:29:07 coyote kernel: [ 2449.053658]
>> page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
>
>and notice how "page:c28fc260" is the same, even though 'pfn' is not.
>
>Gene - I can almost guarantee that you have bad memory. Why?
>
> - 'pfn' is the Linux kernel "page index" - so when the two 'pfn' numbers
>    are different, that means that we're talking about different
>    physical pages, and indexes into the 'struct page[]' array.
>
> - but because the page array was allocated at different addresses
>   (probably because of slightly different configurations and timings
>   during boot), the actual physical memory location that describes those
>   different pages happens to be the same.
>
> - and I can almost guarantee that you have a bit that is stuck to 1 in
>   that RAM location. The 'flags' field is the first one in 'struct page',
>   and so it's the memory location at kernel virtual address c28fc260 that
>   is corrupt - and the way the kernel mappings work on x86, that's
>   physical address 28fc260 (at around the 40MB mark).
>
>There is almost certainly no way that this is a kernel bug - that memory
>location is smack dab in the middle of that 'struct page[]' array, and
>there is absolutely no reason why two different kernels with clearly
>different allocations would set the same incorrect bug. I mean - it
>_could_ happen, and maybe there's some really subtle idiotic thing going
>on, but it's really unlikely.
>
>The address is just so random, and so non-special - and yet it's exactly
>the same physical address in both cases, even though it actually describes
>different things as far as the kernel is concerned. That's an almost 100%
>sure sign of a hard-error in your memory.
>
>And depending on kernel config options, that bad RAM location will be used
>for different things. In your two cases, it's been used for the 'struct
>page[]' array both times, but in other cases it could have been used for
>something else - and maybe resulted in random crashes or other odd things,
>rather than happen to get noticed by a debug test.
>
>The good news about hard memory errors is that if you boot into a memory
>tester like memtest86, it's going to find it. So we're not going to have
>to guess about whether I'm right or not - I would suggest you go download
>memtest86+ from www.memtest.org and run it. I'd just get the bootable ISO
>image of memtest86+ v2.11 and burn it to a CD, and boot it, but there are
>other ways to run that thing.
>
I have several copies of it already since I'm always checking out old boxes 
for use with emc.  Since its been almost a year since I checked it when I 
built the machine, I'll give it a few loops and see what falls out, thanks.

>It's even possible that depending on which distro you have, you may
>already have a "memtest" entry in your LILO or grub setup. I think SuSE
>installs memtest as one of the bootable options, for example.
>
>			Linus


-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

  I marvel at the strength of human weakness.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad page state (was Re: Linux 2.6.31-rc7)
  2009-08-23 16:44         ` Linus Torvalds
  2009-08-23 17:04           ` Gene Heskett
@ 2009-08-24  2:22           ` Gene Heskett
  1 sibling, 0 replies; 7+ messages in thread
From: Gene Heskett @ 2009-08-24  2:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Wu Fengguang, Andrew Morton, Hugh Dickins, Mel Gorman, linux-mm

On Sunday 23 August 2009, Linus Torvalds wrote:
>Gene - good news and bad news.
>
>The good news is that this is almost certainly not a kernel bug.
>
>The bad news is that your machine is almost certainly buggy and you'll
>need to replace your RAM (although it's possible that just removing it
>and re-seating it could fix things). See for details below.
>
Spot on!

Which, when I had a chance today after memtest ran and found a stuck bit 
00004000 at 2 separate addresses fairly early in the testing, I let it run 
for another 6 hours while I painted some shutters, and it ran without 
incrementing the counts of those 2 addresses.  So after dinner I stripped out 
the cards as all slots are full and laid the mobo out the right side of the 
beast, then pulled all  4 1GB modules, putting the front one in the back 
slot, sorta rotating the tires. Then I let memtest run for about 45 minutes 
with no further errors.  That faint knocking sound?  Me, rapping on my head, 
cuz I should have grokked that.  I'll catchup on my email & let memtest run 
for a few hours again tomorrow.

More...

>On Sun, 23 Aug 2009, Gene Heskett wrote:
>> I changed the vmlinuz compression to gzip and rebooted to it last night,
>> and got this shortly after the bootup to -rc7 with the kernal cli
>> argument that makes sensors work on an asus board again:
>>
>> Aug 22 22:29:07 coyote kernel: [ 2449.053652] BUG: Bad page state in
>> process python  pfn:a0e93 Aug 22 22:29:07 coyote kernel: [ 2449.053658]
>> page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
>> Aug 22 22:29:07 coyote kernel: [ 2449.053662] Pid: 4818, comm: python Not
>> tainted 2.6.31-rc7 #3 Aug 22 22:29:07 coyote kernel: [ 2449.053664] Call
>> Trace:
>> Aug 22 22:29:07 coyote kernel: [ 2449.053672]  [<c130fb33>] ?
>> printk+0x23/0x40 Aug 22 22:29:07 coyote kernel: [ 2449.053678] 
>> [<c108352f>] bad_page+0xcf/0x150 Aug 22 22:29:07 coyote kernel: [
>> 2449.053682]  [<c10845cd>] get_page_from_freelist+0x37d/0x480 Aug 22
>> 22:29:07 coyote kernel: [ 2449.053686]  [<c10848af>]
>> __alloc_pages_nodemask+0xdf/0x520 Aug 22 22:29:07 coyote kernel: [
>> 2449.053691]  [<c1095ff9>] handle_mm_fault+0x4a9/0x9f0 Aug 22 22:29:07
>> coyote kernel: [ 2449.053695]  [<c105ca83>] ?
>> tick_dev_program_event+0x43/0xf0 Aug 22 22:29:07 coyote kernel: [
>> 2449.053699]  [<c105cbd6>] ? tick_program_event+0x36/0x60 Aug 22 22:29:07
>> coyote kernel: [ 2449.053703]  [<c1020d61>] do_page_fault+0x141/0x290 Aug
>> 22 22:29:07 coyote kernel: [ 2449.053707]  [<c1020c20>] ?
>> do_page_fault+0x0/0x290 Aug 22 22:29:07 coyote kernel: [ 2449.053710] 
>> [<c131339b>] error_code+0x73/0x78 Aug 22 22:29:07 coyote kernel: [
>> 2449.053712] Disabling lock debugging due to kernel taint
>>
>> This doesn't look exactly like the previous one but the result is
>> similar.
>
>Actually, it looks _too_ much like the previous one in one very specific
>regard: that 'page' pointer is identical. Anf that is where the 'flags'
>came from.
>
>Look here:
>> Aug 21 22:37:47 coyote kernel: [ 1030.152737] BUG: Bad page state in
>> process lzma  pfn:a1093 Aug 21 22:37:47 coyote kernel: [ 1030.152743]
>> page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
>>
>> Aug 22 22:29:07 coyote kernel: [ 2449.053652] BUG: Bad page state in
>> process python  pfn:a0e93 Aug 22 22:29:07 coyote kernel: [ 2449.053658]
>> page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
>
>and notice how "page:c28fc260" is the same, even though 'pfn' is not.

Yes, and that is I believe, the same address that memtest triggered on for 
the 1st and 3rd errors, there was another, at about 1/2 meg higher address in 
between.  I shot a pix of the memtest screen just for the records.

Its pny memory, and possibly still in warranty, I find out tomorrow for sure.

>Gene - I can almost guarantee that you have bad memory. Why?
>
> - 'pfn' is the Linux kernel "page index" - so when the two 'pfn' numbers
>    are different, that means that we're talking about different
>    physical pages, and indexes into the 'struct page[]' array.
>
> - but because the page array was allocated at different addresses
>   (probably because of slightly different configurations and timings
>   during boot), the actual physical memory location that describes those
>   different pages happens to be the same.
>
> - and I can almost guarantee that you have a bit that is stuck to 1 in
>   that RAM location. The 'flags' field is the first one in 'struct page',
>   and so it's the memory location at kernel virtual address c28fc260 that
>   is corrupt - and the way the kernel mappings work on x86, that's
>   physical address 28fc260 (at around the 40MB mark).

40.7, and 41.3 according to memtest. :)

>There is almost certainly no way that this is a kernel bug - that memory
>location is smack dab in the middle of that 'struct page[]' array, and
>there is absolutely no reason why two different kernels with clearly
>different allocations would set the same incorrect bug. I mean - it
>_could_ happen, and maybe there's some really subtle idiotic thing going
>on, but it's really unlikely.
>
>The address is just so random, and so non-special - and yet it's exactly
>the same physical address in both cases, even though it actually describes
>different things as far as the kernel is concerned. That's an almost 100%
>sure sign of a hard-error in your memory.
>
>And depending on kernel config options, that bad RAM location will be used
>for different things. In your two cases, it's been used for the 'struct
>page[]' array both times, but in other cases it could have been used for
>something else - and maybe resulted in random crashes or other odd things,
>rather than happen to get noticed by a debug test.
>
>The good news about hard memory errors is that if you boot into a memory
>tester like memtest86, it's going to find it. So we're not going to have
>to guess about whether I'm right or not - I would suggest you go download
>memtest86+ from www.memtest.org and run it. I'd just get the bootable ISO
>image of memtest86+ v2.11 and burn it to a CD, and boot it, but there are
>other ways to run that thing.
>
>It's even possible that depending on which distro you have, you may
>already have a "memtest" entry in your LILO or grub setup. I think SuSE
>installs memtest as one of the bootable options, for example.

I had an entry for it from a kubuntu install that has since committed 
suicide.  I'm about up to my armpits in fedora, so I have a mandriva 2009.1 
install on another drive that may well become the main os.  Not near as much 
trouble with codecs with an offshore verse of this.  All I need to do is get 
all my scripts and a 7GB email corpus moved.  Unforch, by the time I get that 
done the next mandriva will be out. So many things to do, and relatively 
little time to do them.  Too many other hobbies and honeydo's. :)

Thanks for not spanking me on this one Linus, I blew it, badly.

			Linus

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The NRA is offering FREE Associate memberships to anyone who wants them.
<https://www.nrahq.org/nrabonus/accept-membership.asp>

  I marvel at the strength of human weakness.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bad page state (was Re: Linux 2.6.31-rc7)
  2009-08-22  4:17   ` Bad page state (was Re: Linux 2.6.31-rc7) Linus Torvalds
  2009-08-23  7:22     ` Wu Fengguang
@ 2009-08-24 13:55     ` Mel Gorman
  1 sibling, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2009-08-24 13:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Gene Heskett, Wu Fengguang, Andrew Morton, Hugh Dickins, linux-mm

Sorry about the slow response. I was on holidays for a few days and
didn't have a laptop with me to keep track of mail.

On Fri, Aug 21, 2009 at 09:17:48PM -0700, Linus Torvalds wrote:
> 
> On Fri, 21 Aug 2009, Gene Heskett wrote:
> > 
> > From messages, I already have a problem with lzma too:
> 
> And for this too, can you tell what the last working kernel was?
> 
> Does the problem happen consistently? (And btw, it's not probably so much 
> lzma, but something random that released a page without clearing some of 
> the page flags or something).
> 

This has already been isolated as a memory problem so I won't say
anything more about that.

> Wu - I'm not seeing a lot of changes to compund page handling except for 
> commit 20a0307c0396c2edb651401d2f2db193dda2f3c9 ("mm: introduce PageHuge() 
> for testing huge/gigantic pages").
> 
> That one removed the
> 
> 	set_compound_page_dtor(page, free_compound_page);
> 
> thing from prep_compound_gigantic_page(), which looks a bit odd and 
> suspicious (the commit message only talks about _moving_ it). But I don't 
> know the hugetlb code.
> 

The set_compound_page_dtor() that was there was spurious and was been set
properly later in the initialisation path for gigantic pages.  Wu was going
to fix it up but it made more sense to just delete it. The discussion at
the time is here http://lkml.org/lkml/2009/5/17/83

> But that commit went into -rc1 already.  Gene, I know you sent me email 
> about a later -rc release, but maybe you didn't test it on that machine or 
> with that config?
> 
> > Aug 21 22:37:47 coyote kernel: [ 1030.152737] BUG: Bad page state in process lzma  pfn:a1093
> > Aug 21 22:37:47 coyote kernel: [ 1030.152743] page:c28fc260 flags:80004000 count:0 mapcount:0 mapping:(null) index:0
> > Aug 21 22:37:47 coyote kernel: [ 1030.152747] Pid: 17927, comm: lzma Not tainted 2.6.31-rc7 #1
> > Aug 21 22:37:47 coyote kernel: [ 1030.152750] Call Trace:
> > Aug 21 22:37:47 coyote kernel: [ 1030.152758]  [<c130e363>] ? printk+0x23/0x40
> > Aug 21 22:37:47 coyote kernel: [ 1030.152763]  [<c108404f>] bad_page+0xcf/0x150
> > Aug 21 22:37:47 coyote kernel: [ 1030.152767]  [<c10850ed>] get_page_from_freelist+0x37d/0x480
> > Aug 21 22:37:47 coyote kernel: [ 1030.152771]  [<c10853cf>] __alloc_pages_nodemask+0xdf/0x520
> > Aug 21 22:37:47 coyote kernel: [ 1030.152775]  [<c1096b19>] handle_mm_fault+0x4a9/0x9f0
> > Aug 21 22:37:47 coyote kernel: [ 1030.152780]  [<c1020d61>] do_page_fault+0x141/0x290
> > Aug 21 22:37:47 coyote kernel: [ 1030.152784]  [<c1020c20>] ? do_page_fault+0x0/0x290
> > Aug 21 22:37:47 coyote kernel: [ 1030.152787]  [<c1311bcb>] error_code+0x73/0x78
> > Aug 21 22:37:47 coyote kernel: [ 1030.152789] Disabling lock debugging due to kernel taint
> 
> It looks like 'flags' is the one that causes this problem at allocation 
> time (count, mapcount, mapping and index all look nicely zeroed).
> 
> In particular, it's the 0x4000 bit (the high bit, which is also set, is 
> the upper field bits for page section/node/zone numbers etc), which is 
> either PG_head or PG_compound depending on CONFIG_PAGEFLAGS_EXTENDED.
> 
> And in your case, since you have CONFIG_PAGEFLAGS_EXTENDED=y, it would be 
> PG_head.
> 
> Btw guys, why don't we check PG_head etc at free time when we add the page 
> to the free list? Now we get that annoying error only when it is way too 
> late, and have no way to know who screwed up..
> 

Minimally because the checks for bad bits set in flags can happen before
the compound page is destroyed. Take the most common compound page that
will have it's destructor as free_compound_page(). One free path then
looks like

put_page()
  -> put_compound_page()
    -> free_compound_page()
      -> __free_pages_ok()
         Check every head and tail page with free_pages_check()
        -> __free_one_page()
          -> destroy_compound_page()

So a check for the compound bits here would trigger for all compound pages
freed via put_page().

As the compound bits are always being checked in the free path and handled
accordingly, it's ordinarily considered "impossible" for pages to end up
on the free list with the bits intact which is why we don't have additional
checks for it.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-08-26 11:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <alpine.LFD.2.01.0908211810390.3158@localhost.localdomain>
     [not found] ` <200908212248.40987.gene.heskett@verizon.net>
2009-08-22  4:17   ` Bad page state (was Re: Linux 2.6.31-rc7) Linus Torvalds
2009-08-23  7:22     ` Wu Fengguang
2009-08-23  8:20       ` Gene Heskett
2009-08-23 16:44         ` Linus Torvalds
2009-08-23 17:04           ` Gene Heskett
2009-08-24  2:22           ` Gene Heskett
2009-08-24 13:55     ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox