abnormal OOM killer message

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* abnormal OOM killer message
@ 2009-08-19  1:41 우충기
  2009-08-19  2:44 ` Minchan Kim
  0 siblings, 1 reply; 16+ messages in thread
From: 우충기 @ 2009-08-19  1:41 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: fengguang.wu, riel, akpm, kosaki.motohiro, minchan.kim

Hi all~
I have got a log message with OOM below. I don't know why this
phenomenon was happened.
When direct reclaim routine(try_to_free_pages) in __alloc_pages which
allocates kernel memory was failed,
one last chance is given to allocate memory before OOM routine is executed.
And that time, allocator uses ALLOC_WMARK_HIGH to limit watermark.
Then, zone_watermark_ok function test this value with current memory
state and decide 'can allocate' or 'cannot allocate'.

Here is some kernel source code in __alloc_pages function to understand easily.
Kernel version is 2.6.18 for arm11. Memory size is 32Mbyte. And I use
compcache(0.5.2).
-------------------------------------------------------------------------------------------------------------------------------------------------------------
        ...
        did_some_progress = try_to_free_pages(zonelist->zones,
gfp_mask);            <== direct page reclaim

        p->reclaim_state = NULL;
        p->flags &= ~PF_MEMALLOC;

        cond_resched();

        if (likely(did_some_progress)) {
                page = get_page_from_freelist(gfp_mask, order,
                                                zonelist, alloc_flags);
                if (page)
                        goto got_pg;
        } else if ((gfp_mask & __GFP_FS) && !(gfp_mask &
__GFP_NORETRY)) {    <== when fail to reclaim
                /*
                 * Go through the zonelist yet one more time, keep
                 * very high watermark here, this is only to catch
                 * a parallel oom killing, we must fail if we're still
                 * under heavy pressure.
                 */
                page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL,
order,  <== this is last chance
                                zonelist,
ALLOC_WMARK_HIGH|ALLOC_CPUSET);               <== uses
ALLOC_WMARK_HIGH
                if (page)
                        goto got_pg;

                out_of_memory(zonelist, gfp_mask, order);
                goto restart;
        }
        ...
-------------------------------------------------------------------------------------------------------------------------------------------------------------

In my case, you can see free pages(6804KB) is much more higher than
high watermark value(1084KB) in OOM message.
And order of allocating is also zero.(order=0)
In buddy system, the number of 4kbyte page is 867.
So, I think OOM can't be happend.

How do you think about this?
Is this side effect of compcache?
Please explain me.
Thanks.

This is OOM message.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
oom-killer: gfp_mask=0x201d2, order=0       (==> __GFP_HIGHMEM,
__GFP_WAIT, __GFP_IO, __GFP_FS, __GFP_COLD)
[<c00246c0>] (dump_stack+0x0/0x14) from [<c006ba68>] (out_of_memory+0x38/0x1d0)
[<c006ba30>] (out_of_memory+0x0/0x1d0) from [<c006d4cc>]
(__alloc_pages+0x244/0x2c4)
[<c006d288>] (__alloc_pages+0x0/0x2c4) from [<c006f054>]
(__do_page_cache_readahead+0x12c/0x2d4)
[<c006ef28>] (__do_page_cache_readahead+0x0/0x2d4) from [<c006f594>]
(do_page_cache_readahead+0x60/0x64)
[<c006f534>] (do_page_cache_readahead+0x0/0x64) from [<c006ac24>]
(filemap_nopage+0x1b4/0x438)
 r7 = C0D8C320  r6 = C1422000  r5 = 00000001  r4 = 00000000
[<c006aa70>] (filemap_nopage+0x0/0x438) from [<c0075684>]
(__handle_mm_fault+0x398/0xb84)
[<c00752ec>] (__handle_mm_fault+0x0/0xb84) from [<c0027614>]
(do_page_fault+0xe8/0x224)
[<c002752c>] (do_page_fault+0x0/0x224) from [<c0027900>]
(do_DataAbort+0x3c/0xa0)
[<c00278c4>] (do_DataAbort+0x0/0xa0) from [<c001fde0>]
(ret_from_exception+0x0/0x10)
 r8 = BE9894B8  r7 = 00000078  r6 = 00000130  r5 = 00000000
 r4 = FFFFFFFF
Mem-info:
DMA per-cpu:
cpu 0 hot: high 6, batch 1 used:0
cpu 0 cold: high 2, batch 1 used:1
DMA32 per-cpu: empty
Normal per-cpu: empty
HighMem per-cpu: empty
Free pages:        6804kB (0kB HighMem)
Active:101 inactive:1527 dirty:0 writeback:0 unstable:0 free:1701
slab:936 mapped:972 pagetables:379
DMA free:6804kB min:724kB low:904kB high:1084kB active:404kB
inactive:6108kB present:32768kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
HighMem free:0kB min:128kB low:128kB high:128kB active:0kB
inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 867*4kB 273*8kB 36*16kB 2*32kB 0*64kB 0*128kB 0*256kB 1*512kB
0*1024kB 0*2048kB 0*4096kB = 6804kB
DMA32: empty
Normal: empty
HighMem: empty
Swap cache: add 4597, delete 4488, find 159/299, race 0+0
Free swap  = 67480kB
Total swap = 81916kB
Free swap:        67480kB
8192 pages of RAM
1960 free pages
978 reserved pages
936 slab pages
1201 pages shared
109 pages swap cached
Out of Memory: Kill process 47 (rc.local) score 849737 and children.
Out of memory: Killed process 49 (CTaskManager).
Killed
SW image is stopped..
script in BOOT is stopped...
Starting pid 348, console /dev/ttyS1: '/bin/sh'
-sh: id: not found
#
-------------------------------------------------------------------------------------------------------------------------------------------------------------

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  1:41 abnormal OOM killer message 우충기
@ 2009-08-19  2:44 ` Minchan Kim
  2009-08-19  3:44   ` Nitin Gupta
  2009-08-19 10:18   ` Minchan Kim
  0 siblings, 2 replies; 16+ messages in thread
From: Minchan Kim @ 2009-08-19  2:44 UTC (permalink / raw)
  To: 우충기, Nitin Gupta
  Cc: linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro, minchan.kim, Mel Gorman

On Wed, 19 Aug 2009 10:41:51 +0900
i??i?(C)e,? <chungki.woo@gmail.com> wrote:

> Hi all~
> I have got a log message with OOM below. I don't know why this
> phenomenon was happened.
> When direct reclaim routine(try_to_free_pages) in __alloc_pages which
> allocates kernel memory was failed,
> one last chance is given to allocate memory before OOM routine is executed.
> And that time, allocator uses ALLOC_WMARK_HIGH to limit watermark.
> Then, zone_watermark_ok function test this value with current memory
> state and decide 'can allocate' or 'cannot allocate'.
> 
> Here is some kernel source code in __alloc_pages function to understand easily.
> Kernel version is 2.6.18 for arm11. Memory size is 32Mbyte. And I use
> compcache(0.5.2).
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>         ...
>         did_some_progress = try_to_free_pages(zonelist->zones,
> gfp_mask);            <== direct page reclaim
> 
>         p->reclaim_state = NULL;
>         p->flags &= ~PF_MEMALLOC;
> 
>         cond_resched();
> 
>         if (likely(did_some_progress)) {
>                 page = get_page_from_freelist(gfp_mask, order,
>                                                 zonelist, alloc_flags);
>                 if (page)
>                         goto got_pg;
>         } else if ((gfp_mask & __GFP_FS) && !(gfp_mask &
> __GFP_NORETRY)) {    <== when fail to reclaim
>                 /*
>                  * Go through the zonelist yet one more time, keep
>                  * very high watermark here, this is only to catch
>                  * a parallel oom killing, we must fail if we're still
>                  * under heavy pressure.
>                  */
>                 page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL,
> order,  <== this is last chance
>                                 zonelist,
> ALLOC_WMARK_HIGH|ALLOC_CPUSET);               <== uses
> ALLOC_WMARK_HIGH
>                 if (page)
>                         goto got_pg;
> 
>                 out_of_memory(zonelist, gfp_mask, order);
>                 goto restart;
>         }
>         ...
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> In my case, you can see free pages(6804KB) is much more higher than
> high watermark value(1084KB) in OOM message.
> And order of allocating is also zero.(order=0)
> In buddy system, the number of 4kbyte page is 867.
> So, I think OOM can't be happend.
> 

Yes. I think so. 

In that case, even we can also avoid zone defensive algorithm.

> How do you think about this?
> Is this side effect of compcache?

I don't know compcache well.
But I doubt it. Let's Cced Nitin. 

> Please explain me.
> Thanks.
> 
> This is OOM message.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
> oom-killer: gfp_mask=0x201d2, order=0       (==> __GFP_HIGHMEM,
> __GFP_WAIT, __GFP_IO, __GFP_FS, __GFP_COLD)
> [<c00246c0>] (dump_stack+0x0/0x14) from [<c006ba68>] (out_of_memory+0x38/0x1d0)
> [<c006ba30>] (out_of_memory+0x0/0x1d0) from [<c006d4cc>]
> (__alloc_pages+0x244/0x2c4)
> [<c006d288>] (__alloc_pages+0x0/0x2c4) from [<c006f054>]
> (__do_page_cache_readahead+0x12c/0x2d4)
> [<c006ef28>] (__do_page_cache_readahead+0x0/0x2d4) from [<c006f594>]
> (do_page_cache_readahead+0x60/0x64)
> [<c006f534>] (do_page_cache_readahead+0x0/0x64) from [<c006ac24>]
> (filemap_nopage+0x1b4/0x438)
>  r7 = C0D8C320  r6 = C1422000  r5 = 00000001  r4 = 00000000
> [<c006aa70>] (filemap_nopage+0x0/0x438) from [<c0075684>]
> (__handle_mm_fault+0x398/0xb84)
> [<c00752ec>] (__handle_mm_fault+0x0/0xb84) from [<c0027614>]
> (do_page_fault+0xe8/0x224)
> [<c002752c>] (do_page_fault+0x0/0x224) from [<c0027900>]
> (do_DataAbort+0x3c/0xa0)
> [<c00278c4>] (do_DataAbort+0x0/0xa0) from [<c001fde0>]
> (ret_from_exception+0x0/0x10)
>  r8 = BE9894B8  r7 = 00000078  r6 = 00000130  r5 = 00000000
>  r4 = FFFFFFFF
> Mem-info:
> DMA per-cpu:
> cpu 0 hot: high 6, batch 1 used:0
> cpu 0 cold: high 2, batch 1 used:1
> DMA32 per-cpu: empty
> Normal per-cpu: empty
> HighMem per-cpu: empty
> Free pages:        6804kB (0kB HighMem)
> Active:101 inactive:1527 dirty:0 writeback:0 unstable:0 free:1701
> slab:936 mapped:972 pagetables:379
> DMA free:6804kB min:724kB low:904kB high:1084kB active:404kB
> inactive:6108kB present:32768kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
> present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
> present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> HighMem free:0kB min:128kB low:128kB high:128kB active:0kB
> inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> DMA: 867*4kB 273*8kB 36*16kB 2*32kB 0*64kB 0*128kB 0*256kB 1*512kB
> 0*1024kB 0*2048kB 0*4096kB = 6804kB
> DMA32: empty
> Normal: empty
> HighMem: empty
> Swap cache: add 4597, delete 4488, find 159/299, race 0+0
> Free swap  = 67480kB
> Total swap = 81916kB

In addition, total swap : 79M?? 

> Free swap:        67480kB
> 8192 pages of RAM
> 1960 free pages
> 978 reserved pages
> 936 slab pages
> 1201 pages shared
> 109 pages swap cached

free page : 6M
page table + slab + reserved : 8M
active + inacive : 6M

Where is 12M? 

> Out of Memory: Kill process 47 (rc.local) score 849737 and children.
> Out of memory: Killed process 49 (CTaskManager).
> Killed
> SW image is stopped..
> script in BOOT is stopped...
> Starting pid 348, console /dev/ttyS1: '/bin/sh'
> -sh: id: not found
> #
> -------------------------------------------------------------------------------------------------------------------------------------------------------------

As you mentioned, your memory size is 32M and you use compcache.
How is swap size bigger than your memory size ?
Is the result of compression of swap pages ? 
Nitin. Could you answer the question?

I can't imagine whey order 0 allocation failed although there are
many pages in buddy. 

What do you mm guys think about this problem ?

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  2:44 ` Minchan Kim
@ 2009-08-19  3:44   ` Nitin Gupta
  2009-08-19  4:51     ` Minchan Kim
  2009-08-19 10:18   ` Minchan Kim
  1 sibling, 1 reply; 16+ messages in thread
From: Nitin Gupta @ 2009-08-19  3:44 UTC (permalink / raw)
  To: Minchan Kim
  Cc: 우충기,
	linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro, Mel Gorman

On 08/19/2009 08:14 AM, Minchan Kim wrote:
> On Wed, 19 Aug 2009 10:41:51 +0900
> i??i?(C)e,?<chungki.woo@gmail.com>  wrote:
>
>> Hi all~
>> I have got a log message with OOM below. I don't know why this
>> phenomenon was happened.
>> When direct reclaim routine(try_to_free_pages) in __alloc_pages which
>> allocates kernel memory was failed,
>> one last chance is given to allocate memory before OOM routine is executed.
>> And that time, allocator uses ALLOC_WMARK_HIGH to limit watermark.
>> Then, zone_watermark_ok function test this value with current memory
>> state and decide 'can allocate' or 'cannot allocate'.
>>
>> Here is some kernel source code in __alloc_pages function to understand easily.
>> Kernel version is 2.6.18 for arm11. Memory size is 32Mbyte. And I use
>> compcache(0.5.2).

<snip>

>>
>> In my case, you can see free pages(6804KB) is much more higher than
>> high watermark value(1084KB) in OOM message.
>> And order of allocating is also zero.(order=0)
>> In buddy system, the number of 4kbyte page is 867.
>> So, I think OOM can't be happend.
>>
>
> Yes. I think so.
>
> In that case, even we can also avoid zone defensive algorithm.
>
>> How do you think about this?
>> Is this side effect of compcache?
>

compcache can be storing lot of stale data and this memory space cannot be
reclaimed (unless overwritten by some other swap data). This is because
compcache does not know when a swap slot has been freed and hence does not know 
when its safe to free corresponding memory. You can check current memory usage 
with /proc/ramzswap (see MemUsedTotal).

BTW, with compcache-0.6 there is an experimental kernel patch that gets rid of 
all this stale data:
http://patchwork.kernel.org/patch/41083/

However, this compcache version needs at least kernel 2.6.28. This version also 
fixes all known problems on ARM. compcache-0.5.3 or earlier is known to crash on 
ARM (see: http://code.google.com/p/compcache/issues/detail?id=33).

Thanks,
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  3:44   ` Nitin Gupta
@ 2009-08-19  4:51     ` Minchan Kim
  2009-08-19  6:24       ` 우충기
  0 siblings, 1 reply; 16+ messages in thread
From: Minchan Kim @ 2009-08-19  4:51 UTC (permalink / raw)
  To: ngupta
  Cc: Minchan Kim, 우충기,
	linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro, Mel Gorman


On Wed, 19 Aug 2009 09:14:08 +0530
Nitin Gupta <ngupta@vflare.org> wrote:

> On 08/19/2009 08:14 AM, Minchan Kim wrote:
> > On Wed, 19 Aug 2009 10:41:51 +0900
> > i??i?(C)e,?<chungki.woo@gmail.com>  wrote:
> >
> >> Hi all~
> >> I have got a log message with OOM below. I don't know why this
> >> phenomenon was happened.
> >> When direct reclaim routine(try_to_free_pages) in __alloc_pages which
> >> allocates kernel memory was failed,
> >> one last chance is given to allocate memory before OOM routine is executed.
> >> And that time, allocator uses ALLOC_WMARK_HIGH to limit watermark.
> >> Then, zone_watermark_ok function test this value with current memory
> >> state and decide 'can allocate' or 'cannot allocate'.
> >>
> >> Here is some kernel source code in __alloc_pages function to understand easily.
> >> Kernel version is 2.6.18 for arm11. Memory size is 32Mbyte. And I use
> >> compcache(0.5.2).
> 
> <snip>
> 
> >>
> >> In my case, you can see free pages(6804KB) is much more higher than
> >> high watermark value(1084KB) in OOM message.
> >> And order of allocating is also zero.(order=0)
> >> In buddy system, the number of 4kbyte page is 867.
> >> So, I think OOM can't be happend.
> >>
> >
> > Yes. I think so.
> >
> > In that case, even we can also avoid zone defensive algorithm.
> >
> >> How do you think about this?
> >> Is this side effect of compcache?
> >
> 
> compcache can be storing lot of stale data and this memory space cannot be
> reclaimed (unless overwritten by some other swap data). This is because

stale data. It seems related ARMv6. 
I think Chungki's CPU is ARMv6. 

> compcache does not know when a swap slot has been freed and hence does not know 
> when its safe to free corresponding memory. You can check current memory usage 
> with /proc/ramzswap (see MemUsedTotal).
> 

Let me have a question. 
Now the system has 79M as total swap. 
It's bigger than system memory size. 
Is it possible in compcache?
Can we believe the number?

> BTW, with compcache-0.6 there is an experimental kernel patch that gets rid of 
> all this stale data:
> http://patchwork.kernel.org/patch/41083/
> 
> However, this compcache version needs at least kernel 2.6.28. This version also 
> fixes all known problems on ARM. compcache-0.5.3 or earlier is known to crash on 
> ARM (see: http://code.google.com/p/compcache/issues/detail?id=33).
>

Chungki. Is it reproducible easily ?
Could you try it with compcache-0.6. 
As Nitin said, it seems to solve cache aliasing problem. 

> Thanks,
> Nitin


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  4:51     ` Minchan Kim
@ 2009-08-19  6:24       ` 우충기
  2009-08-19  6:49         ` Minchan Kim
  0 siblings, 1 reply; 16+ messages in thread
From: 우충기 @ 2009-08-19  6:24 UTC (permalink / raw)
  To: Minchan Kim
  Cc: ngupta, linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro, Mel Gorman

Thank you very much for replys.

But I think it seems not to relate with stale data problem in compcache.
My question was why last chance to allocate memory was failed.
When OOM killer is executed, memory state is not a condition to
execute OOM killer.
Specially, there are so many pages of order 0. And allocating order is zero.
I think that last allocating memory should have succeeded.
That's my worry.

-----------------------------------------------------------------------------------------------------------------------------------------------
      page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
<== this is last chance
                           zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
<== uses ALLOC_WMARK_HIGH
      if (page)
      goto got_pg;

      out_of_memory(zonelist, gfp_mask, order);
      goto restart;
-----------------------------------------------------------------------------------------------------------------------------------------------

> Let me have a question.
> Now the system has 79M as total swap.
> It's bigger than system memory size.
> Is it possible in compcache?
> Can we believe the number?

Yeah, It's possible. 79Mbyte is data size can be swap.
It's not compressed data size. It's just original data size.

Thanks,
Minchan, Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  6:24       ` 우충기
@ 2009-08-19  6:49         ` Minchan Kim
  2009-08-19  7:14           ` Chungki woo
  2009-08-19 10:36           ` Mel Gorman
  0 siblings, 2 replies; 16+ messages in thread
From: Minchan Kim @ 2009-08-19  6:49 UTC (permalink / raw)
  To: 우충기, Mel Gorman
  Cc: Minchan Kim, ngupta, linux-kernel, linux-mm, fengguang.wu, riel,
	akpm, kosaki.motohiro

On Wed, 19 Aug 2009 15:24:54 +0900
i??i?(C)e,? <chungki.woo@gmail.com> wrote:

> Thank you very much for replys.
> 
> But I think it seems not to relate with stale data problem in compcache.
> My question was why last chance to allocate memory was failed.
> When OOM killer is executed, memory state is not a condition to
> execute OOM killer.
> Specially, there are so many pages of order 0. And allocating order is zero.
> I think that last allocating memory should have succeeded.
> That's my worry.

Yes. I agree with you.
Mel. Could you give some comment in this situation ?
Is it possible that order 0 allocation is failed 
even there are many pages in buddy ?

> 
> -----------------------------------------------------------------------------------------------------------------------------------------------
>       page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
> <== this is last chance
>                            zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
> <== uses ALLOC_WMARK_HIGH
>       if (page)
>       goto got_pg;
> 
>       out_of_memory(zonelist, gfp_mask, order);
>       goto restart;
> -----------------------------------------------------------------------------------------------------------------------------------------------
> 
> > Let me have a question.
> > Now the system has 79M as total swap.
> > It's bigger than system memory size.
> > Is it possible in compcache?
> > Can we believe the number?
> 
> Yeah, It's possible. 79Mbyte is data size can be swap.
> It's not compressed data size. It's just original data size.

You means your pages with 79M are swap out in compcache's reserved
memory?

> 
> Thanks,
> Minchan, Nitin


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  6:49         ` Minchan Kim
@ 2009-08-19  7:14           ` Chungki woo
  2009-08-19  7:29             ` Minchan Kim
  2009-08-19 10:36           ` Mel Gorman
  1 sibling, 1 reply; 16+ messages in thread
From: Chungki woo @ 2009-08-19  7:14 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Mel Gorman, ngupta, linux-kernel, linux-mm, fengguang.wu, riel,
	akpm, kosaki.motohiro

> You means your pages with 79M are swap out in compcache's reserved
> memory?

Compcache don't have reserved memory.
When it needs memory, and then allocate memory.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  7:14           ` Chungki woo
@ 2009-08-19  7:29             ` Minchan Kim
  2009-08-19  8:25               ` Nitin Gupta
  0 siblings, 1 reply; 16+ messages in thread
From: Minchan Kim @ 2009-08-19  7:29 UTC (permalink / raw)
  To: Chungki woo
  Cc: Mel Gorman, ngupta, linux-kernel, linux-mm, fengguang.wu, riel,
	akpm, kosaki.motohiro

On Wed, Aug 19, 2009 at 4:14 PM, Chungki woo<chungki.woo@gmail.com> wrote:
>> You means your pages with 79M are swap out in compcache's reserved
>> memory?
>
> Compcache don't have reserved memory.
> When it needs memory, and then allocate memory.

Okay. reserved is not important. :)
My point was that 79M with pages are swap out in compcache swap device ?
Is the number real ?
Can we believe it ?

>
> Thanks.
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  7:29             ` Minchan Kim
@ 2009-08-19  8:25               ` Nitin Gupta
  2009-08-19  8:42                 ` Minchan Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Nitin Gupta @ 2009-08-19  8:25 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Chungki woo, Mel Gorman, linux-kernel, linux-mm, fengguang.wu,
	riel, akpm, kosaki.motohiro

On Wed, Aug 19, 2009 at 12:59 PM, Minchan Kim<minchan.kim@gmail.com> wrote:
> On Wed, Aug 19, 2009 at 4:14 PM, Chungki woo<chungki.woo@gmail.com> wrote:
>>> You means your pages with 79M are swap out in compcache's reserved
>>> memory?
>>
>> Compcache don't have reserved memory.
>> When it needs memory, and then allocate memory.
>
> Okay. reserved is not important. :)
> My point was that 79M with pages are swap out in compcache swap device ?
> Is the number real ?
> Can we believe it ?
>


I would suggest moving compcache related discussion over to
linux-mm-cc AT laptop DOT org
as this might not be of such general interest. I would be glad to
discuss your doubts in detail.

See you over there.

Thanks,
Nitin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  8:25               ` Nitin Gupta
@ 2009-08-19  8:42                 ` Minchan Kim
  0 siblings, 0 replies; 16+ messages in thread
From: Minchan Kim @ 2009-08-19  8:42 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Minchan Kim, Chungki woo, Mel Gorman, linux-kernel, linux-mm,
	fengguang.wu, riel, akpm, kosaki.motohiro

On Wed, 19 Aug 2009 13:55:44 +0530
Nitin Gupta <ngupta@vflare.org> wrote:

> On Wed, Aug 19, 2009 at 12:59 PM, Minchan Kim<minchan.kim@gmail.com> wrote:
> > On Wed, Aug 19, 2009 at 4:14 PM, Chungki woo<chungki.woo@gmail.com> wrote:
> >>> You means your pages with 79M are swap out in compcache's reserved
> >>> memory?
> >>
> >> Compcache don't have reserved memory.
> >> When it needs memory, and then allocate memory.
> >
> > Okay. reserved is not important. :)
> > My point was that 79M with pages are swap out in compcache swap device ?
> > Is the number real ?
> > Can we believe it ?
> >
> 
> 
> I would suggest moving compcache related discussion over to
> linux-mm-cc AT laptop DOT org
> as this might not be of such general interest. I would be glad to
> discuss your doubts in detail.

Thanks. But we still don't find exact cause. 
I am not sure this is campcache problem or buddy allocator problem. 
First of all, we have to make sure it. :)

> See you over there.
> 
> Thanks,
> Nitin


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  2:44 ` Minchan Kim
  2009-08-19  3:44   ` Nitin Gupta
@ 2009-08-19 10:18   ` Minchan Kim
  1 sibling, 0 replies; 16+ messages in thread
From: Minchan Kim @ 2009-08-19 10:18 UTC (permalink / raw)
  To: 우충기, Nitin Gupta
  Cc: linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro, minchan.kim, Mel Gorman

On Wed, Aug 19, 2009 at 11:44 AM, Minchan Kim<minchan.kim@gmail.com> wrote:
> On Wed, 19 Aug 2009 10:41:51 +0900
> 우충기 <chungki.woo@gmail.com> wrote:
>
>> Hi all~
>> I have got a log message with OOM below. I don't know why this
>> phenomenon was happened.
>> When direct reclaim routine(try_to_free_pages) in __alloc_pages which
>> allocates kernel memory was failed,
>> one last chance is given to allocate memory before OOM routine is executed.
>> And that time, allocator uses ALLOC_WMARK_HIGH to limit watermark.
>> Then, zone_watermark_ok function test this value with current memory
>> state and decide 'can allocate' or 'cannot allocate'.
>>
>> Here is some kernel source code in __alloc_pages function to understand easily.
>> Kernel version is 2.6.18 for arm11. Memory size is 32Mbyte. And I use
>> compcache(0.5.2).
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>>         ...
>>         did_some_progress = try_to_free_pages(zonelist->zones,
>> gfp_mask);            <== direct page reclaim
>>
>>         p->reclaim_state = NULL;
>>         p->flags &= ~PF_MEMALLOC;
>>
>>         cond_resched();
>>
>>         if (likely(did_some_progress)) {
>>                 page = get_page_from_freelist(gfp_mask, order,
>>                                                 zonelist, alloc_flags);
>>                 if (page)
>>                         goto got_pg;
>>         } else if ((gfp_mask & __GFP_FS) && !(gfp_mask &
>> __GFP_NORETRY)) {    <== when fail to reclaim
>>                 /*
>>                  * Go through the zonelist yet one more time, keep
>>                  * very high watermark here, this is only to catch
>>                  * a parallel oom killing, we must fail if we're still
>>                  * under heavy pressure.
>>                  */
>>                 page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL,
>> order,  <== this is last chance
>>                                 zonelist,
>> ALLOC_WMARK_HIGH|ALLOC_CPUSET);               <== uses
>> ALLOC_WMARK_HIGH
>>                 if (page)
>>                         goto got_pg;
>>
>>                 out_of_memory(zonelist, gfp_mask, order);
>>                 goto restart;
>>         }
>>         ...
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> In my case, you can see free pages(6804KB) is much more higher than
>> high watermark value(1084KB) in OOM message.
>> And order of allocating is also zero.(order=0)
>> In buddy system, the number of 4kbyte page is 867.
>> So, I think OOM can't be happend.
>>
>
> Yes. I think so.
>
> In that case, even we can also avoid zone defensive algorithm.
>
>> How do you think about this?
>> Is this side effect of compcache?
>
> I don't know compcache well.
> But I doubt it. Let's Cced Nitin.
>
>> Please explain me.
>> Thanks.
>>
>> This is OOM message.
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>> oom-killer: gfp_mask=0x201d2, order=0       (==> __GFP_HIGHMEM,
>> __GFP_WAIT, __GFP_IO, __GFP_FS, __GFP_COLD)
>> [<c00246c0>] (dump_stack+0x0/0x14) from [<c006ba68>] (out_of_memory+0x38/0x1d0)
>> [<c006ba30>] (out_of_memory+0x0/0x1d0) from [<c006d4cc>]
>> (__alloc_pages+0x244/0x2c4)
>> [<c006d288>] (__alloc_pages+0x0/0x2c4) from [<c006f054>]
>> (__do_page_cache_readahead+0x12c/0x2d4)
>> [<c006ef28>] (__do_page_cache_readahead+0x0/0x2d4) from [<c006f594>]
>> (do_page_cache_readahead+0x60/0x64)
>> [<c006f534>] (do_page_cache_readahead+0x0/0x64) from [<c006ac24>]
>> (filemap_nopage+0x1b4/0x438)
>>  r7 = C0D8C320  r6 = C1422000  r5 = 00000001  r4 = 00000000
>> [<c006aa70>] (filemap_nopage+0x0/0x438) from [<c0075684>]
>> (__handle_mm_fault+0x398/0xb84)
>> [<c00752ec>] (__handle_mm_fault+0x0/0xb84) from [<c0027614>]
>> (do_page_fault+0xe8/0x224)
>> [<c002752c>] (do_page_fault+0x0/0x224) from [<c0027900>]
>> (do_DataAbort+0x3c/0xa0)
>> [<c00278c4>] (do_DataAbort+0x0/0xa0) from [<c001fde0>]
>> (ret_from_exception+0x0/0x10)
>>  r8 = BE9894B8  r7 = 00000078  r6 = 00000130  r5 = 00000000
>>  r4 = FFFFFFFF
>> Mem-info:
>> DMA per-cpu:
>> cpu 0 hot: high 6, batch 1 used:0
>> cpu 0 cold: high 2, batch 1 used:1
>> DMA32 per-cpu: empty
>> Normal per-cpu: empty
>> HighMem per-cpu: empty
>> Free pages:        6804kB (0kB HighMem)
>> Active:101 inactive:1527 dirty:0 writeback:0 unstable:0 free:1701
>> slab:936 mapped:972 pagetables:379
>> DMA free:6804kB min:724kB low:904kB high:1084kB active:404kB
>> inactive:6108kB present:32768kB pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0 0
>> DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
>> present:0kB pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0 0
>> Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB
>> present:0kB pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0 0
>> HighMem free:0kB min:128kB low:128kB high:128kB active:0kB
>> inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0 0
>> DMA: 867*4kB 273*8kB 36*16kB 2*32kB 0*64kB 0*128kB 0*256kB 1*512kB
>> 0*1024kB 0*2048kB 0*4096kB = 6804kB
>> DMA32: empty
>> Normal: empty
>> HighMem: empty
>> Swap cache: add 4597, delete 4488, find 159/299, race 0+0
>> Free swap  = 67480kB
>> Total swap = 81916kB
>
> In addition, total swap : 79M??
>
>> Free swap:        67480kB
>> 8192 pages of RAM
>> 1960 free pages
>> 978 reserved pages
>> 936 slab pages
>> 1201 pages shared
>> 109 pages swap cached
>
> free page : 6M
> page table + slab + reserved : 8M
> active + inacive : 6M
>
> Where is 12M?
>
>> Out of Memory: Kill process 47 (rc.local) score 849737 and children.
>> Out of memory: Killed process 49 (CTaskManager).
>> Killed
>> SW image is stopped..
>> script in BOOT is stopped...
>> Starting pid 348, console /dev/ttyS1: '/bin/sh'
>> -sh: id: not found
>> #
>> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> As you mentioned, your memory size is 32M and you use compcache.
> How is swap size bigger than your memory size ?
> Is the result of compression of swap pages ?
> Nitin. Could you answer the question?
>
> I can't imagine whey order 0 allocation failed although there are
> many pages in buddy.
>
> What do you mm guys think about this problem ?

I can only think that zonelists set up wrongly or freelist got damaged.
Could you print your zonelist about __GFP_HIGHMEM ?

> --
> Kind regards,
> Minchan Kim
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19  6:49         ` Minchan Kim
  2009-08-19  7:14           ` Chungki woo
@ 2009-08-19 10:36           ` Mel Gorman
  2009-08-19 10:52             ` Minchan Kim
  1 sibling, 1 reply; 16+ messages in thread
From: Mel Gorman @ 2009-08-19 10:36 UTC (permalink / raw)
  To: Minchan Kim
  Cc: ?????????,
	ngupta, linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro

On Wed, Aug 19, 2009 at 03:49:58PM +0900, Minchan Kim wrote:
> On Wed, 19 Aug 2009 15:24:54 +0900
> ????????? <chungki.woo@gmail.com> wrote:
> 
> > Thank you very much for replys.
> > 
> > But I think it seems not to relate with stale data problem in compcache.
> > My question was why last chance to allocate memory was failed.
> > When OOM killer is executed, memory state is not a condition to
> > execute OOM killer.
> > Specially, there are so many pages of order 0. And allocating order is zero.
> > I think that last allocating memory should have succeeded.
> > That's my worry.
> 
> Yes. I agree with you.
> Mel. Could you give some comment in this situation ?
> Is it possible that order 0 allocation is failed 
> even there are many pages in buddy ?
> 

Not ordinarily. If it happens, I tend to suspect that the free list data
is corrupted and would put a check in __rmqueue() that looked like

	BUG_ON(list_empty(&area->free_list) && area->nr_free);

The second question is, why are we in direct reclaim this far above the
watermark? It should only be kswapd that is doing any reclaim at that
point. That makes me wonder again are the free lists corrupted.

The other possibility is that the zonelist used for allocation in the
troubled path contains no populated zones. I would put a BUG_ON check in
get_page_from_freelist() to check if the first zone in the zonelist has no
pages. If that bug triggers, it might explain why OOMs are triggering for
no good reason.

I consider both of those possibilities abnormal though.

> > 
> > -----------------------------------------------------------------------------------------------------------------------------------------------
> >       page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
> > <== this is last chance
> >                            zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
> > <== uses ALLOC_WMARK_HIGH
> >       if (page)
> >       goto got_pg;
> > 
> >       out_of_memory(zonelist, gfp_mask, order);
> >       goto restart;
> > -----------------------------------------------------------------------------------------------------------------------------------------------
> > 
> > > Let me have a question.
> > > Now the system has 79M as total swap.
> > > It's bigger than system memory size.
> > > Is it possible in compcache?
> > > Can we believe the number?
> > 
> > Yeah, It's possible. 79Mbyte is data size can be swap.
> > It's not compressed data size. It's just original data size.
> 
> You means your pages with 79M are swap out in compcache's reserved
> memory?
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19 10:36           ` Mel Gorman
@ 2009-08-19 10:52             ` Minchan Kim
  2009-08-19 10:58               ` Mel Gorman
  0 siblings, 1 reply; 16+ messages in thread
From: Minchan Kim @ 2009-08-19 10:52 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, ?????????,
	ngupta, linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro

Thanks for good comment, Mel. 

On Wed, 19 Aug 2009 11:36:11 +0100
Mel Gorman <mel@csn.ul.ie> wrote:

> On Wed, Aug 19, 2009 at 03:49:58PM +0900, Minchan Kim wrote:
> > On Wed, 19 Aug 2009 15:24:54 +0900
> > ????????? <chungki.woo@gmail.com> wrote:
> > 
> > > Thank you very much for replys.
> > > 
> > > But I think it seems not to relate with stale data problem in compcache.
> > > My question was why last chance to allocate memory was failed.
> > > When OOM killer is executed, memory state is not a condition to
> > > execute OOM killer.
> > > Specially, there are so many pages of order 0. And allocating order is zero.
> > > I think that last allocating memory should have succeeded.
> > > That's my worry.
> > 
> > Yes. I agree with you.
> > Mel. Could you give some comment in this situation ?
> > Is it possible that order 0 allocation is failed 
> > even there are many pages in buddy ?
> > 
> 
> Not ordinarily. If it happens, I tend to suspect that the free list data
> is corrupted and would put a check in __rmqueue() that looked like
> 
> 	BUG_ON(list_empty(&area->free_list) && area->nr_free);

If memory is corrupt, it would be not satisfied with both condition. 
It would be better to ORed condition.

BUG_ON(list_empty(&area->free_list) || area->nr_free);

> The second question is, why are we in direct reclaim this far above the
> watermark? It should only be kswapd that is doing any reclaim at that
> point. That makes me wonder again are the free lists corrupted.

It does make sense!

> The other possibility is that the zonelist used for allocation in the
> troubled path contains no populated zones. I would put a BUG_ON check in
> get_page_from_freelist() to check if the first zone in the zonelist has no
> pages. If that bug triggers, it might explain why OOMs are triggering for
> no good reason.

Yes. Chungki. Could you put the both BUG_ON in each function and
try to reproduce the problem ?

> I consider both of those possibilities abnormal though.
> 
> > > 
> > > -----------------------------------------------------------------------------------------------------------------------------------------------
> > >       page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
> > > <== this is last chance
> > >                            zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
> > > <== uses ALLOC_WMARK_HIGH
> > >       if (page)
> > >       goto got_pg;
> > > 
> > >       out_of_memory(zonelist, gfp_mask, order);
> > >       goto restart;
> > > -----------------------------------------------------------------------------------------------------------------------------------------------
> > > 
> > > > Let me have a question.
> > > > Now the system has 79M as total swap.
> > > > It's bigger than system memory size.
> > > > Is it possible in compcache?
> > > > Can we believe the number?
> > > 
> > > Yeah, It's possible. 79Mbyte is data size can be swap.
> > > It's not compressed data size. It's just original data size.
> > 
> > You means your pages with 79M are swap out in compcache's reserved
> > memory?
> > 
> 
> -- 
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19 10:52             ` Minchan Kim
@ 2009-08-19 10:58               ` Mel Gorman
  2009-08-19 11:01                 ` Minchan Kim
  2009-08-19 12:06                 ` Chungki woo
  0 siblings, 2 replies; 16+ messages in thread
From: Mel Gorman @ 2009-08-19 10:58 UTC (permalink / raw)
  To: Minchan Kim
  Cc: ?????????,
	ngupta, linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro

On Wed, Aug 19, 2009 at 07:52:42PM +0900, Minchan Kim wrote:
> Thanks for good comment, Mel. 
> 
> On Wed, 19 Aug 2009 11:36:11 +0100
> Mel Gorman <mel@csn.ul.ie> wrote:
> 
> > On Wed, Aug 19, 2009 at 03:49:58PM +0900, Minchan Kim wrote:
> > > On Wed, 19 Aug 2009 15:24:54 +0900
> > > ????????? <chungki.woo@gmail.com> wrote:
> > > 
> > > > Thank you very much for replys.
> > > > 
> > > > But I think it seems not to relate with stale data problem in compcache.
> > > > My question was why last chance to allocate memory was failed.
> > > > When OOM killer is executed, memory state is not a condition to
> > > > execute OOM killer.
> > > > Specially, there are so many pages of order 0. And allocating order is zero.
> > > > I think that last allocating memory should have succeeded.
> > > > That's my worry.
> > > 
> > > Yes. I agree with you.
> > > Mel. Could you give some comment in this situation ?
> > > Is it possible that order 0 allocation is failed 
> > > even there are many pages in buddy ?
> > > 
> > 
> > Not ordinarily. If it happens, I tend to suspect that the free list data
> > is corrupted and would put a check in __rmqueue() that looked like
> > 
> > 	BUG_ON(list_empty(&area->free_list) && area->nr_free);
> 
> If memory is corrupt, it would be not satisfied with both condition. 
> It would be better to ORed condition.
> 
> BUG_ON(list_empty(&area->free_list) || area->nr_free);
> 

But it's perfectly reasonable to have nr_free a positive value. The
point of the check is ensure the counters make sense. If nr_free > 0 and
the list is empty, it means accounting is all messed up and the values
reported for "free" in the OOM message are fiction.

> > The second question is, why are we in direct reclaim this far above the
> > watermark? It should only be kswapd that is doing any reclaim at that
> > point. That makes me wonder again are the free lists corrupted.
> 
> It does make sense!
> 
> > The other possibility is that the zonelist used for allocation in the
> > troubled path contains no populated zones. I would put a BUG_ON check in
> > get_page_from_freelist() to check if the first zone in the zonelist has no
> > pages. If that bug triggers, it might explain why OOMs are triggering for
> > no good reason.
> 
> Yes. Chungki. Could you put the both BUG_ON in each function and
> try to reproduce the problem ?
> 
> > I consider both of those possibilities abnormal though.
> > 
> > > > 
> > > > -----------------------------------------------------------------------------------------------------------------------------------------------
> > > >       page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
> > > > <== this is last chance
> > > >                            zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
> > > > <== uses ALLOC_WMARK_HIGH
> > > >       if (page)
> > > >       goto got_pg;
> > > > 
> > > >       out_of_memory(zonelist, gfp_mask, order);
> > > >       goto restart;
> > > > -----------------------------------------------------------------------------------------------------------------------------------------------
> > > > 
> > > > > Let me have a question.
> > > > > Now the system has 79M as total swap.
> > > > > It's bigger than system memory size.
> > > > > Is it possible in compcache?
> > > > > Can we believe the number?
> > > > 
> > > > Yeah, It's possible. 79Mbyte is data size can be swap.
> > > > It's not compressed data size. It's just original data size.
> > > 
> > > You means your pages with 79M are swap out in compcache's reserved
> > > memory?
> > > 
> > 
> > -- 
> > Mel Gorman
> > Part-time Phd Student                          Linux Technology Center
> > University of Limerick                         IBM Dublin Software Lab
> 
> 
> -- 
> Kind regards,
> Minchan Kim
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19 10:58               ` Mel Gorman
@ 2009-08-19 11:01                 ` Minchan Kim
  2009-08-19 12:06                 ` Chungki woo
  1 sibling, 0 replies; 16+ messages in thread
From: Minchan Kim @ 2009-08-19 11:01 UTC (permalink / raw)
  To: Mel Gorman
  Cc: ?????????,
	ngupta, linux-kernel, linux-mm, fengguang.wu, riel, akpm,
	kosaki.motohiro

On Wed, Aug 19, 2009 at 7:58 PM, Mel Gorman<mel@csn.ul.ie> wrote:
> On Wed, Aug 19, 2009 at 07:52:42PM +0900, Minchan Kim wrote:
>> Thanks for good comment, Mel.
>>
>> On Wed, 19 Aug 2009 11:36:11 +0100
>> Mel Gorman <mel@csn.ul.ie> wrote:
>>
>> > On Wed, Aug 19, 2009 at 03:49:58PM +0900, Minchan Kim wrote:
>> > > On Wed, 19 Aug 2009 15:24:54 +0900
>> > > ????????? <chungki.woo@gmail.com> wrote:
>> > >
>> > > > Thank you very much for replys.
>> > > >
>> > > > But I think it seems not to relate with stale data problem in compcache.
>> > > > My question was why last chance to allocate memory was failed.
>> > > > When OOM killer is executed, memory state is not a condition to
>> > > > execute OOM killer.
>> > > > Specially, there are so many pages of order 0. And allocating order is zero.
>> > > > I think that last allocating memory should have succeeded.
>> > > > That's my worry.
>> > >
>> > > Yes. I agree with you.
>> > > Mel. Could you give some comment in this situation ?
>> > > Is it possible that order 0 allocation is failed
>> > > even there are many pages in buddy ?
>> > >
>> >
>> > Not ordinarily. If it happens, I tend to suspect that the free list data
>> > is corrupted and would put a check in __rmqueue() that looked like
>> >
>> >     BUG_ON(list_empty(&area->free_list) && area->nr_free);
>>
>> If memory is corrupt, it would be not satisfied with both condition.
>> It would be better to ORed condition.
>>
>> BUG_ON(list_empty(&area->free_list) || area->nr_free);
>>
>
> But it's perfectly reasonable to have nr_free a positive value. The
> point of the check is ensure the counters make sense. If nr_free > 0 and
> the list is empty, it means accounting is all messed up and the values
> reported for "free" in the OOM message are fiction.

Huh. My mistake.

I confused it with !area->nr_free.
Sorry for confusing you.

Thanks again.



>
>> > The second question is, why are we in direct reclaim this far above the
>> > watermark? It should only be kswapd that is doing any reclaim at that
>> > point. That makes me wonder again are the free lists corrupted.
>>
>> It does make sense!
>>
>> > The other possibility is that the zonelist used for allocation in the
>> > troubled path contains no populated zones. I would put a BUG_ON check in
>> > get_page_from_freelist() to check if the first zone in the zonelist has no
>> > pages. If that bug triggers, it might explain why OOMs are triggering for
>> > no good reason.
>>
>> Yes. Chungki. Could you put the both BUG_ON in each function and
>> try to reproduce the problem ?
>>
>> > I consider both of those possibilities abnormal though.
>> >
>> > > >
>> > > > -----------------------------------------------------------------------------------------------------------------------------------------------
>> > > >       page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
>> > > > <== this is last chance
>> > > >                            zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
>> > > > <== uses ALLOC_WMARK_HIGH
>> > > >       if (page)
>> > > >       goto got_pg;
>> > > >
>> > > >       out_of_memory(zonelist, gfp_mask, order);
>> > > >       goto restart;
>> > > > -----------------------------------------------------------------------------------------------------------------------------------------------
>> > > >
>> > > > > Let me have a question.
>> > > > > Now the system has 79M as total swap.
>> > > > > It's bigger than system memory size.
>> > > > > Is it possible in compcache?
>> > > > > Can we believe the number?
>> > > >
>> > > > Yeah, It's possible. 79Mbyte is data size can be swap.
>> > > > It's not compressed data size. It's just original data size.
>> > >
>> > > You means your pages with 79M are swap out in compcache's reserved
>> > > memory?
>> > >
>> >
>> > --
>> > Mel Gorman
>> > Part-time Phd Student                          Linux Technology Center
>> > University of Limerick                         IBM Dublin Software Lab
>>
>>
>> --
>> Kind regards,
>> Minchan Kim
>>
>
> --
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: abnormal OOM killer message
  2009-08-19 10:58               ` Mel Gorman
  2009-08-19 11:01                 ` Minchan Kim
@ 2009-08-19 12:06                 ` Chungki woo
  1 sibling, 0 replies; 16+ messages in thread
From: Chungki woo @ 2009-08-19 12:06 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, ngupta, linux-kernel, linux-mm, fengguang.wu, riel,
	akpm, kosaki.motohiro

On Wed, Aug 19, 2009 at 7:58 PM, Mel Gorman<mel@csn.ul.ie> wrote:
> On Wed, Aug 19, 2009 at 07:52:42PM +0900, Minchan Kim wrote:
>> Thanks for good comment, Mel.
>>
>> On Wed, 19 Aug 2009 11:36:11 +0100
>> Mel Gorman <mel@csn.ul.ie> wrote:
>>
>> > On Wed, Aug 19, 2009 at 03:49:58PM +0900, Minchan Kim wrote:
>> > > On Wed, 19 Aug 2009 15:24:54 +0900
>> > > ????????? <chungki.woo@gmail.com> wrote:
>> > >
>> > > > Thank you very much for replys.
>> > > >
>> > > > But I think it seems not to relate with stale data problem in compcache.
>> > > > My question was why last chance to allocate memory was failed.
>> > > > When OOM killer is executed, memory state is not a condition to
>> > > > execute OOM killer.
>> > > > Specially, there are so many pages of order 0. And allocating order is zero.
>> > > > I think that last allocating memory should have succeeded.
>> > > > That's my worry.
>> > >
>> > > Yes. I agree with you.
>> > > Mel. Could you give some comment in this situation ?
>> > > Is it possible that order 0 allocation is failed
>> > > even there are many pages in buddy ?
>> > >
>> >
>> > Not ordinarily. If it happens, I tend to suspect that the free list data
>> > is corrupted and would put a check in __rmqueue() that looked like
>> >
>> >     BUG_ON(list_empty(&area->free_list) && area->nr_free);
>>
>> If memory is corrupt, it would be not satisfied with both condition.
>> It would be better to ORed condition.
>>
>> BUG_ON(list_empty(&area->free_list) || area->nr_free);
>>
>
> But it's perfectly reasonable to have nr_free a positive value. The
> point of the check is ensure the counters make sense. If nr_free > 0 and
> the list is empty, it means accounting is all messed up and the values
> reported for "free" in the OOM message are fiction.
>
>> > The second question is, why are we in direct reclaim this far above the
>> > watermark? It should only be kswapd that is doing any reclaim at that
>> > point. That makes me wonder again are the free lists corrupted.
>>
>> It does make sense!
>

'Corrupted free list' makes sense. Thank you very much.
Inserting BUG_ON code is also good idea to check corruption of free list.

I have one more question.
As you know, before and after executing direct reclaim
routine(try_to_free_pages)
cond_resched() routine is also executed.
In other words, it can be scheduled at that time.
Is there no possibility executing kswapd or try_to_free_pages at other
context at that time?
I think this fact maybe can explain that gap(between watermark and
free memory) also.
How do you think about this?
But I know this can't explain why last chance to allocate memory was failed.
I think your idea makes sense.

Anyway, I will try to test again with following BUG_ON code.

BUG_ON(list_empty(&area->free_list) && area->nr_free);

Thanks
Mel, Minchan

>> > The other possibility is that the zonelist used for allocation in the
>> > troubled path contains no populated zones. I would put a BUG_ON check in
>> > get_page_from_freelist() to check if the first zone in the zonelist has no
>> > pages. If that bug triggers, it might explain why OOMs are triggering for
>> > no good reason.
>>
>> Yes. Chungki. Could you put the both BUG_ON in each function and
>> try to reproduce the problem ?
>>
>> > I consider both of those possibilities abnormal though.
>> >
>> > > >
>> > > > -----------------------------------------------------------------------------------------------------------------------------------------------
>> > > >       page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
>> > > > <== this is last chance
>> > > >                            zonelist, ALLOC_WMARK_HIGH|ALLOC_CPUSET);
>> > > > <== uses ALLOC_WMARK_HIGH
>> > > >       if (page)
>> > > >       goto got_pg;
>> > > >
>> > > >       out_of_memory(zonelist, gfp_mask, order);
>> > > >       goto restart;
>> > > > -----------------------------------------------------------------------------------------------------------------------------------------------
>> > > >
>> > > > > Let me have a question.
>> > > > > Now the system has 79M as total swap.
>> > > > > It's bigger than system memory size.
>> > > > > Is it possible in compcache?
>> > > > > Can we believe the number?
>> > > >
>> > > > Yeah, It's possible. 79Mbyte is data size can be swap.
>> > > > It's not compressed data size. It's just original data size.
>> > >
>> > > You means your pages with 79M are swap out in compcache's reserved
>> > > memory?
>> > >
>> >
>> > --
>> > Mel Gorman
>> > Part-time Phd Student                          Linux Technology Center
>> > University of Limerick                         IBM Dublin Software Lab
>>
>>
>> --
>> Kind regards,
>> Minchan Kim
>>
>
> --
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-08-19 12:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-19  1:41 abnormal OOM killer message 우충기
2009-08-19  2:44 ` Minchan Kim
2009-08-19  3:44   ` Nitin Gupta
2009-08-19  4:51     ` Minchan Kim
2009-08-19  6:24       ` 우충기
2009-08-19  6:49         ` Minchan Kim
2009-08-19  7:14           ` Chungki woo
2009-08-19  7:29             ` Minchan Kim
2009-08-19  8:25               ` Nitin Gupta
2009-08-19  8:42                 ` Minchan Kim
2009-08-19 10:36           ` Mel Gorman
2009-08-19 10:52             ` Minchan Kim
2009-08-19 10:58               ` Mel Gorman
2009-08-19 11:01                 ` Minchan Kim
2009-08-19 12:06                 ` Chungki woo
2009-08-19 10:18   ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox