[RFC][PATCH] prevent incorrect oom under split

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [RFC][PATCH] prevent incorrect oom under split_lru
@ 2008-06-24  8:31 KOSAKI Motohiro
  2008-06-24 13:28 ` Rik van Riel
  0 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2008-06-24  8:31 UTC (permalink / raw)
  To: linux-mm, LKML, Lee Schermerhorn, Rik van Riel; +Cc: kosaki.motohiro

Hi Rik,

I encounted strange OOM when ran stress workload.
oom-killer happned but swappable page exist many.

I guess this is split_lru related bug.
what do you think below patch?

-------------
page01 invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0

Call Trace:
 [<a0000001000175e0>] show_stack+0x80/0xa0
                                sp=e00001600e1ffae0 bsp=e00001600e1f1598
 [<a000000100017630>] dump_stack+0x30/0x60
                                sp=e00001600e1ffcb0 bsp=e00001600e1f1580
 [<a000000100133f10>] oom_kill_process+0x250/0x4c0
                                sp=e00001600e1ffcb0 bsp=e00001600e1f1518
 [<a000000100134db0>] out_of_memory+0x3f0/0x520
                                sp=e00001600e1ffcc0 bsp=e00001600e1f14b8
 [<a00000010013f650>] __alloc_pages_internal+0x6b0/0x860
                                sp=e00001600e1ffd60 bsp=e00001600e1f13e8
 [<a00000010018ae80>] alloc_pages_current+0x120/0x1c0
                                sp=e00001600e1ffd70 bsp=e00001600e1f13b0
 [<a00000010012cad0>] __page_cache_alloc+0x130/0x160
                                sp=e00001600e1ffd70 bsp=e00001600e1f1390
 [<a000000100144270>] __do_page_cache_readahead+0x150/0x580
                                sp=e00001600e1ffd70 bsp=e00001600e1f12f8
 [<a0000001001451d0>] do_page_cache_readahead+0xf0/0x120
                                sp=e00001600e1ffd80 bsp=e00001600e1f12c0
 [<a000000100132250>] filemap_fault+0x430/0x8e0
                                sp=e00001600e1ffd80 bsp=e00001600e1f1208
 [<a000000100158900>] __do_fault+0xa0/0xc80
                                sp=e00001600e1ffd80 bsp=e00001600e1f1178
 [<a00000010015d740>] handle_mm_fault+0x260/0x1240
                                sp=e00001600e1ffda0 bsp=e00001600e1f10f0
 [<a0000001007aaab0>] ia64_do_page_fault+0x6f0/0xb00
                                sp=e00001600e1ffda0 bsp=e00001600e1f1090
 [<a00000010000c4e0>] ia64_native_leave_kernel+0x0/0x270
                                sp=e00001600e1ffe30 bsp=e00001600e1f1090
Node 2 DMA per-cpu:
CPU    0: hi:    6, btch:   1 usd:   5
CPU    1: hi:    6, btch:   1 usd:   5
CPU    2: hi:    6, btch:   1 usd:   5
CPU    3: hi:    6, btch:   1 usd:   5
CPU    4: hi:    6, btch:   1 usd:   5
CPU    5: hi:    6, btch:   1 usd:   5
CPU    6: hi:    6, btch:   1 usd:   5
CPU    7: hi:    6, btch:   1 usd:   5
Node 2 Normal per-cpu:
CPU    0: hi:    6, btch:   1 usd:   5
CPU    1: hi:    6, btch:   1 usd:   5
CPU    2: hi:    6, btch:   1 usd:   5
CPU    3: hi:    6, btch:   1 usd:   5
CPU    4: hi:    6, btch:   1 usd:   5
CPU    5: hi:    6, btch:   1 usd:   5
CPU    6: hi:    6, btch:   1 usd:   5
CPU    7: hi:    6, btch:   1 usd:   5
Node 3 Normal per-cpu:
CPU    0: hi:    6, btch:   1 usd:   5
CPU    1: hi:    6, btch:   1 usd:   5
CPU    2: hi:    6, btch:   1 usd:   2
CPU    3: hi:    6, btch:   1 usd:   2
CPU    4: hi:    6, btch:   1 usd:   4
CPU    5: hi:    6, btch:   1 usd:   5
CPU    6: hi:    6, btch:   1 usd:   4
CPU    7: hi:    6, btch:   1 usd:   5
Active_anon:53395 active_file:141 inactive_anon18042
 inactive_file:544 unevictable:12288 dirty:494 writeback:365 unstable:0
 free:288 slab:28313 mapped:83 pagetables:663 bounce:0
Node 2 DMA free:8128kB min:2624kB low:3264kB high:3904kB active_anon:753536kB inactive_anon:587840kB active_file:2176kB inactive_file:8064kB unevictable:0kB present:1863168kB pages_scanned:3934 all_unreclaimable? no
lowmem_reserve[]: 0 110 110
Node 2 Normal free:6464kB min:2560kB low:3200kB high:3840kB active_anon:198400kB inactive_anon:73472kB active_file:1664kB inactive_file:18816kB unevictable:0kB present:1802240kB pages_scanned:64 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
Node 3 Normal free:3840kB min:5888kB low:7360kB high:8832kB active_anon:2465344kB inactive_anon:493376kB active_file:5184kB inactive_file:7936kB unevictable:786432kB present:4124224kB pages_scanned:872 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
Node 2 DMA: 61*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB 0*131072kB 0*262144kB 0*524288kB 0*1048576kB 0*2097152kB 0*4194304kB = 6336kB
Node 2 Normal: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB 0*131072kB 0*262144kB 0*524288kB 0*1048576kB 0*2097152kB 0*4194304kB = 0kB
Node 3 Normal: 57*64kB 6*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB 0*131072kB 0*262144kB 0*524288kB 0*1048576kB 0*2097152kB 0*4194304kB = 4416kB
1158 total pagecache pages
Swap cache: add 525, delete 283, find 0/0
Free swap  = 1997888kB
Total swap = 2031488kB
Out of memory: kill process 56203 (usex) score 2837 or a child
Killed process 56309 (usex)



----------------------
if zone->recent_scanned parameter become inbalanceing anon and file,
OOM killer can happened although swappable page exist.

So, if priority==0, We should try to reclaim all page for prevent OOM.


Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
 mm/vmscan.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Index: b/mm/vmscan.c
===================================================================
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1464,8 +1464,10 @@ static unsigned long shrink_zone(int pri
 			 * kernel will slowly sift through each list.
 			 */
 			scan = zone_page_state(zone, NR_LRU_BASE + l);
-			scan >>= priority;
-			scan = (scan * percent[file]) / 100;
+			if (priority) {
+				scan >>= priority;
+				scan = (scan * percent[file]) / 100;
+			}
 			zone->lru[l].nr_scan += scan + 1;
 			nr[l] = zone->lru[l].nr_scan;
 			if (nr[l] >= sc->swap_cluster_max)


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-24  8:31 [RFC][PATCH] prevent incorrect oom under split_lru KOSAKI Motohiro
@ 2008-06-24 13:28 ` Rik van Riel
  2008-06-25  5:59   ` MinChan Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Rik van Riel @ 2008-06-24 13:28 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-mm, LKML, Lee Schermerhorn, akpm

On Tue, 24 Jun 2008 17:31:54 +0900
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> if zone->recent_scanned parameter become inbalanceing anon and file,
> OOM killer can happened although swappable page exist.
> 
> So, if priority==0, We should try to reclaim all page for prevent OOM.

You are absolutely right.  Good catch.

> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Acked-by: Rik van Riel <riel@redhat.com>

> ---
>  mm/vmscan.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> Index: b/mm/vmscan.c
> ===================================================================
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1464,8 +1464,10 @@ static unsigned long shrink_zone(int pri
>  			 * kernel will slowly sift through each list.
>  			 */
>  			scan = zone_page_state(zone, NR_LRU_BASE + l);
> -			scan >>= priority;
> -			scan = (scan * percent[file]) / 100;
> +			if (priority) {
> +				scan >>= priority;
> +				scan = (scan * percent[file]) / 100;
> +			}
>  			zone->lru[l].nr_scan += scan + 1;
>  			nr[l] = zone->lru[l].nr_scan;
>  			if (nr[l] >= sc->swap_cluster_max)
> 


-- 
All rights reversed.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-24 13:28 ` Rik van Riel
@ 2008-06-25  5:59   ` MinChan Kim
  2008-06-25  6:08     ` KOSAKI Motohiro
  0 siblings, 1 reply; 16+ messages in thread
From: MinChan Kim @ 2008-06-25  5:59 UTC (permalink / raw)
  To: Rik van Riel, KOSAKI Motohiro
  Cc: linux-mm, LKML, Lee Schermerhorn, akpm, Takenori Nagano

On Tue, Jun 24, 2008 at 10:28 PM, Rik van Riel <riel@redhat.com> wrote:
> On Tue, 24 Jun 2008 17:31:54 +0900
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>
>> if zone->recent_scanned parameter become inbalanceing anon and file,
>> OOM killer can happened although swappable page exist.
>>
>> So, if priority==0, We should try to reclaim all page for prevent OOM.
>
> You are absolutely right.  Good catch.

I have a concern about application latency.
If lru list have many pages, it take a very long time to scan pages.
More system have many ram, More many time to scan pages.

Of course I know this is trade-off between memory efficiency VS latency.
But In embedded, some application think latency is more important
thing than memory efficiency.
We need some mechanism to cut off scanning time.


I think Takenori Nagano's "memory reclaim more efficiently patch" is
proper to reduce application latency in this case If we modify some
code.

What do you think about it ?

>> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>
> Acked-by: Rik van Riel <riel@redhat.com>
>
>> ---
>>  mm/vmscan.c |    6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> Index: b/mm/vmscan.c
>> ===================================================================
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1464,8 +1464,10 @@ static unsigned long shrink_zone(int pri
>>                        * kernel will slowly sift through each list.
>>                        */
>>                       scan = zone_page_state(zone, NR_LRU_BASE + l);
>> -                     scan >>= priority;
>> -                     scan = (scan * percent[file]) / 100;
>> +                     if (priority) {
>> +                             scan >>= priority;
>> +                             scan = (scan * percent[file]) / 100;
>> +                     }
>>                       zone->lru[l].nr_scan += scan + 1;
>>                       nr[l] = zone->lru[l].nr_scan;
>>                       if (nr[l] >= sc->swap_cluster_max)
>>
>
>
> --
> All rights reversed.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Kinds regards,
MinChan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25  5:59   ` MinChan Kim
@ 2008-06-25  6:08     ` KOSAKI Motohiro
  2008-06-25  6:56       ` MinChan Kim
  0 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2008-06-25  6:08 UTC (permalink / raw)
  To: MinChan Kim
  Cc: kosaki.motohiro, Rik van Riel, linux-mm, LKML, Lee Schermerhorn,
	akpm, Takenori Nagano

Hi Kim-san,

> >> So, if priority==0, We should try to reclaim all page for prevent OOM.
> >
> > You are absolutely right.  Good catch.
> 
> I have a concern about application latency.
> If lru list have many pages, it take a very long time to scan pages.
> More system have many ram, More many time to scan pages.

No problem.

priority==0 indicate emergency.
it doesn't happend on typical workload.


> Of course I know this is trade-off between memory efficiency VS latency.
> But In embedded, some application think latency is more important
> thing than memory efficiency.
> We need some mechanism to cut off scanning time.
> 
> I think Takenori Nagano's "memory reclaim more efficiently patch" is
> proper to reduce application latency in this case If we modify some
> code.

I think this is off-topic.

but Yes.
both my page reclaim throttle and nagano-san's patch provide 
reclaim cut off mechanism.


and more off-topic,
nagano-san's patch improve only priority==12.
So, typical embedded doesn't improve so big because 
embedded system does't have so large memory.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25  6:08     ` KOSAKI Motohiro
@ 2008-06-25  6:56       ` MinChan Kim
  2008-06-25  6:58         ` MinChan Kim
  2008-06-25 12:11         ` Peter Zijlstra
  0 siblings, 2 replies; 16+ messages in thread
From: MinChan Kim @ 2008-06-25  6:56 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rik van Riel, linux-mm, LKML, Lee Schermerhorn, akpm, Takenori Nagano

On Wed, Jun 25, 2008 at 3:08 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
> Hi Kim-san,
>
>> >> So, if priority==0, We should try to reclaim all page for prevent OOM.
>> >
>> > You are absolutely right.  Good catch.
>>
>> I have a concern about application latency.
>> If lru list have many pages, it take a very long time to scan pages.
>> More system have many ram, More many time to scan pages.
>
> No problem.
>
> priority==0 indicate emergency.
> it doesn't happend on typical workload.
>

I see :)

But if such emergency happen in embedded system, application can't be
executed for some time.
I am not sure how long time it take.
But In some application, schedule period is very important than memory
reclaim latency.

Now, In your patch, when such emergency happen, it continue to reclaim
page until it will scan entire page of lru list.
It

>> Of course I know this is trade-off between memory efficiency VS latency.
>> But In embedded, some application think latency is more important
>> thing than memory efficiency.
>> We need some mechanism to cut off scanning time.
>>
>> I think Takenori Nagano's "memory reclaim more efficiently patch" is
>> proper to reduce application latency in this case If we modify some
>> code.
>
> I think this is off-topic.
>
> but Yes.
> both my page reclaim throttle and nagano-san's patch provide
> reclaim cut off mechanism.
>
>
> and more off-topic,
> nagano-san's patch improve only priority==12.
> So, typical embedded doesn't improve so big because
> embedded system does't have so large memory.
>
>
>
>



-- 
Kinds regards,
MinChan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25  6:56       ` MinChan Kim
@ 2008-06-25  6:58         ` MinChan Kim
  2008-06-25  7:29           ` KOSAKI Motohiro
  2008-06-25 12:11         ` Peter Zijlstra
  1 sibling, 1 reply; 16+ messages in thread
From: MinChan Kim @ 2008-06-25  6:58 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rik van Riel, linux-mm, LKML, Lee Schermerhorn, akpm, Takenori Nagano

On Wed, Jun 25, 2008 at 3:56 PM, MinChan Kim <minchan.kim@gmail.com> wrote:
> On Wed, Jun 25, 2008 at 3:08 PM, KOSAKI Motohiro
> <kosaki.motohiro@jp.fujitsu.com> wrote:
>> Hi Kim-san,
>>
>>> >> So, if priority==0, We should try to reclaim all page for prevent OOM.
>>> >
>>> > You are absolutely right.  Good catch.
>>>
>>> I have a concern about application latency.
>>> If lru list have many pages, it take a very long time to scan pages.
>>> More system have many ram, More many time to scan pages.
>>
>> No problem.
>>
>> priority==0 indicate emergency.
>> it doesn't happend on typical workload.
>>
>
> I see :)
>
> But if such emergency happen in embedded system, application can't be
> executed for some time.
> I am not sure how long time it take.
> But In some application, schedule period is very important than memory
> reclaim latency.
>
> Now, In your patch, when such emergency happen, it continue to reclaim
> page until it will scan entire page of lru list.
> It

with my mistake, I omit following message. :(

So, we need cut-off mechanism to reduce application latency.
So In my opinion, If we modify some code of Takenori's patch, we can
apply his idea to prevent latency probelm.

>>> Of course I know this is trade-off between memory efficiency VS latency.
>>> But In embedded, some application think latency is more important
>>> thing than memory efficiency.
>>> We need some mechanism to cut off scanning time.
>>>
>>> I think Takenori Nagano's "memory reclaim more efficiently patch" is
>>> proper to reduce application latency in this case If we modify some
>>> code.
>>
>> I think this is off-topic.
>>
>> but Yes.
>> both my page reclaim throttle and nagano-san's patch provide
>> reclaim cut off mechanism.
>>
>>
>> and more off-topic,
>> nagano-san's patch improve only priority==12.
>> So, typical embedded doesn't improve so big because
>> embedded system does't have so large memory.
>>
>>
>>
>>
>
>
>
> --
> Kinds regards,
> MinChan Kim
>



-- 
Kinds regards,
MinChan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25  6:58         ` MinChan Kim
@ 2008-06-25  7:29           ` KOSAKI Motohiro
  2008-06-25  7:37             ` MinChan Kim
  0 siblings, 1 reply; 16+ messages in thread
From: KOSAKI Motohiro @ 2008-06-25  7:29 UTC (permalink / raw)
  To: MinChan Kim
  Cc: kosaki.motohiro, Rik van Riel, linux-mm, LKML, Lee Schermerhorn,
	akpm, Takenori Nagano

> > But if such emergency happen in embedded system, application can't be
> > executed for some time.
> > I am not sure how long time it take.
> > But In some application, schedule period is very important than memory
> > reclaim latency.
> >
> > Now, In your patch, when such emergency happen, it continue to reclaim
> > page until it will scan entire page of lru list.
> > It
> 
> with my mistake, I omit following message. :(
> 
> So, we need cut-off mechanism to reduce application latency.
> So In my opinion, If we modify some code of Takenori's patch, we can
> apply his idea to prevent latency probelm.

Yup.
Agreed with latency is as important as throughput.

if anyone explain that patch have reduce some latency and 
no throughput degression by benchmark result,
I have no objection, Of cource.

Can you post any performance result?




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25  7:29           ` KOSAKI Motohiro
@ 2008-06-25  7:37             ` MinChan Kim
  0 siblings, 0 replies; 16+ messages in thread
From: MinChan Kim @ 2008-06-25  7:37 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rik van Riel, linux-mm, LKML, Lee Schermerhorn, akpm, Takenori Nagano

On Wed, Jun 25, 2008 at 4:29 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> > But if such emergency happen in embedded system, application can't be
>> > executed for some time.
>> > I am not sure how long time it take.
>> > But In some application, schedule period is very important than memory
>> > reclaim latency.
>> >
>> > Now, In your patch, when such emergency happen, it continue to reclaim
>> > page until it will scan entire page of lru list.
>> > It
>>
>> with my mistake, I omit following message. :(
>>
>> So, we need cut-off mechanism to reduce application latency.
>> So In my opinion, If we modify some code of Takenori's patch, we can
>> apply his idea to prevent latency probelm.
>
> Yup.
> Agreed with latency is as important as throughput.
>
> if anyone explain that patch have reduce some latency and
> no throughput degression by benchmark result,
> I have no objection, Of cource.
>
> Can you post any performance result?
>
>

hm.. I am not sure when I can post result of benchmark.

Of course, If I do, I will post it :)

Thanks, Kosaki-san

-- 
Kinds regards,
MinChan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25  6:56       ` MinChan Kim
  2008-06-25  6:58         ` MinChan Kim
@ 2008-06-25 12:11         ` Peter Zijlstra
  2008-06-25 13:05           ` MinChan Kim
  2008-06-26  0:36           ` KOSAKI Motohiro
  1 sibling, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2008-06-25 12:11 UTC (permalink / raw)
  To: MinChan Kim
  Cc: KOSAKI Motohiro, Rik van Riel, linux-mm, LKML, Lee Schermerhorn,
	akpm, Takenori Nagano

On Wed, 2008-06-25 at 15:56 +0900, MinChan Kim wrote:
> On Wed, Jun 25, 2008 at 3:08 PM, KOSAKI Motohiro
> <kosaki.motohiro@jp.fujitsu.com> wrote:
> > Hi Kim-san,
> >
> >> >> So, if priority==0, We should try to reclaim all page for prevent OOM.
> >> >
> >> > You are absolutely right.  Good catch.
> >>
> >> I have a concern about application latency.
> >> If lru list have many pages, it take a very long time to scan pages.
> >> More system have many ram, More many time to scan pages.
> >
> > No problem.
> >
> > priority==0 indicate emergency.
> > it doesn't happend on typical workload.
> >
> 
> I see :)
> 
> But if such emergency happen in embedded system, application can't be
> executed for some time.
> I am not sure how long time it take.
> But In some application, schedule period is very important than memory
> reclaim latency.
> 
> Now, In your patch, when such emergency happen, it continue to reclaim
> page until it will scan entire page of lru list.
> It

IMHO embedded real-time apps shoud mlockall() and not do anything that
can result in memory allocations in their fast (deterministic) paths.

The much more important case is desktop usage - that is where we run non
real-time code, but do expect 'low' latency due to user-interaction.

>From hitting swap on my 512M laptop (rather frequent occurance) I know
we can do better here,..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25 12:11         ` Peter Zijlstra
@ 2008-06-25 13:05           ` MinChan Kim
  2008-06-26  1:49             ` Takenori Nagano
  2008-06-26  0:36           ` KOSAKI Motohiro
  1 sibling, 1 reply; 16+ messages in thread
From: MinChan Kim @ 2008-06-25 13:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: KOSAKI Motohiro, Rik van Riel, linux-mm, LKML, Lee Schermerhorn,
	akpm, Takenori Nagano

On Wed, Jun 25, 2008 at 9:11 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, 2008-06-25 at 15:56 +0900, MinChan Kim wrote:
>> On Wed, Jun 25, 2008 at 3:08 PM, KOSAKI Motohiro
>> <kosaki.motohiro@jp.fujitsu.com> wrote:
>> > Hi Kim-san,
>> >
>> >> >> So, if priority==0, We should try to reclaim all page for prevent OOM.
>> >> >
>> >> > You are absolutely right.  Good catch.
>> >>
>> >> I have a concern about application latency.
>> >> If lru list have many pages, it take a very long time to scan pages.
>> >> More system have many ram, More many time to scan pages.
>> >
>> > No problem.
>> >
>> > priority==0 indicate emergency.
>> > it doesn't happend on typical workload.
>> >
>>
>> I see :)
>>
>> But if such emergency happen in embedded system, application can't be
>> executed for some time.
>> I am not sure how long time it take.
>> But In some application, schedule period is very important than memory
>> reclaim latency.
>>
>> Now, In your patch, when such emergency happen, it continue to reclaim
>> page until it will scan entire page of lru list.
>> It
>
> IMHO embedded real-time apps shoud mlockall() and not do anything that
> can result in memory allocations in their fast (deterministic) paths.
Hi peter,

I agree with you.  but if application's virtual address space is big,
we have a hard problem with mlockall since memory pressure might be a
big.
Of course, It will be a RT application design problem.

> The much more important case is desktop usage - that is where we run non
> real-time code, but do expect 'low' latency due to user-interaction.
>
> >From hitting swap on my 512M laptop (rather frequent occurance) I know
> we can do better here,..
>

Absolutely. It is another example. So, I suggest following patch.
It's based on idea of Takenori Nagano's memory reclaim more efficiently.

I expect It will reduce application latency and will not have a regression.
How about you ?

Signed-off-by: MinChan Kim <minchan.kim@gmail.com>
---
 mm/vmscan.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9a5e423..07477cc 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1460,9 +1460,12 @@ static unsigned long shrink_zone(int priority,
struct zone *zone,
                         * kernel will slowly sift through each list.
                         */
                        scan = zone_page_state(zone, NR_LRU_BASE + l);
-                       scan >>= priority;
-                       scan = (scan * percent[file]) / 100;
+                       if (priority) {
+                               scan >>= priority;
+                               scan = (scan * percent[file])/10;
+                       }
                        zone->lru[l].nr_scan += scan + 1;
+
                        nr[l] = zone->lru[l].nr_scan;
                        if (nr[l] >= sc->swap_cluster_max)
                                zone->lru[l].nr_scan = 0;
@@ -1489,6 +1492,9 @@ static unsigned long shrink_zone(int priority,
struct zone *zone,

                                nr_reclaimed += shrink_list(l, nr_to_scan,
                                                        zone, sc, priority);
+                               if (priority == 0 && !current_is_kswapd() &&
+                                       nr_reclaimed >= sc->swap_cluster_max)
+                                       break;
                        }
                }
        }
-- 
1.5.4.3




-- 
Kinds regards,
MinChan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25 12:11         ` Peter Zijlstra
  2008-06-25 13:05           ` MinChan Kim
@ 2008-06-26  0:36           ` KOSAKI Motohiro
  1 sibling, 0 replies; 16+ messages in thread
From: KOSAKI Motohiro @ 2008-06-26  0:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kosaki.motohiro, MinChan Kim, Rik van Riel, linux-mm, LKML,
	Lee Schermerhorn, akpm, Takenori Nagano

> > But if such emergency happen in embedded system, application can't be
> > executed for some time.
> > I am not sure how long time it take.
> > But In some application, schedule period is very important than memory
> > reclaim latency.
> > 
> > Now, In your patch, when such emergency happen, it continue to reclaim
> > page until it will scan entire page of lru list.
> > It
> 
> IMHO embedded real-time apps shoud mlockall() and not do anything that
> can result in memory allocations in their fast (deterministic) paths.

Indeed.

> The much more important case is desktop usage - that is where we run non
> real-time code, but do expect 'low' latency due to user-interaction.
> 
> >From hitting swap on my 512M laptop (rather frequent occurance) I know
> we can do better here,..

nice suggestion.
thanks.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-25 13:05           ` MinChan Kim
@ 2008-06-26  1:49             ` Takenori Nagano
  2008-06-26  4:37               ` MinChan Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Takenori Nagano @ 2008-06-26  1:49 UTC (permalink / raw)
  To: MinChan Kim
  Cc: Peter Zijlstra, KOSAKI Motohiro, Rik van Riel, linux-mm, LKML,
	Lee Schermerhorn, akpm

MinChan Kim wrote:
> Hi peter,
> 
> I agree with you.  but if application's virtual address space is big,
> we have a hard problem with mlockall since memory pressure might be a
> big.
> Of course, It will be a RT application design problem.
> 
>> The much more important case is desktop usage - that is where we run non
>> real-time code, but do expect 'low' latency due to user-interaction.
>>
>> >From hitting swap on my 512M laptop (rather frequent occurance) I know
>> we can do better here,..
>>
> 
> Absolutely. It is another example. So, I suggest following patch.
> It's based on idea of Takenori Nagano's memory reclaim more efficiently.

Hi Kim-san,

Thank you for agreeing with me.

I have one question.
My patch don't mind priority. Why do you need "priority == 0"?

Thanks,
  Takenori

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-26  1:49             ` Takenori Nagano
@ 2008-06-26  4:37               ` MinChan Kim
  2008-06-26  5:24                 ` Takenori Nagano
  0 siblings, 1 reply; 16+ messages in thread
From: MinChan Kim @ 2008-06-26  4:37 UTC (permalink / raw)
  To: Takenori Nagano
  Cc: Peter Zijlstra, KOSAKI Motohiro, Rik van Riel, linux-mm, LKML,
	Lee Schermerhorn, akpm

On Thu, Jun 26, 2008 at 10:49 AM, Takenori Nagano
<t-nagano@ah.jp.nec.com> wrote:
> MinChan Kim wrote:
>> Hi peter,
>>
>> I agree with you.  but if application's virtual address space is big,
>> we have a hard problem with mlockall since memory pressure might be a
>> big.
>> Of course, It will be a RT application design problem.
>>
>>> The much more important case is desktop usage - that is where we run non
>>> real-time code, but do expect 'low' latency due to user-interaction.
>>>
>>> >From hitting swap on my 512M laptop (rather frequent occurance) I know
>>> we can do better here,..
>>>
>>
>> Absolutely. It is another example. So, I suggest following patch.
>> It's based on idea of Takenori Nagano's memory reclaim more efficiently.
>
> Hi Kim-san,
>
> Thank you for agreeing with me.
>
> I have one question.
> My patch don't mind priority. Why do you need "priority == 0"?

Hi, Takenori-san.

Now, Kosaiki-san's patch didn't consider application latency.
That patch scan all lru[x] pages when memory pressure is very high.
(ie, priority == 0)
It will cause application latency to high as peter and me notice that.
We need a idea which prevent big scanning overhead
I modified your idea to prevent big scanning overhead only when memory
pressure is very big.


> Thanks,
>  Takenori
>



-- 
Kinds regards,
MinChan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-26  4:37               ` MinChan Kim
@ 2008-06-26  5:24                 ` Takenori Nagano
  2008-06-26  6:37                   ` MinChan Kim
  0 siblings, 1 reply; 16+ messages in thread
From: Takenori Nagano @ 2008-06-26  5:24 UTC (permalink / raw)
  To: MinChan Kim
  Cc: Peter Zijlstra, KOSAKI Motohiro, Rik van Riel, linux-mm, LKML,
	Lee Schermerhorn, akpm

MinChan Kim wrote:
> On Thu, Jun 26, 2008 at 10:49 AM, Takenori Nagano
> <t-nagano@ah.jp.nec.com> wrote:
>> MinChan Kim wrote:
>>> Hi peter,
>>>
>>> I agree with you.  but if application's virtual address space is big,
>>> we have a hard problem with mlockall since memory pressure might be a
>>> big.
>>> Of course, It will be a RT application design problem.
>>>
>>>> The much more important case is desktop usage - that is where we run non
>>>> real-time code, but do expect 'low' latency due to user-interaction.
>>>>
>>>> >From hitting swap on my 512M laptop (rather frequent occurance) I know
>>>> we can do better here,..
>>>>
>>> Absolutely. It is another example. So, I suggest following patch.
>>> It's based on idea of Takenori Nagano's memory reclaim more efficiently.
>> Hi Kim-san,
>>
>> Thank you for agreeing with me.
>>
>> I have one question.
>> My patch don't mind priority. Why do you need "priority == 0"?
> 
> Hi, Takenori-san.
> 
> Now, Kosaiki-san's patch didn't consider application latency.
> That patch scan all lru[x] pages when memory pressure is very high.
> (ie, priority == 0)
> It will cause application latency to high as peter and me notice that.
> We need a idea which prevent big scanning overhead
> I modified your idea to prevent big scanning overhead only when memory
> pressure is very big.

Hi, Kim-san.

Thank you for your explanation.
I understand your opinion.

But...your patch is not enough for me. :-(
Our Xeon box has 128GB memory, application latency will be very large if
priority goes to be zero.
So, I would like to use "cut off" on every priority.

I would like to delete "priority == 0", Can you?

Thanks,
  Takenori

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-26  5:24                 ` Takenori Nagano
@ 2008-06-26  6:37                   ` MinChan Kim
  2008-06-26  8:05                     ` Takenori Nagano
  0 siblings, 1 reply; 16+ messages in thread
From: MinChan Kim @ 2008-06-26  6:37 UTC (permalink / raw)
  To: Takenori Nagano
  Cc: Peter Zijlstra, KOSAKI Motohiro, Rik van Riel, linux-mm, LKML,
	Lee Schermerhorn, akpm

On Thu, Jun 26, 2008 at 2:24 PM, Takenori Nagano <t-nagano@ah.jp.nec.com> wrote:
> MinChan Kim wrote:
>> On Thu, Jun 26, 2008 at 10:49 AM, Takenori Nagano
>> <t-nagano@ah.jp.nec.com> wrote:
>>> MinChan Kim wrote:
>>>> Hi peter,
>>>>
>>>> I agree with you.  but if application's virtual address space is big,
>>>> we have a hard problem with mlockall since memory pressure might be a
>>>> big.
>>>> Of course, It will be a RT application design problem.
>>>>
>>>>> The much more important case is desktop usage - that is where we run non
>>>>> real-time code, but do expect 'low' latency due to user-interaction.
>>>>>
>>>>> >From hitting swap on my 512M laptop (rather frequent occurance) I know
>>>>> we can do better here,..
>>>>>
>>>> Absolutely. It is another example. So, I suggest following patch.
>>>> It's based on idea of Takenori Nagano's memory reclaim more efficiently.
>>> Hi Kim-san,
>>>
>>> Thank you for agreeing with me.
>>>
>>> I have one question.
>>> My patch don't mind priority. Why do you need "priority == 0"?
>>
>> Hi, Takenori-san.
>>
>> Now, Kosaiki-san's patch didn't consider application latency.
>> That patch scan all lru[x] pages when memory pressure is very high.
>> (ie, priority == 0)
>> It will cause application latency to high as peter and me notice that.
>> We need a idea which prevent big scanning overhead
>> I modified your idea to prevent big scanning overhead only when memory
>> pressure is very big.
>
> Hi, Kim-san.
>
> Thank you for your explanation.
> I understand your opinion.
>
> But...your patch is not enough for me. :-(
> Our Xeon box has 128GB memory, application latency will be very large if
> priority goes to be zero.
> So, I would like to use "cut off" on every priority.

I am not sure it will be a regression.
We don't have any enough data.

My intention is just to prevent kosaki-san's patch's corner case.

> I would like to delete "priority == 0", Can you?
>
> Thanks,
>  Takenori
>



-- 
Kinds regards,
MinChan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] prevent incorrect oom under split_lru
  2008-06-26  6:37                   ` MinChan Kim
@ 2008-06-26  8:05                     ` Takenori Nagano
  0 siblings, 0 replies; 16+ messages in thread
From: Takenori Nagano @ 2008-06-26  8:05 UTC (permalink / raw)
  To: MinChan Kim
  Cc: Peter Zijlstra, KOSAKI Motohiro, Rik van Riel, linux-mm, LKML,
	Lee Schermerhorn, akpm

MinChan Kim wrote:
> On Thu, Jun 26, 2008 at 2:24 PM, Takenori Nagano <t-nagano@ah.jp.nec.com> wrote:
>> MinChan Kim wrote:
>>> On Thu, Jun 26, 2008 at 10:49 AM, Takenori Nagano
>>> <t-nagano@ah.jp.nec.com> wrote:
>>>> MinChan Kim wrote:
>>>>> Hi peter,
>>>>>
>>>>> I agree with you.  but if application's virtual address space is big,
>>>>> we have a hard problem with mlockall since memory pressure might be a
>>>>> big.
>>>>> Of course, It will be a RT application design problem.
>>>>>
>>>>>> The much more important case is desktop usage - that is where we run non
>>>>>> real-time code, but do expect 'low' latency due to user-interaction.
>>>>>>
>>>>>> >From hitting swap on my 512M laptop (rather frequent occurance) I know
>>>>>> we can do better here,..
>>>>>>
>>>>> Absolutely. It is another example. So, I suggest following patch.
>>>>> It's based on idea of Takenori Nagano's memory reclaim more efficiently.
>>>> Hi Kim-san,
>>>>
>>>> Thank you for agreeing with me.
>>>>
>>>> I have one question.
>>>> My patch don't mind priority. Why do you need "priority == 0"?
>>> Hi, Takenori-san.
>>>
>>> Now, Kosaiki-san's patch didn't consider application latency.
>>> That patch scan all lru[x] pages when memory pressure is very high.
>>> (ie, priority == 0)
>>> It will cause application latency to high as peter and me notice that.
>>> We need a idea which prevent big scanning overhead
>>> I modified your idea to prevent big scanning overhead only when memory
>>> pressure is very big.
>> Hi, Kim-san.
>>
>> Thank you for your explanation.
>> I understand your opinion.
>>
>> But...your patch is not enough for me. :-(
>> Our Xeon box has 128GB memory, application latency will be very large if
>> priority goes to be zero.
>> So, I would like to use "cut off" on every priority.
> 
> I am not sure it will be a regression.
> We don't have any enough data.
> 
> My intention is just to prevent kosaki-san's patch's corner case.

OK.
I'll try to test to make enough data. :-)

Thanks,
  Takenori

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-06-26  8:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-24  8:31 [RFC][PATCH] prevent incorrect oom under split_lru KOSAKI Motohiro
2008-06-24 13:28 ` Rik van Riel
2008-06-25  5:59   ` MinChan Kim
2008-06-25  6:08     ` KOSAKI Motohiro
2008-06-25  6:56       ` MinChan Kim
2008-06-25  6:58         ` MinChan Kim
2008-06-25  7:29           ` KOSAKI Motohiro
2008-06-25  7:37             ` MinChan Kim
2008-06-25 12:11         ` Peter Zijlstra
2008-06-25 13:05           ` MinChan Kim
2008-06-26  1:49             ` Takenori Nagano
2008-06-26  4:37               ` MinChan Kim
2008-06-26  5:24                 ` Takenori Nagano
2008-06-26  6:37                   ` MinChan Kim
2008-06-26  8:05                     ` Takenori Nagano
2008-06-26  0:36           ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox