* [PATCH] ignore referenced pages on reclaim when OOM
@ 2004-11-08 18:18 Marcelo Tosatti
2004-11-08 21:48 ` Nikita Danilov
0 siblings, 1 reply; 9+ messages in thread
From: Marcelo Tosatti @ 2004-11-08 18:18 UTC (permalink / raw)
To: akpm, riel; +Cc: linux-mm
Andrew,
Can you please apply Rik's patch?
Ignore referenced bit when priority reaches 0. Get out of such
OOM situation as fast as possible, instead of running around
trying to find elegible pages for reclaim.
Speeds up extreme load performance on Rik's tests.
----- Forwarded message from Rik van Riel <riel@redhat.com> -----
From: Rik van Riel <riel@redhat.com>
Date: Fri, 5 Nov 2004 16:56:17 -0500 (EST)
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Subject: [PATCH] fix OOM problem
X-X-Sender: riel@chimarrao.boston.redhat.com
X-MIMETrack: Itemize by SMTP Server on USMail/Cyclades(Release 6.5.1|January 21, 2004) at
11/05/2004 13:58:40
===== mm/vmscan.c 1.231 vs edited =====
--- 1.231/mm/vmscan.c Sun Oct 17 01:07:24 2004
+++ edited/mm/vmscan.c Mon Oct 25 17:38:56 2004
@@ -379,7 +379,7 @@
referenced = page_referenced(page, 1);
/* In active use or really unfreeable? Activate it. */
- if (referenced && page_mapping_inuse(page))
+ if (referenced && sc->priority && page_mapping_inuse(page))
goto activate_locked;
#ifdef CONFIG_SWAP
@@ -715,7 +715,7 @@
if (page_mapped(page)) {
if (!reclaim_mapped ||
(total_swap_pages == 0 && PageAnon(page)) ||
- page_referenced(page, 0)) {
+ (page_referenced(page, 0) && sc->priority)) {
list_add(&page->lru, &l_active);
continue;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH] ignore referenced pages on reclaim when OOM 2004-11-08 18:18 [PATCH] ignore referenced pages on reclaim when OOM Marcelo Tosatti @ 2004-11-08 21:48 ` Nikita Danilov 2004-11-08 21:56 ` Rik van Riel 0 siblings, 1 reply; 9+ messages in thread From: Nikita Danilov @ 2004-11-08 21:48 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: akpm, riel, linux-mm Marcelo Tosatti writes: > Andrew, > > Can you please apply Rik's patch? > > Ignore referenced bit when priority reaches 0. Get out of such > OOM situation as fast as possible, instead of running around > trying to find elegible pages for reclaim. > > Speeds up extreme load performance on Rik's tests. I recently tested quite similar thing, the only dfference being that in my case references bit started being ignored when scanning priority reached 2 rather than 0. I found that it _degrades_ performance in the loads when there is a lot of file system write-back going from tail of the inactive list (like dirtying huge file through mmap in a loop). Nikita. > > ----- Forwarded message from Rik van Riel <riel@redhat.com> ----- > > From: Rik van Riel <riel@redhat.com> > Date: Fri, 5 Nov 2004 16:56:17 -0500 (EST) > To: Marcelo Tosatti <marcelo.tosatti@cyclades.com> > Subject: [PATCH] fix OOM problem > X-X-Sender: riel@chimarrao.boston.redhat.com > X-MIMETrack: Itemize by SMTP Server on USMail/Cyclades(Release 6.5.1|January 21, 2004) at > 11/05/2004 13:58:40 > > > > ===== mm/vmscan.c 1.231 vs edited ===== > --- 1.231/mm/vmscan.c Sun Oct 17 01:07:24 2004 > +++ edited/mm/vmscan.c Mon Oct 25 17:38:56 2004 > @@ -379,7 +379,7 @@ > > referenced = page_referenced(page, 1); > /* In active use or really unfreeable? Activate it. */ > - if (referenced && page_mapping_inuse(page)) > + if (referenced && sc->priority && page_mapping_inuse(page)) > goto activate_locked; > > #ifdef CONFIG_SWAP > @@ -715,7 +715,7 @@ > if (page_mapped(page)) { > if (!reclaim_mapped || > (total_swap_pages == 0 && PageAnon(page)) || > - page_referenced(page, 0)) { > + (page_referenced(page, 0) && sc->priority)) { > list_add(&page->lru, &l_active); > continue; > } > > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ignore referenced pages on reclaim when OOM 2004-11-08 21:48 ` Nikita Danilov @ 2004-11-08 21:56 ` Rik van Riel 2004-11-08 18:48 ` Marcelo Tosatti 2004-11-08 22:28 ` Andrew Morton 0 siblings, 2 replies; 9+ messages in thread From: Rik van Riel @ 2004-11-08 21:56 UTC (permalink / raw) To: Nikita Danilov; +Cc: Marcelo Tosatti, akpm, linux-mm On Tue, 9 Nov 2004, Nikita Danilov wrote: > > Speeds up extreme load performance on Rik's tests. > > I recently tested quite similar thing, the only dfference being that in > my case references bit started being ignored when scanning priority > reached 2 rather than 0. > > I found that it _degrades_ performance in the loads when there is a lot > of file system write-back going from tail of the inactive list (like > dirtying huge file through mmap in a loop). Well yeah, when you reach priority 2, you've only scanned 1/4 of memory. On the other hand, when you reach priority 0, you've already scanned all pages once - beyond that point the referenced bit really doesn't buy you much any more. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ignore referenced pages on reclaim when OOM 2004-11-08 21:56 ` Rik van Riel @ 2004-11-08 18:48 ` Marcelo Tosatti 2004-11-08 22:28 ` Andrew Morton 1 sibling, 0 replies; 9+ messages in thread From: Marcelo Tosatti @ 2004-11-08 18:48 UTC (permalink / raw) To: Rik van Riel; +Cc: Nikita Danilov, akpm, linux-mm On Mon, Nov 08, 2004 at 04:56:25PM -0500, Rik van Riel wrote: > On Tue, 9 Nov 2004, Nikita Danilov wrote: > > > > Speeds up extreme load performance on Rik's tests. > > > > I recently tested quite similar thing, the only dfference being that in > > my case references bit started being ignored when scanning priority > > reached 2 rather than 0. > > > > I found that it _degrades_ performance in the loads when there is a lot > > of file system write-back going from tail of the inactive list (like > > dirtying huge file through mmap in a loop). > > Well yeah, when you reach priority 2, you've only scanned > 1/4 of memory. On the other hand, when you reach priority > 0, you've already scanned all pages once - beyond that point > the referenced bit really doesn't buy you much any more. Nikita, Can you please rerun your tests with priority=0 instead of priority=2? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ignore referenced pages on reclaim when OOM 2004-11-08 21:56 ` Rik van Riel 2004-11-08 18:48 ` Marcelo Tosatti @ 2004-11-08 22:28 ` Andrew Morton 2004-11-10 18:41 ` Marcelo Tosatti 1 sibling, 1 reply; 9+ messages in thread From: Andrew Morton @ 2004-11-08 22:28 UTC (permalink / raw) To: Rik van Riel; +Cc: nikita, marcelo.tosatti, linux-mm Rik van Riel <riel@redhat.com> wrote: > > On Tue, 9 Nov 2004, Nikita Danilov wrote: > > > > Speeds up extreme load performance on Rik's tests. > > > > I recently tested quite similar thing, the only dfference being that in > > my case references bit started being ignored when scanning priority > > reached 2 rather than 0. > > > > I found that it _degrades_ performance in the loads when there is a lot > > of file system write-back going from tail of the inactive list (like > > dirtying huge file through mmap in a loop). > > Well yeah, when you reach priority 2, you've only scanned > 1/4 of memory. On the other hand, when you reach priority > 0, you've already scanned all pages once - beyond that point > the referenced bit really doesn't buy you much any more. > But we have to scan active, referenced pages two times to move them onto the inactive list. A bit more, really, because nowadays refill_inactive_zone() doesn't even run page_referenced() until it starts to reach higher scanning priorities. So it could be that we're just not scanning enough. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ignore referenced pages on reclaim when OOM 2004-11-08 22:28 ` Andrew Morton @ 2004-11-10 18:41 ` Marcelo Tosatti 2004-11-10 22:29 ` Andrew Morton 0 siblings, 1 reply; 9+ messages in thread From: Marcelo Tosatti @ 2004-11-10 18:41 UTC (permalink / raw) To: Andrew Morton; +Cc: Rik van Riel, nikita, linux-mm, Nick Piggin On Mon, Nov 08, 2004 at 02:28:37PM -0800, Andrew Morton wrote: > Rik van Riel <riel@redhat.com> wrote: > > > > On Tue, 9 Nov 2004, Nikita Danilov wrote: > > > > > > Speeds up extreme load performance on Rik's tests. > > > > > > I recently tested quite similar thing, the only dfference being that in > > > my case references bit started being ignored when scanning priority > > > reached 2 rather than 0. > > > > > > I found that it _degrades_ performance in the loads when there is a lot > > > of file system write-back going from tail of the inactive list (like > > > dirtying huge file through mmap in a loop). > > > > Well yeah, when you reach priority 2, you've only scanned > > 1/4 of memory. On the other hand, when you reach priority > > 0, you've already scanned all pages once - beyond that point > > the referenced bit really doesn't buy you much any more. > > > > But we have to scan active, referenced pages two times to move them onto > the inactive list. A bit more, really, because nowadays > refill_inactive_zone() doesn't even run page_referenced() until it starts > to reach higher scanning priorities. > > So it could be that we're just not scanning enough. You know, all_unreclaimable has drawbacks. Its hard to know whether you have "scanned enough to consider the box OOM and trigger OOM killer" when all_unreclaimable avoids the system from "scanning enough". I'm trying to improve the OOM-kill-from-kswapd patch but z->all_unreclaimable is currently the bigger "rock on the shoe" - we need some way to detect that the zones have been scanned enough so to be able to say "OK, I have scanned enough and no freeable pages appear, its time to trigger the OOM killer". So z->all_unreclaimable logic and "OOM detection" are conflicting goals. There must be some way to combine both effectively. This is my current patch - avoids spurious OOM kills but obviously fails to set "worked_dma" - "worked_normal" due to all_unreclaimable logic, resulting in livelock when swapspace exhauts. Ideas are welcome. --- vmscan.c.orig 2004-11-09 16:38:04.000000000 -0200 +++ vmscan.c 2004-11-10 18:59:43.098090736 -0200 @@ -878,6 +878,8 @@ shrink_zone(zone, sc); } } + +int task_looping_oom = 0; /* * This is the main entry point to direct page reclaim. @@ -952,8 +954,8 @@ if (sc.nr_scanned && priority < DEF_PRIORITY - 2) blk_congestion_wait(WRITE, HZ/10); } - if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) - out_of_memory(gfp_mask); + if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) + task_looping_oom = 1; out: for (i = 0; zones[i] != 0; i++) { struct zone *zone = zones[i]; @@ -963,6 +965,8 @@ zone->prev_priority = zone->temp_priority; } + if (ret) + task_looping_oom = 0; return ret; } @@ -997,13 +1001,17 @@ int all_zones_ok; int priority; int i; - int total_scanned, total_reclaimed; + int total_scanned, total_reclaimed, low_reclaimed; + int worked_norm, worked_dma; struct reclaim_state *reclaim_state = current->reclaim_state; struct scan_control sc; + loop_again: total_scanned = 0; total_reclaimed = 0; + low_reclaimed = 0; + worked_norm = worked_dma = 0; sc.gfp_mask = GFP_KERNEL; sc.may_writepage = 0; sc.nr_mapped = read_page_state(nr_mapped); @@ -1072,6 +1080,17 @@ if (zone->all_unreclaimable && priority != DEF_PRIORITY) continue; + /* if we're scanning dma or normal, and priority + * reached zero, set "worked_dma" or "worked_norm" + * accordingly. + */ + if (i <= 1 && priority == 0) { + if (!i) + worked_dma = 1; + else + worked_norm = 1; + } + if (nr_pages == 0) { /* Not software suspend */ if (!zone_watermark_ok(zone, order, zone->pages_high, end_zone, 0, 0)) @@ -1088,6 +1107,10 @@ shrink_slab(sc.nr_scanned, GFP_KERNEL, lru_pages); sc.nr_reclaimed += reclaim_state->reclaimed_slab; total_reclaimed += sc.nr_reclaimed; + + if (i <= 1) + low_reclaimed += sc.nr_reclaimed; + if (zone->all_unreclaimable) continue; if (zone->pages_scanned >= (zone->nr_active + @@ -1128,6 +1151,29 @@ zone->prev_priority = zone->temp_priority; } + + + if (!low_reclaimed && worked_dma && worked_norm && task_looping_oom) { + + printk(KERN_ERR "kswp: pri:%d tot_recl:%d wrkd_dma:%d" + "wrkd_norm:%d tsk_loop_oom:%d\n", + priority, total_reclaimed, worked_dma, worked_norm, + task_looping_oom); + + /* + * Only kill if ZONE_NORMAL/ZONE_DMA are both below + * pages_min + */ + for (i = pgdat->nr_zones - 2; i >= 0; i--) { + struct zone *zone = pgdat->node_zones + i; + + if (zone->free_pages > zone->pages_min) + return 0; + } + out_of_memory(GFP_KERNEL); + task_looping_oom = 0; + } + if (!all_zones_ok) { cond_resched(); goto loop_again; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ignore referenced pages on reclaim when OOM 2004-11-10 18:41 ` Marcelo Tosatti @ 2004-11-10 22:29 ` Andrew Morton 2004-11-10 20:09 ` Marcelo Tosatti 2004-11-12 16:10 ` Rik van Riel 0 siblings, 2 replies; 9+ messages in thread From: Andrew Morton @ 2004-11-10 22:29 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: riel, nikita, linux-mm, piggin Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote: > > So z->all_unreclaimable logic and "OOM detection" are conflicting goals. Only in a single case: where a zone is all_unreclaimable and some pages have recently become reclaimable but we don't know about it yet. Certainly it can happen, but it sounds really unlikely to me. So I suspect that if you were to fix that problem by some means, it wouldn't help anything. But maybe I'm wrong, or maybe the all_unreclaimable logic has rotted. Have you tried simply disabling it? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ignore referenced pages on reclaim when OOM 2004-11-10 22:29 ` Andrew Morton @ 2004-11-10 20:09 ` Marcelo Tosatti 2004-11-12 16:10 ` Rik van Riel 1 sibling, 0 replies; 9+ messages in thread From: Marcelo Tosatti @ 2004-11-10 20:09 UTC (permalink / raw) To: Andrew Morton; +Cc: riel, nikita, linux-mm, piggin On Wed, Nov 10, 2004 at 02:29:00PM -0800, Andrew Morton wrote: > Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote: > > > > So z->all_unreclaimable logic and "OOM detection" are conflicting goals. > > Only in a single case: where a zone is all_unreclaimable and some pages > have recently become reclaimable but we don't know about it yet. The thing is - if you dont scan the zones "enough" you have no way of reliably knowing it is OOM. But on the other hand, scanning it wastes CPU time - what you call "mad scanning". They are two extremes, I feel we need a balance between them. > Certainly it can happen, but it sounds really unlikely to me. So I suspect > that if you were to fix that problem by some means, it wouldn't help > anything. > >But maybe I'm wrong, or maybe the all_unreclaimable logic has rotted. I dont think all_unreclaimable logic is rotted - it does what what it is expected to do. At least thats how I see things, maybe I'm wrong and it is indeed rotted. > Have you tried simply disabling it? Tried now - if I disable it then balance_pgdat() detects the OOM situation by noticing its not successful freeing pages (thus setting worked_dma and worked_normal, see patch), and kills the memory hog. Side note, the memory hog runs _much_ faster without all_unreclaimable logic. I'll continue hacking on this tomorrow. As always, thanks for the input :D -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] ignore referenced pages on reclaim when OOM 2004-11-10 22:29 ` Andrew Morton 2004-11-10 20:09 ` Marcelo Tosatti @ 2004-11-12 16:10 ` Rik van Riel 1 sibling, 0 replies; 9+ messages in thread From: Rik van Riel @ 2004-11-12 16:10 UTC (permalink / raw) To: Andrew Morton; +Cc: Marcelo Tosatti, nikita, linux-mm, piggin On Wed, 10 Nov 2004, Andrew Morton wrote: > Only in a single case: where a zone is all_unreclaimable and some pages > have recently become reclaimable but we don't know about it yet. > > Certainly it can happen, but it sounds really unlikely to me. The swap token logic can make it appear like this is the case, unless you ignore the referenced bit when you reach priority 0. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-11-12 16:10 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-11-08 18:18 [PATCH] ignore referenced pages on reclaim when OOM Marcelo Tosatti 2004-11-08 21:48 ` Nikita Danilov 2004-11-08 21:56 ` Rik van Riel 2004-11-08 18:48 ` Marcelo Tosatti 2004-11-08 22:28 ` Andrew Morton 2004-11-10 18:41 ` Marcelo Tosatti 2004-11-10 22:29 ` Andrew Morton 2004-11-10 20:09 ` Marcelo Tosatti 2004-11-12 16:10 ` Rik van Riel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox