linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Rik van Riel <riel@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, mel@csn.ul.ie,
	kosaki.motohiro@jp.fujitsu.com
Subject: Re: [PATCH] vmscan: bail out of page reclaim after swap_cluster_max pages
Date: Mon, 24 Nov 2008 12:53:35 -0800	[thread overview]
Message-ID: <20081124125335.556c2a60.akpm@linux-foundation.org> (raw)
In-Reply-To: <20081124145057.4211bd46@bree.surriel.com>

On Mon, 24 Nov 2008 14:50:57 -0500
Rik van Riel <riel@redhat.com> wrote:

> Sometimes the VM spends the first few priority rounds rotating back
> referenced pages and submitting IO.  Once we get to a lower priority,
> sometimes the VM ends up freeing way too many pages.

It would help (a lot) if we had a much more specific and detailed
description of the problem which is being fixed.  Nobody has noticed it
in half a decade, so it can't be very serious?

> The fix is relatively simple: in shrink_zone() we can check how many
> pages we have already freed, direct reclaim tasks break out of the
> scanning loop if they have already freed enough pages and have reached
> a lower priority level.

So in the common scenario where there's a lot of dirty highmem and
little dirty lowmem, the kernel will start reclaiming highmem at a
vastly higher rate than lowmem.  iirc, this was the reason why this
change was tried then reverted.

Please demonstrate that this regression is not worse than the problem
which is being fixed!

> However, in order to do this we do need to know how many pages we already
> freed, so move nr_reclaimed into scan_control.

Thus carrying the state across the *entire* scanning pass: all zones.

So as soon as sc.nr_reclaimed exceeds swap_cluster_max, the scanner
will fall into a different mode for the remaining zones wherein it will
scan only swap_cluster_max pages from them, then will bale.

This will heavily bias scanning onto the zones at the start of the zone
list.  In fact it probably means that the zone at the head of the
zonelist gets thrashed and the remaining zones will just sit there
doing almost nothing.  Where's the sense in that?

Has any testing been done to demonstrate and quantify this effect?

> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
> Kosaki, this should address the zone scanning pressure issue.

What is the "zone scanning pressure issue"?

Please don't put "should" in a vmscan changelog :( Either it does, or
it does not?


This should look familiar:

	commit e468e46a9bea3297011d5918663ce6d19094cf87
	Author: akpm <akpm>
	Date:   Thu Jun 24 15:53:52 2004 +0000

	[PATCH] vmscan.c: dont reclaim too many pages
	    
	    The shrink_zone() logic can, under some circumstances, cause far too many
	    pages to be reclaimed.  Say, we're scanning at high priority and suddenly hit
	    a large number of reclaimable pages on the LRU.                                                                        
	    Change things so we bale out when SWAP_CLUSTER_MAX pages have been reclaimed.
	    
	    Signed-off-by: Andrew Morton <akpm@osdl.org>
	    Signed-off-by: Linus Torvalds <torvalds@osdl.org>
	    
	    BKrev: 40daf910sac4yN_aUhhJF2U8Upx1ww


And here is where it was reverted.  Note that this was nearly two years
later!  It takes that long for these things to be discovered, analysed
and fixed.



	commit 210fe530305ee50cd889fe9250168228b2994f32
	Author: Andrew Morton <akpm@osdl.org>
	Date:   Fri Jan 6 00:11:14 2006 -0800

	    [PATCH] vmscan: balancing fix
	    
	    Revert a patch which went into 2.6.8-rc1.  The changelog for that patch was:
	    
	      The shrink_zone() logic can, under some circumstances, cause far too many
	      pages to be reclaimed.  Say, we're scanning at high priority and suddenly
	      hit a large number of reclaimable pages on the LRU.
	    
	      Change things so we bale out when SWAP_CLUSTER_MAX pages have been
	      reclaimed.
	    
	    Problem is, this change caused significant imbalance in inter-zone scan
	    balancing by truncating scans of larger zones.
	    
	    Suppose, for example, ZONE_HIGHMEM is 10x the size of ZONE_NORMAL.  The zone
	    balancing algorithm would require that if we're scanning 100 pages of
	    ZONE_HIGHMEM, we should scan 10 pages of ZONE_NORMAL.  But this logic will
	    cause the scanning of ZONE_HIGHMEM to bale out after only 32 pages are
	    reclaimed.  Thus effectively causing smaller zones to be scanned relatively
	    harder than large ones.
	    
	    Now I need to remember what the workload was which caused me to write this
	    patch originally, then fix it up in a different way...
	    
	    Signed-off-by: Andrew Morton <akpm@osdl.org>
	    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2008-11-24 20:53 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-24 19:50 Rik van Riel
2008-11-24 20:53 ` Andrew Morton [this message]
2008-11-25 11:35 ` KOSAKI Motohiro
2008-11-25 13:32   ` Rik van Riel
2008-11-25 14:30     ` KOSAKI Motohiro
2008-11-28  7:02   ` KOSAKI Motohiro
2008-11-28 11:03     ` Rik van Riel
2008-11-29 10:53       ` KOSAKI Motohiro
2008-11-29 16:24         ` Rik van Riel
2008-11-30  6:30           ` KOSAKI Motohiro
2008-12-03  5:26             ` [PATCH] vmscan: improve reclaim throuput to bail out patch KOSAKI Motohiro
2008-12-03 13:46               ` Rik van Riel
2008-12-03 15:12                 ` KOSAKI Motohiro
2008-12-04  1:28                   ` [PATCH] vmscan: improve reclaim throuput to bail out patch take2 KOSAKI Motohiro
2008-12-04  4:20                     ` MinChan Kim
2008-12-04  5:04                       ` KOSAKI Motohiro
2008-12-07  3:28                     ` Andrew Morton
2008-12-08  2:49                       ` KOSAKI Motohiro
2008-12-01 13:40           ` [PATCH] vmscan: bail out of page reclaim after swap_cluster_max pages Christoph Lameter
2008-11-26  2:24 ` KOSAKI Motohiro
2008-11-27 17:36 ` [rfc] vmscan: serialize aggressive reclaimers Johannes Weiner
2008-11-29  7:46   ` KOSAKI Motohiro
2008-11-29 15:39     ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081124125335.556c2a60.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox