Re: [PATCH -mm] vmscan: bail out of page reclaim after swap_cluster_max pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Rik van Riel <riel@redhat.com>
Cc: kosaki.motohiro@jp.fujitsu.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH -mm] vmscan: bail out of page reclaim after swap_cluster_max pages
Date: Fri, 14 Nov 2008 09:51:06 +0900 (JST)	[thread overview]
Message-ID: <20081114093301.03BC.KOSAKI.MOTOHIRO@jp.fujitsu.com> (raw)
In-Reply-To: <20081113171208.6985638e@bree.surriel.com>

Hi

> Sometimes the VM spends the first few priority rounds rotating back
> referenced pages and submitting IO.  Once we get to a lower priority,
> sometimes the VM ends up freeing way too many pages.
> 
> The fix is relatively simple: in shrink_zone() we can check how many
> pages we have already freed and break out of the loop.
> 
> However, in order to do this we do need to know how many pages we already
> freed, so move nr_reclaimed into scan_control.
> 
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>

Wow!
Honestly, I prepared the similar patche recently.




> ---
>  mm/vmscan.c |   60 ++++++++++++++++++++++++++++++------------------------------
>  1 file changed, 30 insertions(+), 30 deletions(-)
> 
> Index: linux-2.6.28-rc2-mm1/mm/vmscan.c
> ===================================================================
> --- linux-2.6.28-rc2-mm1.orig/mm/vmscan.c	2008-10-30 15:20:06.000000000 -0400
> +++ linux-2.6.28-rc2-mm1/mm/vmscan.c	2008-11-13 17:08:35.000000000 -0500
> @@ -53,6 +53,9 @@ struct scan_control {
>  	/* Incremented by the number of inactive pages that were scanned */
>  	unsigned long nr_scanned;
>  
> +	/* Number of pages that were freed */
> +	unsigned long nr_reclaimed;
> +
>  	/* This context's GFP mask */
>  	gfp_t gfp_mask;
>  
> @@ -1408,16 +1411,14 @@ static void get_scan_ratio(struct zone *
>  	percent[1] = 100 - percent[0];
>  }
>  
> -
>  /*
>   * This is a basic per-zone page freer.  Used by both kswapd and direct reclaim.
>   */
> -static unsigned long shrink_zone(int priority, struct zone *zone,
> +static void shrink_zone(int priority, struct zone *zone,
>  				struct scan_control *sc)
>  {
>  	unsigned long nr[NR_LRU_LISTS];
>  	unsigned long nr_to_scan;
> -	unsigned long nr_reclaimed = 0;
>  	unsigned long percent[2];	/* anon @ 0; file @ 1 */
>  	enum lru_list l;
>  
> @@ -1458,10 +1459,18 @@ static unsigned long shrink_zone(int pri
>  					(unsigned long)sc->swap_cluster_max);
>  				nr[l] -= nr_to_scan;
>  
> -				nr_reclaimed += shrink_list(l, nr_to_scan,
> +				sc->nr_reclaimed += shrink_list(l, nr_to_scan,
>  							zone, sc, priority);
>  			}
>  		}
> +		/*
> +		 * On large memory systems, scan >> priority can become
> +		 * really large.  This is OK if we need to scan through
> +		 * that many pages (referenced, dirty, etc), but make
> +		 * sure to stop if we already freed enough.
> +		 */
> +		if (sc->nr_reclaimed > sc->swap_cluster_max)
> +			break;
>  	}

There is one risk.
__alloc_pages_internal() has following code,

        pages_reclaimed += did_some_progress;
        do_retry = 0;
        if (!(gfp_mask & __GFP_NORETRY)) {
                if (order <= PAGE_ALLOC_COSTLY_ORDER) {
                        do_retry = 1;
                } else {
                        if (gfp_mask & __GFP_REPEAT &&
                                pages_reclaimed < (1 << order))
                                        do_retry = 1;
                }
                if (gfp_mask & __GFP_NOFAIL)
                        do_retry = 1;
        }


So, reclaim shortcutting can increase the possibility of page allocation 
endless retry on high-order allocation.

Yes, it is the theorical issue.
But we should test it for avoid regression.


Otherthing, looks good to me.

	Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-11-14  0:51 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-13 22:12 Rik van Riel
2008-11-14  0:51 ` KOSAKI Motohiro [this message]
2008-11-14  3:27 ` Andrew Morton
2008-11-14 14:36   ` Rik van Riel
2008-11-14 17:18     ` Andrew Morton
2008-11-16  7:43       ` KOSAKI Motohiro
2008-11-16  7:54         ` Andrew Morton
2008-11-16  7:56           ` KOSAKI Motohiro
2008-11-16  8:02             ` Andrew Morton
2008-11-22 10:22           ` KOSAKI Motohiro
2008-11-22 16:57             ` Rik van Riel
2008-11-24 19:12               ` KOSAKI Motohiro
2008-11-24 19:18                 ` Rik van Riel
2008-11-16  7:38 ` KOSAKI Motohiro
2008-11-17  0:38   ` KAMEZAWA Hiroyuki
2008-11-17  3:43     ` Balbir Singh
2008-11-19 16:54 ` Mel Gorman
2008-11-21 11:59   ` Petr Tesarik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081114093301.03BC.KOSAKI.MOTOHIRO@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox