From: Andrew Morton <akpm@linux-foundation.org>
To: Rik van Riel <riel@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [PATCH] vmscan: bail out of direct reclaim after swap_cluster_max pages
Date: Fri, 28 Nov 2008 23:24:36 -0800 [thread overview]
Message-ID: <20081128232436.f9b92685.akpm@linux-foundation.org> (raw)
In-Reply-To: <20081128062358.7a2e091f@bree.surriel.com>
On Fri, 28 Nov 2008 06:23:58 -0500 Rik van Riel <riel@redhat.com> wrote:
> When the VM is under pressure, it can happen that several direct reclaim
> processes are in the pageout code simultaneously. It also happens that
> the reclaiming processes run into mostly referenced, mapped and dirty
> pages in the first round.
>
> This results in multiple direct reclaim processes having a lower
> pageout priority, which corresponds to a higher target of pages to
> scan.
>
> This in turn can result in each direct reclaim process freeing
> many pages. Together, they can end up freeing way too many pages.
>
> This kicks useful data out of memory (in some cases more than half
> of all memory is swapped out). It also impacts performance by
> keeping tasks stuck in the pageout code for too long.
>
> A 30% improvement in hackbench has been observed with this patch.
>
> The fix is relatively simple: in shrink_zone() we can check how many
> pages we have already freed, direct reclaim tasks break out of the
> scanning loop if they have already freed enough pages and have reached
> a lower priority level.
>
> We do not break out of shrink_zone() when priority == DEF_PRIORITY,
> to ensure that equal pressure is applied to every zone in the common
> case.
>
> However, in order to do this we do need to know how many pages we already
> freed, so move nr_reclaimed into scan_control.
>
Again, it's just awful to make a change which has already be tried and
rejected. Especially as we don't really fully understand why it was
rejected (do we?). The information we seek may well be in the mailing
list archives somewhere.
> + /*
> + * On large memory systems, scan >> priority can become
> + * really large. This is fine for the starting priority;
> + * we want to put equal scanning pressure on each zone.
> + * However, if the VM has a harder time of freeing pages,
> + * with multiple processes reclaiming pages, the total
> + * freeing target can get unreasonably large.
> + */
> + if (sc->nr_reclaimed > sc->swap_cluster_max &&
> + priority < DEF_PRIORITY && !current_is_kswapd())
> + break;
Fingers crossed, it might be that the `priority < DEF_PRIORITY' here
will save our bacon from <whatever it was>. But it sure would be good
to know.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2008-11-29 7:24 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-28 11:23 Rik van Riel
2008-11-29 7:24 ` Andrew Morton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081128232436.f9b92685.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox