linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 3/5] mm: vmscan: Do not reclaim from lower zones if they are balanced
Date: Fri, 27 Jun 2014 13:26:57 -0400	[thread overview]
Message-ID: <20140627172657.GU7331@cmpxchg.org> (raw)
In-Reply-To: <1403856880-12597-4-git-send-email-mgorman@suse.de>

On Fri, Jun 27, 2014 at 09:14:38AM +0100, Mel Gorman wrote:
> Historically kswapd scanned from DMA->Movable in the opposite direction
> to the page allocator to avoid allocating behind kswapd direction of
> progress. The fair zone allocation policy altered this in a non-obvious
> manner.
> 
> Traditionally, the page allocator prefers to use the highest eligible zone
> until the watermark is depleted, woke kswapd and moved onto the next zone.

That's not quite right, the page allocator tries all zones in the
zonelist, then wakes up kswapd, then tries again from the beginning.

> kswapd scans zones in the opposite direction so the scanning lists on
> 64-bit look like this;
> 
> Page alloc		Kswapd
> ----------              ------
> Movable			DMA
> Normal			DMA32
> DMA32			Normal
> DMA			Movable
> 
> If kswapd scanned in the same direction as the page allocator then it is
> possible that kswapd would proportionally reclaim the lower zones that
> were never used as the page allocator was always allocating behind the
> reclaim. This would work as follows
> 
> 	pgalloc hits Normal low wmark
> 					kswapd reclaims Normal
> 					kswapd reclaims DMA32
> 	pgalloc hits Normal low wmark
> 					kswapd reclaims Normal
> 					kswapd reclaims DMA32
> 
> The introduction of the fair zone allocation policy fundamentally altered
> this problem by interleaving between zones until the low watermark is
> reached. There are at least two issues with this
> 
> o The page allocator can allocate behind kswapds progress (scans/reclaims
>   lower zone and fair zone allocation policy then uses those pages)
> o When the low watermark of the high zone is reached there may recently
>   allocated pages allocated from the lower zone but as kswapd scans
>   dma->highmem to the highest zone needing balancing it'll reclaim the
>   lower zone even if it was balanced.
> 
> Let N = high_wmark(Normal) + high_wmark(DMA32). Of the last N allocations,
> some percentage will be allocated from Normal and some from DMA32. The
> percentage depends on the ratio of the zone sizes and when their watermarks
> were hit. If Normal is unbalanced, DMA32 will be shrunk by kswapd. If DMA32
> is unbalanced only DMA32 will be shrunk. This leads to a difference of
> ages between DMA32 and Normal. Relatively young pages are then continually
> rotated and reclaimed from DMA32 due to the higher zone being unbalanced.
> Some of these pages may be recently read-ahead pages requiring that the page
> be re-read from disk and impacting overall performance.
> 
> The problem is fundamental to the fact we have per-zone LRU and allocation
> policies and ideally we would only have per-node allocation and LRU lists.
> This would avoid the need for the fair zone allocation policy but the
> low-memory-starvation issue would have to be addressed again from scratch.
> 
> This patch will only scan/reclaim from lower zones if they have not
> reached their watermark. This should not break the normal page aging
> as the proportional allocations due to the fair zone allocation policy
> should compensate.

That's already the case, kswapd_shrink_zone() checks whether the zone
is balanced before scanning in, so something in this analysis is off -
but I have to admit that I have trouble following it.

The only difference in the two checks is that the outer one you add
does not enforce the balance gap, which means that we stop reclaiming
zones a little earlier than before.  I guess this is where the
throughput improvements come from, but there is a chance it will
regress latency for bursty allocations.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-06-27 17:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-27  8:14 [PATCH 0/5] Improve sequential read throughput v3 Mel Gorman
2014-06-27  8:14 ` [PATCH 1/5] mm: pagemap: Avoid unnecessary overhead when tracepoints are deactivated Mel Gorman
2014-06-27  8:14 ` [PATCH 2/5] mm: Rearrange zone fields into read-only, page alloc, statistics and page reclaim lines Mel Gorman
2014-06-27  8:14 ` [PATCH 3/5] mm: vmscan: Do not reclaim from lower zones if they are balanced Mel Gorman
2014-06-27 17:26   ` Johannes Weiner [this message]
2014-06-27 18:42     ` Mel Gorman
2014-06-27  8:14 ` [PATCH 4/5] mm: page_alloc: Reduce cost of the fair zone allocation policy Mel Gorman
2014-06-27 18:57   ` Johannes Weiner
2014-06-27 19:25     ` Mel Gorman
2014-06-30 14:41       ` Johannes Weiner
2014-06-27  8:14 ` [PATCH 5/5] mm: page_alloc: Reduce cost of dirty zone balancing Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140627172657.GU7331@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox