linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>, Joonsoo Kim <js1304@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: Regression in mobility grouping?
Date: Wed, 28 Sep 2016 11:26:09 +0100	[thread overview]
Message-ID: <20160928102609.GA3840@suse.de> (raw)
In-Reply-To: <20160928014148.GA21007@cmpxchg.org>

On Tue, Sep 27, 2016 at 09:41:48PM -0400, Johannes Weiner wrote:
> Hi guys,
> 
> we noticed what looks like a regression in page mobility grouping
> during an upgrade from 3.10 to 4.0. Identical machines, workloads, and
> uptime, but /proc/pagetypeinfo on 3.10 looks like this:
> 
> Number of blocks type     Unmovable  Reclaimable      Movable      Reserve      Isolate 
> Node 1, zone   Normal          815          433        31518            2            0 
> 
> and on 4.0 like this:
> 
> Number of blocks type     Unmovable  Reclaimable      Movable      Reserve          CMA      Isolate 
> Node 1, zone   Normal         3880         3530        25356            2            0            0 
> 

Unmovable pageblocks is not necessarily related to the number of
unmovable pages in the system although it is obviously a concern.
Basically there are two usual approaches to investigating this -- close
attention to the extfrag tracepoint and analysing high-order allocation
failures.

It's drastic, but when migration grouping was first implemented it was
necessary to use a variation of PAGE_OWNER to walk the movable pageblocks
identifying unmovable allocations in there. I also used to have a
debugging patch that would print out the owner of all pages that failed
to migrate within an unmovable block. Unfortunately I don't have these
patches any more and they wouldn't apply anyway but it'd be easier to
implement today than it was 7-8 years ago.

> 4.0 is either polluting pageblocks more aggressively at allocation, or
> is not able to make pageblocks movable again when the reclaimable and
> unmovable allocations are released. Invoking compaction manually
> (/proc/sys/vm/compact_memory) is not bringing them back, either.
> 
> The problem we are debugging is that these machines have a very high
> rate of order-3 allocations (fdtable during fork, network rx), and
> after the upgrade allocstalls have increased dramatically. I'm not
> entirely sure this is the same issue, since even order-0 allocations
> are struggling, but the mobility grouping in itself looks problematic.
> 

Network RX is likely to be atomic allocations. Another potentially place
to focus on is the use of HighAtomic pageblocks and either increasing
them in size or protecting them more aggressively.

> I'm still going through the changes relevant to mobility grouping in
> that timeframe, but if this rings a bell for anyone, it would help. I
> hate blaming random patches, but these caught my eye:
> 
> 9c0415e mm: more aggressive page stealing for UNMOVABLE allocations
> 3a1086f mm: always steal split buddies in fallback allocations
> 99592d5 mm: when stealing freepages, also take pages created by splitting buddy page
> 
> The changelog states that by aggressively stealing split buddy pages
> during a fallback allocation we avoid subsequent stealing. But since
> there are generally more movable/reclaimable pages available, and so
> less falling back and stealing freepages on behalf of movable, won't
> this mean that we could expect exactly that result - growing numbers
> of unmovable blocks, while rarely stealing them back in movable alloc
> fallbacks? And the expansion of !MOVABLE blocks would over time make
> compaction less and less effective too, seeing as it doesn't consider
> anything !MOVABLE suitable migration targets?
> 

It's a solid theory. There has been a lot of activity to weaken fragmentation
avoidance protection to reduce latency. Unfortunately external fragmentation
continues to be one of those topics that is very difficult to precisely
define because it's a matter of definition whether it's important or
not.

Another avenue worth considering is that compaction used to scan unmovable
pageblocks and migrate movable pages out of there but that was weakened
over time trying to allocate THP pages from direct allocation context
quickly enough. I'm not exactly sure what we do there at the moment and
whether kcompactd cleans unmovable pageblocks or not. It takes time but
it also reduces unmovable pageblock steals over time (or at least it did
a few years ago when I last investigated this in depth).

Unfortunately I do not have any suggestions offhand on how it could be
easily improved without going back to first principals and identifying
what pages end up in awkward positions, why and whether the cost of
"cleaning" unmovable pageblocks during compaction for a high-order
allocation is justified or not.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2016-09-28 10:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-28  1:41 Johannes Weiner
2016-09-28  9:00 ` Vlastimil Babka
2016-09-28 15:39   ` Johannes Weiner
2016-09-29  2:25     ` Johannes Weiner
2016-09-29  6:14       ` Joonsoo Kim
2016-09-29 16:14         ` Johannes Weiner
2016-10-13  7:33           ` Joonsoo Kim
2016-09-29  7:17       ` Vlastimil Babka
2016-09-28 10:26 ` Mel Gorman [this message]
2016-09-28 16:37   ` Johannes Weiner
2016-09-29 21:05 ` [RFC 0/4] try to reduce fragmenting fallbacks Vlastimil Babka
2016-09-29 21:05   ` [RFC 1/4] mm, compaction: change migrate_async_suitable() to suitable_migration_source() Vlastimil Babka
2016-09-29 21:05   ` [RFC 2/4] mm, compaction: add migratetype to compact_control Vlastimil Babka
2016-09-29 21:05   ` [RFC 3/4] mm, compaction: restrict async compaction to matching migratetype Vlastimil Babka
2016-09-29 21:05   ` [RFC 4/4] mm, page_alloc: disallow migratetype fallback in fastpath Vlastimil Babka
2016-10-12 14:51     ` Vlastimil Babka
2016-10-13  7:58     ` Joonsoo Kim
2016-10-13 11:46       ` Vlastimil Babka
2016-10-07  8:32   ` [RFC 5/4] mm, page_alloc: split smallest stolen page in fallback Vlastimil Babka
2016-10-10 17:16   ` [RFC 0/4] try to reduce fragmenting fallbacks Johannes Weiner
2016-10-11 13:11   ` [RFC 6/4] mm, page_alloc: introduce MIGRATE_MIXED migratetype Vlastimil Babka
2016-10-13 14:11   ` [RFC 7/4] mm, page_alloc: count movable pages when stealing Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160928102609.GA3840@suse.de \
    --to=mgorman@suse.de \
    --cc=hannes@cmpxchg.org \
    --cc=js1304@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox