linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Joonsoo Kim <js1304@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Vlastimil Babka <vbabka@suse.cz>, Rik van Riel <riel@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Minchan Kim <minchan@kernel.org>
Subject: Re: [RFC PATCH 00/10] redesign compaction algorithm
Date: Thu, 25 Jun 2015 19:41:35 +0100	[thread overview]
Message-ID: <20150625184135.GB26927@suse.de> (raw)
In-Reply-To: <CAAmzW4PMWOaAa0bd7xVr5Jz=xVgqMw8G=UFOwhUGuyLL9EFbHA@mail.gmail.com>

On Fri, Jun 26, 2015 at 03:14:39AM +0900, Joonsoo Kim wrote:
> > It could though. Reclaim/compaction is entered for orders higher than
> > PAGE_ALLOC_COSTLY_ORDER and when scan priority is sufficiently high.
> > That could be adjusted if you have a viable case where orders <
> > PAGE_ALLOC_COSTLY_ORDER must succeed and currently requires excessive
> > reclaim instead of relying on compaction.
> 
> Yes. I saw this problem in real situation. In ARM, order-2 allocation
> is requested
> in fork(), so it should be succeed. But, there is not enough order-2 freepage,
> so reclaim/compaction begins. Compaction fails repeatedly although
> I didn't check exact reason.

That should be identified and repaired prior to reimplementing
compaction because it's important.

> >> >> 3) Compaction capability is highly depends on migratetype of memory,
> >> >> because freepage scanner doesn't scan unmovable pageblock.
> >> >>
> >> >
> >> > For a very good reason. Unmovable allocation requests that fallback to
> >> > other pageblocks are the worst in terms of fragmentation avoidance. The
> >> > more of these events there are, the more the system will decay. If there
> >> > are many of these events then a compaction benchmark may start with high
> >> > success rates but decay over time.
> >> >
> >> > Very broadly speaking, the more the mm_page_alloc_extfrag tracepoint
> >> > triggers with alloc_migratetype == MIGRATE_UNMOVABLE, the faster the
> >> > system is decaying. Having the freepage scanner select unmovable
> >> > pageblocks will trigger this event more frequently.
> >> >
> >> > The unfortunate impact is that selecting unmovable blocks from the free
> >> > csanner will improve compaction success rates for high-order kernel
> >> > allocations early in the lifetime of the system but later fail high-order
> >> > allocation requests as more pageblocks get converted to unmovable. It
> >> > might be ok for kernel allocations but THP will eventually have a 100%
> >> > failure rate.
> >>
> >> I wrote rationale in the patch itself. We already use non-movable pageblock
> >> for migration scanner. It empties non-movable pageblock so number of
> >> freepage on non-movable pageblock will increase. Using non-movable
> >> pageblock for freepage scanner negates this effect so number of freepage
> >> on non-movable pageblock will be balanced. Could you tell me in detail
> >> how freepage scanner select unmovable pageblocks will cause
> >> more fragmentation? Possibly, I don't understand effect of this patch
> >> correctly and need some investigation. :)
> >>
> >
> > The long-term success rate of fragmentation avoidance depends on
> > minimsing the number of UNMOVABLE allocation requests that use a
> > pageblock belonging to another migratetype. Once such a fallback occurs,
> > that pageblock potentially can never be used for a THP allocation again.
> >
> > Lets say there is an unmovable pageblock with 500 free pages in it. If
> > the freepage scanner uses that pageblock and allocates all 500 free
> > pages then the next unmovable allocation request needs a new pageblock.
> > If one is not completely free then it will fallback to using a
> > RECLAIMABLE or MOVABLE pageblock forever contaminating it.
> 
> Yes, I can imagine that situation. But, as I said above, we already use
> non-movable pageblock for migration scanner. While unmovable
> pageblock with 500 free pages fills, some other unmovable pageblock
> with some movable pages will be emptied. Number of freepage
> on non-movable would be maintained so fallback doesn't happen.
> 
> Anyway, it is better to investigate this effect. I will do it and attach
> result on next submission.
> 

Lets say we have X unmovable pageblocks and Y pageblocks overall. If the
migration scanner takes movable pages from X then there is more space for
unmovable allocations without having to increase X -- this is good. If
the free scanner uses the X pageblocks as targets then they can fill. The
next unmovable allocation then falls back to another pageblock and we
either have X+1 unmovable pageblocks (full steal) or a mixed pageblock
(partial steal) that cannot be used for THP. Do this enough times and
X == Y and all THP allocations fail.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-06-25 18:41 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-25  0:45 Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 01/10] mm/compaction: update skip-bit if whole pageblock is really scanned Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 02/10] mm/compaction: skip useless pfn for scanner's cached pfn Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 03/10] mm/compaction: always update " Joonsoo Kim
2015-06-25  9:08   ` Vlastimil Babka
2015-06-25  0:45 ` [RFC PATCH 04/10] mm/compaction: clean-up restarting condition check Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 05/10] mm/compaction: make freepage scanner scans non-movable pageblock Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 06/10] mm/compaction: introduce compaction depleted state on zone Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 07/10] mm/compaction: limit compaction activity in compaction depleted state Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 08/10] mm/compaction: remove compaction deferring Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 09/10] mm/compaction: redesign compaction Joonsoo Kim
2015-06-25  0:45 ` [RFC PATCH 10/10] mm/compaction: new threshold for compaction depleted zone Joonsoo Kim
2015-06-25 11:03 ` [RFC PATCH 00/10] redesign compaction algorithm Mel Gorman
2015-06-25 17:11   ` Joonsoo Kim
2015-06-25 17:25     ` Mel Gorman
2015-06-25 18:14       ` Joonsoo Kim
2015-06-25 18:41         ` Mel Gorman [this message]
2015-06-26  2:07           ` Joonsoo Kim
2015-06-26 10:22             ` Mel Gorman
2015-07-08  8:24               ` Joonsoo Kim
2015-07-21  9:27                 ` Vlastimil Babka
2015-07-23  5:33                   ` Joonsoo Kim
2015-06-25 18:56         ` Vlastimil Babka
2015-06-26  2:14           ` Joonsoo Kim
2015-06-26 11:22             ` Vlastimil Babka
2015-06-25 13:35 ` Vlastimil Babka
2015-06-25 17:32   ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150625184135.GB26927@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox