From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Rik van Riel <riel@redhat.com>,
David Rientjes <rientjes@google.com>,
Mel Gorman <mgorman@techsingularity.net>,
Johannes Weiner <hannes@cmpxchg.org>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
linux-kernel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [RFC 13/13] mm, compaction: fix and improve watermark handling
Date: Mon, 16 May 2016 11:50:22 +0200 [thread overview]
Message-ID: <573997DE.6010109@suse.cz> (raw)
In-Reply-To: <20160516092505.GE23146@dhcp22.suse.cz>
On 05/16/2016 11:25 AM, Michal Hocko wrote:
> On Tue 10-05-16 09:36:03, Vlastimil Babka wrote:
>> Compaction has been using watermark checks when deciding whether it was
>> successful, and whether compaction is at all suitable. There are few problems
>> with these checks.
>>
>> - __compact_finished() uses low watermark in a check that has to pass if
>> the direct compaction is to finish and allocation should succeed. This is
>> too pessimistic, as the allocation will typically use min watermark. It
>> may happen that during compaction, we drop below the low watermark (due to
>> parallel activity), but still form the target high-order page. By checking
>> against low watermark, we might needlessly continue compaction. After this
>> patch, the check uses direct compactor's alloc_flags to determine the
>> watermark, which is effectively the min watermark.
>
> OK, this makes some sense. It would be great if we could have at least
> some clarification why the low wmark has been used previously. Probably
> Mel can remember?
>
>> - __compaction_suitable has the same issue in the check whether the allocation
>> is already supposed to succeed and we don't need to compact. Fix it the same
>> way.
>>
>> - __compaction_suitable() then checks the low watermark plus a (2 << order) gap
>> to decide if there's enough free memory to perform compaction. This check
>
> And this was a real head scratcher when I started looking into the
> compaction recently. Why do we need to be above low watermark to even
> start compaction.
Hmm, above you said you're fine with low wmark (maybe after
clarification). I don't know why it was used, can only guess.
> Compaction uses additional memory only for a short
> period of time and then releases the already migrated pages.
As for the 2 << order gap. I can imagine that e.g. order-5 compaction
(32 pages) isolates 20 pages for migration and starts looking for free
pages. It collects 19 free pages and then reaches an order-4 free page.
Splitting that page to collect it would result in 19+16=35 pages
isolated, thus exceed the 1 << order gap, and fail. With 2 << order gap,
chances of this happening are reduced.
>> uses direct compactor's alloc_flags, but that's wrong. If alloc_flags doesn't
>> include ALLOC_CMA, we might fail the check, even though the freepage
>> isolation isn't restricted outside of CMA pageblocks. On the other hand,
>> alloc_flags may indicate access to memory reserves, making compaction proceed
>> and then fail watermark check during freepage isolation, which doesn't pass
>> alloc_flags. The fix here is to use fixed ALLOC_CMA flags in the
>> __compaction_suitable() check.
>
> This makes my head hurt. Whut?
I'll try to explain better next time.
>> - __isolate_free_page uses low watermark check to decide if free page can be
>> isolated. It also doesn't use ALLOC_CMA, so add it for the same reasons.
>
> Why do we check the watermark at all? What would happen if this obscure
> if (!is_migrate_isolate(mt)) was gone? I remember I put some tracing
> there and it never hit for me even when I was testing close to OOM
> conditions. Maybe an earlier check bailed out but this code path looks
> really obscure so it should either deserve a large fat comment or to
> die.
The check is there so that compaction doesn't exhaust memory below
reserves during its work, just like any other non-privileged allocation.
>> - The use of low watermark checks in __compaction_suitable() and
>> __isolate_free_page does perhaps make sense for high-order allocations where
>> more freepages increase the chance of success, and we can typically fail
>> with some order-0 fallback when the system is struggling. But for low-order
>> allocation, forming the page should not be that hard. So using low watermark
>> here might just prevent compaction from even trying, and eventually lead to
>> OOM killer even if we are above min watermarks. So after this patch, we use
>> min watermark for non-costly orders in these checks, by passing the
>> alloc_flags parameter to split_page() and __isolate_free_page().
>
> OK, so if IIUC costly high order requests even shouldn't try when we are
> below watermark (unless they are __GFP_REPEAT which would get us to a
> stronger compaction mode/priority) and that would reclaim us over low
> wmark and go on. Is that what you are saying? This makes some sense but
> then let's have a _single_ place to check the watermak please. This
> checks at few different levels is just subtle as hell and error prone
> likewise.
What single place then? The situation might change dynamically so
passing the initial __compaction_suitable() check doesn't guarantee that
enough free pages are still available when it comes to isolating
freepages. Your testing that never hit it shows that this is rare, but
do we want to risk compaction making an OOM situation worse?
>> To sum up, after this patch, the kernel should in some situations finish
>> successful direct compaction sooner, prevent compaction from starting when it's
>> not needed, proceed with compaction when free memory is in CMA pageblocks, and
>> for non-costly orders, prevent OOM killing or excessive reclaim when free
>> memory is between the min and low watermarks.
>
> Could you please split this patch into three(?) parts. One to remove as many
> wmark checks as possible, move low wmark to min for !costly high orders
> and finally the cma part which I fail to understand...
Sure, although I'm not yet convinced we can remove any checks.
> Thanks!
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-05-16 9:50 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-10 7:35 [RFC 00/13] make direct compaction more deterministic Vlastimil Babka
2016-05-10 7:35 ` [RFC 01/13] mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode Vlastimil Babka
2016-05-11 12:40 ` Michal Hocko
2016-05-10 7:35 ` [RFC 02/13] mm, page_alloc: set alloc_flags only once in slowpath Vlastimil Babka
2016-05-10 11:28 ` Tetsuo Handa
2016-05-10 12:30 ` Vlastimil Babka
2016-05-12 12:41 ` Michal Hocko
2016-05-31 6:20 ` Joonsoo Kim
2016-05-31 7:59 ` Vlastimil Babka
2016-06-02 1:50 ` Joonsoo Kim
2016-05-10 7:35 ` [RFC 03/13] mm, page_alloc: don't retry initial attempt " Vlastimil Babka
2016-05-12 12:48 ` Michal Hocko
2016-05-31 6:25 ` Joonsoo Kim
2016-05-31 12:03 ` Vlastimil Babka
2016-05-10 7:35 ` [RFC 04/13] mm, page_alloc: restructure direct compaction handling " Vlastimil Babka
2016-05-12 13:29 ` Michal Hocko
2016-05-13 8:10 ` Vlastimil Babka
2016-05-13 8:31 ` Michal Hocko
2016-05-10 7:35 ` [RFC 05/13] mm, page_alloc: make THP-specific decisions more generic Vlastimil Babka
2016-05-12 13:43 ` Michal Hocko
2016-05-10 7:35 ` [RFC 06/13] mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations Vlastimil Babka
2016-05-12 16:20 ` Michal Hocko
2016-05-13 8:23 ` Vlastimil Babka
2016-05-13 12:05 ` Michal Hocko
2016-05-18 11:59 ` Vlastimil Babka
2016-05-18 15:24 ` Michal Hocko
2016-05-20 13:57 ` Vlastimil Babka
2016-05-23 8:39 ` Michal Hocko
2016-05-10 7:35 ` [RFC 07/13] mm, compaction: introduce direct compaction priority Vlastimil Babka
2016-05-13 12:37 ` Michal Hocko
2016-05-10 7:35 ` [RFC 08/13] mm, compaction: simplify contended compaction handling Vlastimil Babka
2016-05-13 13:09 ` Michal Hocko
2016-05-16 7:10 ` Vlastimil Babka
2016-05-10 7:35 ` [RFC 09/13] mm, compaction: make whole_zone flag ignore cached scanner positions Vlastimil Babka
2016-05-13 13:23 ` Michal Hocko
2016-05-10 7:36 ` [RFC 10/13] mm, compaction: cleanup unused functions Vlastimil Babka
2016-05-10 7:36 ` [RFC 11/13] mm, compaction: add the ultimate direct compaction priority Vlastimil Babka
2016-05-13 13:38 ` Michal Hocko
2016-05-16 7:17 ` Vlastimil Babka
2016-05-16 8:11 ` Michal Hocko
2016-05-18 12:46 ` Vlastimil Babka
2016-05-10 7:36 ` [RFC 12/13] mm, compaction: more reliably increase " Vlastimil Babka
2016-05-10 12:55 ` Vlastimil Babka
2016-05-13 14:15 ` Michal Hocko
2016-05-16 7:31 ` Vlastimil Babka
2016-05-16 8:14 ` Michal Hocko
2016-05-16 9:27 ` Vlastimil Babka
2016-05-16 9:52 ` Michal Hocko
2016-05-31 6:37 ` Joonsoo Kim
2016-05-31 12:07 ` Vlastimil Babka
2016-05-31 12:29 ` Vlastimil Babka
2016-06-02 2:50 ` Joonsoo Kim
2016-05-10 7:36 ` [RFC 13/13] mm, compaction: fix and improve watermark handling Vlastimil Babka
2016-05-16 9:25 ` Michal Hocko
2016-05-16 9:50 ` Vlastimil Babka [this message]
2016-05-16 12:30 ` Michal Hocko
2016-05-18 13:50 ` Mel Gorman
2016-05-18 14:27 ` Michal Hocko
2016-05-18 14:40 ` Mel Gorman
2016-05-17 20:01 ` [RFC 00/13] make direct compaction more deterministic Michal Hocko
2016-05-18 7:19 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=573997DE.6010109@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox