From: Johannes Weiner <hannes@cmpxchg.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>, Zi Yan <ziy@nvidia.com>,
David Rientjes <rientjes@google.com>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Joshua Hahn <joshua.hahnjy@gmail.com>,
Pedro Falcato <pfalcato@suse.de>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC 2/2] mm, page_alloc: fail costly __GFP_NORETRY allocations faster
Date: Wed, 17 Dec 2025 11:35:12 -0500 [thread overview]
Message-ID: <aULbwOHkRvWwy6zg@cmpxchg.org> (raw)
In-Reply-To: <9881b540-7e22-404b-aeaa-282dc5eeb5d5@suse.cz>
On Wed, Dec 17, 2025 at 09:46:34AM +0100, Vlastimil Babka wrote:
> On 12/16/25 21:32, Johannes Weiner wrote:
> > On Tue, Dec 16, 2025 at 04:54:22PM +0100, Vlastimil Babka wrote:
> >> It might make therefore more sense to just fail unconditionally after
> >> the initial compaction attempt, so do that instead. Costly allocations
> >> that do want the reclaim/compaction to happen at least once can omit
> >> __GFP_NORETRY, or even specify __GFP_RETRY_MAYFAIL for more than one
> >> attempt.
> >>
> >> There is a slight potential unfairness in that costly __GFP_NORETRY
> >> allocations that can't perform direct compaction (i.e. lack __GFP_IO)
> >> will still be allowed to direct reclaim, while those that can direct
> >> compact will now never attempt direct reclaim. However, in cases of
> >> memory pressure causing compaction to be skipped due to insufficient
> >> base pages, direct reclaim was already not done before, so there should
> >> be no functional regressions from this change.
> >
> > Hm, kind of. There could be enough basepages for compaction_suitable()
> > but compaction odds are still higher with more free pages. So there
> > might be cases it regresses.
> >
> > __GFP_NORETRY semantics say it'll try reclaim at least once. We should
> > be able to keep that and still simplify, no?
> >
> >> if (costly_order && (gfp_mask & __GFP_NORETRY)) {
> >> - if (gfp_mask & __GFP_THISNODE)
> >> - goto nopage;
> >> + goto nopage;
> >
> > IOW, maybe directly select for the NUMA-THP special case here?
> >
> > /* Optimistic node-local huge page - only compact once */
> > if (costly_order &&
> > ((gfp_mask & (__GFP_NORETRY|__GFP_THISNODE)) ==
> > (__GFP_NORETRY|__GFP_THISNODE)))
> > goto nopage;
> >
> > and then let other __GFP_NORETRY fall through.
>
> I did consider it as an alternative when realizing the potential unfairness
> mentioned above, but then went with the simpler code option.
>
> With your suggestion we keep the THP-specific check but at least remove the
> arguably illogical compaction feedback.
Yes, I'm in favor of removing those either way.
Reclaim makes its own decisions around costly orders. For example, it
targets a higher number of free pages through compaction_ready() than
where compaction would return SKIPPED, to account for concurrency. I
don't think the allocator should have conflicting opinions.
Regarding __GFP_NORETRY: I think it would just be a chance to simplify
the mental model around it again. If somebody does a NORETRY request
when memory is full of stale page cache, I think it's reasonable to
expect at least one shot at dropping some cache to make it happen.
Shortcutting directly to compaction is a good optimization when we
suspect it could succeed without requiring reclaim. But I'm not sure
it's reasonable to ONLY do that and give up.
Btw, I do wonder why that up-front compaction run is so explicit, when
we have
__alloc_pages_direct_reclaim()
__alloc_pages_direct_compact()
calls following below. Couldn't we check for conditions upfront and
set a flag to skip reclaim initially? Then handle priority adjustments
in the retry conditions? IOW, something like:
unsigned long did_some_progress = 0;
if (can_compact && costly_order)
skip_reclaim = true;
if (can_compact && order > 0 && ac->migratetype != MIGRATE_MOVABLE)
skip_reclaim = true;
if (gfp_thisnode_noretry(gfp_mask))
skip_reclaim = true;
retry:
page = get_page_from_freelist(..., alloc_flags, ...);
if (page)
goto got_pg;
if (!skip_reclaim) {
page = __alloc_pages_direct_reclaim(..., &did_some_progress);
if (page)
goto got_pg;
}
page = __alloc_pages_direct_compact(...);
if (page)
goto got_pg;
if (should_loop()) {
skip_reclaim = false;
compact_priority = ...;
goto retry;
}
That would naturally get rid of the gfp_pfmemalloc_allowed() branch
for the upfront check as well, because the ALLOC_NO_WATERMARKS attempt
happens before we do the reclaim/compaction calls.
prev parent reply other threads:[~2025-12-17 16:35 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-16 15:54 [PATCH RFC 0/2] tweaks for costly order __GFP_NORETRY reclaim Vlastimil Babka
2025-12-16 15:54 ` [PATCH RFC 1/2] mm, page_alloc, thp: prevent reclaim for __GFP_THISNODE THP allocations Vlastimil Babka
2025-12-16 16:26 ` Michal Hocko
2025-12-16 20:11 ` Johannes Weiner
2025-12-16 20:23 ` Zi Yan
2025-12-17 15:53 ` Pedro Falcato
2025-12-16 15:54 ` [PATCH RFC 2/2] mm, page_alloc: fail costly __GFP_NORETRY allocations faster Vlastimil Babka
2025-12-16 16:28 ` Michal Hocko
2025-12-16 20:32 ` Johannes Weiner
2025-12-17 8:46 ` Vlastimil Babka
2025-12-17 16:35 ` Johannes Weiner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aULbwOHkRvWwy6zg@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=jackmanb@google.com \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=pfalcato@suse.de \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox