From: Vlastimil Babka <vbabka@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
David Rientjes <rientjes@google.com>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Joshua Hahn <joshua.hahnjy@gmail.com>,
Pedro Falcato <pfalcato@suse.de>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Vlastimil Babka <vbabka@suse.cz>
Subject: [PATCH RFC v2 3/3] mm/page_alloc: simplify __alloc_pages_slowpath() flow
Date: Fri, 19 Dec 2025 18:38:53 +0100 [thread overview]
Message-ID: <20251219-thp-thisnode-tweak-v2-3-0c01f231fd1c@suse.cz> (raw)
In-Reply-To: <20251219-thp-thisnode-tweak-v2-0-0c01f231fd1c@suse.cz>
The actions done before entering the main retry loop include waking up
kswapds and an allocation attempt with the precise alloc_flags.
Then in the loop we keep waking up kswapds, and we retry the allocation
with flags potentially further adjusted by being allowed to use reserves
(due to e.g. becoming an oom victim).
We can adjust the retry loop to keep only one instance of waking up
kswapds and allocation attempt. Introduce a can_retry_reserves variable
for retrying once when we become eligible for reserves. It is still
useful not to evaluate reserve_flags immediately for the first
allocation attempt, because it's better to first try succeed in a
non-preferred zone above the min watermark before allocating immediately
from the preferred zone below min watermark.
Additionaly move the cpuset update checks introduced by e05741fb10c3
("mm/page_alloc.c: avoid infinite retries caused by cpuset race")
further in the retry loop. It's enough to check those only before
reaching any potentially infinite 'goto retry;' loop.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/page_alloc.c | 41 +++++++++++++++++++++++------------------
1 file changed, 23 insertions(+), 18 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cb8965fd5e20..4a68adb383b2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4683,6 +4683,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
unsigned int zonelist_iter_cookie;
int reserve_flags;
bool compact_first = false;
+ bool can_retry_reserves = true;
if (unlikely(nofail)) {
/*
@@ -4750,6 +4751,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
goto nopage;
}
+retry:
+ /* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
if (alloc_flags & ALLOC_KSWAPD)
wake_all_kswapds(order, gfp_mask, ac);
@@ -4761,19 +4764,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
if (page)
goto got_pg;
-retry:
- /*
- * Deal with possible cpuset update races or zonelist updates to avoid
- * infinite retries.
- */
- if (check_retry_cpuset(cpuset_mems_cookie, ac) ||
- check_retry_zonelist(zonelist_iter_cookie))
- goto restart;
-
- /* Ensure kswapd doesn't accidentally go to sleep as long as we loop */
- if (alloc_flags & ALLOC_KSWAPD)
- wake_all_kswapds(order, gfp_mask, ac);
-
reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
if (reserve_flags)
alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, reserve_flags) |
@@ -4788,12 +4778,18 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
ac->nodemask = NULL;
ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
ac->highest_zoneidx, ac->nodemask);
- }
- /* Attempt with potentially adjusted zonelist and alloc_flags */
- page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
- if (page)
- goto got_pg;
+ /*
+ * The first time we adjust anything due to being allowed to
+ * ignore memory policies or watermarks, retry immediately. This
+ * allows us to keep the first allocation attempt optimistic so
+ * it can succeed in a zone that is still above watermarks.
+ */
+ if (can_retry_reserves) {
+ can_retry_reserves = false;
+ goto retry;
+ }
+ }
/* Caller is not willing to reclaim, we can't balance anything */
if (!can_direct_reclaim)
@@ -4857,6 +4853,15 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
!(gfp_mask & __GFP_RETRY_MAYFAIL)))
goto nopage;
+ /*
+ * Deal with possible cpuset update races or zonelist updates to avoid
+ * infinite retries. No "goto retry;" can go above this check unless
+ * it can execute just once.
+ */
+ if (check_retry_cpuset(cpuset_mems_cookie, ac) ||
+ check_retry_zonelist(zonelist_iter_cookie))
+ goto restart;
+
if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
did_some_progress > 0, &no_progress_loops))
goto retry;
--
2.52.0
prev parent reply other threads:[~2025-12-19 17:39 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-19 17:38 [PATCH RFC v2 0/3] tweaks for __alloc_pages_slowpath() Vlastimil Babka
2025-12-19 17:38 ` [PATCH RFC v2 1/3] mm/page_alloc: ignore the exact initial compaction result Vlastimil Babka
2025-12-19 17:38 ` [PATCH RFC v2 2/3] mm/page_alloc: refactor the initial compaction handling Vlastimil Babka
2025-12-22 7:40 ` Joshua Hahn
2025-12-19 17:38 ` Vlastimil Babka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251219-thp-thisnode-tweak-v2-3-0c01f231fd1c@suse.cz \
--to=vbabka@suse.cz \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=pfalcato@suse.de \
--cc=rientjes@google.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox