linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Brendan Jackman <jackmanb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>, Zi Yan <ziy@nvidia.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/5] mm: page_alloc: defrag_mode
Date: Sat, 22 Mar 2025 20:58:23 -0400	[thread overview]
Message-ID: <20250323005823.GB1894930@cmpxchg.org> (raw)
In-Reply-To: <D8MVZ8L12HJN.1LN4G4H0ESLY6@google.com>

On Sat, Mar 22, 2025 at 04:05:52PM +0100, Brendan Jackman wrote:
> On Thu Mar 13, 2025 at 10:05 PM CET, Johannes Weiner wrote:
> > +	/* Reclaim/compaction failed to prevent the fallback */
> > +	if (defrag_mode) {
> > +		alloc_flags &= ALLOC_NOFRAGMENT;
> > +		goto retry;
> > +	}
> 
> I can't see where ALLOC_NOFRAGMENT gets cleared, is it supposed to be
> here (i.e. should this be ~ALLOC_NOFRAGMENT)?

Yes, it should be. Thanks for catching that.

Note that this happens right before OOM, and __alloc_pages_may_oom()
does another allocation attempt without the flag set. In fact, I was
briefly debating whether I need the explicit retry here at all, but
then decided it's clearer and more future proof than quietly relying
on that OOM attempt, which is really only there to check for racing
frees. But this is most likely what hid this during testing.

What might be more of an issue is retrying without ALLOC_CPUSET and
then potentially violating cgroup placement rules too readily -
e.g. OOM only does that for __GFP_NOFAIL.

---

From e81c2086ee8e4b9f2750b821e104d3b5174b81f2 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Sat, 22 Mar 2025 19:21:45 -0400
Subject: [PATCH] mm: page_alloc: fix defrag_mode's last allocation before OOM

Brendan points out that defrag_mode doesn't properly clear
ALLOC_NOFRAGMENT on its last-ditch attempt to allocate.

This is not necessarily a practical issue because it's followed by
__alloc_pages_may_oom(), which does its own attempt at the freelist
without ALLOC_NOFRAGMENT set. However, this is restricted to the high
watermark instead of the usual min mark (since it's merely to check
for racing frees). While this usually works - we just ran a full set
of reclaim/compaction, after all, and likely failed due to a lack of
pageblocks rather than watermarks - it's not as reliable as intended.

A more practical implication is retrying with the other flags cleared,
which means ALLOC_CPUSET is cleared, which can violate placement rules
defined by cgroup policy - OOM usually only does this for GFP_NOFAIL.

Reported-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0c01998cb3a0..b9ee0c00eea5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4544,7 +4544,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Reclaim/compaction failed to prevent the fallback */
 	if (defrag_mode) {
-		alloc_flags &= ALLOC_NOFRAGMENT;
+		alloc_flags &= ~ALLOC_NOFRAGMENT;
 		goto retry;
 	}
 
-- 
2.49.0



  reply	other threads:[~2025-03-23  0:58 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-13 21:05 [PATCH 0/5] mm: reliable huge page allocator Johannes Weiner
2025-03-13 21:05 ` [PATCH 1/5] mm: compaction: push watermark into compaction_suitable() callers Johannes Weiner
2025-03-14 15:08   ` Zi Yan
2025-03-16  4:28   ` Hugh Dickins
2025-03-17 18:18     ` Johannes Weiner
2025-03-21  6:21   ` kernel test robot
2025-03-21 13:55     ` Johannes Weiner
2025-04-10 15:19   ` Vlastimil Babka
2025-04-10 20:17     ` Johannes Weiner
2025-04-11  7:32       ` Vlastimil Babka
2025-03-13 21:05 ` [PATCH 2/5] mm: page_alloc: trace type pollution from compaction capturing Johannes Weiner
2025-03-14 18:36   ` Zi Yan
2025-03-13 21:05 ` [PATCH 3/5] mm: page_alloc: defrag_mode Johannes Weiner
2025-03-14 18:54   ` Zi Yan
2025-03-14 20:50     ` Johannes Weiner
2025-03-14 22:54       ` Zi Yan
2025-03-22 15:05   ` Brendan Jackman
2025-03-23  0:58     ` Johannes Weiner [this message]
2025-03-23  1:34       ` Johannes Weiner
2025-03-23  3:46         ` Johannes Weiner
2025-03-23 18:04           ` Brendan Jackman
2025-03-31 15:55             ` Johannes Weiner
2025-03-13 21:05 ` [PATCH 4/5] mm: page_alloc: defrag_mode kswapd/kcompactd assistance Johannes Weiner
2025-03-13 21:05 ` [PATCH 5/5] mm: page_alloc: defrag_mode kswapd/kcompactd watermarks Johannes Weiner
2025-03-14 21:05   ` Johannes Weiner
2025-04-11  8:19   ` Vlastimil Babka
2025-04-11 15:39     ` Johannes Weiner
2025-04-11 16:51       ` Vlastimil Babka
2025-04-11 18:21         ` Johannes Weiner
2025-04-13  2:20           ` Johannes Weiner
2025-04-15  7:31             ` Vlastimil Babka
2025-04-15  7:44             ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250323005823.GB1894930@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=jackmanb@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=vbabka@suse.cz \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox