linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Brendan Jackman <jackmanb@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	 Mel Gorman <mgorman@techsingularity.net>,
	Zi Yan <ziy@nvidia.com>, <linux-mm@kvack.org>,
	 <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/5] mm: page_alloc: defrag_mode
Date: Sun, 23 Mar 2025 19:04:29 +0100	[thread overview]
Message-ID: <D8NUEJHT150J.17YZMGLU54JG7@google.com> (raw)
In-Reply-To: <20250323034657.GD1894930@cmpxchg.org>

On Sun Mar 23, 2025 at 4:46 AM CET, Johannes Weiner wrote:
> On Sat, Mar 22, 2025 at 09:34:09PM -0400, Johannes Weiner wrote:
> > On Sat, Mar 22, 2025 at 08:58:27PM -0400, Johannes Weiner wrote:
> > > On Sat, Mar 22, 2025 at 04:05:52PM +0100, Brendan Jackman wrote:
> > > > On Thu Mar 13, 2025 at 10:05 PM CET, Johannes Weiner wrote:
> > > > > +	/* Reclaim/compaction failed to prevent the fallback */
> > > > > +	if (defrag_mode) {
> > > > > +		alloc_flags &= ALLOC_NOFRAGMENT;
> > > > > +		goto retry;
> > > > > +	}
> > > > 
> > > > I can't see where ALLOC_NOFRAGMENT gets cleared, is it supposed to be
> > > > here (i.e. should this be ~ALLOC_NOFRAGMENT)?
> > 
> > Please ignore my previous email, this is actually a much more severe
> > issue than I thought at first. The screwed up clearing is bad, but
> > this will also not check the flag before retrying, which means the
> > thread will retry reclaim/compaction and never reach OOM.
> > 
> > This code has weeks of load testing, with workloads fine-tuned to
> > *avoid* OOM. A blatant OOM test shows this problem immediately.
> > 
> > A simple fix, but I'll put it through the wringer before sending it.
>
> Ok, here is the patch. I verified this with intentional OOMing 100
> times in a loop; this would previously lock up on first try in
> defrag_mode, but kills and recovers reliably with this applied.
>
> I also re-ran the full THP benchmarks, to verify that erroneous
> looping here did not accidentally contribute to fragmentation
> avoidance and thus THP success & latency rates. They were in fact not;
> the improvements claimed for defrag_mode are unchanged with this fix:

Sounds good :)

Off topic, but could you share some details about the
tests/benchmarks you're running here? Do you have any links e.g. to
the scripts you're using to run them?


  reply	other threads:[~2025-03-23 18:04 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-13 21:05 [PATCH 0/5] mm: reliable huge page allocator Johannes Weiner
2025-03-13 21:05 ` [PATCH 1/5] mm: compaction: push watermark into compaction_suitable() callers Johannes Weiner
2025-03-14 15:08   ` Zi Yan
2025-03-16  4:28   ` Hugh Dickins
2025-03-17 18:18     ` Johannes Weiner
2025-03-21  6:21   ` kernel test robot
2025-03-21 13:55     ` Johannes Weiner
2025-04-10 15:19   ` Vlastimil Babka
2025-04-10 20:17     ` Johannes Weiner
2025-04-11  7:32       ` Vlastimil Babka
2025-03-13 21:05 ` [PATCH 2/5] mm: page_alloc: trace type pollution from compaction capturing Johannes Weiner
2025-03-14 18:36   ` Zi Yan
2025-03-13 21:05 ` [PATCH 3/5] mm: page_alloc: defrag_mode Johannes Weiner
2025-03-14 18:54   ` Zi Yan
2025-03-14 20:50     ` Johannes Weiner
2025-03-14 22:54       ` Zi Yan
2025-03-22 15:05   ` Brendan Jackman
2025-03-23  0:58     ` Johannes Weiner
2025-03-23  1:34       ` Johannes Weiner
2025-03-23  3:46         ` Johannes Weiner
2025-03-23 18:04           ` Brendan Jackman [this message]
2025-03-31 15:55             ` Johannes Weiner
2025-03-13 21:05 ` [PATCH 4/5] mm: page_alloc: defrag_mode kswapd/kcompactd assistance Johannes Weiner
2025-03-13 21:05 ` [PATCH 5/5] mm: page_alloc: defrag_mode kswapd/kcompactd watermarks Johannes Weiner
2025-03-14 21:05   ` Johannes Weiner
2025-04-11  8:19   ` Vlastimil Babka
2025-04-11 15:39     ` Johannes Weiner
2025-04-11 16:51       ` Vlastimil Babka
2025-04-11 18:21         ` Johannes Weiner
2025-04-13  2:20           ` Johannes Weiner
2025-04-15  7:31             ` Vlastimil Babka
2025-04-15  7:44             ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D8NUEJHT150J.17YZMGLU54JG7@google.com \
    --to=jackmanb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=vbabka@suse.cz \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox