Re: [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches")

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Harry Yoo <harry.yoo@oracle.com>
To: Chris Bainbridge <chris.bainbridge@gmail.com>
Cc: vbabka@suse.cz, surenb@google.com, hao.li@linux.dev,
	leitao@debian.org, Liam.Howlett@oracle.com, zhao1.liu@intel.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-btrfs@vger.kernel.org, regressions@lists.linux.dev
Subject: Re: [REGRESSION] kswapd0: page allocation failure (bisected to "slab: add sheaves to most caches")
Date: Mon, 23 Feb 2026 20:59:30 +0900	[thread overview]
Message-ID: <aZxBIpE8R8DxO4eJ@hyeyoo> (raw)
In-Reply-To: <aZw2LyOjxMc-c3dl@debian.local>

On Mon, Feb 23, 2026 at 11:12:47AM +0000, Chris Bainbridge wrote:
> On Mon, Feb 23, 2026 at 05:41:17PM +0900, Harry Yoo wrote:
> > On Sun, Feb 22, 2026 at 09:36:58PM +0000, Chris Bainbridge wrote:
> > > Hi,
> > > 
> > > The latest mainline kernel (v6.19-11831-ga95f71ad3e2e) has page
> > > allocation failures when doing things like compiling a kernel. I can
> > > also reproduce this with a stress test like
> > > `stress-ng --vm 2 --vm-bytes 110% --verify -v`
> > 
> > Hi, thanks for the report!
> > 
> > > [  104.032925] kswapd0: page allocation failure: order:0, mode:0xc0c40(GFP_NOFS|__GFP_COMP|__GFP_NOMEMALLOC), nodemask=(null),cpuset=/,mems_allowed=0
> > > [  104.033307] CPU: 4 UID: 0 PID: 156 Comm: kswapd0 Not tainted 6.19.0-rc5-00027-g40fd0acc45d0 #435 PREEMPT(voluntary) 
> > > [  104.033312] Hardware name: HP HP Pavilion Aero Laptop 13-be0xxx/8916, BIOS F.17 12/18/2024
> > > [  104.033314] Call Trace:
> > > [  104.033316]  <TASK>
> > > [  104.033319]  dump_stack_lvl+0x6a/0x90
> > > [  104.033328]  warn_alloc.cold+0x95/0x1af
> > > [  104.033334]  ? zone_watermark_ok+0x80/0x80
> > > [  104.033350]  __alloc_frozen_pages_noprof+0xec3/0x2470
> > > [  104.033353]  ? __lock_acquire+0x489/0x2600
> > > [  104.033359]  ? stack_access_ok+0x1c0/0x1c0
> > > [  104.033367]  ? warn_alloc+0x1d0/0x1d0
> > > [  104.033371]  ? __lock_acquire+0x489/0x2600
> > > [  104.033375]  ? _raw_spin_unlock_irqrestore+0x48/0x60
> > > [  104.033379]  ? _raw_spin_unlock_irqrestore+0x48/0x60
> > > [  104.033382]  ? lockdep_hardirqs_on+0x78/0x100
> > > [  104.033394]  allocate_slab+0x2b7/0x510
> > > [  104.033399]  refill_objects+0x25d/0x380
> > > [  104.033407]  __pcs_replace_empty_main+0x193/0x5f0
> > > [  104.033412]  kmem_cache_alloc_noprof+0x5b6/0x6f0
> > > [  104.033415]  ? alloc_extent_state+0x1b/0x210 [btrfs]
> > > [  104.033479]  alloc_extent_state+0x1b/0x210 [btrfs]
> > > [  104.033527]  btrfs_clear_extent_bit_changeset+0x2be/0x9c0 [btrfs]
> > 
> > Hmm while bisect points out the first bad commit is
> > commit e47c897a2949 ("slab: add sheaves to most caches"),
> > 
> > I think the caller is supposed to specify __GFP_NOWARN if it doesn't
> > care about allocation failure?
> > 
> > btrfs_clear_extent_bit_changeset() says:
> > >         if (!prealloc) {
> > >                 /*
> > >                  * Don't care for allocation failure here because we might end
> > >                  * up not needing the pre-allocated extent state at all, which
> > >                  * is the case if we only have in the tree extent states that 
> > >                  * cover our input range and don't cover too any other range.
> > >                  * If we end up needing a new extent state we allocate it later.
> > >                  */
> > >                 prealloc = alloc_extent_state(mask);
> > >         }
> > 
> > Oh wait, I see what's going on. bisection pointed out the commit
> > because slab tries to refill sheaves with __GFP_NOMEMALLOC (and then
> > falls back to slowpath if it fails).
> > 
> > Since failing to refill sheaves doesn't mean the allocation will fail,
> > it should specify __GFP_NOWARN with __GFP_NOMEMALLOC as long as there's
> > fallback method.
> > 
> > But for __prefill_sheaf_pfmemalloc(), it should specify __GPF_NOWARN on
> > the first attempt only when gfp_pfmemalloc_allowed() returns true.
> 
> Is this fix sufficient to do the right thing? I tested it, and it does
> appear to prevent logging of the allocation failures for my test case.

I think we should do both both 1) setting __GFP_NOWARN from btrfs side
and 2) making slab try to refill sheaves with __GFP_NOWARN when
there's a fallback path.

I'm writing a fix for 2) and I'll send it soon.

> diff --git a/fs/btrfs/extent-io-tree.c b/fs/btrfs/extent-io-tree.c
> index d0dd50f7d279..d2e1083848e8 100644
> --- a/fs/btrfs/extent-io-tree.c
> +++ b/fs/btrfs/extent-io-tree.c
> @@ -641,7 +641,7 @@ int btrfs_clear_extent_bit_changeset(struct extent_io_tree *tree, u64 start, u64
>  		 * cover our input range and don't cover too any other range.
>  		 * If we end up needing a new extent state we allocate it later.
>  		 */
> -		prealloc = alloc_extent_state(mask);
> +		prealloc = alloc_extent_state(mask | __GFP_NOWARN);

This seems to be a right thing to do to me, but as I'm not familiar
with btrfs, I'll let btrfs folks leave comment on it :)

>  	}
>  
>  	spin_lock(&tree->lock);

-- 
Cheers,
Harry / Hyeonggon

next prev parent reply	other threads:[~2026-02-23 11:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-22 21:36 Chris Bainbridge
2026-02-23  8:41 ` Harry Yoo
2026-02-23 11:12   ` Chris Bainbridge
2026-02-23 11:59     ` Harry Yoo [this message]
2026-02-23 20:30       ` David Sterba
2026-02-25  1:53     ` Mikhail Gavrilov
2026-02-25  2:10       ` Harry Yoo
2026-02-26 16:20         ` Mikhail Gavrilov
2026-02-26 16:31           ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZxBIpE8R8DxO4eJ@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=chris.bainbridge@gmail.com \
    --cc=hao.li@linux.dev \
    --cc=leitao@debian.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=regressions@lists.linux.dev \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox