From: Joshua Hahn <joshua.hahnjy@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Daniel Palmer <daniel@0x0f.com>,
Matthew Wilcox <willy@infradead.org>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.com>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>, Zi Yan <ziy@nvidia.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
kernel-team@meta.com
Subject: Re: [PATCH v4] mm/page_alloc: prevent reporting pcp->batch = 0
Date: Tue, 16 Dec 2025 21:18:02 -0800 [thread overview]
Message-ID: <20251217051802.86144-1-joshua.hahnjy@gmail.com> (raw)
In-Reply-To: <20251216102205.1d008d1432446956b079bfff@linux-foundation.org>
On Tue, 16 Dec 2025 10:22:05 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:
> On Tue, 16 Dec 2025 06:48:11 -0800 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
>
> > zone_batchsize returns the appropriate value that should be used for
> > pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
> > 1M, however, it leads to some incorrect math.
> >
> > In the case above, we will get an intermediary value of 1, which is then
> > rounded down to the nearest power of two, and 1 is subtracted from it.
> > Since 1 is already a power of two, we will get batch = 1-1 = 0:
> >
> > batch = rounddown_pow_of_two(batch + batch/2) - 1;
> >
> > A pcp->batch value of 0 is nonsensical, for MMU systems. If this were
> > actually set, then functions like drain_zone_pages would become no-ops,
> > since they would free 0 pages at a time.
> >
> > Of the two callers of zone_batchsize, the one that is actually used to
> > set pcp->batch works around this by setting pcp->batch to the maximum
> > of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
> > incorrectly prints out the batch size of the zone to be 0.
> >
> > This is probably rare in a typical zone, but the DMA zone can often have
> > less than 4096 pages, which means it will print out "LIFO batch:0".
> >
> > Before: [ 0.001216] DMA zone: 3998 pages, LIFO batch:0
> > After: [ 0.001210] DMA zone: 3998 pages, LIFO batch:1
> >
> > With all of this said, NOMMU differs in two ways. Semantically, it
> > should report that pcp->batch is 0. At the same time, it can never
> > really have a pcp->batch size of 0 since it will reach a deadlock in
> > pcp freeing functions. For this reason, zone_batchsize should still
> > report 0 for NOMMU, but zone_set_pageset_high_and_batch should still
> > interpret it as 1, meaning we cannot get rid of max(1, zone_batchsize())
> > in zone_set_pageset_high_and_batch.
> >
> > Suggested-by: Daniel Palmer <daniel@0x0f.com>
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > ---
> > Reviewers' note:
> >
> > This patch was originally a part of the 6.19-rc1 pr, but Daniel Palmer
> > kindly reported that this patch causes an issue on NOMMU systems [1].
> > Thank you, Daniel! I wasn't sure how to credit here since it was a
> > report on an unmerged commit so I went with suggested-by. If this is
> > problematic please let me know and I will change the tag.
> >
> > [1] https://lore.kernel.org/all/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
> >
> > Reviewer's note (to Andrew):
> >
> > This replaces commit 2/2 of the series titled "mm/page_alloc: pcp->batch
> > cleanups" [2].
>
> That series is in mainline. 2783088ef24e ("mm/page_alloc: prevent
> reporting pcp->batch = 0").
Hello Andrew,
Sorry again, this mix-up was also avoidable. I'll be more careful in the future.
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5888,8 +5888,8 @@ static int zone_batchsize(struct zone *zone)
> > * and zone lock contention.
> > */
> > batch = min(zone_managed_pages(zone) >> 12, SZ_256K / PAGE_SIZE);
> > - if (batch < 1)
> > - batch = 1;
> > + if (batch <= 1)
> > + return 1;
> >
> > /*
> > * Clamp the batch to a 2^n - 1 value. Having a power
> >
>
> So this doesn't work. Please send along a fix for current -linus and include
>
> Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
>
> and Reported-by:Daniel and the appropriate Closes:
>
> Thanks.
Will do, thank you for your patience. I hope you have a great day!
Joshua
prev parent reply other threads:[~2025-12-17 5:18 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-16 14:48 Joshua Hahn
2025-12-16 18:22 ` Andrew Morton
2025-12-17 5:18 ` Joshua Hahn [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251217051802.86144-1-joshua.hahnjy@gmail.com \
--to=joshua.hahnjy@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=daniel@0x0f.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox