From: Joonwon Kang <joonwonkang@google.com>
To: dennis@kernel.org
Cc: akpm@linux-foundation.org, cl@gentwo.org, dodam@google.com,
joonwonkang@google.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, tj@kernel.org
Subject: Re: [PATCH] percpu: Fix hint invariant breakage
Date: Mon, 20 Apr 2026 12:35:48 +0000 [thread overview]
Message-ID: <20260420123548.2116177-1-joonwonkang@google.com> (raw)
In-Reply-To: <adfrXdWJeuqDLWFM@palisades.local>
> Hello,
>
> Sorry for the delay, I've been a bit sick.
>
> On Mon, Mar 23, 2026 at 02:05:14PM +0000, Joonwon Kang wrote:
> > > Hello,
> > >
> > > On Fri, Mar 20, 2026 at 11:52:14AM +0000, Joonwon Kang wrote:
> > > > The invariant "scan_hint_start > contig_hint_start if and only if
> > > > scan_hint == contig_hint" should be kept for hint management. However,
> > > > it could be broken in some cases:
> > > >
> > >
> > > First I'd just like to apologize. I spent an hour yesterday trying to
> > > remember why the invariant exists and the reality is this code is more
> > > clever than it needs to be.
> >
> > Thanks for taking time for this and sharing more context. While you are at
> > it, I have a fundamental question on the invariant. I had deliberation and
> > discussion on what benefits the invariant gets to the percpu allocator by
> > its existence. My understanding is that if we put contig_hint before
> > scan_hint when they are the same, it is more likely that contig_hint is
> > broken by a future allocation, which leads to a linear scan after the
> > scan_hint for hints update, although we could save scanning upto scan_hint
> > when contig_hint is not broken. On the other hand, if we put scan_hint
> > before contig_hint instead, it is more likely that scan_hint is broken
> > while keeping contig_hint, which does not lead to the linear scan for
> > hints update, although we could not save the scanning that could be saved
> > in the other case.
> >
> > In other words, if contig_hint breaking allocations occur a lot in general
> > with the current invariant, the performance may more suffer than without
> > the invariant. I also think that there would be no strict reason of having
> > the invariant.
> >
>
> I think the original premise is that percpu memory is quite expensive, 1
> allocation costs nr_cpus * sizeof(allocation). So we do our best to bin
> pack at the cost of faster allocations. We could always just break the
> contig_hint but then over time we could cause more fragmentation.
>
> The case that triggered this was netdev needing 8 byte objects with 16
> byte alignment [1].
>
Thank you for sharing the points about the bin packing. Although I did not
fully understand the relationship between breakage of the contig_hint and the
fragmentation trend, it may be helpful to reference the case you referred to.
I guess you may have missed the link for the reference [1]? Could you help to
provide the link, if you intended to leave it?
> > So, could you clarify the necessity of the invariant? If there is no must
> > reason, then I could post another spin-off patch to remove the invariant
> > at all so that we could simplify the code and experiment the result. How
> > do you think?
> >
>
> I can't really recall the exact reasoning for the invariant, but it was
> probably along the lines of wanting to not lose information if possible.
>
> Say an earlier area becomes free that is the same size as the
> contig_hint but with better alignment, we ant to use that as the
> contig_hint but then we either have to lose the scan_hint or keep it
> with the invariant. Given the premise above, I believe we want to
> continue bin packing, I think the general idea of scanning next time
> around isn't the worst thing.
>
> Sadly because it's already there, and has worked for quite some time,
> it's kind of on us today to provide data / reasoning to delete it. I'd
> wager that some upcoming work is going to change how percpu gives out
> objects either through some sort of slab caching that we can revisit
> this more in that context.
>
Understood and thanks for your detailed explanation. I will keep the invariant
as-is unless I have a clear data point to reverse it. I sent the new patch set
v3 recently with this in mind. Please help to review it ;)
Thanks,
Joonwon Kang
prev parent reply other threads:[~2026-04-20 12:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 11:52 Joonwon Kang
2026-03-20 19:08 ` Andrew Morton
2026-03-23 12:02 ` Joonwon Kang
2026-03-21 17:09 ` Dennis Zhou
2026-03-23 14:05 ` Joonwon Kang
2026-04-09 18:09 ` Dennis Zhou
2026-04-20 12:35 ` Joonwon Kang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260420123548.2116177-1-joonwonkang@google.com \
--to=joonwonkang@google.com \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=dennis@kernel.org \
--cc=dodam@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox