From: Mateusz Guzik <mjguzik@gmail.com>
To: Dennis Zhou <dennis@kernel.org>
Cc: linux-kernel@vger.kernel.org, tj@kernel.org, cl@linux.com,
akpm@linux-foundation.org, shakeelb@google.com,
linux-mm@kvack.org
Subject: Re: [PATCH 0/2] execve scalability issues, part 1
Date: Mon, 21 Aug 2023 23:39:51 +0200 [thread overview]
Message-ID: <20230821213951.bx3yyqh7omdvpyae@f> (raw)
In-Reply-To: <ZOPSEJTzrow8YFix@snowbird>
On Mon, Aug 21, 2023 at 02:07:28PM -0700, Dennis Zhou wrote:
> On Mon, Aug 21, 2023 at 10:28:27PM +0200, Mateusz Guzik wrote:
> > With this out of the way I'll be looking at some form of caching to
> > eliminate these allocs as a problem.
> >
>
> I'm not against caching, this is just my first thought. Caching will
> have an impact on the backing pages of percpu. All it takes is 1
> allocation on a page for the current allocator to pin n pages of memory.
> A few years ago percpu depopulation was implemented so that limits the
> amount of resident backing pages.
>
I'm painfully aware.
> Maybe the right thing to do is preallocate pools of common sized
> allocations so that way they can be recycled such that we don't have to
> think too hard about fragmentation that can occur if we populate these
> pools over time?
>
This is what I was going to suggest :)
FreeBSD has a per-cpu allocator which pretends to be the same as the
slab allocator, except handing out per-cpu bufs. So far it has sizes 4,
8, 16, 32 and 64 and you can act as if you are mallocing in that size.
Scales perfectly fine of course since it caches objs per-CPU, but there
is some waste and I have 0 idea how it compares to what Linux is doing
on that front.
I stress though that even if you were to carve out certain sizes, a
global lock to handle ops will still kill scalability.
Perhaps granularity better than global, but less than per-CPU would be a
sweet spot for scalabability vs memory waste.
That said...
> Also as you've pointed out, it wasn't just the percpu allocation being
> the bottleneck, but percpu_counter's global lock too for hotplug
> support. I'm hazarding a guess most use cases of percpu might have
> additional locking requirements too such as percpu_counter.
>
True Fix(tm) is a longer story.
Maybe let's sort out this patchset first, whichever way. :)
> Thanks,
> Dennis
>
> > Thoughts?
> >
> > Mateusz Guzik (2):
> > pcpcntr: add group allocation/free
> > fork: group allocation of per-cpu counters for mm struct
> >
> > include/linux/percpu_counter.h | 19 ++++++++---
> > kernel/fork.c | 13 ++------
> > lib/percpu_counter.c | 61 ++++++++++++++++++++++++----------
> > 3 files changed, 60 insertions(+), 33 deletions(-)
> >
> > --
> > 2.39.2
> >
next prev parent reply other threads:[~2023-08-21 21:39 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-21 20:28 Mateusz Guzik
2023-08-21 20:28 ` [PATCH 1/2] pcpcntr: add group allocation/free Mateusz Guzik
2023-08-22 13:37 ` Vegard Nossum
2023-08-22 14:06 ` Mateusz Guzik
2023-08-22 17:02 ` Dennis Zhou
2023-08-21 20:28 ` [PATCH 2/2] fork: group allocation of per-cpu counters for mm struct Mateusz Guzik
2023-08-21 21:20 ` Matthew Wilcox
2023-08-21 20:42 ` [PATCH 0/2] execve scalability issues, part 1 Matthew Wilcox
2023-08-21 20:44 ` [PATCH 1/7] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 2/7] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 3/7] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 4/7] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 5/7] memcg: Add mem_cgroup_uncharge_batch() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 6/7] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2023-08-21 20:44 ` [PATCH 7/7] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2023-08-21 21:07 ` [PATCH 0/2] execve scalability issues, part 1 Dennis Zhou
2023-08-21 21:39 ` Mateusz Guzik [this message]
2023-08-21 22:29 ` Mateusz Guzik
2023-08-22 9:51 ` Jan Kara
2023-08-22 14:24 ` Mateusz Guzik
2023-08-23 9:49 ` Jan Kara
2023-08-23 10:49 ` David Laight
2023-08-23 12:01 ` Mateusz Guzik
2023-08-23 12:13 ` Mateusz Guzik
2023-08-23 15:47 ` Jan Kara
2023-08-23 16:10 ` Mateusz Guzik
2023-08-23 16:41 ` Jan Kara
2023-08-23 17:12 ` Mateusz Guzik
2023-08-23 20:27 ` Dennis Zhou
2023-08-24 9:19 ` Jan Kara
2023-08-26 18:33 ` Mateusz Guzik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230821213951.bx3yyqh7omdvpyae@f \
--to=mjguzik@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=dennis@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox