From: "Darrick J. Wong" <djwong@kernel.org>
To: Kees Cook <kees@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Dave Chinner <david@fromorbit.com>,
Yafang Shao <laoar.shao@gmail.com>,
Harry Yoo <harry.yoo@oracle.com>,
joel.granados@kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, Josef Bacik <josef@toxicpanda.com>,
linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] mm: kvmalloc: make kmalloc fast path real fast path
Date: Fri, 4 Apr 2025 08:33:30 -0700 [thread overview]
Message-ID: <20250404153330.GA6266@frogsfrogsfrogs> (raw)
In-Reply-To: <202504030920.EB65CCA2@keescook>
On Thu, Apr 03, 2025 at 09:21:50AM -0700, Kees Cook wrote:
> On Thu, Apr 03, 2025 at 09:43:39AM +0200, Michal Hocko wrote:
> > There are users like xfs which need larger allocations with NOFAIL
> > sementic. They are not using kvmalloc currently because the current
> > implementation tries too hard to allocate through the kmalloc path
> > which causes a lot of direct reclaim and compaction and that hurts
> > performance a lot (see 8dc9384b7d75 ("xfs: reduce kvmalloc overhead for
> > CIL shadow buffers") for more details).
> >
> > kvmalloc does support __GFP_RETRY_MAYFAIL semantic to express that
> > kmalloc (physically contiguous) allocation is preferred and we should go
> > more aggressive to make it happen. There is currently no way to express
> > that kmalloc should be very lightweight and as it has been argued [1]
> > this mode should be default to support kvmalloc(NOFAIL) with a
> > lightweight kmalloc path which is currently impossible to express as
> > __GFP_NOFAIL cannot be combined by any other reclaim modifiers.
> >
> > This patch makes all kmalloc allocations GFP_NOWAIT unless
> > __GFP_RETRY_MAYFAIL is provided to kvmalloc. This allows to support both
> > fail fast and retry hard on physically contiguous memory with vmalloc
> > fallback.
> >
> > There is a potential downside that relatively small allocations (smaller
> > than PAGE_ALLOC_COSTLY_ORDER) could fallback to vmalloc too easily and
> > cause page block fragmentation. We cannot really rule that out but it
> > seems that xlog_cil_kvmalloc use doesn't indicate this to be happening.
> >
> > [1] https://lore.kernel.org/all/Z-3i1wATGh6vI8x8@dread.disaster.area/T/#u
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
>
> Thanks for finding a solution for this! It makes way more sense to me to
> kick over to vmap by default for kvmalloc users.
Are 32-bit kernels still constrained by a small(ish) vmalloc space?
It's all fine for xlog_kvmalloc which will continue looping until
something makes progress, but tuning for those platforms aren't a
priority for most xfs developers AFAIK.
--D
> > ---
> > mm/slub.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index b46f87662e71..2da40c2f6478 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -4972,14 +4972,16 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size)
> > * We want to attempt a large physically contiguous block first because
> > * it is less likely to fragment multiple larger blocks and therefore
> > * contribute to a long term fragmentation less than vmalloc fallback.
> > - * However make sure that larger requests are not too disruptive - no
> > - * OOM killer and no allocation failure warnings as we have a fallback.
> > + * However make sure that larger requests are not too disruptive - i.e.
> > + * do not direct reclaim unless physically continuous memory is preferred
> > + * (__GFP_RETRY_MAYFAIL mode). We still kick in kswapd/kcompactd to start
> > + * working in the background but the allocation itself.
>
> I think a word is missing here? "...but do the allocation..." or
> "...allocation itself happens" ?
>
> --
> Kees Cook
>
next prev parent reply other threads:[~2025-04-04 15:33 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250401073046.51121-1-laoar.shao@gmail.com>
2025-04-01 14:01 ` [PATCH] proc: Avoid costly high-order page allocations when reading proc files Kees Cook
2025-04-01 14:50 ` Yafang Shao
2025-04-02 4:15 ` Harry Yoo
2025-04-02 8:42 ` Yafang Shao
2025-04-02 9:25 ` Vlastimil Babka
2025-04-02 12:17 ` Michal Hocko
2025-04-02 18:25 ` Shakeel Butt
2025-04-02 11:32 ` Dave Chinner
2025-04-02 12:24 ` Michal Hocko
2025-04-02 17:24 ` Matthew Wilcox
2025-04-02 18:30 ` Shakeel Butt
2025-04-02 22:38 ` Dave Chinner
2025-04-02 21:16 ` Dave Chinner
2025-04-02 23:10 ` Shakeel Butt
2025-04-03 1:22 ` Dave Chinner
2025-04-03 3:32 ` Yafang Shao
2025-04-03 5:05 ` Shakeel Butt
2025-04-03 7:20 ` Michal Hocko
2025-04-03 4:37 ` Shakeel Butt
2025-04-03 7:22 ` Michal Hocko
2025-04-03 7:43 ` [PATCH] mm: kvmalloc: make kmalloc fast path real fast path Michal Hocko
2025-04-03 8:24 ` Vlastimil Babka
2025-04-03 8:59 ` Michal Hocko
2025-04-03 16:21 ` Kees Cook
2025-04-03 19:49 ` Michal Hocko
2025-04-04 15:33 ` Darrick J. Wong [this message]
2025-04-03 18:30 ` Shakeel Butt
2025-04-03 19:51 ` Michal Hocko
2025-04-09 1:10 ` Dave Chinner
2025-06-04 18:42 ` Matthew Wilcox
2025-04-09 7:35 ` Michal Hocko
2025-04-09 9:11 ` Vlastimil Babka
2025-04-09 12:20 ` Michal Hocko
2025-04-09 12:23 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250404153330.GA6266@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=harry.yoo@oracle.com \
--cc=joel.granados@kernel.org \
--cc=josef@toxicpanda.com \
--cc=kees@kernel.org \
--cc=laoar.shao@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=shakeel.butt@linux.dev \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox