From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org, linux-mm@kvack.org,
Vlastimil Babka <vbabka@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Uladzislau Rezki <urezki@gmail.com>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] mm: Drop INT_MAX limit from kvmalloc()
Date: Sun, 20 Oct 2024 17:44:37 +0100 [thread overview]
Message-ID: <ebe58a08-2371-403c-a0d7-6280160b6848@lucifer.local> (raw)
In-Reply-To: <ucbcmk4kr7tbvj6uqnjke5662kerawbr4ghtlbw4zim7mdm3fg@vjawfne5rdkp>
+cc Linus, vmalloc reviewers
On Sun, Oct 20, 2024 at 09:00:07AM -0400, Kent Overstreet wrote:
> On Sun, Oct 20, 2024 at 12:45:33PM +0100, Lorenzo Stoakes wrote:
> > On Sat, Oct 19, 2024 at 05:00:37PM -0400, Kent Overstreet wrote:
> > > A user with a 75 TB filesystem reported the following journal replay
> > > error:
> > > https://github.com/koverstreet/bcachefs/issues/769
> > >
> > > In journal replay we have to sort and dedup all the keys from the
> > > journal, which means we need a large contiguous allocation. Given that
> > > the user has 128GB of ram, the 2GB limit on allocation size has become
> > > far too small.
> > >
> > > Cc: Vlastimil Babka <vbabka@suse.cz>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> > > ---
> > > mm/util.c | 6 ------
> > > 1 file changed, 6 deletions(-)
> > >
> > > diff --git a/mm/util.c b/mm/util.c
> > > index 4f1275023eb7..c60df7723096 100644
> > > --- a/mm/util.c
> > > +++ b/mm/util.c
> > > @@ -665,12 +665,6 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int node)
> > > if (!gfpflags_allow_blocking(flags))
> > > return NULL;
> > >
> > > - /* Don't even allow crazy sizes */
> > > - if (unlikely(size > INT_MAX)) {
> > > - WARN_ON_ONCE(!(flags & __GFP_NOWARN));
> > > - return NULL;
> > > - }
> > > -
> >
> > Err, and not replace it with _any_ limit? That seems very unwise.
>
> large allocations will go to either the page allocator or vmalloc, and
> they have their own limits.
Ah actually I misread it here, I see the allocation gets immediately sent
off to __kmalloc_node_noprof() and thus that'll apply its own limits before
doing this check prior to the vmalloc call.
We actually do have a basic check in __vmalloc_node_range_noprof() that
prevents _totally_ insane requests, checking that size >> PAGE_SHIFT <=
totalram_pages(), so we shouldn't get anything too stupid here (I am
thinking especially of ptr + size overflow type situations).
But Linus explicitly introduced this INT_MAX check in commit 7661809d493b
("mm: don't allow oversized kvmalloc() calls"), presumably for a reason, so
have cc'd him here in case he has an objection to this which amounts to a
revert of that patch.
Assuming Linus doesn't object, I don't see how this is really doing
anything different than just invoking __vmalloc_node_range_noprof() direct
which we do in quite a few places anyway?
I guess let's wait and see what he says or if Vlastimil/the vmalloc
reviewers have any thoughts on this.
But looks sane to me otherwise.
>
> although I should have a look at that, and make sure we're not
> triggering the > MAX_ORDER warning in the page allocator unnecessarily w
> hen we could just call vmalloc().
OK I guess tha would be a check prior to invoking __kmalloc_node_noprof()
-> ... -> __alloc_pages_noprof() and WARN_ON_ONCE_GFP(order >
MAX_PAGE_ORDER, gfp) ?
Sounds sensible.
next prev parent reply other threads:[~2024-10-20 16:44 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-19 21:00 Kent Overstreet
2024-10-20 11:45 ` Lorenzo Stoakes
2024-10-20 13:00 ` Kent Overstreet
2024-10-20 16:44 ` Lorenzo Stoakes [this message]
2024-10-20 17:03 ` Kent Overstreet
2024-10-20 18:46 ` Linus Torvalds
2024-10-20 18:53 ` Kent Overstreet
2024-10-20 19:09 ` Linus Torvalds
2024-10-20 19:09 ` Linus Torvalds
2024-10-20 19:16 ` Kent Overstreet
2024-10-21 16:15 ` Uladzislau Rezki
2024-10-20 20:10 ` Kent Overstreet
2024-10-20 20:19 ` Linus Torvalds
2024-10-20 20:29 ` Kent Overstreet
2024-10-20 20:54 ` Linus Torvalds
2024-10-20 21:21 ` Linus Torvalds
2024-10-20 21:40 ` Kent Overstreet
2024-10-27 19:58 ` Kent Overstreet
2024-10-20 21:29 ` Kent Overstreet
2024-10-20 21:30 ` Linus Torvalds
2024-10-20 21:42 ` Kent Overstreet
2024-10-20 21:51 ` Joshua Ashton
2024-10-20 21:57 ` Kent Overstreet
2024-10-21 8:46 ` Janpieter Sollie
2024-10-21 9:22 ` Janpieter Sollie
2024-10-20 19:10 ` Kent Overstreet
2024-10-20 19:53 ` Vlastimil Babka
2024-10-20 20:08 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ebe58a08-2371-403c-a0d7-6280160b6848@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox