linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Michal Hocko <mhocko@suse.com>
Cc: Dave Chinner <david@fromorbit.com>,
	Yafang Shao <laoar.shao@gmail.com>,
	Harry Yoo <harry.yoo@oracle.com>, Kees Cook <kees@kernel.org>,
	joel.granados@kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, Josef Bacik <josef@toxicpanda.com>,
	linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] proc: Avoid costly high-order page allocations when reading proc files
Date: Wed, 2 Apr 2025 18:24:10 +0100	[thread overview]
Message-ID: <Z-1yui_QgubgRAmL@casper.infradead.org> (raw)
In-Reply-To: <Z-0sjd8SEtldbxB1@tiehlicka>

On Wed, Apr 02, 2025 at 02:24:45PM +0200, Michal Hocko wrote:
> On Wed 02-04-25 22:32:14, Dave Chinner wrote:
> > > > > >+    /*
> > > > > >+     * Use vmalloc if the count is too large to avoid costly high-order page
> > > > > >+     * allocations.
> > > > > >+     */
> > > > > >+    if (count < (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
> > > > > >+            kbuf = kvzalloc(count + 1, GFP_KERNEL);
> > > > >
> > > > > Why not move this check into kvmalloc family?
> > > >
> > > > Hmm should this check really be in kvmalloc family?
> > > 
> > > Modifying the existing kvmalloc functions risks performance regressions.
> > > Could we instead introduce a new variant like vkmalloc() (favoring
> > > vmalloc over kmalloc) or kvmalloc_costless()?
> > 
> > We should fix kvmalloc() instead of continuing to force
> > subsystems to work around the limitations of kvmalloc().
> 
> Agreed!
> 
> > Have a look at xlog_kvmalloc() in XFS. It implements a basic
> > fast-fail, no retry high order kmalloc before it falls back to
> > vmalloc by turning off direct reclaim for the kmalloc() call.
> > Hence if the there isn't a high-order page on the free lists ready
> > to allocate, it falls back to vmalloc() immediately.

... but if vmalloc fails, it goes around again!  This is exactly why
we don't want filesystems implementing workarounds for MM problems.
What a mess.

>  	if (size > PAGE_SIZE) {
>  		flags |= __GFP_NOWARN;
>  
>  		if (!(flags & __GFP_RETRY_MAYFAIL))
>  			flags |= __GFP_NORETRY;
> +		else
> +			flags &= ~__GFP_DIRECT_RECLAIM;

I think it might be better to do this:

		flags |= __GFP_NOWARN;

		if (!(flags & __GFP_RETRY_MAYFAIL))
			flags |= __GFP_NORETRY;
+		else if (size > (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
+			flags &= ~__GFP_DIRECT_RECLAIM;

I think it's entirely appropriate for a call to kvmalloc() to do
direct reclaim if it's asking for, say, 16KiB and we don't have any of
those available.  Better than exacerbating the fragmentation problem by
allocating 4x4KiB pages, each from different groupings.


  reply	other threads:[~2025-04-02 17:24 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250401073046.51121-1-laoar.shao@gmail.com>
2025-04-01 14:01 ` Kees Cook
2025-04-01 14:50   ` Yafang Shao
2025-04-02  4:15   ` Harry Yoo
2025-04-02  8:42     ` Yafang Shao
2025-04-02  9:25       ` Vlastimil Babka
2025-04-02 12:17         ` Michal Hocko
2025-04-02 18:25         ` Shakeel Butt
2025-04-02 11:32       ` Dave Chinner
2025-04-02 12:24         ` Michal Hocko
2025-04-02 17:24           ` Matthew Wilcox [this message]
2025-04-02 18:30             ` Shakeel Butt
2025-04-02 22:38             ` Dave Chinner
2025-04-02 21:16           ` Dave Chinner
2025-04-02 23:10             ` Shakeel Butt
2025-04-03  1:22               ` Dave Chinner
2025-04-03  3:32                 ` Yafang Shao
2025-04-03  5:05                 ` Shakeel Butt
2025-04-03  7:20                   ` Michal Hocko
2025-04-03  4:37           ` Shakeel Butt
2025-04-03  7:22             ` Michal Hocko
2025-04-03  7:43               ` [PATCH] mm: kvmalloc: make kmalloc fast path real fast path Michal Hocko
2025-04-03  8:24                 ` Vlastimil Babka
2025-04-03  8:59                   ` Michal Hocko
2025-04-03 16:21                 ` Kees Cook
2025-04-03 19:49                   ` Michal Hocko
2025-04-04 15:33                   ` Darrick J. Wong
2025-04-03 18:30                 ` Shakeel Butt
2025-04-03 19:51                 ` Michal Hocko
2025-04-09  1:10                   ` Dave Chinner
2025-06-04 18:42                     ` Matthew Wilcox
2025-04-09  7:35                   ` Michal Hocko
2025-04-09  9:11                     ` Vlastimil Babka
2025-04-09 12:20                       ` Michal Hocko
2025-04-09 12:23                         ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-1yui_QgubgRAmL@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=david@fromorbit.com \
    --cc=harry.yoo@oracle.com \
    --cc=joel.granados@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=kees@kernel.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox