linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>,
	Vlastimil Babka <vbabka@suse.cz>,
	libaokun@huaweicloud.com, linux-mm@kvack.org,
	akpm@linux-foundation.org, surenb@google.com,
	jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com,
	jack@suse.cz, yi.zhang@huawei.com, yangerkun@huawei.com,
	libaokun1@huawei.com
Subject: Re: [PATCH RFC] mm: allow __GFP_NOFAIL allocation up to BLK_MAX_BLOCK_SIZE to support LBS
Date: Mon, 3 Nov 2025 08:55:16 +0100	[thread overview]
Message-ID: <aQhf5LJJMlvT-rrE@tiehlicka> (raw)
In-Reply-To: <aQTqELGGKCN3JTIm@casper.infradead.org>

On Fri 31-10-25 16:55:44, Matthew Wilcox wrote:
> On Fri, Oct 31, 2025 at 09:46:17AM -0700, Shakeel Butt wrote:
> > Now for the interface to allow NOFS+NOFAIL+higher_order, I think a new
> > (FS specific) gfp is fine but will require some maintenance to avoid
> > abuse.
> 
> I don't think a new GFP flag is the answer.  GFP_TRUST_ME_BRO just
> doesn't feel right.

Yeah, as usual a new gfp flag seems convenient except history has taught 
us this rarely works.

> > I am more interested in how to codify "you can reclaim one I've already
> > allocated". I have a different scenario where network stack keep
> > stealing memory from direct reclaimers and keeping them in reclaim for
> > long time. If we have some mechanism to allow reclaimers to get the
> > memory they have reclaimed (at least for some cases), I think that can
> > be used in both cases.
> 
> The only thing that comes to mind is putting pages freed by reclaim on
> a list in task_struct instead of sending them back to the allocator.
> Then the task can allocate from there and free up anything else it's
> reclaimed at some later point.  I don't think this is a good idea,
> but it's the only idea that comes to mind.

I have played with that idea years ago. Mostly to deal with direct
reclaim unfairness when some reclaimers were doing a lot of work on
behalf of everybody else. IIRC I have hit into different problems, like
reclaim throttling and over-reclaim.

Anyway, page allocator does respect GFP_NOFAIL even for high order
requests. The oom killer will be disabled for order-4 but as these will
likely be GFP_NOFS anyway then the order doesn't make much of a
difference. So these requests could really take long time to succeed but
I guess this will be generally understood. As the vmalloc fallback
doesn't seem to be a feasible option short (maybe even mid) term then
this is the only choice we have other than failing allocations and
seeing a lot of fs failures.

That being said I would much rather go and drop the order warning than
trying to invent some fine tuning based on usecase. We might need to
invent some OOM protection for order-3 nofail requests as OOM killer
could just make too much harm killing tasks without much of chance to
defragment memory. Let's deal with that once we see that happening.

-- 
Michal Hocko
SUSE Labs


  parent reply	other threads:[~2025-11-03  7:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-31  6:13 libaokun
2025-10-31  7:25 ` Michal Hocko
2025-10-31 10:12   ` Vlastimil Babka
2025-10-31 14:26     ` Matthew Wilcox
2025-10-31 15:35       ` Shakeel Butt
2025-10-31 15:52         ` Shakeel Butt
2025-10-31 15:54           ` Matthew Wilcox
2025-10-31 16:46             ` Shakeel Butt
2025-10-31 16:55               ` Matthew Wilcox
2025-11-03  2:45                 ` Baokun Li
2025-11-03  7:55                 ` Michal Hocko [this message]
2025-11-03  9:01                   ` Vlastimil Babka
2025-11-03  9:25                     ` Michal Hocko
2025-11-04 10:31                       ` Michal Hocko
2025-11-04 12:32                         ` Vlastimil Babka
2025-11-04 12:50                           ` Michal Hocko
2025-11-04 12:57                             ` Vlastimil Babka
2025-11-04 16:43                               ` Michal Hocko
2025-11-05  6:23                                 ` Baokun Li
2025-11-03 18:53                     ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQhf5LJJMlvT-rrE@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=jackmanb@google.com \
    --cc=libaokun1@huawei.com \
    --cc=libaokun@huaweicloud.com \
    --cc=linux-mm@kvack.org \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox