From: Linus Torvalds <torvalds@linux-foundation.org>
To: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>, Barry Song <21cnbao@gmail.com>,
Yafang Shao <laoar.shao@gmail.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
42.hyeyoo@gmail.com, cl@linux.com, hailong.liu@oppo.com,
hch@infradead.org, iamjoonsoo.kim@lge.com, penberg@kernel.org,
rientjes@google.com, roman.gushchin@linux.dev, urezki@gmail.com,
v-songbaohua@oppo.com, vbabka@suse.cz,
virtualization@lists.linux.dev
Subject: Re: [PATCH v3 0/4] mm: clarify nofail memory allocation
Date: Thu, 22 Aug 2024 17:08:15 +0800 [thread overview]
Message-ID: <CAHk-=wjTTAKRgVD=StNbV1FtPz+a1DHYBbrJuP=kmd_vm8ssFg@mail.gmail.com> (raw)
In-Reply-To: <bc952d89-1219-41e9-9a8b-cfe2a758d585@redhat.com>
On Thu, 22 Aug 2024 at 16:39, David Hildenbrand <david@redhat.com> wrote:
>
> Linus has a point that "retry forever" can also be nasty. I think the
> important part here is, though, that we report sufficient information
> (stacktrace), such that the problem can be debugged reasonably well, and
> not just having a locked-up system.
Unless I missed some case, I *think* most NOFAIL cases are actually
fairly small.
In fact, I suspect many of them are so small that we already
effectively give that guarantee:
> But then again, sizeof(struct resource) is probably so small that it
> likely would never fail.
Iirc, we had the policy of never failing unrestricted kernel
allocations that are smaller than a page (where "unrestricted" means
that it's a regular GFP_KERNEL, not some NOFS or similar allocation).
In fact, I think we practically speaking still do. We really *really*
tend to try very hard to retry small allocations.
That was one of the things that GFP_USER does - it's identical to
GFP_KERNEL, but it basically tells the MM that it should not try so
hard because an allocation failure was fine.
(I say "was", because the semantics have changed over time.
Originally, GFP_KERNEL had the "GFP_HIGH" bit set that said "access
emergency pools for this allocation", and GFP_USER did not have that
bit set. Now the only difference seems to be GFP_HARDWALL, so a user
allocation the cgroup limits are hard limits etc. I think there's been
other versions of this kind of logic over the years as people try to
make it all work out well in practice).
In fact, kernel allocations try so hard that we have those "opposite
flags" of ___GFP_NORETRY and ___GFP_RETRY_MAYFAIL because we often try
*TOO* hard, and reasonably many code-paths have that whole "let's
optimistically ask for a big allocation, but not try very hard and not
warn if it fails, because we can fall back on a smaller one".
So it's _really_ hard to fail a small GFP_KERNEL allocation. It used
to be practically impossible, and in fact I think GFP_NOFAIL was
originally added long ago when the MM code was going through big
upheavals and one of the things that was mucked around with was the
whole "how hard to retry".
Linus
next prev parent reply other threads:[~2024-08-22 9:08 UTC|newest]
Thread overview: 101+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-17 6:24 Barry Song
2024-08-17 6:24 ` [PATCH v3 1/4] vduse: avoid using __GFP_NOFAIL Barry Song
2024-08-17 6:24 ` [PATCH v3 2/4] mm: document __GFP_NOFAIL must be blockable Barry Song
2024-08-17 6:24 ` [PATCH v3 3/4] mm: BUG_ON to avoid NULL deference while __GFP_NOFAIL fails Barry Song
2024-08-19 9:43 ` David Hildenbrand
2024-08-19 9:47 ` Barry Song
2024-08-19 9:55 ` David Hildenbrand
2024-08-19 10:02 ` Barry Song
2024-08-19 12:33 ` David Hildenbrand
2024-08-19 12:48 ` Barry Song
2024-08-19 12:49 ` David Hildenbrand
2024-08-19 17:12 ` Michal Hocko
2024-08-19 17:17 ` Linus Torvalds
2024-08-19 20:24 ` David Hildenbrand
2024-08-19 20:35 ` Linus Torvalds
2024-08-19 21:57 ` David Hildenbrand
2024-08-19 22:13 ` Linus Torvalds
2024-08-20 6:17 ` Michal Hocko
2024-08-19 12:49 ` Christoph Hellwig
2024-08-19 12:51 ` David Hildenbrand
2024-08-19 12:53 ` Christoph Hellwig
2024-08-19 13:14 ` David Hildenbrand
2024-08-19 13:05 ` Barry Song
2024-08-19 13:10 ` David Hildenbrand
2024-08-19 13:19 ` Barry Song
2024-08-19 13:22 ` David Hildenbrand
2024-08-17 6:24 ` [PATCH v3 4/4] mm: prohibit NULL deference exposed for unsupported non-blockable __GFP_NOFAIL Barry Song
2024-08-18 2:55 ` Yafang Shao
2024-08-18 3:48 ` Barry Song
2024-08-18 5:51 ` Yafang Shao
2024-08-18 6:27 ` Barry Song
2024-08-18 6:45 ` Barry Song
2024-08-18 7:07 ` Yafang Shao
2024-08-18 7:25 ` Barry Song
2024-08-19 7:51 ` Michal Hocko
2024-08-19 7:50 ` Michal Hocko
2024-08-19 9:25 ` Yafang Shao
2024-08-19 9:39 ` Barry Song
2024-08-19 9:45 ` Yafang Shao
2024-08-19 10:10 ` Barry Song
2024-08-19 11:56 ` Yafang Shao
2024-08-19 12:09 ` Michal Hocko
2024-08-19 12:17 ` Yafang Shao
2024-08-19 14:01 ` Michal Hocko
2024-08-19 10:17 ` Michal Hocko
2024-08-19 11:56 ` Yafang Shao
2024-08-19 12:04 ` Michal Hocko
2024-08-19 9:44 ` David Hildenbrand
2024-08-19 10:19 ` Michal Hocko
2024-08-19 12:48 ` David Hildenbrand
2024-08-19 13:02 ` [PATCH v3 0/4] mm: clarify nofail memory allocation David Hildenbrand
2024-08-19 16:05 ` Linus Torvalds
2024-08-19 19:23 ` Barry Song
2024-08-19 19:33 ` Linus Torvalds
2024-08-19 21:48 ` Barry Song
2024-08-20 6:24 ` Michal Hocko
2024-08-21 12:40 ` Yafang Shao
2024-08-21 22:59 ` Linus Torvalds
2024-08-22 6:21 ` Michal Hocko
2024-08-22 6:40 ` Linus Torvalds
2024-08-22 6:56 ` Linus Torvalds
2024-08-22 7:47 ` Michal Hocko
2024-08-22 7:57 ` Barry Song
2024-08-22 8:24 ` Michal Hocko
2024-08-22 8:39 ` David Hildenbrand
2024-08-22 9:08 ` Linus Torvalds [this message]
2024-08-22 9:16 ` Michal Hocko
2024-08-22 9:24 ` Linus Torvalds
2024-08-22 9:11 ` Michal Hocko
2024-08-22 9:18 ` Linus Torvalds
2024-08-22 9:33 ` Michal Hocko
2024-08-22 9:44 ` Linus Torvalds
2024-08-22 9:59 ` Michal Hocko
2024-08-22 10:30 ` Linus Torvalds
2024-08-22 10:46 ` Michal Hocko
2024-08-22 9:27 ` David Hildenbrand
2024-08-22 9:34 ` Linus Torvalds
2024-08-22 9:43 ` David Hildenbrand
2024-08-22 9:53 ` Linus Torvalds
2024-08-22 11:58 ` Johannes Weiner
2024-08-26 12:10 ` Vlastimil Babka
2024-08-27 6:57 ` Linus Torvalds
2024-08-27 7:15 ` Barry Song
2024-08-27 7:38 ` Vlastimil Babka
2024-08-27 7:50 ` Barry Song
2024-08-29 10:24 ` Vlastimil Babka
2024-08-29 11:53 ` Barry Song
2024-08-29 13:20 ` Michal Hocko
2024-08-29 21:27 ` Barry Song
2024-08-29 22:31 ` Barry Song
2024-08-30 7:24 ` Michal Hocko
2024-08-30 7:37 ` Vlastimil Babka
2024-08-22 9:41 ` Michal Hocko
2024-08-22 9:42 ` David Hildenbrand
2024-08-22 7:01 ` Gao Xiang
2024-08-22 7:54 ` Michal Hocko
2024-08-22 8:04 ` Gao Xiang
2024-08-22 14:35 ` Yafang Shao
2024-08-22 15:02 ` Gao Xiang
2024-08-22 6:37 ` Barry Song
2024-08-22 14:22 ` Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=wjTTAKRgVD=StNbV1FtPz+a1DHYBbrJuP=kmd_vm8ssFg@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=21cnbao@gmail.com \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=david@redhat.com \
--cc=hailong.liu@oppo.com \
--cc=hch@infradead.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=urezki@gmail.com \
--cc=v-songbaohua@oppo.com \
--cc=vbabka@suse.cz \
--cc=virtualization@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox