From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>,
Shakeel Butt <shakeel.butt@linux.dev>,
libaokun@huaweicloud.com, linux-mm@kvack.org,
akpm@linux-foundation.org, surenb@google.com,
jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com,
jack@suse.cz, yi.zhang@huawei.com, yangerkun@huawei.com,
libaokun1@huawei.com
Subject: Re: [PATCH RFC] mm: allow __GFP_NOFAIL allocation up to BLK_MAX_BLOCK_SIZE to support LBS
Date: Tue, 4 Nov 2025 13:32:52 +0100 [thread overview]
Message-ID: <a0015508-e72b-4b1c-9993-7e66076a0a19@suse.cz> (raw)
In-Reply-To: <aQnV73KorhS0AWH5@tiehlicka>
On 11/4/25 11:31 AM, Michal Hocko wrote:
> On Mon 03-11-25 10:25:40, Michal Hocko wrote:
>> On Mon 03-11-25 10:01:54, Vlastimil Babka wrote:
>>> Maybe we could keep the warning for >=PMD_ORDER as that would still mean
>>> someone made an error?
>>
>> I am not sure TBH. For those large requests (anything that is costly
>> order) it is essentially a loop around allocator inside the allocator.
>> I would be really much more worried about order-3 which still triggers
>> the oom killer and could kill half of the system without much progress.
>> For oder-2 you at least have task_struct which spans 2 pages but I do
>> not think we have any guaranteed order-3 page for each task to guarantee
>> anything when killing those.
>
> Essentially something like this
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 25923cfec9c6..2df477d97cee 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -1142,6 +1142,14 @@ bool out_of_memory(struct oom_control *oc)
> if (!(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc))
> return true;
>
> + /*
> + * unlike for other !costly requests killing a task is not
> + * really guaranteed to free any order-3 pages. Warn about
> + * that to see whether that happens often enough to special
> + * case.
> + */
> + WARN_ON(oc->order == 3 && (oc->gfp_mask & __GFP_NOFAIL));
OK, it might not create an order-3 page immediately. But I'd expect it
allows compaction to make progress thanks to making more free memory
available? We do retry reclaim/compaction after OOM killing one process,
and don't just kill until we succeed allocating, right?
> +
> /*
> * Check if there were limitations on the allocation (only relevant for
> * NUMA and memcg) that may require different handling.
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d1d037f97c5f..ca8795156b14 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3993,6 +3993,7 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> /* Coredumps can quickly deplete all memory reserves */
> if (current->flags & PF_DUMPCORE)
> goto out;
> +
> /* The OOM killer will not help higher order allocs */
> if (order > PAGE_ALLOC_COSTLY_ORDER)
> goto out;
> @@ -4612,11 +4613,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> int reserve_flags;
>
> if (unlikely(nofail)) {
> - /*
> - * We most definitely don't want callers attempting to
> - * allocate greater than order-1 page units with __GFP_NOFAIL.
> - */
> - WARN_ON_ONCE(order > 1);
> /*
> * Also we don't support __GFP_NOFAIL without __GFP_DIRECT_RECLAIM,
> * otherwise, we may result in lockup.
next prev parent reply other threads:[~2025-11-04 12:32 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-31 6:13 libaokun
2025-10-31 7:25 ` Michal Hocko
2025-10-31 10:12 ` Vlastimil Babka
2025-10-31 14:26 ` Matthew Wilcox
2025-10-31 15:35 ` Shakeel Butt
2025-10-31 15:52 ` Shakeel Butt
2025-10-31 15:54 ` Matthew Wilcox
2025-10-31 16:46 ` Shakeel Butt
2025-10-31 16:55 ` Matthew Wilcox
2025-11-03 2:45 ` Baokun Li
2025-11-03 7:55 ` Michal Hocko
2025-11-03 9:01 ` Vlastimil Babka
2025-11-03 9:25 ` Michal Hocko
2025-11-04 10:31 ` Michal Hocko
2025-11-04 12:32 ` Vlastimil Babka [this message]
2025-11-04 12:50 ` Michal Hocko
2025-11-04 12:57 ` Vlastimil Babka
2025-11-04 16:43 ` Michal Hocko
2025-11-05 6:23 ` Baokun Li
2025-11-03 18:53 ` Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a0015508-e72b-4b1c-9993-7e66076a0a19@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=jack@suse.cz \
--cc=jackmanb@google.com \
--cc=libaokun1@huawei.com \
--cc=libaokun@huaweicloud.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=willy@infradead.org \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox