linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: akpm@linux-foundation.org, muchun.song@linux.dev,
	osalvador@suse.de, david@redhat.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] mm: hugetlb: remove __GFP_THISNODE flag when dissolving the old hugetlb
Date: Thu, 1 Feb 2024 16:27:44 +0100	[thread overview]
Message-ID: <Zbu4cD1XLFLfKan8@tiehlicka> (raw)
In-Reply-To: <6f26ce22d2fcd523418a085f2c588fe0776d46e7.1706794035.git.baolin.wang@linux.alibaba.com>

On Thu 01-02-24 21:31:13, Baolin Wang wrote:
> Since commit 369fa227c219 ("mm: make alloc_contig_range handle free
> hugetlb pages"), the alloc_contig_range() can handle free hugetlb pages
> by allocating a new fresh hugepage, and replacing the old one in the
> free hugepage pool.
> 
> However, our customers can still see the failure of alloc_contig_range()
> when seeing a free hugetlb page. The reason is that, there are few memory
> on the old hugetlb page's node, and it can not allocate a fresh hugetlb
> page on the old hugetlb page's node in isolate_or_dissolve_huge_page() with
> setting __GFP_THISNODE flag. This makes sense to some degree.
> 
> Later, the commit ae37c7ff79f1 (" mm: make alloc_contig_range handle
> in-use hugetlb pages") handles the in-use hugetlb pages by isolating it
> and doing migration in __alloc_contig_migrate_range(), but it can allow
> fallbacking to other numa node when allocating a new hugetlb in
> alloc_migration_target().
> 
> This introduces inconsistency to handling free and in-use hugetlb.
> Considering the CMA allocation and memory hotplug relying on the
> alloc_contig_range() are important in some scenarios, as well as keeping
> the consistent hugetlb handling, we should remove the __GFP_THISNODE flag
> in isolate_or_dissolve_huge_page() to allow fallbacking to other numa node,
> which can solve the failure of alloc_contig_range() in our case.

I do agree that the inconsistency is not really good but I am not sure
dropping __GFP_THISNODE is the right way forward. Breaking pre-allocated
per-node pools might result in unexpected failures when node bound
workloads doesn't get what is asssumed available. Keep in mind that our
user APIs allow to pre-allocate per-node pools separately.

The in-use hugetlb is a very similar case. While having a temporarily
misplaced page doesn't really look terrible once that hugetlb page is
released back into the pool we are back to the case above. Either we
make sure that the node affinity is restored later on or it shouldn't be
migrated to a different node at all.

-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2024-02-01 15:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-01 13:31 Baolin Wang
2024-02-01 15:27 ` Michal Hocko [this message]
2024-02-02  1:35   ` Baolin Wang
2024-02-02  8:17     ` Michal Hocko
2024-02-02  9:29       ` Baolin Wang
2024-02-02  9:55         ` Michal Hocko
2024-02-05  2:50           ` Baolin Wang
2024-02-05  9:15             ` Michal Hocko
2024-02-05 13:06               ` Baolin Wang
2024-02-05 14:23                 ` Michal Hocko
2024-02-06  8:18                   ` Baolin Wang
2024-02-06 13:19                     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zbu4cD1XLFLfKan8@tiehlicka \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox