From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: akpm@linux-foundation.org, muchun.song@linux.dev,
david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com,
mhocko@kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 2/3] mm: hugetlb: make the hugetlb migration strategy consistent
Date: Mon, 26 Feb 2024 17:59:37 +0800 [thread overview]
Message-ID: <e3aac0fc-458f-4453-86a6-1bf92dc5fbd0@linux.alibaba.com> (raw)
In-Reply-To: <ZdxXLTDZn8fD3pEn@localhost.localdomain>
On 2024/2/26 17:17, Oscar Salvador wrote:
> On Mon, Feb 26, 2024 at 11:34:51AM +0800, Baolin Wang wrote:
>> IMO, I'm not sure whether it's appropriate to decouple
>> dequeue_hugetlb_folio_nodemask() from alloc_hugetlb_folio_nodemask() into
>> two separate functions for the users to call, because these details should
>> be hidden within the hugetlb core implementation.
>>
>> Instead, I can move the gfp_mask fiddling into a new helper, and move the
>> helper into alloc_migrate_hugetlb_folio(). Temporary hugetlb allocation has
>> its own gfp strategy seems reasonable to me.
>
> An alternative would be to do the following, which does not futher carry
> the "reason" argument into hugetlb code.
> (Not even compile tested, just a PoC)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index c1ee640d87b1..8a89a1007dcb 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -970,6 +970,24 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask)
> return modified_mask;
> }
>
> +static inline bool htlb_allow_fallback(int reason)
> +{
> + bool allowed_fallback = false;
> +
> + switch (reason) {
> + case MR_MEMORY_HOTPLUG:
> + case MR_MEMORY_FAILURE:
> + case MR_SYSCALL:
> + case MR_MEMPOLICY_MBIND:
> + allowed_fallback = true;
> + break;
> + default:
> + break;
> + }
> +
> + return allowed_fallback;
> +}
> +
Thanks for providing an alternative implementation. However, I still
prefer to hide these details into hugetlb core, since users do not need
to pay excessive attention to these hugetlb details. So something like:
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 418d66953224..e8eb08bbc688 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2567,13 +2567,38 @@ static struct folio
*alloc_surplus_hugetlb_folio(struct hstate *h,
}
static struct folio *alloc_migrate_hugetlb_folio(struct hstate *h,
gfp_t gfp_mask,
- int nid, nodemask_t *nmask)
+ int nid, nodemask_t *nmask, int reason)
{
struct folio *folio;
+ bool allowed_fallback = false;
if (hstate_is_gigantic(h))
return NULL;
+ if (gfp_mask & __GFP_THISNODE)
+ goto alloc_new;
+
+ /*
+ * Note: the memory offline, memory failure and migration
syscalls will
+ * be allowed to fallback to other nodes due to lack of a better
chioce,
+ * that might break the per-node hugetlb pool. While other cases
will
+ * set the __GFP_THISNODE to avoid breaking the per-node hugetlb
pool.
+ */
+ switch (reason) {
+ case MR_MEMORY_HOTPLUG:
+ case MR_MEMORY_FAILURE:
+ case MR_MEMORY_FAILURE:
+ case MR_SYSCALL:
+ case MR_MEMPOLICY_MBIND:
+ allowed_fallback = true;
+ break;
+ default:
+ break;
+ }
+
+ if (!allowed_fallback)
+ gfp_mask |= __GFP_THISNODE;
+
+alloc_new:
folio = alloc_fresh_hugetlb_folio(h, gfp_mask, nid, nmask, NULL);
if (!folio)
return NULL;
@@ -2621,7 +2646,7 @@ struct folio
*alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h,
/* folio migration callback function */
struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int
preferred_nid,
- nodemask_t *nmask, gfp_t gfp_mask)
return NULL;
@@ -2621,7 +2646,7 @@ struct folio
*alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h,
/* folio migration callback function */
struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int
preferred_nid,
- nodemask_t *nmask, gfp_t gfp_mask)
+ nodemask_t *nmask, gfp_t gfp_mask, int reason)
{
spin_lock_irq(&hugetlb_lock);
if (available_huge_pages(h)) {
@@ -2636,7 +2661,7 @@ struct folio *alloc_hugetlb_folio_nodemask(struct
hstate *h, int preferred_nid,
}
spin_unlock_irq(&hugetlb_lock);
- return alloc_migrate_hugetlb_folio(h, gfp_mask, preferred_nid,
nmask);
+ return alloc_migrate_hugetlb_folio(h, gfp_mask, preferred_nid,
nmask, reason);
}
/*
@@ -6653,7 +6678,13 @@ static struct folio
*alloc_hugetlb_folio_vma(struct hstate *h,
gfp_mask = htlb_alloc_mask(h);
node = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
- folio = alloc_hugetlb_folio_nodemask(h, node, nodemask, gfp_mask);
+ /*
+ * This is used to allocate a temporary hugetlb to hold the copied
+ * content, which will then be copied again to the final hugetlb
+ * consuming a reservation. Set the migrate reason to -1 to indicate
+ * that breaking the per-node hugetlb pool is not allowed in
this case.
+ */
+ folio = alloc_hugetlb_folio_nodemask(h, node, nodemask,
gfp_mask, -1);
mpol_cond_put(mpol);
return folio;
What do you think? Thanks.
next prev parent reply other threads:[~2024-02-26 9:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-21 9:27 [RFC PATCH 0/3] " Baolin Wang
2024-02-21 9:27 ` [RFC PATCH 1/3] mm: record the migration reason for struct migration_target_control Baolin Wang
2024-02-21 9:27 ` [RFC PATCH 2/3] mm: hugetlb: make the hugetlb migration strategy consistent Baolin Wang
2024-02-22 22:15 ` Oscar Salvador
2024-02-23 2:56 ` Baolin Wang
2024-02-23 14:19 ` Oscar Salvador
2024-02-26 3:34 ` Baolin Wang
2024-02-26 9:17 ` Oscar Salvador
2024-02-26 9:59 ` Baolin Wang [this message]
2024-02-21 9:27 ` [RFC PATCH 3/3] docs: hugetlbpage.rst: add hugetlb migration description Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e3aac0fc-458f-4453-86a6-1bf92dc5fbd0@linux.alibaba.com \
--to=baolin.wang@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=naoya.horiguchi@nec.com \
--cc=osalvador@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox