linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: akpm@linux-foundation.org, muchun.song@linux.dev,
	david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com,
	mhocko@kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 2/3] mm: hugetlb: make the hugetlb migration strategy consistent
Date: Mon, 26 Feb 2024 17:59:37 +0800	[thread overview]
Message-ID: <e3aac0fc-458f-4453-86a6-1bf92dc5fbd0@linux.alibaba.com> (raw)
In-Reply-To: <ZdxXLTDZn8fD3pEn@localhost.localdomain>



On 2024/2/26 17:17, Oscar Salvador wrote:
> On Mon, Feb 26, 2024 at 11:34:51AM +0800, Baolin Wang wrote:
>> IMO, I'm not sure whether it's appropriate to decouple
>> dequeue_hugetlb_folio_nodemask() from alloc_hugetlb_folio_nodemask() into
>> two separate functions for the users to call, because these details should
>> be hidden within the hugetlb core implementation.
>>
>> Instead, I can move the gfp_mask fiddling into a new helper, and move the
>> helper into alloc_migrate_hugetlb_folio(). Temporary hugetlb allocation has
>> its own gfp strategy seems reasonable to me.
> 
> An alternative would be to do the following, which does not futher carry
> the "reason" argument into hugetlb code.
> (Not even compile tested, just a PoC)
> 
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index c1ee640d87b1..8a89a1007dcb 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -970,6 +970,24 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask)
>   	return modified_mask;
>   }
> 
> +static inline bool htlb_allow_fallback(int reason)
> +{
> +	bool allowed_fallback = false;
> +
> +	switch (reason) {
> +	case MR_MEMORY_HOTPLUG:
> +	case MR_MEMORY_FAILURE:
> +	case MR_SYSCALL:
> +	case MR_MEMPOLICY_MBIND:
> +		allowed_fallback = true;
> +		break;
> +	default:
> +	        break;
> +	}
> +
> +	return allowed_fallback;
> +}
> +

Thanks for providing an alternative implementation. However, I still 
prefer to hide these details into hugetlb core, since users do not need 
to pay excessive attention to these hugetlb details. So something like:

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 418d66953224..e8eb08bbc688 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2567,13 +2567,38 @@ static struct folio 
*alloc_surplus_hugetlb_folio(struct hstate *h,
  }

  static struct folio *alloc_migrate_hugetlb_folio(struct hstate *h, 
gfp_t gfp_mask,
-                                    int nid, nodemask_t *nmask)
+                                    int nid, nodemask_t *nmask, int reason)
  {
         struct folio *folio;
+       bool allowed_fallback = false;

         if (hstate_is_gigantic(h))
                 return NULL;

+       if (gfp_mask & __GFP_THISNODE)
+               goto alloc_new;
+
+       /*
+        * Note: the memory offline, memory failure and migration 
syscalls will
+        * be allowed to fallback to other nodes due to lack of a better 
chioce,
+        * that might break the per-node hugetlb pool. While other cases 
will
+        * set the __GFP_THISNODE to avoid breaking the per-node hugetlb 
pool.
+        */
+       switch (reason) {
+       case MR_MEMORY_HOTPLUG:
+       case MR_MEMORY_FAILURE:
+       case MR_MEMORY_FAILURE:
+       case MR_SYSCALL:
+       case MR_MEMPOLICY_MBIND:
+               allowed_fallback = true;
+               break;
+       default:
+               break;
+       }
+
+       if (!allowed_fallback)
+               gfp_mask |= __GFP_THISNODE;
+
+alloc_new:
         folio = alloc_fresh_hugetlb_folio(h, gfp_mask, nid, nmask, NULL);
         if (!folio)
                 return NULL;
@@ -2621,7 +2646,7 @@ struct folio 
*alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h,

  /* folio migration callback function */
  struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int 
preferred_nid,
-               nodemask_t *nmask, gfp_t gfp_mask)
                 return NULL;
@@ -2621,7 +2646,7 @@ struct folio 
*alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h,

  /* folio migration callback function */
  struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int 
preferred_nid,
-               nodemask_t *nmask, gfp_t gfp_mask)
+               nodemask_t *nmask, gfp_t gfp_mask, int reason)
  {
         spin_lock_irq(&hugetlb_lock);
         if (available_huge_pages(h)) {
@@ -2636,7 +2661,7 @@ struct folio *alloc_hugetlb_folio_nodemask(struct 
hstate *h, int preferred_nid,
         }
         spin_unlock_irq(&hugetlb_lock);

-       return alloc_migrate_hugetlb_folio(h, gfp_mask, preferred_nid, 
nmask);
+       return alloc_migrate_hugetlb_folio(h, gfp_mask, preferred_nid, 
nmask, reason);
  }

  /*
@@ -6653,7 +6678,13 @@ static struct folio 
*alloc_hugetlb_folio_vma(struct hstate *h,

         gfp_mask = htlb_alloc_mask(h);
         node = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
-       folio = alloc_hugetlb_folio_nodemask(h, node, nodemask, gfp_mask);
+       /*
+        * This is used to allocate a temporary hugetlb to hold the copied
+        * content, which will then be copied again to the final hugetlb
+        * consuming a reservation. Set the migrate reason to -1 to indicate
+        * that breaking the per-node hugetlb pool is not allowed in 
this case.
+        */
+       folio = alloc_hugetlb_folio_nodemask(h, node, nodemask, 
gfp_mask, -1);
         mpol_cond_put(mpol);

         return folio;


What do you think? Thanks.


  reply	other threads:[~2024-02-26  9:59 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-21  9:27 [RFC PATCH 0/3] " Baolin Wang
2024-02-21  9:27 ` [RFC PATCH 1/3] mm: record the migration reason for struct migration_target_control Baolin Wang
2024-02-21  9:27 ` [RFC PATCH 2/3] mm: hugetlb: make the hugetlb migration strategy consistent Baolin Wang
2024-02-22 22:15   ` Oscar Salvador
2024-02-23  2:56     ` Baolin Wang
2024-02-23 14:19       ` Oscar Salvador
2024-02-26  3:34         ` Baolin Wang
2024-02-26  9:17           ` Oscar Salvador
2024-02-26  9:59             ` Baolin Wang [this message]
2024-02-21  9:27 ` [RFC PATCH 3/3] docs: hugetlbpage.rst: add hugetlb migration description Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3aac0fc-458f-4453-86a6-1bf92dc5fbd0@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox