linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Usama Arif <usama.arif@bytedance.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-mm@kvack.org, muchun.song@linux.dev, rppt@kernel.org,
	linux-kernel@vger.kernel.org, fam.zheng@bytedance.com,
	liangma@liangbit.com, simon.evans@bytedance.com,
	punit.agrawal@bytedance.com,
	Muchun Song <songmuchun@bytedance.com>
Subject: Re: [External] Re: [v2 1/6] mm: hugetlb: Skip prep of tail pages when HVO is enabled
Date: Wed, 2 Aug 2023 11:05:35 +0100	[thread overview]
Message-ID: <540a0f2e-31e0-6396-be14-d9baec608b87@bytedance.com> (raw)
In-Reply-To: <20230731231841.GA39768@monkey>



On 01/08/2023 00:18, Mike Kravetz wrote:
> On 07/30/23 16:16, Usama Arif wrote:
>> When vmemmap is optimizable, it will free all the duplicated tail
>> pages in hugetlb_vmemmap_optimize while preparing the new hugepage.
>> Hence, there is no need to prepare them.
>>
>> For 1G x86 hugepages, it avoids preparing
>> 262144 - 64 = 262080 struct pages per hugepage.
>>
>> The indirection of using __prep_compound_gigantic_folio is also removed,
>> as it just creates extra functions to indicate demote which can be done
>> with the argument.
>>
>> Signed-off-by: Usama Arif <usama.arif@bytedance.com>
>> ---
>>   mm/hugetlb.c         | 32 ++++++++++++++------------------
>>   mm/hugetlb_vmemmap.c |  2 +-
>>   mm/hugetlb_vmemmap.h | 15 +++++++++++----
>>   3 files changed, 26 insertions(+), 23 deletions(-)
> 
> Thanks,
> 
> I just started looking at this series.  Adding Muchun on Cc:
> 
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 64a3239b6407..541c07b6d60f 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1942,14 +1942,23 @@ static void prep_new_hugetlb_folio(struct hstate *h, struct folio *folio, int ni
>>   	spin_unlock_irq(&hugetlb_lock);
>>   }
>>   
>> -static bool __prep_compound_gigantic_folio(struct folio *folio,
>> -					unsigned int order, bool demote)
>> +static bool prep_compound_gigantic_folio(struct folio *folio, struct hstate *h, bool demote)
>>   {
>>   	int i, j;
>> +	int order = huge_page_order(h);
>>   	int nr_pages = 1 << order;
>>   	struct page *p;
>>   
>>   	__folio_clear_reserved(folio);
>> +
>> +	/*
>> +	 * No need to prep pages that will be freed later by hugetlb_vmemmap_optimize.
>> +	 * Hence, reduce nr_pages to the pages that will be kept.
>> +	 */
>> +	if (IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) &&
>> +			vmemmap_should_optimize(h, &folio->page))
> 
> IIUC, vmemmap_optimize_enabled (checked in vmemmap_should_optimize) can be
> modified at runtime via sysctl.  If so, what prevents it from being changed
> after this check and the later call to hugetlb_vmemmap_optimize()?

Hi,

Thanks for the review.

Yes thats a good catch. The solution for this issue would be to to turn 
hugetlb_free_vmemmap into a callback core_param and have a lock around 
the write and when its used in gather_bootmem_prealloc, etc.

But the bigger issue during runtime is what Muchun pointed out that the 
struct page refcount is not frozen to 0.

My main usecase (and maybe for others as well?) is reserving these 
gigantic pages at boot time. I thought the runtime improvement might 
come from free with it but doesnt look like it. Both issues could be 
solved by just limiting it to boot time, as vmemmap_optimize_enabled 
cannot be changed during boot time, and reference to those pages cannot 
gotten by anything else as well (they aren't even initialized by 
memblock after patch 6), so will include the below diff to solve both 
the issues in next revision.


diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8434100f60ae..790842a6f978 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1956,7 +1956,8 @@ static bool prep_compound_gigantic_folio(struct 
folio *folio, struct hstate *h,
          * Hence, reduce nr_pages to the pages that will be kept.
          */
         if (IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) &&
-                       vmemmap_should_optimize(h, &folio->page))
+                       vmemmap_should_optimize(h, &folio->page) &&
+                       system_state == SYSTEM_BOOTING)
                 nr_pages = HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct 
page);
         for (i = 0; i < nr_pages; i++) {

Thanks,
Usama


  reply	other threads:[~2023-08-02 10:05 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-30 15:16 [v2 0/6] mm/memblock: Skip prep and initialization of struct pages freed later by HVO Usama Arif
2023-07-30 15:16 ` [v2 1/6] mm: hugetlb: Skip prep of tail pages when HVO is enabled Usama Arif
2023-07-31 23:18   ` Mike Kravetz
2023-08-02 10:05     ` Usama Arif [this message]
2023-08-01  2:04   ` Muchun Song
2023-08-02 10:06     ` [External] " Usama Arif
2023-07-30 15:16 ` [v2 2/6] mm: hugetlb_vmemmap: Use nid of the head page to reallocate it Usama Arif
2023-07-30 15:16 ` [v2 3/6] memblock: pass memblock_type to memblock_setclr_flag Usama Arif
2023-07-30 15:16 ` [v2 4/6] memblock: introduce MEMBLOCK_RSRV_NOINIT flag Usama Arif
2023-07-30 15:16 ` [v2 5/6] mm: move allocation of gigantic hstates to the start of mm_core_init Usama Arif
2023-07-30 16:49   ` kernel test robot
2023-08-01  3:07   ` Muchun Song
2023-07-30 15:16 ` [v2 6/6] mm: hugetlb: Skip initialization of struct pages freed later by HVO Usama Arif
2023-07-30 19:33   ` kernel test robot
2023-07-31  0:11   ` kernel test robot
2023-07-31  6:46   ` kernel test robot
2023-07-30 22:28 ` [v2 0/6] mm/memblock: Skip prep and " Usama Arif

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=540a0f2e-31e0-6396-be14-d9baec608b87@bytedance.com \
    --to=usama.arif@bytedance.com \
    --cc=fam.zheng@bytedance.com \
    --cc=liangma@liangbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=punit.agrawal@bytedance.com \
    --cc=rppt@kernel.org \
    --cc=simon.evans@bytedance.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox