From: Mike Kravetz <mike.kravetz@oracle.com>
To: Muchun Song <muchun.song@linux.dev>
Cc: Usama Arif <usama.arif@bytedance.com>,
Linux-MM <linux-mm@kvack.org>, Mike Rapoport <rppt@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Muchun Song <songmuchun@bytedance.com>,
fam.zheng@bytedance.com, liangma@liangbit.com,
punit.agrawal@bytedance.com
Subject: Re: [v3 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO
Date: Mon, 28 Aug 2023 20:47:01 -0700 [thread overview]
Message-ID: <20230829034701.GG3290@monkey> (raw)
In-Reply-To: <A9D058DC-699B-4B6D-90EC-D81ADD32C6DD@linux.dev>
On 08/29/23 11:33, Muchun Song wrote:
>
>
> > On Aug 29, 2023, at 05:04, Mike Kravetz <mike.kravetz@oracle.com> wrote:
> >
> > On 08/28/23 19:33, Muchun Song wrote:
> >>
> >>
> >>> On Aug 25, 2023, at 19:18, Usama Arif <usama.arif@bytedance.com> wrote:
> >>>
> >>> The new boot flow when it comes to initialization of gigantic pages
> >>> is as follows:
> >>> - At boot time, for a gigantic page during __alloc_bootmem_hugepage,
> >>> the region after the first struct page is marked as noinit.
> >>> - This results in only the first struct page to be
> >>> initialized in reserve_bootmem_region. As the tail struct pages are
> >>> not initialized at this point, there can be a significant saving
> >>> in boot time if HVO succeeds later on.
> >>> - Later on in the boot, HVO is attempted. If its successful, only the first
> >>> HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages
> >>> after the head struct page are initialized. If it is not successful,
> >>> then all of the tail struct pages are initialized.
> >>>
> >>> Signed-off-by: Usama Arif <usama.arif@bytedance.com>
> >>
> >> This edition is simpler than before ever, thanks for your work.
> >>
> >> There is premise that other subsystems do not access vmemmap pages
> >> before the initialization of vmemmap pages associated withe HugeTLB
> >> pages allocated from bootmem for your optimization. However, IIUC, the
> >> compacting path could access arbitrary struct page when memory fails
> >> to be allocated via buddy allocator. So we should make sure that
> >> those struct pages are not referenced in this routine. And I know
> >> if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, it will encounter
> >> the same issue, but I don't find any code to prevent this from
> >> happening. I need more time to confirm this, if someone already knows,
> >> please let me know, thanks. So I think HugeTLB should adopt the similar
> >> way to prevent this.
> >
> > In this patch, the call to hugetlb_vmemmap_optimize() is moved BEFORE
> > __prep_new_hugetlb_folio or prep_new_hugetlb_folio in all code paths.
> > The prep_new_hugetlb_folio routine(s) are what set the destructor (soon
> > to be a flag) that identifies the set of pages as a hugetlb page. So,
> > there is now a window where a set of pages not identified as hugetlb
> > will not have vmemmap pages.
>
> Thanks for your point it out.
>
> Seems this issue is not related to this change? hugetlb_vmemmap_optimize()
> is called before the setting of destructor since the initial commit
> f41f2ed43ca5. Right?
>
Thanks Muchun!
Yes, this issue exists today. It was the further separation of the calls in
this patch which pointed out the issue to me.
I overlooked the fact that the issue already exists. :(
> >
> > Recently, I closed the same window in the hugetlb freeing code paths with
> > commit 32c877191e02 'hugetlb: do not clear hugetlb dtor until allocating'.
>
> Yes, I saw it.
>
> > This patch needs to be reworked so that this window is not opened in the
> > allocation paths.
>
> So I think the fix should be a separate series.
>
Right. I can fix that up separately.
--
Mike Kravetz
next prev parent reply other threads:[~2023-08-29 3:47 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-25 11:18 [v3 0/4] " Usama Arif
2023-08-25 11:18 ` [v3 1/4] mm: hugetlb_vmemmap: Use nid of the head page to reallocate it Usama Arif
2023-08-28 7:15 ` Muchun Song
2023-08-28 18:25 ` Mike Kravetz
2023-08-25 11:18 ` [v3 2/4] memblock: pass memblock_type to memblock_setclr_flag Usama Arif
2023-08-28 7:16 ` Muchun Song
2023-08-28 7:37 ` Mike Rapoport
2023-08-28 18:39 ` Mike Kravetz
2023-08-25 11:18 ` [v3 3/4] memblock: introduce MEMBLOCK_RSRV_NOINIT_VMEMMAP flag Usama Arif
2023-08-28 7:26 ` Muchun Song
2023-08-28 7:47 ` Mike Rapoport
2023-08-28 8:52 ` Muchun Song
2023-08-28 9:09 ` Mike Rapoport
2023-08-28 9:18 ` Muchun Song
2023-08-25 11:18 ` [v3 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO Usama Arif
2023-08-28 11:33 ` Muchun Song
2023-08-28 21:04 ` Mike Kravetz
2023-08-29 3:33 ` Muchun Song
2023-08-29 3:47 ` Mike Kravetz [this message]
2023-08-30 10:27 ` [External] " Usama Arif
2023-08-31 6:21 ` [External] " Muchun Song
2023-08-31 9:58 ` Mel Gorman
2023-08-31 10:01 ` Muchun Song
2023-08-31 10:28 ` Mel Gorman
2023-08-31 7:33 ` [External] " Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230829034701.GG3290@monkey \
--to=mike.kravetz@oracle.com \
--cc=fam.zheng@bytedance.com \
--cc=liangma@liangbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=muchun.song@linux.dev \
--cc=punit.agrawal@bytedance.com \
--cc=rppt@kernel.org \
--cc=songmuchun@bytedance.com \
--cc=usama.arif@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox