linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Muchun Song <muchun.song@linux.dev>
Cc: Linux-MM <linux-mm@kvack.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Mike Rapoport <rppt@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Muchun Song <songmuchun@bytedance.com>,
	fam.zheng@bytedance.com, liangma@liangbit.com,
	punit.agrawal@bytedance.com,
	Andrew Morton <akpm@linux-foundation.org>,
	Usama Arif <usama.arif@bytedance.com>
Subject: Re: [External] [v3 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO
Date: Thu, 31 Aug 2023 10:58:01 +0100	[thread overview]
Message-ID: <20230831095801.76rtpgdsvdijbw5t@techsingularity.net> (raw)
In-Reply-To: <A5CD653A-DAA6-481F-963E-AB04D2170088@linux.dev>

On Thu, Aug 31, 2023 at 02:21:06PM +0800, Muchun Song wrote:
> 
> 
> > On Aug 30, 2023, at 18:27, Usama Arif <usama.arif@bytedance.com> wrote:
> > On 28/08/2023 12:33, Muchun Song wrote:
> >>> On Aug 25, 2023, at 19:18, Usama Arif <usama.arif@bytedance.com> wrote:
> >>> 
> >>> The new boot flow when it comes to initialization of gigantic pages
> >>> is as follows:
> >>> - At boot time, for a gigantic page during __alloc_bootmem_hugepage,
> >>> the region after the first struct page is marked as noinit.
> >>> - This results in only the first struct page to be
> >>> initialized in reserve_bootmem_region. As the tail struct pages are
> >>> not initialized at this point, there can be a significant saving
> >>> in boot time if HVO succeeds later on.
> >>> - Later on in the boot, HVO is attempted. If its successful, only the first
> >>> HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages
> >>> after the head struct page are initialized. If it is not successful,
> >>> then all of the tail struct pages are initialized.
> >>> 
> >>> Signed-off-by: Usama Arif <usama.arif@bytedance.com>
> >> This edition is simpler than before ever, thanks for your work.
> >> There is premise that other subsystems do not access vmemmap pages
> >> before the initialization of vmemmap pages associated withe HugeTLB
> >> pages allocated from bootmem for your optimization. However, IIUC, the
> >> compacting path could access arbitrary struct page when memory fails
> >> to be allocated via buddy allocator. So we should make sure that
> >> those struct pages are not referenced in this routine. And I know
> >> if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, it will encounter
> >> the same issue, but I don't find any code to prevent this from
> >> happening. I need more time to confirm this, if someone already knows,
> >> please let me know, thanks. So I think HugeTLB should adopt the similar
> >> way to prevent this.
> >> Thanks.
> > 
> > Thanks for the reviews.
> > 
> > So if I understand it correctly, the uninitialized pages due to the optimization in this patch and due to DEFERRED_STRUCT_PAGE_INIT should be treated in the same way during compaction. I see that in isolate_freepages during compaction there is a check to see if PageBuddy flag is set and also there are calls like __pageblock_pfn_to_page to check if the pageblock is valid.
> > 
> > But if the struct page is uninitialized then they would contain random data and these checks could pass if certain bits were set?
> > 
> > Compaction is done on free list. I think the uninitialized struct pages atleast from DEFERRED_STRUCT_PAGE_INIT would be part of freelist, so I think their pfn would be considered for compaction.
> > 
> > Could someone more familiar with DEFERRED_STRUCT_PAGE_INIT and compaction confirm how the uninitialized struct pages are handled when compaction happens? Thanks!
> 
> Hi Mel,
> 
> Could you help us answer this question? I think you must be the expert of
> CONFIG_DEFERRED_STRUCT_PAGE_INIT. I summarize the context here. As we all know,
> some struct pages are uninnitialized when CONFIG_DEFERRED_STRUCT_PAGE_INIT is
> enabled, if someone allocates a larger memory (e.g. order is 4) via buddy
> allocator and fails to allocate the memory, then we will go into the compacting
> routine, which will traverse all pfns and use pfn_to_page to access its struct
> page, however, those struct pages may be uninnitialized (so it's arbitrary data).
> Our question is how to prevent the compacting routine from accessing those
> uninitialized struct pages? We'll be appreciated if you know the answer.
> 

I didn't check the code but IIRC, the struct pages should be at least
valid and not contain arbitrary data once page_alloc_init_late finishes.

-- 
Mel Gorman
SUSE Labs


  reply	other threads:[~2023-08-31  9:58 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-25 11:18 [v3 0/4] " Usama Arif
2023-08-25 11:18 ` [v3 1/4] mm: hugetlb_vmemmap: Use nid of the head page to reallocate it Usama Arif
2023-08-28  7:15   ` Muchun Song
2023-08-28 18:25     ` Mike Kravetz
2023-08-25 11:18 ` [v3 2/4] memblock: pass memblock_type to memblock_setclr_flag Usama Arif
2023-08-28  7:16   ` Muchun Song
2023-08-28  7:37   ` Mike Rapoport
2023-08-28 18:39   ` Mike Kravetz
2023-08-25 11:18 ` [v3 3/4] memblock: introduce MEMBLOCK_RSRV_NOINIT_VMEMMAP flag Usama Arif
2023-08-28  7:26   ` Muchun Song
2023-08-28  7:47   ` Mike Rapoport
2023-08-28  8:52     ` Muchun Song
2023-08-28  9:09       ` Mike Rapoport
2023-08-28  9:18         ` Muchun Song
2023-08-25 11:18 ` [v3 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO Usama Arif
2023-08-28 11:33   ` Muchun Song
2023-08-28 21:04     ` Mike Kravetz
2023-08-29  3:33       ` Muchun Song
2023-08-29  3:47         ` Mike Kravetz
2023-08-30 10:27     ` [External] " Usama Arif
2023-08-31  6:21       ` [External] " Muchun Song
2023-08-31  9:58         ` Mel Gorman [this message]
2023-08-31 10:01           ` Muchun Song
2023-08-31 10:28             ` Mel Gorman
2023-08-31  7:33       ` [External] " Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230831095801.76rtpgdsvdijbw5t@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=fam.zheng@bytedance.com \
    --cc=liangma@liangbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=punit.agrawal@bytedance.com \
    --cc=rppt@kernel.org \
    --cc=songmuchun@bytedance.com \
    --cc=usama.arif@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox