Re: [RFC PATCH] mm: avoid clearing user movable page twice with init_on_alloc=1

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Zi Yan <ziy@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-mm@kvack.org, Alexander Potapenko <glider@google.com>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	John Hubbard <jhubbard@nvidia.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] mm: avoid clearing user movable page twice with init_on_alloc=1
Date: Tue, 08 Oct 2024 07:52:43 -0400	[thread overview]
Message-ID: <84D24C40-AC10-4FF7-B5F6-63FADD523297@nvidia.com> (raw)
In-Reply-To: <9e4e3094-00a2-43bc-996f-af15c3168e3a@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3456 bytes --]

On 8 Oct 2024, at 4:26, David Hildenbrand wrote:

> On 07.10.24 20:23, Zi Yan wrote:
>> Commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and
>> init_on_free=1 boot options") forces allocated page to be cleared in
>> post_alloc_hook() when init_on_alloc=1.
>>
>> For non PMD folios, if arch does not define
>> vma_alloc_zeroed_movable_folio(), the default implementation again clears
>> the page return from the buddy allocator. So the page is cleared twice.
>> Fix it by passing __GFP_ZERO instead to avoid double page clearing.
>> At the moment, s390,arm64,x86,alpha,m68k are not impacted since they
>> define their own vma_alloc_zeroed_movable_folio().
>>
>> For PMD folios, folio_zero_user() is called to clear the folio again.
>> Fix it by calling folio_zero_user() only if init_on_alloc is set.
>> All arch are impacted.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>>   include/linux/highmem.h | 14 ++------------
>>   mm/huge_memory.c        |  4 +++-
>>   2 files changed, 5 insertions(+), 13 deletions(-)
>>
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index 930a591b9b61..4b15224842e1 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -220,18 +220,8 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
>>    * Return: A folio containing one allocated and zeroed page or NULL if
>>    * we are out of memory.
>>    */
>> -static inline
>> -struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>> -				   unsigned long vaddr)
>> -{
>> -	struct folio *folio;
>> -
>> -	folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr, false);
>> -	if (folio)
>> -		clear_user_highpage(&folio->page, vaddr);
>> -
>> -	return folio;
>> -}
>> +#define vma_alloc_zeroed_movable_folio(vma, vaddr) \
>> +	vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false)
>>   #endif
>>    static inline void clear_highpage(struct page *page)
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index a7b05f4c2a5e..ff746151896f 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -1177,7 +1177,9 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf,
>>   		goto release;
>>   	}
>>  -	folio_zero_user(folio, vmf->address);
>> +	if (!static_branch_maybe(CONFIG_INIT_ON_ALLOC_DEFAULT_ON,
>> +				&init_on_alloc))
>> +		folio_zero_user(folio, vmf->address);
>>   	/*
>>   	 * The memory barrier inside __folio_mark_uptodate makes sure that
>>   	 * folio_zero_user writes become visible before the set_pmd_at()
>
> I remember we discussed that in the past and that we do *not* want to sprinkle these CONFIG_INIT_ON_ALLOC_DEFAULT_ON checks all over the kernel.
>
> Ideally, we'd use GFP_ZERO and have the buddy just do that for us? There is the slight chance that we zero-out when we're not going to use the allocated folio, but ... that can happen either way even with the current code?

I agree that putting CONFIG_INIT_ON_ALLOC_DEFAULT_ON here is not ideal, but
folio_zero_user() uses vmf->address to improve cache performance by changing
subpage clearing order. See commit c79b57e462b5 ("mm: hugetlb: clear target
sub-page last when clearing huge page”). If we use GFP_ZERO, we lose this
optimization. To keep it, vmf->address will need to be passed to allocation
code. Maybe that is acceptable?

Best Regards,
Yan, Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

next prev parent reply	other threads:[~2024-10-08 11:52 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-07 18:23 Zi Yan
2024-10-08  8:26 ` David Hildenbrand
2024-10-08 11:52   ` Zi Yan [this message]
2024-10-08 12:57     ` Vlastimil Babka
2024-10-08 13:06       ` David Hildenbrand
2024-10-08 13:46         ` Zi Yan
2024-10-11  6:55           ` Huang, Ying
2024-10-11  6:57 ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84D24C40-AC10-4FF7-B5F6-63FADD523297@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=glider@google.com \
    --cc=jhubbard@nvidia.com \
    --cc=keescook@chromium.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox