[PATCH] mm/memfd: clear hugetlb pages on allocation

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm/memfd: clear hugetlb pages on allocation
@ 2025-11-12  3:16 Deepanshu Kartikey
  2025-11-12  6:55 ` Hugh Dickins
  0 siblings, 1 reply; 10+ messages in thread
From: Deepanshu Kartikey @ 2025-11-12  3:16 UTC (permalink / raw)
  To: hughd, baolin.wang, akpm
  Cc: linux-mm, linux-kernel, Deepanshu Kartikey, syzbot+f64019ba229e3a5c411b

When allocating hugetlb pages for memfd, the pages are not zeroed,
which leads to uninitialized kernel memory being exposed to userspace
through read() or mmap() operations.

The issue arises because hugetlb_reserve_pages() can allocate pages
through the surplus allocation path without the __GFP_ZERO flag. These
pages are added to the reservation pool and later returned by
alloc_hugetlb_folio_reserve() without being cleared, resulting in
uninitialized memory being accessible to userspace.

This is a security vulnerability as it allows information disclosure of
potentially sensitive kernel data. Fix it by explicitly zeroing the
folio after allocation using folio_zero_range().

This is particularly important for udmabuf use cases where these pages
are pinned and directly accessed by userspace via DMA buffers.

Reproducer:
 - Create memfd with MFD_HUGETLB flag
 - Use UDMABUF_CREATE ioctl to pin the hugetlb pages
 - Read from the memfd using preadv()
 - KMSAN detects uninitialized memory being copied to userspace

Reported-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f64019ba229e3a5c411b
Tested-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/memfd.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/memfd.c b/mm/memfd.c
index 1d109c1acf21..f8cfc2909507 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -96,6 +96,12 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
 						    NULL,
 						    gfp_mask);
 		if (folio) {
+			/*
+			 * Zero the folio to prevent information leaks to userspace.
+			 * The folio may have been allocated during hugetlb_reserve_pages()
+			 * without __GFP_ZERO, so explicitly clear it here.
+			 */
+			folio_zero_range(folio, 0, folio_size(folio));
 			err = hugetlb_add_to_page_cache(folio,
 							memfd->f_mapping,
 							idx);
-- 
2.43.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12  3:16 [PATCH] mm/memfd: clear hugetlb pages on allocation Deepanshu Kartikey
@ 2025-11-12  6:55 ` Hugh Dickins
  2025-11-12  7:28   ` Deepanshu Kartikey
  2025-11-12  9:13   ` Oscar Salvador
  0 siblings, 2 replies; 10+ messages in thread
From: Hugh Dickins @ 2025-11-12  6:55 UTC (permalink / raw)
  To: Muchun Song, Oscar Salvador, David Hildenbrand
  Cc: Deepanshu Kartikey, Vivek Kasireddy, hughd, baolin.wang, akpm,
	linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b

On Wed, 12 Nov 2025, Deepanshu Kartikey wrote:

> When allocating hugetlb pages for memfd, the pages are not zeroed,
> which leads to uninitialized kernel memory being exposed to userspace
> through read() or mmap() operations.
> 
> The issue arises because hugetlb_reserve_pages() can allocate pages
> through the surplus allocation path without the __GFP_ZERO flag. These
> pages are added to the reservation pool and later returned by
> alloc_hugetlb_folio_reserve() without being cleared, resulting in
> uninitialized memory being accessible to userspace.
> 
> This is a security vulnerability as it allows information disclosure of
> potentially sensitive kernel data. Fix it by explicitly zeroing the
> folio after allocation using folio_zero_range().
> 
> This is particularly important for udmabuf use cases where these pages
> are pinned and directly accessed by userspace via DMA buffers.
> 
> Reproducer:
>  - Create memfd with MFD_HUGETLB flag
>  - Use UDMABUF_CREATE ioctl to pin the hugetlb pages
>  - Read from the memfd using preadv()
>  - KMSAN detects uninitialized memory being copied to userspace
> 
> Reported-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=f64019ba229e3a5c411b
> Tested-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>

Thanks a lot, Deepanshu and syzbot: this sounds horrid, and important
to fix very soon; and wlll need a Fixes tag (with stable Cc'ed when
the fix goes into mm.git), I presume it's

Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")

But although my name appears against mm/memfd.c, the truth is I know
little of hugetlb (maintainers now addressed), and when its folios
are supposed to get zeroed (would a __GFP_ZERO somewhere be better?).

I was puzzled by how udmabuf came into the picture, since hugetlbfs
has always supported the read (not write) system call: but see now
that there is this surprising backdoor into the hugetlb subsystem,
via memfd and GUP pinning.

And where does that folio get marked uptodate, or is "uptodate"
irrelevant on hugetlbfs?  Are the right locks taken, or could
there be races when adding to hugetlbfs cache in this way?

Muchun, Oscar, David, I think this needs your eyes please!  I sense
that there could easily be other bugs hereabouts, but perhaps the
lack of zeroing needs to be addressed before worrying further.

Thanks,
Hugh

> ---
>  mm/memfd.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 1d109c1acf21..f8cfc2909507 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -96,6 +96,12 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx)
>  						    NULL,
>  						    gfp_mask);
>  		if (folio) {
> +			/*
> +			 * Zero the folio to prevent information leaks to userspace.
> +			 * The folio may have been allocated during hugetlb_reserve_pages()
> +			 * without __GFP_ZERO, so explicitly clear it here.
> +			 */
> +			folio_zero_range(folio, 0, folio_size(folio));
>  			err = hugetlb_add_to_page_cache(folio,
>  							memfd->f_mapping,
>  							idx);
> -- 
> 2.43.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12  6:55 ` Hugh Dickins
@ 2025-11-12  7:28   ` Deepanshu Kartikey
  2025-11-12  7:55     ` Hugh Dickins
  2025-11-12  9:13   ` Oscar Salvador
  1 sibling, 1 reply; 10+ messages in thread
From: Deepanshu Kartikey @ 2025-11-12  7:28 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Muchun Song, Oscar Salvador, David Hildenbrand, Vivek Kasireddy,
	baolin.wang, akpm, linux-mm, linux-kernel,
	syzbot+f64019ba229e3a5c411b

Hi Hugh,

Thank you for the quick review and for looping in the hugetlb maintainers!

You raise good points about the approach. I chose explicit zeroing in
memfd_alloc_folio() because hugetlb_reserve_pages() can allocate pages
without seeing the __GFP_ZERO flag, but I'm happy to revise if the
hugetlb maintainers prefer a different approach.

I'll add the Fixes: 89c1905d9c14 tag and Cc: stable in v2.

Should I send v2 now with just the tag added, or wait for feedback from
Muchun/Oscar/David on the overall approach first?

Thanks,
Deepanshu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12  7:28   ` Deepanshu Kartikey
@ 2025-11-12  7:55     ` Hugh Dickins
  0 siblings, 0 replies; 10+ messages in thread
From: Hugh Dickins @ 2025-11-12  7:55 UTC (permalink / raw)
  To: Deepanshu Kartikey
  Cc: Hugh Dickins, Muchun Song, Oscar Salvador, David Hildenbrand,
	Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel,
	syzbot+f64019ba229e3a5c411b

On Wed, 12 Nov 2025, Deepanshu Kartikey wrote:

> Hi Hugh,
> 
> Thank you for the quick review and for looping in the hugetlb maintainers!
> 
> You raise good points about the approach. I chose explicit zeroing in
> memfd_alloc_folio() because hugetlb_reserve_pages() can allocate pages
> without seeing the __GFP_ZERO flag, but I'm happy to revise if the
> hugetlb maintainers prefer a different approach.
> 
> I'll add the Fixes: 89c1905d9c14 tag and Cc: stable in v2.
> 
> Should I send v2 now with just the tag added, or wait for feedback from
> Muchun/Oscar/David on the overall approach first?

No need for a v2 at this stage - Andrew is very much more than capable
of adding in that Fixes tag and Cc stable if he's inclined to grab your
patch for mm.git in the interim, but let's wait to hear from hugetlb
folks before finalizing (I expect they'll say __GFP_ZERO is no good).

Hugh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12  6:55 ` Hugh Dickins
  2025-11-12  7:28   ` Deepanshu Kartikey
@ 2025-11-12  9:13   ` Oscar Salvador
  2025-11-12  9:26     ` Deepanshu Kartikey
  2025-11-12 10:09     ` David Hildenbrand (Red Hat)
  1 sibling, 2 replies; 10+ messages in thread
From: Oscar Salvador @ 2025-11-12  9:13 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Muchun Song, David Hildenbrand, Deepanshu Kartikey,
	Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel,
	syzbot+f64019ba229e3a5c411b

On Tue, Nov 11, 2025 at 10:55:03PM -0800, Hugh Dickins wrote:
> Thanks a lot, Deepanshu and syzbot: this sounds horrid, and important
> to fix very soon; and wlll need a Fixes tag (with stable Cc'ed when
> the fix goes into mm.git), I presume it's
> 
> Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
> 
> But although my name appears against mm/memfd.c, the truth is I know
> little of hugetlb (maintainers now addressed), and when its folios
> are supposed to get zeroed (would a __GFP_ZERO somewhere be better?).
> 
> I was puzzled by how udmabuf came into the picture, since hugetlbfs
> has always supported the read (not write) system call: but see now
> that there is this surprising backdoor into the hugetlb subsystem,
> via memfd and GUP pinning.
> 
> And where does that folio get marked uptodate, or is "uptodate"
> irrelevant on hugetlbfs?  Are the right locks taken, or could
> there be races when adding to hugetlbfs cache in this way?

Thanks Hugh for raising this up.

memfd_alloc_folio() seems to try to recreate what hugetlb_no_page()
would do (slightly different though).

The thing is that as far as I know, we should grab hugetlb mutex before
trying to add a new page in the pagecache, per comment in
hugetlb_fault():

 "
   /*
    * Serialize hugepage allocation and instantiation, so that we don't
    * get spurious allocation failures if two CPUs race to instantiate
    * the same page in the page cache.
    */
 "

and at least that is what all callers of hugetlb_add_to_page_cache() do
at this moment, all except memfd_alloc_folio(), so I guess this one
needs fixing.

Regarding the uptodate question, I do not see what is special about this situation
that we would not need it.
We seem to be marking the folio uptodate every time we do allocate a folio __and__
before adding it into the pagecache (which is expected, right?).

Now, for the GFP_ZERO question.
This one is nasty.
hugetlb_reserve_pages() will allocate surplus folios without zeroing, but those
will be zeroed in the faulting path before mapping them into userspace pagetables
(see folio_zero_user() in hugetlb_no_page()).
So unless I am missing something we need to zero them in this case as well.

-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12  9:13   ` Oscar Salvador
@ 2025-11-12  9:26     ` Deepanshu Kartikey
  2025-11-12 10:09     ` David Hildenbrand (Red Hat)
  1 sibling, 0 replies; 10+ messages in thread
From: Deepanshu Kartikey @ 2025-11-12  9:26 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Hugh Dickins, Muchun Song, David Hildenbrand, Vivek Kasireddy,
	baolin.wang, akpm, linux-mm, linux-kernel,
	syzbot+f64019ba229e3a5c411b

Hi Oscar,

Thank you for catching these issues!

I have a question about scope: Should I fix all three issues (zeroing,
locking, uptodate) in a single patch for v2, or would you prefer:

1. My current patch for just the zeroing (security fix), and
2. A separate follow-up patch for the locking and uptodate issues?

I'm happy to do either - just want to make sure I'm following the preferred
approach for the mm subsystem.

Thanks,
Deepanshu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12  9:13   ` Oscar Salvador
  2025-11-12  9:26     ` Deepanshu Kartikey
@ 2025-11-12 10:09     ` David Hildenbrand (Red Hat)
  2025-11-12 11:56       ` Oscar Salvador
  1 sibling, 1 reply; 10+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-12 10:09 UTC (permalink / raw)
  To: Oscar Salvador, Hugh Dickins
  Cc: Muchun Song, Deepanshu Kartikey, Vivek Kasireddy, baolin.wang,
	akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b

On 12.11.25 10:13, Oscar Salvador wrote:
> On Tue, Nov 11, 2025 at 10:55:03PM -0800, Hugh Dickins wrote:
>> Thanks a lot, Deepanshu and syzbot: this sounds horrid, and important
>> to fix very soon; and wlll need a Fixes tag (with stable Cc'ed when
>> the fix goes into mm.git), I presume it's
>>
>> Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios")
>>
>> But although my name appears against mm/memfd.c, the truth is I know
>> little of hugetlb (maintainers now addressed), and when its folios
>> are supposed to get zeroed (would a __GFP_ZERO somewhere be better?).
>>
>> I was puzzled by how udmabuf came into the picture, since hugetlbfs
>> has always supported the read (not write) system call: but see now
>> that there is this surprising backdoor into the hugetlb subsystem,
>> via memfd and GUP pinning.
>>
>> And where does that folio get marked uptodate, or is "uptodate"
>> irrelevant on hugetlbfs?  Are the right locks taken, or could
>> there be races when adding to hugetlbfs cache in this way?
> 
> Thanks Hugh for raising this up.
> 
> memfd_alloc_folio() seems to try to recreate what hugetlb_no_page()
> would do (slightly different though).

Can we factor that out to merge both paths?

> 
> The thing is that as far as I know, we should grab hugetlb mutex before
> trying to add a new page in the pagecache, per comment in
> hugetlb_fault():
> 
>   "
>     /*
>      * Serialize hugepage allocation and instantiation, so that we don't
>      * get spurious allocation failures if two CPUs race to instantiate
>      * the same page in the page cache.
>      */
>   "
> 
> and at least that is what all callers of hugetlb_add_to_page_cache() do
> at this moment, all except memfd_alloc_folio(), so I guess this one
> needs fixing.
> 
> Regarding the uptodate question, I do not see what is special about this situation
> that we would not need it.
> We seem to be marking the folio uptodate every time we do allocate a folio __and__
> before adding it into the pagecache (which is expected, right?).

Right, at least filemap.c heavily depends on it being set (I don't think 
hugetlb itself needs it).

> 
> Now, for the GFP_ZERO question.
> This one is nasty.
> hugetlb_reserve_pages() will allocate surplus folios without zeroing, but those
> will be zeroed in the faulting path before mapping them into userspace pagetables
> (see folio_zero_user() in hugetlb_no_page()).
> So unless I am missing something we need to zero them in this case as well.

I assume we want to avoid GFP_ZERO and use folio_zero_user(), which is 
optimized for zeroing huge/gigantic pages.


-- 
Cheers

David


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12 10:09     ` David Hildenbrand (Red Hat)
@ 2025-11-12 11:56       ` Oscar Salvador
  2025-11-12 12:06         ` Deepanshu Kartikey
  0 siblings, 1 reply; 10+ messages in thread
From: Oscar Salvador @ 2025-11-12 11:56 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: Hugh Dickins, Muchun Song, Deepanshu Kartikey, Vivek Kasireddy,
	baolin.wang, akpm, linux-mm, linux-kernel,
	syzbot+f64019ba229e3a5c411b

On Wed, Nov 12, 2025 at 11:09:51AM +0100, David Hildenbrand (Red Hat) wrote:
> On 12.11.25 10:13, Oscar Salvador wrote:
> > memfd_alloc_folio() seems to try to recreate what hugetlb_no_page()
> > would do (slightly different though).
> 
> Can we factor that out to merge both paths?

I guess it is worth looking into it, I shall fiddle with it.

> > Regarding the uptodate question, I do not see what is special about this situation
> > that we would not need it.
> > We seem to be marking the folio uptodate every time we do allocate a folio __and__
> > before adding it into the pagecache (which is expected, right?).
> 
> Right, at least filemap.c heavily depends on it being set (I don't think
> hugetlb itself needs it).

Yes, you are probably right.

> > Now, for the GFP_ZERO question.
> > This one is nasty.
> > hugetlb_reserve_pages() will allocate surplus folios without zeroing, but those
> > will be zeroed in the faulting path before mapping them into userspace pagetables
> > (see folio_zero_user() in hugetlb_no_page()).
> > So unless I am missing something we need to zero them in this case as well.
> 
> I assume we want to avoid GFP_ZERO and use folio_zero_user(), which is
> optimized for zeroing huge/gigantic pages.

Yes, I would go with folio_zero_user() as well, to match what we do in
all paths.
Maybe if we can factor it out, we can simplifiy it as right now seems a
small-duplication of hugetlb_no_page (and more so once we add what is
missing: mutex, uptodate and folio_zero_user).
 

-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12 11:56       ` Oscar Salvador
@ 2025-11-12 12:06         ` Deepanshu Kartikey
  2025-11-12 14:54           ` Deepanshu Kartikey
  0 siblings, 1 reply; 10+ messages in thread
From: Deepanshu Kartikey @ 2025-11-12 12:06 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: David Hildenbrand (Red Hat),
	Hugh Dickins, Muchun Song, Vivek Kasireddy, baolin.wang, akpm,
	linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b

Hi Oscar and David,

Thanks for the guidance!

> I guess it is worth looking into it, I shall fiddle with it.

Great, I'll focus on fixing the immediate bugs in v2 and you can handle
the refactoring in a follow-up. This keeps my patch focused on the
security fix + the missing initialization steps.

> Yes, I would go with folio_zero_user() as well, to match what we do in
> all paths.

Understood. I'll use folio_zero_user() in v2.

So for v2, I'll add:
1. folio_zero_user() instead of folio_zero_range()
2. folio_mark_uptodate()
3. hugetlb_fault_mutex locking around hugetlb_add_to_page_cache()

This will match the pattern in hugetlb_no_page() and fix the information
leak, missing uptodate flag, and locking issue.

I'll send v2 shortly after testing.

Thanks,
Deepanshu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation
  2025-11-12 12:06         ` Deepanshu Kartikey
@ 2025-11-12 14:54           ` Deepanshu Kartikey
  0 siblings, 0 replies; 10+ messages in thread
From: Deepanshu Kartikey @ 2025-11-12 14:54 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: David Hildenbrand (Red Hat),
	Hugh Dickins, Muchun Song, Vivek Kasireddy, baolin.wang, akpm,
	linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b

v2 sent: https://lore.kernel.org/all/20251112145034.2320452-1-kartikey406@gmail.com/T/

Changes in v2: -
 - Used folio_zero_user() as suggested by Oscar and David -
 - Added folio_mark_uptodate() -
 - Added proper hugetlb_fault_mutex locking

Thanks for the reviews!


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-11-12 14:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-12  3:16 [PATCH] mm/memfd: clear hugetlb pages on allocation Deepanshu Kartikey
2025-11-12  6:55 ` Hugh Dickins
2025-11-12  7:28   ` Deepanshu Kartikey
2025-11-12  7:55     ` Hugh Dickins
2025-11-12  9:13   ` Oscar Salvador
2025-11-12  9:26     ` Deepanshu Kartikey
2025-11-12 10:09     ` David Hildenbrand (Red Hat)
2025-11-12 11:56       ` Oscar Salvador
2025-11-12 12:06         ` Deepanshu Kartikey
2025-11-12 14:54           ` Deepanshu Kartikey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox