* [PATCH] mm/memfd: clear hugetlb pages on allocation @ 2025-11-12 3:16 Deepanshu Kartikey 2025-11-12 6:55 ` Hugh Dickins 0 siblings, 1 reply; 10+ messages in thread From: Deepanshu Kartikey @ 2025-11-12 3:16 UTC (permalink / raw) To: hughd, baolin.wang, akpm Cc: linux-mm, linux-kernel, Deepanshu Kartikey, syzbot+f64019ba229e3a5c411b When allocating hugetlb pages for memfd, the pages are not zeroed, which leads to uninitialized kernel memory being exposed to userspace through read() or mmap() operations. The issue arises because hugetlb_reserve_pages() can allocate pages through the surplus allocation path without the __GFP_ZERO flag. These pages are added to the reservation pool and later returned by alloc_hugetlb_folio_reserve() without being cleared, resulting in uninitialized memory being accessible to userspace. This is a security vulnerability as it allows information disclosure of potentially sensitive kernel data. Fix it by explicitly zeroing the folio after allocation using folio_zero_range(). This is particularly important for udmabuf use cases where these pages are pinned and directly accessed by userspace via DMA buffers. Reproducer: - Create memfd with MFD_HUGETLB flag - Use UDMABUF_CREATE ioctl to pin the hugetlb pages - Read from the memfd using preadv() - KMSAN detects uninitialized memory being copied to userspace Reported-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f64019ba229e3a5c411b Tested-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> --- mm/memfd.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memfd.c b/mm/memfd.c index 1d109c1acf21..f8cfc2909507 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -96,6 +96,12 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) NULL, gfp_mask); if (folio) { + /* + * Zero the folio to prevent information leaks to userspace. + * The folio may have been allocated during hugetlb_reserve_pages() + * without __GFP_ZERO, so explicitly clear it here. + */ + folio_zero_range(folio, 0, folio_size(folio)); err = hugetlb_add_to_page_cache(folio, memfd->f_mapping, idx); -- 2.43.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 3:16 [PATCH] mm/memfd: clear hugetlb pages on allocation Deepanshu Kartikey @ 2025-11-12 6:55 ` Hugh Dickins 2025-11-12 7:28 ` Deepanshu Kartikey 2025-11-12 9:13 ` Oscar Salvador 0 siblings, 2 replies; 10+ messages in thread From: Hugh Dickins @ 2025-11-12 6:55 UTC (permalink / raw) To: Muchun Song, Oscar Salvador, David Hildenbrand Cc: Deepanshu Kartikey, Vivek Kasireddy, hughd, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b On Wed, 12 Nov 2025, Deepanshu Kartikey wrote: > When allocating hugetlb pages for memfd, the pages are not zeroed, > which leads to uninitialized kernel memory being exposed to userspace > through read() or mmap() operations. > > The issue arises because hugetlb_reserve_pages() can allocate pages > through the surplus allocation path without the __GFP_ZERO flag. These > pages are added to the reservation pool and later returned by > alloc_hugetlb_folio_reserve() without being cleared, resulting in > uninitialized memory being accessible to userspace. > > This is a security vulnerability as it allows information disclosure of > potentially sensitive kernel data. Fix it by explicitly zeroing the > folio after allocation using folio_zero_range(). > > This is particularly important for udmabuf use cases where these pages > are pinned and directly accessed by userspace via DMA buffers. > > Reproducer: > - Create memfd with MFD_HUGETLB flag > - Use UDMABUF_CREATE ioctl to pin the hugetlb pages > - Read from the memfd using preadv() > - KMSAN detects uninitialized memory being copied to userspace > > Reported-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=f64019ba229e3a5c411b > Tested-by: syzbot+f64019ba229e3a5c411b@syzkaller.appspotmail.com > Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Thanks a lot, Deepanshu and syzbot: this sounds horrid, and important to fix very soon; and wlll need a Fixes tag (with stable Cc'ed when the fix goes into mm.git), I presume it's Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios") But although my name appears against mm/memfd.c, the truth is I know little of hugetlb (maintainers now addressed), and when its folios are supposed to get zeroed (would a __GFP_ZERO somewhere be better?). I was puzzled by how udmabuf came into the picture, since hugetlbfs has always supported the read (not write) system call: but see now that there is this surprising backdoor into the hugetlb subsystem, via memfd and GUP pinning. And where does that folio get marked uptodate, or is "uptodate" irrelevant on hugetlbfs? Are the right locks taken, or could there be races when adding to hugetlbfs cache in this way? Muchun, Oscar, David, I think this needs your eyes please! I sense that there could easily be other bugs hereabouts, but perhaps the lack of zeroing needs to be addressed before worrying further. Thanks, Hugh > --- > mm/memfd.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/memfd.c b/mm/memfd.c > index 1d109c1acf21..f8cfc2909507 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -96,6 +96,12 @@ struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) > NULL, > gfp_mask); > if (folio) { > + /* > + * Zero the folio to prevent information leaks to userspace. > + * The folio may have been allocated during hugetlb_reserve_pages() > + * without __GFP_ZERO, so explicitly clear it here. > + */ > + folio_zero_range(folio, 0, folio_size(folio)); > err = hugetlb_add_to_page_cache(folio, > memfd->f_mapping, > idx); > -- > 2.43.0 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 6:55 ` Hugh Dickins @ 2025-11-12 7:28 ` Deepanshu Kartikey 2025-11-12 7:55 ` Hugh Dickins 2025-11-12 9:13 ` Oscar Salvador 1 sibling, 1 reply; 10+ messages in thread From: Deepanshu Kartikey @ 2025-11-12 7:28 UTC (permalink / raw) To: Hugh Dickins Cc: Muchun Song, Oscar Salvador, David Hildenbrand, Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b Hi Hugh, Thank you for the quick review and for looping in the hugetlb maintainers! You raise good points about the approach. I chose explicit zeroing in memfd_alloc_folio() because hugetlb_reserve_pages() can allocate pages without seeing the __GFP_ZERO flag, but I'm happy to revise if the hugetlb maintainers prefer a different approach. I'll add the Fixes: 89c1905d9c14 tag and Cc: stable in v2. Should I send v2 now with just the tag added, or wait for feedback from Muchun/Oscar/David on the overall approach first? Thanks, Deepanshu ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 7:28 ` Deepanshu Kartikey @ 2025-11-12 7:55 ` Hugh Dickins 0 siblings, 0 replies; 10+ messages in thread From: Hugh Dickins @ 2025-11-12 7:55 UTC (permalink / raw) To: Deepanshu Kartikey Cc: Hugh Dickins, Muchun Song, Oscar Salvador, David Hildenbrand, Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b On Wed, 12 Nov 2025, Deepanshu Kartikey wrote: > Hi Hugh, > > Thank you for the quick review and for looping in the hugetlb maintainers! > > You raise good points about the approach. I chose explicit zeroing in > memfd_alloc_folio() because hugetlb_reserve_pages() can allocate pages > without seeing the __GFP_ZERO flag, but I'm happy to revise if the > hugetlb maintainers prefer a different approach. > > I'll add the Fixes: 89c1905d9c14 tag and Cc: stable in v2. > > Should I send v2 now with just the tag added, or wait for feedback from > Muchun/Oscar/David on the overall approach first? No need for a v2 at this stage - Andrew is very much more than capable of adding in that Fixes tag and Cc stable if he's inclined to grab your patch for mm.git in the interim, but let's wait to hear from hugetlb folks before finalizing (I expect they'll say __GFP_ZERO is no good). Hugh ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 6:55 ` Hugh Dickins 2025-11-12 7:28 ` Deepanshu Kartikey @ 2025-11-12 9:13 ` Oscar Salvador 2025-11-12 9:26 ` Deepanshu Kartikey 2025-11-12 10:09 ` David Hildenbrand (Red Hat) 1 sibling, 2 replies; 10+ messages in thread From: Oscar Salvador @ 2025-11-12 9:13 UTC (permalink / raw) To: Hugh Dickins Cc: Muchun Song, David Hildenbrand, Deepanshu Kartikey, Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b On Tue, Nov 11, 2025 at 10:55:03PM -0800, Hugh Dickins wrote: > Thanks a lot, Deepanshu and syzbot: this sounds horrid, and important > to fix very soon; and wlll need a Fixes tag (with stable Cc'ed when > the fix goes into mm.git), I presume it's > > Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios") > > But although my name appears against mm/memfd.c, the truth is I know > little of hugetlb (maintainers now addressed), and when its folios > are supposed to get zeroed (would a __GFP_ZERO somewhere be better?). > > I was puzzled by how udmabuf came into the picture, since hugetlbfs > has always supported the read (not write) system call: but see now > that there is this surprising backdoor into the hugetlb subsystem, > via memfd and GUP pinning. > > And where does that folio get marked uptodate, or is "uptodate" > irrelevant on hugetlbfs? Are the right locks taken, or could > there be races when adding to hugetlbfs cache in this way? Thanks Hugh for raising this up. memfd_alloc_folio() seems to try to recreate what hugetlb_no_page() would do (slightly different though). The thing is that as far as I know, we should grab hugetlb mutex before trying to add a new page in the pagecache, per comment in hugetlb_fault(): " /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate * the same page in the page cache. */ " and at least that is what all callers of hugetlb_add_to_page_cache() do at this moment, all except memfd_alloc_folio(), so I guess this one needs fixing. Regarding the uptodate question, I do not see what is special about this situation that we would not need it. We seem to be marking the folio uptodate every time we do allocate a folio __and__ before adding it into the pagecache (which is expected, right?). Now, for the GFP_ZERO question. This one is nasty. hugetlb_reserve_pages() will allocate surplus folios without zeroing, but those will be zeroed in the faulting path before mapping them into userspace pagetables (see folio_zero_user() in hugetlb_no_page()). So unless I am missing something we need to zero them in this case as well. -- Oscar Salvador SUSE Labs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 9:13 ` Oscar Salvador @ 2025-11-12 9:26 ` Deepanshu Kartikey 2025-11-12 10:09 ` David Hildenbrand (Red Hat) 1 sibling, 0 replies; 10+ messages in thread From: Deepanshu Kartikey @ 2025-11-12 9:26 UTC (permalink / raw) To: Oscar Salvador Cc: Hugh Dickins, Muchun Song, David Hildenbrand, Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b Hi Oscar, Thank you for catching these issues! I have a question about scope: Should I fix all three issues (zeroing, locking, uptodate) in a single patch for v2, or would you prefer: 1. My current patch for just the zeroing (security fix), and 2. A separate follow-up patch for the locking and uptodate issues? I'm happy to do either - just want to make sure I'm following the preferred approach for the mm subsystem. Thanks, Deepanshu ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 9:13 ` Oscar Salvador 2025-11-12 9:26 ` Deepanshu Kartikey @ 2025-11-12 10:09 ` David Hildenbrand (Red Hat) 2025-11-12 11:56 ` Oscar Salvador 1 sibling, 1 reply; 10+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-11-12 10:09 UTC (permalink / raw) To: Oscar Salvador, Hugh Dickins Cc: Muchun Song, Deepanshu Kartikey, Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b On 12.11.25 10:13, Oscar Salvador wrote: > On Tue, Nov 11, 2025 at 10:55:03PM -0800, Hugh Dickins wrote: >> Thanks a lot, Deepanshu and syzbot: this sounds horrid, and important >> to fix very soon; and wlll need a Fixes tag (with stable Cc'ed when >> the fix goes into mm.git), I presume it's >> >> Fixes: 89c1905d9c14 ("mm/gup: introduce memfd_pin_folios() for pinning memfd folios") >> >> But although my name appears against mm/memfd.c, the truth is I know >> little of hugetlb (maintainers now addressed), and when its folios >> are supposed to get zeroed (would a __GFP_ZERO somewhere be better?). >> >> I was puzzled by how udmabuf came into the picture, since hugetlbfs >> has always supported the read (not write) system call: but see now >> that there is this surprising backdoor into the hugetlb subsystem, >> via memfd and GUP pinning. >> >> And where does that folio get marked uptodate, or is "uptodate" >> irrelevant on hugetlbfs? Are the right locks taken, or could >> there be races when adding to hugetlbfs cache in this way? > > Thanks Hugh for raising this up. > > memfd_alloc_folio() seems to try to recreate what hugetlb_no_page() > would do (slightly different though). Can we factor that out to merge both paths? > > The thing is that as far as I know, we should grab hugetlb mutex before > trying to add a new page in the pagecache, per comment in > hugetlb_fault(): > > " > /* > * Serialize hugepage allocation and instantiation, so that we don't > * get spurious allocation failures if two CPUs race to instantiate > * the same page in the page cache. > */ > " > > and at least that is what all callers of hugetlb_add_to_page_cache() do > at this moment, all except memfd_alloc_folio(), so I guess this one > needs fixing. > > Regarding the uptodate question, I do not see what is special about this situation > that we would not need it. > We seem to be marking the folio uptodate every time we do allocate a folio __and__ > before adding it into the pagecache (which is expected, right?). Right, at least filemap.c heavily depends on it being set (I don't think hugetlb itself needs it). > > Now, for the GFP_ZERO question. > This one is nasty. > hugetlb_reserve_pages() will allocate surplus folios without zeroing, but those > will be zeroed in the faulting path before mapping them into userspace pagetables > (see folio_zero_user() in hugetlb_no_page()). > So unless I am missing something we need to zero them in this case as well. I assume we want to avoid GFP_ZERO and use folio_zero_user(), which is optimized for zeroing huge/gigantic pages. -- Cheers David ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 10:09 ` David Hildenbrand (Red Hat) @ 2025-11-12 11:56 ` Oscar Salvador 2025-11-12 12:06 ` Deepanshu Kartikey 0 siblings, 1 reply; 10+ messages in thread From: Oscar Salvador @ 2025-11-12 11:56 UTC (permalink / raw) To: David Hildenbrand (Red Hat) Cc: Hugh Dickins, Muchun Song, Deepanshu Kartikey, Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b On Wed, Nov 12, 2025 at 11:09:51AM +0100, David Hildenbrand (Red Hat) wrote: > On 12.11.25 10:13, Oscar Salvador wrote: > > memfd_alloc_folio() seems to try to recreate what hugetlb_no_page() > > would do (slightly different though). > > Can we factor that out to merge both paths? I guess it is worth looking into it, I shall fiddle with it. > > Regarding the uptodate question, I do not see what is special about this situation > > that we would not need it. > > We seem to be marking the folio uptodate every time we do allocate a folio __and__ > > before adding it into the pagecache (which is expected, right?). > > Right, at least filemap.c heavily depends on it being set (I don't think > hugetlb itself needs it). Yes, you are probably right. > > Now, for the GFP_ZERO question. > > This one is nasty. > > hugetlb_reserve_pages() will allocate surplus folios without zeroing, but those > > will be zeroed in the faulting path before mapping them into userspace pagetables > > (see folio_zero_user() in hugetlb_no_page()). > > So unless I am missing something we need to zero them in this case as well. > > I assume we want to avoid GFP_ZERO and use folio_zero_user(), which is > optimized for zeroing huge/gigantic pages. Yes, I would go with folio_zero_user() as well, to match what we do in all paths. Maybe if we can factor it out, we can simplifiy it as right now seems a small-duplication of hugetlb_no_page (and more so once we add what is missing: mutex, uptodate and folio_zero_user). -- Oscar Salvador SUSE Labs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 11:56 ` Oscar Salvador @ 2025-11-12 12:06 ` Deepanshu Kartikey 2025-11-12 14:54 ` Deepanshu Kartikey 0 siblings, 1 reply; 10+ messages in thread From: Deepanshu Kartikey @ 2025-11-12 12:06 UTC (permalink / raw) To: Oscar Salvador Cc: David Hildenbrand (Red Hat), Hugh Dickins, Muchun Song, Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b Hi Oscar and David, Thanks for the guidance! > I guess it is worth looking into it, I shall fiddle with it. Great, I'll focus on fixing the immediate bugs in v2 and you can handle the refactoring in a follow-up. This keeps my patch focused on the security fix + the missing initialization steps. > Yes, I would go with folio_zero_user() as well, to match what we do in > all paths. Understood. I'll use folio_zero_user() in v2. So for v2, I'll add: 1. folio_zero_user() instead of folio_zero_range() 2. folio_mark_uptodate() 3. hugetlb_fault_mutex locking around hugetlb_add_to_page_cache() This will match the pattern in hugetlb_no_page() and fix the information leak, missing uptodate flag, and locking issue. I'll send v2 shortly after testing. Thanks, Deepanshu ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] mm/memfd: clear hugetlb pages on allocation 2025-11-12 12:06 ` Deepanshu Kartikey @ 2025-11-12 14:54 ` Deepanshu Kartikey 0 siblings, 0 replies; 10+ messages in thread From: Deepanshu Kartikey @ 2025-11-12 14:54 UTC (permalink / raw) To: Oscar Salvador Cc: David Hildenbrand (Red Hat), Hugh Dickins, Muchun Song, Vivek Kasireddy, baolin.wang, akpm, linux-mm, linux-kernel, syzbot+f64019ba229e3a5c411b v2 sent: https://lore.kernel.org/all/20251112145034.2320452-1-kartikey406@gmail.com/T/ Changes in v2: - - Used folio_zero_user() as suggested by Oscar and David - - Added folio_mark_uptodate() - - Added proper hugetlb_fault_mutex locking Thanks for the reviews! ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-11-12 14:54 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-11-12 3:16 [PATCH] mm/memfd: clear hugetlb pages on allocation Deepanshu Kartikey 2025-11-12 6:55 ` Hugh Dickins 2025-11-12 7:28 ` Deepanshu Kartikey 2025-11-12 7:55 ` Hugh Dickins 2025-11-12 9:13 ` Oscar Salvador 2025-11-12 9:26 ` Deepanshu Kartikey 2025-11-12 10:09 ` David Hildenbrand (Red Hat) 2025-11-12 11:56 ` Oscar Salvador 2025-11-12 12:06 ` Deepanshu Kartikey 2025-11-12 14:54 ` Deepanshu Kartikey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox