linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
@ 2025-10-16  3:36 Lance Yang
  2025-10-16  5:34 ` Dev Jain
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Lance Yang @ 2025-10-16  3:36 UTC (permalink / raw)
  To: akpm, david, lorenzo.stoakes
  Cc: ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
	baohua, ioworker0, linux-kernel, linux-mm, Lance Yang

From: Lance Yang <lance.yang@linux.dev>

A non-present entry, like a swap PTE, contains completely different data
(swap type and offset). pte_pfn() doesn't know this, so if we feed it a
non-present entry, it will spit out a junk PFN.

What if that junk PFN happens to match the zeropage's PFN by sheer
chance? While really unlikely, this would be really bad if it did.

So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
in khugepaged.c are properly guarded by a pte_present() check.

Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Lance Yang <lance.yang@linux.dev>
---
 mm/khugepaged.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index d635d821f611..0341c3d13e9e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
 		pte_t pteval = ptep_get(_pte);
 		unsigned long pfn;
 
-		if (pte_none(pteval))
+		if (!pte_present(pteval))
 			continue;
 		pfn = pte_pfn(pteval);
 		if (is_zero_pfn(pfn))
@@ -690,9 +690,10 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte,
 	     address += nr_ptes * PAGE_SIZE) {
 		nr_ptes = 1;
 		pteval = ptep_get(_pte);
-		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
+		if (pte_none(pteval) ||
+		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
 			add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
-			if (is_zero_pfn(pte_pfn(pteval))) {
+			if (!pte_none(pteval)) {
 				/*
 				 * ptl mostly unnecessary.
 				 */
@@ -794,7 +795,8 @@ static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio,
 		unsigned long src_addr = address + i * PAGE_SIZE;
 		struct page *src_page;
 
-		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
+		if (pte_none(pteval) ||
+		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
 			clear_user_highpage(page, src_addr);
 			continue;
 		}
@@ -1294,7 +1296,8 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
 				goto out_unmap;
 			}
 		}
-		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
+		if (pte_none(pteval) ||
+		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
 			++none_or_zero;
 			if (!userfaultfd_armed(vma) &&
 			    (!cc->is_khugepaged ||
-- 
2.49.0



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  3:36 [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present() Lance Yang
@ 2025-10-16  5:34 ` Dev Jain
  2025-10-16  5:59   ` Lance Yang
  2025-10-16  6:17 ` Dev Jain
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Dev Jain @ 2025-10-16  5:34 UTC (permalink / raw)
  To: Lance Yang, akpm, david, lorenzo.stoakes
  Cc: ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts, baohua,
	ioworker0, linux-kernel, linux-mm


On 16/10/25 9:06 am, Lance Yang wrote:
> From: Lance Yang <lance.yang@linux.dev>
>
> A non-present entry, like a swap PTE, contains completely different data
> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
> non-present entry, it will spit out a junk PFN.
>
> What if that junk PFN happens to match the zeropage's PFN by sheer
> chance? While really unlikely, this would be really bad if it did.
>
> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
> in khugepaged.c are properly guarded by a pte_present() check.
>
> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> ---

Thanks, I missed this.

>   mm/khugepaged.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index d635d821f611..0341c3d13e9e 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
>   		pte_t pteval = ptep_get(_pte);
>   		unsigned long pfn;
>   
> -		if (pte_none(pteval))
> +		if (!pte_present(pteval))

There should be no chance that we end up with a pteval which is not none *and*
not present, if you look at the callers of release_pte_pages. So perhaps we
should keep this either the same, or, after "if(pte_none(pteval))", do a
WARN_ON_ONCE(!pte_present(pteval))?

>   			continue;
>   		pfn = pte_pfn(pteval);
>   		if (is_zero_pfn(pfn))
> @@ -690,9 +690,10 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte,
>   	     address += nr_ptes * PAGE_SIZE) {
>   		nr_ptes = 1;
>   		pteval = ptep_get(_pte);
> -		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
> +		if (pte_none(pteval) ||
> +		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>   			add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
> -			if (is_zero_pfn(pte_pfn(pteval))) {
> +			if (!pte_none(pteval)) {
>   				/*
>   				 * ptl mostly unnecessary.
>   				 */
> @@ -794,7 +795,8 @@ static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio,
>   		unsigned long src_addr = address + i * PAGE_SIZE;
>   		struct page *src_page;
>   
> -		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
> +		if (pte_none(pteval) ||
> +		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>   			clear_user_highpage(page, src_addr);
>   			continue;
>   		}
> @@ -1294,7 +1296,8 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>   				goto out_unmap;
>   			}
>   		}
> -		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
> +		if (pte_none(pteval) ||
> +		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>   			++none_or_zero;
>   			if (!userfaultfd_armed(vma) &&
>   			    (!cc->is_khugepaged ||


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  5:34 ` Dev Jain
@ 2025-10-16  5:59   ` Lance Yang
  2025-10-16  6:15     ` Dev Jain
  0 siblings, 1 reply; 15+ messages in thread
From: Lance Yang @ 2025-10-16  5:59 UTC (permalink / raw)
  To: Dev Jain
  Cc: ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts, baohua,
	ioworker0, linux-kernel, linux-mm, lorenzo.stoakes, akpm, david



On 2025/10/16 13:34, Dev Jain wrote:
> 
> On 16/10/25 9:06 am, Lance Yang wrote:
>> From: Lance Yang <lance.yang@linux.dev>
>>
>> A non-present entry, like a swap PTE, contains completely different data
>> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>> non-present entry, it will spit out a junk PFN.
>>
>> What if that junk PFN happens to match the zeropage's PFN by sheer
>> chance? While really unlikely, this would be really bad if it did.
>>
>> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>> in khugepaged.c are properly guarded by a pte_present() check.
>>
>> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>> ---
> 
> Thanks, I missed this.

Me too ...

> 
>>   mm/khugepaged.c | 13 ++++++++-----
>>   1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index d635d821f611..0341c3d13e9e 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t 
>> *_pte,
>>           pte_t pteval = ptep_get(_pte);
>>           unsigned long pfn;
>> -        if (pte_none(pteval))
>> +        if (!pte_present(pteval))
> 
> There should be no chance that we end up with a pteval which is not none 
> *and*
> not present, if you look at the callers of release_pte_pages. So perhaps we
> should keep this either the same, or, after "if(pte_none(pteval))", do a
> WARN_ON_ONCE(!pte_present(pteval))?

Good catch! Yeah, but I'd rather not rely on the callers ...

Wouldn't it just be simpler and safer to always have is_zero_pfn() guarded
by pte_present()?

I don't have a strong opinon here, though ;p

Dev, Thanks!


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  5:59   ` Lance Yang
@ 2025-10-16  6:15     ` Dev Jain
  0 siblings, 0 replies; 15+ messages in thread
From: Dev Jain @ 2025-10-16  6:15 UTC (permalink / raw)
  To: Lance Yang
  Cc: ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts, baohua,
	ioworker0, linux-kernel, linux-mm, lorenzo.stoakes, akpm, david


On 16/10/25 11:29 am, Lance Yang wrote:
>
>
> On 2025/10/16 13:34, Dev Jain wrote:
>>
>> On 16/10/25 9:06 am, Lance Yang wrote:
>>> From: Lance Yang <lance.yang@linux.dev>
>>>
>>> A non-present entry, like a swap PTE, contains completely different 
>>> data
>>> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>>> non-present entry, it will spit out a junk PFN.
>>>
>>> What if that junk PFN happens to match the zeropage's PFN by sheer
>>> chance? While really unlikely, this would be really bad if it did.
>>>
>>> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>>> in khugepaged.c are properly guarded by a pte_present() check.
>>>
>>> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>>> ---
>>
>> Thanks, I missed this.
>
> Me too ...
>
>>
>>>   mm/khugepaged.c | 13 ++++++++-----
>>>   1 file changed, 8 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index d635d821f611..0341c3d13e9e 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t 
>>> *_pte,
>>>           pte_t pteval = ptep_get(_pte);
>>>           unsigned long pfn;
>>> -        if (pte_none(pteval))
>>> +        if (!pte_present(pteval))
>>
>> There should be no chance that we end up with a pteval which is not 
>> none *and*
>> not present, if you look at the callers of release_pte_pages. So 
>> perhaps we
>> should keep this either the same, or, after "if(pte_none(pteval))", do a
>> WARN_ON_ONCE(!pte_present(pteval))?
>
> Good catch! Yeah, but I'd rather not rely on the callers ...
>
> Wouldn't it just be simpler and safer to always have is_zero_pfn() 
> guarded
> by pte_present()?
>
> I don't have a strong opinon here, though ;p

Yeah same, I think we can leave it to what you have done.


>
> Dev, Thanks!
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  3:36 [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present() Lance Yang
  2025-10-16  5:34 ` Dev Jain
@ 2025-10-16  6:17 ` Dev Jain
  2025-10-16  6:26   ` Lance Yang
  2025-10-17  1:27   ` Wei Yang
  2025-10-16  9:33 ` Wei Yang
  2025-10-17  8:10 ` Baolin Wang
  3 siblings, 2 replies; 15+ messages in thread
From: Dev Jain @ 2025-10-16  6:17 UTC (permalink / raw)
  To: Lance Yang, akpm, david, lorenzo.stoakes
  Cc: ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts, baohua,
	ioworker0, linux-kernel, linux-mm


On 16/10/25 9:06 am, Lance Yang wrote:
> From: Lance Yang <lance.yang@linux.dev>
>
> A non-present entry, like a swap PTE, contains completely different data
> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
> non-present entry, it will spit out a junk PFN.
>
> What if that junk PFN happens to match the zeropage's PFN by sheer
> chance? While really unlikely, this would be really bad if it did.
>
> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
> in khugepaged.c are properly guarded by a pte_present() check.
>
> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> ---
>   mm/khugepaged.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index d635d821f611..0341c3d13e9e 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
>   		pte_t pteval = ptep_get(_pte);
>   		unsigned long pfn;
>   
> -		if (pte_none(pteval))
> +		if (!pte_present(pteval))
>   			continue;
>   		pfn = pte_pfn(pteval);
>   		if (is_zero_pfn(pfn))
> @@ -690,9 +690,10 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte,
>   	     address += nr_ptes * PAGE_SIZE) {
>   		nr_ptes = 1;
>   		pteval = ptep_get(_pte);
> -		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
> +		if (pte_none(pteval) ||
> +		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>   			add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
> -			if (is_zero_pfn(pte_pfn(pteval))) {
> +			if (!pte_none(pteval)) {

Could save a level of indentation by saying
if (pte_none(pteval))
     continue;

That would make it crystal clear that we do nothing when pte is none,
and we do something when pte is pointing to zero pfn.

Reviewed-by: Dev Jain <dev.jain@arm.com>

>   				/*
>   				 * ptl mostly unnecessary.
>   				 */
> @@ -794,7 +795,8 @@ static int __collapse_huge_page_copy(pte_t *pte, struct folio *folio,
>   		unsigned long src_addr = address + i * PAGE_SIZE;
>   		struct page *src_page;
>   
> -		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
> +		if (pte_none(pteval) ||
> +		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>   			clear_user_highpage(page, src_addr);
>   			continue;
>   		}
> @@ -1294,7 +1296,8 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>   				goto out_unmap;
>   			}
>   		}
> -		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
> +		if (pte_none(pteval) ||
> +		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>   			++none_or_zero;
>   			if (!userfaultfd_armed(vma) &&
>   			    (!cc->is_khugepaged ||


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  6:17 ` Dev Jain
@ 2025-10-16  6:26   ` Lance Yang
  2025-10-17  1:27   ` Wei Yang
  1 sibling, 0 replies; 15+ messages in thread
From: Lance Yang @ 2025-10-16  6:26 UTC (permalink / raw)
  To: Dev Jain
  Cc: ziy, baolin.wang, Liam.Howlett, npache, ryan.roberts, baohua,
	ioworker0, linux-kernel, linux-mm, akpm, lorenzo.stoakes, david



On 2025/10/16 14:17, Dev Jain wrote:
> 
> On 16/10/25 9:06 am, Lance Yang wrote:
>> From: Lance Yang <lance.yang@linux.dev>
>>
>> A non-present entry, like a swap PTE, contains completely different data
>> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>> non-present entry, it will spit out a junk PFN.
>>
>> What if that junk PFN happens to match the zeropage's PFN by sheer
>> chance? While really unlikely, this would be really bad if it did.
>>
>> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>> in khugepaged.c are properly guarded by a pte_present() check.
>>
>> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>> ---
>>   mm/khugepaged.c | 13 ++++++++-----
>>   1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index d635d821f611..0341c3d13e9e 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t 
>> *_pte,
>>           pte_t pteval = ptep_get(_pte);
>>           unsigned long pfn;
>> -        if (pte_none(pteval))
>> +        if (!pte_present(pteval))
>>               continue;
>>           pfn = pte_pfn(pteval);
>>           if (is_zero_pfn(pfn))
>> @@ -690,9 +690,10 @@ static void 
>> __collapse_huge_page_copy_succeeded(pte_t *pte,
>>            address += nr_ptes * PAGE_SIZE) {
>>           nr_ptes = 1;
>>           pteval = ptep_get(_pte);
>> -        if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
>> +        if (pte_none(pteval) ||
>> +            (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>>               add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
>> -            if (is_zero_pfn(pte_pfn(pteval))) {
>> +            if (!pte_none(pteval)) {
> 
> Could save a level of indentation by saying
> if (pte_none(pteval))
>      continue;
> 
> That would make it crystal clear that we do nothing when pte is none,
> and we do something when pte is pointing to zero pfn.

Yes! That does look clearer ;)

I'll pick that up in the next spin, if one is needed.

> 
> Reviewed-by: Dev Jain <dev.jain@arm.com>

Cheers!

> 
>>                   /*
>>                    * ptl mostly unnecessary.
>>                    */
>> @@ -794,7 +795,8 @@ static int __collapse_huge_page_copy(pte_t *pte, 
>> struct folio *folio,
>>           unsigned long src_addr = address + i * PAGE_SIZE;
>>           struct page *src_page;
>> -        if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
>> +        if (pte_none(pteval) ||
>> +            (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>>               clear_user_highpage(page, src_addr);
>>               continue;
>>           }
>> @@ -1294,7 +1296,8 @@ static int hpage_collapse_scan_pmd(struct 
>> mm_struct *mm,
>>                   goto out_unmap;
>>               }
>>           }
>> -        if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
>> +        if (pte_none(pteval) ||
>> +            (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>>               ++none_or_zero;
>>               if (!userfaultfd_armed(vma) &&
>>                   (!cc->is_khugepaged ||



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  3:36 [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present() Lance Yang
  2025-10-16  5:34 ` Dev Jain
  2025-10-16  6:17 ` Dev Jain
@ 2025-10-16  9:33 ` Wei Yang
  2025-10-16 10:51   ` Lance Yang
  2025-10-17  8:10 ` Baolin Wang
  3 siblings, 1 reply; 15+ messages in thread
From: Wei Yang @ 2025-10-16  9:33 UTC (permalink / raw)
  To: Lance Yang
  Cc: akpm, david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
	npache, ryan.roberts, dev.jain, baohua, ioworker0, linux-kernel,
	linux-mm

On Thu, Oct 16, 2025 at 11:36:43AM +0800, Lance Yang wrote:
>From: Lance Yang <lance.yang@linux.dev>
>
>A non-present entry, like a swap PTE, contains completely different data
>(swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>non-present entry, it will spit out a junk PFN.
>
>What if that junk PFN happens to match the zeropage's PFN by sheer
>chance? While really unlikely, this would be really bad if it did.
>
>So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>in khugepaged.c are properly guarded by a pte_present() check.
>

Does it more like to guard pte_pfn() with pte_present()?

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  9:33 ` Wei Yang
@ 2025-10-16 10:51   ` Lance Yang
  0 siblings, 0 replies; 15+ messages in thread
From: Lance Yang @ 2025-10-16 10:51 UTC (permalink / raw)
  To: Wei Yang
  Cc: akpm, david, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett,
	npache, ryan.roberts, dev.jain, baohua, ioworker0, linux-kernel,
	linux-mm



On 2025/10/16 17:33, Wei Yang wrote:
> On Thu, Oct 16, 2025 at 11:36:43AM +0800, Lance Yang wrote:
>> From: Lance Yang <lance.yang@linux.dev>
>>
>> A non-present entry, like a swap PTE, contains completely different data
>> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>> non-present entry, it will spit out a junk PFN.
>>
>> What if that junk PFN happens to match the zeropage's PFN by sheer
>> chance? While really unlikely, this would be really bad if it did.
>>
>> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>> in khugepaged.c are properly guarded by a pte_present() check.
>>
> 
> Does it more like to guard pte_pfn() with pte_present()?

Exactly! My thinking was that by guarding pte_pfn(), we're ultimately
protecting is_zero_pfn() from acting on a junk PFN.

So we're on the same page — I just described the end goal ;p

Thanks!


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  6:17 ` Dev Jain
  2025-10-16  6:26   ` Lance Yang
@ 2025-10-17  1:27   ` Wei Yang
  2025-10-17  8:11     ` David Hildenbrand
  1 sibling, 1 reply; 15+ messages in thread
From: Wei Yang @ 2025-10-17  1:27 UTC (permalink / raw)
  To: Dev Jain
  Cc: Lance Yang, akpm, david, lorenzo.stoakes, ziy, baolin.wang,
	Liam.Howlett, npache, ryan.roberts, baohua, ioworker0,
	linux-kernel, linux-mm

On Thu, Oct 16, 2025 at 11:47:06AM +0530, Dev Jain wrote:
>
>On 16/10/25 9:06 am, Lance Yang wrote:
>> From: Lance Yang <lance.yang@linux.dev>
>> 
>> A non-present entry, like a swap PTE, contains completely different data
>> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>> non-present entry, it will spit out a junk PFN.
>> 
>> What if that junk PFN happens to match the zeropage's PFN by sheer
>> chance? While really unlikely, this would be really bad if it did.
>> 
>> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>> in khugepaged.c are properly guarded by a pte_present() check.
>> 
>> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>> ---
>>   mm/khugepaged.c | 13 ++++++++-----
>>   1 file changed, 8 insertions(+), 5 deletions(-)
>> 
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index d635d821f611..0341c3d13e9e 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
>>   		pte_t pteval = ptep_get(_pte);
>>   		unsigned long pfn;
>> -		if (pte_none(pteval))
>> +		if (!pte_present(pteval))
>>   			continue;
>>   		pfn = pte_pfn(pteval);
>>   		if (is_zero_pfn(pfn))
>> @@ -690,9 +690,10 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte,
>>   	     address += nr_ptes * PAGE_SIZE) {
>>   		nr_ptes = 1;
>>   		pteval = ptep_get(_pte);
>> -		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
>> +		if (pte_none(pteval) ||
>> +		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>>   			add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
>> -			if (is_zero_pfn(pte_pfn(pteval))) {
>> +			if (!pte_none(pteval)) {
>
>Could save a level of indentation by saying
>if (pte_none(pteval))
>    continue;
>

Vote for this :-)

No other comment.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-16  3:36 [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present() Lance Yang
                   ` (2 preceding siblings ...)
  2025-10-16  9:33 ` Wei Yang
@ 2025-10-17  8:10 ` Baolin Wang
  3 siblings, 0 replies; 15+ messages in thread
From: Baolin Wang @ 2025-10-17  8:10 UTC (permalink / raw)
  To: Lance Yang, akpm, david, lorenzo.stoakes
  Cc: ziy, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua,
	ioworker0, linux-kernel, linux-mm



On 2025/10/16 11:36, Lance Yang wrote:
> From: Lance Yang <lance.yang@linux.dev>
> 
> A non-present entry, like a swap PTE, contains completely different data
> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
> non-present entry, it will spit out a junk PFN.
> 
> What if that junk PFN happens to match the zeropage's PFN by sheer
> chance? While really unlikely, this would be really bad if it did.
> 
> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
> in khugepaged.c are properly guarded by a pte_present() check.
> 
> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> ---

LGTM.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-17  1:27   ` Wei Yang
@ 2025-10-17  8:11     ` David Hildenbrand
  2025-10-17  8:37       ` Lance Yang
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand @ 2025-10-17  8:11 UTC (permalink / raw)
  To: Wei Yang, Dev Jain
  Cc: Lance Yang, akpm, lorenzo.stoakes, ziy, baolin.wang,
	Liam.Howlett, npache, ryan.roberts, baohua, ioworker0,
	linux-kernel, linux-mm

On 17.10.25 03:27, Wei Yang wrote:
> On Thu, Oct 16, 2025 at 11:47:06AM +0530, Dev Jain wrote:
>>
>> On 16/10/25 9:06 am, Lance Yang wrote:
>>> From: Lance Yang <lance.yang@linux.dev>
>>>
>>> A non-present entry, like a swap PTE, contains completely different data
>>> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>>> non-present entry, it will spit out a junk PFN.
>>>
>>> What if that junk PFN happens to match the zeropage's PFN by sheer
>>> chance? While really unlikely, this would be really bad if it did.
>>>
>>> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>>> in khugepaged.c are properly guarded by a pte_present() check.
>>>
>>> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>>> ---
>>>    mm/khugepaged.c | 13 ++++++++-----
>>>    1 file changed, 8 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index d635d821f611..0341c3d13e9e 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte,
>>>    		pte_t pteval = ptep_get(_pte);
>>>    		unsigned long pfn;
>>> -		if (pte_none(pteval))
>>> +		if (!pte_present(pteval))
>>>    			continue;
>>>    		pfn = pte_pfn(pteval);
>>>    		if (is_zero_pfn(pfn))
>>> @@ -690,9 +690,10 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte,
>>>    	     address += nr_ptes * PAGE_SIZE) {
>>>    		nr_ptes = 1;
>>>    		pteval = ptep_get(_pte);
>>> -		if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
>>> +		if (pte_none(pteval) ||
>>> +		    (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>>>    			add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
>>> -			if (is_zero_pfn(pte_pfn(pteval))) {
>>> +			if (!pte_none(pteval)) {
>>
>> Could save a level of indentation by saying
>> if (pte_none(pteval))
>>     continue;
>>
> 
> Vote for this :-)

I suspect there will be a v2, correct?

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-17  8:11     ` David Hildenbrand
@ 2025-10-17  8:37       ` Lance Yang
  2025-10-17  8:43         ` David Hildenbrand
  0 siblings, 1 reply; 15+ messages in thread
From: Lance Yang @ 2025-10-17  8:37 UTC (permalink / raw)
  To: David Hildenbrand, Wei Yang, Dev Jain
  Cc: akpm, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
	ryan.roberts, baohua, ioworker0, linux-kernel, linux-mm



On 2025/10/17 16:11, David Hildenbrand wrote:
> On 17.10.25 03:27, Wei Yang wrote:
>> On Thu, Oct 16, 2025 at 11:47:06AM +0530, Dev Jain wrote:
>>>
>>> On 16/10/25 9:06 am, Lance Yang wrote:
>>>> From: Lance Yang <lance.yang@linux.dev>
>>>>
>>>> A non-present entry, like a swap PTE, contains completely different 
>>>> data
>>>> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>>>> non-present entry, it will spit out a junk PFN.
>>>>
>>>> What if that junk PFN happens to match the zeropage's PFN by sheer
>>>> chance? While really unlikely, this would be really bad if it did.
>>>>
>>>> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>>>> in khugepaged.c are properly guarded by a pte_present() check.
>>>>
>>>> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>>>> ---
>>>>    mm/khugepaged.c | 13 ++++++++-----
>>>>    1 file changed, 8 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>>> index d635d821f611..0341c3d13e9e 100644
>>>> --- a/mm/khugepaged.c
>>>> +++ b/mm/khugepaged.c
>>>> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t 
>>>> *_pte,
>>>>            pte_t pteval = ptep_get(_pte);
>>>>            unsigned long pfn;
>>>> -        if (pte_none(pteval))
>>>> +        if (!pte_present(pteval))
>>>>                continue;
>>>>            pfn = pte_pfn(pteval);
>>>>            if (is_zero_pfn(pfn))
>>>> @@ -690,9 +690,10 @@ static void 
>>>> __collapse_huge_page_copy_succeeded(pte_t *pte,
>>>>             address += nr_ptes * PAGE_SIZE) {
>>>>            nr_ptes = 1;
>>>>            pteval = ptep_get(_pte);
>>>> -        if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
>>>> +        if (pte_none(pteval) ||
>>>> +            (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>>>>                add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
>>>> -            if (is_zero_pfn(pte_pfn(pteval))) {
>>>> +            if (!pte_none(pteval)) {
>>>
>>> Could save a level of indentation by saying
>>> if (pte_none(pteval))
>>>     continue;
>>>
>>
>> Vote for this :-)
> 
> I suspect there will be a v2, correct?

I was hoping a v2 wouldn't be necessary for this ;p

Of course, if we'd prefer a v2, I'm happy to send one out.

Cheers!


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-17  8:37       ` Lance Yang
@ 2025-10-17  8:43         ` David Hildenbrand
  2025-10-17  8:47           ` Lance Yang
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand @ 2025-10-17  8:43 UTC (permalink / raw)
  To: Lance Yang, Wei Yang, Dev Jain
  Cc: akpm, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
	ryan.roberts, baohua, ioworker0, linux-kernel, linux-mm

On 17.10.25 10:37, Lance Yang wrote:
> 
> 
> On 2025/10/17 16:11, David Hildenbrand wrote:
>> On 17.10.25 03:27, Wei Yang wrote:
>>> On Thu, Oct 16, 2025 at 11:47:06AM +0530, Dev Jain wrote:
>>>>
>>>> On 16/10/25 9:06 am, Lance Yang wrote:
>>>>> From: Lance Yang <lance.yang@linux.dev>
>>>>>
>>>>> A non-present entry, like a swap PTE, contains completely different
>>>>> data
>>>>> (swap type and offset). pte_pfn() doesn't know this, so if we feed it a
>>>>> non-present entry, it will spit out a junk PFN.
>>>>>
>>>>> What if that junk PFN happens to match the zeropage's PFN by sheer
>>>>> chance? While really unlikely, this would be really bad if it did.
>>>>>
>>>>> So, let's fix this potential bug by ensuring all calls to is_zero_pfn()
>>>>> in khugepaged.c are properly guarded by a pte_present() check.
>>>>>
>>>>> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>>>>> ---
>>>>>     mm/khugepaged.c | 13 ++++++++-----
>>>>>     1 file changed, 8 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>>>> index d635d821f611..0341c3d13e9e 100644
>>>>> --- a/mm/khugepaged.c
>>>>> +++ b/mm/khugepaged.c
>>>>> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t
>>>>> *_pte,
>>>>>             pte_t pteval = ptep_get(_pte);
>>>>>             unsigned long pfn;
>>>>> -        if (pte_none(pteval))
>>>>> +        if (!pte_present(pteval))
>>>>>                 continue;
>>>>>             pfn = pte_pfn(pteval);
>>>>>             if (is_zero_pfn(pfn))
>>>>> @@ -690,9 +690,10 @@ static void
>>>>> __collapse_huge_page_copy_succeeded(pte_t *pte,
>>>>>              address += nr_ptes * PAGE_SIZE) {
>>>>>             nr_ptes = 1;
>>>>>             pteval = ptep_get(_pte);
>>>>> -        if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
>>>>> +        if (pte_none(pteval) ||
>>>>> +            (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>>>>>                 add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
>>>>> -            if (is_zero_pfn(pte_pfn(pteval))) {
>>>>> +            if (!pte_none(pteval)) {
>>>>
>>>> Could save a level of indentation by saying
>>>> if (pte_none(pteval))
>>>>      continue;
>>>>
>>>
>>> Vote for this :-)
>>
>> I suspect there will be a v2, correct?
> 
> I was hoping a v2 wouldn't be necessary for this ;p
> 
> Of course, if we'd prefer a v2, I'm happy to send one out.

I lost track of what the result will be, so a v2 would be nice at least 
for me :)

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-17  8:43         ` David Hildenbrand
@ 2025-10-17  8:47           ` Lance Yang
  2025-10-17  9:35             ` Lorenzo Stoakes
  0 siblings, 1 reply; 15+ messages in thread
From: Lance Yang @ 2025-10-17  8:47 UTC (permalink / raw)
  To: David Hildenbrand, Wei Yang, Dev Jain
  Cc: akpm, lorenzo.stoakes, ziy, baolin.wang, Liam.Howlett, npache,
	ryan.roberts, baohua, ioworker0, linux-kernel, linux-mm



On 2025/10/17 16:43, David Hildenbrand wrote:
> On 17.10.25 10:37, Lance Yang wrote:
>>
>>
>> On 2025/10/17 16:11, David Hildenbrand wrote:
>>> On 17.10.25 03:27, Wei Yang wrote:
>>>> On Thu, Oct 16, 2025 at 11:47:06AM +0530, Dev Jain wrote:
>>>>>
>>>>> On 16/10/25 9:06 am, Lance Yang wrote:
>>>>>> From: Lance Yang <lance.yang@linux.dev>
>>>>>>
>>>>>> A non-present entry, like a swap PTE, contains completely different
>>>>>> data
>>>>>> (swap type and offset). pte_pfn() doesn't know this, so if we feed 
>>>>>> it a
>>>>>> non-present entry, it will spit out a junk PFN.
>>>>>>
>>>>>> What if that junk PFN happens to match the zeropage's PFN by sheer
>>>>>> chance? While really unlikely, this would be really bad if it did.
>>>>>>
>>>>>> So, let's fix this potential bug by ensuring all calls to 
>>>>>> is_zero_pfn()
>>>>>> in khugepaged.c are properly guarded by a pte_present() check.
>>>>>>
>>>>>> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>>>>> Signed-off-by: Lance Yang <lance.yang@linux.dev>
>>>>>> ---
>>>>>>     mm/khugepaged.c | 13 ++++++++-----
>>>>>>     1 file changed, 8 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>>>>> index d635d821f611..0341c3d13e9e 100644
>>>>>> --- a/mm/khugepaged.c
>>>>>> +++ b/mm/khugepaged.c
>>>>>> @@ -516,7 +516,7 @@ static void release_pte_pages(pte_t *pte, pte_t
>>>>>> *_pte,
>>>>>>             pte_t pteval = ptep_get(_pte);
>>>>>>             unsigned long pfn;
>>>>>> -        if (pte_none(pteval))
>>>>>> +        if (!pte_present(pteval))
>>>>>>                 continue;
>>>>>>             pfn = pte_pfn(pteval);
>>>>>>             if (is_zero_pfn(pfn))
>>>>>> @@ -690,9 +690,10 @@ static void
>>>>>> __collapse_huge_page_copy_succeeded(pte_t *pte,
>>>>>>              address += nr_ptes * PAGE_SIZE) {
>>>>>>             nr_ptes = 1;
>>>>>>             pteval = ptep_get(_pte);
>>>>>> -        if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) {
>>>>>> +        if (pte_none(pteval) ||
>>>>>> +            (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) {
>>>>>>                 add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1);
>>>>>> -            if (is_zero_pfn(pte_pfn(pteval))) {
>>>>>> +            if (!pte_none(pteval)) {
>>>>>
>>>>> Could save a level of indentation by saying
>>>>> if (pte_none(pteval))
>>>>>      continue;
>>>>>
>>>>
>>>> Vote for this :-)
>>>
>>> I suspect there will be a v2, correct?
>>
>> I was hoping a v2 wouldn't be necessary for this ;p
>>
>> Of course, if we'd prefer a v2, I'm happy to send one out.
> 
> I lost track of what the result will be, so a v2 would be nice at least 
> for me :)

Sure. V2 on the way ;)



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present()
  2025-10-17  8:47           ` Lance Yang
@ 2025-10-17  9:35             ` Lorenzo Stoakes
  0 siblings, 0 replies; 15+ messages in thread
From: Lorenzo Stoakes @ 2025-10-17  9:35 UTC (permalink / raw)
  To: Lance Yang
  Cc: David Hildenbrand, Wei Yang, Dev Jain, akpm, ziy, baolin.wang,
	Liam.Howlett, npache, ryan.roberts, baohua, ioworker0,
	linux-kernel, linux-mm

On Fri, Oct 17, 2025 at 04:47:20PM +0800, Lance Yang wrote:
> Sure. V2 on the way ;)

Will wait for that before review then :)


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-10-17  9:35 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-10-16  3:36 [PATCH mm-new 1/1] mm/khugepaged: guard is_zero_pfn() calls with pte_present() Lance Yang
2025-10-16  5:34 ` Dev Jain
2025-10-16  5:59   ` Lance Yang
2025-10-16  6:15     ` Dev Jain
2025-10-16  6:17 ` Dev Jain
2025-10-16  6:26   ` Lance Yang
2025-10-17  1:27   ` Wei Yang
2025-10-17  8:11     ` David Hildenbrand
2025-10-17  8:37       ` Lance Yang
2025-10-17  8:43         ` David Hildenbrand
2025-10-17  8:47           ` Lance Yang
2025-10-17  9:35             ` Lorenzo Stoakes
2025-10-16  9:33 ` Wei Yang
2025-10-16 10:51   ` Lance Yang
2025-10-17  8:10 ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox