* [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t*
@ 2025-03-10 14:04 Ryan Roberts
2025-03-10 14:10 ` Lorenzo Stoakes
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Ryan Roberts @ 2025-03-10 14:04 UTC (permalink / raw)
To: Andrew Morton, Lorenzo Stoakes; +Cc: Ryan Roberts, linux-mm, linux-kernel
It is best practice for all pte accesses to go via the arch helpers, to
ensure non-torn values and to allow the arch to intervene where needed
(contpte for arm64 for example). While in this case it was probably safe
to directly dereference, let's tidy it up for consistency.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
mm/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index 22e270f727ed..33a22c2d6b20 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -202,7 +202,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
return false;
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
- VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page);
+ VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
mm_forbids_zeropage(pvmw->vma->vm_mm))
--
2.43.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t*
2025-03-10 14:04 [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t* Ryan Roberts
@ 2025-03-10 14:10 ` Lorenzo Stoakes
2025-03-10 15:51 ` Qi Zheng
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Lorenzo Stoakes @ 2025-03-10 14:10 UTC (permalink / raw)
To: Ryan Roberts; +Cc: Andrew Morton, linux-mm, linux-kernel
On Mon, Mar 10, 2025 at 02:04:17PM +0000, Ryan Roberts wrote:
> It is best practice for all pte accesses to go via the arch helpers, to
> ensure non-torn values and to allow the arch to intervene where needed
> (contpte for arm64 for example). While in this case it was probably safe
> to directly dereference, let's tidy it up for consistency.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
LGTM, so:
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> mm/migrate.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 22e270f727ed..33a22c2d6b20 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -202,7 +202,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
> return false;
> VM_BUG_ON_PAGE(!PageAnon(page), page);
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> - VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page);
> + VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
>
> if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
> mm_forbids_zeropage(pvmw->vma->vm_mm))
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t*
2025-03-10 14:04 [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t* Ryan Roberts
2025-03-10 14:10 ` Lorenzo Stoakes
@ 2025-03-10 15:51 ` Qi Zheng
2025-03-10 16:31 ` Ryan Roberts
2025-03-11 5:28 ` Anshuman Khandual
2025-03-11 6:03 ` Dev Jain
3 siblings, 1 reply; 7+ messages in thread
From: Qi Zheng @ 2025-03-10 15:51 UTC (permalink / raw)
To: Ryan Roberts, Andrew Morton, Lorenzo Stoakes; +Cc: linux-mm, linux-kernel
Hi Ryan,
On 3/10/25 10:04 PM, Ryan Roberts wrote:
> It is best practice for all pte accesses to go via the arch helpers, to
> ensure non-torn values and to allow the arch to intervene where needed
> (contpte for arm64 for example). While in this case it was probably safe
> to directly dereference, let's tidy it up for consistency.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> mm/migrate.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
This looks good to me. So
Reviewed-by: Qi Zheng <zhengqi.arch@bytedance.com>
BTW, there are many other places in the kernel that directly
dereference pmd_t* and pud_t*, etc.
For example:
root@debian:~# grep "*vmf->pmd" . -rwn
./mm/memory.c:5113: if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) {
./mm/memory.c:5207: if (unlikely(!pmd_none(*vmf->pmd)))
./mm/memory.c:5339: if (pmd_none(*vmf->pmd)) {
./mm/memory.c:5490: if (pmd_none(*vmf->pmd)) {
./mm/memory.c:5996: if (unlikely(pmd_none(*vmf->pmd))) {
./mm/filemap.c:3612: if (pmd_trans_huge(*vmf->pmd)) {
./mm/filemap.c:3618: if (pmd_none(*vmf->pmd) &&
folio_test_pmd_mappable(folio)) {
./mm/filemap.c:3628: if (pmd_none(*vmf->pmd) && vmf->prealloc_pte)
./mm/huge_memory.c:1237: if (unlikely(!pmd_none(*vmf->pmd))) {
./mm/huge_memory.c:1352: if (pmd_none(*vmf->pmd)) {
./mm/huge_memory.c:1496: if (pmd_none(*vmf->pmd)) {
./mm/huge_memory.c:1882: if (unlikely(!pmd_same(*vmf->pmd, vmf->orig_pmd)))
./mm/huge_memory.c:1947: if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) {
./mm/huge_memory.c:1965: if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) {
./fs/dax.c:1935: if (pmd_trans_huge(*vmf->pmd) || pmd_devmap(*vmf->pmd)) {
./fs/dax.c:2058: if (!pmd_none(*vmf->pmd) && !pmd_trans_huge(*vmf->pmd) &&
./fs/dax.c:2059: !pmd_devmap(*vmf->pmd)) {
Would it be best to clean them up as well?
Thanks,
Qi
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 22e270f727ed..33a22c2d6b20 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -202,7 +202,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
> return false;
> VM_BUG_ON_PAGE(!PageAnon(page), page);
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> - VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page);
> + VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
>
> if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
> mm_forbids_zeropage(pvmw->vma->vm_mm))
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t*
2025-03-10 15:51 ` Qi Zheng
@ 2025-03-10 16:31 ` Ryan Roberts
2025-03-11 3:14 ` Qi Zheng
0 siblings, 1 reply; 7+ messages in thread
From: Ryan Roberts @ 2025-03-10 16:31 UTC (permalink / raw)
To: Qi Zheng, Andrew Morton, Lorenzo Stoakes, Anshuman Khandual
Cc: linux-mm, linux-kernel
On 10/03/2025 15:51, Qi Zheng wrote:
> Hi Ryan,
>
> On 3/10/25 10:04 PM, Ryan Roberts wrote:
>> It is best practice for all pte accesses to go via the arch helpers, to
>> ensure non-torn values and to allow the arch to intervene where needed
>> (contpte for arm64 for example). While in this case it was probably safe
>> to directly dereference, let's tidy it up for consistency.
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>> mm/migrate.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> This looks good to me. So
>
> Reviewed-by: Qi Zheng <zhengqi.arch@bytedance.com>
Thanks!
>
> BTW, there are many other places in the kernel that directly
> dereference pmd_t* and pud_t*, etc.
It's all a little bit murky. For now, from arm64's perspective at least, there
is only a hard requirement for ptes to be accessed through the arch helper. This
is because arm64's contpte layer may apply a transform when reading the pte.
In general there are also potential issues with tearing, if you don't at least
read with READ_ONCE(). But often to consumer of the value is tolerant to tearing
(e.g. pmd_none(), etc). Also, in practice on arm64 the compiler will emit
instructions that ensure single-copy-atomicity for direct dereferences, so it
all works out.
That said, Anshuman (cc'ed) has been looking at supporting FEAT_D128 (128 bit
page table descriptors) on arm64. The compiler does not emit single-copy-atomic
loads for direct dereferences of 128 bit data, so he has been working on
converting the other levels to use the accessors for that reason.
But that has some potentially problematic interactions with level folding that
need to be solved. Some arches rely on the compiler optimizing away the direct
dereferences when folded. But it can't do that for a READ_ONCE().
I believe Anshuman is aiming to post a series to do this at some point in the
future.
Thanks,
Ryan
>
> For example:
>
> root@debian:~# grep "*vmf->pmd" . -rwn
> ./mm/memory.c:5113: if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) {
> ./mm/memory.c:5207: if (unlikely(!pmd_none(*vmf->pmd)))
> ./mm/memory.c:5339: if (pmd_none(*vmf->pmd)) {
> ./mm/memory.c:5490: if (pmd_none(*vmf->pmd)) {
> ./mm/memory.c:5996: if (unlikely(pmd_none(*vmf->pmd))) {
> ./mm/filemap.c:3612: if (pmd_trans_huge(*vmf->pmd)) {
> ./mm/filemap.c:3618: if (pmd_none(*vmf->pmd) &&
> folio_test_pmd_mappable(folio)) {
> ./mm/filemap.c:3628: if (pmd_none(*vmf->pmd) && vmf->prealloc_pte)
> ./mm/huge_memory.c:1237: if (unlikely(!pmd_none(*vmf->pmd))) {
> ./mm/huge_memory.c:1352: if (pmd_none(*vmf->pmd)) {
> ./mm/huge_memory.c:1496: if (pmd_none(*vmf->pmd)) {
> ./mm/huge_memory.c:1882: if (unlikely(!pmd_same(*vmf->pmd, vmf->orig_pmd)))
> ./mm/huge_memory.c:1947: if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) {
> ./mm/huge_memory.c:1965: if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) {
> ./fs/dax.c:1935: if (pmd_trans_huge(*vmf->pmd) || pmd_devmap(*vmf->pmd)) {
> ./fs/dax.c:2058: if (!pmd_none(*vmf->pmd) && !pmd_trans_huge(*vmf->pmd) &&
> ./fs/dax.c:2059: !pmd_devmap(*vmf->pmd)) {
>
> Would it be best to clean them up as well?
>
> Thanks,
> Qi
>
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 22e270f727ed..33a22c2d6b20 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -202,7 +202,7 @@ static bool try_to_map_unused_to_zeropage(struct
>> page_vma_mapped_walk *pvmw,
>> return false;
>> VM_BUG_ON_PAGE(!PageAnon(page), page);
>> VM_BUG_ON_PAGE(!PageLocked(page), page);
>> - VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page);
>> + VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
>>
>> if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
>> mm_forbids_zeropage(pvmw->vma->vm_mm))
>> --
>> 2.43.0
>>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t*
2025-03-10 16:31 ` Ryan Roberts
@ 2025-03-11 3:14 ` Qi Zheng
0 siblings, 0 replies; 7+ messages in thread
From: Qi Zheng @ 2025-03-11 3:14 UTC (permalink / raw)
To: Ryan Roberts, Andrew Morton, Lorenzo Stoakes, Anshuman Khandual
Cc: linux-mm, linux-kernel
On 3/11/25 12:31 AM, Ryan Roberts wrote:
> On 10/03/2025 15:51, Qi Zheng wrote:
>> Hi Ryan,
>>
>> On 3/10/25 10:04 PM, Ryan Roberts wrote:
>>> It is best practice for all pte accesses to go via the arch helpers, to
>>> ensure non-torn values and to allow the arch to intervene where needed
>>> (contpte for arm64 for example). While in this case it was probably safe
>>> to directly dereference, let's tidy it up for consistency.
>>>
>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>> ---
>>> mm/migrate.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> This looks good to me. So
>>
>> Reviewed-by: Qi Zheng <zhengqi.arch@bytedance.com>
>
> Thanks!
>
>>
>> BTW, there are many other places in the kernel that directly
>> dereference pmd_t* and pud_t*, etc.
>
> It's all a little bit murky. For now, from arm64's perspective at least, there
> is only a hard requirement for ptes to be accessed through the arch helper. This
> is because arm64's contpte layer may apply a transform when reading the pte.
>
> In general there are also potential issues with tearing, if you don't at least
> read with READ_ONCE(). But often to consumer of the value is tolerant to tearing
> (e.g. pmd_none(), etc). Also, in practice on arm64 the compiler will emit
> instructions that ensure single-copy-atomicity for direct dereferences, so it
> all works out.
>
> That said, Anshuman (cc'ed) has been looking at supporting FEAT_D128 (128 bit
> page table descriptors) on arm64. The compiler does not emit single-copy-atomic
> loads for direct dereferences of 128 bit data, so he has been working on
> converting the other levels to use the accessors for that reason.
>
> But that has some potentially problematic interactions with level folding that
> need to be solved. Some arches rely on the compiler optimizing away the direct
> dereferences when folded. But it can't do that for a READ_ONCE().
>
> I believe Anshuman is aiming to post a series to do this at some point in the
> future.
Got it. Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t*
2025-03-10 14:04 [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t* Ryan Roberts
2025-03-10 14:10 ` Lorenzo Stoakes
2025-03-10 15:51 ` Qi Zheng
@ 2025-03-11 5:28 ` Anshuman Khandual
2025-03-11 6:03 ` Dev Jain
3 siblings, 0 replies; 7+ messages in thread
From: Anshuman Khandual @ 2025-03-11 5:28 UTC (permalink / raw)
To: Ryan Roberts, Andrew Morton, Lorenzo Stoakes; +Cc: linux-mm, linux-kernel
On 3/10/25 19:34, Ryan Roberts wrote:
> It is best practice for all pte accesses to go via the arch helpers, to
> ensure non-torn values and to allow the arch to intervene where needed
> (contpte for arm64 for example). While in this case it was probably safe
> to directly dereference, let's tidy it up for consistency.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> mm/migrate.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 22e270f727ed..33a22c2d6b20 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -202,7 +202,7 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
> return false;
> VM_BUG_ON_PAGE(!PageAnon(page), page);
> VM_BUG_ON_PAGE(!PageLocked(page), page);
> - VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page);
> + VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
>
> if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
> mm_forbids_zeropage(pvmw->vma->vm_mm))
> --
> 2.43.0
>
>
LGTM.
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t*
2025-03-10 14:04 [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t* Ryan Roberts
` (2 preceding siblings ...)
2025-03-11 5:28 ` Anshuman Khandual
@ 2025-03-11 6:03 ` Dev Jain
3 siblings, 0 replies; 7+ messages in thread
From: Dev Jain @ 2025-03-11 6:03 UTC (permalink / raw)
To: Ryan Roberts, Andrew Morton, Lorenzo Stoakes; +Cc: linux-mm, linux-kernel
On 10/03/25 7:34 pm, Ryan Roberts wrote:
> It is best practice for all pte accesses to go via the arch helpers, to
> ensure non-torn values and to allow the arch to intervene where needed
> (contpte for arm64 for example). While in this case it was probably safe
> to directly dereference, let's tidy it up for consistency.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-11 6:03 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-03-10 14:04 [PATCH v1] mm: use ptep_get() instead of directly dereferencing pte_t* Ryan Roberts
2025-03-10 14:10 ` Lorenzo Stoakes
2025-03-10 15:51 ` Qi Zheng
2025-03-10 16:31 ` Ryan Roberts
2025-03-11 3:14 ` Qi Zheng
2025-03-11 5:28 ` Anshuman Khandual
2025-03-11 6:03 ` Dev Jain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox