linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte
@ 2025-05-08  9:07 Barry Song
  2025-05-08  9:24 ` David Hildenbrand
  2025-05-08 15:23 ` Peter Xu
  0 siblings, 2 replies; 7+ messages in thread
From: Barry Song @ 2025-05-08  9:07 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, Barry Song, David Hildenbrand, Peter Xu,
	Suren Baghdasaryan, Lokesh Gidra

From: Barry Song <v-songbaohua@oppo.com>

As David pointed out, what truly matters for mremap and userfaultfd
move operations is the soft dirty bit. The current comment and
implementation—which always sets the dirty bit for present PTEs
and fails to set the soft dirty bit for swap PTEs—are incorrect.
This patch updates the behavior to correctly set the soft dirty bit
for both present and swap PTEs in accordance with mremap.

Reported-by: David Hildenbrand <david@redhat.com>
Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
Cc: Peter Xu <peterx@redhat.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 mm/userfaultfd.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index e8ce92dc105f..bc473ad21202 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1064,8 +1064,13 @@ static int move_present_pte(struct mm_struct *mm,
 	src_folio->index = linear_page_index(dst_vma, dst_addr);
 
 	orig_dst_pte = folio_mk_pte(src_folio, dst_vma->vm_page_prot);
-	/* Follow mremap() behavior and treat the entry dirty after the move */
-	orig_dst_pte = pte_mkwrite(pte_mkdirty(orig_dst_pte), dst_vma);
+	/* Set soft dirty bit so userspace can notice the pte was moved */
+#ifdef CONFIG_MEM_SOFT_DIRTY
+	orig_dst_pte = pte_mksoft_dirty(orig_dst_pte);
+#endif
+	if (pte_dirty(orig_src_pte))
+		orig_dst_pte = pte_mkdirty(orig_dst_pte);
+	orig_dst_pte = pte_mkwrite(orig_dst_pte, dst_vma);
 
 	set_pte_at(mm, dst_addr, dst_pte, orig_dst_pte);
 out:
@@ -1100,6 +1105,9 @@ static int move_swap_pte(struct mm_struct *mm, struct vm_area_struct *dst_vma,
 	}
 
 	orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte);
+#ifdef CONFIG_MEM_SOFT_DIRTY
+	orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte);
+#endif
 	set_pte_at(mm, dst_addr, dst_pte, orig_src_pte);
 	double_pt_unlock(dst_ptl, src_ptl);
 
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte
  2025-05-08  9:07 [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte Barry Song
@ 2025-05-08  9:24 ` David Hildenbrand
  2025-05-08 15:23 ` Peter Xu
  1 sibling, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2025-05-08  9:24 UTC (permalink / raw)
  To: Barry Song, akpm, linux-mm
  Cc: linux-kernel, Barry Song, Peter Xu, Suren Baghdasaryan, Lokesh Gidra

On 08.05.25 11:07, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> As David pointed out, what truly matters for mremap and userfaultfd
> move operations is the soft dirty bit. The current comment and
> implementation—which always sets the dirty bit for present PTEs
> and fails to set the soft dirty bit for swap PTEs—are incorrect.
> This patch updates the behavior to correctly set the soft dirty bit
> for both present and swap PTEs in accordance with mremap.
> 
> Reported-by: David Hildenbrand <david@redhat.com>
> Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>   mm/userfaultfd.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index e8ce92dc105f..bc473ad21202 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -1064,8 +1064,13 @@ static int move_present_pte(struct mm_struct *mm,
>   	src_folio->index = linear_page_index(dst_vma, dst_addr);
>   
>   	orig_dst_pte = folio_mk_pte(src_folio, dst_vma->vm_page_prot);
> -	/* Follow mremap() behavior and treat the entry dirty after the move */
> -	orig_dst_pte = pte_mkwrite(pte_mkdirty(orig_dst_pte), dst_vma);
> +	/* Set soft dirty bit so userspace can notice the pte was moved */
> +#ifdef CONFIG_MEM_SOFT_DIRTY
> +	orig_dst_pte = pte_mksoft_dirty(orig_dst_pte);
> +#endif
> +	if (pte_dirty(orig_src_pte))
> +		orig_dst_pte = pte_mkdirty(orig_dst_pte);
> +	orig_dst_pte = pte_mkwrite(orig_dst_pte, dst_vma);
>   
>   	set_pte_at(mm, dst_addr, dst_pte, orig_dst_pte);
>   out:
> @@ -1100,6 +1105,9 @@ static int move_swap_pte(struct mm_struct *mm, struct vm_area_struct *dst_vma,
>   	}
>   
>   	orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte);
> +#ifdef CONFIG_MEM_SOFT_DIRTY
> +	orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte);
> +#endif
>   	set_pte_at(mm, dst_addr, dst_pte, orig_src_pte);
>   	double_pt_unlock(dst_ptl, src_ptl);

Yeah, I think that should be the right thing to do.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte
  2025-05-08  9:07 [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte Barry Song
  2025-05-08  9:24 ` David Hildenbrand
@ 2025-05-08 15:23 ` Peter Xu
  2025-05-08 15:27   ` Lokesh Gidra
  2025-05-08 15:33   ` Suren Baghdasaryan
  1 sibling, 2 replies; 7+ messages in thread
From: Peter Xu @ 2025-05-08 15:23 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, linux-kernel, Barry Song, David Hildenbrand,
	Suren Baghdasaryan, Lokesh Gidra

On Thu, May 08, 2025 at 09:07:35PM +1200, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> As David pointed out, what truly matters for mremap and userfaultfd
> move operations is the soft dirty bit. The current comment and
> implementation—which always sets the dirty bit for present PTEs
> and fails to set the soft dirty bit for swap PTEs—are incorrect.
> This patch updates the behavior to correctly set the soft dirty bit
> for both present and swap PTEs in accordance with mremap.
> 
> Reported-by: David Hildenbrand <david@redhat.com>
> Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Lokesh Gidra <lokeshgidra@google.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

Acked-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte
  2025-05-08 15:23 ` Peter Xu
@ 2025-05-08 15:27   ` Lokesh Gidra
  2025-05-08 21:36     ` Barry Song
  2025-05-08 15:33   ` Suren Baghdasaryan
  1 sibling, 1 reply; 7+ messages in thread
From: Lokesh Gidra @ 2025-05-08 15:27 UTC (permalink / raw)
  To: Peter Xu
  Cc: Barry Song, akpm, linux-mm, linux-kernel, Barry Song,
	David Hildenbrand, Suren Baghdasaryan, Kalesh Singh

Thanks Barry for fixing this.

On Thu, May 8, 2025 at 8:24 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Thu, May 08, 2025 at 09:07:35PM +1200, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > As David pointed out, what truly matters for mremap and userfaultfd
> > move operations is the soft dirty bit. The current comment and
> > implementation—which always sets the dirty bit for present PTEs
> > and fails to set the soft dirty bit for swap PTEs—are incorrect.

Can you please briefly describe the consequences of not setting the
soft-dirty bit? I'm wondering if it needs to be backported as a fix?

> > This patch updates the behavior to correctly set the soft dirty bit
> > for both present and swap PTEs in accordance with mremap.
> >
> > Reported-by: David Hildenbrand <david@redhat.com>
> > Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
> > Cc: Peter Xu <peterx@redhat.com>
> > Cc: Suren Baghdasaryan <surenb@google.com>
> > Cc: Lokesh Gidra <lokeshgidra@google.com>
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
>
> Acked-by: Peter Xu <peterx@redhat.com>
>
> --
> Peter Xu
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte
  2025-05-08 15:23 ` Peter Xu
  2025-05-08 15:27   ` Lokesh Gidra
@ 2025-05-08 15:33   ` Suren Baghdasaryan
  1 sibling, 0 replies; 7+ messages in thread
From: Suren Baghdasaryan @ 2025-05-08 15:33 UTC (permalink / raw)
  To: Peter Xu
  Cc: Barry Song, akpm, linux-mm, linux-kernel, Barry Song,
	David Hildenbrand, Lokesh Gidra

On Thu, May 8, 2025 at 8:23 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Thu, May 08, 2025 at 09:07:35PM +1200, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > As David pointed out, what truly matters for mremap and userfaultfd
> > move operations is the soft dirty bit. The current comment and
> > implementation—which always sets the dirty bit for present PTEs
> > and fails to set the soft dirty bit for swap PTEs—are incorrect.
> > This patch updates the behavior to correctly set the soft dirty bit
> > for both present and swap PTEs in accordance with mremap.
> >
> > Reported-by: David Hildenbrand <david@redhat.com>
> > Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
> > Cc: Peter Xu <peterx@redhat.com>
> > Cc: Suren Baghdasaryan <surenb@google.com>
> > Cc: Lokesh Gidra <lokeshgidra@google.com>
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
>
> Acked-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Suren Baghdasaryan <surenb@google.com>

The function was introduced in 6.8. We should probably CC stable as well.

>
> --
> Peter Xu
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte
  2025-05-08 15:27   ` Lokesh Gidra
@ 2025-05-08 21:36     ` Barry Song
  2025-05-08 22:00       ` Lokesh Gidra
  0 siblings, 1 reply; 7+ messages in thread
From: Barry Song @ 2025-05-08 21:36 UTC (permalink / raw)
  To: Lokesh Gidra
  Cc: Peter Xu, akpm, linux-mm, linux-kernel, Barry Song,
	David Hildenbrand, Suren Baghdasaryan, Kalesh Singh

On Fri, May 9, 2025 at 3:27 AM Lokesh Gidra <lokeshgidra@google.com> wrote:
>
> Thanks Barry for fixing this.
>
> On Thu, May 8, 2025 at 8:24 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Thu, May 08, 2025 at 09:07:35PM +1200, Barry Song wrote:
> > > From: Barry Song <v-songbaohua@oppo.com>
> > >
> > > As David pointed out, what truly matters for mremap and userfaultfd
> > > move operations is the soft dirty bit. The current comment and
> > > implementation—which always sets the dirty bit for present PTEs
> > > and fails to set the soft dirty bit for swap PTEs—are incorrect.
>
> Can you please briefly describe the consequences of not setting the
> soft-dirty bit? I'm wondering if it needs to be backported as a fix?

As I understand it, this could break features like Checkpoint-Restore
in Userspace (CRIU), which relies on tracking memory changes to create
incremental dumps. While Android may not currently have a real-world
use case for this, it would still be beneficial to backport the fix in
a general way.

>
> > > This patch updates the behavior to correctly set the soft dirty bit
> > > for both present and swap PTEs in accordance with mremap.
> > >
> > > Reported-by: David Hildenbrand <david@redhat.com>
> > > Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
> > > Cc: Peter Xu <peterx@redhat.com>
> > > Cc: Suren Baghdasaryan <surenb@google.com>
> > > Cc: Lokesh Gidra <lokeshgidra@google.com>
> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> >
> > Acked-by: Peter Xu <peterx@redhat.com>
> >
> > --
> > Peter Xu

Thanks
Barry


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte
  2025-05-08 21:36     ` Barry Song
@ 2025-05-08 22:00       ` Lokesh Gidra
  0 siblings, 0 replies; 7+ messages in thread
From: Lokesh Gidra @ 2025-05-08 22:00 UTC (permalink / raw)
  To: Barry Song
  Cc: Peter Xu, akpm, linux-mm, linux-kernel, Barry Song,
	David Hildenbrand, Suren Baghdasaryan, Kalesh Singh

On Thu, May 8, 2025 at 2:36 PM Barry Song <21cnbao@gmail.com> wrote:
>
> On Fri, May 9, 2025 at 3:27 AM Lokesh Gidra <lokeshgidra@google.com> wrote:
> >
> > Thanks Barry for fixing this.
> >
> > On Thu, May 8, 2025 at 8:24 AM Peter Xu <peterx@redhat.com> wrote:
> > >
> > > On Thu, May 08, 2025 at 09:07:35PM +1200, Barry Song wrote:
> > > > From: Barry Song <v-songbaohua@oppo.com>
> > > >
> > > > As David pointed out, what truly matters for mremap and userfaultfd
> > > > move operations is the soft dirty bit. The current comment and
> > > > implementation—which always sets the dirty bit for present PTEs
> > > > and fails to set the soft dirty bit for swap PTEs—are incorrect.
> >
> > Can you please briefly describe the consequences of not setting the
> > soft-dirty bit? I'm wondering if it needs to be backported as a fix?
>
> As I understand it, this could break features like Checkpoint-Restore
> in Userspace (CRIU), which relies on tracking memory changes to create
> incremental dumps. While Android may not currently have a real-world
> use case for this, it would still be beneficial to backport the fix in
> a general way.
>
Makes sense. Thanks for clarifying.
> >
> > > > This patch updates the behavior to correctly set the soft dirty bit
> > > > for both present and swap PTEs in accordance with mremap.
> > > >
> > > > Reported-by: David Hildenbrand <david@redhat.com>
> > > > Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
> > > > Cc: Peter Xu <peterx@redhat.com>
> > > > Cc: Suren Baghdasaryan <surenb@google.com>
> > > > Cc: Lokesh Gidra <lokeshgidra@google.com>
> > > > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > >
> > > Acked-by: Peter Xu <peterx@redhat.com>
> > >
> > > --
> > > Peter Xu
>
> Thanks
> Barry


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-05-08 22:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-08  9:07 [PATCH RFC] mm: userfaultfd: correct dirty flags set for both present and swap pte Barry Song
2025-05-08  9:24 ` David Hildenbrand
2025-05-08 15:23 ` Peter Xu
2025-05-08 15:27   ` Lokesh Gidra
2025-05-08 21:36     ` Barry Song
2025-05-08 22:00       ` Lokesh Gidra
2025-05-08 15:33   ` Suren Baghdasaryan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox