linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
@ 2025-11-14 17:53 Lorenzo Stoakes
  2025-11-14 17:53 ` [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge Lorenzo Stoakes
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-14 17:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-mm, linux-kernel

Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
establishing a new VMA, or via merge) as implemented in __mmap_complete()
and do_brk_flags().

However, when performing a merge of existing mappings such as when
performing mprotect(), we may lose the VM_SOFTDIRTY flag.


Lorenzo Stoakes (2):
  mm: propagate VM_SOFTDIRTY on merge
  testing/selftests/mm: add soft-dirty merge self-test

 include/linux/mm.h                      | 23 ++++++-----
 tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
 tools/testing/vma/vma_internal.h        | 23 ++++++-----
 3 files changed, 72 insertions(+), 25 deletions(-)

--
2.51.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge
  2025-11-14 17:53 [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Lorenzo Stoakes
@ 2025-11-14 17:53 ` Lorenzo Stoakes
  2025-11-17  4:39   ` Anshuman Khandual
                     ` (3 more replies)
  2025-11-14 17:53 ` [PATCH 2/2] testing/selftests/mm: add soft-dirty merge self-test Lorenzo Stoakes
                   ` (2 subsequent siblings)
  3 siblings, 4 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-14 17:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-mm, linux-kernel

Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
establishing a new VMA, or via merge) as implemented in __mmap_complete()
and do_brk_flags().

However, when performing a merge of existing mappings such as when
performing mprotect(), we may lose the VM_SOFTDIRTY flag.

This is because currently we simply ignore VM_SOFTDIRTY for the purposes of
merge, so one VMA may possess the flag and another not, and whichever
happens to be the target VMA will be the one upon which the merge is
performed which may or may not have VM_SOFTDIRTY set.

Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
which solves this issue.

Additionally update VMA userland tests to propagate changes.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 include/linux/mm.h               | 23 +++++++++++------------
 tools/testing/vma/vma_internal.h | 23 +++++++++++------------
 2 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 43eec43da66a..fd9eeff07eb5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -532,29 +532,28 @@ extern unsigned int kobjsize(const void *objp);
  * possesses it but the other does not, the merged VMA should nonetheless have
  * applied to it:
  *
+ *   VM_SOFTDIRTY - if a VMA is marked soft-dirty, that is has not had its
+ *                  references cleared via /proc/$pid/clear_refs, any merged VMA
+ *                  should be considered soft-dirty also as it operates at a VMA
+ *                  granularity.
+ *
  * VM_MAYBE_GUARD - If a VMA may have guard regions in place it implies that
  *                  mapped page tables may contain metadata not described by the
  *                  VMA and thus any merged VMA may also contain this metadata,
  *                  and thus we must make this flag sticky.
  */
-#define VM_STICKY VM_MAYBE_GUARD
+#define VM_STICKY (VM_SOFTDIRTY | VM_MAYBE_GUARD)
 
 /*
  * VMA flags we ignore for the purposes of merge, i.e. one VMA possessing one
  * of these flags and the other not does not preclude a merge.
  *
- * VM_SOFTDIRTY - Should not prevent from VMA merging, if we match the flags but
- *                dirty bit -- the caller should mark merged VMA as dirty. If
- *                dirty bit won't be excluded from comparison, we increase
- *                pressure on the memory system forcing the kernel to generate
- *                new VMAs when old one could be extended instead.
- *
- *    VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
- *                which should propagate to all VMAs, but the other does not,
- *                the merge should still proceed with the merge logic applying
- *                sticky flags to the final VMA.
+ * VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
+ *             which should propagate to all VMAs, but the other does not,
+ *             the merge should still proceed with the merge logic applying
+ *             sticky flags to the final VMA.
  */
-#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
+#define VM_IGNORE_MERGE VM_STICKY
 
 /*
  * Flags which should result in page tables being copied on fork. These are
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
index bd6352a5f24d..10f46a95a73a 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -122,29 +122,28 @@ extern unsigned long dac_mmap_min_addr;
  * possesses it but the other does not, the merged VMA should nonetheless have
  * applied to it:
  *
+ *   VM_SOFTDIRTY - if a VMA is marked soft-dirty, that is has not had its
+ *                  references cleared via /proc/$pid/clear_refs, any merged VMA
+ *                  should be considered soft-dirty also as it operates at a VMA
+ *                  granularity.
+ *
  * VM_MAYBE_GUARD - If a VMA may have guard regions in place it implies that
  *                  mapped page tables may contain metadata not described by the
  *                  VMA and thus any merged VMA may also contain this metadata,
  *                  and thus we must make this flag sticky.
  */
-#define VM_STICKY VM_MAYBE_GUARD
+#define VM_STICKY (VM_SOFTDIRTY | VM_MAYBE_GUARD)
 
 /*
  * VMA flags we ignore for the purposes of merge, i.e. one VMA possessing one
  * of these flags and the other not does not preclude a merge.
  *
- * VM_SOFTDIRTY - Should not prevent from VMA merging, if we match the flags but
- *                dirty bit -- the caller should mark merged VMA as dirty. If
- *                dirty bit won't be excluded from comparison, we increase
- *                pressure on the memory system forcing the kernel to generate
- *                new VMAs when old one could be extended instead.
- *
- *    VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
- *                which should propagate to all VMAs, but the other does not,
- *                the merge should still proceed with the merge logic applying
- *                sticky flags to the final VMA.
+ * VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
+ *             which should propagate to all VMAs, but the other does not,
+ *             the merge should still proceed with the merge logic applying
+ *             sticky flags to the final VMA.
  */
-#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
+#define VM_IGNORE_MERGE VM_STICKY
 
 /*
  * Flags which should result in page tables being copied on fork. These are
-- 
2.51.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 2/2] testing/selftests/mm: add soft-dirty merge self-test
  2025-11-14 17:53 [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Lorenzo Stoakes
  2025-11-14 17:53 ` [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge Lorenzo Stoakes
@ 2025-11-14 17:53 ` Lorenzo Stoakes
  2025-11-17 14:44   ` David Hildenbrand (Red Hat)
  2025-11-14 21:53 ` [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Andrew Morton
  2025-11-17  0:53 ` Andrei Vagin
  3 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-14 17:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-mm, linux-kernel

Assert that we correctly merge VMAs containing VM_SOFTDIRTY flags now that
we correctly handle these as sticky.

In order to do so, we have to account for the fact the pagemap interface
checks soft dirty PTEs and additionally that newly merged VMAs are marked
VM_SOFTDIRTY.

To account for this we use unfaulted anon VMAs, mapping one VMA in and
clearing soft-dirty, then another separate from the first which will be
marked soft-dirty which we then mremap() into place.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
index 4ee4db3750c1..bb29edb1e2a3 100644
--- a/tools/testing/selftests/mm/soft-dirty.c
+++ b/tools/testing/selftests/mm/soft-dirty.c
@@ -184,6 +184,54 @@ static void test_mprotect(int pagemap_fd, int pagesize, bool anon)
 		close(test_fd);
 }
 
+static void test_merge(int pagemap_fd, int pagesize)
+{
+	char *reserved, *map, *map2;
+
+	/* Reserve space. */
+	reserved = mmap(NULL, 4 * pagesize, PROT_NONE,
+			MAP_ANON | MAP_PRIVATE, -1, 0);
+	if (reserved == MAP_FAILED)
+		ksft_exit_fail_msg("mmap failed\n");
+	munmap(reserved, 4 * pagesize);
+
+	/* Map a page. */
+	map = mmap(&reserved[pagesize], pagesize, PROT_READ | PROT_WRITE,
+		   MAP_ANON | MAP_PRIVATE | MAP_FIXED, -1, 0);
+	if (map == MAP_FAILED)
+		ksft_exit_fail_msg("mmap failed\n");
+
+	/* This will clear VM_SOFTDIRTY too. */
+	clear_softdirty();
+
+	/*
+	 * Now place a new mapping which will be marked VM_SOFTDIRTY. Away from
+	 * map.
+	 */
+	map2 = mmap(&reserved[3 * pagesize], pagesize, PROT_READ | PROT_WRITE,
+		    MAP_ANON | MAP_PRIVATE | MAP_FIXED, -1, 0);
+	if (map2 == MAP_FAILED)
+		ksft_exit_fail_msg("mmap failed\n");
+
+	/*
+	 * Now remap it immediately adjacent to map, if the merge correctly
+	 * propagates VM_SOFTDIRTY, we should then observe the VMA as a whole
+	 * being marked soft-dirty.
+	 */
+	map2 = mremap(map2, pagesize, pagesize, MREMAP_FIXED | MREMAP_MAYMOVE,
+		      &reserved[2 * pagesize]);
+	if (map2 == MAP_FAILED)
+		ksft_exit_fail_msg("mremap failed\n");
+	ksft_test_result(pagemap_is_softdirty(pagemap_fd, map) == 1,
+			 "Test %s-anon soft-dirty after merge 1st pg\n",
+			 __func__);
+	ksft_test_result(pagemap_is_softdirty(pagemap_fd, map2) == 1,
+			 "Test %s-anon soft-dirty after merge 2nd pg\n",
+			 __func__);
+
+	munmap(map, 2 * pagesize);
+}
+
 static void test_mprotect_anon(int pagemap_fd, int pagesize)
 {
 	test_mprotect(pagemap_fd, pagesize, true);
@@ -204,7 +252,7 @@ int main(int argc, char **argv)
 	if (!softdirty_supported())
 		ksft_exit_skip("soft-dirty is not support\n");
 
-	ksft_set_plan(15);
+	ksft_set_plan(17);
 	pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY);
 	if (pagemap_fd < 0)
 		ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH);
@@ -216,6 +264,7 @@ int main(int argc, char **argv)
 	test_hugepage(pagemap_fd, pagesize);
 	test_mprotect_anon(pagemap_fd, pagesize);
 	test_mprotect_file(pagemap_fd, pagesize);
+	test_merge(pagemap_fd, pagesize);
 
 	close(pagemap_fd);
 
-- 
2.51.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-14 17:53 [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Lorenzo Stoakes
  2025-11-14 17:53 ` [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge Lorenzo Stoakes
  2025-11-14 17:53 ` [PATCH 2/2] testing/selftests/mm: add soft-dirty merge self-test Lorenzo Stoakes
@ 2025-11-14 21:53 ` Andrew Morton
  2025-11-17 11:41   ` Lorenzo Stoakes
  2025-11-17  0:53 ` Andrei Vagin
  3 siblings, 1 reply; 21+ messages in thread
From: Andrew Morton @ 2025-11-14 21:53 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-mm, linux-kernel

On Fri, 14 Nov 2025 17:53:17 +0000 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> establishing a new VMA, or via merge) as implemented in __mmap_complete()
> and do_brk_flags().
> 
> However, when performing a merge of existing mappings such as when
> performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> 

userspace-visible effects?

Documentation/admin-guide/mm/soft-dirty.rst tells me that this can
already happen in other circumstances so I guess it isn't very serious.
CRIU inefficiency.  perhaps?

Please review Documentation/admin-guide/mm/soft-dirty.rst, check that
it is complete and accurate?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-14 17:53 [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Lorenzo Stoakes
                   ` (2 preceding siblings ...)
  2025-11-14 21:53 ` [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Andrew Morton
@ 2025-11-17  0:53 ` Andrei Vagin
  2025-11-17  4:37   ` Anshuman Khandual
  2025-11-17 11:32   ` Lorenzo Stoakes
  3 siblings, 2 replies; 21+ messages in thread
From: Andrei Vagin @ 2025-11-17  0:53 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, Pedro Falcato, linux-mm, linux-kernel, criu

On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> establishing a new VMA, or via merge) as implemented in __mmap_complete()
> and do_brk_flags().
>
> However, when performing a merge of existing mappings such as when
> performing mprotect(), we may lose the VM_SOFTDIRTY flag.

Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.

A separate concern is whether merging two VMAs should be permitted when
one has the VM_SOFTDIRTY flag set and another does not. I think the
merging operation should be disallowed.The  issue is that
PAGE_IS_SOFT_DIRTY will be reported for every page in the resulting VMA.
Consider a scenario where a large VMA has only a small number of pages
marked SOFT_DIRTY. If we merge it with a smaller VMA that does have
VM_SOFTDIRTY, all pages in the originally large VMA will subsequently be
reported as SOFT_DIRTY. As a result, CRIU will needlessly dump all of
these pages again, even though the vast majority of them were unchanged
since the prior checkpoint iteration.

Thanks,
Andrei

>
>
> Lorenzo Stoakes (2):
>   mm: propagate VM_SOFTDIRTY on merge
>   testing/selftests/mm: add soft-dirty merge self-test
>
>  include/linux/mm.h                      | 23 ++++++-----
>  tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
>  tools/testing/vma/vma_internal.h        | 23 ++++++-----
>  3 files changed, 72 insertions(+), 25 deletions(-)
>
> --
> 2.51.0
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-17  0:53 ` Andrei Vagin
@ 2025-11-17  4:37   ` Anshuman Khandual
  2025-11-17 11:32   ` Lorenzo Stoakes
  1 sibling, 0 replies; 21+ messages in thread
From: Anshuman Khandual @ 2025-11-17  4:37 UTC (permalink / raw)
  To: Andrei Vagin, Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, Pedro Falcato, linux-mm, linux-kernel, criu

On 17/11/25 6:23 AM, Andrei Vagin wrote:
> On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
>>
>> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
>> establishing a new VMA, or via merge) as implemented in __mmap_complete()
>> and do_brk_flags().
>>
>> However, when performing a merge of existing mappings such as when
>> performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> 
> Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.
> 
> A separate concern is whether merging two VMAs should be permitted when
> one has the VM_SOFTDIRTY flag set and another does not. I think the
> merging operation should be disallowed.The  issue is that

If merging VM_SOFTDIRTY and non-VM_SOFTDIRTY VMAs would not be allowed then
what is the point for moving VM_SOFTDIRTY as VM_STICKY ?
> PAGE_IS_SOFT_DIRTY will be reported for every page in the resulting VMA.
> Consider a scenario where a large VMA has only a small number of pages
> marked SOFT_DIRTY. If we merge it with a smaller VMA that does have
> VM_SOFTDIRTY, all pages in the originally large VMA will subsequently be
> reported as SOFT_DIRTY. As a result, CRIU will needlessly dump all of
> these pages again, even though the vast majority of them were unchanged
> since the prior checkpoint iteration.
> 
> Thanks,
> Andrei
> 
>>
>>
>> Lorenzo Stoakes (2):
>>   mm: propagate VM_SOFTDIRTY on merge
>>   testing/selftests/mm: add soft-dirty merge self-test
>>
>>  include/linux/mm.h                      | 23 ++++++-----
>>  tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
>>  tools/testing/vma/vma_internal.h        | 23 ++++++-----
>>  3 files changed, 72 insertions(+), 25 deletions(-)
>>
>> --
>> 2.51.0
>>
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge
  2025-11-14 17:53 ` [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge Lorenzo Stoakes
@ 2025-11-17  4:39   ` Anshuman Khandual
  2025-11-17 11:34     ` Lorenzo Stoakes
  2025-11-17 14:25   ` David Hildenbrand (Red Hat)
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Anshuman Khandual @ 2025-11-17  4:39 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-mm, linux-kernel



On 14/11/25 11:23 PM, Lorenzo Stoakes wrote:
> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> establishing a new VMA, or via merge) as implemented in __mmap_complete()
> and do_brk_flags().
> 
> However, when performing a merge of existing mappings such as when
> performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> 
> This is because currently we simply ignore VM_SOFTDIRTY for the purposes of
> merge, so one VMA may possess the flag and another not, and whichever
> happens to be the target VMA will be the one upon which the merge is
> performed which may or may not have VM_SOFTDIRTY set.
> 
> Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
> which solves this issue.
> 
> Additionally update VMA userland tests to propagate changes.
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
>  include/linux/mm.h               | 23 +++++++++++------------
>  tools/testing/vma/vma_internal.h | 23 +++++++++++------------
>  2 files changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 43eec43da66a..fd9eeff07eb5 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -532,29 +532,28 @@ extern unsigned int kobjsize(const void *objp);
>   * possesses it but the other does not, the merged VMA should nonetheless have
>   * applied to it:
>   *
> + *   VM_SOFTDIRTY - if a VMA is marked soft-dirty, that is has not had its
> + *                  references cleared via /proc/$pid/clear_refs, any merged VMA
> + *                  should be considered soft-dirty also as it operates at a VMA
> + *                  granularity.
> + *
>   * VM_MAYBE_GUARD - If a VMA may have guard regions in place it implies that
>   *                  mapped page tables may contain metadata not described by the
>   *                  VMA and thus any merged VMA may also contain this metadata,
>   *                  and thus we must make this flag sticky.
>   */
> -#define VM_STICKY VM_MAYBE_GUARD
> +#define VM_STICKY (VM_SOFTDIRTY | VM_MAYBE_GUARD)
>  
>  /*
>   * VMA flags we ignore for the purposes of merge, i.e. one VMA possessing one
>   * of these flags and the other not does not preclude a merge.
>   *
> - * VM_SOFTDIRTY - Should not prevent from VMA merging, if we match the flags but
> - *                dirty bit -- the caller should mark merged VMA as dirty. If
> - *                dirty bit won't be excluded from comparison, we increase
> - *                pressure on the memory system forcing the kernel to generate
> - *                new VMAs when old one could be extended instead.
> - *
> - *    VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
> - *                which should propagate to all VMAs, but the other does not,
> - *                the merge should still proceed with the merge logic applying
> - *                sticky flags to the final VMA.
> + * VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
> + *             which should propagate to all VMAs, but the other does not,
> + *             the merge should still proceed with the merge logic applying
> + *             sticky flags to the final VMA.
>   */
> -#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
> +#define VM_IGNORE_MERGE VM_STICKY

Logically VM_STICKY should be the only flag qualifying for VM_IGNORE_MERGE. In that
case should not VM_IGNORE_MERGE flag be dropped all together ?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-17  0:53 ` Andrei Vagin
  2025-11-17  4:37   ` Anshuman Khandual
@ 2025-11-17 11:32   ` Lorenzo Stoakes
  2025-11-17 18:26     ` Andrei Vagin
  1 sibling, 1 reply; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-17 11:32 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, Pedro Falcato, linux-mm, linux-kernel, criu

On Sun, Nov 16, 2025 at 04:53:36PM -0800, Andrei Vagin wrote:
> On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > and do_brk_flags().
> >
> > However, when performing a merge of existing mappings such as when
> > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
>
> Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.
>
> A separate concern is whether merging two VMAs should be permitted when
> one has the VM_SOFTDIRTY flag set and another does not. I think the
> merging operation should be disallowed.The  issue is that


This patch doesn't change anything in terms of merging, it only _correctly_
marks VMAs as soft-dirty where certain, very specific, circumstances might
result in a merged VMA being incorrectly indicated to not be soft-dirty
when it in fact contains pages which are.

Since VMA fragmentation is an issue that impacts non-softydirty users, I'm
afraid we cannot split on this parameter.

It'd also be a user-visible change that could cause breaking issues
(mremap() for instances in _most_ cases requires that it operates on a
single VMA).

So this isn't possible.


> PAGE_IS_SOFT_DIRTY will be reported for every page in the resulting VMA.
> Consider a scenario where a large VMA has only a small number of pages
> marked SOFT_DIRTY. If we merge it with a smaller VMA that does have
> VM_SOFTDIRTY, all pages in the originally large VMA will subsequently be
> reported as SOFT_DIRTY. As a result, CRIU will needlessly dump all of
> these pages again, even though the vast majority of them were unchanged
> since the prior checkpoint iteration.

I think there's some confusion about what is possible here.

Currently if you don't invoke /proc/$pid/clear_refs, all VMAs will have
soft-dirty set until you do.

So this is a situation that _already exists_.

And intentionally so - we default all VMAs to soft-dirty so users can
detect new mappings in order not to perceive e.g. mmap()'ing over an
existing range as as being no change.

OK so what if you clear references? Considering:

1. Map large VMA
2. Clear references
3. Dirty several pages (VM_SOFTDIRTY clear)
4. Map new VMA immediately _after_ it (VM_SOFTDIRTY set)
5. Merge - Before this patch: VM_SOFTDIRTY bit cleared on merge BUT SET
   AGAIN due to it being an mmap() invocation. After this patch:
   VM_SOFTDIRTY bit retained on merge but also set again due to it being an
   mmap() invocation.

So this kind of merge has no change in behaviour.

And again, it's correct - the user needs to be able to identify what's
changed.

This change fixes this behaviour to be consistent for other types of merge,
when previously it was not.

In the past, you'd get soft-dirty set/not set _depending on the type of
merge_. So if the target VMA had the flag set, you'd have it marked
soft-dirty, otherwise not.

Since it's unacceptabale to fragment VMAs on the basis of soft-dirty, we're
_only_ improving correctness here, and this patch is a net good no matter
what.


>
> Thanks,
> Andrei
>
> >
> >
> > Lorenzo Stoakes (2):
> >   mm: propagate VM_SOFTDIRTY on merge
> >   testing/selftests/mm: add soft-dirty merge self-test
> >
> >  include/linux/mm.h                      | 23 ++++++-----
> >  tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
> >  tools/testing/vma/vma_internal.h        | 23 ++++++-----
> >  3 files changed, 72 insertions(+), 25 deletions(-)
> >
> > --
> > 2.51.0
> >

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge
  2025-11-17  4:39   ` Anshuman Khandual
@ 2025-11-17 11:34     ` Lorenzo Stoakes
  0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-17 11:34 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, Pedro Falcato, linux-mm, linux-kernel

On Mon, Nov 17, 2025 at 10:09:37AM +0530, Anshuman Khandual wrote:
>
>
> On 14/11/25 11:23 PM, Lorenzo Stoakes wrote:
> > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > and do_brk_flags().
> >
> > However, when performing a merge of existing mappings such as when
> > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> >
> > This is because currently we simply ignore VM_SOFTDIRTY for the purposes of
> > merge, so one VMA may possess the flag and another not, and whichever
> > happens to be the target VMA will be the one upon which the merge is
> > performed which may or may not have VM_SOFTDIRTY set.
> >
> > Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
> > which solves this issue.
> >
> > Additionally update VMA userland tests to propagate changes.
> >
> > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > ---
> >  include/linux/mm.h               | 23 +++++++++++------------
> >  tools/testing/vma/vma_internal.h | 23 +++++++++++------------
> >  2 files changed, 22 insertions(+), 24 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 43eec43da66a..fd9eeff07eb5 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -532,29 +532,28 @@ extern unsigned int kobjsize(const void *objp);
> >   * possesses it but the other does not, the merged VMA should nonetheless have
> >   * applied to it:
> >   *
> > + *   VM_SOFTDIRTY - if a VMA is marked soft-dirty, that is has not had its
> > + *                  references cleared via /proc/$pid/clear_refs, any merged VMA
> > + *                  should be considered soft-dirty also as it operates at a VMA
> > + *                  granularity.
> > + *
> >   * VM_MAYBE_GUARD - If a VMA may have guard regions in place it implies that
> >   *                  mapped page tables may contain metadata not described by the
> >   *                  VMA and thus any merged VMA may also contain this metadata,
> >   *                  and thus we must make this flag sticky.
> >   */
> > -#define VM_STICKY VM_MAYBE_GUARD
> > +#define VM_STICKY (VM_SOFTDIRTY | VM_MAYBE_GUARD)
> >
> >  /*
> >   * VMA flags we ignore for the purposes of merge, i.e. one VMA possessing one
> >   * of these flags and the other not does not preclude a merge.
> >   *
> > - * VM_SOFTDIRTY - Should not prevent from VMA merging, if we match the flags but
> > - *                dirty bit -- the caller should mark merged VMA as dirty. If
> > - *                dirty bit won't be excluded from comparison, we increase
> > - *                pressure on the memory system forcing the kernel to generate
> > - *                new VMAs when old one could be extended instead.
> > - *
> > - *    VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
> > - *                which should propagate to all VMAs, but the other does not,
> > - *                the merge should still proceed with the merge logic applying
> > - *                sticky flags to the final VMA.
> > + * VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
> > + *             which should propagate to all VMAs, but the other does not,
> > + *             the merge should still proceed with the merge logic applying
> > + *             sticky flags to the final VMA.
> >   */
> > -#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
> > +#define VM_IGNORE_MERGE VM_STICKY
>
> Logically VM_STICKY should be the only flag qualifying for VM_IGNORE_MERGE. In that
> case should not VM_IGNORE_MERGE flag be dropped all together ?

I intentionally kept it to be explicit. This is self-documenting as-is.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-14 21:53 ` [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Andrew Morton
@ 2025-11-17 11:41   ` Lorenzo Stoakes
  0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-17 11:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-mm, linux-kernel

On Fri, Nov 14, 2025 at 01:53:03PM -0800, Andrew Morton wrote:
> On Fri, 14 Nov 2025 17:53:17 +0000 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
>
> > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > and do_brk_flags().
> >
> > However, when performing a merge of existing mappings such as when
> > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> >
>
> userspace-visible effects?

Simply more correct accounting of soft-dirty :)

>
> Documentation/admin-guide/mm/soft-dirty.rst tells me that this can
> already happen in other circumstances so I guess it isn't very serious.
> CRIU inefficiency.  perhaps?

I don't think it should cause inefficiency other than us already _accidentally_
being more efficient, see the discussion in thread :)

>
> Please review Documentation/admin-guide/mm/soft-dirty.rst, check that
> it is complete and accurate?

It LGTM, we are changing some very specific internal implementation detail here
which I don't think is worth mentioning here (effectively - 'we used to be wrong
sometimes, now not so much' :)

(Staring at tihs I realise I _probably_ need to change something very specific
in the original sticky implementation. The VMA implementation is so
finnicky... will send follow up fixpatch/respin on that w/details)

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge
  2025-11-14 17:53 ` [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge Lorenzo Stoakes
  2025-11-17  4:39   ` Anshuman Khandual
@ 2025-11-17 14:25   ` David Hildenbrand (Red Hat)
  2025-11-17 15:35     ` Lorenzo Stoakes
  2025-11-17 15:47   ` Pedro Falcato
  2025-11-17 16:05   ` Liam R. Howlett
  3 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-17 14:25 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-mm, linux-kernel

On 14.11.25 18:53, Lorenzo Stoakes wrote:
> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> establishing a new VMA, or via merge) as implemented in __mmap_complete()
> and do_brk_flags().
> 
> However, when performing a merge of existing mappings such as when
> performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> 
> This is because currently we simply ignore VM_SOFTDIRTY for the purposes of
> merge, so one VMA may possess the flag and another not, and whichever
> happens to be the target VMA will be the one upon which the merge is
> performed which may or may not have VM_SOFTDIRTY set.
> 
> Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
> which solves this issue.
> 
> Additionally update VMA userland tests to propagate changes.
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---

Looks reasonable to me. I thought that we had that behavior in the past 
... but I also remember scenarios where we would have imprecise 
soft-dirty handling. So I assume this was semi-broken for a while 
(soft-broken :) )

Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>

-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] testing/selftests/mm: add soft-dirty merge self-test
  2025-11-14 17:53 ` [PATCH 2/2] testing/selftests/mm: add soft-dirty merge self-test Lorenzo Stoakes
@ 2025-11-17 14:44   ` David Hildenbrand (Red Hat)
  2025-11-17 15:15     ` Lorenzo Stoakes
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-17 14:44 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-mm, linux-kernel

On 14.11.25 18:53, Lorenzo Stoakes wrote:
> Assert that we correctly merge VMAs containing VM_SOFTDIRTY flags now that
> we correctly handle these as sticky.
> 
> In order to do so, we have to account for the fact the pagemap interface
> checks soft dirty PTEs and additionally that newly merged VMAs are marked
> VM_SOFTDIRTY.
> 
> To account for this we use unfaulted anon VMAs, mapping one VMA in and
> clearing soft-dirty, then another separate from the first which will be
> marked soft-dirty which we then mremap() into place.
> 
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
>   tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
>   1 file changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
> index 4ee4db3750c1..bb29edb1e2a3 100644
> --- a/tools/testing/selftests/mm/soft-dirty.c
> +++ b/tools/testing/selftests/mm/soft-dirty.c
> @@ -184,6 +184,54 @@ static void test_mprotect(int pagemap_fd, int pagesize, bool anon)
>   		close(test_fd);
>   }
>   
> +static void test_merge(int pagemap_fd, int pagesize)
> +{
> +	char *reserved, *map, *map2;
> +
> +	/* Reserve space. */

It took me a while to figure out why you are using 4 pages. I guess you 
want to make sure that we don't end up merging to the left (or the 
right). A diagram would have helped me.

> +	reserved = mmap(NULL, 4 * pagesize, PROT_NONE,
> +			MAP_ANON | MAP_PRIVATE, -1, 0);
> +	if (reserved == MAP_FAILED)
> +		ksft_exit_fail_msg("mmap failed\n");
> +	munmap(reserved, 4 * pagesize);
> +
> +	/* Map a page. */

Note that we are not actually "mapping a page". "Create a new page-sized 
VMA" or sth like that.

> +	map = mmap(&reserved[pagesize], pagesize, PROT_READ | PROT_WRITE,
> +		   MAP_ANON | MAP_PRIVATE | MAP_FIXED, -1, 0);
> +	if (map == MAP_FAILED)
> +		ksft_exit_fail_msg("mmap failed\n");
> +
> +	/* This will clear VM_SOFTDIRTY too. */
> +	clear_softdirty();
> +
> +	/*
> +	 * Now place a new mapping which will be marked VM_SOFTDIRTY. Away from
> +	 * map.

Could we have something "to the right" of this new VMA that we might be 
merging with (that might interfere?) and if so, do we care?

Just wondering if we would actually want a reserved area that spans 5 
page sizes to rule out these cases.

mmap1

[ empty ][ VMA1  ][ empty                   ]

mmap2

[ empty ][ VMA1  ][ empty ][ VMA2  ][ empty ]

mremap

[ empty ][ VMA1  ][ VMA2  ][ empty          ]

which is after the merge

[ empty ][ VMA (SD)       ][                ]


-- 
Cheers

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] testing/selftests/mm: add soft-dirty merge self-test
  2025-11-17 14:44   ` David Hildenbrand (Red Hat)
@ 2025-11-17 15:15     ` Lorenzo Stoakes
  0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-17 15:15 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: Andrew Morton, Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-mm, linux-kernel

On Mon, Nov 17, 2025 at 03:44:46PM +0100, David Hildenbrand (Red Hat) wrote:
> On 14.11.25 18:53, Lorenzo Stoakes wrote:
> > Assert that we correctly merge VMAs containing VM_SOFTDIRTY flags now that
> > we correctly handle these as sticky.
> >
> > In order to do so, we have to account for the fact the pagemap interface
> > checks soft dirty PTEs and additionally that newly merged VMAs are marked
> > VM_SOFTDIRTY.
> >
> > To account for this we use unfaulted anon VMAs, mapping one VMA in and
> > clearing soft-dirty, then another separate from the first which will be
> > marked soft-dirty which we then mremap() into place.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > ---
> >   tools/testing/selftests/mm/soft-dirty.c | 51 ++++++++++++++++++++++++-
> >   1 file changed, 50 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c
> > index 4ee4db3750c1..bb29edb1e2a3 100644
> > --- a/tools/testing/selftests/mm/soft-dirty.c
> > +++ b/tools/testing/selftests/mm/soft-dirty.c
> > @@ -184,6 +184,54 @@ static void test_mprotect(int pagemap_fd, int pagesize, bool anon)
> >   		close(test_fd);
> >   }
> > +static void test_merge(int pagemap_fd, int pagesize)
> > +{
> > +	char *reserved, *map, *map2;
> > +
> > +	/* Reserve space. */
>
> It took me a while to figure out why you are using 4 pages. I guess you want
> to make sure that we don't end up merging to the left (or the right). A
> diagram would have helped me.

I have spoiled everybody too much with my ASCII diagrams ;)

But _just for you_ I will make one again ;))

>
> > +	reserved = mmap(NULL, 4 * pagesize, PROT_NONE,
> > +			MAP_ANON | MAP_PRIVATE, -1, 0);
> > +	if (reserved == MAP_FAILED)
> > +		ksft_exit_fail_msg("mmap failed\n");
> > +	munmap(reserved, 4 * pagesize);
> > +
> > +	/* Map a page. */
>
> Note that we are not actually "mapping a page". "Create a new page-sized
> VMA" or sth like that.

Lol depends on your definition I suppose. Not in the sense of page table
mappings. I guess this is fairly redundant anyway so can dorp

>
> > +	map = mmap(&reserved[pagesize], pagesize, PROT_READ | PROT_WRITE,
> > +		   MAP_ANON | MAP_PRIVATE | MAP_FIXED, -1, 0);
> > +	if (map == MAP_FAILED)
> > +		ksft_exit_fail_msg("mmap failed\n");
> > +
> > +	/* This will clear VM_SOFTDIRTY too. */
> > +	clear_softdirty();
> > +
> > +	/*
> > +	 * Now place a new mapping which will be marked VM_SOFTDIRTY. Away from
> > +	 * map.
>
> Could we have something "to the right" of this new VMA that we might be
> merging with (that might interfere?) and if so, do we care?
>
> Just wondering if we would actually want a reserved area that spans 5 page
> sizes to rule out these cases.
>

You're right! This is an oversight, will fix.

> mmap1
>
> [ empty ][ VMA1  ][ empty                   ]
>
> mmap2
>
> [ empty ][ VMA1  ][ empty ][ VMA2  ][ empty ]
>
> mremap
>
> [ empty ][ VMA1  ][ VMA2  ][ empty          ]
>
> which is after the merge
>
> [ empty ][ VMA (SD)       ][                ]

Yeah this is the purpose of adding space around.

I also want to add an additional test anyway so can do both things at once
:)

>
>
> --
> Cheers
>
> David

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge
  2025-11-17 14:25   ` David Hildenbrand (Red Hat)
@ 2025-11-17 15:35     ` Lorenzo Stoakes
  0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-17 15:35 UTC (permalink / raw)
  To: David Hildenbrand (Red Hat)
  Cc: Andrew Morton, Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-mm, linux-kernel

On Mon, Nov 17, 2025 at 03:25:05PM +0100, David Hildenbrand (Red Hat) wrote:
> On 14.11.25 18:53, Lorenzo Stoakes wrote:
> > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > and do_brk_flags().
> >
> > However, when performing a merge of existing mappings such as when
> > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> >
> > This is because currently we simply ignore VM_SOFTDIRTY for the purposes of
> > merge, so one VMA may possess the flag and another not, and whichever
> > happens to be the target VMA will be the one upon which the merge is
> > performed which may or may not have VM_SOFTDIRTY set.
> >
> > Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
> > which solves this issue.
> >
> > Additionally update VMA userland tests to propagate changes.
> >
> > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > ---
>
> Looks reasonable to me. I thought that we had that behavior in the past ...
> but I also remember scenarios where we would have imprecise soft-dirty
> handling. So I assume this was semi-broken for a while (soft-broken :) )

:))

Yeah it's only specific merge scenarios, and only when you e.g. already cleared
refs then mapped a new VMA and it happened to merge.

Nicer thing about this change is we stop treating VM_SOFTDIRTY like an exception
on merge :)

>
> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>

Cheers!

>
> --
> Cheers
>
> David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge
  2025-11-14 17:53 ` [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge Lorenzo Stoakes
  2025-11-17  4:39   ` Anshuman Khandual
  2025-11-17 14:25   ` David Hildenbrand (Red Hat)
@ 2025-11-17 15:47   ` Pedro Falcato
  2025-11-17 15:53     ` Lorenzo Stoakes
  2025-11-17 16:05   ` Liam R. Howlett
  3 siblings, 1 reply; 21+ messages in thread
From: Pedro Falcato @ 2025-11-17 15:47 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, linux-mm, linux-kernel

On Fri, Nov 14, 2025 at 05:53:18PM +0000, Lorenzo Stoakes wrote:
> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> establishing a new VMA, or via merge) as implemented in __mmap_complete()
> and do_brk_flags().
> 
> However, when performing a merge of existing mappings such as when
> performing mprotect(), we may lose the VM_SOFTDIRTY flag.

Does it make sense to backport this to stable? A more minimal version, that is.

> 
> This is because currently we simply ignore VM_SOFTDIRTY for the purposes of
> merge, so one VMA may possess the flag and another not, and whichever
> happens to be the target VMA will be the one upon which the merge is
> performed which may or may not have VM_SOFTDIRTY set.
> 
> Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
> which solves this issue.
> 
> Additionally update VMA userland tests to propagate changes.
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

Reviewed-by: Pedro Falcato <pfalcato@suse.de>

-- 
Pedro


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge
  2025-11-17 15:47   ` Pedro Falcato
@ 2025-11-17 15:53     ` Lorenzo Stoakes
  0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-17 15:53 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, linux-mm, linux-kernel

On Mon, Nov 17, 2025 at 03:47:51PM +0000, Pedro Falcato wrote:
> On Fri, Nov 14, 2025 at 05:53:18PM +0000, Lorenzo Stoakes wrote:
> > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > and do_brk_flags().
> >
> > However, when performing a merge of existing mappings such as when
> > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
>
> Does it make sense to backport this to stable? A more minimal version, that is.

No :) This has been subtly broken since forever. I don't think it warrants that
and it'd require significant and risky changes to older kernels to even make it
possible.

It's more a biproduct of features added so let's fix this going forward.

>
> >
> > This is because currently we simply ignore VM_SOFTDIRTY for the purposes of
> > merge, so one VMA may possess the flag and another not, and whichever
> > happens to be the target VMA will be the one upon which the merge is
> > performed which may or may not have VM_SOFTDIRTY set.
> >
> > Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
> > which solves this issue.
> >
> > Additionally update VMA userland tests to propagate changes.
> >
> > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
> Reviewed-by: Pedro Falcato <pfalcato@suse.de>

Thanks!

>
> --
> Pedro

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge
  2025-11-14 17:53 ` [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge Lorenzo Stoakes
                     ` (2 preceding siblings ...)
  2025-11-17 15:47   ` Pedro Falcato
@ 2025-11-17 16:05   ` Liam R. Howlett
  3 siblings, 0 replies; 21+ messages in thread
From: Liam R. Howlett @ 2025-11-17 16:05 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-mm, linux-kernel

* Lorenzo Stoakes <lorenzo.stoakes@oracle.com> [251114 12:53]:
> Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> establishing a new VMA, or via merge) as implemented in __mmap_complete()
> and do_brk_flags().
> 
> However, when performing a merge of existing mappings such as when
> performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> 
> This is because currently we simply ignore VM_SOFTDIRTY for the purposes of
> merge, so one VMA may possess the flag and another not, and whichever
> happens to be the target VMA will be the one upon which the merge is
> performed which may or may not have VM_SOFTDIRTY set.
> 
> Now we have the concept of 'sticky' VMA flags, let's make VM_SOFTDIRTY one
> which solves this issue.
> 
> Additionally update VMA userland tests to propagate changes.
> 

Nit below on the comment, but looks good.

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
>  include/linux/mm.h               | 23 +++++++++++------------
>  tools/testing/vma/vma_internal.h | 23 +++++++++++------------
>  2 files changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 43eec43da66a..fd9eeff07eb5 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -532,29 +532,28 @@ extern unsigned int kobjsize(const void *objp);
>   * possesses it but the other does not, the merged VMA should nonetheless have
>   * applied to it:
>   *
> + *   VM_SOFTDIRTY - if a VMA is marked soft-dirty, that is has not had its
> + *                  references cleared via /proc/$pid/clear_refs, any merged VMA
> + *                  should be considered soft-dirty also as it operates at a VMA
> + *                  granularity.
> + *
>   * VM_MAYBE_GUARD - If a VMA may have guard regions in place it implies that
>   *                  mapped page tables may contain metadata not described by the
>   *                  VMA and thus any merged VMA may also contain this metadata,
>   *                  and thus we must make this flag sticky.
>   */
> -#define VM_STICKY VM_MAYBE_GUARD
> +#define VM_STICKY (VM_SOFTDIRTY | VM_MAYBE_GUARD)
>  
>  /*
>   * VMA flags we ignore for the purposes of merge, i.e. one VMA possessing one
>   * of these flags and the other not does not preclude a merge.
>   *
> - * VM_SOFTDIRTY - Should not prevent from VMA merging, if we match the flags but
> - *                dirty bit -- the caller should mark merged VMA as dirty. If
> - *                dirty bit won't be excluded from comparison, we increase
> - *                pressure on the memory system forcing the kernel to generate
> - *                new VMAs when old one could be extended instead.
> - *
> - *    VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
> - *                which should propagate to all VMAs, but the other does not,
> - *                the merge should still proceed with the merge logic applying
> - *                sticky flags to the final VMA.
> + * VM_STICKY - If one VMA has flags which most be 'sticky', that is ones
> + *             which should propagate to all VMAs, but the other does not,
> + *             the merge should still proceed with the merge logic applying
> + *             sticky flags to the final VMA.

Nit, could we also fix the wording of this comment when you are fixing
the spacing?

>   */
> -#define VM_IGNORE_MERGE (VM_SOFTDIRTY | VM_STICKY)
> +#define VM_IGNORE_MERGE VM_STICKY

I also like keeping this for self-documenting code.

Thanks,
Liam



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-17 11:32   ` Lorenzo Stoakes
@ 2025-11-17 18:26     ` Andrei Vagin
  2025-11-17 19:57       ` Lorenzo Stoakes
  0 siblings, 1 reply; 21+ messages in thread
From: Andrei Vagin @ 2025-11-17 18:26 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, Pedro Falcato, linux-mm, linux-kernel, criu,
	Cyrill Gorcunov

On Mon, Nov 17, 2025 at 3:33 AM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> On Sun, Nov 16, 2025 at 04:53:36PM -0800, Andrei Vagin wrote:
> > On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
> > <lorenzo.stoakes@oracle.com> wrote:
> > >
> > > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > > and do_brk_flags().
> > >
> > > However, when performing a merge of existing mappings such as when
> > > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> >
> > Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.
> >
> > A separate concern is whether merging two VMAs should be permitted when
> > one has the VM_SOFTDIRTY flag set and another does not. I think the
> > merging operation should be disallowed.The  issue is that
>
>
> This patch doesn't change anything in terms of merging, it only _correctly_
> marks VMAs as soft-dirty where certain, very specific, circumstances might
> result in a merged VMA being incorrectly indicated to not be soft-dirty
> when it in fact contains pages which are.

As I mentioned in the previous message, this patch is correct, and I
appreciate your effort to solve this issue. My comment was about whether
we should allow merging VMAs if one has VM_SOFTDIRTY and the other does
not. You are right, this is a separate question unrelated to this patch.

I recall correctly that initially, merging vma-s with different
VM_SORTDIRTY bit values was not allowed. It was a bit surprising that
this behavior was changed by Cyrill in 34228d473efe.  Cyrill was an
active CRIU contributor at the time, so we can't even blame anyone for
breaking CRIU :).

Thanks,
Andrei


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-17 18:26     ` Andrei Vagin
@ 2025-11-17 19:57       ` Lorenzo Stoakes
  2025-11-19 13:09         ` Cyrill Gorcunov
  0 siblings, 1 reply; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-17 19:57 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, Pedro Falcato, linux-mm, linux-kernel, criu,
	Cyrill Gorcunov

On Mon, Nov 17, 2025 at 10:26:34AM -0800, Andrei Vagin wrote:
> On Mon, Nov 17, 2025 at 3:33 AM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> >
> > On Sun, Nov 16, 2025 at 04:53:36PM -0800, Andrei Vagin wrote:
> > > On Fri, Nov 14, 2025 at 9:59 AM Lorenzo Stoakes
> > > <lorenzo.stoakes@oracle.com> wrote:
> > > >
> > > > Currently we set VM_SOFTDIRTY when a new mapping is set up (whether by
> > > > establishing a new VMA, or via merge) as implemented in __mmap_complete()
> > > > and do_brk_flags().
> > > >
> > > > However, when performing a merge of existing mappings such as when
> > > > performing mprotect(), we may lose the VM_SOFTDIRTY flag.
> > >
> > > Losing VM_SOFTDIRTY is definitely a bug, thank you for fixing it.
> > >
> > > A separate concern is whether merging two VMAs should be permitted when
> > > one has the VM_SOFTDIRTY flag set and another does not. I think the
> > > merging operation should be disallowed.The  issue is that
> >
> >
> > This patch doesn't change anything in terms of merging, it only _correctly_
> > marks VMAs as soft-dirty where certain, very specific, circumstances might
> > result in a merged VMA being incorrectly indicated to not be soft-dirty
> > when it in fact contains pages which are.
>
> As I mentioned in the previous message, this patch is correct, and I
> appreciate your effort to solve this issue. My comment was about whether
> we should allow merging VMAs if one has VM_SOFTDIRTY and the other does
> not. You are right, this is a separate question unrelated to this patch.

Thanks :)

>
> I recall correctly that initially, merging vma-s with different
> VM_SORTDIRTY bit values was not allowed. It was a bit surprising that
> this behavior was changed by Cyrill in 34228d473efe.  Cyrill was an
> active CRIU contributor at the time, so we can't even blame anyone for
> breaking CRIU :).

Well I think Cyrill is in the right here :) the problem described there -
that of hitting the max_map_count simply due to failed VM_SOFTDIRTY merges
- is very serious and clearly highlights the issue that arises from not
merging these - that is VMA fragmentation.

>
> Thanks,
> Andrei

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-17 19:57       ` Lorenzo Stoakes
@ 2025-11-19 13:09         ` Cyrill Gorcunov
  2025-11-19 17:21           ` Lorenzo Stoakes
  0 siblings, 1 reply; 21+ messages in thread
From: Cyrill Gorcunov @ 2025-11-19 13:09 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrei Vagin, Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, Pedro Falcato, linux-mm, linux-kernel, criu

On Mon, Nov 17, 2025 at 07:57:30PM +0000, Lorenzo Stoakes wrote:
...
> > I recall correctly that initially, merging vma-s with different
> > VM_SORTDIRTY bit values was not allowed. It was a bit surprising that
> > this behavior was changed by Cyrill in 34228d473efe.  Cyrill was an
> > active CRIU contributor at the time, so we can't even blame anyone for
> > breaking CRIU :).
> 
> Well I think Cyrill is in the right here :) the problem described there -
> that of hitting the max_map_count simply due to failed VM_SOFTDIRTY merges
> - is very serious and clearly highlights the issue that arises from not
> merging these - that is VMA fragmentation.

Hi guys! Happen to miss this thread due to high message traffic, thanks
for CC'ing me ;) Yeah, disability to merge VMAs due to softdirty bit has
been a serious issue, so better criu to dump more (redundant) memory
than apps got broken. As to this patch series, I think it is good, thanks
a huge Lorenzo!

Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag
  2025-11-19 13:09         ` Cyrill Gorcunov
@ 2025-11-19 17:21           ` Lorenzo Stoakes
  0 siblings, 0 replies; 21+ messages in thread
From: Lorenzo Stoakes @ 2025-11-19 17:21 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andrei Vagin, Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Jann Horn, Pedro Falcato, linux-mm, linux-kernel, criu

On Wed, Nov 19, 2025 at 04:09:49PM +0300, Cyrill Gorcunov wrote:
> On Mon, Nov 17, 2025 at 07:57:30PM +0000, Lorenzo Stoakes wrote:
> ...
> > > I recall correctly that initially, merging vma-s with different
> > > VM_SORTDIRTY bit values was not allowed. It was a bit surprising that
> > > this behavior was changed by Cyrill in 34228d473efe.  Cyrill was an
> > > active CRIU contributor at the time, so we can't even blame anyone for
> > > breaking CRIU :).
> >
> > Well I think Cyrill is in the right here :) the problem described there -
> > that of hitting the max_map_count simply due to failed VM_SOFTDIRTY merges
> > - is very serious and clearly highlights the issue that arises from not
> > merging these - that is VMA fragmentation.
>
> Hi guys! Happen to miss this thread due to high message traffic, thanks
> for CC'ing me ;) Yeah, disability to merge VMAs due to softdirty bit has
> been a serious issue, so better criu to dump more (redundant) memory
> than apps got broken. As to this patch series, I think it is good, thanks
> a huge Lorenzo!
>
> Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>

Thanks :)


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2025-11-19 17:21 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-14 17:53 [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Lorenzo Stoakes
2025-11-14 17:53 ` [PATCH 1/2] mm: propagate VM_SOFTDIRTY on merge Lorenzo Stoakes
2025-11-17  4:39   ` Anshuman Khandual
2025-11-17 11:34     ` Lorenzo Stoakes
2025-11-17 14:25   ` David Hildenbrand (Red Hat)
2025-11-17 15:35     ` Lorenzo Stoakes
2025-11-17 15:47   ` Pedro Falcato
2025-11-17 15:53     ` Lorenzo Stoakes
2025-11-17 16:05   ` Liam R. Howlett
2025-11-14 17:53 ` [PATCH 2/2] testing/selftests/mm: add soft-dirty merge self-test Lorenzo Stoakes
2025-11-17 14:44   ` David Hildenbrand (Red Hat)
2025-11-17 15:15     ` Lorenzo Stoakes
2025-11-14 21:53 ` [PATCH 0/2] make VM_SOFTDIRTY a sticky VMA flag Andrew Morton
2025-11-17 11:41   ` Lorenzo Stoakes
2025-11-17  0:53 ` Andrei Vagin
2025-11-17  4:37   ` Anshuman Khandual
2025-11-17 11:32   ` Lorenzo Stoakes
2025-11-17 18:26     ` Andrei Vagin
2025-11-17 19:57       ` Lorenzo Stoakes
2025-11-19 13:09         ` Cyrill Gorcunov
2025-11-19 17:21           ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox