linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Nadav Amit <nadav.amit@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2] mm/mprotect: Fix soft-dirty check in can_change_pte_writable()
Date: Thu, 21 Jul 2022 09:28:47 +0200	[thread overview]
Message-ID: <f0d193c2-9b23-6f3a-9208-2a615febe0c2@redhat.com> (raw)
In-Reply-To: <20220720220324.88538-1-peterx@redhat.com>

On 21.07.22 00:03, Peter Xu wrote:
> The check wanted to make sure when soft-dirty tracking is enabled we won't
> grant write bit by accident, as a page fault is needed for dirty tracking.
> The intention is correct but we didn't check it right because VM_SOFTDIRTY
> set actually means soft-dirty tracking disabled.  Fix it.

Thanks for digging into this and writing the reproducer. The softdirty
logic was rather confusing for me.

> 
> It wasn't a bug for a long time because we used to only optimize the write
> bit settings in change_pte_range() for page caches, and since we've got a
> higher level check in vma_wants_writenotify(), we will never set the bit
> MM_CP_TRY_CHANGE_WRITABLE for soft-dirty enabled page caches, hence even if
> we checked with the wrong value of VM_SOFTDIRTY in change_pte_range() it'll
> just be an no-op.  Functionally it was still correct, even if cpu cycles
> wasted.

I don't quite follow that explanation and most probably I am missing
something.

Modifying your test to map page from a file MAP_SHARED gives me under
5.18.11-100.fc35.x86_64:


53,54d52
<       FILE *file = fopen("tmpfile", "w+");
<       int file_fd;
56d53
<       assert(file);
59,61d55
<
<       file_fd = fileno(file);
<       ftruncate(file_fd, psize);
63c57
<                   MAP_SHARED, file_fd, 0);
---
>                   MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);


t480s: ~  $ sudo ./tmp
ERROR: Wrote page again, soft-dirty=0 (expect: 1



IMHO, while the check in vma_wants_writenotify() is correct and makes
sure that the pages are kept R/O by the generic machinery. We get
vma_wants_writenotify(), so we activate MM_CP_TRY_CHANGE_WRITABLE. The
wrong logic in can_change_pte_writable(), however, maps the page
writable again without caring about softdirty.

At least that would be my explanation for the failure. But maybe I
messes up something else :)



> 
> However after the recent work of anonymous page optimization on exclusive
> pages we'll start to make it wrong because anonymous page does not require
> the check in vma_wants_writenotify() hence it'll suffer from the wrong
> check here in can_change_pte_writable().
> 
> We can easily verify this with any exclusive anonymous page, like program
> below:
> 
> =======8<======
>   #include <stdio.h>
>   #include <unistd.h>
>   #include <stdlib.h>
>   #include <assert.h>
>   #include <inttypes.h>
>   #include <stdint.h>
>   #include <sys/types.h>
>   #include <sys/mman.h>
>   #include <sys/types.h>
>   #include <sys/stat.h>
>   #include <unistd.h>
>   #include <fcntl.h>
>   #include <stdbool.h>
> 
>   #define BIT_ULL(nr)                   (1ULL << (nr))
>   #define PM_SOFT_DIRTY                 BIT_ULL(55)
> 
>   unsigned int psize;
>   char *page;
> 
>   uint64_t pagemap_read_vaddr(int fd, void *vaddr)
>   {
>       uint64_t value;
>       int ret;
> 
>       ret = pread(fd, &value, sizeof(uint64_t),
>                   ((uint64_t)vaddr >> 12) * sizeof(uint64_t));
>       assert(ret == sizeof(uint64_t));
> 
>       return value;
>   }
> 
>   void clear_refs_write(void)
>   {
>       int fd = open("/proc/self/clear_refs", O_RDWR);
> 
>       assert(fd >= 0);
>       write(fd, "4", 2);
>       close(fd);
>   }
> 
>   #define  check_soft_dirty(str, expect)  do {                            \
>           bool dirty = pagemap_read_vaddr(fd, page) & PM_SOFT_DIRTY;      \
>           if (dirty != expect) {                                          \
>               printf("ERROR: %s, soft-dirty=%d (expect: %d)\n", str, dirty, expect); \
>               exit(-1);                                                   \
>           }                                                               \
>   } while (0)
> 
>   int main(void)
>   {
>       int fd = open("/proc/self/pagemap", O_RDONLY);
> 
>       assert(fd >= 0);
>       psize = getpagesize();
>       page = mmap(NULL, psize, PROT_READ|PROT_WRITE,
>                   MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
>       assert(page != MAP_FAILED);
> 
>       *page = 1;
>       check_soft_dirty("Just faulted in page", 1);
>       clear_refs_write();
>       check_soft_dirty("Clear_refs written", 0);
>       mprotect(page, psize, PROT_READ);
>       check_soft_dirty("Marked RO", 0);
>       mprotect(page, psize, PROT_READ|PROT_WRITE);
>       check_soft_dirty("Marked RW", 0);
>       *page = 2;
>       check_soft_dirty("Wrote page again", 1);
> 
>       munmap(page, psize);
>       close(fd);
>       printf("Test passed.\n");
> 
>       return 0;
>   }

Can we turn that into a vm selftest in
tools/testing/selftests/vm/soft-dirty.c, and also extend it by
MAP_SHARED froma  file as above?

> =======8<======
> 
> So even if commit 64fe24a3e05e kept the old behavior and didn't attempt to
> change the behavior here, the bug will only be able to be triggered after
> commit 64fe24a3e05e because only anonymous page will suffer from it.
> 
> Fixes: 64fe24a3e05e ("mm/mprotect: try avoiding write faults for exclusive anonymous pages when changing protection")
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  mm/mprotect.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 0420c3ed936c..804807ab14e6 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -48,8 +48,11 @@ static inline bool can_change_pte_writable(struct vm_area_struct *vma,
>  	if (pte_protnone(pte) || !pte_dirty(pte))
>  		return false;
>  
> -	/* Do we need write faults for softdirty tracking? */
> -	if ((vma->vm_flags & VM_SOFTDIRTY) && !pte_soft_dirty(pte))
> +	/*
> +	 * Do we need write faults for softdirty tracking?  Note,
> +	 * soft-dirty is enabled when !VM_SOFTDIRTY.
> +	 */
> +	if (!(vma->vm_flags & VM_SOFTDIRTY) && !pte_soft_dirty(pte))
>  		return false;

I wonder if we now want, just as in vma_wants_writenotify(), an early
check for IS_ENABLED(CONFIG_MEM_SOFT_DIRTY), to optimize this out
completely.

>  
>  	/* Do we need write faults for uffd-wp tracking? */


-- 
Thanks,

David / dhildenb



  reply	other threads:[~2022-07-21 11:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-20 22:03 Peter Xu
2022-07-21  7:28 ` David Hildenbrand [this message]
2022-07-21 16:57   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0d193c2-9b23-6f3a-9208-2a615febe0c2@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nadav.amit@gmail.com \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox