linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: akpm@linux-foundation.org, jirislaby@kernel.org,
	jacobly.alt@gmail.com,  holger@applied-asynchrony.com,
	michel@lespinasse.org, jglisse@google.com,  mhocko@suse.com,
	vbabka@suse.cz, hannes@cmpxchg.org,  mgorman@techsingularity.net,
	dave@stgolabs.net, willy@infradead.org,  liam.howlett@oracle.com,
	peterz@infradead.org, ldufour@linux.ibm.com,  paulmck@kernel.org,
	mingo@redhat.com, will@kernel.org, luto@kernel.org,
	 songliubraving@fb.com, peterx@redhat.com, dhowells@redhat.com,
	 hughd@google.com, bigeasy@linutronix.de,
	kent.overstreet@linux.dev,  punit.agrawal@bytedance.com,
	lstoakes@gmail.com, peterjung1337@gmail.com,
	 rientjes@google.com, chriscli@google.com,
	axelrasmussen@google.com,  joelaf@google.com, minchan@google.com,
	rppt@kernel.org, jannh@google.com,  shakeelb@google.com,
	tatashin@google.com, edumazet@google.com,  gthelen@google.com,
	linux-mm@kvack.org
Subject: Re: [PATCH 1/1] mm: disable CONFIG_PER_VMA_LOCK by default until its fixed
Date: Tue, 4 Jul 2023 05:39:31 +0000	[thread overview]
Message-ID: <CAJuCfpHoCqLTWikajYS_tPCyjTmLfuS1nt3ed8qi7p53XHoWgQ@mail.gmail.com> (raw)
In-Reply-To: <7e3f35cc-59b9-bf12-b8b1-4ed78223844a@redhat.com>

On Mon, Jul 3, 2023 at 8:30 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 03.07.23 20:21, Suren Baghdasaryan wrote:
> > A memory corruption was reported in [1] with bisection pointing to the
> > patch [2] enabling per-VMA locks for x86.
> > Disable per-VMA locks config to prevent this issue while the problem is
> > being investigated. This is expected to be a temporary measure.
> >
> > [1] https://bugzilla.kernel.org/show_bug.cgi?id=217624
> > [2] https://lore.kernel.org/all/20230227173632.3292573-30-surenb@google.com
> >
> > Reported-by: Jiri Slaby <jirislaby@kernel.org>
> > Reported-by: Jacob Young <jacobly.alt@gmail.com>
> > Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first")
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > ---
> >   mm/Kconfig | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index 09130434e30d..de94b2497600 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -1224,7 +1224,7 @@ config ARCH_SUPPORTS_PER_VMA_LOCK
> >          def_bool n
> >
> >   config PER_VMA_LOCK
> > -     def_bool y
> > +     bool "Enable per-vma locking during page fault handling."
> >       depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP
> >       help
> >         Allow per-vma locking during page fault handling.
>
> As raised at LSF/MM, I was "surprised" that we can now handle page faults
> concurrent to fork() and was expecting something to be broken already.
>
> What probably happens is that we wr-protected the page in the parent process and
> COW-shared an anon page with the child using copy_present_pte().
>
> But we only flush the parent MM tlb before we drop the parent MM lock in
> dup_mmap().
>
>
> If we get a write-fault before that TLB flush in the parent, and we end up
> replacing that anon page in the parent process in do_wp_page() [because, COW-shared with the child],
> this might be problematic: some stale writable TLB entries can target the wrong (old) page.

Hi David,
Thanks for the detailed explanation. Let me check if this is indeed
what's happening here. If that's indeed the cause, I think we can
write-lock the VMAs being dup'ed until the TLB is flushed and
mmap_write_unlock(oldmm) unlocks them all and lets page faults to
proceed. If that works we at least will know the reason for the memory
corruption.
Thanks,
Suren.

>
>
> We had similar issues in the past with userfaultfd, see the comment at the beginning of do_wp_page():
>
>
>         if (likely(!unshare)) {
>                 if (userfaultfd_pte_wp(vma, *vmf->pte)) {
>                         pte_unmap_unlock(vmf->pte, vmf->ptl);
>                         return handle_userfault(vmf, VM_UFFD_WP);
>                 }
>
>                 /*
>                  * Userfaultfd write-protect can defer flushes. Ensure the TLB
>                  * is flushed in this case before copying.
>                  */
>                 if (unlikely(userfaultfd_wp(vmf->vma) &&
>                              mm_tlb_flush_pending(vmf->vma->vm_mm)))
>                         flush_tlb_page(vmf->vma, vmf->address);
>         }
>
>
> We really should not allow page faults concurrent to fork() without further investigation.
>
> --
> Cheers,
>
> David / dhildenb
>


  reply	other threads:[~2023-07-04  5:39 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-03 18:21 Suren Baghdasaryan
2023-07-03 20:07 ` David Rientjes
2023-07-03 20:30 ` David Hildenbrand
2023-07-04  5:39   ` Suren Baghdasaryan [this message]
2023-07-04  6:50     ` Suren Baghdasaryan
2023-07-04  7:18       ` David Hildenbrand
2023-07-04  7:34         ` Suren Baghdasaryan
2023-07-04  8:03           ` David Hildenbrand
2023-07-04 18:01           ` David Hildenbrand
2023-07-04 13:07         ` Matthew Wilcox
2023-07-04 17:21           ` Suren Baghdasaryan
2023-07-04 17:36             ` David Hildenbrand
2023-07-04 17:56               ` Suren Baghdasaryan
2023-07-04 18:05                 ` David Hildenbrand
2023-07-04 19:11                   ` Suren Baghdasaryan
2023-07-04 20:10                     ` Suren Baghdasaryan
     [not found]                       ` <7d6ba07b-ee60-8920-b91c-04c826eb4690@applied-asynchrony.com>
2023-07-04 22:03                         ` Suren Baghdasaryan
2023-07-04 22:42                         ` Matthew Wilcox
     [not found]                           ` <a7149847-4b53-8ff0-d570-042631a1ce20@applied-asynchrony.com>
2023-07-05  6:46                             ` Suren Baghdasaryan
2023-07-04 17:55             ` Matthew Wilcox
2023-07-04 17:58               ` Suren Baghdasaryan
2023-07-04  8:12 ` Linux regression tracking (Thorsten Leemhuis)
2023-07-04  8:30   ` Hans de Goede
2023-07-04  8:18 ` Hans de Goede
2023-07-04 15:24   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJuCfpHoCqLTWikajYS_tPCyjTmLfuS1nt3ed8qi7p53XHoWgQ@mail.gmail.com \
    --to=surenb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=bigeasy@linutronix.de \
    --cc=chriscli@google.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=holger@applied-asynchrony.com \
    --cc=hughd@google.com \
    --cc=jacobly.alt@gmail.com \
    --cc=jannh@google.com \
    --cc=jglisse@google.com \
    --cc=jirislaby@kernel.org \
    --cc=joelaf@google.com \
    --cc=kent.overstreet@linux.dev \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=luto@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=michel@lespinasse.org \
    --cc=minchan@google.com \
    --cc=mingo@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterjung1337@gmail.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=punit.agrawal@bytedance.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=shakeelb@google.com \
    --cc=songliubraving@fb.com \
    --cc=tatashin@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox