linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Yajun Deng" <yajun.deng@linux.dev>
To: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: "David Hildenbrand" <david@redhat.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/rmap: remove unnecessary page_table_lock
Date: Wed, 24 Apr 2024 03:04:51 +0000	[thread overview]
Message-ID: <e9c442bef63af1fab9f7630b41803aebf3c2b079@linux.dev> (raw)
In-Reply-To: <3v6xmyzmnw6go45riwlu7qv4c4phiexpqxldnlbgwsjhppe4oi@xdcqm4xupl4k>

April 24, 2024 at 1:11 AM, "Liam R. Howlett" <Liam.Howlett@oracle.com> wrote:



> 
> * Yajun Deng <yajun.deng@linux.dev> [240423 04:35]:
> 
> > 
> > April 23, 2024 at 4:18 PM, "David Hildenbrand" <david@redhat.com> wrote:
> > 
> >  
> > 
> >  
> > 
> >  
> > 
> >  
> > 
> >  On 23.04.24 09:53, Yajun Deng wrote:
> > 
> >  
> > 
> >  > 
> > 
> >  > April 22, 2024 at 7:24 PM, "David Hildenbrand" <david@redhat.com> wrote:
> > 
> >  > 
> > 
> >  > > >>
> > 
> >  > 
> > 
> >  > > 
> > 
> >  > > On 22.04.24 12:52, Yajun Deng wrote:
> > 
> >  > > 
> > 
> >  > 
> > 
> >  > page_table_lock is a lock that for page table, we won't change page
> > 
> >  > 
> > 
> >  > table in __anon_vma_prepare(). As we can see, it works well in
> > 
> >  > 
> > 
> >  > anon_vma_clone(). They do the same operation.
> > 
> >  > 
> > 
> >  > > 
> > 
> >  > > We are reusing mm->page_table_lock to serialize, not the *actual* low-level page table locks that really protect PTEs.
> > 
> >  > > 
> > 
> >  > > With that locking gone, there would be nothing protection vma->anon_vma.
> > 
> >  > > 
> > 
> >  > > Note that anon_vma_clone() is likely called with the mmap_lock held in write mode, which is not the case for __anon_vma_prepare() ...
> > 
> >  > > 
> > 
> >  > 
> > 
> >  > Yes, anon_vma_clone() is called with the mmap_lock held. I added mmap_assert_write_locked(dst->vm_mm) to prove it.
> > 
> >  > 
> > 
> >  > I added mmap_assert_write_locked(vma->vm_mm) in __anon_vma_prepare() at the same time, it shows __anon_vma_prepare()
> > 
> >  > 
> > 
> >  > is also called with the mmap_lock held too.
> > 
> >  > 
> > 
> >  
> > 
> >  Make sure you actually have lockdep built in and enabled.
> > 
> >  
> > 
> >  
> > 
> >  This is my config.
> > 
> >  CONFIG_LOCKDEP=n
> > 
> >  CONFIG_DEBUG_VM=y
> > 
> >  
> > 
> >  I did another test.
> > 
> >  I put mmap_assert_write_locked(mm) before 'set_bit(MMF_OOM_SKIP, &mm->flags)' in mmap.c, it's outside the lock.
> > 
> >  It will crash when on boot. I think mmap_assert_write_locked() works.
> > 
> 
> If you are changing locks, then please test with lockdep on.
> 

It's my fault. It shows a warning with lockdep on.

> > 
> > __anon_vma_prepare() is for example called from do_anonymous_page() where we might only hold the mmap_lock in read mode (or not at all IIRC with VMA in read mode).
> > 
> 
> Consider two concurrent readers getting to this function with the same
> 
> vma. There is no mergeable anon vma, so both create a new anon_vma.
> 
> You take the anon_vma lock to parallelize the linking to the vma - but
> 
> they are different locks because they are both new anon_vma structs
> 
> allocated by concurrent readers.
> 
> You now need a lock that you know cannot allow this to happen. Looking
> 
> at the top of mm/rmap.c and see which one works. The next in line is
> 
> the page table lock, so that one is used here.
> 
> What if we reverse the locks? We can deadlock.
> 
> What if we don't take the anon_vma lock? We can have two writers to the
> 
> anon_vma.
> 
> Thanks,
> 
> Liam
>


      reply	other threads:[~2024-04-24  3:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-22 10:52 Yajun Deng
2024-04-22 11:24 ` Qi Zheng
     [not found] ` <b848c431-deca-42e4-925c-673b3fa1f251@redhat.com>
2024-04-23  7:53   ` Yajun Deng
2024-04-23  8:18     ` David Hildenbrand
2024-04-23  8:35       ` Yajun Deng
2024-04-23 17:11         ` Liam R. Howlett
2024-04-24  3:04           ` Yajun Deng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9c442bef63af1fab9f7630b41803aebf3c2b079@linux.dev \
    --to=yajun.deng@linux.dev \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox