linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	akpm@linux-foundation.org, willy@infradead.org,
	lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz,
	hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com,
	mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com,
	oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com,
	hughd@google.com, lokeshgidra@google.com, minchan@google.com,
	jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com,
	pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net,
	linux-doc@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count
Date: Wed, 18 Dec 2024 15:38:29 -0500	[thread overview]
Message-ID: <ulbspoec633hfm54f3jzvoqs6ilskxou3qykk2u727pbaltvfl@lb53vjcaxnuf> (raw)
In-Reply-To: <CAJuCfpFSD98fw=844AJPy+LT5y=zREQGtSEVj3_FCXiZ5cFR_A@mail.gmail.com>

* Suren Baghdasaryan <surenb@google.com> [241218 15:01]:
> On Wed, Dec 18, 2024 at 11:38 AM 'Liam R. Howlett' via kernel-team
> <kernel-team@android.com> wrote:
> >
> > * Suren Baghdasaryan <surenb@google.com> [241218 14:29]:
> > > On Wed, Dec 18, 2024 at 11:07 AM Suren Baghdasaryan <surenb@google.com> wrote:
> > > >
> > > > On Wed, Dec 18, 2024 at 11:00 AM 'Liam R. Howlett' via kernel-team
> > > > <kernel-team@android.com> wrote:
> > > > >
> > > > > * Suren Baghdasaryan <surenb@google.com> [241218 12:58]:
> > > > > > On Wed, Dec 18, 2024 at 9:44 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > > > >
> > > > > > > On Wed, Dec 18, 2024 at 09:36:42AM -0800, Suren Baghdasaryan wrote:
> > > > > > >
> > > > > > > > > You will not. vms_complete_munmap_vmas() will call remove_vma() to
> > > > > > > > > remove PTEs IIRC, and if you do start_write() and detach() before
> > > > > > > > > dropping mmap_lock_write, you should be good.
> > > > > > > >
> > > > > > > > Ok, I think we will have to move mmap_write_downgrade() inside
> > > > > > > > vms_complete_munmap_vmas() to be called after remove_vma().
> > > > > > > > vms_clear_ptes() is using vmas, so we can't move remove_vma() before
> > > > > > > > mmap_write_downgrade().
> > > > > > >
> > > > > > > Why ?!
> > > > > > >
> > > > > > > vms_clear_ptes() and remove_vma() are fine where they are -- there is no
> > > > > > > concurrency left at this point.
> > > > > > >
> > > > > > > Note that by doing vma_start_write() inside vms_complete_munmap_vmas(),
> > > > > > > which is *after* the vmas have been unhooked from the mm, you wait for
> > > > > > > any concurrent user to go away.
> > > > > > >
> > > > > > > And since they're unhooked, there can't be any new users.
> > > > > > >
> > > > > > > So you're the one and only user left, and code is fine the way it is.
> > > > > >
> > > > > > Ok, let me make sure I understand this part of your proposal. From
> > > > > > your earlier email:
> > > > > >
> > > > > > @@ -1173,6 +1173,11 @@ static void vms_complete_munmap_vmas(struct
> > > > > > vma_munmap_struct *vms,
> > > > > >         struct vm_area_struct *vma;
> > > > > >         struct mm_struct *mm;
> > > > > >
> > > > > > +       mas_for_each(mas_detach, vma, ULONG_MAX) {
> > > > > > +               vma_start_write(next);
> > > > > > +               vma_mark_detached(next, true);
> > > > > > +       }
> > > > > > +
> > > > > >         mm = current->mm;
> > > > > >         mm->map_count -= vms->vma_count;
> > > > > >         mm->locked_vm -= vms->locked_vm;
> > > > > >
> > > > > > This would mean:
> > > > > >
> > > > > > vms_complete_munmap_vmas
> > > > > >            vma_start_write
> > > > > >            vma_mark_detached
> > > > > >            mmap_write_downgrade
> > > > > >            vms_clear_ptes
> > > > > >            remove_vma
> > > > > >
> > > > > > And remove_vma will be just freeing the vmas. Is that correct?
> > > > > > I'm a bit confused because the original thinking was that
> > > > > > vma_mark_detached() would drop the last refcnt and if it's 0 we would
> > > > > > free the vma right there. If that's still what we want to do then I
> > > > > > think the above sequence should look like this:
> > > > > >
> > > > > > vms_complete_munmap_vmas
> > > > > >            vms_clear_ptes
> > > > > >            remove_vma
> > > > > >                vma_start_write
> > > > > >                vma_mark_detached
> > > > > >            mmap_write_downgrade
> > > > > >
> > > > > > because vma_start_write+vma_mark_detached should be done under  mmap_write_lock.
> > > > > > Please let me know which way you want to move forward.
> > > > > >
> > > > >
> > > > > Are we sure we're not causing issues with the MAP_FIXED path here?
> > > > >
> > > > > With the above change, we'd be freeing the PTEs before marking the vmas
> > > > > as detached or vma_start_write().
> > > >
> > > > IIUC when we call vms_complete_munmap_vmas() all vmas inside
> > > > mas_detach have been already write-locked, no?
> >
> > That's the way it is today - but I thought you were moving the lock to
> > the complete stage, not adding a new one? (why add a new one otherwise?)
> 
> Is my understanding correct that mas_detach is populated by
> vms_gather_munmap_vmas() only with vmas that went through
> __split_vma() (and were write-locked there)? I don't see any path that
> would add any other vma into mas_detach but maybe I'm missing
> something?

No, that is not correct.

vms_gather_munmap_vmas() calls split on the first vma, then adds all
vmas that are within the range of the munmap() call.  Potentially
splitting the last vma and adding that in the
"if (next->vm_end > vms->end)" block.

Sometimes this is a single vma that gets split twice, sometimes no
splits happen and entire vmas are unmapped, sometimes it's just one vma
that isn't split.

My observation is the common case is a single vma, but besides that we
see 3, and sometimes 7 at a time, but it could be any number of vmas and
not all of them are split.

There is a loop for_each_vma_range() that does:

vma_start_write(next);
mas_set(mas_detach, vms->mas_count++);
mas_store_gfp(mas_detach, next, GFP_KERNEL);


> 
> >
> > >
> > > Yeah, I think we can simply do this:
> > >
> > > vms_complete_munmap_vmas
> > >            vms_clear_ptes
> > >            remove_vma
> > >                vma_mark_detached
> > >            mmap_write_downgrade
> > >
> > > If my assumption is incorrect, assertion inside vma_mark_detached()
> > > should trigger. I tried a quick test and so far nothing exploded.
> > >
> >
> > If they are write locked, then the page faults are not a concern.  There
> > is also the rmap race that Jann found in mmap_region() [1].  This is
> > probably also fine since you are keeping the write lock in place earlier
> > on in the gather stage.  Note the ptes will already be cleared by the
> > time vms_complete_munmap_vmas() is called in this case.
> >
> > [1] https://lore.kernel.org/all/CAG48ez0ZpGzxi=-5O_uGQ0xKXOmbjeQ0LjZsRJ1Qtf2X5eOr1w@mail.gmail.com/
> >
> > Thanks,
> > Liam
> >
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> >


  reply	other threads:[~2024-12-18 20:38 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-16 19:24 [PATCH v6 00/16] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 01/16] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 02/16] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 03/16] mm: mark vma as detached until it's added into vma tree Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 04/16] mm/nommu: fix the last places where vma is not locked before being attached Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 05/16] types: move struct rcuwait into types.h Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 06/16] mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail Suren Baghdasaryan
2024-12-17 11:31   ` Lokesh Gidra
2024-12-17 15:51     ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 07/16] mm: move mmap_init_lock() out of the header file Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 08/16] mm: uninline the main body of vma_start_write() Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 09/16] refcount: introduce __refcount_{add|inc}_not_zero_limited Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count Suren Baghdasaryan
2024-12-16 20:42   ` Peter Zijlstra
2024-12-16 20:53     ` Suren Baghdasaryan
2024-12-16 21:15   ` Peter Zijlstra
2024-12-16 21:53     ` Suren Baghdasaryan
2024-12-16 22:00       ` Peter Zijlstra
2024-12-16 21:37   ` Peter Zijlstra
2024-12-16 21:44     ` Suren Baghdasaryan
2024-12-17 10:30       ` Peter Zijlstra
2024-12-17 16:27         ` Suren Baghdasaryan
2024-12-18  9:41           ` Peter Zijlstra
2024-12-18 10:06             ` Peter Zijlstra
2024-12-18 15:37               ` Liam R. Howlett
2024-12-18 15:50                 ` Suren Baghdasaryan
2024-12-18 16:18                   ` Peter Zijlstra
2024-12-18 17:36                     ` Suren Baghdasaryan
2024-12-18 17:44                       ` Peter Zijlstra
2024-12-18 17:58                         ` Suren Baghdasaryan
2024-12-18 19:00                           ` Liam R. Howlett
2024-12-18 19:07                             ` Suren Baghdasaryan
2024-12-18 19:29                               ` Suren Baghdasaryan
2024-12-18 19:38                                 ` Liam R. Howlett
2024-12-18 20:00                                   ` Suren Baghdasaryan
2024-12-18 20:38                                     ` Liam R. Howlett [this message]
2024-12-18 21:53                                       ` Suren Baghdasaryan
2024-12-18 21:55                                         ` Suren Baghdasaryan
2024-12-19  0:35                                         ` Andrew Morton
2024-12-19  0:47                                           ` Suren Baghdasaryan
2024-12-19  9:13                                         ` Peter Zijlstra
2024-12-19 11:20                                           ` Peter Zijlstra
2024-12-19 16:17                                             ` Suren Baghdasaryan
2024-12-19 17:16                                             ` Liam R. Howlett
2024-12-19 17:42                                               ` Peter Zijlstra
2024-12-19 18:18                                                 ` Liam R. Howlett
2024-12-19 18:46                                                   ` Peter Zijlstra
2024-12-19 18:55                                                     ` Liam R. Howlett
2024-12-20 15:22                                                       ` Suren Baghdasaryan
2024-12-23  3:03                                                       ` Suren Baghdasaryan
2024-12-26 17:12                                                         ` Suren Baghdasaryan
2024-12-19 16:14                                           ` Suren Baghdasaryan
2024-12-19 17:23                                             ` Peter Zijlstra
2024-12-19  8:55                                 ` Peter Zijlstra
2024-12-19 16:08                                   ` Suren Baghdasaryan
2024-12-19  8:53                           ` Peter Zijlstra
2024-12-19 16:08                             ` Suren Baghdasaryan
2024-12-18 15:57                 ` Suren Baghdasaryan
2024-12-18 16:13                 ` Peter Zijlstra
2024-12-18 15:42             ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 11/16] mm: enforce vma to be in detached state before freeing Suren Baghdasaryan
2024-12-16 21:16   ` Peter Zijlstra
2024-12-16 21:18     ` Peter Zijlstra
2024-12-16 21:57       ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 12/16] mm: remove extra vma_numab_state_init() call Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 13/16] mm: introduce vma_ensure_detached() Suren Baghdasaryan
2024-12-17 10:26   ` Peter Zijlstra
2024-12-17 15:58     ` Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 14/16] mm: prepare lock_vma_under_rcu() for vma reuse possibility Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 15/16] mm: make vma cache SLAB_TYPESAFE_BY_RCU Suren Baghdasaryan
2024-12-16 19:24 ` [PATCH v6 16/16] docs/mm: document latest changes to vm_lock Suren Baghdasaryan
2024-12-16 19:39 ` [PATCH v6 00/16] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-12-17 18:42   ` Andrew Morton
2024-12-17 18:49     ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ulbspoec633hfm54f3jzvoqs6ilskxou3qykk2u727pbaltvfl@lb53vjcaxnuf \
    --to=liam.howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=kernel-team@android.com \
    --cc=klarasmodin@gmail.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lokeshgidra@google.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=mjguzik@gmail.com \
    --cc=oleg@redhat.com \
    --cc=oliver.sang@intel.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=shakeel.butt@linux.dev \
    --cc=souravpanda@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox