From: Suren Baghdasaryan <surenb@google.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, Chris Li <chriscli@google.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>,
Shakeel Butt <shakeel.butt@linux.dev>,
Jann Horn <jannh@google.com>, Pedro Falcato <pfalcato@suse.de>
Subject: Re: [PATCH v2 1/2] mm: Add vma_start_write_killable()
Date: Mon, 17 Nov 2025 11:50:06 -0800 [thread overview]
Message-ID: <CAJuCfpG32mQT45KqEkO=5TNecES9M6AccPLw-Ju=N2cbto++=Q@mail.gmail.com> (raw)
In-Reply-To: <aRXt9JXI3rW1kP3k@casper.infradead.org>
On Thu, Nov 13, 2025 at 6:40 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Nov 13, 2025 at 01:20:14PM +0000, Lorenzo Stoakes wrote:
> > > +/**
> > > + * vma_start_write_killable - Begin writing to a VMA.
> > > + * @vma: The VMA we are going to modify.
> > > + *
> > > + * Exclude concurrent readers under the per-VMA lock until the currently
> > > + * write-locked mmap_lock is dropped or downgraded.
> > > + *
> > > + * Context: May sleep while waiting for readers to drop the vma read lock.
> > > + * Caller must already hold the mmap_lock for write.
> > > + *
> > > + * Return: 0 for a successful acquisition. -EINTR if a fatal signal was
> > > + * received.
> > > + */
> >
> > Not relevant to this series but probably we should write kdoc's for all of
> > these...
>
> Yes. I was mildly miffed that I couldn't crib from an existing one ;-)
> Definitely worth a followup patch at some point to add kdoc for all of
> them.
>
> > > +static inline __must_check
> > > +int vma_start_write_killable(struct vm_area_struct *vma)
> > > +{
> > > + unsigned int mm_lock_seq;
> > > +
> >
> > Wonder if an mmap write lock assert is appropriate here. But we can defer
> > that as we don't do that with vma_start_write() either...
>
> Seems like a reasonable addition in a later patch. As you say, it
> wasn't there in vma_start_write(), so I didn't like to add it.
>
> > > +int __vma_start_write(struct vm_area_struct *vma, unsigned int mm_lock_seq,
> > > + int state)
> > > {
> > > - bool locked;
> > > + int locked;
> > >
> > > /*
> > > * __vma_enter_locked() returns false immediately if the vma is not
> > > * attached, otherwise it waits until refcnt is indicating that vma
> > > * is attached with no readers.
> >
> > This comment seems to need updating?
>
> Honestly, I'd rather remove it. It's weird to document what the
> function you're calling does. If there's anything worth saving,
> put it in the comment before __vma_enter_locked().
Either removing or moving it above __vma_enter_locked() sounds good to me.
>
> > > */
> > > - locked = __vma_enter_locked(vma, false);
> > > + locked = __vma_enter_locked(vma, false, state);
> >
> > NIT, but while we're here maybe we could make this:
> >
> > locked = __vma_enter_locked(vma, /*detaching=*/false, state);
>
> Ah yes, the canonical 'don't pass bool arguments to functions' problem.
> #define VMA_DETACHING true
> #define VMA_NOT_DETACHING false
>
> > > @@ -118,7 +134,7 @@ void vma_mark_detached(struct vm_area_struct *vma)
> > > */
> > > if (unlikely(!refcount_dec_and_test(&vma->vm_refcnt))) {
> > > /* Wait until vma is detached with no readers. */
> > > - if (__vma_enter_locked(vma, true)) {
> > > + if (__vma_enter_locked(vma, true, TASK_UNINTERRUPTIBLE)) {
> >
> > No issues with us waiting uninterruptibly here?
> >
> > I guess rare condition and we require this to happen so makes sense.
>
> I'd defer to you on that analysis. If we want to add a
> vma_mark_detached_killable(), we can do that. If we want to make all
> callers of vma_mark_detached() check its return value, we can do that
> too (see earlier patch attempts you were cc'd on off-list). It really
> requires someone to do the analysis of whether it's worthwhile.
I analyzed this when Matthew proposed his first patch and I think
waiting here uninterruptibly is fine. The only time the writer might
be blocked here is when a reader temporarily increments vm_refcnt to
check vm_lock_seq, realize the vma is locked and drop back the
vm_refcnt. That is an unlikely case and when it happens the reader
should drop the refcount and unblock the writer very quickly.
>
next prev parent reply other threads:[~2025-11-17 19:50 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-10 20:32 [PATCH v2 0/2] vma_start_write_killable Matthew Wilcox (Oracle)
2025-11-10 20:32 ` [PATCH v2 1/2] mm: Add vma_start_write_killable() Matthew Wilcox (Oracle)
2025-11-11 8:58 ` Vlastimil Babka
2025-11-13 13:20 ` Lorenzo Stoakes
2025-11-13 14:40 ` Matthew Wilcox
2025-11-17 19:50 ` Suren Baghdasaryan [this message]
2025-11-10 20:32 ` [PATCH v2 2/2] mm: Use vma_start_write_killable() in dup_mmap() Matthew Wilcox (Oracle)
2025-11-11 9:16 ` Vlastimil Babka
2025-11-13 13:25 ` Lorenzo Stoakes
2025-11-13 14:30 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJuCfpG32mQT45KqEkO=5TNecES9M6AccPLw-Ju=N2cbto++=Q@mail.gmail.com' \
--to=surenb@google.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=chriscli@google.com \
--cc=jannh@google.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=pfalcato@suse.de \
--cc=shakeel.butt@linux.dev \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox