linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mateusz Guzik <mjguzik@gmail.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	 Andrii Nakryiko <andrii@kernel.org>,
	linux-trace-kernel@vger.kernel.org, peterz@infradead.org,
	 oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org,
	bpf@vger.kernel.org,  linux-kernel@vger.kernel.org,
	jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org,
	 akpm@linux-foundation.org, linux-mm@kvack.org,
	Jann Horn <jannh@google.com>
Subject: Re: [PATCH RFC v3 13/13] uprobes: add speculative lockless VMA to inode resolution
Date: Thu, 15 Aug 2024 20:24:15 +0200	[thread overview]
Message-ID: <guxwr4wzs5yt5ajrpwwpjdv6lbjf4dhgmjh7edrbc7lvevnh2o@joquw2jf6s4i> (raw)
In-Reply-To: <CAJuCfpGZT+ci0eDfTuLvo-3=jtEfMLYswnDJ0CQHfittou0GZQ@mail.gmail.com>

On Thu, Aug 15, 2024 at 10:45:45AM -0700, Suren Baghdasaryan wrote:
> >From all the above, my understanding of your objection is that
> checking mmap_lock during our speculation is too coarse-grained and
> you would prefer to use the VMA seq counter to check that the VMA we
> are working on is unchanged. I agree, that would be ideal. I had a
> quick chat with Jann about this and the conclusion we came to is that
> we would need to add an additional smp_wmb() barrier inside
> vma_start_write() and a smp_rmb() in the speculation code:
> 
> static inline void vma_start_write(struct vm_area_struct *vma)
> {
>         int mm_lock_seq;
> 
>         if (__is_vma_write_locked(vma, &mm_lock_seq))
>                 return;
> 
>         down_write(&vma->vm_lock->lock);
>         /*
>          * We should use WRITE_ONCE() here because we can have concurrent reads
>          * from the early lockless pessimistic check in vma_start_read().
>          * We don't really care about the correctness of that early check, but
>          * we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy.
>          */
>         WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq);
> +        smp_wmb();
>         up_write(&vma->vm_lock->lock);
> }
> 
> Note: up_write(&vma->vm_lock->lock) in the vma_start_write() is not
> enough because it's one-way permeable (it's a "RELEASE operation") and
> later vma->vm_file store (or any other VMA modification) can move
> before our vma->vm_lock_seq store.
> 
> This makes vma_start_write() heavier but again, it's write-locking, so
> should not be considered a fast path.
> With this change we can use the code suggested by Andrii in
> https://lore.kernel.org/all/CAEf4BzZeLg0WsYw2M7KFy0+APrPaPVBY7FbawB9vjcA2+6k69Q@mail.gmail.com/
> with an additional smp_rmb():
> 
> rcu_read_lock()
> vma = find_vma(...)
> if (!vma) /* bail */
> 
> vm_lock_seq = smp_load_acquire(&vma->vm_lock_seq);
> mm_lock_seq = smp_load_acquire(&vma->mm->mm_lock_seq);
> /* I think vm_lock has to be acquired first to avoid the race */
> if (mm_lock_seq == vm_lock_seq)
>         /* bail, vma is write-locked */
> ... perform uprobe lookup logic based on vma->vm_file->f_inode ...
> smp_rmb();
> if (vma->vm_lock_seq != vm_lock_seq)
>         /* bail, VMA might have changed */
> 
> The smp_rmb() is needed so that vma->vm_lock_seq load does not get
> reordered and moved up before speculation.
> 
> I'm CC'ing Jann since he understands memory barriers way better than
> me and will keep me honest.
> 

So I briefly noted that maybe down_read on the vma would do it, but per
Andrii parallel lookups on the same vma on multiple CPUs are expected,
which whacks that out.

When I initially mentioned per-vma sequence counters I blindly assumed
they worked the usual way. I don't believe any fancy rework here is
warranted especially given that the per-mm counter thing is expected to
have other uses.

However, chances are decent this can still be worked out with per-vma
granualarity all while avoiding any stores on lookup and without
invasive (or complicated) changes. The lockless uprobe code claims to
guarantee only false negatives and the miss always falls back to the
mmap semaphore lookup. There may be something here, I'm going to chew on
it.

That said, thank you both for writeup so far.


  reply	other threads:[~2024-08-15 18:24 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-13  4:29 [PATCH v3 00/13] uprobes: RCU-protected hot path optimizations Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 01/13] uprobes: revamp uprobe refcounting and lifetime management Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 02/13] uprobes: protected uprobe lifetime with SRCU Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 03/13] uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 04/13] uprobes: travers uprobe's consumer list locklessly under SRCU protection Andrii Nakryiko
2024-08-22 14:22   ` Jiri Olsa
2024-08-22 16:59     ` Andrii Nakryiko
2024-08-22 17:35       ` Jiri Olsa
2024-08-22 17:51         ` Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 05/13] perf/uprobe: split uprobe_unregister() Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 06/13] rbtree: provide rb_find_rcu() / rb_find_add_rcu() Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 07/13] uprobes: perform lockless SRCU-protected uprobes_tree lookup Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 08/13] uprobes: switch to RCU Tasks Trace flavor for better performance Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 09/13] uprobes: SRCU-protect uretprobe lifetime (with timeout) Andrii Nakryiko
2024-08-19 13:41   ` Oleg Nesterov
2024-08-19 20:34     ` Andrii Nakryiko
2024-08-20 15:05       ` Oleg Nesterov
2024-08-20 18:01         ` Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 10/13] uprobes: implement SRCU-protected lifetime for single-stepped uprobe Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 11/13] mm: introduce mmap_lock_speculation_{start|end} Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 12/13] mm: add SLAB_TYPESAFE_BY_RCU to files_cache Andrii Nakryiko
2024-08-13  6:07   ` Mateusz Guzik
2024-08-13 14:49     ` Suren Baghdasaryan
2024-08-13 18:15       ` Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 13/13] uprobes: add speculative lockless VMA to inode resolution Andrii Nakryiko
2024-08-13  6:17   ` Mateusz Guzik
2024-08-13 15:36     ` Suren Baghdasaryan
2024-08-15 13:44       ` Mateusz Guzik
2024-08-15 16:47         ` Andrii Nakryiko
2024-08-15 17:45           ` Suren Baghdasaryan
2024-08-15 18:24             ` Mateusz Guzik [this message]
2024-08-15 18:58             ` Jann Horn
2024-08-15 19:07               ` Mateusz Guzik
2024-08-15 19:17                 ` Arnaldo Carvalho de Melo
2024-08-15 19:18                   ` Arnaldo Carvalho de Melo
2024-08-15 19:44               ` Suren Baghdasaryan
2024-08-15 20:17               ` Andrii Nakryiko
2024-08-15 13:24 ` [PATCH v3 00/13] uprobes: RCU-protected hot path optimizations Oleg Nesterov
2024-08-15 16:49   ` Andrii Nakryiko
2024-08-21 16:41     ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=guxwr4wzs5yt5ajrpwwpjdv6lbjf4dhgmjh7edrbc7lvevnh2o@joquw2jf6s4i \
    --to=mjguzik@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=jannh@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox