linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
	linux-trace-kernel@vger.kernel.org,  peterz@infradead.org,
	rostedt@goodmis.org, mhiramat@kernel.org,  bpf@vger.kernel.org,
	linux-kernel@vger.kernel.org, jolsa@kernel.org,
	 paulmck@kernel.org, willy@infradead.org, surenb@google.com,
	 akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH RFC v3 09/13] uprobes: SRCU-protect uretprobe lifetime (with timeout)
Date: Tue, 20 Aug 2024 11:01:13 -0700	[thread overview]
Message-ID: <CAEf4Bzb7gQEKk26B4-KXS_ht8LyCRA7SdThc7ZS05gOEuNZjrQ@mail.gmail.com> (raw)
In-Reply-To: <20240820150534.GD12400@redhat.com>

On Tue, Aug 20, 2024 at 8:06 AM Oleg Nesterov <oleg@redhat.com> wrote:
>
> On 08/19, Andrii Nakryiko wrote:
> >
> > On Mon, Aug 19, 2024 at 6:41 AM Oleg Nesterov <oleg@redhat.com> wrote:
> > >
> > > On 08/12, Andrii Nakryiko wrote:
> > > >
> > > > Avoid taking refcount on uprobe in prepare_uretprobe(), instead take
> > > > uretprobe-specific SRCU lock and keep it active as kernel transfers
> > > > control back to user space.
> > > ...
> > > >  include/linux/uprobes.h |  49 ++++++-
> > > >  kernel/events/uprobes.c | 294 ++++++++++++++++++++++++++++++++++------
> > > >  2 files changed, 301 insertions(+), 42 deletions(-)
> > >
> > > Oh. To be honest I don't like this patch.
> > >
> > > I would like to know what other reviewers think, but to me it adds too many
> > > complications that I can't even fully understand...
> >
> > Which parts? The atomic xchg() and cmpxchg() parts? What exactly do
> > you feel like you don't fully understand?
>
> Heh, everything looks too complex for me ;)

Well, the best code is no code. But I'm not doing this just for fun,
so I'm happy with the simplest solution *that works*.

>
> Say, hprobe_expire(). It is also called by ri_timer() in softirq context,
> right? And it does
>
>         /* We lost the race, undo our refcount bump. It can drop to zero. */
>         put_uprobe(uprobe);
>
> How so? If the refcount goes to zero, put_uprobe() does mutex_lock(),
> but it must not sleep in softirq context.
>

Now we are talking about specific issues, thank you! It's hard to
discuss "I don't like".

Yes, I think you are right and it is certainly a problem with
put_uprobe() that it can't be called from softirq (just having to
remember that is error-prone, as is evidenced by me forgetting about
this issue).

It's easy enough to solve. We can either schedule work from timer
thread (and that will solve this particular issue only), or we can
teach put_uprobe() to schedule work item if it drops refcount to zero
from softirq and other restricted contexts.

I vote for making put_uprobe() flexible in this regard, add
work_struct to uprobe, and schedule all this to be done in the work
queue callback. WDYT?

>
> Or, prepare_uretprobe() which does
>
>         rcu_assign_pointer(utask->return_instances, ri);
>
>         if (!timer_pending(&utask->ri_timer))
>                 mod_timer(&utask->ri_timer, ...);
>
> Suppose that the timer was pending and it was fired right before
> rcu_assign_pointer(). What guarantees that prepare_uretprobe() will see
> timer_pending() == false?
>
> rcu_assign_pointer()->smp_store_release() is a one-way barrier. This
> timer_pending() check may appear to happen before rcu_assign_pointer()
> completes.
>
> In this (yes, theoretical) case ri_timer() can miss the new return_instance,
> while prepare_uretprobe() can miss the necessary mod_timer(). I think this
> needs another mb() in between.
>

Ok, that's fair. I felt like this pattern might be a bit problematic,
but I also felt like it's good to have to ensure that we do
occasionally have a race between timer callback and uretprobe, even if
uretprobe returns very quickly. (and I did confirm we get those races
and they seem to be handled fine, i.e., I saw uprobes being "expired"
into refcounted ones from ri_timer)

But the really simple way to solve this is to do unconditional
mod_timer(), so I can do just that to keep this less tricky. Would you
be ok with that?

>
> And I can't convince myself hprobe_expire() is correct... OK, I don't
> fully understand the logic, but why data_race(READ_ONCE(hprobe->leased)) ?
> READ_ONCE() should be enough in this case?

you mean why data_race() annotation? To appease KCSAN, given that we
modify hprobe->leased with xchg/cmpxchg, but read here with
READ_ONCE(). Maybe I'm overthinking it, not sure. There is a reason
why this is an RFC ;)

>
>
> > > As I have already mentioned in the previous discussions, we can simply kill
> > > utask->active_uprobe. And utask->auprobe.
> >
> > I don't have anything against that, in principle, but let's benchmark
> > and test that thoroughly. I'm a bit uneasy about the possibility that
> > some arch-specific code will do container_of() on this arch_uprobe in
> > order to get to uprobe,
>
> Well, struct uprobe is not "exported", the arch-specific code can't do this.
>

Ah, good point, that's great. But as I said, uretprobe is actually
*the common* use case, not single-stepped uprobe. Still not very happy
about that memcpy() and assumption that it's cheap, but that's minor.

But no matter what we do for single-stepped one, uretprobe needs some
solution. (and if that solution works for uretprobe, why wouldn't it
work for single-step?...)

> Oleg.
>


  reply	other threads:[~2024-08-20 18:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-13  4:29 [PATCH v3 00/13] uprobes: RCU-protected hot path optimizations Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 01/13] uprobes: revamp uprobe refcounting and lifetime management Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 02/13] uprobes: protected uprobe lifetime with SRCU Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 03/13] uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 04/13] uprobes: travers uprobe's consumer list locklessly under SRCU protection Andrii Nakryiko
2024-08-22 14:22   ` Jiri Olsa
2024-08-22 16:59     ` Andrii Nakryiko
2024-08-22 17:35       ` Jiri Olsa
2024-08-22 17:51         ` Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 05/13] perf/uprobe: split uprobe_unregister() Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 06/13] rbtree: provide rb_find_rcu() / rb_find_add_rcu() Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 07/13] uprobes: perform lockless SRCU-protected uprobes_tree lookup Andrii Nakryiko
2024-08-13  4:29 ` [PATCH v3 08/13] uprobes: switch to RCU Tasks Trace flavor for better performance Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 09/13] uprobes: SRCU-protect uretprobe lifetime (with timeout) Andrii Nakryiko
2024-08-19 13:41   ` Oleg Nesterov
2024-08-19 20:34     ` Andrii Nakryiko
2024-08-20 15:05       ` Oleg Nesterov
2024-08-20 18:01         ` Andrii Nakryiko [this message]
2024-08-13  4:29 ` [PATCH RFC v3 10/13] uprobes: implement SRCU-protected lifetime for single-stepped uprobe Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 11/13] mm: introduce mmap_lock_speculation_{start|end} Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 12/13] mm: add SLAB_TYPESAFE_BY_RCU to files_cache Andrii Nakryiko
2024-08-13  6:07   ` Mateusz Guzik
2024-08-13 14:49     ` Suren Baghdasaryan
2024-08-13 18:15       ` Andrii Nakryiko
2024-08-13  4:29 ` [PATCH RFC v3 13/13] uprobes: add speculative lockless VMA to inode resolution Andrii Nakryiko
2024-08-13  6:17   ` Mateusz Guzik
2024-08-13 15:36     ` Suren Baghdasaryan
2024-08-15 13:44       ` Mateusz Guzik
2024-08-15 16:47         ` Andrii Nakryiko
2024-08-15 17:45           ` Suren Baghdasaryan
2024-08-15 18:24             ` Mateusz Guzik
2024-08-15 18:58             ` Jann Horn
2024-08-15 19:07               ` Mateusz Guzik
2024-08-15 19:17                 ` Arnaldo Carvalho de Melo
2024-08-15 19:18                   ` Arnaldo Carvalho de Melo
2024-08-15 19:44               ` Suren Baghdasaryan
2024-08-15 20:17               ` Andrii Nakryiko
2024-08-15 13:24 ` [PATCH v3 00/13] uprobes: RCU-protected hot path optimizations Oleg Nesterov
2024-08-15 16:49   ` Andrii Nakryiko
2024-08-21 16:41     ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4Bzb7gQEKk26B4-KXS_ht8LyCRA7SdThc7ZS05gOEuNZjrQ@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox