From: Tejun Heo <tj@kernel.org>
To: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: dennis@kernel.org, cl@linux.com, akpm@linux-foundation.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
zhouchengming@bytedance.com, songmuchun@bytedance.com
Subject: Re: [PATCH] percpu_ref: call wake_up_all() after percpu_ref_put() completes
Date: Fri, 8 Apr 2022 07:41:05 -1000 [thread overview]
Message-ID: <YlBzsakUloG4nS7W@slm.duckdns.org> (raw)
In-Reply-To: <20220407103335.36885-1-zhengqi.arch@bytedance.com>
Hello,
On Thu, Apr 07, 2022 at 06:33:35PM +0800, Qi Zheng wrote:
> In the percpu_ref_call_confirm_rcu(), we call the wake_up_all()
> before calling percpu_ref_put(), which will cause the value of
> percpu_ref to be unstable when percpu_ref_switch_to_atomic_sync()
> returns.
>
> CPU0 CPU1
>
> percpu_ref_switch_to_atomic_sync(&ref)
> --> percpu_ref_switch_to_atomic(&ref)
> --> percpu_ref_get(ref); /* put after confirmation */
> call_rcu(&ref->data->rcu, percpu_ref_switch_to_atomic_rcu);
>
> percpu_ref_switch_to_atomic_rcu
> --> percpu_ref_call_confirm_rcu
> --> data->confirm_switch = NULL;
> wake_up_all(&percpu_ref_switch_waitq);
>
> /* here waiting to wake up */
> wait_event(percpu_ref_switch_waitq, !ref->data->confirm_switch);
> (A)percpu_ref_put(ref);
> /* The value of &ref is unstable! */
> percpu_ref_is_zero(&ref)
> (B)percpu_ref_put(ref);
>
> As shown above, assuming that the counts on each cpu add up to 0 before
> calling percpu_ref_switch_to_atomic_sync(), we expect that after switching
> to atomic mode, percpu_ref_is_zero() can return true. But actually it will
> return different values in the two cases of A and B, which is not what
> we expected.
>
> Maybe the original purpose of percpu_ref_switch_to_atomic_sync() is
> just to ensure that the conversion to atomic mode is completed, but it
> should not return with an extra reference count.
>
> Calling wake_up_all() after percpu_ref_put() ensures that the value of
> percpu_ref is stable after percpu_ref_switch_to_atomic_sync() returns.
> So just do it.
>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> ---
> lib/percpu-refcount.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
> index af9302141bcf..b11b4152c8cd 100644
> --- a/lib/percpu-refcount.c
> +++ b/lib/percpu-refcount.c
> @@ -154,13 +154,14 @@ static void percpu_ref_call_confirm_rcu(struct rcu_head *rcu)
>
> data->confirm_switch(ref);
> data->confirm_switch = NULL;
> - wake_up_all(&percpu_ref_switch_waitq);
>
> if (!data->allow_reinit)
> __percpu_ref_exit(ref);
>
> /* drop ref from percpu_ref_switch_to_atomic() */
> percpu_ref_put(ref);
> +
> + wake_up_all(&percpu_ref_switch_waitq);
The interface, at least originally, doesn't give any guarantee over whether
there's gonna be a residual reference on it or not. There's nothing
necessarily wrong with guaranteeing that but it's rather unusual and given
that putting the base ref in a percpu_ref is a special "kill" operation and
a ref in percpu mode always returns %false on is_zero(), I'm not quite sure
how such semantics would be useful. Do you care to explain the use case with
concrete examples?
Also, the proposed patch is racy. There's nothing preventing
percpu_ref_switch_to_atomic_sync() from waking up early between
confirm_switch clearing and the wake_up_all, so the above change doesn't
guarantee what it tries to guarantee. For that, you'd have to move
confirm_switch clearing *after* percpu_ref_put() but then, you'd be
accessing the ref after its final ref is put which can lead to
use-after-free.
In fact, the whole premise seems wrong. The switching needs a reference to
the percpu_ref because it is accessing it asynchronously. The switching side
doesn't know when the ref is gonna go away once it puts its reference and
thus can't signal that they're done after putting their reference.
We *can* make that work by putting the whole thing in its own critical
section so that we can make confirm_switch clearing atomic with the possibly
final put, but that's gonna add some complexity and begs the question why
we'd need such a thing.
Andrew, I don't think the patch as proposed makes much sense. Maybe it'd be
better to keep it out of the tree for the time being?
Thanks.
--
tejun
next prev parent reply other threads:[~2022-04-08 17:41 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-07 10:33 Qi Zheng
2022-04-07 22:57 ` Andrew Morton
2022-04-08 0:39 ` Dennis Zhou
2022-04-08 1:40 ` Ming Lei
2022-04-08 2:54 ` Muchun Song
2022-04-08 3:50 ` Qi Zheng
2022-04-08 3:54 ` Andrew Morton
2022-04-08 4:06 ` Qi Zheng
2022-04-08 4:10 ` Andrew Morton
2022-04-08 4:14 ` Qi Zheng
2022-04-08 4:16 ` Qi Zheng
2022-04-08 5:57 ` Dennis Zhou
2022-04-08 6:28 ` Qi Zheng
2022-04-08 17:41 ` Tejun Heo [this message]
2022-04-08 19:19 ` Dennis Zhou
2022-04-09 0:40 ` Qi Zheng
2022-04-11 7:19 ` Qi Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YlBzsakUloG4nS7W@slm.duckdns.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=dennis@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=songmuchun@bytedance.com \
--cc=zhengqi.arch@bytedance.com \
--cc=zhouchengming@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox