From: Hillf Danton <hdanton@sina.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>,
Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
"Michael S. Tsirkin" <mst@redhat.com>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: 5.13-rt1 + KVM = WARNING: at fs/eventfd.c:74 eventfd_signal()
Date: Fri, 23 Jul 2021 10:23:56 +0800 [thread overview]
Message-ID: <20210723022356.1301-1-hdanton@sina.com> (raw)
In-Reply-To: <2b4aea8d-a038-e347-7f6f-10476d771b7e@redhat.com>
On Wed, 21 Jul 2021 12:59:39 +0200 Paolo Bonzini wrote:
>On 21/07/21 12:11, Hillf Danton wrote:
>> On Wed, 21 Jul 2021 09:25:32 +0200 Thomas Gleixner wrote:
>>> On Wed, Jul 21 2021 at 15:04, Hillf Danton wrote:
>>>>
>>>> But the preempting waker can not make sense without the waiter who is bloody
>>>> special. Why is it so in the first place? Or it is not at all but the race
>>>> existing from Monday to Friday.
>>>
>>> See the large comment in eventfd_poll().
>>
>> Is it likely for a reader to make eventfd_poll() return 0?
>>
>> read * poll write
>> ---- * ----------------- ------------
>> * count = ctx->count (INVALID!)
>> * lock ctx->qwh.lock
>> * ctx->count += n
>> * **waitqueue_active is false**
>> * **no wake_up_locked_poll!**
>> * unlock ctx->qwh.lock
>>
>> lock ctx->qwh.lock
>> *cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
>> ctx->count -= *cnt;
>> **waitqueue_active is false**
>> unlock ctx->qwh.lock
>>
>> * lock ctx->wqh.lock (in poll_wait)
>> * __add_wait_queue
>> * unlock ctx->wqh.lock
>> * eventfd_poll returns 0
>> */
>> count = READ_ONCE(ctx->count);
>>
>
>No, it's simply impossible. The same comment explains why: "count =
>ctx->count" cannot move above poll_wait's locking of ctx->wqh.lock.
Detect concurrent reader and writer by reading event counter before and
after poll_wait(), and determine feedback with the case of unstable
counter taken into account.
Cut the big comment as the added barriers speak for themselves.
+++ x/fs/eventfd.c
@@ -131,49 +131,20 @@ static __poll_t eventfd_poll(struct file
{
struct eventfd_ctx *ctx = file->private_data;
__poll_t events = 0;
- u64 count;
+ u64 c0, count;
+
+ c0 = ctx->count;
+ smp_rmb();
poll_wait(file, &ctx->wqh, wait);
- /*
- * All writes to ctx->count occur within ctx->wqh.lock. This read
- * can be done outside ctx->wqh.lock because we know that poll_wait
- * takes that lock (through add_wait_queue) if our caller will sleep.
- *
- * The read _can_ therefore seep into add_wait_queue's critical
- * section, but cannot move above it! add_wait_queue's spin_lock acts
- * as an acquire barrier and ensures that the read be ordered properly
- * against the writes. The following CAN happen and is safe:
- *
- * poll write
- * ----------------- ------------
- * lock ctx->wqh.lock (in poll_wait)
- * count = ctx->count
- * __add_wait_queue
- * unlock ctx->wqh.lock
- * lock ctx->qwh.lock
- * ctx->count += n
- * if (waitqueue_active)
- * wake_up_locked_poll
- * unlock ctx->qwh.lock
- * eventfd_poll returns 0
- *
- * but the following, which would miss a wakeup, cannot happen:
- *
- * poll write
- * ----------------- ------------
- * count = ctx->count (INVALID!)
- * lock ctx->qwh.lock
- * ctx->count += n
- * **waitqueue_active is false**
- * **no wake_up_locked_poll!**
- * unlock ctx->qwh.lock
- * lock ctx->wqh.lock (in poll_wait)
- * __add_wait_queue
- * unlock ctx->wqh.lock
- * eventfd_poll returns 0
- */
- count = READ_ONCE(ctx->count);
+ smp_rmb();
+ count = ctx->count;
+
+ if (c0 < count)
+ return EPOLLIN;
+ if (c0 > count)
+ return EPOLLOUT;
if (count > 0)
events |= EPOLLIN;
next prev parent reply other threads:[~2021-07-23 2:24 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <df278db6-1fc0-3d42-9c0e-f5a085c6351e@redhat.com>
[not found] ` <8dfc0ee9-b97a-8ca8-d057-31c8cad3f5b6@redhat.com>
[not found] ` <f0254740-944d-201b-9a66-9db1fe480ca6@redhat.com>
[not found] ` <475f84e2-78ee-1a24-ef57-b16c1f2651ed@redhat.com>
[not found] ` <20210715102249.2205-1-hdanton@sina.com>
[not found] ` <20210716020611.2288-1-hdanton@sina.com>
[not found] ` <20210716075539.2376-1-hdanton@sina.com>
[not found] ` <20210716093725.2438-1-hdanton@sina.com>
[not found] ` <a2f3f9ac-dac2-eadc-269e-91652d78ebd3@redhat.com>
2021-07-18 12:42 ` Hillf Danton
2021-07-19 15:38 ` Paolo Bonzini
2021-07-21 7:04 ` Hillf Danton
2021-07-21 7:25 ` Thomas Gleixner
2021-07-21 10:11 ` Hillf Danton
2021-07-21 10:59 ` Paolo Bonzini
2021-07-22 5:58 ` Hillf Danton
2021-07-23 2:23 ` Hillf Danton [this message]
2021-07-23 7:59 ` Paolo Bonzini
2021-07-23 9:48 ` Hillf Danton
2021-07-23 10:56 ` Paolo Bonzini
2021-07-24 4:33 ` Hillf Danton
2021-07-26 11:03 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210723022356.1301-1-hdanton@sina.com \
--to=hdanton@sina.com \
--cc=bigeasy@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox