From: Petr Mladek <pmladek@suse.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Oleg Nesterov <oleg@redhat.com>, Tejun Heo <tj@kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Josh Triplett <josh@joshtriplett.org>,
Thomas Gleixner <tglx@linutronix.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jiri Kosina <jkosina@suse.cz>, Borislav Petkov <bp@suse.de>,
Michal Hocko <mhocko@suse.cz>,
linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
live-patching@vger.kernel.org, linux-api@vger.kernel.org,
linux-kernel@vger.kernel.org, Petr Mladek <pmladek@suse.com>
Subject: [RFC v2 10/18] ring_buffer: Do no not complete benchmark reader too early
Date: Mon, 21 Sep 2015 15:03:51 +0200 [thread overview]
Message-ID: <1442840639-6963-11-git-send-email-pmladek@suse.com> (raw)
In-Reply-To: <1442840639-6963-1-git-send-email-pmladek@suse.com>
It seems that complete(&read_done) might be called too early
in some situations.
1st scenario:
-------------
CPU0 CPU1
ring_buffer_producer_thread()
wake_up_process(consumer);
wait_for_completion(&read_start);
ring_buffer_consumer_thread()
complete(&read_start);
ring_buffer_producer()
# producing data in
# the do-while cycle
ring_buffer_consumer();
# reading data
# got error
# set kill_test = 1;
set_current_state(
TASK_INTERRUPTIBLE);
if (reader_finish) # false
schedule();
# producer still in the middle of
# do-while cycle
if (consumer && !(cnt % wakeup_interval))
wake_up_process(consumer);
# spurious wakeup
while (!reader_finish &&
!kill_test)
# leaving because
# kill_test == 1
reader_finish = 0;
complete(&read_done);
1st BANG: We might access uninitialized "read_done" if this is the
the first round.
# producer finally leaving
# the do-while cycle because kill_test == 1;
if (consumer) {
reader_finish = 1;
wake_up_process(consumer);
wait_for_completion(&read_done);
2nd BANG: This will never complete because consumer already did
the completion.
2nd scenario:
-------------
CPU0 CPU1
ring_buffer_producer_thread()
wake_up_process(consumer);
wait_for_completion(&read_start);
ring_buffer_consumer_thread()
complete(&read_start);
ring_buffer_producer()
# CPU3 removes the module <--- difference from
# and stops producer <--- the 1st scenario
if (kthread_should_stop())
kill_test = 1;
ring_buffer_consumer();
while (!reader_finish &&
!kill_test)
# kill_test == 1 => we never go
# into the top level while()
reader_finish = 0;
complete(&read_done);
# producer still in the middle of
# do-while cycle
if (consumer && !(cnt % wakeup_interval))
wake_up_process(consumer);
# spurious wakeup
while (!reader_finish &&
!kill_test)
# leaving because kill_test == 1
reader_finish = 0;
complete(&read_done);
BANG: We are in the same "bang" situations as in the 1st scenario.
Root of the problem:
--------------------
ring_buffer_consumer() must complete "read_done" only when "reader_finish"
variable is set. It must not be skipped due to other conditions.
Note that we still must keep the check for "reader_finish" in a loop
because there might be spurious wakeups as described in the
above scenarios.
Solution:
----------
The top level cycle in ring_buffer_consumer() will finish only when
"reader_finish" is set. The data will be read in "while-do" cycle
so that they are not read after an error (kill_test == 1)
or a spurious wake up.
In addition, "reader_finish" is manipulated by the producer thread.
Therefore we add READ_ONCE() to make sure that the fresh value is
read in each cycle. Also we add the corresponding barrier
to synchronize the sleep check.
Next we set the state back to TASK_RUNNING for the situation where we
did not sleep.
Just from paranoid reasons, we initialize both completions statically.
This is safer, in case there are other races that we are unaware of.
As a side effect we could remove the memory barrier from
ring_buffer_producer_thread(). IMHO, this was the reason for
the barrier. ring_buffer_reset() uses spin locks that should
provide the needed memory barrier for using the buffer.
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
kernel/trace/ring_buffer_benchmark.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/kernel/trace/ring_buffer_benchmark.c b/kernel/trace/ring_buffer_benchmark.c
index a1503a027ee2..9ea7949366b3 100644
--- a/kernel/trace/ring_buffer_benchmark.c
+++ b/kernel/trace/ring_buffer_benchmark.c
@@ -24,8 +24,8 @@ struct rb_page {
static int wakeup_interval = 100;
static int reader_finish;
-static struct completion read_start;
-static struct completion read_done;
+static DECLARE_COMPLETION(read_start);
+static DECLARE_COMPLETION(read_done);
static struct ring_buffer *buffer;
static struct task_struct *producer;
@@ -178,10 +178,14 @@ static void ring_buffer_consumer(void)
read_events ^= 1;
read = 0;
- while (!reader_finish && !kill_test) {
- int found;
+ /*
+ * Continue running until the producer specifically asks to stop
+ * and is ready for the completion.
+ */
+ while (!READ_ONCE(reader_finish)) {
+ int found = 1;
- do {
+ while (found && !kill_test) {
int cpu;
found = 0;
@@ -195,17 +199,23 @@ static void ring_buffer_consumer(void)
if (kill_test)
break;
+
if (stat == EVENT_FOUND)
found = 1;
+
}
- } while (found && !kill_test);
+ }
+ /* Wait till the producer wakes us up when there is more data
+ * available or when the producer wants us to finish reading.
+ */
set_current_state(TASK_INTERRUPTIBLE);
if (reader_finish)
break;
schedule();
}
+ __set_current_state(TASK_RUNNING);
reader_finish = 0;
complete(&read_done);
}
@@ -389,13 +399,10 @@ static int ring_buffer_consumer_thread(void *arg)
static int ring_buffer_producer_thread(void *arg)
{
- init_completion(&read_start);
-
while (!kthread_should_stop() && !kill_test) {
ring_buffer_reset(buffer);
if (consumer) {
- smp_wmb();
wake_up_process(consumer);
wait_for_completion(&read_start);
}
--
1.8.5.6
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-09-21 13:05 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-21 13:03 [RFC v2 00/18] kthread: Use kthread worker API more widely Petr Mladek
2015-09-21 13:03 ` [RFC v2 01/18] kthread: Allow to call __kthread_create_on_node() with va_list args Petr Mladek
2015-09-21 13:03 ` [RFC v2 02/18] kthread: Add create_kthread_worker*() Petr Mladek
2015-09-22 18:20 ` Tejun Heo
2015-09-21 13:03 ` [RFC v2 03/18] kthread: Add drain_kthread_worker() Petr Mladek
2015-09-22 18:26 ` Tejun Heo
2015-09-21 13:03 ` [RFC v2 04/18] kthread: Add destroy_kthread_worker() Petr Mladek
2015-09-22 18:30 ` Tejun Heo
2015-09-21 13:03 ` [RFC v2 05/18] kthread: Add pending flag to kthread work Petr Mladek
2015-09-21 13:03 ` [RFC v2 06/18] kthread: Initial support for delayed " Petr Mladek
2015-09-21 13:03 ` [RFC v2 07/18] kthread: Allow to cancel " Petr Mladek
2015-09-22 19:35 ` Tejun Heo
2015-09-25 11:26 ` Petr Mladek
2015-09-28 17:03 ` Tejun Heo
2015-10-02 15:43 ` Petr Mladek
2015-10-02 19:24 ` Tejun Heo
2015-10-05 10:07 ` Petr Mladek
2015-10-05 11:09 ` Petr Mladek
2015-10-07 9:21 ` Petr Mladek
2015-10-07 14:24 ` Tejun Heo
2015-10-14 10:20 ` Petr Mladek
2015-10-14 17:30 ` Tejun Heo
2015-09-21 13:03 ` [RFC v2 08/18] kthread: Allow to modify delayed " Petr Mladek
2015-09-21 13:03 ` [RFC v2 09/18] mm/huge_page: Convert khugepaged() into kthread worker API Petr Mladek
2015-09-22 20:26 ` Tejun Heo
2015-09-23 9:50 ` Petr Mladek
2015-09-21 13:03 ` Petr Mladek [this message]
2015-09-21 13:03 ` [RFC v2 11/18] ring_buffer: Fix more races when terminating the producer in the benchmark Petr Mladek
2015-09-21 13:03 ` [RFC v2 12/18] ring_buffer: Convert benchmark kthreads into kthread worker API Petr Mladek
2015-09-21 13:03 ` [RFC v2 13/18] rcu: Finish folding ->fqs_state into ->gp_state Petr Mladek
2015-09-21 13:03 ` [RFC v2 14/18] rcu: Store first_gp_fqs into struct rcu_state Petr Mladek
2015-09-21 13:03 ` [RFC v2 15/18] rcu: Clean up timeouts for forcing the quiescent state Petr Mladek
2015-09-21 13:03 ` [RFC v2 16/18] rcu: Check actual RCU_GP_FLAG_FQS when handling " Petr Mladek
2015-09-21 13:03 ` [RFC v2 17/18] rcu: Convert RCU gp kthreads into kthread worker API Petr Mladek
2015-09-28 17:14 ` Paul E. McKenney
2015-10-01 15:43 ` Petr Mladek
2015-10-01 16:33 ` Paul E. McKenney
2015-09-21 13:03 ` [RFC v2 18/18] kthread: Better support freezable kthread workers Petr Mladek
2015-09-22 20:32 ` [RFC v2 00/18] kthread: Use kthread worker API more widely Tejun Heo
2015-09-30 5:08 ` Paul E. McKenney
2015-10-01 15:59 ` Petr Mladek
2015-10-01 17:00 ` Paul E. McKenney
2015-10-02 12:00 ` Petr Mladek
2015-10-02 13:59 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1442840639-6963-11-git-send-email-pmladek@suse.com \
--to=pmladek@suse.com \
--cc=akpm@linux-foundation.org \
--cc=bp@suse.de \
--cc=jkosina@suse.cz \
--cc=josh@joshtriplett.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=live-patching@vger.kernel.org \
--cc=mhocko@suse.cz \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox