From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: Christoph Lameter <cl@gentwo.org>
Cc: akpm@linux-foundation.org, Gilad Ben-Yossef <gilad@benyossef.com>,
Thomas Gleixner <tglx@linutronix.de>, Tejun Heo <tj@kernel.org>,
John Stultz <johnstul@us.ibm.com>,
Mike Frysinger <vapier@gentoo.org>,
Minchan Kim <minchan.kim@gmail.com>,
Hakan Akkan <hakanakkan@gmail.com>,
Max Krasnyansky <maxk@qualcomm.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
hughd@google.com, viresh.kumar@linaro.org, hpa@zytor.com,
mingo@kernel.org, peterz@infradead.org
Subject: Re: vmstat: On demand vmstat workers V8
Date: Thu, 31 Jul 2014 08:52:22 +0800 [thread overview]
Message-ID: <53D99346.2080001@cn.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1407300934410.4608@gentwo.org>
On 07/30/2014 10:45 PM, Christoph Lameter wrote:
> On Wed, 30 Jul 2014, Lai Jiangshan wrote:
>
>> I think the bug is here, it re-queues the per_cpu(vmstat_work, cpu) which is offline
>> (after vmstat_cpuup_callback(CPU_DOWN_PREPARE). And cpu_stat_off is accessed without
>> proper lock.
>
> Ok. I guess we need to make the preemption check output more information
> so that it tells us that an operation was performed on a processor that is
> down?
If the cpu_allows of the percpu-kworker is changed, the specific processor of the kworker
should have been down if workqueue is implemented correctly.
(the preemption check checks the cpu_allows)
>
>> I suggest to use get_cpu_online() or a new cpu_stat_off_mutex to protect it.
>
> If a processor is downed then cpu_stat_off bit should be cleared but also
> the worker thread should not run.
The kworker need to run for some reasons after the processor is down.
Peter and TJ were just discussing it.
The root cause (TO ME only) here is vmstat queues work to offline (or offlining) CPU,
so the kworker has to run it. We may add some check for queuing on offline CPU,
but we can't check for higher level user guarantees. (Example, vmstat can't queue
work to a CPU which is still online but after vmstat_cpuup_callback(CPU_DOWN_PREPARE)).
>
>>> case CPU_DOWN_PREPARE:
>>> case CPU_DOWN_PREPARE_FROZEN:
>>> - cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
>>> - per_cpu(vmstat_work, cpu).work.func = NULL;
>>> + if (!cpumask_test_and_set_cpu(cpu, cpu_stat_off))
>>> + cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
>>
>> It is suggest that cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu)) should
>> be called unconditionally. And the cpu should be cleared from cpu_stat_off.
>> (you set it, it is BUG according to vmstat_shepherd() and the semantics of the
>> cpu_stat_off).
>
> True.
>
> Subject: vmstat ondemand: Fix online/offline races
>
> Do not allow onlining/offlining while the shepherd task is checking
> for vmstat threads.
>
> On offlining a processor do the right thing cancelling the vmstat
> worker thread if it exista and also exclude it from the shepherd
> process checks.
>
> Signed-off-by: Christoph Lameter <cl@linux.com>
>
> Index: linux/mm/vmstat.c
> ===================================================================
> --- linux.orig/mm/vmstat.c 2014-07-30 09:35:54.602662306 -0500
> +++ linux/mm/vmstat.c 2014-07-30 09:43:07.109037043 -0500
> @@ -1317,6 +1317,7 @@ static void vmstat_shepherd(struct work_
> {
> int cpu;
>
> + get_online_cpus();
> /* Check processors whose vmstat worker threads have been disabled */
> for_each_cpu(cpu, cpu_stat_off)
> if (need_update(cpu) &&
> @@ -1325,6 +1326,7 @@ static void vmstat_shepherd(struct work_
> schedule_delayed_work_on(cpu, &per_cpu(vmstat_work, cpu),
> __round_jiffies_relative(sysctl_stat_interval, cpu));
>
> + put_online_cpus();
>
> schedule_delayed_work(&shepherd,
> round_jiffies_relative(sysctl_stat_interval));
> @@ -1380,8 +1382,8 @@ static int vmstat_cpuup_callback(struct
> break;
> case CPU_DOWN_PREPARE:
> case CPU_DOWN_PREPARE_FROZEN:
> - if (!cpumask_test_and_set_cpu(cpu, cpu_stat_off))
> - cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> + cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> + cpumask_clear_cpu(cpu, cpu_stat_off);
Sasha Levin's test result?
> break;
> case CPU_DOWN_FAILED:
> case CPU_DOWN_FAILED_FROZEN:
> .
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2014-07-31 0:51 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-10 14:04 Christoph Lameter
2014-07-11 13:20 ` Frederic Weisbecker
2014-07-11 13:56 ` Christoph Lameter
2014-07-11 13:58 ` Frederic Weisbecker
2014-07-11 15:17 ` Christoph Lameter
2014-07-11 15:19 ` Frederic Weisbecker
2014-07-11 15:22 ` Christoph Lameter
2014-07-14 20:10 ` Hugh Dickins
2014-07-14 20:51 ` Christoph Lameter
2014-07-30 3:04 ` Lai Jiangshan
2014-07-26 2:22 ` Sasha Levin
2014-07-28 18:55 ` Christoph Lameter
2014-07-28 21:54 ` Andrew Morton
2014-07-28 22:00 ` Sasha Levin
2014-07-29 15:17 ` Christoph Lameter
2014-07-29 7:56 ` Peter Zijlstra
2014-07-29 12:05 ` Tejun Heo
2014-07-29 12:23 ` Peter Zijlstra
2014-07-29 13:12 ` Tejun Heo
2014-07-29 15:10 ` Christoph Lameter
2014-07-29 15:14 ` Tejun Heo
2014-07-29 15:26 ` Christoph Lameter
2014-07-29 15:39 ` Christoph Lameter
2014-07-29 15:47 ` Sasha Levin
2014-07-29 15:59 ` Christoph Lameter
2014-07-30 3:11 ` Lai Jiangshan
2014-07-30 14:34 ` Christoph Lameter
2014-07-29 15:22 ` Christoph Lameter
2014-07-29 15:43 ` Sasha Levin
2014-08-04 21:37 ` Sasha Levin
2014-08-05 14:51 ` Christoph Lameter
2014-08-05 22:25 ` Sasha Levin
2014-08-06 14:12 ` Christoph Lameter
2014-08-07 1:50 ` Sasha Levin
2014-07-30 2:57 ` Lai Jiangshan
2014-07-30 14:45 ` Christoph Lameter
2014-07-31 0:52 ` Lai Jiangshan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53D99346.2080001@cn.fujitsu.com \
--to=laijs@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=fweisbec@gmail.com \
--cc=gilad@benyossef.com \
--cc=hakanakkan@gmail.com \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maxk@qualcomm.com \
--cc=minchan.kim@gmail.com \
--cc=mingo@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vapier@gentoo.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox