From: "yebin (H)" <yebin10@huawei.com>
To: Yury Norov <yury.norov@gmail.com>, Ye Bin <yebin@huaweicloud.com>
Cc: <dennis@kernel.org>, <tj@kernel.org>, <cl@linux.com>,
<linux-mm@kvack.org>, <andriy.shevchenko@linux.intel.com>,
<linux@rasmusvillemoes.dk>, <linux-kernel@vger.kernel.org>,
<dchinner@redhat.com>
Subject: Re: [PATCH 2/2] lib/percpu_counter: fix dying cpu compare race
Date: Tue, 4 Apr 2023 14:54:25 +0800 [thread overview]
Message-ID: <642BC9A1.4040802@huawei.com> (raw)
In-Reply-To: <ZCuQhDLkRhJy081W@yury-laptop>
On 2023/4/4 10:50, Yury Norov wrote:
> On Tue, Apr 04, 2023 at 09:42:06AM +0800, Ye Bin wrote:
>> From: Ye Bin <yebin10@huawei.com>
>>
>> In commit 8b57b11cca88 ("pcpcntrs: fix dying cpu summation race") a race
>> condition between a cpu dying and percpu_counter_sum() iterating online CPUs
>> was identified.
>> Acctually, there's the same race condition between a cpu dying and
>> __percpu_counter_compare(). Here, use 'num_online_cpus()' for quick judgment.
>> But 'num_online_cpus()' will be decreased before call 'percpu_counter_cpu_dead()',
>> then maybe return incorrect result.
>> To solve above issue, also need to add dying CPUs count when do quick judgment
>> in __percpu_counter_compare().
> Not sure I completely understood the race you are describing. All CPU
> accounting is protected with percpu_counters_lock. Is it a real race
> that you've faced, or hypothetical? If it's real, can you share stack
> traces?
>
>> Signed-off-by: Ye Bin <yebin10@huawei.com>
>> ---
>> lib/percpu_counter.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c
>> index 5004463c4f9f..399840cb0012 100644
>> --- a/lib/percpu_counter.c
>> +++ b/lib/percpu_counter.c
>> @@ -227,6 +227,15 @@ static int percpu_counter_cpu_dead(unsigned int cpu)
>> return 0;
>> }
>>
>> +static __always_inline unsigned int num_count_cpus(void)
> This doesn't look like a good name. Maybe num_offline_cpus?
>
>> +{
>> +#ifdef CONFIG_HOTPLUG_CPU
>> + return (num_online_cpus() + num_dying_cpus());
> ^ ^
> 'return' is not a function. Braces are not needed
>
> Generally speaking, a sequence of atomic operations is not an atomic
> operation, so the above doesn't look correct. I don't think that it
> would be possible to implement raceless accounting based on 2 separate
> counters.
Yes, there is indeed a concurrency issue with doing so here. But I saw
that the process was first
set up dying_mask and then reduce the number of online CPUs. The total
quantity maybe is larger
than the actual value and may fall back to a slow path.But this won't
cause any problems.
>
> Most probably, you'd have to use the same approach as in 8b57b11cca88:
>
> lock();
> for_each_cpu_or(cpu, cpu_online_mask, cpu_dying_mask)
> cnt++;
> unlock();
>
> And if so, I'd suggest to implement cpumask_weight_or() for that.
>
>> +#else
>> + return num_online_cpus();
>> +#endif
>> +}
>> +
>> /*
>> * Compare counter against given value.
>> * Return 1 if greater, 0 if equal and -1 if less
>> @@ -237,7 +246,7 @@ int __percpu_counter_compare(struct percpu_counter *fbc, s64 rhs, s32 batch)
>>
>> count = percpu_counter_read(fbc);
>> /* Check to see if rough count will be sufficient for comparison */
>> - if (abs(count - rhs) > (batch * num_online_cpus())) {
>> + if (abs(count - rhs) > (batch * num_count_cpus())) {
>> if (count > rhs)
>> return 1;
>> else
>> --
>> 2.31.1
> .
>
next prev parent reply other threads:[~2023-04-04 6:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-04 1:42 [PATCH 0/2] " Ye Bin
2023-04-04 1:42 ` [PATCH 1/2] cpu/hotplug: introduce 'num_dying_cpus' to get dying CPUs count Ye Bin
2023-04-04 2:24 ` Yury Norov
2023-04-04 1:42 ` [PATCH 2/2] lib/percpu_counter: fix dying cpu compare race Ye Bin
2023-04-04 2:50 ` Yury Norov
2023-04-04 6:54 ` yebin (H) [this message]
2023-04-10 17:38 ` Yury Norov
2023-04-04 7:06 ` yebin (H)
2023-04-04 6:01 ` Dave Chinner
2023-04-04 6:40 ` yebin (H)
2023-04-04 2:11 ` [PATCH 0/2] " Yury Norov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=642BC9A1.4040802@huawei.com \
--to=yebin10@huawei.com \
--cc=andriy.shevchenko@linux.intel.com \
--cc=cl@linux.com \
--cc=dchinner@redhat.com \
--cc=dennis@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux@rasmusvillemoes.dk \
--cc=tj@kernel.org \
--cc=yebin@huaweicloud.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox