From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18902C6FD1D for ; Tue, 4 Apr 2023 06:54:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 380F16B0071; Tue, 4 Apr 2023 02:54:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 30A336B0074; Tue, 4 Apr 2023 02:54:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1AB8A6B0075; Tue, 4 Apr 2023 02:54:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 03D8E6B0071 for ; Tue, 4 Apr 2023 02:54:36 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B966140BD3 for ; Tue, 4 Apr 2023 06:54:35 +0000 (UTC) X-FDA: 80642795310.18.0FC78B7 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf30.hostedemail.com (Postfix) with ESMTP id 9502F80015 for ; Tue, 4 Apr 2023 06:54:31 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf30.hostedemail.com: domain of yebin10@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=yebin10@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680591273; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aPuAe7nlP9PMnBR+H0+CeeCC2jzUOWEg4QHyrg6SGUw=; b=Y58PhD4pngn2QuMxQCipd2LcJyUXWtDIpUTxF/mtoOdMlhraVwvfdjzM3HcBRelGdLZwWo 7q1JIZ2RtKKTmhBG+KvFRmY/3cmzfYfhUlfM0/C+N2X4VbqEswE58+FR2xWuybNvg5loZT 8H2VB7riHTBAWGMqKSypOAPOimKk4Jc= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf30.hostedemail.com: domain of yebin10@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=yebin10@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680591273; a=rsa-sha256; cv=none; b=BziiltlyZvoxCHFIVUJm0R7jA47/M/MWK9hMs2AOmWSvi84mYo9u9f59V0zqDY/Kdgy8t4 Gt/wVA7qP50FkTPH7ulgn6UB3aPHlkeMPEqrV6NcnI/mBy10E31PybcKL3zW++3xvZbOtt X+OFK2Y9jSxeBVLYKPYu2eU1qTFkaQo= Received: from canpemm500010.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4PrJPM4hdKzrVkJ; Tue, 4 Apr 2023 14:53:11 +0800 (CST) Received: from [10.174.178.185] (10.174.178.185) by canpemm500010.china.huawei.com (7.192.105.118) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 4 Apr 2023 14:54:25 +0800 Subject: Re: [PATCH 2/2] lib/percpu_counter: fix dying cpu compare race To: Yury Norov , Ye Bin References: <20230404014206.3752945-1-yebin@huaweicloud.com> <20230404014206.3752945-3-yebin@huaweicloud.com> CC: , , , , , , , From: "yebin (H)" Message-ID: <642BC9A1.4040802@huawei.com> Date: Tue, 4 Apr 2023 14:54:25 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.178.185] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500010.china.huawei.com (7.192.105.118) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9502F80015 X-Stat-Signature: wqdm8n5wkrhfnfjnmj5a5tk9315cnrmc X-Rspam-User: X-HE-Tag: 1680591271-184504 X-HE-Meta: U2FsdGVkX1/NYCB8xAjjsVUFv2PEDt2gyiEMDlx/mt/NHG8Ea2Mt90J/Mt9EcIMsgz1ShxKdI3JQWzW+g1LfCcNbz8oNuqudcTXYLJo5ter0vs1BNjEF3JT0wDzjorUBPmKCbQ1lZ9MDQfSqZIX3xbBEfd+NTZJysXUEFNKR5TT5jTx7/86qlkzbMay+St8AnURiia/dCjK/bKBRKKDH7f5SJUCMqHX3w8I1C6oi5dmcciDmoDcevsbWZMP+DzH0mkzywa4tynolqhcgvuXO/gQQp5q8plbFpK7GFhEW0j/GyxWa6yXacoJsVCVxNGzdpptaux4WgPEpebrQPMAUL86zCmcHKjb9L2+gy0dSjVghj+cE55ITRwIpbNJr0tLwpa/2dxYEw5QV359SAexVeHQd/RPdcsTcOZEMfQfDKYbphNPt4P/atCiUkCFS/hOBD4Lj/2aM+o3hCoFjIxjKiN83rIG9DPt+hxM+EQfPkRLYZ4bD4uMlxoYHD31QrB0ddLddL5QX1GWLO+PvGYRbB8lU4zY5Y7LrtjaJZlyizC6fZSbHSluhJrdQiEZb1m9VfZNkRmUzAlayvH93YOR4Ph3dRhpSFx1N4QANRea7kcup09txhJO8BGIt8AvAS5z55kaEN72pRpYPNUWxiTdPxkbthoG+5BP7VAEsses5I5Whh+pgAVXyP9wqMEsNy8/JHzju5BUF25zBWWFAl73uWBz7X0mUVtOzHRIKwmrkeLlMOKqbzrLTfKV4fX1rc/AjYMVFKKxHOhxLfm+tsiB73JyIeEiD/y3ba4cTjhH5jiZmLtjs9aZ+c3T8GK2xe7Sv7RqMv1p42uA6wrvwcUHDjo3nLJqX8GF8yJ1Md0SAb1VQ7X4kUYD+s6OevRMXefSDwnSumGJYaOmVzDKZkn9beVjzbsvdWesdL/MQNZrGSHJ3hwJYkTsZAyAQBlAmdgsTfQZj+CrkLRMibi2Gf1Z gg40L4A4 KBYUi/q8K3QP/SvatYXsO1ZlATilGoV//WFU4vHIfNHmdD23Fe4h7AB1w8Kaf6F0WfMfaiMEwUHSpLVHMW725H0sWUvl2WQrrPc48e9WBZI9QctoLkyG3ACPej1GRCbI4EZFkXk5JM2kBxcKe+qy9vtO9+NiM/6P9YHkztQ83JZpgun7A9gvMWI8fcpTP6a4uxfD5VhBcxvEb+cNJY8XOISHwNjP/2LEw9qPwwed3vFALpAvrOKp+keN911jLb1+ZNalwt0fTA49QuJLJDBN747FM38DZhJEPYGVsmvdCspshJ4MWr4fMzjNCVjnxo1I3TKI3cHeTuqigdAXUqvNovTJaA1lQnDjNt8o/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/4/4 10:50, Yury Norov wrote: > On Tue, Apr 04, 2023 at 09:42:06AM +0800, Ye Bin wrote: >> From: Ye Bin >> >> In commit 8b57b11cca88 ("pcpcntrs: fix dying cpu summation race") a race >> condition between a cpu dying and percpu_counter_sum() iterating online CPUs >> was identified. >> Acctually, there's the same race condition between a cpu dying and >> __percpu_counter_compare(). Here, use 'num_online_cpus()' for quick judgment. >> But 'num_online_cpus()' will be decreased before call 'percpu_counter_cpu_dead()', >> then maybe return incorrect result. >> To solve above issue, also need to add dying CPUs count when do quick judgment >> in __percpu_counter_compare(). > Not sure I completely understood the race you are describing. All CPU > accounting is protected with percpu_counters_lock. Is it a real race > that you've faced, or hypothetical? If it's real, can you share stack > traces? > >> Signed-off-by: Ye Bin >> --- >> lib/percpu_counter.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c >> index 5004463c4f9f..399840cb0012 100644 >> --- a/lib/percpu_counter.c >> +++ b/lib/percpu_counter.c >> @@ -227,6 +227,15 @@ static int percpu_counter_cpu_dead(unsigned int cpu) >> return 0; >> } >> >> +static __always_inline unsigned int num_count_cpus(void) > This doesn't look like a good name. Maybe num_offline_cpus? > >> +{ >> +#ifdef CONFIG_HOTPLUG_CPU >> + return (num_online_cpus() + num_dying_cpus()); > ^ ^ > 'return' is not a function. Braces are not needed > > Generally speaking, a sequence of atomic operations is not an atomic > operation, so the above doesn't look correct. I don't think that it > would be possible to implement raceless accounting based on 2 separate > counters. Yes, there is indeed a concurrency issue with doing so here. But I saw that the process was first set up dying_mask and then reduce the number of online CPUs. The total quantity maybe is larger than the actual value and may fall back to a slow path.But this won't cause any problems. > > Most probably, you'd have to use the same approach as in 8b57b11cca88: > > lock(); > for_each_cpu_or(cpu, cpu_online_mask, cpu_dying_mask) > cnt++; > unlock(); > > And if so, I'd suggest to implement cpumask_weight_or() for that. > >> +#else >> + return num_online_cpus(); >> +#endif >> +} >> + >> /* >> * Compare counter against given value. >> * Return 1 if greater, 0 if equal and -1 if less >> @@ -237,7 +246,7 @@ int __percpu_counter_compare(struct percpu_counter *fbc, s64 rhs, s32 batch) >> >> count = percpu_counter_read(fbc); >> /* Check to see if rough count will be sufficient for comparison */ >> - if (abs(count - rhs) > (batch * num_online_cpus())) { >> + if (abs(count - rhs) > (batch * num_count_cpus())) { >> if (count > rhs) >> return 1; >> else >> -- >> 2.31.1 > . >