From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 754DBEB64DD for ; Mon, 14 Aug 2023 08:43:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1601A8E0002; Mon, 14 Aug 2023 04:43:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1119F8E0001; Mon, 14 Aug 2023 04:43:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1BAE8E0002; Mon, 14 Aug 2023 04:43:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E31458E0001 for ; Mon, 14 Aug 2023 04:43:23 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A3CFC807B4 for ; Mon, 14 Aug 2023 08:43:23 +0000 (UTC) X-FDA: 81122071086.17.F514759 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf15.hostedemail.com (Postfix) with ESMTP id 2C4C9A0004 for ; Mon, 14 Aug 2023 08:43:19 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=yangyicong@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692002601; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I84n6XfLqaOpTJax0DtD5sDc6uAy1od1FHE+lv1M8CU=; b=mivbwIYkmU3fbEPRqkV5oYoxpFSVd8nJO/86V0nIMiwaedPicwMUDUJkU9lxZqllcxl2nQ hudqSEKATdfhWdPDreJqRciqwSD+WiNN9fAzeCOW7tLOuPQg3vGaZS2Ia13epzKorNQ6br FrsBKkUBz40tn51oGTXsSw9fppXx1OA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692002601; a=rsa-sha256; cv=none; b=7Qs02yatdzVkfJqoPAoQVeW9ZwBy16A2ydBNZkcZ3yiQ8XwMxmaFotcb6OWaQnWJIFTEsX QliozVXwU9M+wcWeZkJ5qBoH3X9UBSRL+R70jte0WJ1ScdKeHevCbAbb+ncvQqMwgMiu9E 3GTGL26q+VLYR5ZH+p2Zn4DfiSATAOQ= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=yangyicong@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from canpemm500009.china.huawei.com (unknown [172.30.72.56]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4RPSX33x3Lz2BdG3; Mon, 14 Aug 2023 16:40:19 +0800 (CST) Received: from [10.67.102.169] (10.67.102.169) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Mon, 14 Aug 2023 16:43:14 +0800 Subject: Re: [PATCH 3/4] sched: fix sched_numa_find_nth_cpu() in CPU-less case To: Yury Norov , , CC: Ingo Molnar , Peter Zijlstra , Andrew Morton , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Jacob Keller , Jakub Kicinski , Juri Lelli , Mel Gorman , Peter Lafreniere , Steven Rostedt , Tariq Toukan , Valentin Schneider , Vincent Guittot , , , , , Andy Shevchenko , Rasmus Villemoes , Guenter Roeck References: <20230810162442.9863-1-yury.norov@gmail.com> <20230810162442.9863-4-yury.norov@gmail.com> From: Yicong Yang Message-ID: <7c24e857-fe86-4c2a-68bc-58152bac1f39@huawei.com> Date: Mon, 14 Aug 2023 16:43:14 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: <20230810162442.9863-4-yury.norov@gmail.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.102.169] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected X-Stat-Signature: so81hccsqkauuo5qhnwtgrua7pkmynth X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 2C4C9A0004 X-Rspam-User: X-HE-Tag: 1692002599-135435 X-HE-Meta: U2FsdGVkX188OEtgEBpXwerMVOJ8xJQnhqPSAPIL4SPkc+AutezMTv1pNJNV3Cpj/RPFQlF9/cebwLV82EcJns0m6o/4yuusdQ1wA+4Y4F49bzmKrPCGLIGWPGqmt/PWvTZvqQYiejOZOkWQzLYxg4+az3irlEI+eEucIS9yjwcm2YqEEeWBCraWZlz1smzyudQ3LStXCXdbAHNHFA6bFnhYttqcP8MeyfAPJiau8xHkUnfxVdwqLJ8WbqCPnNP2a2zDn+h8/62duvfw6dsGhd04zmh5F1unOCYjwFkfPjCrR5oAmHqL+BBgoE0qdDmFh382+No472WfwVKHexPzgFxd/LYmvyVE21YZHeBXlQGXewPgIEgxCEiMC82KlUd3bNEIz8Bi0EhH3QX7SrWLXvbOykaBLuJ/vVcDE4Kolm4mgG93yZU6VSPRLa09ds+5uDR+QIVYcO+ROxDLEBe8OVeSQM/K71kehPGLQN2Geq4KjWPdfGNHxNrYcy3SWCSZuEJ8JU2NgwOy0jE5rEf5JSnLg122gQRFFSEJAvMjElvlhW47kEPDAZumvbqKcI835J0h3cbOEQ4t2DytqB5bVcDR0XFKi+NZoDjWNcw+3tfUp35v+FE3cxhD2Y0n1MAUjLlJDTY1GP86RN9C/vpWudXNTuGPFRWGNg2sWd4DIkvom+UwPalET96fyPBvc/CMrBl+ugqRHRKhJtMLqnVGPkstzo9kyP/S9yS1N5h/t7NYHup8G97Yf8pQ+gqbELH3eMdJqkEjc3e6PalD91mjyLyTDoA3WhSNpSFK6dy9NX8ZYzUOKFBaYM0pwusMJJkhWYuNmBXF/fWUS2hQV8ZhR4e/CnLGLkdyJ0UsXEbkpHg8XL+LZtrFDRiahcSrZyPfpjw56AWxN0eYDM5hakgt/ey6jKR5szT7f44UACZaWZoVc9cKOLy84Oze248PMc8m0eMMiN8ylsWMDyiAmYv 2nPWn36R vfdxED4kr/Dn8MP2OMRJeLfxcgj0672BGVuHd+DfKQ9dOi5cLolkJxnINnweG6DOHwzz6AQu1H5IMfOSciAP3Kclw1TbDLcas4LQJU68WNs2ibFwUQxQvYmj8dfJjNMYVoOL+QQn4TaPGjIr2lQXbspr7SrynllyMJJ7oqy0TmQ/C6iD2UsxXc5PSfjSCQfCPGsAqxjcJJ+oLWpHFGpAhM9SCKW/rC5uTtp0VgIuAKQ2+UnR4QF43COVddW+qccKWI4cvqBVTp7N+woxnizRSUuB+hUEvGxQeWT23QKlHYG/pc7CgakdYfAtn1KIYfJ5V2O0Maut/SGeNFD0W+sPxJMeB16pKMWaa3cIm2Lqw9dqdIzxFITyQTbCDyLDMNKn7DNT1xJDlpZUbVZFsU6mwF9bQQhf8IlYNrJeyqI8SpZi41xYeew+S4gRrxsRebuV+KhNv3YTQhYhqVr8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Yury, On 2023/8/11 0:24, Yury Norov wrote: > When the node provided by user is CPU-less, corresponding record in > sched_domains_numa_masks is not set. Trying to dereference it in the > following code leads to kernel crash. > > To avoid it, start searching from the nearest node with CPUs. > > Fixes: cd7f55359c90 ("sched: add sched_numa_find_nth_cpu()") > Reported-by: Yicong Yang > Closes: https://lore.kernel.org/lkml/CAAH8bW8C5humYnfpW3y5ypwx0E-09A3QxFE1JFzR66v+mO4XfA@mail.gmail.com/T/ > Reported-by: Guenter Roeck > Closes: https://lore.kernel.org/lkml/ZMHSNQfv39HN068m@yury-ThinkPad/T/#mf6431cb0b7f6f05193c41adeee444bc95bf2b1c4 > Signed-off-by: Yury Norov > --- > > This has been discovered and fixed by Yicong Yang: > > https://lore.kernel.org/lkml/CAAH8bW8C5humYnfpW3y5ypwx0E-09A3QxFE1JFzR66v+mO4XfA@mail.gmail.com/T/ > > When discovering Guenter's failure report for sparc64, I found it's due to > the same problem. And while fixing, I found an opportunity to generalize > nearest NUMA node search and avoid code duplication. > > Yicong, if you like this approach, please feel free to add your co-developed-by > or any appropriate tags. > Looks fine to me. One nit below. Reviewed-by: Yicong Yang > kernel/sched/topology.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c > index d3a3b2646ec4..66b387172b6f 100644 > --- a/kernel/sched/topology.c > +++ b/kernel/sched/topology.c > @@ -2113,10 +2113,14 @@ static int hop_cmp(const void *a, const void *b) > */ > int sched_numa_find_nth_cpu(const struct cpumask *cpus, int cpu, int node) > { > - struct __cmp_key k = { .cpus = cpus, .node = node, .cpu = cpu }; > + struct __cmp_key k = { .cpus = cpus, .cpu = cpu }; > struct cpumask ***hop_masks; > int hop, ret = nr_cpu_ids; > > + /* CPU-less node entries are uninitialized in sched_domains_numa_masks */ > + node = numa_nearest_node(node, N_CPU); > + k.node = node; > + We may also have problem if node == NUMA_NO_NODE, is it better to mention this in the function comment or check it before we going on? Currently this function is only used in cpumask_local_spread() and the caller has already checked it, but considering this is an export function so somebody may use it directly. I wondering whether we should put this block within the protection of rcu_read_lock() for some issues like hotplug or not. Is it possible if @node become CPU-less subsequently? > rcu_read_lock(); > > k.masks = rcu_dereference(sched_domains_numa_masks); >