From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DAC57CCD19F for ; Tue, 21 Oct 2025 01:46:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DA62F8E0009; Mon, 20 Oct 2025 21:46:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D4FF78E0002; Mon, 20 Oct 2025 21:46:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C3E6F8E0009; Mon, 20 Oct 2025 21:46:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id ABD758E0002 for ; Mon, 20 Oct 2025 21:46:40 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 55E85160363 for ; Tue, 21 Oct 2025 01:46:40 +0000 (UTC) X-FDA: 84020432160.29.1948BF6 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) by imf29.hostedemail.com (Postfix) with ESMTP id 7AD5612000A for ; Tue, 21 Oct 2025 01:46:36 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; spf=pass (imf29.hostedemail.com: domain of chenridong@huaweicloud.com designates 45.249.212.56 as permitted sender) smtp.mailfrom=chenridong@huaweicloud.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1761011198; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O0ZJ5duNdTflkbBvVh2wuF1GyoUX5wgF5uwEeWUicSY=; b=44CBOsqnTZM/R8IbfBV1PO/cttZknRSumSvQlyuZ7fjJL+NvTY9KjcwliDbmCe+se/PnAe TSJbJsg/Dz7kDZdSBTO0ygwKJrJaKNADfto6BYuJ8HmfXoAbDuCNy+0LR9BVJD5JV8a0VD /HC+g0PZE8Gc/jw3UV7zvGJuiTwamVk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of chenridong@huaweicloud.com designates 45.249.212.56 as permitted sender) smtp.mailfrom=chenridong@huaweicloud.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1761011198; a=rsa-sha256; cv=none; b=wI37L9BqBZfKgM2OyRAkju0SroFzlfDR2BU6f2xaLYcvJzRpn2k4S2BDgm2dq36w/HeNOX BuRxI1+25GAR2/AV/l7PcEz/yr/biJRSoaWdlZjTYvi2YX7y04l5zbfkD9EID1buu6n+8P EK9ax5vJz1Mpag1OSbrjLDu7CaVeTRc= Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4crFVy0tCfzKHMQG for ; Tue, 21 Oct 2025 09:45:46 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id D88FB1A131F for ; Tue, 21 Oct 2025 09:46:31 +0800 (CST) Received: from [10.67.111.176] (unknown [10.67.111.176]) by APP1 (Coremail) with SMTP id cCh0CgB36Uz25fZoPxk8BA--.39058S2; Tue, 21 Oct 2025 09:46:31 +0800 (CST) Message-ID: Date: Tue, 21 Oct 2025 09:46:29 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 12/33] sched/isolation: Convert housekeeping cpumasks to rcu pointers To: Frederic Weisbecker , LKML Cc: =?UTF-8?Q?Michal_Koutn=C3=BD?= , Andrew Morton , Bjorn Helgaas , Catalin Marinas , Danilo Krummrich , "David S . Miller" , Eric Dumazet , Gabriele Monaco , Greg Kroah-Hartman , Ingo Molnar , Jakub Kicinski , Jens Axboe , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Muchun Song , Paolo Abeni , Peter Zijlstra , Phil Auld , "Rafael J . Wysocki" , Roman Gushchin , Shakeel Butt , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , Will Deacon , cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-block@vger.kernel.org, linux-mm@kvack.org, linux-pci@vger.kernel.org, netdev@vger.kernel.org References: <20251013203146.10162-1-frederic@kernel.org> <20251013203146.10162-13-frederic@kernel.org> Content-Language: en-US From: Chen Ridong In-Reply-To: <20251013203146.10162-13-frederic@kernel.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-CM-TRANSID:cCh0CgB36Uz25fZoPxk8BA--.39058S2 X-Coremail-Antispam: 1UD129KBjvJXoW3WF43tF17Ww17Gr4xXF43Wrg_yoW3Jw4kpr Z8W3y3GrWkXr1fG398ZwnrAry5Wwn7Arn2yas3Ww4rCFy7uw1kZry09FnxXryUu3s7Cry7 Zas8tw4F93Wjy37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvYb4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JM4IIrI8v6xkF7I0E8cxan2IY04v7MxkF7I0En4kS 14v26rWY6Fy7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I 8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWrXVW8 Jr1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7 CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v2 6r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07 j6a0PUUUUU= X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 7AD5612000A X-Stat-Signature: 6w87ym3bbi6s6j71koaxecswpodhzu87 X-Rspam-User: X-HE-Tag: 1761011196-65846 X-HE-Meta: U2FsdGVkX1/wF8hvDDFvSixOSSIwP3ZHNIKtMkuO8phgrc2/JvCa9Rm+Mz7telGqOCbqt5NsEqPVfbG7fdNnf2UVxtyyMlIaAMlG3HcPChaXvuXeFmCV3yKpbOQ58Aeeb39Wb8bmc2qduWn0E7WJZxLUnXejDsDgpw9sfW2AO1qCS9qQInFYQNYRPvSoRr/jrY69sOaoTbWOSsYtMy9qo2D3TqNPuPZr1Y3LzJ+eN2+NsA485U2UGwPe4xFF+4buhQAPerVBb5swWG4EER+uYIsPoBGlJ9mo9KhlNJa9gLqex5jyYGb/qB7mEGIDDeB6qvlyquQknEI0HwI65N3MMHzW9aLH4ZK4+35N/TmgNOt0fsHE4x0r7mCcsAbBdru1bKlwYVUNYwNaoT4ghHkSItEAlOrVzdVymIua8+jTbBdTuZJkAzle0Q7qV1Ps1DvHL8g2qSusbbaXhpR9/5wV9gdRKR4xGyBC+wTLFjWo8Qv0V24CUa/aqgtmxTzgMxyph75euGt2cUCaZeUWMc9HqGnnFqnVk4Ssksu8ziCX5BE1eh0g1qympVVQAJ1CiRHevovAlzHomvTjcyYeBasdXCAi79TgsK3cN3IW9NPofi3jJgG/PuvQ5H+KTR18SswNOOfGpxwiJijQwVhpLmArZKcFkpF0NalLdhXaeDP6GREO7XBO6WazuiFvdy0IvxXqqWsVcO/vaoTHpstNrvTxDnaJ1w8iqo4nQRuG9iIWYi3Grcloqh7HhcraiW6OAUfe5iu/66NFoTLHS606zHqEZ79BMCDqup6tnj75u4Z5Q+wwd9L0L0051FP3faoVdLoMc423z3EHFuE613Ki0AVH8QtP4SsVl2ic3X0S9ENrwbGATiot/OhMt8295dXESYkVQt7+k0J8vUzXo6bq+rjsbeVlDtxwGH9jlApINEGaYIXRUwl/sFlZq/MJerdIxdsNeExWBhqoczv1ePh1xAx t3K3sOH7 mC+DhRg3zOW+qjHiZKKLZZ/DKoUv8Zq7UZgtg61TFjdjHcVdq7XIerwTHxg7prB09/ZuY+S0sYWoYnBonh1Cdk4fjqKFGR6jeglgKuEgmdVG0CZs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/10/14 4:31, Frederic Weisbecker wrote: > HK_TYPE_DOMAIN's cpumask will soon be made modifyable by cpuset. > A synchronization mechanism is then needed to synchronize the updates > with the housekeeping cpumask readers. > > Turn the housekeeping cpumasks into RCU pointers. Once a housekeeping > cpumask will be modified, the update side will wait for an RCU grace > period and propagate the change to interested subsystem when deemed > necessary. > > Signed-off-by: Frederic Weisbecker > --- > kernel/sched/isolation.c | 58 +++++++++++++++++++++++++--------------- > kernel/sched/sched.h | 1 + > 2 files changed, 37 insertions(+), 22 deletions(-) > > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c > index 8690fb705089..b46c20b5437f 100644 > --- a/kernel/sched/isolation.c > +++ b/kernel/sched/isolation.c > @@ -21,7 +21,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden); > EXPORT_SYMBOL_GPL(housekeeping_overridden); > > struct housekeeping { > - cpumask_var_t cpumasks[HK_TYPE_MAX]; > + struct cpumask __rcu *cpumasks[HK_TYPE_MAX]; > unsigned long flags; > }; > > @@ -33,17 +33,28 @@ bool housekeeping_enabled(enum hk_type type) > } > EXPORT_SYMBOL_GPL(housekeeping_enabled); > > +const struct cpumask *housekeeping_cpumask(enum hk_type type) > +{ > + if (static_branch_unlikely(&housekeeping_overridden)) { > + if (housekeeping.flags & BIT(type)) { > + return rcu_dereference_check(housekeeping.cpumasks[type], 1); > + } > + } > + return cpu_possible_mask; > +} > +EXPORT_SYMBOL_GPL(housekeeping_cpumask); > + > int housekeeping_any_cpu(enum hk_type type) > { > int cpu; > > if (static_branch_unlikely(&housekeeping_overridden)) { > if (housekeeping.flags & BIT(type)) { > - cpu = sched_numa_find_closest(housekeeping.cpumasks[type], smp_processor_id()); > + cpu = sched_numa_find_closest(housekeeping_cpumask(type), smp_processor_id()); > if (cpu < nr_cpu_ids) > return cpu; > > - cpu = cpumask_any_and_distribute(housekeeping.cpumasks[type], cpu_online_mask); > + cpu = cpumask_any_and_distribute(housekeeping_cpumask(type), cpu_online_mask); > if (likely(cpu < nr_cpu_ids)) > return cpu; > /* > @@ -59,28 +70,18 @@ int housekeeping_any_cpu(enum hk_type type) > } > EXPORT_SYMBOL_GPL(housekeeping_any_cpu); > > -const struct cpumask *housekeeping_cpumask(enum hk_type type) > -{ > - if (static_branch_unlikely(&housekeeping_overridden)) > - if (housekeeping.flags & BIT(type)) > - return housekeeping.cpumasks[type]; > - return cpu_possible_mask; > -} > -EXPORT_SYMBOL_GPL(housekeeping_cpumask); > - > void housekeeping_affine(struct task_struct *t, enum hk_type type) > { > if (static_branch_unlikely(&housekeeping_overridden)) > if (housekeeping.flags & BIT(type)) > - set_cpus_allowed_ptr(t, housekeeping.cpumasks[type]); > + set_cpus_allowed_ptr(t, housekeeping_cpumask(type)); > } > EXPORT_SYMBOL_GPL(housekeeping_affine); > > bool housekeeping_test_cpu(int cpu, enum hk_type type) > { > - if (static_branch_unlikely(&housekeeping_overridden)) > - if (housekeeping.flags & BIT(type)) > - return cpumask_test_cpu(cpu, housekeeping.cpumasks[type]); > + if (housekeeping.flags & BIT(type)) > + return cpumask_test_cpu(cpu, housekeeping_cpumask(type)); > return true; > } > EXPORT_SYMBOL_GPL(housekeeping_test_cpu); > @@ -96,20 +97,33 @@ void __init housekeeping_init(void) > > if (housekeeping.flags & HK_FLAG_KERNEL_NOISE) > sched_tick_offload_init(); > - > + /* > + * Realloc with a proper allocator so that any cpumask update > + * can indifferently free the old version with kfree(). > + */ > for_each_set_bit(type, &housekeeping.flags, HK_TYPE_MAX) { > + struct cpumask *omask, *nmask = kmalloc(cpumask_size(), GFP_KERNEL); > + > + if (WARN_ON_ONCE(!nmask)) > + return; > + > + omask = rcu_dereference(housekeeping.cpumasks[type]); > + > /* We need at least one CPU to handle housekeeping work */ > - WARN_ON_ONCE(cpumask_empty(housekeeping.cpumasks[type])); > + WARN_ON_ONCE(cpumask_empty(omask)); > + cpumask_copy(nmask, omask); > + RCU_INIT_POINTER(housekeeping.cpumasks[type], nmask); > + memblock_free(omask, cpumask_size()); > } > } > > static void __init housekeeping_setup_type(enum hk_type type, > cpumask_var_t housekeeping_staging) > { > + struct cpumask *mask = memblock_alloc_or_panic(cpumask_size(), SMP_CACHE_BYTES); > > - alloc_bootmem_cpumask_var(&housekeeping.cpumasks[type]); > - cpumask_copy(housekeeping.cpumasks[type], > - housekeeping_staging); > + cpumask_copy(mask, housekeeping_staging); > + RCU_INIT_POINTER(housekeeping.cpumasks[type], mask); > } > > static int __init housekeeping_setup(char *str, unsigned long flags) > @@ -162,7 +176,7 @@ static int __init housekeeping_setup(char *str, unsigned long flags) > > for_each_set_bit(type, &iter_flags, HK_TYPE_MAX) { > if (!cpumask_equal(housekeeping_staging, > - housekeeping.cpumasks[type])) { > + housekeeping_cpumask(type))) { > pr_warn("Housekeeping: nohz_full= must match isolcpus=\n"); > goto free_housekeeping_staging; > } > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 1f5d07067f60..0c0ef8999fd6 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -42,6 +42,7 @@ > #include > #include > #include > +#include > #include > #include > #include A warning was detected: ============================= WARNING: suspicious RCU usage 6.17.0-next-20251009-00033-g4444da88969b #808 Not tainted ----------------------------- kernel/sched/isolation.c:60 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by swapper/0/1: #0: ffff888100600ce0 (&type->i_mutex_dir_key#3){++++}-{4:4}, at: walk_compone stack backtrace: CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-next-20251009-00033-g4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239 Call Trace: dump_stack_lvl+0x68/0xa0 lockdep_rcu_suspicious+0x148/0x1b0 housekeeping_cpumask+0xaa/0xb0 housekeeping_test_cpu+0x25/0x40 find_get_block_common+0x41/0x3e0 bdev_getblk+0x28/0xa0 ext4_getblk+0xba/0x2d0 ext4_bread_batch+0x56/0x170 __ext4_find_entry+0x17c/0x410 ? lock_release+0xc6/0x290 ext4_lookup+0x7a/0x1d0 __lookup_slow+0xf9/0x1b0 walk_component+0xe0/0x150 link_path_walk+0x201/0x3e0 path_openat+0xb1/0xb30 ? stack_depot_save_flags+0x41e/0xa00 do_filp_open+0xbc/0x170 ? _raw_spin_unlock_irqrestore+0x2c/0x50 ? __create_object+0x59/0x80 ? trace_kmem_cache_alloc+0x1d/0xa0 ? vprintk_emit+0x2b2/0x360 do_open_execat+0x56/0x100 alloc_bprm+0x1a/0x200 ? __pfx_kernel_init+0x10/0x10 kernel_execve+0x4b/0x160 kernel_init+0xe5/0x1c0 ret_from_fork+0x185/0x1d0 ? __pfx_kernel_init+0x10/0x10 ret_from_fork_asm+0x1a/0x30 random: crng init done -- Best regards, Ridong