From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7923EC021B8 for ; Tue, 4 Mar 2025 14:55:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E82766B0088; Tue, 4 Mar 2025 09:55:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E0AEB6B0089; Tue, 4 Mar 2025 09:55:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C846C6B008A; Tue, 4 Mar 2025 09:55:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A0AE76B0088 for ; Tue, 4 Mar 2025 09:55:37 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4AA97C017A for ; Tue, 4 Mar 2025 14:55:37 +0000 (UTC) X-FDA: 83184167514.18.B850A29 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf18.hostedemail.com (Postfix) with ESMTP id 39E761C000A for ; Tue, 4 Mar 2025 14:55:35 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="fde/bJwf"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf18.hostedemail.com: domain of "SRS0=ISEx=VX=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 147.75.193.91 as permitted sender) smtp.mailfrom="SRS0=ISEx=VX=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741100135; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=r1PGqELUY5CSi5m8IWDEPRVVf9KSIU8gi6Aj6DYJ9mo=; b=6XEek48OCwZLF7+MCtEfiG3UFSPtvTRtrPMvDCyn1JLmiWLVNceImMQPlmI3RBBS2No7Ri 1NQmcLxzE92R3CXgDqAk8fWLfqbW3hBwcWFvmewolLMrJkbcKo9+CgMKQ4w/WmyNKeETzf MUxOFKsxODuMKzBXEOHN9312at8TTqA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741100135; a=rsa-sha256; cv=none; b=2fQbBRW8mtMRe1YrUshMHg/u1K5Fmm0yG+qy1lVooBtqHRjAUIWtxRX5Tw5ecMvSu+T//1 yYx4vbkSUE6cmVAfbb0YBIz+Xb69naTYjMId/fqSeHUvkWg0vipYcfbmzXErQknRakOyE+ N6AnxxsFqSkjrKMFJH93vU0Mcda4A18= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="fde/bJwf"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf18.hostedemail.com: domain of "SRS0=ISEx=VX=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 147.75.193.91 as permitted sender) smtp.mailfrom="SRS0=ISEx=VX=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 5C0E6A456A6; Tue, 4 Mar 2025 14:50:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D395EC4CEE5; Tue, 4 Mar 2025 14:55:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741100133; bh=21/+tTReVZYvj3YaSc8NqoDvZVmfhekN36NcOLoxLwk=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=fde/bJwfid4Bvg9Rf1K+unyrTTS4ahPqPaVQaw+O4gR7RTjOtx0kHnb+w7XhwNI/0 2uJzM4oX/4e811tBSQlgF+KA+VQjZpsVRfZQwbEQSfQEX6J0vF01yDmwt+dAg6XvSA cNIe+c5DcIFY1cgk5zn/c4dFhlQgSJyo80hrDtiHlQH1fw8bw4yhBNtijapcp6B4zK cTweeFAgoSbbJAvXmispCWHDG3muyms8yoEuQ+7qO4Xj2sW5AJbfFOFrbIoR/YemZF ntNuJmSlcG/rWGWgFTXsoh1WKKrRExceq+iZfsPojEf6iQiWt6dUtxC+lllxido3bQ dPNhqrHpanUkw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 70480CE12E4; Tue, 4 Mar 2025 06:55:33 -0800 (PST) Date: Tue, 4 Mar 2025 06:55:33 -0800 From: "Paul E. McKenney" To: Joel Fernandes Cc: "Uladzislau Rezki (Sony)" , linux-mm@kvack.org, Andrew Morton , Vlastimil Babka , RCU , LKML , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Oleksiy Avramchenko , stable@vger.kernel.org, Greg Kroah-Hartman , Keith Busch Subject: Re: [PATCH v1 2/2] mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq Message-ID: <14b61981-35ae-4f87-8341-b8d484123e56@paulmck-laptop> Reply-To: paulmck@kernel.org References: <20250228121356.336871-1-urezki@gmail.com> <20250228121356.336871-2-urezki@gmail.com> <20250303160824.GA22541@joelnvbox> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250303160824.GA22541@joelnvbox> X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 39E761C000A X-Stat-Signature: yy74b63na1d89ukqikudtkw4ij4shfgn X-HE-Tag: 1741100135-421803 X-HE-Meta: U2FsdGVkX19b4pKBOnmWpqMOi+eiFLYsEmWhafyIO9RBFFEwB3hlsBuX/V0arSSnquEdmU3Zwj4X5ijoQaIATd0dnLQd6TfXSKhYuCtxRiZneOwYv/yNMXlq6khFIJoyue7XE/kUaycB1mJ2kqUFzyS4M6cJxqouA65vOaKBGCD5kXvEHObWHigG62PO3yrzVhuomlEtF0ngPbrp4tX+CfXe3dBpdgBrc1P02ezJuFnVMOBCRcRXbOTUi/lLpssGPEbavr2WAkdQWbTFFdHTdbYCVS/uwaO2o5yKUSz5KYcnCdGeuth0wGJXQxn71TSKkyxU6a+qyBgOq7rktjfzHF+kCxP8h4O1Uu8sMV+RY9CpVMyuQmfJGtHagpbpK23mg0BoCXjZJlN0Zie7gxDwlCo6aFN5WpFvr+Wed1QKfPDnFxz8m6CTFoU936TM5/kbug0Ci2zvBMi5MYsShlIzsHztw/TqyFg3Oa/PVn5s95PoSR2l3ApOxbq2+pBR4LE0yvBMmHiACY47oA65Zn/TRxS9TnMyTHP8uY/Ca9MIwktFTbllXpIPGh+44WZIqThGp6pC+ZgGUOIcpqhXIO0bS16IsfyQuLgs4Y+ChrjHFgYoj8mI32qeuyRvrP6GeuEHXyI/50oYi+Hdsv8velZWT+dXHNxm5VmBRZNaWM5X2V/eJ6Lj6rIOKq2H0Mpacbbxxl4f/X6uYY21RL9AjwWsB11S1O+37gnMs56F6JyIWoPm9G6RGPcGB9MS7EFABOtbgp1yhH/x1tc2O8TwvBL7Zn+7Qbku8kFte5Dvgpv/4Jm6UIEwEWyLM8Urx1W/0GWubOpLF8udnFPnj+UNRPjpGZecDQB4oUqfi//3uAVxkktrbOMba/sGjGR3pMlPNIdiQVH+Jp02furkcJrJKDiCgzcXcvs15Qn9R0HjCYmKkn3gaecS5bbd3RR2XqVCLXglF1+H1185lV2QwtGoupP Hg6NMPGu EjUa946USVn8wDhJA5BvOInrr6EDxga9FZX4owJZYBDKK4rO3B9rYnGHegtn2PHRhHM/u1Da4RxnZdNLhXOl7Ps9QTcIJ6sDR1Dez37OHcB25TSp9CZ+3d94JS+Yro9xUNDNh98rkG+N9eFUvt9vB3wxHJ67qeAAOek1iMO2heCHpz8S4rjIZ3z54Vr/JmlgNk5ae0r+CxVUFuQfie/SfXiAyqhLzptn9ZiV8znBFf3A/p8ysUlqqRxu9o3guF9g2qtBH8IEOlolGTOElU9+mJGUCsx1SHUrpy3t3GVELFmYg92ByK3gL99criuHXTcDzLrPzEp/WQHhl8EzxRum047Kcb278OE9ZMAoaTzPDhA4LvvJIQINcYSLTd4XpA45oQdOnFgzAG8GwpQFOu6pbpMPK/LVXahiIeteQYU90CWufnnuWOt0WcZhXqS88nGgqYtXKmOc5QqWICBiJg3Z/RBKeflnj0+2dfoUOFoj/7/57uddARCdhkx/qSP0k1hiUhMzphc9lj/QTmAJyL2oURGg5vzdjJqJ4jDhPre7aOuZr0c4mnnOzKbA0SYOzHjJkfb2KyriKm7HNTSX1Tlh49waw8t6gS2dLIMdR6/ovJmrYMFUyhq43ZkX2VTK6UWcQVMzmpc2nsgyFAIZPA3+AnEGIFUBsX+Jybtwducu4im/A646/h8ahmKFYwA4BMu9N+xS6ucEoabOtEYrkzpM1iJe6OGTSpaUNJFpfUEdcUgUqLz1gYCEB0G5VLKP6mlrF9ePrzv0BUinP5P3Trx251oSDcfD+tOUM40zjK88eBlguf1dk3ouROlhOHICZ/K0g1BIF0+9156eFtZF5CgJTG8nx3/pJAZdSb8HKtxdOEe8i9UYdPbTwnxjwWxijfhngbHD/NtrDoM+HdsgZ3mT/NEWI+hlWseq9bE/+iwH+oec2zZL8A7Up8s5oQujrQxV2jdJ9pA/h92Zf0nFIk45ApUyouQV6 SEV25lRg hDWQoxVS61UQwfjAPb31gw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 03, 2025 at 11:08:24AM -0500, Joel Fernandes wrote: > On Fri, Feb 28, 2025 at 01:13:56PM +0100, Uladzislau Rezki (Sony) wrote: > > Currently kvfree_rcu() APIs use a system workqueue which is > > "system_unbound_wq" to driver RCU machinery to reclaim a memory. > > > > Recently, it has been noted that the following kernel warning can > > be observed: > > > > > > workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_scan_work is flushing !WQ_MEM_RECLAIM events_unbound:kfree_rcu_work > > WARNING: CPU: 21 PID: 330 at kernel/workqueue.c:3719 check_flush_dependency+0x112/0x120 > > Modules linked in: intel_uncore_frequency(E) intel_uncore_frequency_common(E) skx_edac(E) ... > > CPU: 21 UID: 0 PID: 330 Comm: kworker/u144:6 Tainted: G E 6.13.2-0_g925d379822da #1 > > Hardware name: Wiwynn Twin Lakes MP/Twin Lakes Passive MP, BIOS YMM20 02/01/2023 > > Workqueue: nvme-wq nvme_scan_work > > RIP: 0010:check_flush_dependency+0x112/0x120 > > Code: 05 9a 40 14 02 01 48 81 c6 c0 00 00 00 48 8b 50 18 48 81 c7 c0 00 00 00 48 89 f9 48 ... > > RSP: 0018:ffffc90000df7bd8 EFLAGS: 00010082 > > RAX: 000000000000006a RBX: ffffffff81622390 RCX: 0000000000000027 > > RDX: 00000000fffeffff RSI: 000000000057ffa8 RDI: ffff88907f960c88 > > RBP: 0000000000000000 R08: ffffffff83068e50 R09: 000000000002fffd > > R10: 0000000000000004 R11: 0000000000000000 R12: ffff8881001a4400 > > R13: 0000000000000000 R14: ffff88907f420fb8 R15: 0000000000000000 > > FS: 0000000000000000(0000) GS:ffff88907f940000(0000) knlGS:0000000000000000 > > CR2: 00007f60c3001000 CR3: 000000107d010005 CR4: 00000000007726f0 > > PKRU: 55555554 > > Call Trace: > > > > ? __warn+0xa4/0x140 > > ? check_flush_dependency+0x112/0x120 > > ? report_bug+0xe1/0x140 > > ? check_flush_dependency+0x112/0x120 > > ? handle_bug+0x5e/0x90 > > ? exc_invalid_op+0x16/0x40 > > ? asm_exc_invalid_op+0x16/0x20 > > ? timer_recalc_next_expiry+0x190/0x190 > > ? check_flush_dependency+0x112/0x120 > > ? check_flush_dependency+0x112/0x120 > > __flush_work.llvm.1643880146586177030+0x174/0x2c0 > > flush_rcu_work+0x28/0x30 > > kvfree_rcu_barrier+0x12f/0x160 > > kmem_cache_destroy+0x18/0x120 > > bioset_exit+0x10c/0x150 > > disk_release.llvm.6740012984264378178+0x61/0xd0 > > device_release+0x4f/0x90 > > kobject_put+0x95/0x180 > > nvme_put_ns+0x23/0xc0 > > nvme_remove_invalid_namespaces+0xb3/0xd0 > > nvme_scan_work+0x342/0x490 > > process_scheduled_works+0x1a2/0x370 > > worker_thread+0x2ff/0x390 > > ? pwq_release_workfn+0x1e0/0x1e0 > > kthread+0xb1/0xe0 > > ? __kthread_parkme+0x70/0x70 > > ret_from_fork+0x30/0x40 > > ? __kthread_parkme+0x70/0x70 > > ret_from_fork_asm+0x11/0x20 > > > > ---[ end trace 0000000000000000 ]--- > > > > > > To address this switch to use of independent WQ_MEM_RECLAIM > > workqueue, so the rules are not violated from workqueue framework > > point of view. > > > > Apart of that, since kvfree_rcu() does reclaim memory it is worth > > to go with WQ_MEM_RECLAIM type of wq because it is designed for > > this purpose. > > > > Cc: > > Cc: Greg Kroah-Hartman > > Cc: Keith Busch > > Closes: https://www.spinics.net/lists/kernel/msg5563270.html > > Fixes: 6c6c47b063b5 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()"), > > Reported-by: Keith Busch > > Signed-off-by: Uladzislau Rezki (Sony) > > BTW, there is a path in RCU-tasks that involves queuing work on system_wq > which is !WQ_RECLAIM. While I don't anticipate an issue such as the one fixed > by this patch, I am wondering if we should move these to their own WQ_RECLAIM > queues for added robustness since otherwise that will result in CB invocation > (And thus memory freeing delays). Paul? For RCU Tasks, the memory traffic has been much lower. But maybe someday someone will drop a million trampolines all at once. But let's see that problem before we fix some random problem that we believe will happen, but which proves to be only slightly related to the problem that actually does happen. ;-) Thanx, Paul > kernel/rcu/tasks.h: queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); > kernel/rcu/tasks.h: queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); > > For this patch: > Reviewed-by: Joel Fernandes > > thanks, > > - Joel > > > > --- > > mm/slab_common.c | 14 ++++++++++---- > > 1 file changed, 10 insertions(+), 4 deletions(-) > > > > diff --git a/mm/slab_common.c b/mm/slab_common.c > > index 4030907b6b7d..4c9f0a87f733 100644 > > --- a/mm/slab_common.c > > +++ b/mm/slab_common.c > > @@ -1304,6 +1304,8 @@ module_param(rcu_min_cached_objs, int, 0444); > > static int rcu_delay_page_cache_fill_msec = 5000; > > module_param(rcu_delay_page_cache_fill_msec, int, 0444); > > > > +static struct workqueue_struct *rcu_reclaim_wq; > > + > > /* Maximum number of jiffies to wait before draining a batch. */ > > #define KFREE_DRAIN_JIFFIES (5 * HZ) > > #define KFREE_N_BATCHES 2 > > @@ -1632,10 +1634,10 @@ __schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp) > > if (delayed_work_pending(&krcp->monitor_work)) { > > delay_left = krcp->monitor_work.timer.expires - jiffies; > > if (delay < delay_left) > > - mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); > > + mod_delayed_work(rcu_reclaim_wq, &krcp->monitor_work, delay); > > return; > > } > > - queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); > > + queue_delayed_work(rcu_reclaim_wq, &krcp->monitor_work, delay); > > } > > > > static void > > @@ -1733,7 +1735,7 @@ kvfree_rcu_queue_batch(struct kfree_rcu_cpu *krcp) > > // "free channels", the batch can handle. Break > > // the loop since it is done with this CPU thus > > // queuing an RCU work is _always_ success here. > > - queued = queue_rcu_work(system_unbound_wq, &krwp->rcu_work); > > + queued = queue_rcu_work(rcu_reclaim_wq, &krwp->rcu_work); > > WARN_ON_ONCE(!queued); > > break; > > } > > @@ -1883,7 +1885,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp) > > if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && > > !atomic_xchg(&krcp->work_in_progress, 1)) { > > if (atomic_read(&krcp->backoff_page_cache_fill)) { > > - queue_delayed_work(system_unbound_wq, > > + queue_delayed_work(rcu_reclaim_wq, > > &krcp->page_cache_work, > > msecs_to_jiffies(rcu_delay_page_cache_fill_msec)); > > } else { > > @@ -2120,6 +2122,10 @@ void __init kvfree_rcu_init(void) > > int i, j; > > struct shrinker *kfree_rcu_shrinker; > > > > + rcu_reclaim_wq = alloc_workqueue("kvfree_rcu_reclaim", > > + WQ_UNBOUND | WQ_MEM_RECLAIM, 0); > > + WARN_ON(!rcu_reclaim_wq); > > + > > /* Clamp it to [0:100] seconds interval. */ > > if (rcu_delay_page_cache_fill_msec < 0 || > > rcu_delay_page_cache_fill_msec > 100 * MSEC_PER_SEC) { > > -- > > 2.39.5 > > >