From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BC8DE77199 for ; Thu, 9 Jan 2025 05:16:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 847CF6B0099; Thu, 9 Jan 2025 00:16:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F8986B009A; Thu, 9 Jan 2025 00:16:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C0346B009B; Thu, 9 Jan 2025 00:16:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4BB1A6B0099 for ; Thu, 9 Jan 2025 00:16:17 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B1E7B80C54 for ; Thu, 9 Jan 2025 05:16:16 +0000 (UTC) X-FDA: 82986752352.06.B15EC8F Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf26.hostedemail.com (Postfix) with ESMTP id 8B22E140009 for ; Thu, 9 Jan 2025 05:16:14 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736399775; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8H5jHRMO/ZqfTZhHteP0rzENanEP7/Y76XxU9u1BYds=; b=N41ngGlWwZASpCRXFUVv7+XwID1HlkmVh3ED+shzGe9wafCXNDrIj8hwaox5z7aBSGMWpn oFBIvbjvG4QVgg9zCbQdHkFCdv0l/nqPKt9296WtUlGMRAFCtWUYULuu6+6+bF8XML7hSX RRbBuTTqvAvKlyS+wngE1Y0FW8a6ZWY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf26.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736399775; a=rsa-sha256; cv=none; b=cS+LVPUO+k7pS7+iEqlmlOFFxY/SVItj4MGzeht97rYEOuV8v70MYP8OGxY6+6aE3rWdfU 5TvFgRcbg0PrCLG6LwP2dPtaJfYOTvR528+spXExdkQR4lSZVZJhkP2wtcF4C8f95W/bmm pDILcLES3QYoiWBNwGyeT5w+8xlSUYk= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D023413D5; Wed, 8 Jan 2025 21:16:41 -0800 (PST) Received: from [10.163.55.158] (unknown [10.163.55.158]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A2A003F59E; Wed, 8 Jan 2025 21:16:11 -0800 (PST) Message-ID: <0b6763fa-6225-436a-a24a-c6b4029c0d68@arm.com> Date: Thu, 9 Jan 2025 10:46:08 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3] vmstat: disable vmstat_work on vmstat_cpu_down_prep() To: Koichiro Den , linux-mm@kvack.org Cc: akpm@linux-foundation.org, lorenzo.stoakes@oracle.com, chenhuacai@kernel.org, linux-kernel@vger.kernel.org References: <20250108042807.3429745-1-koichiro.den@canonical.com> Content-Language: en-US From: Anshuman Khandual In-Reply-To: <20250108042807.3429745-1-koichiro.den@canonical.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 8B22E140009 X-Stat-Signature: nqjq6wxc7oji4fenqbn6urx37gshm5is X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736399774-547967 X-HE-Meta: U2FsdGVkX1/4Ap8rNKHsIgHG0ye884T0RKIq6uS9Jz48m5MNwuJEKqinoNNGoVnfRbhycHH/+4d8RT0AeWBzWcMsVK0inoRSLZxuK5BJClotbfsvyuA+b8hQNhPV/hfoleswnxoJq/hg/eVh33tqG24p2Qr9/nlJncd3V7WC6iqqPeix/k90Z4g7EVQz/pQbG4df4v2rkapU1+25/BYMkyqrGsucsp4moeojyLf27r3gCW+kkDt+gcAV4kIG/3RIc14Pkzp9tiq7q8UrAcd0b1K/PWf5N75IZkbLX48W2Z32TXO1nPRhqcibfeWer8t25dTLBvNP9BC4lSDnvVjBBhMThGrMkQVA3ETg/sotVEVykQb78AsghQSCdr6ivMm0qFn/DEQD4QT8akOLsS0tblb7W8KOPLYW4rDZR5ENXB1KRh+zRjwzUekckd+buVlwnmwxAjK9slqbKA5aHY5TdCQbs/vfA3oUDh4Ta9L/Us+TcG+UJZD/pLKnVCUWz/Jtq07YSDqJBiGcpfTRDL1YJSIRozC/kNITe/+ciodPHa6V4sP4C+U5R+4m7/i8OyrO0KJuxWM4XA/ryzm7PCMsyraYFwgAFu+FP3deQHgk+D72sTu7w5NkRDsnaLSBSj4GkWbrdkKND8O6WKsAEe6HA8H9inBZPV0QDwDo1qbQpsDVTnTK8qA0ELjDZ6qt/0qBx5G3LlDiPNA/Al24QjAViw8/jldQEy1mF3IlRnzxYx8ReUEsn5q/5Vg9+YBuX9vaMPr3AsdC185aLICrnZtZ8flrGp9FSyz6DfeA9d2Q1ITFzafTb0ktR2si+wPYI2bVkl64E0NZNbuuO52ADssmQ0RzhcY1a4fDPqp3VaoteZ2S2NoOyHBwnetuft+/1wNZQtSzhOzUojFnv8Jbp/0YfGb0Gh6/NYE5OVPoj1428SD+ybz2/4gZJjg5VXHrLCG/Qfpswi+pOEN8e+Ta0r4 wH2r45LT SzokCoF4CgEy2vBMt9eQAU6hplfsGtd9mbscMHDGCq2lH74MpLD4UPkSGe0cTimsinsROwVoGDuWO9gyOpSCqZ3fTnbl5ThTOcl/18q0BpFBtIO74kTS/uZePA/cQEPjnrIfAyN+MW7tPRclE34DZLPWOMnpOR6DFSOJ9YRolP5s6VnsCE4TLCizYk47eQCrO8GWBU5r1qQvF9yxqVfQPeOldd0Zz0eQZjPJSEwfKNOEkVXbVBHOvedaH1U49XNhXdddn4O0zgh2fCAE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 1/8/25 09:58, Koichiro Den wrote: > The upstream commit adcfb264c3ed ("vmstat: disable vmstat_work on > vmstat_cpu_down_prep()") introduced another warning during the boot phase I did observe this warning during a boot on arm64 and a quick git bisect also pointed to the above mentioned commit. But seems like you are already on this problem. [ 0.092532] ------------[ cut here ]------------ [ 0.092534] workqueue: work disable count underflowed [ 0.092540] WARNING: CPU: 1 PID: 21 at kernel/workqueue.c:4313 enable_work+0xe8/0x100 [ 0.092550] Modules linked in: [ 0.092554] CPU: 1 UID: 0 PID: 21 Comm: cpuhp/1 Not tainted 6.13.0-rc4-00018-gadcfb264c3ed #11 [ 0.092558] Hardware name: linux,dummy-virt (DT) [ 0.092560] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 0.092564] pc : enable_work+0xe8/0x100 [ 0.092566] lr : enable_work+0xe8/0x100 [ 0.092569] sp : ffff800082de3d50 [ 0.092570] x29: ffff800082de3d50 x28: 0000000000000000 x27: 0000000000000000 [ 0.092574] x26: 0000000000000000 x25: 0000000000000000 x24: ffff00008001a1d0 [ 0.092578] x23: 0000000000000001 x22: 0000000000000001 x21: ffff0001ff2de110 [ 0.092582] x20: ffff8000827506a0 x19: ffff0001ff2e4d28 x18: 0000000000000038 [ 0.092585] x17: 000000040044ffff x16: 00500074b5503510 x15: fffffffffffe1698 [ 0.092589] x14: ffff8000827547d8 x13: 00000000000000ff x12: 0000000000000055 [ 0.092592] x11: fffffffffffe1698 x10: ffff8000827ac7d8 x9 : 00000000fffff000 [ 0.092596] x8 : ffff8000827547d8 x7 : ffff8000827ac7d8 x6 : 0000000000000000 [ 0.092599] x5 : 80000000fffff000 x4 : 000000000000aff5 x3 : 0000000000000000 [ 0.092603] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000080270000 [ 0.092607] Call trace: [ 0.092608] enable_work+0xe8/0x100 (P) [ 0.092612] enable_delayed_work+0x10/0x1c [ 0.092615] vmstat_cpu_online+0x84/0xb8 [ 0.092621] cpuhp_invoke_callback+0x104/0x20c [ 0.092626] cpuhp_thread_fun+0xa4/0x17c [ 0.092629] smpboot_thread_fn+0x220/0x24c [ 0.092635] kthread+0x118/0x11c [ 0.092640] ret_from_fork+0x10/0x20 [ 0.092645] ---[ end trace 0000000000000000 ]--- > so was soon reverted on upstream by commit cd6313beaeae ("Revert "vmstat: > disable vmstat_work on vmstat_cpu_down_prep()""). This commit resolves it > and reattempts the original fix. > > Even after mm/vmstat:online teardown, shepherd may still queue work for > the dying cpu until the cpu is removed from online mask. While it's quite > rare, this means that after unbind_workers() unbinds a per-cpu kworker, it > potentially runs vmstat_update for the dying CPU on an irrelevant cpu > before entering atomic AP states. When CONFIG_DEBUG_PREEMPT=y, it results > in the following error with the backtrace. > > BUG: using smp_processor_id() in preemptible [00000000] code: \ > kworker/7:3/1702 > caller is refresh_cpu_vm_stats+0x235/0x5f0 > CPU: 0 UID: 0 PID: 1702 Comm: kworker/7:3 Tainted: G > Tainted: [N]=TEST > Workqueue: mm_percpu_wq vmstat_update > Call Trace: > > dump_stack_lvl+0x8d/0xb0 > check_preemption_disabled+0xce/0xe0 > refresh_cpu_vm_stats+0x235/0x5f0 > vmstat_update+0x17/0xa0 > process_one_work+0x869/0x1aa0 > worker_thread+0x5e5/0x1100 > kthread+0x29e/0x380 > ret_from_fork+0x2d/0x70 > ret_from_fork_asm+0x1a/0x30 > > > So, for mm/vmstat:online, disable vmstat_work reliably on teardown and > symmetrically enable it on startup. > > For secondary CPUs during CPU hotplug scenarios, ensure the delayed work > is disabled immediately after the initialization. These CPUs are not yet > online when start_shepherd_timer() runs on boot CPU. vmstat_cpu_online() > will enable the work for them. > > Suggested-by: Huacai Chen > Signed-off-by: Huacai Chen > Signed-off-by: Koichiro Den > --- > v2: https://lore.kernel.org/all/20241221033321.4154409-1-koichiro.den@canonical.com/ > v1: https://lore.kernel.org/all/20241220134234.3809621-1-koichiro.den@canonical.com/ > --- > mm/vmstat.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/mm/vmstat.c b/mm/vmstat.c > index 4d016314a56c..16bfe1c694dd 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -2122,10 +2122,20 @@ static void __init start_shepherd_timer(void) > { > int cpu; > > - for_each_possible_cpu(cpu) > + for_each_possible_cpu(cpu) { > INIT_DEFERRABLE_WORK(per_cpu_ptr(&vmstat_work, cpu), > vmstat_update); > > + /* > + * For secondary CPUs during CPU hotplug scenarios, > + * vmstat_cpu_online() will enable the work. > + * mm/vmstat:online enables and disables vmstat_work > + * symmetrically during CPU hotplug events. > + */ > + if (!cpu_online(cpu)) > + disable_delayed_work_sync(&per_cpu(vmstat_work, cpu)); > + } > + > schedule_delayed_work(&shepherd, > round_jiffies_relative(sysctl_stat_interval)); > } > @@ -2148,13 +2158,14 @@ static int vmstat_cpu_online(unsigned int cpu) > if (!node_state(cpu_to_node(cpu), N_CPU)) { > node_set_state(cpu_to_node(cpu), N_CPU); > } > + enable_delayed_work(&per_cpu(vmstat_work, cpu)); > > return 0; > } > > static int vmstat_cpu_down_prep(unsigned int cpu) > { > - cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu)); > + disable_delayed_work_sync(&per_cpu(vmstat_work, cpu)); > return 0; > } >