From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91488C41513 for ; Wed, 2 Aug 2023 22:40:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C94902801FE; Wed, 2 Aug 2023 18:40:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C44C02801EB; Wed, 2 Aug 2023 18:40:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE5172801FE; Wed, 2 Aug 2023 18:40:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9C1662801EB for ; Wed, 2 Aug 2023 18:40:23 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 474851A072C for ; Wed, 2 Aug 2023 22:40:23 +0000 (UTC) X-FDA: 81080634726.09.64CF708 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf03.hostedemail.com (Postfix) with ESMTP id 73B6D2001D for ; Wed, 2 Aug 2023 22:40:21 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NPGlLLxz; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf03.hostedemail.com: domain of "SRS0=ZBVx=DT=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=ZBVx=DT=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691016021; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Zwme7zBcq+asiNa8KMkcaWKKr/iZY1BXTeeIw4ibk4E=; b=gKY6K2jgvWHrwx1K9wha/6UXDic3c0C+zThTlt4Ll1qr0aJEeXHq2EMcLUg0pCkQfw379d EJaB63YSD3r7p8ENj5sZTadUJM1iCjTgd3ftppW0cEGZNnCXjT8didGdfAJvuM0VCe6mYT XmZ3Tk5Xh/BhxXRq2eG2B1V8BGAlc6Q= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NPGlLLxz; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf03.hostedemail.com: domain of "SRS0=ZBVx=DT=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=ZBVx=DT=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691016021; a=rsa-sha256; cv=none; b=PceZ9FoRMmox+J1DfLHIshWTlO+7KdLbffA+Cud62Ur3S1Ac3queiEudZYcdeYdcDFjihD n9oc+Y2waMZJazTFGWmGPd2xZa4u/hdDNZEY/fgQysu5my8itGLD3qgyEwcHThOuOKA6se xDDNDrS6jMOeDTgp1rLeDCwqy+Y38h4= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7106161B53; Wed, 2 Aug 2023 22:40:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BEDD2C433C8; Wed, 2 Aug 2023 22:40:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1691016019; bh=ncYNi0R6QFmd8nuiw7FFifx0Ugbl+qglQ18OAjBA5dA=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=NPGlLLxzZwWKXqmElAKxQw35/kqjgAatcYpNKq6HB6T8H4xPvNBU5/5oAmaYPy260 kN4SlQfnyawo9E3T+ESBWZ00+xgNIs5A0t9M+0qxzFiHGgbnrOi/w1QGCfRRbqJsuB oChkZWfdImFMHMa3wIGzRE7FNp6+eqDlGLiDVDrc44tyLVWdsEdbs8yV4G4K4XKYZX xzpX0F4DIlGw0pzrBS3FRFqNM7lZ+UaQenCYlSBk4HUq/jAMXeAEPAwSzFlRfYzpet NBe3sg6Z11GddjQzCbPPeIbkoc4H4xQX5eSD1D5TEtNrJLB8oIvXm4MgdnI9Jr8HN+ xrffqXXo7k3QA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 5002FCE1071; Wed, 2 Aug 2023 15:40:19 -0700 (PDT) Date: Wed, 2 Aug 2023 15:40:19 -0700 From: "Paul E. McKenney" To: thunder.leizhen@huaweicloud.com Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , rcu@vger.kernel.org, linux-kernel@vger.kernel.org, Zhen Lei Subject: Re: [PATCH v4 2/2] rcu: Dump memory object info if callback function is invalid Message-ID: <06731ba9-0746-453e-bd1f-b857bd253543@paulmck-laptop> Reply-To: paulmck@kernel.org References: <20230802130918.1132-1-thunder.leizhen@huaweicloud.com> <20230802130918.1132-3-thunder.leizhen@huaweicloud.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230802130918.1132-3-thunder.leizhen@huaweicloud.com> X-Rspamd-Queue-Id: 73B6D2001D X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: cx5k8nmjgtin6e7fbq8axpueecn61z8w X-HE-Tag: 1691016021-540347 X-HE-Meta: U2FsdGVkX19mS9FK305OI8VoiERuEHU0LOn6Juk3at83edOueKnlS8+v3A/aIRZVNoAvEEsm0rMIuTRhMBjQ65ZspTtbK+PNZYRW4XsSSpQpzLmChP9F3Y9zqXMTUJllWMJ0LmFQL4dYwNt9ZCdtRAlAvkulMoy37w5QYrSacNkFeTfg410Y+XoVuZpTu8Xv1wqAuCQxrmxWkD8nnKsW/mcNpL+zKCn5PhtKHX8VlVLjfR3cIEMoIU9RDpp71GPEB9mo/1SWf09M76jtjbvqNzgQg1ZCNhPBoc3BmiH+9FJzfYVBTUpG00iEdecsmTUcPL0GHUl90XtbTuo4pkE7QKk9Rln0YikFOMC/UaOyP4WsOW7A98eoUYxy0edMwjyzxtKcbuqrf3Xz8H+NH6Bt0asQ0Td//Wdpv6psJ7/qNHB5P+rS9+Yqz7NU5tTduSowO3eryaY2i7ZVZhMj9yJk6bjxQgYgup2MmP3lo5BFFTyBgebfzg8grjDuiZ615kiuFhgL6JZ22UIZ/VzpscrIlu+E+KtNxot0X2sRBFioF26iqbOvM8JjZiZYctwfzkOql9bl/C0QufNZh9hVWIVPV2QMauAVWga2yMpo88QtBNAFj+Fx9mGNXzxA7faqIjY7y2OO/zmhx1dpK7FbR51lRzzCUIAwwJt1Q+JayvARQK5Eqoa8Rbmvp7t85XNiY/Otf/VXqWGN35Ifbyesvqax+gEqYeUuq+RnORGkljdwZ5YiUyQkRaj0DUulXtHmQ+q/L666uomzQU14flLsEZojLTr7v+GeppIdYdU3NrDuVnjExKNsXiUS0CBUChpxHDnEAUBAzDa6z/rmUT7G0KaXkDt7CRJ5a7thB59gVikD41Npk9OkDN+/y3rRTcwbVa2+2swNOMMTABLdEtydO33DsE3L6GlZ8kJ2oa7huj6iDknaNwecZGVzv9RsKDpfa+KtmJGZYYuBjArKbqmTkQG VOehHVA/ J9oXyXEwoVIb/Uh+Hl9l17IPSWXfn/80LJvIyjv9NiBWEwQVWxAlZpu/uNn1a6XmItgT7EhWjVwuoZITrRhNzkPfC9b+XYylwbd8GptENoZTQ2aJsnw3rTj8z/nTrAyAmlYJ/N3hplh2+6R9ds/HGJHG4H15apslatwxX9NaO8WkUq38omtLOW2Czm8hmy5A5RmzTu/eYSvO+tIpLfvrdjUvMYF9szzMd6qbP792DeqInXYZahW6LU1mg/nJM93bl09D5hyuBTnM0A7Uql+RnfAakhevcuE9g2vBcsIOwYO6EOSP6J0x2ttEmW9O900jy0DlaeRk16HTDEEY/y7SnTvUMYmKZF+YiwjX1XY+Hp8FbJTOquRQL4cFUbvBpJsp4q8OikR+Z4CKJB61OUNR6xK5ty1pdH362qUd7S72ctRlv864R39wv9O6tsoa04r0exrXJFYgQzvamioj9FDw5qpkXX6vYhTSG6j0okkM8NnaCOuOL/dzxsBFAwMV+Esbaa5YrbHD5WAFdXq/lLLHJe0hBltL8wRtJHvblVsH5seal6oQMzDwVV7kTnYs9FJiEMOyKtFf2RAGeuCVFhrdfe3Zc/X51u29VcpYBe/TdTeahsk16Qu5dVet4FRATW04N36bI3bz6s6bk9TjhbCdl4YiGb/twhtib92db+JwInUycDVTQ2Q9WF6lBUYw2tb01wBSoj4QcO44gz0YObuH46uy7KD/o9kutCOPMi3dAWlUShuwlAezVZdkNfxmxO2SEvOB8VwT99ANxq/6ocP097X7IKT+S8h+W5OOBmXceGxLdCNpqQg3qqv/2JUZJRsYBWm3KmKKe9Y2dcUb72wgeZ1pwoLsdDXhnmnMA9c75G32kfSPhXLjE296RFush8HurF+KT5mwMAHOf948zAZaRAApP8FmD7XB9+Jd3D2g4dd+2FaUKcVLUC6DIOQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 02, 2023 at 09:09:18PM +0800, thunder.leizhen@huaweicloud.com wrote: > From: Zhen Lei > > When a structure containing an RCU callback rhp is (incorrectly) freed > and reallocated after rhp is passed to call_rcu(), it is not unusual for > rhp->func to be set to NULL. This defeats the debugging prints used by > __call_rcu_common() in kernels built with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, > which expect to identify the offending code using the identity of this > function. > > And in kernels build without CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, things > are even worse, as can be seen from this splat: > > Unable to handle kernel NULL pointer dereference at virtual address 0 > ... ... > PC is at 0x0 > LR is at rcu_do_batch+0x1c0/0x3b8 > ... ... > (rcu_do_batch) from (rcu_core+0x1d4/0x284) > (rcu_core) from (__do_softirq+0x24c/0x344) > (__do_softirq) from (__irq_exit_rcu+0x64/0x108) > (__irq_exit_rcu) from (irq_exit+0x8/0x10) > (irq_exit) from (__handle_domain_irq+0x74/0x9c) > (__handle_domain_irq) from (gic_handle_irq+0x8c/0x98) > (gic_handle_irq) from (__irq_svc+0x5c/0x94) > (__irq_svc) from (arch_cpu_idle+0x20/0x3c) > (arch_cpu_idle) from (default_idle_call+0x4c/0x78) > (default_idle_call) from (do_idle+0xf8/0x150) > (do_idle) from (cpu_startup_entry+0x18/0x20) > (cpu_startup_entry) from (0xc01530) > > This commit therefore adds calls to mem_dump_obj(rhp) to output some > information, for example: > > slab kmalloc-256 start ffff410c45019900 pointer offset 0 size 256 > > This provides the rough size of the memory block and the offset of the > rcu_head structure, which as least provides at least a few clues to help > locate the problem. If the problem is reproducible, additional slab > debugging can be enabled, for example, CONFIG_DEBUG_SLAB=y, which can > provide significantly more information. > > Signed-off-by: Zhen Lei Looks plausible, thank you! What did you do to test this? One option is the object_debug module parameter to rcutorture, which is described here: https://paulmck.livejournal.com/61432.html > Signed-off-by: Paul E. McKenney Not a big problem, but not a good habit to get into... I add my own Signed-off-by when I pull patches into my tree. Or if you are thinking in terms of sending this to mainline using some other path, when I am good with it, I would give you a tag to use. So were you looking for me to take these two patches? Thanx, Paul > --- > kernel/rcu/rcu.h | 7 +++++++ > kernel/rcu/srcutiny.c | 1 + > kernel/rcu/srcutree.c | 1 + > kernel/rcu/tasks.h | 1 + > kernel/rcu/tiny.c | 1 + > kernel/rcu/tree.c | 1 + > 6 files changed, 12 insertions(+) > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h > index d1dcb09750efbd6..bc81582238b9846 100644 > --- a/kernel/rcu/rcu.h > +++ b/kernel/rcu/rcu.h > @@ -10,6 +10,7 @@ > #ifndef __LINUX_RCU_H > #define __LINUX_RCU_H > > +#include > #include > > /* > @@ -248,6 +249,12 @@ static inline void debug_rcu_head_unqueue(struct rcu_head *head) > } > #endif /* #else !CONFIG_DEBUG_OBJECTS_RCU_HEAD */ > > +static inline void debug_rcu_head_callback(struct rcu_head *rhp) > +{ > + if (unlikely(!rhp->func)) > + kmem_dump_obj(rhp); > +} > + > extern int rcu_cpu_stall_suppress_at_boot; > > static inline bool rcu_stall_is_suppressed_at_boot(void) > diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c > index 336af24e0fe358a..c38e5933a5d6937 100644 > --- a/kernel/rcu/srcutiny.c > +++ b/kernel/rcu/srcutiny.c > @@ -138,6 +138,7 @@ void srcu_drive_gp(struct work_struct *wp) > while (lh) { > rhp = lh; > lh = lh->next; > + debug_rcu_head_callback(rhp); > local_bh_disable(); > rhp->func(rhp); > local_bh_enable(); > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index f1a905200fc2f79..833a8f848a90ae6 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -1710,6 +1710,7 @@ static void srcu_invoke_callbacks(struct work_struct *work) > rhp = rcu_cblist_dequeue(&ready_cbs); > for (; rhp != NULL; rhp = rcu_cblist_dequeue(&ready_cbs)) { > debug_rcu_head_unqueue(rhp); > + debug_rcu_head_callback(rhp); > local_bh_disable(); > rhp->func(rhp); > local_bh_enable(); > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h > index 7294be62727b12c..148ac6a464bfb12 100644 > --- a/kernel/rcu/tasks.h > +++ b/kernel/rcu/tasks.h > @@ -538,6 +538,7 @@ static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu > raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags); > len = rcl.len; > for (rhp = rcu_cblist_dequeue(&rcl); rhp; rhp = rcu_cblist_dequeue(&rcl)) { > + debug_rcu_head_callback(rhp); > local_bh_disable(); > rhp->func(rhp); > local_bh_enable(); > diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c > index 42f7589e51e09e7..fec804b7908032d 100644 > --- a/kernel/rcu/tiny.c > +++ b/kernel/rcu/tiny.c > @@ -97,6 +97,7 @@ static inline bool rcu_reclaim_tiny(struct rcu_head *head) > > trace_rcu_invoke_callback("", head); > f = head->func; > + debug_rcu_head_callback(head); > WRITE_ONCE(head->func, (rcu_callback_t)0L); > f(head); > rcu_lock_release(&rcu_callback_map); > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 7c79480bfaa04e4..927c5ba0ae42269 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -2135,6 +2135,7 @@ static void rcu_do_batch(struct rcu_data *rdp) > trace_rcu_invoke_callback(rcu_state.name, rhp); > > f = rhp->func; > + debug_rcu_head_callback(rhp); > WRITE_ONCE(rhp->func, (rcu_callback_t)0L); > f(rhp); > > -- > 2.34.1 >