From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.2 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B864C55178 for ; Tue, 27 Oct 2020 14:39:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3AC1D22258 for ; Tue, 27 Oct 2020 14:39:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="MhjZ+vG+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3AC1D22258 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6395D6B0071; Tue, 27 Oct 2020 10:39:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5C5926B0072; Tue, 27 Oct 2020 10:39:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 464F76B0073; Tue, 27 Oct 2020 10:39:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0110.hostedemail.com [216.40.44.110]) by kanga.kvack.org (Postfix) with ESMTP id 1598D6B0071 for ; Tue, 27 Oct 2020 10:39:57 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id AA1B0181AC9CB for ; Tue, 27 Oct 2020 14:39:56 +0000 (UTC) X-FDA: 77417964792.01.mist27_4e07ef92727c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 887D310045848 for ; Tue, 27 Oct 2020 14:39:56 +0000 (UTC) X-HE-Tag: mist27_4e07ef92727c X-Filterd-Recvd-Size: 10098 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Tue, 27 Oct 2020 14:39:55 +0000 (UTC) Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 09REY39g163845; Tue, 27 Oct 2020 10:39:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=/ZwSrmzIGjGcCyBuzzCXPJA4pYZxZT38qWDq452uTLU=; b=MhjZ+vG+sLn/HSFwmwJxrP8p8uDpku1JHTXUVSga5CoV4Ed3yCheSEU5S8q/K6ChT9fD OD1YQrpcr3TPAkwqpcesE5+OPYQJLOmdZoj7McvmqtB0XDsCJlPdYh2PoKyU7IBOPNgx JF5Y9WyP0kx/us1fPwDhwcxJ+AZcXdw1aazTEyv5lFl0JcGij5nPbqWwJmW/EYGHPNII KmADEASKQpTDx9Qw9py5dY8/fwRpaRuPxAc9WC0qpGwyt0uDknvXfE34T4fmqYEnbY6J IyQolu51QkgdG6OAsBWeDHGsGtE/7xvR7kpS+f/Rpzc/pUjVJq07s789kAjTBHZam6nW JQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 34dp3r6pvj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Oct 2020 10:39:53 -0400 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 09REYEJ1165081; Tue, 27 Oct 2020 10:39:52 -0400 Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0b-001b2d01.pphosted.com with ESMTP id 34dp3r6puv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Oct 2020 10:39:52 -0400 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 09REVqTF025791; Tue, 27 Oct 2020 14:39:50 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma03fra.de.ibm.com with ESMTP id 34cbw81sbw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Oct 2020 14:39:50 +0000 Received: from d06av24.portsmouth.uk.ibm.com (d06av24.portsmouth.uk.ibm.com [9.149.105.60]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 09REdlRg33095942 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 27 Oct 2020 14:39:47 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 77CBD4204D; Tue, 27 Oct 2020 14:39:47 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1216B42049; Tue, 27 Oct 2020 14:39:47 +0000 (GMT) Received: from pomme.local (unknown [9.145.20.129]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 27 Oct 2020 14:39:47 +0000 (GMT) Subject: Re: [PATCH] mm/slub: fix panic in slab_alloc_node() To: Michal Hocko Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, nathanl@linux.ibm.com, cheloha@linux.ibm.com, Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , stable@vger.kernel.org, Vlastimil Babka References: <20201027140926.276-1-ldufour@linux.ibm.com> <20201027142421.GW20500@dhcp22.suse.cz> From: Laurent Dufour Message-ID: <11bdd295-3ef8-fbeb-2c76-2a109fa26f19@linux.ibm.com> Date: Tue, 27 Oct 2020 15:39:46 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 In-Reply-To: <20201027142421.GW20500@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312,18.0.737 definitions=2020-10-27_08:2020-10-26,2020-10-27 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 impostorscore=0 spamscore=0 malwarescore=0 mlxscore=0 phishscore=0 adultscore=0 suspectscore=0 priorityscore=1501 mlxlogscore=999 clxscore=1011 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2010270089 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Le 27/10/2020 =C3=A0 15:24, Michal Hocko a =C3=A9crit=C2=A0: > [Cc Vlastimil] >=20 > On Tue 27-10-20 15:09:26, Laurent Dufour wrote: >> While doing memory hot-unplug operation on a PowerPC VM running 1024 C= PUs >> with 11TB of ram, I hit the following panic: >> >> BUG: Kernel NULL pointer dereference on read at 0x00000007 >> Faulting instruction address: 0xc000000000456048 >> Oops: Kernel access of bad area, sig: 11 [#2] >> LE PAGE_SIZE=3D64K MMU=3DHash SMP NR_CPUS=3D2048 NUMA pSeries >> Modules linked in: rpadlpar_io rpaphp >> CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1 >> NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350 >> REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.= 0) >> MSR: 8000000000009033 CR: 24004228 XER: 0000= 0000 >> CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: = 0 >> GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 000000000000= 0000 >> GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df9= 9320 >> GPR08: 000007c53c980000 0000000000000000 000007c53c980000 000000000000= 0000 >> GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 000000000000= 0f6a >> GPR16: 0000000000000000 c000000001c25930 c000000001d62528 000000000000= 00c1 >> GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c= 0ff8 >> GPR24: 0000000000000018 0000000000000000 0000000000000dc0 000000000000= 0000 >> GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 000000000000= 0001 >> NIP [c000000000456048] __kmalloc_node+0x108/0x790 >> LR [c000000000455fd4] __kmalloc_node+0x94/0x790 >> Call Trace: >> [c00006028d1b7a30] [c00007c51af92000] 0xc00007c51af92000 (unreliable) >> [c00006028d1b7aa0] [c0000000003c0ff8] kvmalloc_node+0x58/0x110 >> [c00006028d1b7ae0] [c00000000047b45c] mem_cgroup_css_online+0x10c/0x27= 0 >> [c00006028d1b7b30] [c000000000241fd8] online_css+0x48/0xd0 >> [c00006028d1b7b60] [c00000000024af14] cgroup_apply_control_enable+0x2c= 4/0x470 >> [c00006028d1b7c40] [c00000000024e838] cgroup_mkdir+0x408/0x5f0 >> [c00006028d1b7cb0] [c0000000005a4ef0] kernfs_iop_mkdir+0x90/0x100 >> [c00006028d1b7cf0] [c0000000004b8168] vfs_mkdir+0x138/0x250 >> [c00006028d1b7d40] [c0000000004baf04] do_mkdirat+0x154/0x1c0 >> [c00006028d1b7dc0] [c000000000032b38] system_call_exception+0xf8/0x200 >> [c00006028d1b7e20] [c00000000000c740] system_call_common+0xf0/0x27c >> Instruction dump: >> e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502= a >> 2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6b= b78 >> >> This pointing to the following code: >> >> mm/slub.c:2851 >> if (unlikely(!object || !node_match(page, node))) { >> c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0 >> c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <= __kmalloc_node+0x114> >> node_match(): >> mm/slub.c:2491 >> if (node !=3D NUMA_NO_NODE && page_to_nid(page) !=3D node) >> c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <= __kmalloc_node+0x330> >> page_to_nid(): >> include/linux/mm.h:1294 >> c000000000456044: 10 00 27 e9 ld r9,16(r7) >> c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 =3D N= ULL >> node_match(): >> mm/slub.c:2491 >> c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9 >> c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <= __kmalloc_node+0x330> >> >> The panic occurred in slab_alloc_node() when checking for the page's n= ode: >> object =3D c->freelist; >> page =3D c->page; >> if (unlikely(!object || !node_match(page, node))) { >> object =3D __slab_alloc(s, gfpflags, node, addr, c); >> stat(s, ALLOC_SLOWPATH); >> >> The issue is that object is not NULL while page is NULL which is odd b= ut >> may happen if the cache flush happened after loading object but before >> loading page. Thus checking for the page pointer is required too. >=20 > Could you be more specific? I am especially confused how the memory > hotplug is involved here. What kind of flush are we talking about? This happens when flush_cpu_slab() is called when a memory block is about= to be=20 offlined, see slab_mem_going_offline_callback() called by the=20 MEM_GOING_OFFLINE's callback triggered by offline_pages(). >=20 >> In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in >> node_match()") check on the page pointer has been removed assuming tha= t >> page is always valid when it is called. It happens that this is not tr= ue in >> that particular case, so check for page before calling node_match() he= re. >> >> Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match= ()") >> Signed-off-by: Laurent Dufour >> Cc: Christoph Lameter >> Cc: Pekka Enberg >> Cc: David Rientjes >> Cc: Joonsoo Kim >> Cc: Andrew Morton >> Cc: stable@vger.kernel.org >> --- >> mm/slub.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/slub.c b/mm/slub.c >> index 8f66de8a5ab3..7dc5c6aaf4b7 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -2852,7 +2852,7 @@ static __always_inline void *slab_alloc_node(str= uct kmem_cache *s, >> =20 >> object =3D c->freelist; >> page =3D c->page; >> - if (unlikely(!object || !node_match(page, node))) { >> + if (unlikely(!object || !page || !node_match(page, node))) { >> object =3D __slab_alloc(s, gfpflags, node, addr, c); >> } else { >> void *next_object =3D get_freepointer_safe(s, object); >> --=20 >> 2.29.1 >> >=20