From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 852D9CDB474 for ; Mon, 16 Oct 2023 22:34:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 21F118D00C6; Mon, 16 Oct 2023 18:34:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1CF638D00B8; Mon, 16 Oct 2023 18:34:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BEA38D00C6; Mon, 16 Oct 2023 18:34:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EE1BC8D00B8 for ; Mon, 16 Oct 2023 18:34:57 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B5DEF1609E3 for ; Mon, 16 Oct 2023 22:34:57 +0000 (UTC) X-FDA: 81352781034.05.F74F26F Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf23.hostedemail.com (Postfix) with ESMTP id D85DC140008 for ; Mon, 16 Oct 2023 22:34:54 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xyXnGZd4; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697495695; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZX7U2O6L7oSVmZ23sbEjhTzmwIwmr9lOtdI3NlTKFS4=; b=hT0vEvFPAewlUP1GlAlDdrZpXA5AiyWFiBlxYyV/EKJTCUxf9c9tXntm597FTCNLB2mp7y oH1WIuUPSexf8IAokkruhF+iB/PacfPfkmSxh8joeyio+BZyf3LnR6WXkzHR1vs7aEhxSL KzVW8HYwDCnuSoSEGscryUya59jjCow= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=xyXnGZd4; dmarc=none; spf=pass (imf23.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697495695; a=rsa-sha256; cv=none; b=zd1G/nS7mzDJ2e98c1cke1INd7SfKbyRS67wZhdoL6kgPGQ1EpLbNdHiqt9Ao79GtGlM7m 8WvtRfvzAZZf14cjtcf8zeE42MAekGz/RFQlCkn8ionJQlkityYCI6ktyEh0F0UTQiWOYq 6cJAUJ3zKXclzDcnipO9YcxAqIq7fVg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id 5B619B81AEC; Mon, 16 Oct 2023 22:34:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54BC3C433C7; Mon, 16 Oct 2023 22:34:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1697495692; bh=FZ9FRkz2iD/S2s21JwYAKU+5RDYOBoUlW8pmKTtVfBo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=xyXnGZd4VcZnn6u+N+V0mFtKNz3+M9q4gWHqqW76DgX/RfQ1UT1s49a3cKyEip/Zp 4KYRRS9gUrhhWi+os8H2Z6J0JyEfIpgljxyHRfepVyXsoL+qUJGFC1GPwit3V5Uqc4 cRZCStbG9cmgfBaDnVTnFLGQqYimrZS04FdJGxQQ= Date: Mon, 16 Oct 2023 15:34:51 -0700 From: Andrew Morton To: Charan Teja Kalla Cc: David Hildenbrand , , , , , , , Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage Message-Id: <20231016153451.09f3677496bd6cc8b1f95daa@linux-foundation.org> In-Reply-To: <994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com> References: <1697202267-23600-1-git-send-email-quic_charante@quicinc.com> <20231014152532.5f3dca7838c2567a1a9ca9c6@linux-foundation.org> <994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: D85DC140008 X-Stat-Signature: op8us3htpadknxbr56b79ba4t4wb94rs X-Rspam-User: X-HE-Tag: 1697495694-94079 X-HE-Meta: U2FsdGVkX1+eNBPVlYMkjPfOFzbtA7REu0DexKbU+8kP/Z0Ajp594K5AadL+jWjqX9zPM4W4n+uaUVZdgLpJlLUQPRIZ0g+jV7buqSsSmp8QUfS8G9w0XJIPlEhkRyVx7oj9rsEO9UT4X/bWJeck3tmA4P4qy5s47ZyLbcjRBHLgADahvi3Y65OhYlQvzF/yvt2YUfO1/e+RsynzomQ9p8BSEmMBGlEfuxHFPDa6wnh879MqOqxx6M2c5BRaTDG4VNSHSOBHIvLY+LAlCpLmhGM4q6Ndo4nLvq5v9AsbVjCALtZSB++96hYYRKY6WdxVL64r8x+Pvc4xnvBvS17KX2Eqgb/TrWfgHdt/kMcAc5ZX4RlSwGNqyJ7OM+lyi78wGOhOAOLevXWGQcY50e2w4UddUSnUEkOvkbdguQ8W/yj/cwKaGBGc5URuBVsxKkrLgTMOkUnij6XDA4e2b+i8u4TlPkQkeplVh/uaquVnmxgrfmwYfwlbXuXNLltMYOAOSkkbNoiiaMLHUTXrZnd+CGIs7M37FHqI4VQOZahb9tdfujf74bh1hPU6oI/2c/Pvfy2IEs6AJbw1PyApYxKQP6kPJ1b9+kRD5lMgjbiNqYOP2giILyXYMUKy55Gn0YrxtXOOdAIN6dcTKfj6g0h9gD6657SgTc2O21Tv6WoVgPybN+0GXgiof13BYSI735UkdyRYfSf2RM9RCE76RoigGadUZtohhvru688CvTeK53hvewr1nHZ+i5XgfBayMBnukpCNhPbMo7HjYy+o8bHkVBMfUTmsP5A8zAlffB4/RjJQWl/tJh60Iy8rLvcxCiQ6/Pyy/YDyW22f6kgna5oWqM32fBKxWaibI1aR+gyGaIEVnx3/vxoFAyK5iLN+L6Svf78s8ajUQKkh21dQZPdSKgFe9434/4ibedxa+OqDT/tCCT/fn52+qZFQMDs9K5CVegPGP1z9Yy8v6VBJKs9 O1gWsYLR cSsvUSGtpBILxi0E6rsb0l27TOLui3Lu2oRnVwjiFT9l3M5Tay4ZOwoft6WFWcGD+buRwIY7H4h2YfOtBEkZmXBClfFU2ZymWrDPxhUdBflWTIB11p/+scz/19kLNPPLxyYbXeuLQKxnsW1m8LpsLQvrQqSjXg1pF1Nh+IDfujIk22HsCWACqVB7yMRXOXt7x3u4dqxMHZb4z8mBqlI8O6o1ZB33Mev2zngJ2xGWfHKldWHT8CJtAkV1+s6q5Lbxt5brOFGlYMh9zxrX/sxeNF6XBF+/9VoSz47BKqtYZD7goU3zAhUWh0HB6+SCG1LFM9pY1v2DgvPl5b21lenm1qYGtSxypMhBuwot7 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, 16 Oct 2023 19:08:00 +0530 Charan Teja Kalla wrote: > > From the description, it's not quite clear to me if this was actually > > hit -- usually people include the dmesg bug/crash info. > > On Snapdragon SoC, with the mentioned memory configuration of PFN's as > [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], we are able to see bunch of > issues daily while testing on a device farm. > > I note that from next time on wards will send the demsg bug/crash info > for these type of issues. For this particular issue below is the log. > Though the below log is not directly pointing to the > pfn_section_valid(){ ms->usage;}, when we loaded this dump on T32 > lauterbach tool, it is pointing. > > [ 540.578056] Unable to handle kernel NULL pointer dereference at > virtual address 0000000000000000 > [ 540.578068] Mem abort info: > [ 540.578070] ESR = 0x0000000096000005 > [ 540.578073] EC = 0x25: DABT (current EL), IL = 32 bits > [ 540.578077] SET = 0, FnV = 0 > [ 540.578080] EA = 0, S1PTW = 0 > [ 540.578082] FSC = 0x05: level 1 translation fault > [ 540.578085] Data abort info: > [ 540.578086] ISV = 0, ISS = 0x00000005 > [ 540.578088] CM = 0, WnR = 0 > [ 540.579431] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBS > BTYPE=--) > [ 540.579436] pc : __pageblock_pfn_to_page+0x6c/0x14c > [ 540.579454] lr : compact_zone+0x994/0x1058 > [ 540.579460] sp : ffffffc03579b510 > [ 540.579463] x29: ffffffc03579b510 x28: 0000000000235800 x27: > 000000000000000c > [ 540.579470] x26: 0000000000235c00 x25: 0000000000000068 x24: > ffffffc03579b640 > [ 540.579477] x23: 0000000000000001 x22: ffffffc03579b660 x21: > 0000000000000000 > [ 540.579483] x20: 0000000000235bff x19: ffffffdebf7e3940 x18: > ffffffdebf66d140 > [ 540.579489] x17: 00000000739ba063 x16: 00000000739ba063 x15: > 00000000009f4bff > [ 540.579495] x14: 0000008000000000 x13: 0000000000000000 x12: > 0000000000000001 > [ 540.579501] x11: 0000000000000000 x10: 0000000000000000 x9 : > ffffff897d2cd440 > [ 540.579507] x8 : 0000000000000000 x7 : 0000000000000000 x6 : > ffffffc03579b5b4 > [ 540.579512] x5 : 0000000000027f25 x4 : ffffffc03579b5b8 x3 : > 0000000000000001 > [ 540.579518] x2 : ffffffdebf7e3940 x1 : 0000000000235c00 x0 : > 0000000000235800 > [ 540.579524] Call trace: > [ 540.579527] __pageblock_pfn_to_page+0x6c/0x14c > [ 540.579533] compact_zone+0x994/0x1058 > [ 540.579536] try_to_compact_pages+0x128/0x378 > [ 540.579540] __alloc_pages_direct_compact+0x80/0x2b0 > [ 540.579544] __alloc_pages_slowpath+0x5c0/0xe10 > [ 540.579547] __alloc_pages+0x250/0x2d0 > [ 540.579550] __iommu_dma_alloc_noncontiguous+0x13c/0x3fc > [ 540.579561] iommu_dma_alloc+0xa0/0x320 > [ 540.579565] dma_alloc_attrs+0xd4/0x108 Thanks. I added the above info to the changelog, added a cc:stable and I added a note-to-myself that a new version of the fix may be forthcoming.