From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3615C25B76 for ; Tue, 11 Jun 2024 11:39:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 322FF6B009A; Tue, 11 Jun 2024 07:39:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2AC186B009C; Tue, 11 Jun 2024 07:39:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14D076B009D; Tue, 11 Jun 2024 07:39:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E1C036B009A for ; Tue, 11 Jun 2024 07:39:47 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 86BC3A1375 for ; Tue, 11 Jun 2024 11:39:47 +0000 (UTC) X-FDA: 82218413214.12.0799397 Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) by imf29.hostedemail.com (Postfix) with ESMTP id B15C8120004 for ; Tue, 11 Jun 2024 11:39:43 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=F585devi; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf29.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.113 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718105985; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OVUFDlMITzMq7x7Jd4SoAFKf9m70eZxH5vbvcK8ja40=; b=OhBxkl3gWP9yGXLrQla0p5unotr5BqIuMZXa3GHoVtCBgwiEHt4/r15qa3yyxbsPCWlj6/ PfDsV+bBtKQ+LX9PvYtKMOv5ifKZi2RsPZGgSZumY2r00LptpRML+q9XylaRdsHYMrhi+L l6eK5I1f/23WkN7d7ccGZ+bS0hOg8Rs= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=F585devi; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf29.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.113 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718105985; a=rsa-sha256; cv=none; b=ATss2cOgyEdBVr21hctonsb/cNYM4J8Tl3m9kEuL9DIQnxf/bKmTa3jdLVB/oJB508yo1F 1KeTy9wIvebI4r0RBEwi1ms3MN3IJZISd26GMFNtnOD7WcbciPvciZSye0TOy7H84z0Rny V3QGWeKnj/LUfcE2y/Hyt32mm7ItBfY= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1718105980; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=OVUFDlMITzMq7x7Jd4SoAFKf9m70eZxH5vbvcK8ja40=; b=F585devi7FVEzp3LiNVPfTmfHgjbtciynQ6JNBshzvDj97w0iFEGbY9T7+NgY6IS6HwJfzq1Kg8o7hYRASS6rIqYRhHl5ohApWVRuu7Z2uwQc3mMNJ9EPRPNsal9CJVdy/Mn4WVXZop5Oeqrb9PXMnKhPoBvw6u10txcjzfB7KE= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R711e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033022160150;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0W8GL.6F_1718105977; Received: from 30.97.56.68(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W8GL.6F_1718105977) by smtp.aliyun-inc.com; Tue, 11 Jun 2024 19:39:38 +0800 Message-ID: <80a05784-21dd-4f20-b441-1e2a2be0e0ff@linux.alibaba.com> Date: Tue, 11 Jun 2024 19:39:37 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: LTP: fork13: kernel panic on rk3399-rock-pi-4 running mainline 6.10.rc3 To: David Hildenbrand , Naresh Kamboju , open list , linux-mm , lkft-triage@lists.linaro.org, Linux Regressions Cc: Andrew Morton , willy@infradead.org, Kefeng Wang , Barry Song , Ryan Roberts , "Russell King (Oracle)" , Dan Carpenter , Arnd Bergmann , Anders Roxell References: From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B15C8120004 X-Stat-Signature: s1o7gcmd4ecnjjkcng7wuskrzohs67ow X-Rspam-User: X-HE-Tag: 1718105983-905474 X-HE-Meta: U2FsdGVkX1+M7GJhjUPJ/vIPfUpVZkYFd456ZpUw8J7/0UH4LSRHlJvgroHD+MsUz+eKB7pdLYxKFrMW/c5U0lUXs8UJtDIRy9dfDiGFEvyyZxwGKnphFbmW0Jpvq7JbMVInrgdoKIlgPzSBFMgy8qndNZxdeTMP8A5ct49rgXt99oO82uTu9nSNBz1O/098sYWd7zYxwagd22ULR5fb0MokIIyiuNuPle7RRDKNaixZOqi5rq4Y8FBltOwF/hgIpxKPcVSiQomuLerGZK/POHUrhQRv+OPiWWBJo7ILVK6sOX91+tjQDu7sJSRExgL75rK6FARLjXl5qVBy3eOILr/KUqRCArTY/YTXVUd/47WqLITZ7hXRcNcxCH1pJE1QZVlFTYnVz5xL++xG3roRctBeOGzN9/3MlIgLDyxAOp1tBt4HED0gGAkk9I3V0N1jgkocsri+WWNHcJh5MOjcwlyOJf7beWWGJJ5Bf0J1yv3av5hzTl59eR33fFkGSyvuSEHAsJXERkHuB5FWJ95RkAuLiYa2hZqqgF8U7xPc/dw3RzMGS6/ISqy3pL+pC3daaBdgyv9IoTsnMjWAb+u3+hGnKSRy5hSoW9C36ULjENMBMMQwaafXTRfu+UQiBy7N+r2dJ+ZiBAgEZyPRsc4dtujmo/ChlCOMw41jz7a//5TWOt8yO7+mcl/49bucMe6VCqh2+FbE5OHmDtxP4mkuH6cgelL1xSh3WiRuvuKKnWDaj65VIpRr9KFOzfD1dgVx/VnEki7cqk8jpUvCMgbi8l0aOc6OWWCbElUyAjfnMsIBwnUWSPZSfTsW0X69PJmdHbfbLHEFWlV90sAyses4g+HmTgEQMVvvQHN47yxUke1DSDTAn6s9jWJ36tqKYqhQx2ctPqpzs4mU5x0OJgUuAYrPJcf+0e9ZEwRA3caRXubR2R07Puj99ypjmDe1TACWz5B5gbTl5Xjt17h3VLT GUH6D8hz XKbtRydG3MLJRQKo07GUBJsAvHsjQAamUaGaccYDz59+ahvLoYhcmi8otoO+OIWugeV07dpxa6OivGDu6TjQf1hroJVu3/8hH+wIYWZ6NbHh5u1Mypx0Wi75AyjvSTc9ExZPoSFvE8wCtSOZwTmxPYdHjrZHSqKDKC6jO1rNA5Rsw1bp1/S+GBqRgc7eBGLOnJgqfhPNffOirYAdQH2kXMHWX/7GMQCvGDIK7V6zgJN+a1UvRzDKmYREGVWqno3APGth7/tz8sV5y7FOG/SAf+g8TRoCNHQlvIIUUQ/BOiisdYtu6s3TsUy7/sPjWiKCYl52Zar+Pnr61zRcvxnXELl3zM3+Heam5GJEC+5vvvCQXULaMoO9jLhVRGfXxLOJtLeehJWDcZUS3mQRMVA8STm9NaxUFdi/4cfoirVX4AQx+7KJQ8jBQSRmOw/MPdbVQeH+eeRkvnm4LDb9Ib6/C42DSfkPkB4/iWI6C1F5kl+KIBgQRO5b8cFnq1P6zOv6FK1Iq23RNNlCEXRAo3Y4Gs+AM4HkVFYRA4C9q X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/6/11 18:32, David Hildenbrand wrote: > On 11.06.24 12:14, Naresh Kamboju wrote: >> The kernel panic was noticed while running LTP syscalls fork13 (long >> running) on >> the mainline master 6.10.rc3 kernel on arm64 rk3399-rock-pi-4 device. >> >> Please find detailed logs in the links, >> >> As you know fork13 is a stress test case trying to generate a maximum >> number >> of PID's in a 100,000 loop. >> >> This device is running via NFS mounted filesystem. >> >> I have tried to reproduce this problem in a loop but failed to >> reproduce the >> crash. >> >> >> Crash flow: >> ------ >> fork13 run started >> BUG: Bad page map in process fork13 >> BUG: Bad rss-counter state mm: >> Unable to handle kernel paging request at virtual address >> Internal error: Oops: 0000000096000046 >> run for 800 secs ( 13 minutes) and more. >> fork14 run started and completed >> >> fpathconf01 run started and completed >> sugov: >> >> Unable to handle kernel NULL pointer dereference at virtual address >> >> Insufficient stack space to handle exception! >> end Kernel panic - not syncing: kernel stack overflow >> >> I have tried to decode stack dump by not being useful [1]. >> [1] https://people.linaro.org/~naresh.kamboju/output-rk3399.txt >> >> Test log : >> -------- >> tst_test.c:1733: TINFO: LTP version: 20240524 >> tst_test.c:1617: TINFO: Timeout per run is 0h 15m 00s >> [  904.280569] BUG: Bad page map in process fork13  pte:2000000019ffc3 >> pmd:80000000df55003 >> [  904.281397] page: refcount:1 mapcount:-1 mapping:0000000000000000 >> index:0x0 pfn:0x19f > > Mapcount underflow on a small folio (head: not printed). > > [...] > >> [  904.294564] BUG: Bad page map in process fork13  pte:200000002e4fc3 >> pmd:80000000df55003 >> [  904.295275] page: refcount:2 mapcount:-1 mapping:000000007885152f >> index:0x6 pfn:0x2e4 > > Another mapcount underflow on a small folio (head: not printed). > > >> [  904.309309] BUG: Bad page map in process fork13  pte:20000000cc6fc3 >> pmd:80000000df55003 >> [  904.310031] page: refcount:1 mapcount:-1 mapping:0000000000000000 >> index:0x6 pfn:0xcc6 >> [  904.310728] head: order:3 mapcount:-1 entire_mapcount:0 >> nr_pages_mapped:8388607 pincount:0 > > Mapcount underflow on a large folio. > > ... > >> [  904.326666] BUG: Bad page map in process fork13  pte:20000000268fc3 >> pmd:80000000df55003 >> [  904.327390] page: refcount:1 mapcount:-1 mapping:00000000f0624181 >> index:0x1b pfn:0x268 > > Another mapcount underflow on a small folio (head: not printed). > >> [  904.328094] memcg:ffff0000016b4000 >> [  904.328401] aops:nfs_file_aops ino:8526e6 dentry >> name:"libgpg-error.so.0.36.0" >> [  904.329051] flags: >> 0x3fffe000000002c(referenced|uptodate|lru|node=0|zone=0|lastcpupid=0x1ffff) >> [  904.329878] raw: 03fffe000000002c fffffdffc0009a48 fffffdffc022f3c8 >> ffff00000688bd60 >> [  904.330561] raw: 000000000000001b 0000000000000000 00000001fffffffe >> ffff0000016b4000 >> [  904.331240] page dumped because: bad pte >> [  904.331590] addr:0000aaaad9afe000 vm_flags:00000075 >> anon_vma:0000000000000000 mapping:ffff0000300d4188 index:2e >> [  904.332476] file:fork13 fault:filemap_fault mmap:nfs_file_mmap >> read_folio:nfs_read_folio >> [  904.333245] CPU: 5 PID: 22685 Comm: fork13 Tainted: G    B > > > Are these maybe side-effects due to > > https://lkml.kernel.org/r/20240607103241.1298388-1-wangkefeng.wang@huawei.com IIUC, the rk3399-rock-pi-4b device has no NUMA nodes (6 arm64 cores), so I don't think the numa balancing will cause this issue. Anyway, I will run fork13 test case on my arm64 server to try.