From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22BD4C25B74 for ; Tue, 14 May 2024 03:06:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F43E8D0017; Mon, 13 May 2024 23:06:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A3298D000D; Mon, 13 May 2024 23:06:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 66ABF8D0017; Mon, 13 May 2024 23:06:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 47F268D000D for ; Mon, 13 May 2024 23:06:07 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B5FFF80F99 for ; Tue, 14 May 2024 03:06:06 +0000 (UTC) X-FDA: 82115512332.19.306D85A Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf13.hostedemail.com (Postfix) with ESMTP id 615132000B for ; Tue, 14 May 2024 03:06:01 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf13.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715655965; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pJetnA4H90s7fvxftUmZ4TJNpUQYfp8MhQBu3mQbi+g=; b=RnhombgHwRM6X57/01K6xmbT1JENA+1PzFQe33jXkSVqaZsN4Tw7aTMvYWYP5Mpd2/lvTA uZCM63egS09nBZXV/HXN/AVOawwoh4r3l8ptuWQUjdF3BNTPKXHRzp5xie9owTrpxoBwQO t1gutyV0AoT+wVnXs3r4VWl9fGiKRSo= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf13.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715655965; a=rsa-sha256; cv=none; b=OtrhtxULjtvcgwwnQkUjxY3aPeojGeDbadrq99B+C25/K1coB0cS93t8hhJ/AtpEKPU2iw P4WA5AK1Ik51mGHccrIE/Hk+4ZUuLCE5ybUo7tv1NKmXslPnXxWPqzXbGlfVNMNudyK0Nx KXu/hItSiuKlSVZdQj/yL/PmFaDvQtE= Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Vdh3p7237z1S5Vc; Tue, 14 May 2024 11:02:30 +0800 (CST) Received: from canpemm500002.china.huawei.com (unknown [7.192.104.244]) by mail.maildlp.com (Postfix) with ESMTPS id 2BE5E180080; Tue, 14 May 2024 11:05:58 +0800 (CST) Received: from [10.173.135.154] (10.173.135.154) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Tue, 14 May 2024 11:05:57 +0800 Subject: Re: [PATCH] mm/huge_memory: mark huge_zero_folio reserved To: Yang Shi CC: , , , , References: <20240511032801.1295023-1-linmiaohe@huawei.com> From: Miaohe Lin Message-ID: <94ddebf2-8cfa-b1bb-2241-49f672186946@huawei.com> Date: Tue, 14 May 2024 11:05:57 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.173.135.154] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500002.china.huawei.com (7.192.104.244) X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 615132000B X-Stat-Signature: 1k4iwe33jygfuyt5omgpczg5ixiie4m9 X-Rspam-User: X-HE-Tag: 1715655961-333892 X-HE-Meta: U2FsdGVkX1+9X1muBqleqo00kL0+mGutKBCTJFjuE5+b7bCOEidzJs14hDlRedwjA1YUfdLG+2swk2KPX8gga8P/ihOS9jrX8C3fquaBweD3R7Z4LX919VeOrNE1JBqoRFNAXXAg/iz2duQvO6u9i517s8UJ1A+4mE4zAtZaVV7TYMyTRP/2lN80QJ6gzgsZ0jkw+HxgCHbohAvy+iV3YKsxQRPAkbZIc+87ZitPjZg3ZHxPuNHa6JmflJXM9521IwzPDVqtno6tJiEJJX1KO8LbGndjOKL+PjRom70McG0NAGF5WmzU4hB0wE8+OL/rQ5dwt5nv50mQYPY4c0ScQah0Zywrl5ORcoLqXIn6GuTAbX5xCYHVHSSuj0oCZtMujzw/71qT7l43dFxMWbWPdWVP4qubLbLIZM3OH9UDI5/z3UO1bSEOR1aLybaS/O/dd2M3kU0Yz9X2ZddG9+95EMg6vA6bOwdu9IttodQTwi8FIsakCr6JmYM74M/P+8F9gJOm2mBYTnlIvW/VKMkG7WRHNxmmzzk88uic13rRlegxa9vN7cEzs7tK56kjaj4XYaXlw3f2dhhisGy3o+HAApu2h9jjWOvEHxsN//OScds7fmaNlZ7WaFs1m6pDo2JyYQXD/N9xPpSmbmkMvtX/Oqn0anLLn3YIVy6/Ulw6Ywx0ZPenyYFg9xH0s4KLVVrLl06YEY2P6glksakQahO22zuHrgyuE0P6e0kpePYIMe43zVU/FIQlW7Vi9oR/52FBFlroxX4G52gmy/N5OuWB6KpfkAne+4c//4A+gRt2lgZwqqv3ZSjmz4bU2kI5RRjpE5EtfLYlxpc71eVrqkP8f09s2+XhigHJZOaJsirAZ/LxnORid+gu9l+pC2eC6f+enMmnLiPP50S0pMjW1B/aTFyZ/9Ze0gFfXkDFhjCosNPH/LwiCrlvLPcaIIfCx+B0rKJwxFBrJSgvFrxeddW BZuy7G6Z OCrZBz9BaQrBvtR4w+PnTsARrKkoPuFv+8Vve0E9Kr3uJ3eMQ2XD2XUN0JAJeO0LVQPkK/+FKGGRCm3TeOmtVGO+LLGUR6R6IBA4oqiBue83GJmqgW9rkhQNvtck0xed3vz3iiFHU1GfcHnLzWme3zhvOSm00Bhg9M4qP4lIly46656fC410rr1Blm9LBL//kT+FHynbGJa4YZAmSIg9Y+nG8VG5HwfrM+V5RxJXSg/6yK76wgmiLp5o2/R8A10luW7vDsaqnrBpGZyg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/5/13 23:34, Yang Shi wrote: > On Fri, May 10, 2024 at 9:31 PM Miaohe Lin wrote: >> >> When I did memory failure tests recently, below panic occurs: >> >> kernel BUG at include/linux/mm.h:1135! >> invalid opcode: 0000 [#1] PREEMPT SMP NOPTI >> CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14 >> RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0 >> RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 >> RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 >> RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 >> RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 >> R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 >> R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 >> FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0 >> Call Trace: >> >> do_shrink_slab+0x14f/0x6a0 >> shrink_slab+0xca/0x8c0 >> shrink_node+0x2d0/0x7d0 >> balance_pgdat+0x33a/0x720 >> kswapd+0x1f3/0x410 >> kthread+0xd5/0x100 >> ret_from_fork+0x2f/0x50 >> ret_from_fork_asm+0x1a/0x30 >> >> Modules linked in: mce_inject hwpoison_inject >> ---[ end trace 0000000000000000 ]--- >> RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0 >> RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 >> RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 >> RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 >> RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 >> R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 >> R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 >> FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0 >> >> The root cause is that HWPoison flag will be set for huge_zero_folio >> without increasing the folio refcnt. But then unpoison_memory() will >> decrease the folio refcnt unexpectly as it appears like a successfully >> hwpoisoned folio leading to VM_BUG_ON_PAGE(page_ref_count(page) == 0) >> when releasing huge_zero_folio. >> >> Fix this issue by marking huge_zero_folio reserved. So unpoison_memory() >> will skip this page. This will make it consistent with ZERO_PAGE case too. > > If I read the code correctly, unpoison_memory() should not dec > refcount for huge zero page by calling put_page_testzero(). The huge > zero page's real refcount is actually maintained separately by > huge_zero_refcount. It is different from the regular refount in struct > folio, see get_huge_zero_page(). Sure. Huge zero folio should be skipped in unpoison_memory(). It's not supported anyway. I marked huge_zero_folio reserved in order to let unpoison_memory() skip it by folio_test_reserved(folio) check. But as David points out, the use of PG_reserve is limited, so I will find another way to fix the issue. Thanks. .