From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAC61E77188 for ; Tue, 24 Dec 2024 07:29:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 454956B0082; Tue, 24 Dec 2024 02:29:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 404AA6B0083; Tue, 24 Dec 2024 02:29:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CD276B0085; Tue, 24 Dec 2024 02:29:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0E33E6B0082 for ; Tue, 24 Dec 2024 02:29:18 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 938F2428C8 for ; Tue, 24 Dec 2024 07:29:17 +0000 (UTC) X-FDA: 82929024948.19.7251998 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf07.hostedemail.com (Postfix) with ESMTP id C05BD4000C for ; Tue, 24 Dec 2024 07:28:10 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735025319; a=rsa-sha256; cv=none; b=wfL59skVPpcZtoEUcIzS8CDF7r5hdF0d8ZeViA9KPUGGcQ4G0Yh5JwqMTweN7FgBgtCgY7 /U9z3HWuIv21jEeS3QtU6NTDoMjtj4vE7qg3E5yv5R7VH+3HvT5T3cqxem6OYFHeL/qmIf LVDsCDK/6JtZ0jwdEW3+NGFISEBn8TU= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735025319; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Miaz1DsEjP/c6YOmy6sufkXP+pNrE5URvfulWMXIoGA=; b=K6+fAsRWml7/VykTlFDCl+E2GnEwKhQp8STUQzfx6GwrwsQzaTvhicwsSClqExUZdEKNzE 2d9tFTqOmiwnu4b/+VXJc9U3zEcfWe3IrPDt487wN/gwf/PRfLlCit+MXefbB3HxQjQ8We ftAnC7RwCdyJ4LjuNgorbU4FfyqMo0I= Received: from mail.maildlp.com (unknown [172.19.162.112]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4YHRKq4xsvz21p8G; Tue, 24 Dec 2024 15:27:11 +0800 (CST) Received: from kwepemd200019.china.huawei.com (unknown [7.221.188.193]) by mail.maildlp.com (Postfix) with ESMTPS id 6A7BC1400F4; Tue, 24 Dec 2024 15:29:10 +0800 (CST) Received: from [10.173.127.72] (10.173.127.72) by kwepemd200019.china.huawei.com (7.221.188.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 24 Dec 2024 15:29:09 +0800 Subject: Re: [PATCH v2] mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory To: David Hildenbrand , , Dan Williams CC: , , References: <20241219115209.574065-1-linmiaohe@huawei.com> <06a45f8a-0981-40a2-a12a-5964fcdace13@redhat.com> <1af81f0e-6500-9719-20be-505851673b58@huawei.com> <44de3a6c-6761-4ad6-a4dd-b9002a42c437@redhat.com> From: Miaohe Lin Message-ID: <1d137347-23c9-36a5-e4f0-97e572ff6097@huawei.com> Date: Tue, 24 Dec 2024 15:29:08 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <44de3a6c-6761-4ad6-a4dd-b9002a42c437@redhat.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.173.127.72] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemd200019.china.huawei.com (7.221.188.193) X-Rspamd-Queue-Id: C05BD4000C X-Stat-Signature: d8d7sjb6g739eiemo9nfxif6f4r9ujju X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1735025290-54892 X-HE-Meta: U2FsdGVkX1/aiiV4Gw2zKgkedG6rRO5EFsl+v8IYEw5cN/x21eGlTEyHaLb9vazVm0JdrkwLcsH1/dy2ZTi0xs2cERF9ep5mjZ2MVfyuSjlPHz/kdy2atHXZu7Z2fcYAnADZuSrfnaIB1sbcInSD3WO1NDheJsU+nSgVOmEKgkm87kMEYNGZY14DG9ZaVVYCcKxJCwrr1xiL5ra9ROCadr8mooCuIiOhgMr5yiCu6quL+fxrm6v0+35e13X9+IMaOpGTrO7AyUU5T3VwK/h86tzQZred25rr63u8DDF7BxcUUCZFYwxi+Gk2ZhaJEUShn9hLAlvecddDFrCzcXsh+kwJYD5fdt1dq52pG0TZUHk0c6o59Ws55hjwkF9jT9K2x+FPqta9Lh7vaS0J89HCD5xq3+tp0uu66OOZgD5Qu5T+Nt/wkTyt4iIFu+LiWlpFFa+oJaJSXbxVcLiuorB2PM9DcvBuy1K/t8AITZJKggnzybEG8Ubzq3i6W1Ehq6+z7muKlT5lM34z/sS1xjTFbMYk2zATLwG1XaKag0SzLif85ywKAUrpx69Gz96bte16H3Xtfx3UI24GpesA3fL9RqPvdYI52t+qKY1aENH3GRIl0Ok5UGW3P6uLIj9MPE2dGsfIwBfbvbCXM1h+N+G7B6xBPRWj5mjl+BLX+tJZjReVCLfvCz3IOXDUiAy8fcGd7sGo0TwAQVtVGF1cx6Q+5GBnq1TNsqxxK9MBDewPqy6bAqAWQ0V3oGE8dpUj+teVNWhApiY+RV/QranAsFWWcHqPe/LPCsYzKZv9oy4ntWnOhc9IgiOrrt1NqUjFY/xyscI3wgJaHql770T2NtAN21Q1CRoaNrqOAfcqKLHuM2wPl8QzeNhBGhUDSPOTLDozMgD7ukuLNwBygIr3fWZ40ROog7Z4XBjV7C61Wd6le0dsrRUPFqr7bvksRxWgxzjbjk1XgBE9SSQSAV9vx70 h9ODmt7t qatUZKUdRk056a+rUQA+BvipWvhGHti5lWJyYnYLggpXe2Ehdt4r8Iz5fJJqWNyv3x9JLtvczjx2koXdkl8Wq+PIFsBs/8Ej6EH5yyDQZfRS4r0rVWQ/oqe8vsaD+YYKMD9ev5qDImuwHAalC47nRhYuEGEu0EtyfsYL6e4oEt+me8g0GSR1n3JhOxcA24xvPRyANz8ioSPe5mT1uZ4Au+FirY3ezAZXYfFQyVT0poE3M9GCofuhCLFrEqwk6aPelgcKBUvyNtfzfdpsPj+1fYh7NHbrJTSayNX+VYepyPrUTFsI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/12/23 18:42, David Hildenbrand wrote: > On 23.12.24 03:55, Miaohe Lin wrote: >> On 2024/12/20 16:50, David Hildenbrand wrote: >>> On 20.12.24 03:35, Miaohe Lin wrote: >>>> On 2024/12/19 20:18, David Hildenbrand wrote: >>>>> On 19.12.24 12:52, Miaohe Lin wrote: >>>>>> When I did memory failure tests recently, below panic occurs: >>>>>> >>>>>> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page)) >>>>>> kernel BUG at include/linux/page-flags.h:616! >>>>>> Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI >>>>>> CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40 >>>>>> RIP: 0010:unpoison_memory+0x2f3/0x590 >>>>>> RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 >>>>>> RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 >>>>>> RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 >>>>>> RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb >>>>>> R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 >>>>>> R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe >>>>>> FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 >>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>> CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 >>>>>> Call Trace: >>>>>>     >>>>>>     unpoison_memory+0x2f3/0x590 >>>>>>     simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 >>>>>>     debugfs_attr_write+0x42/0x60 >>>>>>     full_proxy_write+0x5b/0x80 >>>>>>     vfs_write+0xd5/0x540 >>>>>>     ksys_write+0x64/0xe0 >>>>>>     do_syscall_64+0xb9/0x1d0 >>>>>>     entry_SYSCALL_64_after_hwframe+0x77/0x7f >>>>>> RIP: 0033:0x7f08f0314887 >>>>>> RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 >>>>>> RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887 >>>>>> RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001 >>>>>> RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff >>>>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009 >>>>>> R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00 >>>>>>     >>>>>> Modules linked in: hwpoison_inject >>>>>> ---[ end trace 0000000000000000 ]--- >>>>>> RIP: 0010:unpoison_memory+0x2f3/0x590 >>>>>> RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 >>>>>> RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 >>>>>> RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 >>>>>> RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb >>>>>> R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 >>>>>> R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe >>>>>> FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 >>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>> CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 >>>>>> Kernel panic - not syncing: Fatal exception >>>>>> Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) >>>>>> ---[ end Kernel panic - not syncing: Fatal exception ]--- >>>>>> >>>>>> The root cause is that unpoison_memory() tries to check the PG_HWPoison >>>>>> flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is >>>>>> triggered. This can be reproduced by below steps: >>>>>> 1.Offline memory block: >>>>>>     echo offline > /sys/devices/system/memory/memory12/state >>>>>> 2.Get offlined memory pfn: >>>>>>     page-types -b n -rlN >>>>>> 3.Write pfn to unpoison-pfn >>>>>>     echo > /sys/kernel/debug/hwpoison/unpoison-pfn >>>>>> >>>>>> Signed-off-by: Miaohe Lin >>>>>> --- >>>>>> v2: Use pfn_to_online_page per David. Thanks. >>>>>> --- >>>>>>     mm/memory-failure.c | 14 +++++++++++--- >>>>>>     1 file changed, 11 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>>>> index a7b8ccd29b6f..02be0596ce67 100644 >>>>>> --- a/mm/memory-failure.c >>>>>> +++ b/mm/memory-failure.c >>>>>> @@ -2556,10 +2556,18 @@ int unpoison_memory(unsigned long pfn) >>>>>>         static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, >>>>>>                         DEFAULT_RATELIMIT_BURST); >>>>>>     -    if (!pfn_valid(pfn)) >>>>>> -        return -ENXIO; >>>>>> +    p = pfn_to_online_page(pfn); >>>>>> +    if (!p) { >>>>>> +        struct dev_pagemap *pgmap; >>>>>>     -    p = pfn_to_page(pfn); >>>>>> +        if (!pfn_valid(pfn)) >>>>>> +            return -ENXIO; >>>>>> +        pgmap = get_dev_pagemap(pfn, NULL); >>>>>> +        if (!pgmap) >>>>>> +            return -ENXIO; >>>>>> +        put_dev_pagemap(pgmap); >>>>>> +        p = pfn_to_page(pfn); >>>>>> +    } >>>>> >>>>> Hm, I wonder if we can do anything reasonable with ZONE_DEVICE pages here? >>>> >>>> All I can see in unpoison_memory() is folio_test_clear_hwpoison() for ZONE_DEVICE pages. >>> >>> IIRC, it can only be triggered via debugfs in special kernel configs. So chances are this was never ever actually run against a ZONE_DEVICE page. >> >> If ZONE_DEVICE pages are never expected, we can simply filter them out. > > Looking into some details, I think we should just ignore ZONE_DEVICE for now, I'm pretty sure that it's not handled correctly. > > So I suggest to fail if pfn_to_online_page() == NULL, just like soft_offline_page() would. I think current code doesn't take ZONE_DEVICE pages into account. So I tend to ignore them too. But let's wait some time for input from Dan. Thanks. .