From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAF87C677C4 for ; Wed, 11 Jun 2025 09:01:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 585906B0088; Wed, 11 Jun 2025 05:01:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 535B36B0089; Wed, 11 Jun 2025 05:01:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 473506B0092; Wed, 11 Jun 2025 05:01:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 25C8E6B0088 for ; Wed, 11 Jun 2025 05:01:05 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B90B9C057B for ; Wed, 11 Jun 2025 09:01:04 +0000 (UTC) X-FDA: 83542525248.26.F1BCBF2 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf30.hostedemail.com (Postfix) with ESMTP id 9C75280007 for ; Wed, 11 Jun 2025 09:01:01 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf30.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749632463; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=brUyZmwR9ngX6PNpHsKELFAi4gDWnfMIceTZzewceb4=; b=kEcFVotTQeR+3twkrJ7lJfuDJy9OBacC7pcq+JqIU7JRlM/5Kag8X7tg3DoH/XSeOynh4k /LUo6dQ5vgXslfxIJ+Wh5a1cNi41JVyxvmHhmE8p62GhaCKAf7ywERCtXGHRH3rnzpzX19 mwjWjRKxrrM3T/i3f6UKMmdy6csLqpE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749632463; a=rsa-sha256; cv=none; b=22JalATEzeGgRTzsYxyeNBmgtXY0HIGea8Wsel8Sq6b3es0QEG+tJNuZMH9dhXvDTJPXVr DtIaByM9Hvx8JduoLi4DCHDsteoo3GzpvhhUzOuqUR5Yz1sf62JHQvTT3mgYkMqjEvkP2r GaT9bN/1wyVOXW7CKcApaxGF2z5wNOs= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf30.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4bHKMf5dyMz13M0J; Wed, 11 Jun 2025 16:58:54 +0800 (CST) Received: from kwepemo200002.china.huawei.com (unknown [7.202.195.209]) by mail.maildlp.com (Postfix) with ESMTPS id 3F4D1180482; Wed, 11 Jun 2025 17:00:57 +0800 (CST) Received: from [10.174.179.13] (10.174.179.13) by kwepemo200002.china.huawei.com (7.202.195.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 11 Jun 2025 17:00:56 +0800 Message-ID: Date: Wed, 11 Jun 2025 17:00:56 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH] mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list To: David Hildenbrand , , CC: , References: <20250611074643.250837-1-tujinjiang@huawei.com> <1f0c7d73-b7e2-4ee9-8050-f23c05e75e8b@redhat.com> From: Jinjiang Tu In-Reply-To: <1f0c7d73-b7e2-4ee9-8050-f23c05e75e8b@redhat.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.179.13] X-ClientProxiedBy: kwepems500001.china.huawei.com (7.221.188.70) To kwepemo200002.china.huawei.com (7.202.195.209) X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 9C75280007 X-Stat-Signature: zi417zcfxgzzcesxtbgfumu6r7uu9j5f X-Rspam-User: X-HE-Tag: 1749632461-900061 X-HE-Meta: U2FsdGVkX1//m+nPI1EW/sOMa6Az3x5e6CdxFkNRia6yFK87+M2kxxRH+pBdHsiYS8deyxz7JkTZzrVwblUTz8+v0rKBryXRaJWESNmEVIyaUkINfNY46a2e5eGQQadYAGU9ZLhLWetLhfqrweo59OvJGbVODoi9KPBls6qCyUTac5ZxqO63oueQc7X+iK4ZvjMFragJPsgVrzTaYMVH9lVzuNH6M8m/ymPWXFEA6xqlBodjm5QVWVRDqPoDoMZHyDZpvwStkJZpiJGTWbMPoM3KYS43vwzesjlU5d/i0ShD2Vfq13zw19gZ2JC27/n1vI+FsPfFqAErVqX5jWCzUK4BV3ImuVJo6RRPUCWQrEloNcqWEYlTnnnwy+j3OD6WUhgNs+AcHs56s7OR1sFD4ND+bFzOUeIQxBTmi/VTl3GRxH6Uy5bHVDeDhWnl1K6zx+CdaUfe27iQXsLc9SBcV/dfokFXcuNy7KX9QWi3UIkULQK4zjN4JomkDs9gtSoKEiazoN4SwDo8WQProbcmxFmg5EDuoMbzrskOJXtNBxngKOx9Sc6LERH9ul4HniCMi0fqV+Yne6+InNnTZjGs+/jOFwr7rUuS0dA32ceFiCVwbVaObpOTcLqpmQdu48z34AUi6FWpzuZNlWokzJgQ8oRO9EZ1azDKeb0xqM0QUOoy1BVzev6MWjpTC4h5sO3L5LIBGwvme4L+5tTdZFqpuYgD95w+JOdMCNUZiy4fxe3mpCdESmur0uLxuFKUcte4JBYgY/BrVjexNPWOOAZvtZL+o6wMOniWTLHVw9eXnp+8ZHsPHokY6INPDO0HnsFOipgj2A4kJmE1WTZ+32SI3FvUU66LbkOW78YF7qoQIDEqTsd/gdqPX5UBT/oGzczMhSgUfAV9s49pjuYKCxqUvOTLNL2i5MU3Do5bBgvPYW+Smyevgnpnlgs203S21OTffiHmh/Ii1BVwVzb/FUx GkuqO4Sp yb5fbZGpmY2z2dWQps7DkuEN2SsH8Meb+f10JCJ+io6v/JVSsM05z/1SpzTp9jhmW1R4oqLU+sWUMlWMpBOblhGRTpizx/Y9+KhWvuEmgWHpi+eRMr8AG9jRJovb4HoPVXsdLOGLETLfUG1odv1uM2XV1lBH2360b1hR87sM97cciJqF5LWQcrz2lIjQsFf0H4Hx6+MfP9XzTfhBhQfg0cY6e0Ya5oPalBvwIY0GgeT4upC0wyenYOuSky0g1d526G6pbMXP+6vshaxTfHgANjcumcYRE06yQ0GLZlyhnpm37mFiW757cpbBTr0BSXXy1sffaNhT2u7T+/mU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/6/11 16:35, David Hildenbrand 写道: > On 11.06.25 10:29, Jinjiang Tu wrote: >> >> 在 2025/6/11 15:59, David Hildenbrand 写道: >>> On 11.06.25 09:46, Jinjiang Tu wrote: >>>> In shrink_folio_list(), the hwpoisoned folio may be large folio, which >>>> can't be handled by unmap_poisoned_folio(). >>>> >>>> Since UCE is rare in real world, and race with reclaimation is more >>>> rare, >>>> just skipping the hwpoisoned large folio is enough. memory_failure() >>>> will >>>> handle it if the UCE is triggered again. >>>> >>>> Fixes: 1b0449544c64 ("mm/vmscan: don't try to reclaim hwpoison folio") >>> >>> Please also add >>> >>> Closes: >>> >>> with a link to the report >> Thanks, I will add it. >>> >>>> Reported-by: syzbot+3b220254df55d8ca8a61@syzkaller.appspotmail.com >>>> Signed-off-by: Jinjiang Tu >>>> --- >>>>    mm/vmscan.c | 8 ++++++++ >>>>    1 file changed, 8 insertions(+) >>>> /home/tujinjiang/hulk-repo/hulk/mm/mempolicy.c >>>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>>> index b6f4db6c240f..3a4e8d7419ae 100644 >>>> --- a/mm/vmscan.c >>>> +++ b/mm/vmscan.c >>>> @@ -1131,6 +1131,14 @@ static unsigned int shrink_folio_list(struct >>>> list_head *folio_list, >>>>                goto keep; >>>>              if (folio_contain_hwpoisoned_page(folio)) { >>>> +            /* >>>> +             * unmap_poisoned_folio() can't handle large >>>> +             * folio, just skip it. memory_failure() will >>>> +             * handle it if the UCE is triggered again. >>>> +             */ >>>> +            if (folio_test_large(folio)) >>>> +                goto keep_locked; >>>> + >>>>                unmap_poisoned_folio(folio, folio_pfn(folio), false); >>>>                folio_unlock(folio); >>>>                folio_put(folio); >>> >>> Why not handle that in unmap_poisoned_folio() to make that limitation >>> clear and avoid? >> I tried to put the check in unmap_poisoned_folio(), but it still exists >> other issues. > > > >> The calltrace in v6.6 kernel: >> >> Unable to handle kernel paging request at virtual address >> fbd5200000000024 >> KASAN: maybe wild-memory-access in range >> [0xdead000000000120-0xdead000000000127] >> pc : __list_add_valid_or_report+0x50/0x158 lib/list_debug.c:32 >> lr : __list_add_valid include/linux/list.h:88 [inline] >> lr : __list_add include/linux/list.h:150 [inline] >> lr : list_add_tail include/linux/list.h:183 [inline] >> lr : lru_add_page_tail.constprop.0+0x4ac/0x640 mm/huge_memory.c:3187 >> Call trace: >>    __list_add_valid_or_report+0x50/0x158 lib/list_debug.c:32 >>    __list_add_valid include/linux/list.h:88 [inline] >>    __list_add include/linux/list.h:150 [inline] >>    list_add_tail include/linux/list.h:183 [inline] >>    lru_add_page_tail.constprop.0+0x4ac/0x640 mm/huge_memory.c:3187 >>    __split_huge_page_tail.isra.0+0x344/0x508 mm/huge_memory.c:3286 >>    __split_huge_page+0x244/0x1270 mm/huge_memory.c:3317 >>    split_huge_page_to_list_to_order+0x1038/0x1620 mm/huge_memory.c:3625 >>    split_folio_to_list_to_order include/linux/huge_mm.h:638 [inline] >>    split_folio_to_order include/linux/huge_mm.h:643 [inline] >>    deferred_split_scan+0x5f8/0xb70 mm/huge_memory.c:3778 >>    do_shrink_slab+0x2a0/0x828 mm/vmscan.c:927 >>    shrink_slab_memcg+0x2c0/0x558 mm/vmscan.c:996 >>    shrink_slab+0x228/0x250 mm/vmscan.c:1075 >>    shrink_node_memcgs+0x34c/0x6a0 mm/vmscan.c:6630 >>    shrink_node+0x21c/0x1378 mm/vmscan.c:6664 >>    shrink_zones.constprop.0+0x24c/0xab0 mm/vmscan.c:6906 >>    do_try_to_free_pages+0x150/0x880 mm/vmscan.c:6968 >> >> >> The folio is deleted from lru and the folio->lru can't be accessed. If >> the folio is splitted later, >> lru_add_split_folio() assumes the folio is on lru. > > Not sure if something like the following would be appropriate: > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index b91a33fb6c694..fdd58c8ba5254 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1566,6 +1566,9 @@ int unmap_poisoned_folio(struct folio *folio, > unsigned long pfn, bool must_kill) >         enum ttu_flags ttu = TTU_IGNORE_MLOCK | TTU_SYNC | TTU_HWPOISON; >         struct address_space *mapping; > > +       if (folio_test_large && !folio_test_hugetlb(folio)) > +               return -EBUSY; > + >         if (folio_test_swapcache(folio)) { >                 pr_err("%#lx: keeping poisoned page in swap cache\n", > pfn); >                 ttu &= ~TTU_HWPOISON; > diff --git a/mm/vmscan.c b/mm/vmscan.c > index f8dfd2864bbf4..6a3426bc9e9d7 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1138,7 +1138,8 @@ static unsigned int shrink_folio_list(struct > list_head *folio_list, >                         goto keep; > >                 if (folio_contain_hwpoisoned_page(folio)) { > -                       unmap_poisoned_folio(folio, folio_pfn(folio), > false); > +                       if (unmap_poisoned_folio(folio, > folio_pfn(folio), false)){ > +                               list_add(&folio->lru, &ret_folios); >                         folio_unlock(folio); >                         folio_put(folio); >                         continue; The expected behaviour is keeping the folio on lru if unmap_poisoned_folio fails? If so, we should: +                       if (unmap_poisoned_folio(folio, folio_pfn(folio), false)){ +                               goto keep_locked; otherwise, folio_put() is called twice to put ref grabbed from isolation.