From: jane.chu@oracle.com
To: Zi Yan <ziy@nvidia.com>
Cc: syzbot <syzbot+e6367ea2fdab6ed46056@syzkaller.appspotmail.com>,
syzkaller-bugs@googlegroups.com, akpm@linux-foundation.org,
david@redhat.com, kernel@pankajraghav.com, linmiaohe@huawei.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
mcgrof@kernel.org, nao.horiguchi@gmail.com
Subject: Re: [syzbot] [mm?] WARNING in memory_failure
Date: Thu, 2 Oct 2025 21:02:48 -0700 [thread overview]
Message-ID: <7db593b7-6f95-4346-bffd-041fc89ee3f3@oracle.com> (raw)
In-Reply-To: <32A2C338-EFCC-470A-B5ED-E53C38395E51@nvidia.com>
On 10/2/2025 11:45 AM, Zi Yan wrote:
> On 2 Oct 2025, at 13:54, jane.chu@oracle.com wrote:
>
>> On 10/2/2025 6:54 AM, Zi Yan wrote:
>>> On 2 Oct 2025, at 1:23, jane.chu@oracle.com wrote:
>>>
>>>> On 10/1/2025 7:04 PM, Zi Yan wrote:
>>>>> On 1 Oct 2025, at 20:38, Zi Yan wrote:
>>>>>
>>>>>> On 1 Oct 2025, at 19:58, jane.chu@oracle.com wrote:
>>>>>>
>>>>>>> Hi, Zi Yan,
>>>>>>>
>>>>>>> On 9/30/2025 9:51 PM, syzbot wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
>>>>>>>> lost connection to test machine
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Tested on:
>>>>>>>>
>>>>>>>> commit: d8795075 mm/huge_memory: do not change split_huge_page..
>>>>>>>> git tree: https://github.com/x-y-z/linux-dev.git fix_split_page_min_order-for-kernelci
>>>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=17ce96e2580000
>>>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=714d45b6135c308e
>>>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=e6367ea2fdab6ed46056
>>>>>>>> compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
>>>>>>>> userspace arch: arm64
>>>>>>>>
>>>>>>>> Note: no patches were applied.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Thank you for looking into this.
>>>>>>
>>>>>>> My hunch is that
>>>>>>> https://github.com/x-y-z/linux-dev.git fix_split_page_min_order-for-kernelci
>>>>>>> alone is not enough. Perhaps on ARM64, the page cache pages of /dev/nullb0 in
>>>>>> Yes, it only has the first patch, which fails a split if it cannot be
>>>>>> split to the intended order (order-0 in this case).
>>>>>>
>>>>>>
>>>>>>> the test case are probably with min_order > 0, therefore THP split fails, as the console message show:
>>>>>>> [ 200.378989][T18221] Memory failure: 0x124d30: recovery action for unsplit thp: Failed
>>>>>>>
>>>>>>> With lots of poisoned THP pages stuck in the page cache, OOM could trigger too soon.
>>>>>>
>>>>>> That is my understanding too. Thanks for the confirmation.
>>>>>>
>>>>>>>
>>>>>>> I think it's worth to try add the additional changes I suggested earlier -
>>>>>>> https://lore.kernel.org/lkml/7577871f-06be-492d-b6d7-8404d7a045e0@oracle.com/
>>>>>>>
>>>>>>> So that in the madvise HWPOISON cases, large huge pages are splitted to smaller huge pages, and most of them remain usable in the page cache.
>>>>>>
>>>>>> Yep, I am going to incorporate your suggestion as the second patch and make
>>>>>> syzbot check it again.
>>>>>
>>>>>
>>>>> #syz test: https://github.com/x-y-z/linux-dev.git fix_split_page_min_order_and_opt_memory_failure-for-kernelci
>>>>>
>>>>
>>>> There is a bug here,
>>>>
>>>> if (try_to_split_thp_page(p, new_order, false) || new_order) {
>>>> res = -EHWPOISON;
>>>> kill_procs_now(p, pfn, flags, folio); <---
>>>>
>>>> If try_to_split_thp_page() succeeded on min_order, 'folio' should be retaken: folio = page_folio(page) before moving on to kill_procs_now().
>>>
>>> Thank you for pointing it out. Let me fix it and let syzbot test it again.
>>
>> Forgot to ask, even with your current patch, after splitting at min_order, the old 'folio' should be at min_order as well, just not necessarily the one where the raw hwpoisoned sub-page resides, right?
>
> Yes.
>
>> If yes, then 1) I am wondering about the value of the min_order? 2) perhaps
>
> I think min_order depends on the filesystem config. It can be like 2 (16KB) or 4 (64KB). Based on the reproducer[1], it seems that block size is set to 64KB
> (see ioctl$BLKBSZSET arg).
>
> [1] https://syzkaller.appspot.com/text?tag=ReproC&x=1361627c580000
>
>> the syzbot test need to reduce the number of fork()'ing,
>> as with each MADV_HWPOISON inject, one page cache page will be lost and stuck in the page cache, the difference is the size of the page cache page and the number of pages.
>
> Right. the lost page size is amplified by min_order.
>
> BTW, I do not see fork or loop in the above reproducer, I wonder why the test
> went OOM.
You're right, the test itself doesn't fork. I saw copy_process() in the
oom-kill call trace, I spoke too soon.
The tests appear to be running in a tight loop, can't tell the number of
iterations or duration. The console has logged "5039 pages hwpoisoned",
likely with each MADV_HWPOISON injection, a 64K folio is lost. So either
that means 5039 * 64K, or just 5039 base pages, it's a lot memory lost
and become unusable, until zone normal dipped below the min watermark.
I think the test might need to be adjusted.
Option 1, reduce the test runs accordingly, eg, if the blocksize is 4K,
maybe allow more test runs,
Option 2, add unpoison operation after poison.
Not sure how do we go about that. What do others think?
thanks,
-jane
>
> Best Regards,
> Yan, Zi
>
next prev parent reply other threads:[~2025-10-03 4:03 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 16:22 syzbot
2025-09-24 11:32 ` David Hildenbrand
2025-09-24 15:03 ` Zi Yan
2025-09-24 15:35 ` David Hildenbrand
2025-09-24 16:33 ` Zi Yan
2025-09-24 17:05 ` David Hildenbrand
2025-09-24 17:52 ` Zi Yan
2025-09-25 12:02 ` Pankaj Raghav (Samsung)
2025-09-25 14:24 ` Zi Yan
2025-09-25 16:23 ` Yang Shi
2025-09-25 16:48 ` David Hildenbrand
2025-09-25 17:26 ` Yang Shi
2025-09-29 11:08 ` Pankaj Raghav (Samsung)
2025-09-29 15:20 ` Zi Yan
2025-09-29 16:13 ` David Hildenbrand
2025-10-01 1:51 ` Zi Yan
2025-10-01 2:06 ` syzbot
2025-10-01 2:13 ` Zi Yan
2025-10-01 4:51 ` syzbot
2025-10-01 23:58 ` jane.chu
2025-10-02 0:38 ` Zi Yan
2025-10-02 2:04 ` Zi Yan
2025-10-02 2:50 ` syzbot
2025-10-02 5:23 ` jane.chu
2025-10-02 13:54 ` Zi Yan
2025-10-02 17:47 ` jane.chu
2025-10-09 7:39 ` Miaohe Lin
2025-10-10 15:25 ` Zi Yan
2025-10-02 17:54 ` jane.chu
2025-10-02 18:45 ` Zi Yan
2025-10-03 4:02 ` jane.chu [this message]
2025-10-02 18:33 ` Zi Yan
2025-10-02 19:09 ` syzbot
2025-10-02 7:25 ` David Hildenbrand
2025-09-29 17:29 ` jane.chu
2025-09-29 17:49 ` jane.chu
2025-09-29 18:23 ` jane.chu
2025-09-29 20:15 ` Zi Yan
2025-09-29 20:52 ` jane.chu
2025-09-30 2:51 ` Miaohe Lin
2025-09-30 4:35 ` jane.chu
2025-09-30 6:31 ` Miaohe Lin
2025-10-01 18:15 ` jane.chu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7db593b7-6f95-4346-bffd-041fc89ee3f3@oracle.com \
--to=jane.chu@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=kernel@pankajraghav.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=nao.horiguchi@gmail.com \
--cc=syzbot+e6367ea2fdab6ed46056@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox