From: David Hildenbrand <david@redhat.com>
To: Zi Yan <ziy@nvidia.com>, Pankaj Raghav <kernel@pankajraghav.com>
Cc: Matthew Wilcox <willy@infradead.org>,
Luis Chamberlain <mcgrof@kernel.org>,
Jinjiang Tu <tujinjiang@huawei.com>,
Oscar Salvador <osalvador@suse.de>,
akpm@linux-foundation.org, linmiaohe@huawei.com,
mhocko@kernel.org, linux-mm@kvack.org,
wangkefeng.wang@huawei.com
Subject: Re: [PATCH v2 2/2] mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range
Date: Mon, 14 Jul 2025 16:24:22 +0200 [thread overview]
Message-ID: <3702f6b0-27a9-4ca1-adbd-fb1e2985b2d3@redhat.com> (raw)
In-Reply-To: <641F5B0B-2B48-46FA-AC58-3A8A4BEB1448@nvidia.com>
On 14.07.25 16:20, Zi Yan wrote:
> On 14 Jul 2025, at 9:53, Pankaj Raghav wrote:
>
>> Hi Zi Yan,
>>
>>>>
>>>> Probably the WARN_ON can indeed trigger now.
>>>>
>>>>
>>>> @Zi Yan, on a related note ...
>>>>
>>>> in memory_failure(), we call try_to_split_thp_page(). If it works,
>>>> we assume that we have a small folio.
>>>>
>>>> But that is not the case if split_huge_page() cannot split it to
>>>> order-0 ... min_order_for_split().
>>>
>>> Right. memory failure needs to learn about this. Either poison every
>>> subpage or write back if necessary and drop the page cache folio.
>>>
>>>>
>>>> I'm afraid we havbe more such code that does not expect that if split_huge_page()
>>>> succeeds that we still have a large folio ...
>>>
>>> I did some search, here are the users of split_huge_page*():
>>>
>>> 1. ksm: since it is anonymous only, so split always goes to order-0;
>>> 2. userfaultfd: it is also anonymous;
>>> 3. madvise cold or pageout: a large pagecache folio will be split if it is partially
>>> mapped. And it will retry. It might cause a deadlock if the folio has a min order.
>>> 4. shmem: split always goes to order-0;
>>> 5. memory-failure: see above.
>>>
>>> So we will need to take care of madvise cold or pageout case?
>>>
>>> Hi Matthew, Pankaj, and Luis,
>>>
>>> Is it possible to partially map a min-order folio in a fs with LBS? Based on my
>>
>> Typically, FSs match the min order with the blocksize of the filesystem.
>> As a filesystem block is the smallest unit of data that the filesystem uses
>> to store file data on the disk, we cannot partially map them.
>>
>> So if I understand your question correctly, the answer is no.
I'm confused. Shouldn't this be trivially possible?
E.g., just mmap() a single page of such a file? Who would make that fail?
And if it doesn't fail, who would stop us from munmap()'ing everything
except a single page.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2025-07-14 14:24 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-27 12:57 [PATCH v2 0/2] fix two calls of unmap_poisoned_folio() for large folio Jinjiang Tu
2025-06-27 12:57 ` [PATCH v2 1/2] mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list Jinjiang Tu
2025-06-27 17:10 ` David Hildenbrand
2025-06-27 22:00 ` Andrew Morton
2025-06-28 2:38 ` Jinjiang Tu
2025-06-28 3:13 ` Miaohe Lin
2025-07-01 14:13 ` Oscar Salvador
2025-07-03 7:30 ` Jinjiang Tu
2025-06-27 12:57 ` [PATCH v2 2/2] mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range Jinjiang Tu
2025-07-01 14:21 ` Oscar Salvador
2025-07-03 7:46 ` Jinjiang Tu
2025-07-03 7:57 ` David Hildenbrand
2025-07-03 8:24 ` Jinjiang Tu
2025-07-03 9:06 ` David Hildenbrand
2025-07-07 11:51 ` Jinjiang Tu
2025-07-07 12:37 ` David Hildenbrand
2025-07-08 1:15 ` Jinjiang Tu
2025-07-08 9:54 ` David Hildenbrand
2025-07-09 16:27 ` Zi Yan
2025-07-14 13:53 ` Pankaj Raghav
2025-07-14 14:20 ` Zi Yan
2025-07-14 14:24 ` David Hildenbrand [this message]
2025-07-14 15:09 ` Pankaj Raghav (Samsung)
2025-07-14 15:14 ` David Hildenbrand
2025-07-14 15:25 ` Zi Yan
2025-07-14 15:28 ` Zi Yan
2025-07-14 15:33 ` David Hildenbrand
2025-07-14 15:44 ` Zi Yan
2025-07-14 15:52 ` David Hildenbrand
2025-07-20 2:23 ` Andrew Morton
2025-07-22 15:30 ` David Hildenbrand
2025-08-21 5:02 ` Andrew Morton
2025-08-21 22:07 ` David Hildenbrand
2025-08-22 17:24 ` Zi Yan
2025-08-25 2:05 ` Miaohe Lin
2025-07-03 7:53 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3702f6b0-27a9-4ca1-adbd-fb1e2985b2d3@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kernel@pankajraghav.com \
--cc=linmiaohe@huawei.com \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.de \
--cc=tujinjiang@huawei.com \
--cc=wangkefeng.wang@huawei.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox