Hi Yan,

For the record it was my colleague Adam Bratschi-Kaye (CCed) who wrote the rust-based thp-madv-remove-test. 

I don't have an issue if you include a C version as a kernel selftest. In fact, it's a good idea to prevent this from regressing in the future.

Cheers,

Bas


On Mon, 2 Mar 2026, 17:37 Zi Yan, <ziy@nvidia.com> wrote:
On 2 Mar 2026, at 10:11, Lance Yang wrote:

> On 2026/3/2 22:28, David Hildenbrand (Arm) wrote:
>> On 2/28/26 04:10, Lance Yang wrote:
>>>
>>>
>>> On 2026/2/28 09:06, Zi Yan wrote:
>>>> During a pagecache folio split, the values in the related xarray
>>>> should not
>>>> be changed from the original folio at xarray split time until all
>>>> after-split folios are well formed and stored in the xarray. Current use
>>>> of xas_try_split() in __split_unmapped_folio() lets some after-split
>>>> folios
>>>> show up at wrong indices in the xarray. When these misplaced after-split
>>>> folios are unfrozen, before correct folios are stored via
>>>> __xa_store(), and
>>>> grabbed by folio_try_get(), they are returned to userspace at wrong file
>>>> indices, causing data corruption.
>>>>
>>>> Fix it by using the original folio in xas_try_split() calls, so that
>>>> folio_try_get() can get the right after-split folios after the original
>>>> folio is unfrozen.
>>>>
>>>> Uniform split, split_huge_page*(), is not affected, since it uses
>>>> xas_split_alloc() and xas_split() only once and stores the original folio
>>>> in the xarray.
>>>>
>>>> Fixes below points to the commit introduces the code, but
>>>> folio_split() is
>>>> used in a later commit 7460b470a131f ("mm/truncate: use folio_split() in
>>>> truncate operation").
>>>>
>>>> Fixes: 00527733d0dc8 ("mm/huge_memory: add two new (not yet used)
>>>> functions for folio_split()")
>>>> Reported-by: Bas van Dijk <bas@dfinity.org>
>>>> Closes: https://lore.kernel.org/all/CAKNNEtw5_kZomhkugedKMPOG-
>>>> sxs5Q5OLumWJdiWXv+C9Yct0w@mail.gmail.com/
>>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>>> Cc: <stable@vger.kernel.org>
>>>> ---
>>>
>>> Thanks for the fix!
>>>
>>> I also made a C reproducer and tested this patch - the corruption
>>> disappeared.
>>
>> Should we link that reproducer somehow from the patch description?
>
> Yes, the original reproducer provided by Bas is available here[1].
>
> Regarding the C reproducer, Zi plans to add it to selftests in a
> follow-up patch (as we discussed off-list).
>
> [1] https://github.com/dfinity/thp-madv-remove-test

Sure. I will add the reproducer link to the commit log.


Hi Bas,

I used Cursor to convert your rust-based thp-madv-remove-test to C.
Do you have any concern if I add it to kernel’s selftests to check
this race condition?

Thanks.


Best Regards,
Yan, Zi