From: Gavin Guo <gavinguo@igalia.com>
To: Zi Yan <ziy@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>,
Hugh Dickins <hughd@google.com>,
linux-mm@kvack.org, akpm@linux-foundation.org,
willy@infradead.org, linmiaohe@huawei.com, revest@google.com,
kernel-dev@igalia.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/huge_memory: fix dereferencing invalid pmd migration entry
Date: Thu, 17 Apr 2025 20:38:30 +0800 [thread overview]
Message-ID: <492b58a8-ff4a-4afe-b317-6fd1bafc874e@igalia.com> (raw)
In-Reply-To: <CD959F2D-FD0B-42C3-B451-ABCE254485E7@nvidia.com>
On 4/17/25 20:10, Zi Yan wrote:
> On 17 Apr 2025, at 8:02, Gavin Guo wrote:
>
>> On 4/17/25 19:32, Zi Yan wrote:
>>> On 17 Apr 2025, at 7:21, Gavin Guo wrote:
>>>
>>>> On 4/17/25 17:04, David Hildenbrand wrote:
>>>>> On 17.04.25 10:55, Hugh Dickins wrote:
>>>>>> On Thu, 17 Apr 2025, David Hildenbrand wrote:
>>>>>>> On 17.04.25 09:18, David Hildenbrand wrote:
>>>>>>>> On 17.04.25 07:36, Hugh Dickins wrote:
>>>>>>>>> On Wed, 16 Apr 2025, David Hildenbrand wrote:
>>>>>>>>>>
>>>>>>>>>> Why not something like
>>>>>>>>>>
>>>>>>>>>> struct folio *entry_folio;
>>>>>>>>>>
>>>>>>>>>> if (folio) {
>>>>>>>>>> if (is_pmd_migration_entry(*pmd))
>>>>>>>>>> entry_folio = pfn_swap_entry_folio(pmd_to_swp_entry(*pmd)));
>>>>>>>>>> else
>>>>>>>>>> entry_folio = pmd_folio(*pmd));
>>>>>>>>>>
>>>>>>>>>> if (folio != entry_folio)
>>>>>>>>>> return;
>>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> My own preference is to not add unnecessary code:
>>>>>>>>> if folio and pmd_migration entry, we're not interested in entry_folio.
>>>>>>>>> But yes it could be written in lots of other ways.
>>>>>>>>
>>>>>>>> While I don't disagree about "not adding unnecessary code" in general,
>>>>>>>> in this particular case just looking the folio up properly might be the
>>>>>>>> better alternative to reasoning about locking rules with conditional
>>>>>>>> input parameters :)
>>>>>>>>
>>>>>>>
>>>>>>> FWIW, I was wondering if we can rework that code, letting the caller to the
>>>>>>> checking and getting rid of the folio parameter. Something like this
>>>>>>> (incomplete, just to
>>>>>>> discuss if we could move the TTU_SPLIT_HUGE_PMD handling).
>>>>>>
>>>>>> Yes, I too dislike the folio parameter used for a single case, and agree
>>>>>> it's better for the caller who chose pmd to check that *pmd fits the folio.
>>>>>>
>>>>>> I haven't checked your code below, but it looks like a much better way
>>>>>> to proceed, using the page_vma_mapped_walk() to get pmd lock and check;
>>>>>> and cutting out two or more layers of split_huge_pmd obscurity.
>>>>>>
>>>>>> Way to go. However... what we want right now is a fix that can easily
>>>>>> go to stable: the rearrangements here in 6.15-rc mean, I think, that
>>>>>> whatever goes into the current tree will have to be placed differently
>>>>>> for stable, no seamless backports; but Gavin's patch (reworked if you
>>>>>> insist) can be adapted to stable (differently for different releases)
>>>>>> more more easily than the future direction you're proposing here.
>>>>>
>>>>> I'm fine with going with the current patch and looking into cleaning it up properly (if possible).
>>>>>
>>>>> So for this patch
>>>>>
>>>>> Acked-by: David Hildenbrand <david@redhat.com>
>>>>>
>>>>> @Gavin, can you look into cleaning that up?
>>>>
>>>> Thank you for your review. Before I begin the cleanup, could you please
>>>> confirm the following action items:
>>>>
>>>> Zi Yan's suggestions for the patch are:
>>>> 1. Replace the page fault with an invalid address access in the commit
>>>> description.
>>>>
>>>> 2. Simplify the nested if-statements into a single if-statement to
>>>> reduce indentation.
>>>
>>> 3. Can you please add Huge’s explanation below in the commit log?
>>> That clarifies the issue. Thank you for the fix.
>>
>> Sure, will send out another patch for your review. Thank you for the review.
>>
> Thanks. Do you mind sharing the syzkaller reproducer if that is
> possible and easy? I am trying to understand more about the issue.
Sure, this is the reproducer:
https://drive.google.com/file/d/1eDBV6VfIzyqD9SeYGQBah-BJXO32Piy8/view
Reproducing steps
1). gcc -o repro -lpthread -static ./repro.c
2). ./repro
3). Find the group number and replace 2539 in the following
sudo cat /sys/kernel/debug/shrinker/thp-deferred_split-12/count
4). Run the following command in multiple sessions
for i in $(seq 10000); do echo "2539 0 100" | sudo tee
/sys/kernel/debug/shrinker/thp-deferred_split-12/scan ; done
Generally, the bug will be triggered within 5 minutes.
>
>>>
>>> “
>>> an anon_vma lookup points to a
>>> location which may contain the folio of interest, but might instead
>>> contain another folio: and weeding out those other folios is precisely
>>> what the "folio != pmd_folio((*pmd)" check (and the "risk of replacing
>>> the wrong folio" comment a few lines above it) is for.
>>> ”
>>>
>>> With that, Acked-by: Zi Yan <ziy@nvidia.com>
>>>
>>>>
>>>> David, based on your comment, I understand that you are recommending the
>>>> entry_folio implementation. Also, from your discussion with Hugh, it
>>>> appears you agreed with my original approach of returning early when
>>>> encountering a PMD migration entry, thereby avoiding unnecessary checks.
>>>> Is that correct? If so, I will keep the current logic. Do you have any
>>>> additional cleanup suggestions?
>>>>
>>>> I will start the cleanup work after confirmation.
>>>>
>>>>>
>>>>>>
>>>>>> (Hmm, that may be another reason for preferring the reasoning by
>>>>>> folio lock: forgive me if I'm misremembering, but didn't those
>>>>>> page migration swapops get renamed, some time around 5.11?)
>>>>>
>>>>> I remember that we did something to PTE handling stuff in the context of PTE markers. But things keep changing all of the time .. :)
>>>>>
>>>
>>>
>>> Best Regards,
>>> Yan, Zi
>
>
> Best Regards,
> Yan, Zi
next prev parent reply other threads:[~2025-04-17 12:38 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-14 7:27 Gavin Guo
2025-04-14 16:50 ` Zi Yan
2025-04-15 10:07 ` Gavin Guo
2025-04-15 15:57 ` Zi Yan
2025-04-17 5:29 ` Hugh Dickins
2025-04-18 13:25 ` Zi Yan
2025-04-17 5:03 ` Hugh Dickins
2025-04-16 16:10 ` David Hildenbrand
2025-04-17 5:36 ` Hugh Dickins
2025-04-17 7:18 ` David Hildenbrand
2025-04-17 8:07 ` David Hildenbrand
2025-04-17 8:09 ` David Hildenbrand
2025-04-17 8:55 ` Hugh Dickins
2025-04-17 9:04 ` David Hildenbrand
2025-04-17 11:21 ` Gavin Guo
2025-04-17 11:32 ` Zi Yan
2025-04-17 12:02 ` Gavin Guo
2025-04-17 12:10 ` Zi Yan
2025-04-17 12:38 ` Gavin Guo [this message]
2025-04-17 11:36 ` David Hildenbrand
2025-04-17 12:05 ` Gavin Guo
2025-04-17 4:38 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=492b58a8-ff4a-4afe-b317-6fd1bafc874e@igalia.com \
--to=gavinguo@igalia.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kernel-dev@igalia.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=revest@google.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox