From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: syzbot <syzbot+bf2c35fa302ebe3c7471@syzkaller.appspotmail.com>,
akpm@linux-foundation.org, bp@alien8.de,
dave.hansen@linux.intel.com, hpa@zytor.com, jgg@ziepe.ca,
leitao@debian.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, mingo@redhat.com, rppt@kernel.org,
syzkaller-bugs@googlegroups.com, tglx@linutronix.de,
x86@kernel.org
Subject: Re: [syzbot] [mm?] WARNING in copy_huge_pmd
Date: Thu, 26 Sep 2024 17:25:07 +0200 [thread overview]
Message-ID: <72ea5f41-6186-4383-90e4-cdd04fac6a3c@redhat.com> (raw)
In-Reply-To: <ZvVkG4aSxS9rNoRd@x1n>
On 26.09.24 15:39, Peter Xu wrote:
> On Thu, Sep 26, 2024 at 12:48:19PM +0200, David Hildenbrand wrote:
>> On 25.09.24 18:59, Peter Xu wrote:
>>> On Tue, Sep 24, 2024 at 04:45:00PM +0200, David Hildenbrand wrote:
>>>> On 23.09.24 14:18, syzbot wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following issue on:
>>>>>
>>>>> HEAD commit: 88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k..
>>>>> git tree: upstream
>>>>> console+strace: https://syzkaller.appspot.com/x/log.txt?x=16c36c27980000
>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=e851828834875d6f
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bf2c35fa302ebe3c7471
>>>>> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12773080580000
>>>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16ed5e9f980000
>>>>>
>>>>> Downloadable assets:
>>>>> disk image: https://storage.googleapis.com/syzbot-assets/0e011ac37c93/disk-88264981.raw.xz
>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/f5c65577e19e/vmlinux-88264981.xz
>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/984d963c8ea1/bzImage-88264981.xz
>>>>>
>>>>> The issue was bisected to:
>>>>>
>>>>> commit 75182022a0439788415b2dd1db3086e07aa506f7
>>>>> Author: Peter Xu <peterx@redhat.com>
>>>>> Date: Mon Aug 26 20:43:51 2024 +0000
>>>>>
>>>>> mm/x86: support large pfn mappings
>>>>>
>>>>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=17df9c27980000
>>>>> final oops: https://syzkaller.appspot.com/x/report.txt?x=143f9c27980000
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=103f9c27980000
>>>>>
>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>> Reported-by: syzbot+bf2c35fa302ebe3c7471@syzkaller.appspotmail.com
>>>>> Fixes: 75182022a043 ("mm/x86: support large pfn mappings")
>>>>>
>>>>> ------------[ cut here ]------------
>>>>> WARNING: CPU: 1 PID: 5508 at mm/huge_memory.c:1602 copy_huge_pmd+0x102c/0x1c60 mm/huge_memory.c:1602
>>>>
>>>> This is the
>>>>
>>>> VM_WARN_ON_ONCE(is_cow_mapping(src_vma->vm_flags) && pmd_write(pmd))
>>>>
>>>> So we have a special-marked PMD in a COW mapping.
>>>>
>>>> The reproducer seems to involve fuse, but not sure if that makes a
>>>> difference here.
>>>
>>> That chunk of code seems to be there only making sure the test won't get
>>> blocked due to any fused based fs being stuck, via writting to the "abort"
>>> file:
>>>
>>> snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
>>> ent->d_name);
>>> int fd = open(abort, O_WRONLY);
>>> if (fd == -1) {
>>> continue;
>>> }
>>> if (write(fd, abort, 1) < 0) {
>>> }
>>> close(fd);
>>>
>>> So far looks not relevant to this issue indeed.
>>>
>>> Unfortunately I cannot reproduce it even with the reproducer. So this one
>>> is a bit tricky..
>>>
>>> What confuses me yet is how that special bit is set, if it's only used so
>>> far with vfio-pci, and this test doesn't seem to have it involved.
>>>
>>> The test keeps invoking processes, then threads, doing concurrent accesses
>>> over a few stuff (madvise, mremap, migrate_pages, munmap, etc.) on the
>>> pre-mapped areas, but none of them seem to create new memory that can
>>> provide hint on how special bit can start to occur.
>>>
>>> I wonder if some of these operations can race in a way that mm can wrongly
>>> create the special bit (alone with it being writable).. and then it could
>>> be a historical bug, only captured by this patchset due to the newly added
>>> WARN_ON_ONCE somehow, then it could mean that it's not the WRITE bit that
>>> is not intended, but the SPECIAL bit altogether.
>>
>> I assume you are missing a check for present/non-swap pmds. Assume you have
>> a migration entry and end up using the special bit -- which is perfectly
>> fine -- your code would assume it's a present PMD with the special bit set.
>>
>> Maybe for the time being something like:
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 0580ac9e47b9..e55efcad1e6c 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -1586,7 +1586,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct
>> mm_struct *src_mm,
>> int ret = -ENOMEM;
>>
>> pmd = pmdp_get_lockless(src_pmd);
>> - if (unlikely(pmd_special(pmd))) {
>> + if (unlikely(pmd_present(pmd) && pmd_special(pmd))) {
>> dst_ptl = pmd_lock(dst_mm, dst_pmd);
>> src_ptl = pmd_lockptr(src_mm, src_pmd);
>> spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
>
> Good catch!
>
> I definitely overlooked it, and I did check the config has THP_MIGRATION
> set indeed. So it's very possible relevant.
>
> Do you want to send a formal patch? You can also push a branch with "#syz
> test", looks like syzbot can constantly reproduce.
Yes, let me send out a patch real quick.
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-09-26 15:25 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-23 12:18 syzbot
2024-09-24 14:45 ` David Hildenbrand
2024-09-25 16:59 ` Peter Xu
2024-09-26 10:48 ` David Hildenbrand
2024-09-26 13:39 ` Peter Xu
2024-09-26 15:25 ` David Hildenbrand [this message]
2024-09-26 15:45 ` David Hildenbrand
2024-09-27 4:20 ` syzbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=72ea5f41-6186-4383-90e4-cdd04fac6a3c@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=jgg@ziepe.ca \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=peterx@redhat.com \
--cc=rppt@kernel.org \
--cc=syzbot+bf2c35fa302ebe3c7471@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox