From: Felix Kuehling <felix.kuehling@amd.com>
To: David Hildenbrand <david@redhat.com>,
Alistair Popple <apopple@nvidia.com>,
Matthew Wilcox <willy@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: BUG_ON() in pfn_swap_entry_to_page()
Date: Fri, 26 Apr 2024 10:56:29 -0400 [thread overview]
Message-ID: <aaeb84f8-f4ad-43de-8f3a-5bc682041bc8@amd.com> (raw)
In-Reply-To: <b2138961-33c7-4713-9f29-26c769533780@redhat.com>
On 2024-04-26 4:49, David Hildenbrand wrote:
> On 25.04.24 16:33, Felix Kuehling wrote:
>>
>>
>> On 2024-04-25 5:32, David Hildenbrand wrote:
>>> On 24.04.24 21:45, Felix Kuehling wrote:
>>>> Sorry for top-posting. I'm resurrecting an old thread here because I
>>>> think I ran into the same problem with this assertion failing on Linux
>>>> 6.7:
>>>>
>>>> static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry)
>>>> {
>>>> struct page *p = pfn_to_page(swp_offset_pfn(entry));
>>>>
>>>> /*
>>>> * Any use of migration entries may only occur while the
>>>> * corresponding page is locked
>>>> */
>>>> --> BUG_ON(is_migration_entry(entry) && !PageLocked(p));
>>>>
>>>> return p;
>>>> }
>>>>
>>>> It looks like this thread just fizzled two years ago. Did anything
>>>> ever come of this?
>>>>
>>>> Maybe I should add that I saw this in a pre-silicon test environment.
>>>> I've never seen this on real hardware. Maybe something
>>>> timing-sensitive.
>>>
>>> In the past, it indicated a swp pte corruption, that would e.g., mess up
>>> the stored PFN ot the swap entry type.
>>>
>>> On which call chain do you see that?
>>>
>>
>> This is the backtrace, it's coming from hmm_range_fault. Looks like the
>> swap entries are from migrated DEVICE_PRIVATE pages.
>
> Thanks, on which kernel version can you reproduce this?
This is on a branch based on v6.7: $ git describe HEAD
v6.7-2677-g065851796b25
The branch mostly changes code in drivers. No changes in kernel/ or mm/.
A few changes in include/linux, but nothing that looks related to core
memory management.
>
>>
>> [Apr 3 20:11] ------------[ cut here ]------------
>> [ +0.000041] kernel BUG at include/linux/swapops.h:466!
>> [ +0.000691] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>> [ +0.000342] CPU: 2 PID: 49 Comm: kworker/2:1 Not tainted
>> 6.7.0-kfd-compute-rocm-npi-186 #1
>> [ +0.000556] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
>> [ +0.000703] Workqueue: events amdgpu_irq_handle_ih_soft [amdgpu]
>> [ +0.000501] RIP: 0010:migration_entry_wait_on_locked+0x26b/0x2b0
>> [ +0.000389] Code: fe ff ff 48 8d 7c 24 07 e8 02 7e f0 ff e9 58 fe ff
>> ff 48 8b 43 08 a8 01 75 3f 66 90 48 89 d8 48 8b 00 a8 01 0f 85 f1 fd ff
>> ff <0f> 0b 48 8d 58 ff e9 f7 fd ff ff 48 89 d8 f7 c3 ff 0f 00 00 75 df
>> [ +0.001161] RSP: 0018:ffffb211c01bb788 EFLAGS: 00010246
>> [ +0.000339] RAX: 017fff8000080018 RBX: fffff682c40ce8c0 RCX:
>> 0000000000000001
>> [ +0.000463] RDX: 0000000000000000 RSI: ffff977a45034840 RDI:
>> 000000000000001a
>> [ +0.000454] RBP: ffff977a45034840 R08: 68000000001033a3 R09:
>> 0000000000000030
>> [ +0.000451] R10: ffffb211c01bb6a8 R11: 0000000000000001 R12:
>> ffff977a46bd1318
>> [ +0.000461] R13: 0000000000000003 R14: 4000000000000000 R15:
>> ffffb211c01bb9b8
>> [ +0.000454] FS: 0000000000000000(0000) GS:ffff977dafd00000(0000)
>> knlGS:0000000000000000
>> [ +0.000518] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ +0.000372] CR2: 00007fa2d1cba000 CR3: 00000001030d2004 CR4:
>> 0000000000770ef0
>> [ +0.000453] PKRU: 55555554
>> [ +0.000182] Call Trace:
>> [ +0.000171] <TASK>
>> [ +0.000147] ? die+0x37/0x90
>> [ +0.000211] ? do_trap+0xe0/0x110
>> [ +0.000221] ? migration_entry_wait_on_locked+0x26b/0x2b0
>> [ +0.000351] ? do_error_trap+0x98/0x120
>> [ +0.000252] ? migration_entry_wait_on_locked+0x26b/0x2b0
>> [ +0.000346] ? migration_entry_wait_on_locked+0x26b/0x2b0
>> [ +0.000355] ? exc_invalid_op+0x52/0x70
>> [ +0.000254] ? migration_entry_wait_on_locked+0x26b/0x2b0
>> [ +0.000345] ? asm_exc_invalid_op+0x1a/0x20
>> [ +0.000274] ? migration_entry_wait_on_locked+0x26b/0x2b0
>> [ +0.000361] ? migration_entry_wait+0x4e/0x160
>> [ +0.000293] ? lock_release+0x119/0x260
>> [ +0.000255] migration_entry_wait+0x105/0x160
>> [ +0.000290] hmm_vma_walk_pmd+0x822/0x8a0
>> [ +0.000263] walk_pgd_range+0x40b/0x900
>> [ +0.000268] __walk_page_range+0x205/0x220
>
> I wonder if that is coming from pmd_migration_entry_wait() or
> migration_entry_wait() -- the "?" above adds uncertainty :)
This is weird. I only see a call to pmd_migration_entry_wait in
hmm_vma_walk_pmd.
>
> Likely it's from migration_entry_wait().
>
> I was first concerned about the lack of PTL in this function, but
> migration_entry_wait() will take the PTL and re-read the PTE.
>
> So when we call into migration_entry_wait_on_locked(), we are holding
> the PTL and we verified that we indeed have a migration entry.
>
> So if we fail in
> migration_entry_wait_on_locked()->pfn_swap_entry_folio(), we verified
> under PTL and still have a migration entry.
>
> The referenced folio is indeed not locked then.
I must admit, I'm not familiar with this code at all, so my observations
and questions are probably naive. So is the BUG_ON bad, or is
migration_entry_wait_on_locked missing some page locking?
I see that migration_entry_wait_on_locked does a
folio_trylock_flag(folio, PG_locked, wait), but _after_ getting the
folio with page_folio(pfn_swap_entry_to_page(entry)).
Maybe as a workaround for the team stumbling over this, I'll suggest
disabling THP.
Regards,
Felix
next prev parent reply other threads:[~2024-04-26 14:56 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-22 17:25 Sebastian Andrzej Siewior
2022-03-22 17:41 ` Matthew Wilcox
2022-03-23 0:29 ` Alistair Popple
2022-03-24 3:24 ` Matthew Wilcox
2022-03-24 3:51 ` Alistair Popple
2024-04-24 19:45 ` Felix Kuehling
2024-04-25 9:32 ` David Hildenbrand
2024-04-25 14:33 ` Felix Kuehling
2024-04-26 8:49 ` David Hildenbrand
2024-04-26 14:56 ` Felix Kuehling [this message]
2022-03-22 18:53 ` Ritesh Harjani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aaeb84f8-f4ad-43de-8f3a-5bc682041bc8@amd.com \
--to=felix.kuehling@amd.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=bigeasy@linutronix.de \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox