From: Peter Xu <peterx@redhat.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Miaohe Lin <linmiaohe@huawei.com>,
David Hildenbrand <david@redhat.com>,
stable@vger.kernel.org, Tony Luck <tony.luck@intel.com>,
Naoya Horiguchi <naoya.horiguchi@nec.com>
Subject: Re: [PATCH] mm,swapops: Update check in is_pfn_swap_entry for hwpoison entries
Date: Sun, 7 Apr 2024 15:19:57 -0400 [thread overview]
Message-ID: <ZhLx3fwzQNPDbBei@x1n> (raw)
In-Reply-To: <ZhKmAecilbb2oSD9@localhost.localdomain>
On Sun, Apr 07, 2024 at 03:56:17PM +0200, Oscar Salvador wrote:
> On Sun, Apr 07, 2024 at 03:05:37PM +0200, Oscar Salvador wrote:
> > Tony reported that the Machine check recovery was broken in v6.9-rc1,
> > as he was hitting a VM_BUG_ON when injecting uncorrectable memory errors
> > to DRAM.
> > After some more digging and debugging on his side, he realized that this
> > went back to v6.1, with the introduction of 'commit 0d206b5d2e0d ("mm/swap: add
> > swp_offset_pfn() to fetch PFN from swap entry")'.
> > That commit, among other things, introduced swp_offset_pfn(), replacing
> > hwpoison_entry_to_pfn() in its favour.
> >
> > The patch also introduced a VM_BUG_ON() check for is_pfn_swap_entry(),
> > but is_pfn_swap_entry() never got updated to cover hwpoison entries, which
> > means that we would hit the VM_BUG_ON whenever we would call
> > swp_offset_pfn() for such entries on environments with CONFIG_DEBUG_VM set.
> > Fix this by updating the check to cover hwpoison entries as well, and update
> > the comment while we are it.
> >
> > Reported-by: Tony Luck <tony.luck@intel.com>
> > Closes: https://lore.kernel.org/all/Zg8kLSl2yAlA3o5D@agluck-desk3/
> > Tested-by: Tony Luck <tony.luck@intel.com>
> > Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry")
Totally unexpected, as this commit even removed hwpoison_entry_to_pfn().
Obviously even until now I assumed hwpoison is accounted as pfn swap entry
but it's just missing..
Since this commit didn't really change is_pfn_swap_entry() itself, I was
thinking maybe an older fix tag would apply, but then I noticed the old
code indeed should work well even if hwpoison entry is missing. For
example, it's a grey area on whether a hwpoisoned page should be accounted
in smaps. So I think the Fixes tag is correct, and thanks for fixing this.
Reviewed-by: Peter Xu <peterx@redhat.com>
> > Cc: <stable@vger.kernel.org> # 6.1.x
>
> I think I need to clarify why the stable.
>
> It is my understanding that some distros ship their kernel with
> CONFIG_DEBUG_VM set by default (I think Fedora comes to my mind?).
> I am fine with backing down if people think that this is an
> overreaction.
Fedora stopped having DEBUG_VM for some time, but not sure about when it's
still in the 6.1 trees. It looks like cc stable is still reasonable from
that regard.
A side note is that when I'm looking at this, I went back and see why in
some cases we need the pfn maintained for the poisoned, then I saw the only
user is check_hwpoisoned_entry() who wants to do fast kills in some
contexts and that includes a double check on the pfns in a poisoned entry.
Then afaict this path is just too rarely used and buggy.
A few things we may need fixing, maybe someone in the loop would have time
to have a look:
- check_hwpoisoned_entry()
- pte_none check is missing
- all the rest swap types are missing (e.g., we want to kill the proc too
if the page is during migration)
- check_hwpoisoned_pmd_entry()
- need similar care like above (pmd_none is covered not others)
I copied Naoya too.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2024-04-07 19:20 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-07 13:05 Oscar Salvador
2024-04-07 13:56 ` Oscar Salvador
2024-04-07 19:19 ` Peter Xu [this message]
2024-04-07 20:31 ` Oscar Salvador
2024-04-10 8:16 ` Miaohe Lin
2024-04-08 7:44 ` David Hildenbrand
2024-04-08 8:31 ` Miaohe Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZhLx3fwzQNPDbBei@x1n \
--to=peterx@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=naoya.horiguchi@nec.com \
--cc=osalvador@suse.de \
--cc=stable@vger.kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox