From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, hughd@google.com,
kirill.shutemov@linux.intel.com, linux-mm@kvack.org,
mm-commits@vger.kernel.org, naoya.horiguchi@nec.com,
osalvador@suse.de, peterx@redhat.com, shy828301@gmail.com,
stable@vger.kernel.org, torvalds@linux-foundation.org,
willy@infradead.org
Subject: [patch 03/11] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
Date: Thu, 28 Oct 2021 14:36:11 -0700 [thread overview]
Message-ID: <20211028213611.-fbyoks-F%akpm@linux-foundation.org> (raw)
In-Reply-To: <20211028143506.5f5d5e2cd1f768a1da864844@linux-foundation.org>
From: Yang Shi <shy828301@gmail.com>
Subject: mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
When handling shmem page fault the THP with corrupted subpage could be PMD
mapped if certain conditions are satisfied. But kernel is supposed to
send SIGBUS when trying to map hwpoisoned page.
There are two paths which may do PMD map: fault around and regular fault.
Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
codepaths") the thing was even worse in fault around path. The THP could
be PMD mapped as long as the VMA fits regardless what subpage is accessed
and corrupted. After this commit as long as head page is not corrupted
the THP could be PMD mapped.
In the regular fault path the THP could be PMD mapped as long as the
corrupted page is not accessed and the VMA fits.
This loophole could be fixed by iterating every subpage to check if any of
them is hwpoisoned or not, but it is somewhat costly in page fault path.
So introduce a new page flag called HasHWPoisoned on the first tail page.
It indicates the THP has hwpoisoned subpage(s). It is set if any subpage
of THP is found hwpoisoned by memory failure and after the refcount is
bumped successfully, then cleared when the THP is freed or split.
The soft offline path doesn't need this since soft offline handler just
marks a subpage hwpoisoned when the subpage is migrated successfully. But
shmem THP didn't get split then migrated at all.
Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com
Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
include/linux/page-flags.h | 23 +++++++++++++++++++++++
mm/huge_memory.c | 2 ++
mm/memory-failure.c | 14 ++++++++++++++
mm/memory.c | 9 +++++++++
mm/page_alloc.c | 4 +++-
5 files changed, 51 insertions(+), 1 deletion(-)
--- a/include/linux/page-flags.h~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/include/linux/page-flags.h
@@ -171,6 +171,15 @@ enum pageflags {
/* Compound pages. Stored in first tail page's flags */
PG_double_map = PG_workingset,
+#ifdef CONFIG_MEMORY_FAILURE
+ /*
+ * Compound pages. Stored in first tail page's flags.
+ * Indicates that at least one subpage is hwpoisoned in the
+ * THP.
+ */
+ PG_has_hwpoisoned = PG_mappedtodisk,
+#endif
+
/* non-lru isolated movable page */
PG_isolated = PG_reclaim,
@@ -668,6 +677,20 @@ PAGEFLAG_FALSE(DoubleMap)
TESTSCFLAG_FALSE(DoubleMap)
#endif
+#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
+/*
+ * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in the
+ * compound page.
+ *
+ * This flag is set by hwpoison handler. Cleared by THP split or free page.
+ */
+PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+ TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+#else
+PAGEFLAG_FALSE(HasHWPoisoned)
+ TESTSCFLAG_FALSE(HasHWPoisoned)
+#endif
+
/*
* Check if a page is currently marked HWPoisoned. Note that this check is
* best effort only and inherently racy: there is no way to synchronize with
--- a/mm/huge_memory.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/mm/huge_memory.c
@@ -2426,6 +2426,8 @@ static void __split_huge_page(struct pag
/* lock lru list/PageCompound, ref frozen by page_ref_freeze */
lruvec = lock_page_lruvec(head);
+ ClearPageHasHWPoisoned(head);
+
for (i = nr - 1; i >= 1; i--) {
__split_huge_page_tail(head, i, lruvec, list);
/* Some pages can be beyond EOF: drop them from page cache */
--- a/mm/memory.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/mm/memory.c
@@ -3907,6 +3907,15 @@ vm_fault_t do_set_pmd(struct vm_fault *v
return ret;
/*
+ * Just backoff if any subpage of a THP is corrupted otherwise
+ * the corrupted page may mapped by PMD silently to escape the
+ * check. This kind of THP just can be PTE mapped. Access to
+ * the corrupted subpage should trigger SIGBUS as expected.
+ */
+ if (unlikely(PageHasHWPoisoned(page)))
+ return ret;
+
+ /*
* Archs like ppc64 need additional space to store information
* related to pte entry. Use the preallocated table for that.
*/
--- a/mm/memory-failure.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/mm/memory-failure.c
@@ -1694,6 +1694,20 @@ try_again:
}
if (PageTransHuge(hpage)) {
+ /*
+ * The flag must be set after the refcount is bumped
+ * otherwise it may race with THP split.
+ * And the flag can't be set in get_hwpoison_page() since
+ * it is called by soft offline too and it is just called
+ * for !MF_COUNT_INCREASE. So here seems to be the best
+ * place.
+ *
+ * Don't need care about the above error handling paths for
+ * get_hwpoison_page() since they handle either free page
+ * or unhandlable page. The refcount is bumped iff the
+ * page is a valid handlable page.
+ */
+ SetPageHasHWPoisoned(hpage);
if (try_to_split_thp_page(p, "Memory Failure") < 0) {
action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED);
res = -EBUSY;
--- a/mm/page_alloc.c~mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault
+++ a/mm/page_alloc.c
@@ -1312,8 +1312,10 @@ static __always_inline bool free_pages_p
VM_BUG_ON_PAGE(compound && compound_order(page) != order, page);
- if (compound)
+ if (compound) {
ClearPageDoubleMap(page);
+ ClearPageHasHWPoisoned(page);
+ }
for (i = 1; i < (1 << order); i++) {
if (compound)
bad += free_tail_pages_check(page, page + i);
_
next prev parent reply other threads:[~2021-10-28 21:36 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-28 21:35 incoming Andrew Morton
2021-10-28 21:36 ` [patch 01/11] memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT Andrew Morton
2021-10-28 21:36 ` [patch 02/11] mm: hwpoison: remove the unnecessary THP check Andrew Morton
2021-10-28 21:36 ` Andrew Morton [this message]
2021-10-28 21:36 ` [patch 04/11] mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap Andrew Morton
2021-10-28 21:36 ` [patch 05/11] ocfs2: fix race between searching chunks and release journal_head from buffer_head Andrew Morton
2021-10-28 21:36 ` [patch 06/11] mm/secretmem: avoid letting secretmem_users drop to zero Andrew Morton
2021-10-28 21:36 ` [patch 07/11] mm/vmalloc: fix numa spreading for large hash tables Andrew Morton
2021-10-28 21:36 ` [patch 08/11] mm, thp: bail out early in collapse_file for writeback page Andrew Morton
2021-10-28 21:36 ` [patch 09/11] mm: khugepaged: skip huge page collapse for special files Andrew Morton
2021-10-28 21:36 ` [patch 10/11] mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()' Andrew Morton
2021-10-28 21:36 ` [patch 11/11] tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211028213611.-fbyoks-F%akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=mm-commits@vger.kernel.org \
--cc=naoya.horiguchi@nec.com \
--cc=osalvador@suse.de \
--cc=peterx@redhat.com \
--cc=shy828301@gmail.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox