* [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page
2015-07-31 6:46 [PATCH v2 0/5] hwpoison: fixes on v4.2-rc4 Naoya Horiguchi
@ 2015-07-31 6:46 ` Naoya Horiguchi
2015-07-31 20:54 ` David Rientjes
2015-07-31 6:46 ` [PATCH v2 2/5] mm/memory-failure: fix race in counting num_poisoned_pages Naoya Horiguchi
` (3 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Naoya Horiguchi @ 2015-07-31 6:46 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
Naoya Horiguchi
In "just unpoisoned" path, we do put_page and then unlock_page, which is a
wrong order and causes "freeing locked page" bug. So let's fix it.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
mm/memory-failure.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index c53543d89282..04d677048af7 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1209,9 +1209,9 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
if (!PageHWPoison(p)) {
printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
atomic_long_sub(nr_pages, &num_poisoned_pages);
+ unlock_page(hpage);
put_page(hpage);
- res = 0;
- goto out;
+ return 0;
}
if (hwpoison_filter(p)) {
if (TestClearPageHWPoison(p))
--
2.4.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page
2015-07-31 6:46 ` [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page Naoya Horiguchi
@ 2015-07-31 20:54 ` David Rientjes
0 siblings, 0 replies; 7+ messages in thread
From: David Rientjes @ 2015-07-31 20:54 UTC (permalink / raw)
To: Naoya Horiguchi
Cc: Andrew Morton, Andi Kleen, Dean Nelson, Tony Luck,
Kirill A. Shutemov, Hugh Dickins, linux-mm, linux-kernel,
Naoya Horiguchi
On Fri, 31 Jul 2015, Naoya Horiguchi wrote:
> In "just unpoisoned" path, we do put_page and then unlock_page, which is a
> wrong order and causes "freeing locked page" bug. So let's fix it.
>
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> ---
> mm/memory-failure.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
> index c53543d89282..04d677048af7 100644
> --- v4.2-rc4.orig/mm/memory-failure.c
> +++ v4.2-rc4/mm/memory-failure.c
> @@ -1209,9 +1209,9 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
> if (!PageHWPoison(p)) {
> printk(KERN_ERR "MCE %#lx: just unpoisoned\n", pfn);
> atomic_long_sub(nr_pages, &num_poisoned_pages);
> + unlock_page(hpage);
> put_page(hpage);
> - res = 0;
> - goto out;
> + return 0;
> }
> if (hwpoison_filter(p)) {
> if (TestClearPageHWPoison(p))
Looks like you could do the unlock_page() before either the printk or
atomic_long_sub(), but probably not important.
Acked-by: David Rientjes <rientjes@google.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/5] mm/memory-failure: fix race in counting num_poisoned_pages
2015-07-31 6:46 [PATCH v2 0/5] hwpoison: fixes on v4.2-rc4 Naoya Horiguchi
2015-07-31 6:46 ` [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page Naoya Horiguchi
@ 2015-07-31 6:46 ` Naoya Horiguchi
2015-07-31 6:46 ` [PATCH v2 4/5] mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* Naoya Horiguchi
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Naoya Horiguchi @ 2015-07-31 6:46 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
Naoya Horiguchi
When memory_failure() is called on a page which are just freed after page
migration from soft offlining, the counter num_poisoned_pages is raised twice.
So let's fix it with using TestSetPageHWPoison.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
mm/memory-failure.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index 04d677048af7..f72d2fad0b90 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1671,8 +1671,8 @@ static int __soft_offline_page(struct page *page, int flags)
if (ret > 0)
ret = -EIO;
} else {
- SetPageHWPoison(page);
- atomic_long_inc(&num_poisoned_pages);
+ if (!TestSetPageHWPoison(page))
+ atomic_long_inc(&num_poisoned_pages);
}
} else {
pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx\n",
--
2.4.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH v2 4/5] mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*
2015-07-31 6:46 [PATCH v2 0/5] hwpoison: fixes on v4.2-rc4 Naoya Horiguchi
2015-07-31 6:46 ` [PATCH v2 1/5] mm/memory-failure: unlock_page before put_page Naoya Horiguchi
2015-07-31 6:46 ` [PATCH v2 2/5] mm/memory-failure: fix race in counting num_poisoned_pages Naoya Horiguchi
@ 2015-07-31 6:46 ` Naoya Horiguchi
2015-07-31 6:46 ` [PATCH v2 3/5] mm/memory-failure: give up error handling for non-tail-refcounted thp Naoya Horiguchi
2015-07-31 6:46 ` [PATCH v2 5/5] mm/memory-failure: set PageHWPoison before migrate_pages() Naoya Horiguchi
4 siblings, 0 replies; 7+ messages in thread
From: Naoya Horiguchi @ 2015-07-31 6:46 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
Naoya Horiguchi
The race condition addressed in commit add05cecef80 ("mm: soft-offline: don't
free target page in successful page migration") was not closed completely,
because that can happen not only for soft-offline, but also for hard-offline.
Consider that a slab page is about to be freed into buddy pool, and then an
uncorrected memory error hits the page just after entering __free_one_page(),
then VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP) is triggered,
despite the fact that it's not necessary because the data on the affected
page is not consumed.
To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time. I think it's justified because __PG_HWPOISON flags is
defined to prevent the page from being reused, and setting it outside the
page's alloc-free cycle is a designed behavior (not a bug.)
For recent months, I was annoyed about BUG_ON when soft-offlined page remains
on lru cache list for a while, which is avoided by calling put_page() instead
of putback_lru_page() in page migration's success path. This means that this
patch reverts a major change from commit add05cecef80 about the new refcounting
rule of soft-offlined pages, so "reuse window" revives. This will be closed
by a subsequent patch.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
v1 -> v2:
- call put_page() when migration succeeded in reason == MR_MEMORY_FAILURE
- refrain from reviving "reuse prevention" by MIGRATE_ISOLATE flag
---
include/linux/page-flags.h | 10 +++++++---
mm/huge_memory.c | 7 +------
mm/migrate.c | 5 ++++-
mm/page_alloc.c | 4 ++++
4 files changed, 16 insertions(+), 10 deletions(-)
diff --git v4.2-rc4.orig/include/linux/page-flags.h v4.2-rc4/include/linux/page-flags.h
index f34e040b34e9..41c93844fb1d 100644
--- v4.2-rc4.orig/include/linux/page-flags.h
+++ v4.2-rc4/include/linux/page-flags.h
@@ -631,15 +631,19 @@ static inline void ClearPageSlabPfmemalloc(struct page *page)
1 << PG_private | 1 << PG_private_2 | \
1 << PG_writeback | 1 << PG_reserved | \
1 << PG_slab | 1 << PG_swapcache | 1 << PG_active | \
- 1 << PG_unevictable | __PG_MLOCKED | __PG_HWPOISON | \
+ 1 << PG_unevictable | __PG_MLOCKED | \
__PG_COMPOUND_LOCK)
/*
* Flags checked when a page is prepped for return by the page allocator.
- * Pages being prepped should not have any flags set. It they are set,
+ * Pages being prepped should not have these flags set. It they are set,
* there has been a kernel bug or struct page corruption.
+ *
+ * __PG_HWPOISON is exceptional because it needs to be kept beyond page's
+ * alloc-free cycle to prevent from reusing the page.
*/
-#define PAGE_FLAGS_CHECK_AT_PREP ((1 << NR_PAGEFLAGS) - 1)
+#define PAGE_FLAGS_CHECK_AT_PREP \
+ (((1 << NR_PAGEFLAGS) - 1) & ~__PG_HWPOISON)
#define PAGE_FLAGS_PRIVATE \
(1 << PG_private | 1 << PG_private_2)
diff --git v4.2-rc4.orig/mm/huge_memory.c v4.2-rc4/mm/huge_memory.c
index c107094f79ba..097c7a4bfbd9 100644
--- v4.2-rc4.orig/mm/huge_memory.c
+++ v4.2-rc4/mm/huge_memory.c
@@ -1676,12 +1676,7 @@ static void __split_huge_page_refcount(struct page *page,
/* after clearing PageTail the gup refcount can be released */
smp_mb__after_atomic();
- /*
- * retain hwpoison flag of the poisoned tail page:
- * fix for the unsuitable process killed on Guest Machine(KVM)
- * by the memory-failure.
- */
- page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP | __PG_HWPOISON;
+ page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
page_tail->flags |= (page->flags &
((1L << PG_referenced) |
(1L << PG_swapbacked) |
diff --git v4.2-rc4.orig/mm/migrate.c v4.2-rc4/mm/migrate.c
index ee401e4e5ef1..f2415be7d93b 100644
--- v4.2-rc4.orig/mm/migrate.c
+++ v4.2-rc4/mm/migrate.c
@@ -950,7 +950,10 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
list_del(&page->lru);
dec_zone_page_state(page, NR_ISOLATED_ANON +
page_is_file_cache(page));
- if (reason != MR_MEMORY_FAILURE)
+ /* Soft-offlined page shouldn't go through lru cache list */
+ if (reason == MR_MEMORY_FAILURE)
+ put_page(page);
+ else
putback_lru_page(page);
}
diff --git v4.2-rc4.orig/mm/page_alloc.c v4.2-rc4/mm/page_alloc.c
index ef19f22b2b7d..775c254648f7 100644
--- v4.2-rc4.orig/mm/page_alloc.c
+++ v4.2-rc4/mm/page_alloc.c
@@ -1285,6 +1285,10 @@ static inline int check_new_page(struct page *page)
bad_reason = "non-NULL mapping";
if (unlikely(atomic_read(&page->_count) != 0))
bad_reason = "nonzero _count";
+ if (unlikely(page->flags & __PG_HWPOISON)) {
+ bad_reason = "HWPoisoned (hardware-corrupted)";
+ bad_flags = __PG_HWPOISON;
+ }
if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_PREP)) {
bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set";
bad_flags = PAGE_FLAGS_CHECK_AT_PREP;
--
2.4.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH v2 3/5] mm/memory-failure: give up error handling for non-tail-refcounted thp
2015-07-31 6:46 [PATCH v2 0/5] hwpoison: fixes on v4.2-rc4 Naoya Horiguchi
` (2 preceding siblings ...)
2015-07-31 6:46 ` [PATCH v2 4/5] mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_* Naoya Horiguchi
@ 2015-07-31 6:46 ` Naoya Horiguchi
2015-07-31 6:46 ` [PATCH v2 5/5] mm/memory-failure: set PageHWPoison before migrate_pages() Naoya Horiguchi
4 siblings, 0 replies; 7+ messages in thread
From: Naoya Horiguchi @ 2015-07-31 6:46 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
Naoya Horiguchi
"non anonymous thp" case is still racy with freeing thp, which causes panic
due to put_page() for refcount-0 page. It seems that closing up this race
might be hard (and/or not worth doing,) so let's give up the error handling
for this case.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
v1 -> v2:
- keep pr_err() message
---
mm/memory-failure.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index f72d2fad0b90..cd985530f102 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -909,6 +909,18 @@ int get_hwpoison_page(struct page *page)
* directly for tail pages.
*/
if (PageTransHuge(head)) {
+ /*
+ * Non anonymous thp exists only in allocation/free time. We
+ * can't handle such a case correctly, so let's give it up.
+ * This should be better than triggering BUG_ON when kernel
+ * tries to touch the "partially handled" page.
+ */
+ if (!PageAnon(head)) {
+ pr_err("MCE: %#lx: non anonymous thp\n",
+ page_to_pfn(page));
+ return 0;
+ }
+
if (get_page_unless_zero(head)) {
if (PageTail(page))
get_page(page);
@@ -1134,15 +1146,6 @@ int memory_failure(unsigned long pfn, int trapno, int flags)
}
if (!PageHuge(p) && PageTransHuge(hpage)) {
- if (!PageAnon(hpage)) {
- pr_err("MCE: %#lx: non anonymous thp\n", pfn);
- if (TestClearPageHWPoison(p))
- atomic_long_sub(nr_pages, &num_poisoned_pages);
- put_page(p);
- if (p != hpage)
- put_page(hpage);
- return -EBUSY;
- }
if (unlikely(split_huge_page(hpage))) {
pr_err("MCE: %#lx: thp split failed\n", pfn);
if (TestClearPageHWPoison(p))
--
2.4.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH v2 5/5] mm/memory-failure: set PageHWPoison before migrate_pages()
2015-07-31 6:46 [PATCH v2 0/5] hwpoison: fixes on v4.2-rc4 Naoya Horiguchi
` (3 preceding siblings ...)
2015-07-31 6:46 ` [PATCH v2 3/5] mm/memory-failure: give up error handling for non-tail-refcounted thp Naoya Horiguchi
@ 2015-07-31 6:46 ` Naoya Horiguchi
4 siblings, 0 replies; 7+ messages in thread
From: Naoya Horiguchi @ 2015-07-31 6:46 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Dean Nelson, Tony Luck, Kirill A. Shutemov,
Hugh Dickins, linux-mm, linux-kernel, Naoya Horiguchi,
Naoya Horiguchi
Now page freeing code doesn't consider PageHWPoison as a bad page, so by
setting it before completing the page containment, we can prevent the error
page from being reused just after successful page migration.
I added TTU_IGNORE_HWPOISON for try_to_unmap() to make sure that the page
table entry is transformed into migration entry, not to hwpoison entry.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
mm/memory-failure.c | 7 ++++---
mm/migrate.c | 3 ++-
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git v4.2-rc4.orig/mm/memory-failure.c v4.2-rc4/mm/memory-failure.c
index cd985530f102..ea5a93659488 100644
--- v4.2-rc4.orig/mm/memory-failure.c
+++ v4.2-rc4/mm/memory-failure.c
@@ -1659,6 +1659,8 @@ static int __soft_offline_page(struct page *page, int flags)
inc_zone_page_state(page, NR_ISOLATED_ANON +
page_is_file_cache(page));
list_add(&page->lru, &pagelist);
+ if (!TestSetPageHWPoison(page))
+ atomic_long_inc(&num_poisoned_pages);
ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
MIGRATE_SYNC, MR_MEMORY_FAILURE);
if (ret) {
@@ -1673,9 +1675,8 @@ static int __soft_offline_page(struct page *page, int flags)
pfn, ret, page->flags);
if (ret > 0)
ret = -EIO;
- } else {
- if (!TestSetPageHWPoison(page))
- atomic_long_inc(&num_poisoned_pages);
+ if (TestClearPageHWPoison(page))
+ atomic_long_dec(&num_poisoned_pages);
}
} else {
pr_info("soft offline: %#lx: isolation failed: %d, page count %d, type %lx\n",
diff --git v4.2-rc4.orig/mm/migrate.c v4.2-rc4/mm/migrate.c
index f2415be7d93b..eb4267107d1f 100644
--- v4.2-rc4.orig/mm/migrate.c
+++ v4.2-rc4/mm/migrate.c
@@ -880,7 +880,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
/* Establish migration ptes or remove ptes */
if (page_mapped(page)) {
try_to_unmap(page,
- TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+ TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
+ TTU_IGNORE_HWPOISON);
page_was_mapped = 1;
}
--
2.4.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread