* [PATCH 0/7] HWPOISON for hugepage backed KVM guest
@ 2011-01-21 6:28 Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 1/7] hugetlb: check swap entry in follow_hugetlb_page() Naoya Horiguchi
` (6 more replies)
0 siblings, 7 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 6:28 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm
Hi,
I wrote "HWPOISON for hugepage" patchset last year, but it didn't
cover the hugepages used by KVM guest because follow_hugetlb_pages()
called in a guest page fault code path didn't know about swap entry
formatted pmd entry.
This patchset fixes it and makes both soft and hard offline available
on hugepage backed KVM guest.
I appreciate all of your comments and reviews.
Thanks,
Naoya Horiguchi
Summary:
[PATCH 1/7] hugetlb: check swap entry in follow_hugetlb_page()
[PATCH 2/7] check hugepage swap entry in get_user_pages_fast()
[PATCH 3/7] remove putback_lru_pages() in hugepage migration context
[PATCH 4/7] hugetlb, migration: add migration_hugepage_entry_wait()
[PATCH 5/7] hugetlb: fix race condition between hugepage soft offline and page fault
[PATCH 6/7] HWPOISON: pass order to set/clear_page_hwpoison_huge_page()
[PATCH 7/7] HWPOISON, hugetlb: fix hard offline for hugepage backed KVM guest
arch/x86/mm/gup.c | 9 +++++++++
include/linux/swapops.h | 20 ++++++++++++++++++++
mm/hugetlb.c | 39 +++++++++++++++++++++++++++++----------
mm/memory-failure.c | 24 +++++++++++++-----------
mm/migrate.c | 33 +++++++++++++++++++++++++++++++++
5 files changed, 104 insertions(+), 21 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/7] hugetlb: check swap entry in follow_hugetlb_page()
2011-01-21 6:28 [PATCH 0/7] HWPOISON for hugepage backed KVM guest Naoya Horiguchi
@ 2011-01-21 6:28 ` Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 2/7] check hugepage swap entry in get_user_pages_fast() Naoya Horiguchi
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 6:28 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm
KVM host calls follow_hugetlb_page() in HVA-PFN translation
(through get_user_pages(),) so we need to have it handle swap
entry to detect HWPOISONed or migrating hugepages.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
mm/hugetlb.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git v2.6.38-rc1/mm/hugetlb.c v2.6.38-rc1/mm/hugetlb.c
index bb0b7c1..97c7471 100644
--- v2.6.38-rc1/mm/hugetlb.c
+++ v2.6.38-rc1/mm/hugetlb.c
@@ -2731,6 +2731,7 @@ int follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
while (vaddr < vma->vm_end && remainder) {
pte_t *pte;
int absent;
+ int swap;
struct page *page;
/*
@@ -2740,6 +2741,7 @@ int follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
*/
pte = huge_pte_offset(mm, vaddr & huge_page_mask(h));
absent = !pte || huge_pte_none(huge_ptep_get(pte));
+ swap = !absent && !pte_present(*pte);
/*
* When coredumping, it suits get_dump_page if we just return
@@ -2754,7 +2756,7 @@ int follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,
break;
}
- if (absent ||
+ if (absent || swap ||
((flags & FOLL_WRITE) && !pte_write(huge_ptep_get(pte)))) {
int ret;
--
1.7.3.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/7] check hugepage swap entry in get_user_pages_fast()
2011-01-21 6:28 [PATCH 0/7] HWPOISON for hugepage backed KVM guest Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 1/7] hugetlb: check swap entry in follow_hugetlb_page() Naoya Horiguchi
@ 2011-01-21 6:28 ` Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 3/7] remove putback_lru_pages() in hugepage migration context Naoya Horiguchi
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 6:28 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm
When the hugepage associated with a given address is HWPOISONed
or under page migration, get_user_pages_fast() need to fall back
to slow path in order to make the page fault fail (when HWPOISONed)
or to wait for migration completion (when under migration.)
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
arch/x86/mm/gup.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git v2.6.38-rc1/arch/x86/mm/gup.c v2.6.38-rc1/arch/x86/mm/gup.c
index dbe34b9..93b74dd 100644
--- v2.6.38-rc1/arch/x86/mm/gup.c
+++ v2.6.38-rc1/arch/x86/mm/gup.c
@@ -176,6 +176,15 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
*/
if (pmd_none(pmd) || pmd_trans_splitting(pmd))
return 0;
+ /*
+ * PMD can be in swap entry style when the hugepage
+ * pointed to by it is hwpoisoned or under migration.
+ * Because the swap entry format has no flag showing
+ * the page size, pmd_large() cannot detect it.
+ * So then we just fall back to the slow path.
+ */
+ if (unlikely(!pmd_present(pmd)))
+ return 0;
if (unlikely(pmd_large(pmd))) {
if (!gup_huge_pmd(pmd, addr, next, write, pages, nr))
return 0;
--
1.7.3.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/7] remove putback_lru_pages() in hugepage migration context
2011-01-21 6:28 [PATCH 0/7] HWPOISON for hugepage backed KVM guest Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 1/7] hugetlb: check swap entry in follow_hugetlb_page() Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 2/7] check hugepage swap entry in get_user_pages_fast() Naoya Horiguchi
@ 2011-01-21 6:28 ` Naoya Horiguchi
2011-01-21 6:40 ` Minchan Kim
2011-01-21 6:28 ` [PATCH 4/7] hugetlb, migration: add migration_hugepage_entry_wait() Naoya Horiguchi
` (3 subsequent siblings)
6 siblings, 1 reply; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 6:28 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm,
Minchan Kim
This putback_lru_pages() is inserted at cf608ac19c to allow
memory compaction to count the number of migration failed pages.
But we should not do it for a hugepage because page->lru of a hugepage
is used differently from that of a normal page:
in-use hugepage : page->lru is unlinked,
free hugepage : page->lru is linked to the free hugepage list,
so putting back hugepages to LRU lists collapses this rule.
We just drop this change (without any impact on memory compaction.)
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
---
mm/memory-failure.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git v2.6.38-rc1/mm/memory-failure.c v2.6.38-rc1/mm/memory-failure.c
index 548fbd7..b4910e8 100644
--- v2.6.38-rc1/mm/memory-failure.c
+++ v2.6.38-rc1/mm/memory-failure.c
@@ -1295,7 +1295,6 @@ static int soft_offline_huge_page(struct page *page, int flags)
ret = migrate_huge_pages(&pagelist, new_page, MPOL_MF_MOVE_ALL, 0,
true);
if (ret) {
- putback_lru_pages(&pagelist);
pr_debug("soft offline: %#lx: migration failed %d, type %lx\n",
pfn, ret, page->flags);
if (ret > 0)
--
1.7.3.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 4/7] hugetlb, migration: add migration_hugepage_entry_wait()
2011-01-21 6:28 [PATCH 0/7] HWPOISON for hugepage backed KVM guest Naoya Horiguchi
` (2 preceding siblings ...)
2011-01-21 6:28 ` [PATCH 3/7] remove putback_lru_pages() in hugepage migration context Naoya Horiguchi
@ 2011-01-21 6:28 ` Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 5/7] hugetlb: fix race condition between hugepage soft offline and page fault Naoya Horiguchi
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 6:28 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm
migration_entry_wait() doesn't work for hugepage, because page->ptl
on hugepage is unused for now. So this patch introduces a hugepage
variant of this function.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
include/linux/swapops.h | 8 ++++++++
mm/hugetlb.c | 3 ++-
mm/migrate.c | 33 +++++++++++++++++++++++++++++++++
3 files changed, 43 insertions(+), 1 deletions(-)
diff --git v2.6.38-rc1/include/linux/swapops.h v2.6.38-rc1/include/linux/swapops.h
index cd42e30..a220ef5 100644
--- v2.6.38-rc1/include/linux/swapops.h
+++ v2.6.38-rc1/include/linux/swapops.h
@@ -169,3 +169,11 @@ static inline int non_swap_entry(swp_entry_t entry)
return 0;
}
#endif
+
+#if defined(CONFIG_HUGETLB_PAGE) || defined(CONFIG_MIGRATION)
+extern void migration_hugepage_entry_wait(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long address);
+#else
+static inline void migration_hugepage_entry_wait(struct mm_struct *mm,
+ pmd_t *pmd, unsigned long address) { }
+#endif
diff --git v2.6.38-rc1/mm/hugetlb.c v2.6.38-rc1/mm/hugetlb.c
index 97c7471..d3b856a 100644
--- v2.6.38-rc1/mm/hugetlb.c
+++ v2.6.38-rc1/mm/hugetlb.c
@@ -2618,7 +2618,8 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
if (ptep) {
entry = huge_ptep_get(ptep);
if (unlikely(is_hugetlb_entry_migration(entry))) {
- migration_entry_wait(mm, (pmd_t *)ptep, address);
+ migration_hugepage_entry_wait(mm, (pmd_t *)ptep,
+ address);
return 0;
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
diff --git v2.6.38-rc1/mm/migrate.c v2.6.38-rc1/mm/migrate.c
index 46fe8cc..363685f 100644
--- v2.6.38-rc1/mm/migrate.c
+++ v2.6.38-rc1/mm/migrate.c
@@ -220,6 +220,39 @@ out:
pte_unmap_unlock(ptep, ptl);
}
+void migration_hugepage_entry_wait(struct mm_struct *mm, pmd_t *pmd,
+ unsigned long address)
+{
+ pte_t *ptep, pte;
+ spinlock_t *ptl;
+ swp_entry_t entry;
+ struct page *page;
+
+ ptep = (pte_t *)pmd;
+ ptl = &(mm)->page_table_lock;
+ spin_lock(ptl);
+ pte = *ptep;
+ if (!is_swap_pte(pte))
+ goto out;
+
+ entry = pte_to_swp_entry(pte);
+ if (!is_migration_entry(entry))
+ goto out;
+
+ page = migration_entry_to_page(entry);
+
+ if (!get_page_unless_zero(page))
+ goto out;
+ spin_unlock(ptl);
+ pte_unmap(ptep);
+ wait_on_page_locked(page);
+ put_page(page);
+ return;
+out:
+ spin_unlock(ptl);
+ pte_unmap(ptep);
+}
+
/*
* Replace the page in the mapping.
*
--
1.7.3.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 5/7] hugetlb: fix race condition between hugepage soft offline and page fault
2011-01-21 6:28 [PATCH 0/7] HWPOISON for hugepage backed KVM guest Naoya Horiguchi
` (3 preceding siblings ...)
2011-01-21 6:28 ` [PATCH 4/7] hugetlb, migration: add migration_hugepage_entry_wait() Naoya Horiguchi
@ 2011-01-21 6:28 ` Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 6/7] HWPOISON: pass order to set/clear_page_hwpoison_huge_page() Naoya Horiguchi
2011-01-21 6:29 ` [PATCH 7/7] HWPOISON, hugetlb: fix hard offline for hugepage backed KVM guest Naoya Horiguchi
6 siblings, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 6:28 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm
When hugepage soft offline succeeds, the old hugepage is expected
to be temporarily enqueued to free hugepage list, and then dequeued
as a HWPOISONed hugepage.
But there is a race window, which collapses reference counting.
See the following list:
soft offline page fault
soft_offline_huge_page
migrate_huge_pages
unmap_and_move_huge_page
lock_page
try_to_unmap
move_to_new_page
migrate_page
migrate_page_copy
hugetlb_fault
migration_hugepage_entry_wait
get_page_unless_zero
wait_on_page_locked
remove_migration_ptes
unlock_page
-------------------------------------------------------------------
put_page put_page
dequeue_hwpoisoned_huge_page
Two put_page()s below the horizontal line are racy.
If put_page() from soft offline comes first, the HWPOISONed hugepage
remains in free hugepage list, causing wrong results.
It's hard to fix this problem by locking because we cannot control
page fault by page lock.
So this patch just adds to free_huge_page() a HWPOISON check,
which ensures that the last user of the old hugepage dequeues it
from free hugepage list.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
mm/hugetlb.c | 28 +++++++++++++++++++++-------
1 files changed, 21 insertions(+), 7 deletions(-)
diff --git v2.6.38-rc1/mm/hugetlb.c v2.6.38-rc1/mm/hugetlb.c
index d3b856a..b777c81 100644
--- v2.6.38-rc1/mm/hugetlb.c
+++ v2.6.38-rc1/mm/hugetlb.c
@@ -524,6 +524,8 @@ struct hstate *size_to_hstate(unsigned long size)
return NULL;
}
+static int __dequeue_hwpoisoned_huge_page(struct page *hpage, struct hstate *h);
+
static void free_huge_page(struct page *page)
{
/*
@@ -548,6 +550,8 @@ static void free_huge_page(struct page *page)
h->surplus_huge_pages_node[nid]--;
} else {
enqueue_huge_page(h, page);
+ if (unlikely(PageHWPoison(page)))
+ __dequeue_hwpoisoned_huge_page(page, h);
}
spin_unlock(&hugetlb_lock);
if (mapping)
@@ -2932,17 +2936,11 @@ static int is_hugepage_on_freelist(struct page *hpage)
return 0;
}
-/*
- * This function is called from memory failure code.
- * Assume the caller holds page lock of the head page.
- */
-int dequeue_hwpoisoned_huge_page(struct page *hpage)
+static int __dequeue_hwpoisoned_huge_page(struct page *hpage, struct hstate *h)
{
- struct hstate *h = page_hstate(hpage);
int nid = page_to_nid(hpage);
int ret = -EBUSY;
- spin_lock(&hugetlb_lock);
if (is_hugepage_on_freelist(hpage)) {
list_del(&hpage->lru);
set_page_refcounted(hpage);
@@ -2950,6 +2948,22 @@ int dequeue_hwpoisoned_huge_page(struct page *hpage)
h->free_huge_pages_node[nid]--;
ret = 0;
}
+ return ret;
+}
+
+/*
+ * This function is called from memory failure code.
+ * Assume the caller holds page lock of the head page.
+ */
+int dequeue_hwpoisoned_huge_page(struct page *hpage)
+{
+ struct hstate *h = page_hstate(hpage);
+ int ret;
+
+ if (!h)
+ return 0;
+ spin_lock(&hugetlb_lock);
+ ret = __dequeue_hwpoisoned_huge_page(hpage, h);
spin_unlock(&hugetlb_lock);
return ret;
}
--
1.7.3.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 6/7] HWPOISON: pass order to set/clear_page_hwpoison_huge_page()
2011-01-21 6:28 [PATCH 0/7] HWPOISON for hugepage backed KVM guest Naoya Horiguchi
` (4 preceding siblings ...)
2011-01-21 6:28 ` [PATCH 5/7] hugetlb: fix race condition between hugepage soft offline and page fault Naoya Horiguchi
@ 2011-01-21 6:28 ` Naoya Horiguchi
2011-01-21 6:29 ` [PATCH 7/7] HWPOISON, hugetlb: fix hard offline for hugepage backed KVM guest Naoya Horiguchi
6 siblings, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 6:28 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm
When a surplus hugepage is soft-offlined, the old hugepage will
be freed into buddy directly. Then we'll have no access to hstate.
So we need to pass page order to PG_HWPoison set/clear functions.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
mm/memory-failure.c | 21 ++++++++++++---------
1 files changed, 12 insertions(+), 9 deletions(-)
diff --git v2.6.38-rc1/mm/memory-failure.c v2.6.38-rc1/mm/memory-failure.c
index b4910e8..eed1846 100644
--- v2.6.38-rc1/mm/memory-failure.c
+++ v2.6.38-rc1/mm/memory-failure.c
@@ -927,18 +927,18 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
return ret;
}
-static void set_page_hwpoison_huge_page(struct page *hpage)
+static void set_page_hwpoison_huge_page(struct page *hpage, int order)
{
int i;
- int nr_pages = 1 << compound_trans_order(hpage);
+ int nr_pages = 1 << order;
for (i = 0; i < nr_pages; i++)
SetPageHWPoison(hpage + i);
}
-static void clear_page_hwpoison_huge_page(struct page *hpage)
+static void clear_page_hwpoison_huge_page(struct page *hpage, int order)
{
int i;
- int nr_pages = 1 << compound_trans_order(hpage);
+ int nr_pages = 1 << order;
for (i = 0; i < nr_pages; i++)
ClearPageHWPoison(hpage + i);
}
@@ -1002,7 +1002,8 @@ int __memory_failure(unsigned long pfn, int trapno, int flags)
atomic_long_sub(nr_pages, &mce_bad_pages);
return 0;
}
- set_page_hwpoison_huge_page(hpage);
+ set_page_hwpoison_huge_page(hpage,
+ compound_order(hpage));
res = dequeue_hwpoisoned_huge_page(hpage);
action_result(pfn, "free huge",
res ? IGNORED : DELAYED);
@@ -1078,7 +1079,7 @@ int __memory_failure(unsigned long pfn, int trapno, int flags)
* page lock held, we can safely set PG_hwpoison bits on tail pages.
*/
if (PageHuge(p))
- set_page_hwpoison_huge_page(hpage);
+ set_page_hwpoison_huge_page(hpage, compound_order(hpage));
wait_on_page_writeback(p);
@@ -1197,7 +1198,8 @@ int unpoison_memory(unsigned long pfn)
atomic_long_sub(nr_pages, &mce_bad_pages);
freeit = 1;
if (PageHuge(page))
- clear_page_hwpoison_huge_page(page);
+ clear_page_hwpoison_huge_page(page,
+ compound_order(page));
}
unlock_page(page);
@@ -1275,6 +1277,7 @@ static int soft_offline_huge_page(struct page *page, int flags)
int ret;
unsigned long pfn = page_to_pfn(page);
struct page *hpage = compound_head(page);
+ int order = compound_order(hpage);
LIST_HEAD(pagelist);
ret = get_any_page(page, pfn, flags);
@@ -1303,8 +1306,8 @@ static int soft_offline_huge_page(struct page *page, int flags)
}
done:
if (!PageHWPoison(hpage))
- atomic_long_add(1 << compound_trans_order(hpage), &mce_bad_pages);
- set_page_hwpoison_huge_page(hpage);
+ atomic_long_add(1 << order, &mce_bad_pages);
+ set_page_hwpoison_huge_page(hpage, order);
dequeue_hwpoisoned_huge_page(hpage);
/* keep elevated page count for bad page */
return ret;
--
1.7.3.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 7/7] HWPOISON, hugetlb: fix hard offline for hugepage backed KVM guest
2011-01-21 6:28 [PATCH 0/7] HWPOISON for hugepage backed KVM guest Naoya Horiguchi
` (5 preceding siblings ...)
2011-01-21 6:28 ` [PATCH 6/7] HWPOISON: pass order to set/clear_page_hwpoison_huge_page() Naoya Horiguchi
@ 2011-01-21 6:29 ` Naoya Horiguchi
6 siblings, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 6:29 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm
When a qemu-kvm process touches HWPOISONed pages,
we expect that a SIGBUS signal causes MCE on the guest OS.
But currently it doesn't work on a hugepage backed KVM guest
because is_hwpoison_address() can't detect the HWPOISON entry
on PMD and the guest repeats page fault infinitely.
This patch fixes it.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Huang Ying <ying.huang@intel.com>
---
include/linux/swapops.h | 12 ++++++++++++
mm/hugetlb.c | 4 +++-
mm/memory-failure.c | 2 +-
3 files changed, 16 insertions(+), 2 deletions(-)
diff --git v2.6.38-rc1/include/linux/swapops.h v2.6.38-rc1/include/linux/swapops.h
index a220ef5..2c1a942 100644
--- v2.6.38-rc1/include/linux/swapops.h
+++ v2.6.38-rc1/include/linux/swapops.h
@@ -177,3 +177,15 @@ extern void migration_hugepage_entry_wait(struct mm_struct *mm, pmd_t *pmd,
static inline void migration_hugepage_entry_wait(struct mm_struct *mm,
pmd_t *pmd, unsigned long address) { }
#endif
+
+#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_HUGETLB_PAGE)
+extern int is_hugetlb_entry_hwpoisoned(pte_t pte);
+#else
+static inline int is_hugetlb_entry_hwpoisoned(pte_t pte)
+{
+ return 0;
+}
+#endif
+
+
+
diff --git v2.6.38-rc1/mm/hugetlb.c v2.6.38-rc1/mm/hugetlb.c
index b777c81..c65922e 100644
--- v2.6.38-rc1/mm/hugetlb.c
+++ v2.6.38-rc1/mm/hugetlb.c
@@ -2185,7 +2185,8 @@ static int is_hugetlb_entry_migration(pte_t pte)
return 0;
}
-static int is_hugetlb_entry_hwpoisoned(pte_t pte)
+#ifdef CONFIG_MEMORY_FAILURE
+int is_hugetlb_entry_hwpoisoned(pte_t pte)
{
swp_entry_t swp;
@@ -2197,6 +2198,7 @@ static int is_hugetlb_entry_hwpoisoned(pte_t pte)
} else
return 0;
}
+#endif
void __unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
unsigned long end, struct page *ref_page)
diff --git v2.6.38-rc1/mm/memory-failure.c v2.6.38-rc1/mm/memory-failure.c
index eed1846..8ee5038 100644
--- v2.6.38-rc1/mm/memory-failure.c
+++ v2.6.38-rc1/mm/memory-failure.c
@@ -1461,7 +1461,7 @@ int is_hwpoison_address(unsigned long addr)
pmdp = pmd_offset(pudp, addr);
pmd = *pmdp;
if (!pmd_present(pmd) || pmd_large(pmd))
- return 0;
+ return is_hugetlb_entry_hwpoisoned(*(pte_t *)pmdp);
ptep = pte_offset_map(pmdp, addr);
pte = *ptep;
pte_unmap(ptep);
--
1.7.3.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/7] remove putback_lru_pages() in hugepage migration context
2011-01-21 6:28 ` [PATCH 3/7] remove putback_lru_pages() in hugepage migration context Naoya Horiguchi
@ 2011-01-21 6:40 ` Minchan Kim
2011-01-21 10:00 ` Naoya Horiguchi
0 siblings, 1 reply; 10+ messages in thread
From: Minchan Kim @ 2011-01-21 6:40 UTC (permalink / raw)
To: Naoya Horiguchi
Cc: Andi Kleen, Andrew Morton, Wu Fengguang, Mel Gorman,
Christoph Lameter, Huang Ying, Fernando Luis Vazquez Cao,
tony.luck, LKML, linux-mm
Hello,
On Fri, Jan 21, 2011 at 3:28 PM, Naoya Horiguchi
<n-horiguchi@ah.jp.nec.com> wrote:
> This putback_lru_pages() is inserted at cf608ac19c to allow
> memory compaction to count the number of migration failed pages.
>
> But we should not do it for a hugepage because page->lru of a hugepage
> is used differently from that of a normal page:
>
> in-use hugepage : page->lru is unlinked,
> free hugepage : page->lru is linked to the free hugepage list,
>
> so putting back hugepages to LRU lists collapses this rule.
> We just drop this change (without any impact on memory compaction.)
>
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Minchan Kim <minchan.kim@gmail.com>
As I said previously, It seems mistake during patch merge.
I didn't add it in my original patch. You can see my final patch.
https://lkml.org/lkml/2010/8/24/248
Anyway, I realized it recently so I sent the patch to Andrew.
Could you see this one?
https://lkml.org/lkml/2011/1/20/241
Thanks for notice me.
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/7] remove putback_lru_pages() in hugepage migration context
2011-01-21 6:40 ` Minchan Kim
@ 2011-01-21 10:00 ` Naoya Horiguchi
0 siblings, 0 replies; 10+ messages in thread
From: Naoya Horiguchi @ 2011-01-21 10:00 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, Wu Fengguang, Mel Gorman, Christoph Lameter,
Huang Ying, Fernando Luis Vazquez Cao, tony.luck, LKML, linux-mm,
Andi Kleen
Hi,
On Fri, Jan 21, 2011 at 03:40:35PM +0900, Minchan Kim wrote:
> Hello,
>
> On Fri, Jan 21, 2011 at 3:28 PM, Naoya Horiguchi
> <n-horiguchi@ah.jp.nec.com> wrote:
> > This putback_lru_pages() is inserted at cf608ac19c to allow
> > memory compaction to count the number of migration failed pages.
> >
> > But we should not do it for a hugepage because page->lru of a hugepage
> > is used differently from that of a normal page:
> >
> > in-use hugepage : page->lru is unlinked,
> > free hugepage : page->lru is linked to the free hugepage list,
> >
> > so putting back hugepages to LRU lists collapses this rule.
> > We just drop this change (without any impact on memory compaction.)
> >
> > Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> > Cc: Minchan Kim <minchan.kim@gmail.com>
>
> As I said previously, It seems mistake during patch merge.
> I didn't add it in my original patch. You can see my final patch.
> https://lkml.org/lkml/2010/8/24/248
OK.
> Anyway, I realized it recently so I sent the patch to Andrew.
> Could you see this one?
> https://lkml.org/lkml/2011/1/20/241
This patch seems not to change hugepage soft offline's behavior,
so I have no objection.
-- Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-01-21 10:04 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-21 6:28 [PATCH 0/7] HWPOISON for hugepage backed KVM guest Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 1/7] hugetlb: check swap entry in follow_hugetlb_page() Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 2/7] check hugepage swap entry in get_user_pages_fast() Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 3/7] remove putback_lru_pages() in hugepage migration context Naoya Horiguchi
2011-01-21 6:40 ` Minchan Kim
2011-01-21 10:00 ` Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 4/7] hugetlb, migration: add migration_hugepage_entry_wait() Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 5/7] hugetlb: fix race condition between hugepage soft offline and page fault Naoya Horiguchi
2011-01-21 6:28 ` [PATCH 6/7] HWPOISON: pass order to set/clear_page_hwpoison_huge_page() Naoya Horiguchi
2011-01-21 6:29 ` [PATCH 7/7] HWPOISON, hugetlb: fix hard offline for hugepage backed KVM guest Naoya Horiguchi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox