* [PATCH 01/24] page-types: add standard GPL license head
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 13:08 ` Andi Kleen
2009-12-02 3:12 ` [PATCH 02/24] migrate: page could be locked by hwpoison, dont BUG() Wu Fengguang
` (22 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: page-types-gpl.patch --]
[-- Type: text/plain, Size: 1517 bytes --]
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
Documentation/vm/page-types.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
--- linux-mm.orig/Documentation/vm/page-types.c 2009-11-07 19:28:51.000000000 +0800
+++ linux-mm/Documentation/vm/page-types.c 2009-11-08 22:04:04.000000000 +0800
@@ -1,11 +1,22 @@
/*
* page-types: Tool for querying page flags
*
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; version 2.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should find a copy of v2 of the GNU General Public License somewhere on
+ * your Linux system; if not, write to the Free Software Foundation, Inc., 59
+ * Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
* Copyright (C) 2009 Intel corporation
*
* Authors: Wu Fengguang <fengguang.wu@intel.com>
- *
- * Released under the General Public License (GPL).
*/
#define _LARGEFILE64_SOURCE
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 01/24] page-types: add standard GPL license head
2009-12-02 3:12 ` [PATCH 01/24] page-types: add standard GPL license head Wu Fengguang
@ 2009-12-02 13:08 ` Andi Kleen
0 siblings, 0 replies; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 13:08 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 11:12:32AM +0800, Wu Fengguang wrote:
> CC: Andi Kleen <andi@firstfloor.org>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
> Documentation/vm/page-types.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> --- linux-mm.orig/Documentation/vm/page-types.c 2009-11-07 19:28:51.000000000 +0800
> +++ linux-mm/Documentation/vm/page-types.c 2009-11-08 22:04:04.000000000 +0800
> @@ -1,11 +1,22 @@
> /*
> * page-types: Tool for querying page flags
> *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License as published by the Free
> + * Software Foundation; version 2.
I guess it's not fully hwpoison department, but I'll just include it
because it's so simple.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 02/24] migrate: page could be locked by hwpoison, dont BUG()
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
2009-12-02 3:12 ` [PATCH 01/24] page-types: add standard GPL license head Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 13:09 ` Andi Kleen
2009-12-02 14:50 ` Christoph Lameter
2009-12-02 3:12 ` [PATCH 03/24] HWPOISON: remove the anonymous entry Wu Fengguang
` (21 subsequent siblings)
23 siblings, 2 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Nick Piggin, Christoph Lameter, KAMEZAWA Hiroyuki,
Wu Fengguang, linux-mm, LKML
[-- Attachment #1: hwpoison-migrate-trylock-fix.patch --]
[-- Type: text/plain, Size: 1006 bytes --]
The new page could be taken by hwpoison, in which case
return EAGAIN to allocate a new page and retry.
CC: Nick Piggin <npiggin@suse.de>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- linux-mm.orig/mm/migrate.c 2009-11-02 10:18:45.000000000 +0800
+++ linux-mm/mm/migrate.c 2009-11-02 10:26:16.000000000 +0800
@@ -556,7 +556,7 @@ static int move_to_new_page(struct page
* holding a reference to the new page at this point.
*/
if (!trylock_page(newpage))
- BUG();
+ return -EAGAIN; /* got by hwpoison */
/* Prepare mapping for the new page.*/
newpage->index = page->index;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 02/24] migrate: page could be locked by hwpoison, dont BUG()
2009-12-02 3:12 ` [PATCH 02/24] migrate: page could be locked by hwpoison, dont BUG() Wu Fengguang
@ 2009-12-02 13:09 ` Andi Kleen
2009-12-02 14:50 ` Christoph Lameter
1 sibling, 0 replies; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 13:09 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, Nick Piggin, Christoph Lameter,
KAMEZAWA Hiroyuki, linux-mm, LKML
On Wed, Dec 02, 2009 at 11:12:33AM +0800, Wu Fengguang wrote:
> The new page could be taken by hwpoison, in which case
> return EAGAIN to allocate a new page and retry.
Previously there were some complaints about this patch, but I guess
it doesn't hurt, so I'll add it.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 02/24] migrate: page could be locked by hwpoison, dont BUG()
2009-12-02 3:12 ` [PATCH 02/24] migrate: page could be locked by hwpoison, dont BUG() Wu Fengguang
2009-12-02 13:09 ` Andi Kleen
@ 2009-12-02 14:50 ` Christoph Lameter
2009-12-03 1:34 ` Wu Fengguang
1 sibling, 1 reply; 61+ messages in thread
From: Christoph Lameter @ 2009-12-02 14:50 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, Nick Piggin, KAMEZAWA Hiroyuki,
linux-mm, LKML
On Wed, 2 Dec 2009, Wu Fengguang wrote:
> mm/migrate.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- linux-mm.orig/mm/migrate.c 2009-11-02 10:18:45.000000000 +0800
> +++ linux-mm/mm/migrate.c 2009-11-02 10:26:16.000000000 +0800
> @@ -556,7 +556,7 @@ static int move_to_new_page(struct page
> * holding a reference to the new page at this point.
> */
> if (!trylock_page(newpage))
> - BUG();
> + return -EAGAIN; /* got by hwpoison */
>
> /* Prepare mapping for the new page.*/
> newpage->index = page->index;
The error handling code in umap_and_move() assumes that the page is
locked upon return from move_to_new_page() even if it failed.
If you return EAGAIN then it may try to unlock a page that is not
locked.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 02/24] migrate: page could be locked by hwpoison, dont BUG()
2009-12-02 14:50 ` Christoph Lameter
@ 2009-12-03 1:34 ` Wu Fengguang
0 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-03 1:34 UTC (permalink / raw)
To: Christoph Lameter
Cc: Andi Kleen, Andrew Morton, Nick Piggin, KAMEZAWA Hiroyuki,
linux-mm, LKML
On Wed, Dec 02, 2009 at 10:50:10PM +0800, Christoph Lameter wrote:
> On Wed, 2 Dec 2009, Wu Fengguang wrote:
>
> > mm/migrate.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > --- linux-mm.orig/mm/migrate.c 2009-11-02 10:18:45.000000000 +0800
> > +++ linux-mm/mm/migrate.c 2009-11-02 10:26:16.000000000 +0800
> > @@ -556,7 +556,7 @@ static int move_to_new_page(struct page
> > * holding a reference to the new page at this point.
> > */
> > if (!trylock_page(newpage))
> > - BUG();
> > + return -EAGAIN; /* got by hwpoison */
> >
> > /* Prepare mapping for the new page.*/
> > newpage->index = page->index;
>
> The error handling code in umap_and_move() assumes that the page is
> locked upon return from move_to_new_page() even if it failed.
>
> If you return EAGAIN then it may try to unlock a page that is not
> locked.
Ah yes, thanks! We could fix it with more changes, however it seems
better to just drop this patch.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 03/24] HWPOISON: remove the anonymous entry
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
2009-12-02 3:12 ` [PATCH 01/24] page-types: add standard GPL license head Wu Fengguang
2009-12-02 3:12 ` [PATCH 02/24] migrate: page could be locked by hwpoison, dont BUG() Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 04/24] HWPOISON: return ENXIO on invalid pfn Wu Fengguang
` (20 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-remove-anon-entry.patch --]
[-- Type: text/plain, Size: 883 bytes --]
(PG_swapbacked && !PG_lru) pages are rediculous.
Better to treat them as unknown pages.
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 1 -
1 file changed, 1 deletion(-)
--- linux-mm.orig/mm/memory-failure.c 2009-11-02 10:18:45.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-02 10:26:17.000000000 +0800
@@ -589,7 +589,6 @@ static struct page_state {
{ lru|dirty, lru|dirty, "LRU", me_pagecache_dirty },
{ lru|dirty, lru, "clean LRU", me_pagecache_clean },
- { swapbacked, swapbacked, "anonymous", me_pagecache_clean },
/*
* Catchall entry: must be at end.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 04/24] HWPOISON: return ENXIO on invalid pfn
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (2 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 03/24] HWPOISON: remove the anonymous entry Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 05/24] HWPOISON: avoid grabbing page for two times Wu Fengguang
` (19 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-action_result-valid-pfn.patch --]
[-- Type: text/plain, Size: 1466 bytes --]
Return ENXIO to indicate "No such device or address".
This also avoids calling action_result() with invalid pfn.
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- linux-mm.orig/mm/memory-failure.c 2009-11-02 10:26:17.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-02 10:26:17.000000000 +0800
@@ -598,13 +598,11 @@ static struct page_state {
static void action_result(unsigned long pfn, char *msg, int result)
{
- struct page *page = NULL;
- if (pfn_valid(pfn))
- page = pfn_to_page(pfn);
+ struct page *page = pfn_to_page(pfn);
printk(KERN_ERR "MCE %#lx: %s%s page recovery: %s\n",
pfn,
- page && PageDirty(page) ? "dirty " : "",
+ PageDirty(page) ? "dirty " : "",
msg, action_name[result]);
}
@@ -730,8 +728,10 @@ int __memory_failure(unsigned long pfn,
panic("Memory failure from trap %d on page %lx", trapno, pfn);
if (!pfn_valid(pfn)) {
- action_result(pfn, "memory outside kernel control", IGNORED);
- return -EIO;
+ printk(KERN_ERR
+ "MCE %#lx: memory outside kernel control\n",
+ pfn);
+ return -ENXIO;
}
p = pfn_to_page(pfn);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 05/24] HWPOISON: avoid grabbing page for two times
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (3 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 04/24] HWPOISON: return ENXIO on invalid pfn Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 06/24] HWPOISON: abort on failed unmap Wu Fengguang
` (18 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-no-double-ref.patch --]
[-- Type: text/plain, Size: 2321 bytes --]
If page is double referenced in madvise_hwpoison() and __memory_failure(),
remove_mapping() will fail because it expects page_count=2. Fix it by
not grabbing extra page count in __memory_failure().
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/madvise.c | 1 -
mm/memory-failure.c | 8 ++++----
2 files changed, 4 insertions(+), 5 deletions(-)
--- linux-mm.orig/mm/madvise.c 2009-11-02 11:12:02.000000000 +0800
+++ linux-mm/mm/madvise.c 2009-11-02 12:31:52.000000000 +0800
@@ -238,7 +238,6 @@ static int madvise_hwpoison(unsigned lon
page_to_pfn(p), start);
/* Ignore return value for now */
__memory_failure(page_to_pfn(p), 0, 1);
- put_page(p);
}
return ret;
}
--- linux-mm.orig/mm/memory-failure.c 2009-11-02 12:31:49.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-02 13:53:41.000000000 +0800
@@ -607,7 +607,7 @@ static void action_result(unsigned long
}
static int page_action(struct page_state *ps, struct page *p,
- unsigned long pfn, int ref)
+ unsigned long pfn)
{
int result;
int count;
@@ -615,7 +615,7 @@ static int page_action(struct page_state
result = ps->action(p, pfn);
action_result(pfn, ps->msg, result);
- count = page_count(p) - 1 - ref;
+ count = page_count(p) - 1;
if (count != 0)
printk(KERN_ERR
"MCE %#lx: %s page still referenced by %d users\n",
@@ -753,7 +753,7 @@ int __memory_failure(unsigned long pfn,
* In fact it's dangerous to directly bump up page count from 0,
* that may make page_freeze_refs()/page_unfreeze_refs() mismatch.
*/
- if (!get_page_unless_zero(compound_head(p))) {
+ if (!ref && !get_page_unless_zero(compound_head(p))) {
action_result(pfn, "free or high order kernel", IGNORED);
return PageBuddy(compound_head(p)) ? 0 : -EBUSY;
}
@@ -801,7 +801,7 @@ int __memory_failure(unsigned long pfn,
res = -EBUSY;
for (ps = error_states;; ps++) {
if (((p->flags | lru_flag)& ps->mask) == ps->res) {
- res = page_action(ps, p, pfn, ref);
+ res = page_action(ps, p, pfn);
break;
}
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 06/24] HWPOISON: abort on failed unmap
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (4 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 05/24] HWPOISON: avoid grabbing page for two times Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 13:11 ` Andi Kleen
2009-12-02 3:12 ` [PATCH 07/24] HWPOISON: comment the possible set_page_dirty() race Wu Fengguang
` (17 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-abort-on-failed-unmap.patch --]
[-- Type: text/plain, Size: 2268 bytes --]
Don't try to isolate a still mapped page. Otherwise we will hit the
BUG_ON(page_mapped(page)) in __remove_from_page_cache().
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 10:35:38.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-30 11:11:25.000000000 +0800
@@ -635,7 +635,7 @@ static int page_action(struct page_state
* Do all that is necessary to remove user space mappings. Unmap
* the pages and send SIGBUS to the processes if the data was dirty.
*/
-static void hwpoison_user_mappings(struct page *p, unsigned long pfn,
+static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
int trapno)
{
enum ttu_flags ttu = TTU_UNMAP | TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS;
@@ -645,15 +645,18 @@ static void hwpoison_user_mappings(struc
int i;
int kill = 1;
- if (PageReserved(p) || PageCompound(p) || PageSlab(p) || PageKsm(p))
- return;
+ if (PageReserved(p) || PageSlab(p))
+ return SWAP_SUCCESS;
/*
* This check implies we don't kill processes if their pages
* are in the swap cache early. Those are always late kills.
*/
if (!page_mapped(p))
- return;
+ return SWAP_SUCCESS;
+
+ if (PageCompound(p) || PageKsm(p))
+ return SWAP_FAIL;
if (PageSwapCache(p)) {
printk(KERN_ERR
@@ -715,6 +718,8 @@ static void hwpoison_user_mappings(struc
*/
kill_procs_ao(&tokill, !!PageDirty(p), trapno,
ret != SWAP_SUCCESS, pfn);
+
+ return ret;
}
int __memory_failure(unsigned long pfn, int trapno, int ref)
@@ -786,8 +791,12 @@ int __memory_failure(unsigned long pfn,
/*
* Now take care of user space mappings.
+ * Abort on fail: __remove_from_page_cache() assumes unmapped page.
*/
- hwpoison_user_mappings(p, pfn, trapno);
+ if (hwpoison_user_mappings(p, pfn, trapno) != SWAP_SUCCESS) {
+ res = -EBUSY;
+ goto out;
+ }
/*
* Torn down by someone else?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 06/24] HWPOISON: abort on failed unmap
2009-12-02 3:12 ` [PATCH 06/24] HWPOISON: abort on failed unmap Wu Fengguang
@ 2009-12-02 13:11 ` Andi Kleen
2009-12-02 13:28 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 13:11 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
> * Now take care of user space mappings.
> + * Abort on fail: __remove_from_page_cache() assumes unmapped page.
> */
> - hwpoison_user_mappings(p, pfn, trapno);
> + if (hwpoison_user_mappings(p, pfn, trapno) != SWAP_SUCCESS) {
> + res = -EBUSY;
> + goto out;
It would be good to print something in this case.
Did you actually see it during testing?
Or maybe loop forever in the unmapper.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 06/24] HWPOISON: abort on failed unmap
2009-12-02 13:11 ` Andi Kleen
@ 2009-12-02 13:28 ` Wu Fengguang
2009-12-02 13:44 ` Andi Kleen
0 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 13:28 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 09:11:50PM +0800, Andi Kleen wrote:
> > * Now take care of user space mappings.
> > + * Abort on fail: __remove_from_page_cache() assumes unmapped page.
> > */
> > - hwpoison_user_mappings(p, pfn, trapno);
> > + if (hwpoison_user_mappings(p, pfn, trapno) != SWAP_SUCCESS) {
> > + res = -EBUSY;
> > + goto out;
>
> It would be good to print something in this case.
OK.
> Did you actually see it during testing?
Perhaps not.
> Or maybe loop forever in the unmapper.
!SWAP_SUCCESS should be rare, so not necessary to loop forever?
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 06/24] HWPOISON: abort on failed unmap
2009-12-02 13:28 ` Wu Fengguang
@ 2009-12-02 13:44 ` Andi Kleen
0 siblings, 0 replies; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 13:44 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 09:28:19PM +0800, Wu Fengguang wrote:
> On Wed, Dec 02, 2009 at 09:11:50PM +0800, Andi Kleen wrote:
> > > * Now take care of user space mappings.
> > > + * Abort on fail: __remove_from_page_cache() assumes unmapped page.
> > > */
> > > - hwpoison_user_mappings(p, pfn, trapno);
> > > + if (hwpoison_user_mappings(p, pfn, trapno) != SWAP_SUCCESS) {
> > > + res = -EBUSY;
> > > + goto out;
> >
> > It would be good to print something in this case.
>
> OK.
I'll add it.
>
> > Did you actually see it during testing?
>
> Perhaps not.
>
> > Or maybe loop forever in the unmapper.
>
> !SWAP_SUCCESS should be rare, so not necessary to loop forever?
I think the loop I originally added was overcautious and could
be even removed possibly now. It probably needs some more analysis how l
ikely unmapping failures really are.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 07/24] HWPOISON: comment the possible set_page_dirty() race
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (5 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 06/24] HWPOISON: abort on failed unmap Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 08/24] HWPOISON: comment dirty swapcache pages Wu Fengguang
` (16 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-comment-dirty.patch --]
[-- Type: text/plain, Size: 984 bytes --]
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 2 ++
1 file changed, 2 insertions(+)
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 11:11:25.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-30 11:12:41.000000000 +0800
@@ -667,6 +667,8 @@ static int hwpoison_user_mappings(struct
/*
* Propagate the dirty bit from PTEs to struct page first, because we
* need this to decide if we should kill or just drop the page.
+ * XXX: the dirty test could be racy: set_page_dirty() may not always
+ * be called inside page lock (it's recommended but not enforced).
*/
mapping = page_mapping(p);
if (!PageDirty(p) && mapping && mapping_cap_writeback_dirty(mapping)) {
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 08/24] HWPOISON: comment dirty swapcache pages
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (6 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 07/24] HWPOISON: comment the possible set_page_dirty() race Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 09/24] HWPOISON: introduce delete_from_lru_cache() Wu Fengguang
` (15 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-comment-dirty-swapcache.patch --]
[-- Type: text/plain, Size: 893 bytes --]
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory.c | 4 ++++
1 file changed, 4 insertions(+)
--- linux-mm.orig/mm/memory.c 2009-11-24 16:50:44.000000000 +0800
+++ linux-mm/mm/memory.c 2009-11-30 10:35:39.000000000 +0800
@@ -2540,6 +2540,10 @@ static int do_swap_page(struct mm_struct
ret = VM_FAULT_MAJOR;
count_vm_event(PGMAJFAULT);
} else if (PageHWPoison(page)) {
+ /*
+ * hwpoisoned dirty swapcache pages are kept for killing
+ * owner processes (which may be unknown at hwpoison time)
+ */
ret = VM_FAULT_HWPOISON;
delayacct_clear_flag(DELAYACCT_PF_SWAPIN);
goto out_release;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 09/24] HWPOISON: introduce delete_from_lru_cache()
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (7 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 08/24] HWPOISON: comment dirty swapcache pages Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 10/24] HWPOISON: remove the free buddy page handler Wu Fengguang
` (14 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: lru-flags.patch --]
[-- Type: text/plain, Size: 3526 bytes --]
Introduce delete_from_lru_cache() to
- clear PG_active, PG_unevictable to avoid complains at unpoison time
- move the isolate_lru_page() call back to the handlers instead of the
entrance of __memory_failure(), this is more hwpoison filter friendly
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 45 ++++++++++++++++++++++++++++++++++--------
1 file changed, 37 insertions(+), 8 deletions(-)
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 11:12:41.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-30 20:04:43.000000000 +0800
@@ -328,6 +328,30 @@ static const char *action_name[] = {
};
/*
+ * XXX: It is possible that a page is isolated from LRU cache,
+ * and then kept in swap cache or failed to remove from page cache.
+ * The page count will stop it from being freed by unpoison.
+ * Stress tests should be aware of this memory leak problem.
+ */
+static int delete_from_lru_cache(struct page *p)
+{
+ if (!isolate_lru_page(p)) {
+ /*
+ * Clear sensible page flags, so that the buddy system won't
+ * complain when the page is unpoison-and-freed.
+ */
+ ClearPageActive(p);
+ ClearPageUnevictable(p);
+ /*
+ * drop the page count elevated by isolate_lru_page()
+ */
+ page_cache_release(p);
+ return 0;
+ }
+ return -EIO;
+}
+
+/*
* Error hit kernel page.
* Do nothing, try to be lucky and not touch this instead. For a few cases we
* could be more sophisticated.
@@ -371,6 +395,8 @@ static int me_pagecache_clean(struct pag
int ret = FAILED;
struct address_space *mapping;
+ delete_from_lru_cache(p);
+
/*
* For anonymous pages we're done the only reference left
* should be the one m_f() holds.
@@ -500,14 +526,20 @@ static int me_swapcache_dirty(struct pag
/* Trigger EIO in shmem: */
ClearPageUptodate(p);
- return DELAYED;
+ if (!delete_from_lru_cache(p))
+ return DELAYED;
+ else
+ return FAILED;
}
static int me_swapcache_clean(struct page *p, unsigned long pfn)
{
delete_from_swap_cache(p);
- return RECOVERED;
+ if (!delete_from_lru_cache(p))
+ return RECOVERED;
+ else
+ return FAILED;
}
/*
@@ -726,7 +758,6 @@ static int hwpoison_user_mappings(struct
int __memory_failure(unsigned long pfn, int trapno, int ref)
{
- unsigned long lru_flag;
struct page_state *ps;
struct page *p;
int res;
@@ -775,13 +806,11 @@ int __memory_failure(unsigned long pfn,
*/
if (!PageLRU(p))
lru_add_drain_all();
- lru_flag = p->flags & lru;
- if (isolate_lru_page(p)) {
+ if (!PageLRU(p)) {
action_result(pfn, "non LRU", IGNORED);
put_page(p);
return -EBUSY;
}
- page_cache_release(p);
/*
* Lock the page and wait for writeback to finish.
@@ -803,7 +832,7 @@ int __memory_failure(unsigned long pfn,
/*
* Torn down by someone else?
*/
- if ((lru_flag & lru) && !PageSwapCache(p) && p->mapping == NULL) {
+ if (PageLRU(p) && !PageSwapCache(p) && p->mapping == NULL) {
action_result(pfn, "already truncated LRU", IGNORED);
res = 0;
goto out;
@@ -811,7 +840,7 @@ int __memory_failure(unsigned long pfn,
res = -EBUSY;
for (ps = error_states;; ps++) {
- if (((p->flags | lru_flag)& ps->mask) == ps->res) {
+ if ((p->flags & ps->mask) == ps->res) {
res = page_action(ps, p, pfn);
break;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 10/24] HWPOISON: remove the free buddy page handler
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (8 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 09/24] HWPOISON: introduce delete_from_lru_cache() Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 13:13 ` Andi Kleen
2009-12-02 3:12 ` [PATCH 11/24] HWPOISON: detect free buddy pages explicitly Wu Fengguang
` (13 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-remove-free-handler.patch --]
[-- Type: text/plain, Size: 1206 bytes --]
The buddy page has already be handled in the very beginning.
So remove redundant code.
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 9 ---------
1 file changed, 9 deletions(-)
--- linux-mm.orig/mm/memory-failure.c 2009-11-09 10:57:50.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-09 10:59:26.000000000 +0800
@@ -379,14 +379,6 @@ static int me_unknown(struct page *p, un
}
/*
- * Free memory
- */
-static int me_free(struct page *p, unsigned long pfn)
-{
- return DELAYED;
-}
-
-/*
* Clean (or cleaned) page cache page.
*/
static int me_pagecache_clean(struct page *p, unsigned long pfn)
@@ -592,7 +584,6 @@ static struct page_state {
int (*action)(struct page *p, unsigned long pfn);
} error_states[] = {
{ reserved, reserved, "reserved kernel", me_ignore },
- { buddy, buddy, "free kernel", me_free },
/*
* Could in theory check if slab page is free or if we can drop
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 10/24] HWPOISON: remove the free buddy page handler
2009-12-02 3:12 ` [PATCH 10/24] HWPOISON: remove the free buddy page handler Wu Fengguang
@ 2009-12-02 13:13 ` Andi Kleen
2009-12-02 13:31 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 13:13 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 11:12:41AM +0800, Wu Fengguang wrote:
> The buddy page has already be handled in the very beginning.
> So remove redundant code.
I think I prefer the table to be complete, even if some of the
cases might not happen currently. A BUG() would be reasonable though.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 10/24] HWPOISON: remove the free buddy page handler
2009-12-02 13:13 ` Andi Kleen
@ 2009-12-02 13:31 ` Wu Fengguang
0 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 13:31 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 09:13:30PM +0800, Andi Kleen wrote:
> On Wed, Dec 02, 2009 at 11:12:41AM +0800, Wu Fengguang wrote:
> > The buddy page has already be handled in the very beginning.
> > So remove redundant code.
>
> I think I prefer the table to be complete, even if some of the
> cases might not happen currently. A BUG() would be reasonable though.
I'd prefer not to carry around some useless bytes in kernel.
What if we replace it with a comment line in the table?
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 11/24] HWPOISON: detect free buddy pages explicitly
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (9 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 10/24] HWPOISON: remove the free buddy page handler Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 12/24] HWPOISON: make it possible to unpoison pages Wu Fengguang
` (12 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Nick Piggin, Mel Gorman, Wu Fengguang, linux-mm, LKML
[-- Attachment #1: hwpoison-is-free-page.patch --]
[-- Type: text/plain, Size: 2544 bytes --]
Most free pages in the buddy system have no PG_buddy set.
Introduce is_free_buddy_page() for detecting them reliably.
CC: Andi Kleen <andi@firstfloor.org>
CC: Nick Piggin <npiggin@suse.de>
CC: Mel Gorman <mel@linux.vnet.ibm.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/internal.h | 3 +++
mm/memory-failure.c | 9 +++++++--
mm/page_alloc.c | 21 +++++++++++++++++++++
3 files changed, 31 insertions(+), 2 deletions(-)
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 20:04:51.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-30 20:06:00.000000000 +0800
@@ -783,8 +783,13 @@ int __memory_failure(unsigned long pfn,
* that may make page_freeze_refs()/page_unfreeze_refs() mismatch.
*/
if (!ref && !get_page_unless_zero(compound_head(p))) {
- action_result(pfn, "free or high order kernel", IGNORED);
- return PageBuddy(compound_head(p)) ? 0 : -EBUSY;
+ if (is_free_buddy_page(p)) {
+ action_result(pfn, "free buddy", DELAYED);
+ return 0;
+ } else {
+ action_result(pfn, "high order kernel", IGNORED);
+ return -EBUSY;
+ }
}
/*
--- linux-mm.orig/mm/internal.h 2009-11-30 11:08:34.000000000 +0800
+++ linux-mm/mm/internal.h 2009-11-30 20:06:01.000000000 +0800
@@ -50,6 +50,9 @@ extern void putback_lru_page(struct page
*/
extern void __free_pages_bootmem(struct page *page, unsigned int order);
extern void prep_compound_page(struct page *page, unsigned long order);
+#ifdef CONFIG_MEMORY_FAILURE
+extern bool is_free_buddy_page(struct page *page);
+#endif
/*
--- linux-mm.orig/mm/page_alloc.c 2009-11-30 11:08:34.000000000 +0800
+++ linux-mm/mm/page_alloc.c 2009-11-30 20:06:01.000000000 +0800
@@ -5085,3 +5085,24 @@ __offline_isolated_pages(unsigned long s
spin_unlock_irqrestore(&zone->lock, flags);
}
#endif
+
+#ifdef CONFIG_MEMORY_FAILURE
+bool is_free_buddy_page(struct page *page)
+{
+ struct zone *zone = page_zone(page);
+ unsigned long pfn = page_to_pfn(page);
+ unsigned long flags;
+ int order;
+
+ spin_lock_irqsave(&zone->lock, flags);
+ for (order = 0; order < MAX_ORDER; order++) {
+ struct page *page_head = page - (pfn & ((1 << order) - 1));
+
+ if (PageBuddy(page_head) && page_order(page_head) >= order)
+ break;
+ }
+ spin_unlock_irqrestore(&zone->lock, flags);
+
+ return order < MAX_ORDER;
+}
+#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 12/24] HWPOISON: make it possible to unpoison pages
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (10 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 11/24] HWPOISON: detect free buddy pages explicitly Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 13:15 ` Andi Kleen
2009-12-02 3:12 ` [PATCH 13/24] HWPOISON: introduce struct hwpoison_control Wu Fengguang
` (11 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-free-poisoned-memory.patch --]
[-- Type: text/plain, Size: 4913 bytes --]
The unpoisoning interface can be useful for
- stress testing tools to reclaim poisoned pages (to prevent OOM)
- system admin to instruct kernel to forget temporal memory errors
Note that it may leak pages silently - those who have been removed from
LRU cache, but not isolated from page cache/swap cache at hwpoison time.
Especially the stress test of dirty swap cache pages shall reboot system
before exhausting memory.
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
include/linux/mm.h | 1
include/linux/page-flags.h | 2 -
mm/hwpoison-inject.c | 31 ++++++++++++++++----
mm/memory-failure.c | 52 +++++++++++++++++++++++++++++++++++
4 files changed, 79 insertions(+), 7 deletions(-)
--- linux-mm.orig/mm/hwpoison-inject.c 2009-11-30 11:08:34.000000000 +0800
+++ linux-mm/mm/hwpoison-inject.c 2009-11-30 20:30:55.000000000 +0800
@@ -4,7 +4,7 @@
#include <linux/kernel.h>
#include <linux/mm.h>
-static struct dentry *hwpoison_dir, *corrupt_pfn;
+static struct dentry *hwpoison_dir;
static int hwpoison_inject(void *data, u64 val)
{
@@ -14,7 +14,16 @@ static int hwpoison_inject(void *data, u
return __memory_failure(val, 18, 0);
}
+static int hwpoison_forget(void *data, u64 val)
+{
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ return forget_memory_failure(val);
+}
+
DEFINE_SIMPLE_ATTRIBUTE(hwpoison_fops, NULL, hwpoison_inject, "%lli\n");
+DEFINE_SIMPLE_ATTRIBUTE(unpoison_fops, NULL, hwpoison_forget, "%lli\n");
static void pfn_inject_exit(void)
{
@@ -24,16 +33,26 @@ static void pfn_inject_exit(void)
static int pfn_inject_init(void)
{
+ struct dentry *dentry;
+
hwpoison_dir = debugfs_create_dir("hwpoison", NULL);
if (hwpoison_dir == NULL)
return -ENOMEM;
- corrupt_pfn = debugfs_create_file("corrupt-pfn", 0600, hwpoison_dir,
+
+ dentry = debugfs_create_file("corrupt-pfn", 0600, hwpoison_dir,
NULL, &hwpoison_fops);
- if (corrupt_pfn == NULL) {
- pfn_inject_exit();
- return -ENOMEM;
- }
+ if (!dentry)
+ goto fail;
+
+ dentry = debugfs_create_file("renew-pfn", 0600, hwpoison_dir,
+ NULL, &unpoison_fops);
+ if (!dentry)
+ goto fail;
+
return 0;
+fail:
+ pfn_inject_exit();
+ return -ENOMEM;
}
module_init(pfn_inject_init);
--- linux-mm.orig/include/linux/mm.h 2009-11-30 11:08:34.000000000 +0800
+++ linux-mm/include/linux/mm.h 2009-11-30 20:08:10.000000000 +0800
@@ -1318,6 +1318,7 @@ extern void refund_locked_memory(struct
extern void memory_failure(unsigned long pfn, int trapno);
extern int __memory_failure(unsigned long pfn, int trapno, int ref);
+extern int forget_memory_failure(unsigned long pfn);
extern int sysctl_memory_failure_early_kill;
extern int sysctl_memory_failure_recovery;
extern atomic_long_t mce_bad_pages;
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 20:06:00.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-30 20:33:58.000000000 +0800
@@ -814,6 +814,16 @@ int __memory_failure(unsigned long pfn,
* and in many cases impossible, so we just avoid it here.
*/
lock_page_nosync(p);
+
+ /*
+ * unpoison always clear PG_hwpoison inside page lock
+ */
+ if (!PageHWPoison(p)) {
+ action_result(pfn, "unpoisoned", IGNORED);
+ res = 0;
+ goto out;
+ }
+
wait_on_page_writeback(p);
/*
@@ -868,3 +878,45 @@ void memory_failure(unsigned long pfn, i
{
__memory_failure(pfn, trapno, 0);
}
+
+int forget_memory_failure(unsigned long pfn)
+{
+ struct page *page;
+ struct page *p;
+ int freeit = 0;
+
+ if (!pfn_valid(pfn))
+ return -ENXIO;
+
+ p = pfn_to_page(pfn);
+ page = compound_head(p);
+
+ if (!PageHWPoison(p))
+ return 0;
+
+ if (!get_page_unless_zero(page)) {
+ if (TestClearPageHWPoison(p))
+ atomic_long_dec(&mce_bad_pages);
+ return 0;
+ }
+
+ lock_page_nosync(page);
+ /*
+ * This test is racy because PG_hwpoison is set outside of page lock.
+ * That's acceptable because that won't trigger kernel panic. Instead,
+ * the PG_hwpoison page will be caught and isolated on the entrance to
+ * the free buddy page pool.
+ */
+ if (TestClearPageHWPoison(p)) {
+ atomic_long_dec(&mce_bad_pages);
+ freeit = 1;
+ }
+ unlock_page(page);
+
+ put_page(page);
+ if (freeit)
+ put_page(page);
+
+ return 0;
+}
+EXPORT_SYMBOL(forget_memory_failure);
--- linux-mm.orig/include/linux/page-flags.h 2009-11-30 11:08:34.000000000 +0800
+++ linux-mm/include/linux/page-flags.h 2009-11-30 20:08:10.000000000 +0800
@@ -277,7 +277,7 @@ PAGEFLAG_FALSE(Uncached)
#ifdef CONFIG_MEMORY_FAILURE
PAGEFLAG(HWPoison, hwpoison)
-TESTSETFLAG(HWPoison, hwpoison)
+TESTSCFLAG(HWPoison, hwpoison)
#define __PG_HWPOISON (1UL << PG_hwpoison)
#else
PAGEFLAG_FALSE(HWPoison)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 12/24] HWPOISON: make it possible to unpoison pages
2009-12-02 3:12 ` [PATCH 12/24] HWPOISON: make it possible to unpoison pages Wu Fengguang
@ 2009-12-02 13:15 ` Andi Kleen
2009-12-02 13:31 ` Wu Fengguang
2009-12-02 13:46 ` Wu Fengguang
0 siblings, 2 replies; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 13:15 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
> Note that it may leak pages silently - those who have been removed from
> LRU cache, but not isolated from page cache/swap cache at hwpoison time.
It would be better if we could detect that somehow and at least warn.
> }
>
> +static int hwpoison_forget(void *data, u64 val)
> +{
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
> +
> + return forget_memory_failure(val);
> +}
> +
> DEFINE_SIMPLE_ATTRIBUTE(hwpoison_fops, NULL, hwpoison_inject, "%lli\n");
> +DEFINE_SIMPLE_ATTRIBUTE(unpoison_fops, NULL, hwpoison_forget, "%lli\n");
I'll rename it to unpoison, not forget. I think that's a more clear
name.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 12/24] HWPOISON: make it possible to unpoison pages
2009-12-02 13:15 ` Andi Kleen
@ 2009-12-02 13:31 ` Wu Fengguang
2009-12-02 13:46 ` Wu Fengguang
1 sibling, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 13:31 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 09:15:30PM +0800, Andi Kleen wrote:
> > Note that it may leak pages silently - those who have been removed from
> > LRU cache, but not isolated from page cache/swap cache at hwpoison time.
>
> It would be better if we could detect that somehow and at least warn.
>
> > }
> >
> > +static int hwpoison_forget(void *data, u64 val)
> > +{
> > + if (!capable(CAP_SYS_ADMIN))
> > + return -EPERM;
> > +
> > + return forget_memory_failure(val);
> > +}
> > +
> > DEFINE_SIMPLE_ATTRIBUTE(hwpoison_fops, NULL, hwpoison_inject, "%lli\n");
> > +DEFINE_SIMPLE_ATTRIBUTE(unpoison_fops, NULL, hwpoison_forget, "%lli\n");
>
> I'll rename it to unpoison, not forget. I think that's a more clear
> name.
OK. Will repost the whole updated patchset.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 12/24] HWPOISON: make it possible to unpoison pages
2009-12-02 13:15 ` Andi Kleen
2009-12-02 13:31 ` Wu Fengguang
@ 2009-12-02 13:46 ` Wu Fengguang
2009-12-02 14:03 ` Andi Kleen
1 sibling, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 13:46 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 09:15:30PM +0800, Andi Kleen wrote:
> > Note that it may leak pages silently - those who have been removed from
> > LRU cache, but not isolated from page cache/swap cache at hwpoison time.
>
> It would be better if we could detect that somehow and at least warn.
We warned when some page cannot be isolated (but didn't mention it may
lead to memory leak).
We exported the hwpoison counter in /proc/meminfo. The memory leak is
mainly a problem with stress testing, and the test cases can make use
of that counter to do sanity checking.
> > }
> >
> > +static int hwpoison_forget(void *data, u64 val)
> > +{
> > + if (!capable(CAP_SYS_ADMIN))
> > + return -EPERM;
> > +
> > + return forget_memory_failure(val);
> > +}
> > +
> > DEFINE_SIMPLE_ATTRIBUTE(hwpoison_fops, NULL, hwpoison_inject, "%lli\n");
> > +DEFINE_SIMPLE_ATTRIBUTE(unpoison_fops, NULL, hwpoison_forget, "%lli\n");
>
> I'll rename it to unpoison, not forget. I think that's a more clear
> name.
btw, do you feel comfortable with the interface name "renew-pfn"?
(versus "unpoison-pfn")
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 12/24] HWPOISON: make it possible to unpoison pages
2009-12-02 13:46 ` Wu Fengguang
@ 2009-12-02 14:03 ` Andi Kleen
2009-12-03 1:45 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 14:03 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
> btw, do you feel comfortable with the interface name "renew-pfn"?
> (versus "unpoison-pfn")
I prefer unpoison, that makes it clear what it is.
Maybe even call it "software_unpoison_pfn", because it won't unpoison on the
hardware level (this really should be documented somewhere too)
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 12/24] HWPOISON: make it possible to unpoison pages
2009-12-02 14:03 ` Andi Kleen
@ 2009-12-03 1:45 ` Wu Fengguang
0 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-03 1:45 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 10:03:05PM +0800, Andi Kleen wrote:
> > btw, do you feel comfortable with the interface name "renew-pfn"?
> > (versus "unpoison-pfn")
>
> I prefer unpoison, that makes it clear what it is.
OK.
> Maybe even call it "software_unpoison_pfn", because it won't unpoison on the
> hardware level (this really should be documented somewhere too)
Yes we can document it in Documentation/vm/hwpoison.txt.
Does that mean we may introduce a "hardware_unpoison_pfn" in future?
(a superset of software_unpoison_pfn)
And "software_unpoison_pfn" may make the other "corrupt-pfn" a bit
confusing ;)
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 13/24] HWPOISON: introduce struct hwpoison_control
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (11 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 12/24] HWPOISON: make it possible to unpoison pages Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 13:15 ` Andi Kleen
2009-12-02 3:12 ` [PATCH 14/24] HWPOISON: return 0 if page is assured to be isolated Wu Fengguang
` (10 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-control.patch --]
[-- Type: text/plain, Size: 10152 bytes --]
This allows passing around more parameters and states.
No behavior change.
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 108 +++++++++++++++++++++++++-----------------
1 file changed, 65 insertions(+), 43 deletions(-)
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 20:33:58.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-30 20:35:49.000000000 +0800
@@ -313,20 +313,27 @@ static void collect_procs(struct page *p
* Error handlers for various types of pages.
*/
-enum outcome {
+enum hwpoison_result {
FAILED, /* Error handling failed */
DELAYED, /* Will be handled later */
IGNORED, /* Error safely ignored */
RECOVERED, /* Successfully recovered */
};
-static const char *action_name[] = {
+static const char *hwpoison_result_name[] = {
[FAILED] = "Failed",
[DELAYED] = "Delayed",
[IGNORED] = "Ignored",
[RECOVERED] = "Recovered",
};
+struct hwpoison_control {
+ unsigned long pfn;
+ struct page *p; /* raw corrupted page */
+ struct page *page; /* compound page head */
+ int result;
+};
+
/*
* XXX: It is possible that a page is isolated from LRU cache,
* and then kept in swap cache or failed to remove from page cache.
@@ -356,7 +363,7 @@ static int delete_from_lru_cache(struct
* Do nothing, try to be lucky and not touch this instead. For a few cases we
* could be more sophisticated.
*/
-static int me_kernel(struct page *p, unsigned long pfn)
+static int me_kernel(struct hwpoison_control *hpc)
{
return DELAYED;
}
@@ -364,28 +371,30 @@ static int me_kernel(struct page *p, uns
/*
* Already poisoned page.
*/
-static int me_ignore(struct page *p, unsigned long pfn)
+static int me_ignore(struct hwpoison_control *hpc)
{
+ printk(KERN_ERR "MCE %#lx: Unknown page state\n", hpc->pfn);
return IGNORED;
}
/*
* Page in unknown state. Do nothing.
*/
-static int me_unknown(struct page *p, unsigned long pfn)
+static int me_unknown(struct hwpoison_control *hpc)
{
- printk(KERN_ERR "MCE %#lx: Unknown page state\n", pfn);
+ printk(KERN_ERR "MCE %#lx: Unknown page state\n", hpc->pfn);
return FAILED;
}
/*
* Clean (or cleaned) page cache page.
*/
-static int me_pagecache_clean(struct page *p, unsigned long pfn)
+static int me_pagecache_clean(struct hwpoison_control *hpc)
{
int err;
int ret = FAILED;
struct address_space *mapping;
+ struct page *p = hpc->page;
delete_from_lru_cache(p);
@@ -420,10 +429,11 @@ static int me_pagecache_clean(struct pag
err = mapping->a_ops->error_remove_page(mapping, p);
if (err != 0) {
printk(KERN_INFO "MCE %#lx: Failed to punch page: %d\n",
- pfn, err);
+ hpc->pfn, err);
} else if (page_has_private(p) &&
!try_to_release_page(p, GFP_NOIO)) {
- pr_debug("MCE %#lx: failed to release buffers\n", pfn);
+ pr_debug("MCE %#lx: failed to release buffers\n",
+ hpc->pfn);
} else {
ret = RECOVERED;
}
@@ -436,7 +446,7 @@ static int me_pagecache_clean(struct pag
ret = RECOVERED;
else
printk(KERN_INFO "MCE %#lx: Failed to invalidate\n",
- pfn);
+ hpc->pfn);
}
return ret;
}
@@ -446,11 +456,11 @@ static int me_pagecache_clean(struct pag
* Issues: when the error hit a hole page the error is not properly
* propagated.
*/
-static int me_pagecache_dirty(struct page *p, unsigned long pfn)
+static int me_pagecache_dirty(struct hwpoison_control *hpc)
{
- struct address_space *mapping = page_mapping(p);
+ struct address_space *mapping = page_mapping(hpc->page);
- SetPageError(p);
+ SetPageError(hpc->page);
/* TBD: print more information about the file. */
if (mapping) {
/*
@@ -490,7 +500,7 @@ static int me_pagecache_dirty(struct pag
mapping_set_error(mapping, EIO);
}
- return me_pagecache_clean(p, pfn);
+ return me_pagecache_clean(hpc);
}
/*
@@ -512,8 +522,9 @@ static int me_pagecache_dirty(struct pag
* Clean swap cache pages can be directly isolated. A later page fault will
* bring in the known good data from disk.
*/
-static int me_swapcache_dirty(struct page *p, unsigned long pfn)
+static int me_swapcache_dirty(struct hwpoison_control *hpc)
{
+ struct page *p = hpc->page;
ClearPageDirty(p);
/* Trigger EIO in shmem: */
ClearPageUptodate(p);
@@ -524,8 +535,10 @@ static int me_swapcache_dirty(struct pag
return FAILED;
}
-static int me_swapcache_clean(struct page *p, unsigned long pfn)
+static int me_swapcache_clean(struct hwpoison_control *hpc)
{
+ struct page *p = hpc->page;
+
delete_from_swap_cache(p);
if (!delete_from_lru_cache(p))
@@ -545,7 +558,7 @@ static int me_swapcache_clean(struct pag
* Should handle free huge pages and dequeue them too, but this needs to
* handle huge page accounting correctly.
*/
-static int me_huge_page(struct page *p, unsigned long pfn)
+static int me_huge_page(struct hwpoison_control *hpc)
{
return FAILED;
}
@@ -581,7 +594,7 @@ static struct page_state {
unsigned long mask;
unsigned long res;
char *msg;
- int (*action)(struct page *p, unsigned long pfn);
+ int (*action)(struct hwpoison_control *hpc);
} error_states[] = {
{ reserved, reserved, "reserved kernel", me_ignore },
@@ -619,30 +632,29 @@ static struct page_state {
{ 0, 0, "unknown page state", me_unknown },
};
-static void action_result(unsigned long pfn, char *msg, int result)
+static void action_result(struct hwpoison_control *hpc, char *msg, int result)
{
- struct page *page = pfn_to_page(pfn);
-
+ hpc->result = result;
printk(KERN_ERR "MCE %#lx: %s%s page recovery: %s\n",
- pfn,
- PageDirty(page) ? "dirty " : "",
- msg, action_name[result]);
+ hpc->pfn,
+ PageDirty(hpc->page) ? "dirty " : "",
+ msg, hwpoison_result_name[result]);
}
-static int page_action(struct page_state *ps, struct page *p,
- unsigned long pfn)
+static int page_action(struct page_state *ps,
+ struct hwpoison_control *hpc)
{
int result;
int count;
- result = ps->action(p, pfn);
- action_result(pfn, ps->msg, result);
+ result = ps->action(hpc);
+ action_result(hpc, ps->msg, result);
- count = page_count(p) - 1;
+ count = page_count(hpc->page) - 1;
if (count != 0)
printk(KERN_ERR
"MCE %#lx: %s page still referenced by %d users\n",
- pfn, ps->msg, count);
+ hpc->pfn, ps->msg, count);
/* Could do more checks here if page looks ok */
/*
@@ -658,11 +670,12 @@ static int page_action(struct page_state
* Do all that is necessary to remove user space mappings. Unmap
* the pages and send SIGBUS to the processes if the data was dirty.
*/
-static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
- int trapno)
+static int hwpoison_user_mappings(struct hwpoison_control *hpc, int trapno)
{
enum ttu_flags ttu = TTU_UNMAP | TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS;
struct address_space *mapping;
+ struct page *p = hpc->page;
+ unsigned long pfn = hpc->pfn;
LIST_HEAD(tokill);
int ret;
int i;
@@ -725,7 +738,8 @@ static int hwpoison_user_mappings(struct
ret = try_to_unmap(p, ttu);
if (ret == SWAP_SUCCESS)
break;
- pr_debug("MCE %#lx: try_to_unmap retry needed %d\n", pfn, ret);
+ pr_debug("MCE %#lx: try_to_unmap retry needed %d\n",
+ pfn, ret);
}
if (ret != SWAP_SUCCESS)
@@ -749,8 +763,10 @@ static int hwpoison_user_mappings(struct
int __memory_failure(unsigned long pfn, int trapno, int ref)
{
+ struct hwpoison_control hpc;
struct page_state *ps;
struct page *p;
+ struct page *page;
int res;
if (!sysctl_memory_failure_recovery)
@@ -763,9 +779,15 @@ int __memory_failure(unsigned long pfn,
return -ENXIO;
}
- p = pfn_to_page(pfn);
+ p = pfn_to_page(pfn);
+ page = compound_head(p);
+
+ hpc.pfn = pfn;
+ hpc.p = p;
+ hpc.page = page;
+
if (TestSetPageHWPoison(p)) {
- action_result(pfn, "already hardware poisoned", IGNORED);
+ action_result(&hpc, "already hardware poisoned", IGNORED);
return 0;
}
@@ -782,12 +804,12 @@ int __memory_failure(unsigned long pfn,
* In fact it's dangerous to directly bump up page count from 0,
* that may make page_freeze_refs()/page_unfreeze_refs() mismatch.
*/
- if (!ref && !get_page_unless_zero(compound_head(p))) {
+ if (!ref && !get_page_unless_zero(page)) {
if (is_free_buddy_page(p)) {
- action_result(pfn, "free buddy", DELAYED);
+ action_result(&hpc, "free buddy", DELAYED);
return 0;
} else {
- action_result(pfn, "high order kernel", IGNORED);
+ action_result(&hpc, "high order kernel", IGNORED);
return -EBUSY;
}
}
@@ -803,7 +825,7 @@ int __memory_failure(unsigned long pfn,
if (!PageLRU(p))
lru_add_drain_all();
if (!PageLRU(p)) {
- action_result(pfn, "non LRU", IGNORED);
+ action_result(&hpc, "non LRU", IGNORED);
put_page(p);
return -EBUSY;
}
@@ -819,7 +841,7 @@ int __memory_failure(unsigned long pfn,
* unpoison always clear PG_hwpoison inside page lock
*/
if (!PageHWPoison(p)) {
- action_result(pfn, "unpoisoned", IGNORED);
+ action_result(&hpc, "unpoisoned", IGNORED);
res = 0;
goto out;
}
@@ -830,7 +852,7 @@ int __memory_failure(unsigned long pfn,
* Now take care of user space mappings.
* Abort on fail: __remove_from_page_cache() assumes unmapped page.
*/
- if (hwpoison_user_mappings(p, pfn, trapno) != SWAP_SUCCESS) {
+ if (hwpoison_user_mappings(&hpc, trapno) != SWAP_SUCCESS) {
res = -EBUSY;
goto out;
}
@@ -839,7 +861,7 @@ int __memory_failure(unsigned long pfn,
* Torn down by someone else?
*/
if (PageLRU(p) && !PageSwapCache(p) && p->mapping == NULL) {
- action_result(pfn, "already truncated LRU", IGNORED);
+ action_result(&hpc, "already truncated LRU", IGNORED);
res = 0;
goto out;
}
@@ -847,7 +869,7 @@ int __memory_failure(unsigned long pfn,
res = -EBUSY;
for (ps = error_states;; ps++) {
if ((p->flags & ps->mask) == ps->res) {
- res = page_action(ps, p, pfn);
+ res = page_action(ps, &hpc);
break;
}
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 13/24] HWPOISON: introduce struct hwpoison_control
2009-12-02 3:12 ` [PATCH 13/24] HWPOISON: introduce struct hwpoison_control Wu Fengguang
@ 2009-12-02 13:15 ` Andi Kleen
0 siblings, 0 replies; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 13:15 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 11:12:44AM +0800, Wu Fengguang wrote:
> This allows passing around more parameters and states.
> No behavior change.
As mentioned earlier I'll skip this patch for now.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 14/24] HWPOISON: return 0 if page is assured to be isolated
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (12 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 13/24] HWPOISON: introduce struct hwpoison_control Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 12:47 ` Andi Kleen
2009-12-02 3:12 ` [PATCH 15/24] HWPOISON: add fs/device filters Wu Fengguang
` (9 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-isolated.patch --]
[-- Type: text/plain, Size: 3689 bytes --]
Introduce hpc.page_isolated to record if page is assured to be
isolated, ie. it won't be accessed in normal kernel code paths
and therefore won't trigger another MCE event.
__memory_failure() will now return 0 to indicate that page is
really isolated. Note that the original used action result
RECOVERED is not a reliable criterion.
Note that we now don't bother to risk returning 0 for the
rare unpoison/truncated cases.
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 20:35:49.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-30 20:40:56.000000000 +0800
@@ -332,6 +332,7 @@ struct hwpoison_control {
struct page *p; /* raw corrupted page */
struct page *page; /* compound page head */
int result;
+ unsigned page_isolated:1;
};
/*
@@ -529,9 +530,10 @@ static int me_swapcache_dirty(struct hwp
/* Trigger EIO in shmem: */
ClearPageUptodate(p);
- if (!delete_from_lru_cache(p))
+ if (!delete_from_lru_cache(p)) {
+ hpc->page_isolated = 1;
return DELAYED;
- else
+ } else
return FAILED;
}
@@ -641,7 +643,7 @@ static void action_result(struct hwpoiso
msg, hwpoison_result_name[result]);
}
-static int page_action(struct page_state *ps,
+static void page_action(struct page_state *ps,
struct hwpoison_control *hpc)
{
int result;
@@ -656,12 +658,15 @@ static int page_action(struct page_state
"MCE %#lx: %s page still referenced by %d users\n",
hpc->pfn, ps->msg, count);
+ if (result == RECOVERED)
+ hpc->page_isolated = 1;
+ if (count || page_mapcount(hpc->page))
+ hpc->page_isolated = 0;
+
/* Could do more checks here if page looks ok */
/*
* Could adjust zone counters here to correct for the missing page.
*/
-
- return result == RECOVERED ? 0 : -EBUSY;
}
#define N_UNMAP_TRIES 5
@@ -767,7 +772,6 @@ int __memory_failure(unsigned long pfn,
struct page_state *ps;
struct page *p;
struct page *page;
- int res;
if (!sysctl_memory_failure_recovery)
panic("Memory failure from trap %d on page %lx", trapno, pfn);
@@ -785,6 +789,7 @@ int __memory_failure(unsigned long pfn,
hpc.pfn = pfn;
hpc.p = p;
hpc.page = page;
+ hpc.page_isolated = 0;
if (TestSetPageHWPoison(p)) {
action_result(&hpc, "already hardware poisoned", IGNORED);
@@ -842,7 +847,6 @@ int __memory_failure(unsigned long pfn,
*/
if (!PageHWPoison(p)) {
action_result(&hpc, "unpoisoned", IGNORED);
- res = 0;
goto out;
}
@@ -852,30 +856,26 @@ int __memory_failure(unsigned long pfn,
* Now take care of user space mappings.
* Abort on fail: __remove_from_page_cache() assumes unmapped page.
*/
- if (hwpoison_user_mappings(&hpc, trapno) != SWAP_SUCCESS) {
- res = -EBUSY;
+ if (hwpoison_user_mappings(&hpc, trapno) != SWAP_SUCCESS)
goto out;
- }
/*
* Torn down by someone else?
*/
if (PageLRU(p) && !PageSwapCache(p) && p->mapping == NULL) {
action_result(&hpc, "already truncated LRU", IGNORED);
- res = 0;
goto out;
}
- res = -EBUSY;
for (ps = error_states;; ps++) {
if ((p->flags & ps->mask) == ps->res) {
- res = page_action(ps, &hpc);
+ page_action(ps, &hpc);
break;
}
}
out:
unlock_page(p);
- return res;
+ return hpc.page_isolated ? 0 : -EBUSY;
}
EXPORT_SYMBOL_GPL(__memory_failure);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 14/24] HWPOISON: return 0 if page is assured to be isolated
2009-12-02 3:12 ` [PATCH 14/24] HWPOISON: return 0 if page is assured to be isolated Wu Fengguang
@ 2009-12-02 12:47 ` Andi Kleen
2009-12-02 13:15 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 12:47 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 11:12:45AM +0800, Wu Fengguang wrote:
> Introduce hpc.page_isolated to record if page is assured to be
> isolated, ie. it won't be accessed in normal kernel code paths
> and therefore won't trigger another MCE event.
>
> __memory_failure() will now return 0 to indicate that page is
> really isolated. Note that the original used action result
> RECOVERED is not a reliable criterion.
>
> Note that we now don't bother to risk returning 0 for the
> rare unpoison/truncated cases.
That's the only user of the new hwpoison_control structure right?
I think I prefer for that single bit to extend the return values
and keep the arguments around. structures are not nice to read.
I'll change the code.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 14/24] HWPOISON: return 0 if page is assured to be isolated
2009-12-02 12:47 ` Andi Kleen
@ 2009-12-02 13:15 ` Wu Fengguang
0 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 13:15 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 08:47:30PM +0800, Andi Kleen wrote:
> On Wed, Dec 02, 2009 at 11:12:45AM +0800, Wu Fengguang wrote:
> > Introduce hpc.page_isolated to record if page is assured to be
> > isolated, ie. it won't be accessed in normal kernel code paths
> > and therefore won't trigger another MCE event.
> >
> > __memory_failure() will now return 0 to indicate that page is
> > really isolated. Note that the original used action result
> > RECOVERED is not a reliable criterion.
> >
> > Note that we now don't bother to risk returning 0 for the
> > rare unpoison/truncated cases.
>
> That's the only user of the new hwpoison_control structure right?
> I think I prefer for that single bit to extend the return values
> and keep the arguments around. structures are not nice to read.
Easier to read but harder to extend. I saw Haicheng add some debug
bits to hwpoison_control to collect debug info ;)
> I'll change the code.
I originally introduce "struct hwpoison_control" to collect more info
(like data_recoverable) and to dump them via uevent. Then we decide to
drop them unless there comes explicit user demands..
In its current form, it does seem more clean to do without hwpoison_control.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 15/24] HWPOISON: add fs/device filters
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (13 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 14/24] HWPOISON: return 0 if page is assured to be isolated Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 16/24] HWPOISON: limit hwpoison injector to known page types Wu Fengguang
` (8 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Haicheng Li, Nick Piggin, Wu Fengguang, linux-mm, LKML
[-- Attachment #1: hwpoison-filter-fs.patch --]
[-- Type: text/plain, Size: 3744 bytes --]
Filesystem data/metadata present the most tricky-to-isolate pages.
It requires careful code review and stress testing to get them right.
The fs/device filter helps to target the stress tests to some specific
filesystem pages. The filter condition is block device's major/minor
numbers:
- corrupt-filter-dev-major
- corrupt-filter-dev-minor
When specified (non -1), only page cache pages that belong to that
device will be poisoned.
The filters are checked reliably on the locked and refcounted page.
Haicheng: clear PG_hwpoison and drop bad page count if filter not OK
CC: Haicheng Li <haicheng.li@intel.com>
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/hwpoison-inject.c | 11 +++++++++
mm/internal.h | 3 ++
mm/memory-failure.c | 48 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 62 insertions(+)
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 20:44:31.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-11-30 20:51:22.000000000 +0800
@@ -48,6 +48,47 @@ int sysctl_memory_failure_recovery __rea
atomic_long_t mce_bad_pages __read_mostly = ATOMIC_LONG_INIT(0);
+u32 hwpoison_filter_dev_major = ~0U;
+u32 hwpoison_filter_dev_minor = ~0U;
+
+static int hwpoison_filter_dev(struct page *p)
+{
+ struct address_space *mapping;
+ dev_t dev;
+
+ if (hwpoison_filter_dev_major == ~0U &&
+ hwpoison_filter_dev_minor == ~0U)
+ return 0;
+
+ /*
+ * page_mapping() does not accept slab page
+ */
+ if (PageSlab(p))
+ return -EINVAL;
+
+ mapping = page_mapping(p);
+ if (mapping == NULL || mapping->host == NULL)
+ return -EINVAL;
+
+ dev = mapping->host->i_sb->s_dev;
+ if (hwpoison_filter_dev_major != ~0U &&
+ hwpoison_filter_dev_major != MAJOR(dev))
+ return -EINVAL;
+ if (hwpoison_filter_dev_minor != ~0U &&
+ hwpoison_filter_dev_minor != MINOR(dev))
+ return -EINVAL;
+
+ return 0;
+}
+
+int hwpoison_filter(struct page *p)
+{
+ if (hwpoison_filter_dev(p))
+ return -EINVAL;
+
+ return 0;
+}
+
/*
* Send all the processes who have the page mapped an ``action optional''
* signal.
@@ -849,6 +890,13 @@ int __memory_failure(unsigned long pfn,
action_result(&hpc, "unpoisoned", IGNORED);
goto out;
}
+ if (hwpoison_filter(p)) {
+ if (TestClearPageHWPoison(p))
+ atomic_long_dec(&mce_bad_pages);
+ unlock_page(p);
+ put_page(p);
+ return 0;
+ }
wait_on_page_writeback(p);
--- linux-mm.orig/mm/hwpoison-inject.c 2009-11-30 20:30:55.000000000 +0800
+++ linux-mm/mm/hwpoison-inject.c 2009-11-30 20:44:41.000000000 +0800
@@ -3,6 +3,7 @@
#include <linux/debugfs.h>
#include <linux/kernel.h>
#include <linux/mm.h>
+#include "internal.h"
static struct dentry *hwpoison_dir;
@@ -49,6 +50,16 @@ static int pfn_inject_init(void)
if (!dentry)
goto fail;
+ dentry = debugfs_create_u32("corrupt-filter-dev-major", 0600,
+ hwpoison_dir, &hwpoison_filter_dev_major);
+ if (!dentry)
+ goto fail;
+
+ dentry = debugfs_create_u32("corrupt-filter-dev-minor", 0600,
+ hwpoison_dir, &hwpoison_filter_dev_minor);
+ if (!dentry)
+ goto fail;
+
return 0;
fail:
pfn_inject_exit();
--- linux-mm.orig/mm/internal.h 2009-11-30 20:06:01.000000000 +0800
+++ linux-mm/mm/internal.h 2009-11-30 20:44:41.000000000 +0800
@@ -263,3 +263,6 @@ int __get_user_pages(struct task_struct
#define ZONE_RECLAIM_SOME 0
#define ZONE_RECLAIM_SUCCESS 1
#endif
+
+extern u32 hwpoison_filter_dev_major;
+extern u32 hwpoison_filter_dev_minor;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 16/24] HWPOISON: limit hwpoison injector to known page types
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (14 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 15/24] HWPOISON: add fs/device filters Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 8:11 ` Ingo Molnar
2009-12-02 3:12 ` [PATCH 17/24] mm: export stable page flags Wu Fengguang
` (7 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Haicheng Li, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-filter-limit-scope.patch --]
[-- Type: text/plain, Size: 2678 bytes --]
__memory_failure()'s workflow is
set PG_hwpoison
//...
unset PG_hwpoison if didn't pass hwpoison filter
That could kill unrelated process if it happens to page fault on the
page with the (temporary) PG_hwpoison. The race should be big enough to
appear in stress tests.
Fix it by grabbing the page and checking filter at inject time. This
also avoids the very noisy "Injecting memory failure..." messages.
- we don't touch madvise() based injection, because the filters are
generally not necessary for it.
- if we want to apply the filters to h/w aided injection, we'd better to
rearrange the logic in __memory_failure() instead of this patch.
CC: Haicheng Li <haicheng.li@intel.com>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/hwpoison-inject.c | 27 ++++++++++++++++++++++++++-
mm/internal.h | 2 ++
2 files changed, 28 insertions(+), 1 deletion(-)
--- linux-mm.orig/mm/hwpoison-inject.c 2009-11-30 20:44:41.000000000 +0800
+++ linux-mm/mm/hwpoison-inject.c 2009-11-30 20:58:20.000000000 +0800
@@ -3,16 +3,41 @@
#include <linux/debugfs.h>
#include <linux/kernel.h>
#include <linux/mm.h>
+#include <linux/swap.h>
#include "internal.h"
static struct dentry *hwpoison_dir;
static int hwpoison_inject(void *data, u64 val)
{
+ unsigned long pfn = val;
+ struct page *p;
+
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
+
+ if (!pfn_valid(pfn))
+ return -ENXIO;
+
+ /*
+ * This implies unable to support free buddy pages.
+ */
+ p = pfn_to_page(pfn);
+ if (!get_page_unless_zero(p))
+ return 0;
+
+ if (!PageLRU(p))
+ lru_add_drain_all();
+ /*
+ * do a racy check with elevated page count, to make sure PG_hwpoison
+ * will only be set for the targeted owner (or on a free page).
+ * __memory_failure() will redo the check reliably inside page lock.
+ */
+ if (hwpoison_filter(p))
+ return 0;
+
printk(KERN_INFO "Injecting memory failure at pfn %Lx\n", val);
- return __memory_failure(val, 18, 0);
+ return __memory_failure(val, 18, 1);
}
static int hwpoison_forget(void *data, u64 val)
--- linux-mm.orig/mm/internal.h 2009-11-30 20:44:41.000000000 +0800
+++ linux-mm/mm/internal.h 2009-11-30 20:52:11.000000000 +0800
@@ -264,5 +264,7 @@ int __get_user_pages(struct task_struct
#define ZONE_RECLAIM_SUCCESS 1
#endif
+extern int hwpoison_filter(struct page *p);
+
extern u32 hwpoison_filter_dev_major;
extern u32 hwpoison_filter_dev_minor;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 16/24] HWPOISON: limit hwpoison injector to known page types
2009-12-02 3:12 ` [PATCH 16/24] HWPOISON: limit hwpoison injector to known page types Wu Fengguang
@ 2009-12-02 8:11 ` Ingo Molnar
0 siblings, 0 replies; 61+ messages in thread
From: Ingo Molnar @ 2009-12-02 8:11 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, Haicheng Li, Nick Piggin, linux-mm, LKML
* Wu Fengguang <fengguang.wu@intel.com> wrote:
> --- linux-mm.orig/mm/hwpoison-inject.c 2009-11-30 20:44:41.000000000 +0800
> +++ linux-mm/mm/hwpoison-inject.c 2009-11-30 20:58:20.000000000 +0800
> @@ -3,16 +3,41 @@
> #include <linux/debugfs.h>
> #include <linux/kernel.h>
> #include <linux/mm.h>
> +#include <linux/swap.h>
> #include "internal.h"
>
> static struct dentry *hwpoison_dir;
>
> static int hwpoison_inject(void *data, u64 val)
> {
i'd like to raise a continuing conceptual objection against the ad-hoc
and specialistic nature of the event injection in the
mm/memory-failure*.c code. It should probably be using a standardized
interface by integrating with perf events - as i outlined it before.
Where needed perf events should be extended - we can help with that.
There's no point in having scattered pieces of incompatible (and
user-ABI affecting) infrastructure all around the kernel.
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 17/24] mm: export stable page flags
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (15 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 16/24] HWPOISON: limit hwpoison injector to known page types Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 4:42 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 18/24] HWPOISON: add page flags filter Wu Fengguang
` (6 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Matt Mackall, Nick Piggin, Christoph Lameter,
Wu Fengguang, linux-mm, LKML
[-- Attachment #1: kpageflags-export-page_uflags.patch --]
[-- Type: text/plain, Size: 4336 bytes --]
Rename get_uflags() to stable_page_flags() and make it a global function
for use in the hwpoison page flags filter, which need to compare user
page flags with the value provided by user space.
Also move KPF_* to kernel-page-flags.h for use by user space tools.
CC: Matt Mackall <mpm@selenic.com>
CC: Nick Piggin <npiggin@suse.de>
CC: Christoph Lameter <clameter@sgi.com>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
fs/proc/page.c | 45 +--------------------------
include/linux/kernel-page-flags.h | 46 ++++++++++++++++++++++++++++
include/linux/page-flags.h | 2 +
3 files changed, 51 insertions(+), 42 deletions(-)
--- linux-mm.orig/fs/proc/page.c 2009-11-07 20:23:59.000000000 +0800
+++ linux-mm/fs/proc/page.c 2009-11-07 20:37:31.000000000 +0800
@@ -8,6 +8,7 @@
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/hugetlb.h>
+#include <linux/kernel-page-flags.h>
#include <asm/uaccess.h>
#include "internal.h"
@@ -71,52 +72,12 @@ static const struct file_operations proc
* physical page flags.
*/
-/* These macros are used to decouple internal flags from exported ones */
-
-#define KPF_LOCKED 0
-#define KPF_ERROR 1
-#define KPF_REFERENCED 2
-#define KPF_UPTODATE 3
-#define KPF_DIRTY 4
-#define KPF_LRU 5
-#define KPF_ACTIVE 6
-#define KPF_SLAB 7
-#define KPF_WRITEBACK 8
-#define KPF_RECLAIM 9
-#define KPF_BUDDY 10
-
-/* 11-20: new additions in 2.6.31 */
-#define KPF_MMAP 11
-#define KPF_ANON 12
-#define KPF_SWAPCACHE 13
-#define KPF_SWAPBACKED 14
-#define KPF_COMPOUND_HEAD 15
-#define KPF_COMPOUND_TAIL 16
-#define KPF_HUGE 17
-#define KPF_UNEVICTABLE 18
-#define KPF_HWPOISON 19
-#define KPF_NOPAGE 20
-
-#define KPF_KSM 21
-
-/* kernel hacking assistances
- * WARNING: subject to change, never rely on them!
- */
-#define KPF_RESERVED 32
-#define KPF_MLOCKED 33
-#define KPF_MAPPEDTODISK 34
-#define KPF_PRIVATE 35
-#define KPF_PRIVATE_2 36
-#define KPF_OWNER_PRIVATE 37
-#define KPF_ARCH 38
-#define KPF_UNCACHED 39
-
static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
{
return ((kflags >> kbit) & 1) << ubit;
}
-static u64 get_uflags(struct page *page)
+u64 stable_page_flags(struct page *page)
{
u64 k;
u64 u;
@@ -219,7 +180,7 @@ static ssize_t kpageflags_read(struct fi
else
ppage = NULL;
- if (put_user(get_uflags(ppage), out)) {
+ if (put_user(stable_page_flags(ppage), out)) {
ret = -EFAULT;
break;
}
--- linux-mm.orig/include/linux/page-flags.h 2009-11-07 20:37:27.000000000 +0800
+++ linux-mm/include/linux/page-flags.h 2009-11-07 20:37:31.000000000 +0800
@@ -284,6 +284,8 @@ PAGEFLAG_FALSE(HWPoison)
#define __PG_HWPOISON 0
#endif
+u64 stable_page_flags(struct page *page);
+
static inline int PageUptodate(struct page *page)
{
int ret = test_bit(PG_uptodate, &(page)->flags);
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ linux-mm/include/linux/kernel-page-flags.h 2009-11-07 20:37:31.000000000 +0800
@@ -0,0 +1,46 @@
+#ifndef LINUX_KERNEL_PAGE_FLAGS_H
+#define LINUX_KERNEL_PAGE_FLAGS_H
+
+/*
+ * Stable page flag bits exported to user space
+ */
+
+#define KPF_LOCKED 0
+#define KPF_ERROR 1
+#define KPF_REFERENCED 2
+#define KPF_UPTODATE 3
+#define KPF_DIRTY 4
+#define KPF_LRU 5
+#define KPF_ACTIVE 6
+#define KPF_SLAB 7
+#define KPF_WRITEBACK 8
+#define KPF_RECLAIM 9
+#define KPF_BUDDY 10
+
+/* 11-20: new additions in 2.6.31 */
+#define KPF_MMAP 11
+#define KPF_ANON 12
+#define KPF_SWAPCACHE 13
+#define KPF_SWAPBACKED 14
+#define KPF_COMPOUND_HEAD 15
+#define KPF_COMPOUND_TAIL 16
+#define KPF_HUGE 17
+#define KPF_UNEVICTABLE 18
+#define KPF_HWPOISON 19
+#define KPF_NOPAGE 20
+
+#define KPF_KSM 21
+
+/* kernel hacking assistances
+ * WARNING: subject to change, never rely on them!
+ */
+#define KPF_RESERVED 32
+#define KPF_MLOCKED 33
+#define KPF_MAPPEDTODISK 34
+#define KPF_PRIVATE 35
+#define KPF_PRIVATE_2 36
+#define KPF_OWNER_PRIVATE 37
+#define KPF_ARCH 38
+#define KPF_UNCACHED 39
+
+#endif /* LINUX_KERNEL_PAGE_FLAGS_H */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 17/24] mm: export stable page flags
2009-12-02 3:12 ` [PATCH 17/24] mm: export stable page flags Wu Fengguang
@ 2009-12-02 4:42 ` Wu Fengguang
0 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 4:42 UTC (permalink / raw)
To: Andi Kleen
Cc: Christoph Lameter, Andrew Morton, Matt Mackall, Nick Piggin,
linux-mm, LKML
[corrected CC to Christoph Lameter <cl@linux-foundation.org>]
On Wed, Dec 02, 2009 at 11:12:48AM +0800, Wu, Fengguang wrote:
> Rename get_uflags() to stable_page_flags() and make it a global function
> for use in the hwpoison page flags filter, which need to compare user
> page flags with the value provided by user space.
>
> Also move KPF_* to kernel-page-flags.h for use by user space tools.
>
> CC: Matt Mackall <mpm@selenic.com>
> CC: Nick Piggin <npiggin@suse.de>
> CC: Christoph Lameter <clameter@sgi.com>
> CC: Andi Kleen <andi@firstfloor.org>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
> fs/proc/page.c | 45 +--------------------------
> include/linux/kernel-page-flags.h | 46 ++++++++++++++++++++++++++++
> include/linux/page-flags.h | 2 +
> 3 files changed, 51 insertions(+), 42 deletions(-)
>
> --- linux-mm.orig/fs/proc/page.c 2009-11-07 20:23:59.000000000 +0800
> +++ linux-mm/fs/proc/page.c 2009-11-07 20:37:31.000000000 +0800
> @@ -8,6 +8,7 @@
> #include <linux/proc_fs.h>
> #include <linux/seq_file.h>
> #include <linux/hugetlb.h>
> +#include <linux/kernel-page-flags.h>
> #include <asm/uaccess.h>
> #include "internal.h"
>
> @@ -71,52 +72,12 @@ static const struct file_operations proc
> * physical page flags.
> */
>
> -/* These macros are used to decouple internal flags from exported ones */
> -
> -#define KPF_LOCKED 0
> -#define KPF_ERROR 1
> -#define KPF_REFERENCED 2
> -#define KPF_UPTODATE 3
> -#define KPF_DIRTY 4
> -#define KPF_LRU 5
> -#define KPF_ACTIVE 6
> -#define KPF_SLAB 7
> -#define KPF_WRITEBACK 8
> -#define KPF_RECLAIM 9
> -#define KPF_BUDDY 10
> -
> -/* 11-20: new additions in 2.6.31 */
> -#define KPF_MMAP 11
> -#define KPF_ANON 12
> -#define KPF_SWAPCACHE 13
> -#define KPF_SWAPBACKED 14
> -#define KPF_COMPOUND_HEAD 15
> -#define KPF_COMPOUND_TAIL 16
> -#define KPF_HUGE 17
> -#define KPF_UNEVICTABLE 18
> -#define KPF_HWPOISON 19
> -#define KPF_NOPAGE 20
> -
> -#define KPF_KSM 21
> -
> -/* kernel hacking assistances
> - * WARNING: subject to change, never rely on them!
> - */
> -#define KPF_RESERVED 32
> -#define KPF_MLOCKED 33
> -#define KPF_MAPPEDTODISK 34
> -#define KPF_PRIVATE 35
> -#define KPF_PRIVATE_2 36
> -#define KPF_OWNER_PRIVATE 37
> -#define KPF_ARCH 38
> -#define KPF_UNCACHED 39
> -
> static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
> {
> return ((kflags >> kbit) & 1) << ubit;
> }
>
> -static u64 get_uflags(struct page *page)
> +u64 stable_page_flags(struct page *page)
> {
> u64 k;
> u64 u;
> @@ -219,7 +180,7 @@ static ssize_t kpageflags_read(struct fi
> else
> ppage = NULL;
>
> - if (put_user(get_uflags(ppage), out)) {
> + if (put_user(stable_page_flags(ppage), out)) {
> ret = -EFAULT;
> break;
> }
> --- linux-mm.orig/include/linux/page-flags.h 2009-11-07 20:37:27.000000000 +0800
> +++ linux-mm/include/linux/page-flags.h 2009-11-07 20:37:31.000000000 +0800
> @@ -284,6 +284,8 @@ PAGEFLAG_FALSE(HWPoison)
> #define __PG_HWPOISON 0
> #endif
>
> +u64 stable_page_flags(struct page *page);
> +
> static inline int PageUptodate(struct page *page)
> {
> int ret = test_bit(PG_uptodate, &(page)->flags);
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-mm/include/linux/kernel-page-flags.h 2009-11-07 20:37:31.000000000 +0800
> @@ -0,0 +1,46 @@
> +#ifndef LINUX_KERNEL_PAGE_FLAGS_H
> +#define LINUX_KERNEL_PAGE_FLAGS_H
> +
> +/*
> + * Stable page flag bits exported to user space
> + */
> +
> +#define KPF_LOCKED 0
> +#define KPF_ERROR 1
> +#define KPF_REFERENCED 2
> +#define KPF_UPTODATE 3
> +#define KPF_DIRTY 4
> +#define KPF_LRU 5
> +#define KPF_ACTIVE 6
> +#define KPF_SLAB 7
> +#define KPF_WRITEBACK 8
> +#define KPF_RECLAIM 9
> +#define KPF_BUDDY 10
> +
> +/* 11-20: new additions in 2.6.31 */
> +#define KPF_MMAP 11
> +#define KPF_ANON 12
> +#define KPF_SWAPCACHE 13
> +#define KPF_SWAPBACKED 14
> +#define KPF_COMPOUND_HEAD 15
> +#define KPF_COMPOUND_TAIL 16
> +#define KPF_HUGE 17
> +#define KPF_UNEVICTABLE 18
> +#define KPF_HWPOISON 19
> +#define KPF_NOPAGE 20
> +
> +#define KPF_KSM 21
> +
> +/* kernel hacking assistances
> + * WARNING: subject to change, never rely on them!
> + */
> +#define KPF_RESERVED 32
> +#define KPF_MLOCKED 33
> +#define KPF_MAPPEDTODISK 34
> +#define KPF_PRIVATE 35
> +#define KPF_PRIVATE_2 36
> +#define KPF_OWNER_PRIVATE 37
> +#define KPF_ARCH 38
> +#define KPF_UNCACHED 39
> +
> +#endif /* LINUX_KERNEL_PAGE_FLAGS_H */
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 18/24] HWPOISON: add page flags filter
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (16 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 17/24] mm: export stable page flags Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 19/24] memcg: rename and export try_get_mem_cgroup_from_page() Wu Fengguang
` (5 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Nick Piggin, Wu Fengguang, linux-mm, LKML
[-- Attachment #1: hwpoison-filter-pgflags.patch --]
[-- Type: text/plain, Size: 3084 bytes --]
When specified, only poison pages if ((page_flags & mask) == value).
- corrupt-filter-flags-mask
- corrupt-filter-flags-value
This allows stress testing of many kinds of pages.
Strictly speaking, the buddy pages requires taking zone lock, to avoid
setting PG_hwpoison on a "was buddy but now allocated to someone" page.
However we can just do nothing because we set PG_locked in the beginning,
this prevents the page allocator from allocating it to someone. (It will
BUG() on the unexpected PG_locked, which is fine for hwpoison testing.)
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/hwpoison-inject.c | 10 ++++++++++
mm/internal.h | 2 ++
mm/memory-failure.c | 18 ++++++++++++++++++
3 files changed, 30 insertions(+)
--- linux-mm.orig/mm/hwpoison-inject.c 2009-12-01 09:56:00.000000000 +0800
+++ linux-mm/mm/hwpoison-inject.c 2009-12-01 09:56:06.000000000 +0800
@@ -85,6 +85,16 @@ static int pfn_inject_init(void)
if (!dentry)
goto fail;
+ dentry = debugfs_create_u64("corrupt-filter-flags-mask", 0600,
+ hwpoison_dir, &hwpoison_filter_flags_mask);
+ if (!dentry)
+ goto fail;
+
+ dentry = debugfs_create_u64("corrupt-filter-flags-value", 0600,
+ hwpoison_dir, &hwpoison_filter_flags_value);
+ if (!dentry)
+ goto fail;
+
return 0;
fail:
pfn_inject_exit();
--- linux-mm.orig/mm/internal.h 2009-12-01 09:56:00.000000000 +0800
+++ linux-mm/mm/internal.h 2009-12-01 09:56:06.000000000 +0800
@@ -268,3 +268,5 @@ extern int hwpoison_filter(struct page *
extern u32 hwpoison_filter_dev_major;
extern u32 hwpoison_filter_dev_minor;
+extern u64 hwpoison_filter_flags_mask;
+extern u64 hwpoison_filter_flags_value;
--- linux-mm.orig/mm/memory-failure.c 2009-11-30 20:51:22.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-12-01 09:56:06.000000000 +0800
@@ -34,6 +34,7 @@
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/page-flags.h>
+#include <linux/kernel-page-flags.h>
#include <linux/sched.h>
#include <linux/ksm.h>
#include <linux/rmap.h>
@@ -50,6 +51,8 @@ atomic_long_t mce_bad_pages __read_mostl
u32 hwpoison_filter_dev_major = ~0U;
u32 hwpoison_filter_dev_minor = ~0U;
+u64 hwpoison_filter_flags_mask;
+u64 hwpoison_filter_flags_value;
static int hwpoison_filter_dev(struct page *p)
{
@@ -81,11 +84,26 @@ static int hwpoison_filter_dev(struct pa
return 0;
}
+static int hwpoison_filter_flags(struct page *p)
+{
+ if (!hwpoison_filter_flags_mask)
+ return 0;
+
+ if ((stable_page_flags(p) & hwpoison_filter_flags_mask) ==
+ hwpoison_filter_flags_value)
+ return 0;
+ else
+ return -EINVAL;
+}
+
int hwpoison_filter(struct page *p)
{
if (hwpoison_filter_dev(p))
return -EINVAL;
+ if (hwpoison_filter_flags(p))
+ return -EINVAL;
+
return 0;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 19/24] memcg: rename and export try_get_mem_cgroup_from_page()
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (17 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 18/24] HWPOISON: add page flags filter Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-03 1:58 ` Balbir Singh
2009-12-02 3:12 ` [PATCH 20/24] memcg: add accessor to mem_cgroup.css Wu Fengguang
` (4 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, KOSAKI Motohiro, Hugh Dickins, Daisuke Nishimura,
Balbir Singh, KAMEZAWA Hiroyuki, Wu Fengguang, Nick Piggin,
linux-mm, LKML
[-- Attachment #1: memcg-try_get_mem_cgroup_from_page.patch --]
[-- Type: text/plain, Size: 2791 bytes --]
So that the hwpoison injector can get mem_cgroup for arbitrary page
and thus know whether it is owned by some mem_cgroup task(s).
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
include/linux/memcontrol.h | 6 ++++++
mm/memcontrol.c | 11 ++++-------
2 files changed, 10 insertions(+), 7 deletions(-)
--- linux-mm.orig/mm/memcontrol.c 2009-11-02 10:18:42.000000000 +0800
+++ linux-mm/mm/memcontrol.c 2009-11-02 10:26:21.000000000 +0800
@@ -1379,25 +1379,22 @@ static struct mem_cgroup *mem_cgroup_loo
return container_of(css, struct mem_cgroup, css);
}
-static struct mem_cgroup *try_get_mem_cgroup_from_swapcache(struct page *page)
+struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
{
- struct mem_cgroup *mem;
+ struct mem_cgroup *mem = NULL;
struct page_cgroup *pc;
unsigned short id;
swp_entry_t ent;
VM_BUG_ON(!PageLocked(page));
- if (!PageSwapCache(page))
- return NULL;
-
pc = lookup_page_cgroup(page);
lock_page_cgroup(pc);
if (PageCgroupUsed(pc)) {
mem = pc->mem_cgroup;
if (mem && !css_tryget(&mem->css))
mem = NULL;
- } else {
+ } else if (PageSwapCache(page)) {
ent.val = page_private(page);
id = lookup_swap_cgroup(ent);
rcu_read_lock();
@@ -1742,7 +1739,7 @@ int mem_cgroup_try_charge_swapin(struct
*/
if (!PageSwapCache(page))
return 0;
- mem = try_get_mem_cgroup_from_swapcache(page);
+ mem = try_get_mem_cgroup_from_page(page);
if (!mem)
goto charge_cur_mm;
*ptr = mem;
--- linux-mm.orig/include/linux/memcontrol.h 2009-11-02 10:18:42.000000000 +0800
+++ linux-mm/include/linux/memcontrol.h 2009-11-02 10:26:21.000000000 +0800
@@ -68,6 +68,7 @@ extern unsigned long mem_cgroup_isolate_
extern void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask);
int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem);
+extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
static inline
@@ -189,6 +190,11 @@ mem_cgroup_move_lists(struct page *page,
{
}
+static inline struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
+{
+ return NULL;
+}
+
static inline int mm_match_cgroup(struct mm_struct *mm, struct mem_cgroup *mem)
{
return 1;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 19/24] memcg: rename and export try_get_mem_cgroup_from_page()
2009-12-02 3:12 ` [PATCH 19/24] memcg: rename and export try_get_mem_cgroup_from_page() Wu Fengguang
@ 2009-12-03 1:58 ` Balbir Singh
0 siblings, 0 replies; 61+ messages in thread
From: Balbir Singh @ 2009-12-03 1:58 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, KAMEZAWA Hiroyuki, Nick Piggin, linux-mm,
LKML
* Wu Fengguang <fengguang.wu@intel.com> [2009-12-02 11:12:50]:
> So that the hwpoison injector can get mem_cgroup for arbitrary page
> and thus know whether it is owned by some mem_cgroup task(s).
>
> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> CC: Balbir Singh <balbir@linux.vnet.ibm.com>
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Sorry for the delay in reviewing, I am attending a conference this
week. I'll try and get to them soon.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 20/24] memcg: add accessor to mem_cgroup.css
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (18 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 19/24] memcg: rename and export try_get_mem_cgroup_from_page() Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 21/24] cgroup: define empty css_put() when !CONFIG_CGROUPS Wu Fengguang
` (3 subsequent siblings)
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, KOSAKI Motohiro, Hugh Dickins, Daisuke Nishimura,
Balbir Singh, KAMEZAWA Hiroyuki, Wu Fengguang, Nick Piggin,
linux-mm, LKML
[-- Attachment #1: memcg-mem_cgroup_css.patch --]
[-- Type: text/plain, Size: 1944 bytes --]
So that an outside user can free the reference count grabbed by
try_get_mem_cgroup_from_page().
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
include/linux/memcontrol.h | 7 +++++++
mm/memcontrol.c | 5 +++++
2 files changed, 12 insertions(+)
--- linux-mm.orig/include/linux/memcontrol.h 2009-11-02 10:26:21.000000000 +0800
+++ linux-mm/include/linux/memcontrol.h 2009-11-02 10:26:21.000000000 +0800
@@ -81,6 +81,8 @@ int mm_match_cgroup(const struct mm_stru
return cgroup == mem;
}
+extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *mem);
+
extern int
mem_cgroup_prepare_migration(struct page *page, struct mem_cgroup **ptr);
extern void mem_cgroup_end_migration(struct mem_cgroup *mem,
@@ -206,6 +208,11 @@ static inline int task_in_mem_cgroup(str
return 1;
}
+static inline struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *mem)
+{
+ return NULL;
+}
+
static inline int
mem_cgroup_prepare_migration(struct page *page, struct mem_cgroup **ptr)
{
--- linux-mm.orig/mm/memcontrol.c 2009-11-02 10:26:21.000000000 +0800
+++ linux-mm/mm/memcontrol.c 2009-11-02 10:26:21.000000000 +0800
@@ -282,6 +282,11 @@ mem_cgroup_zoneinfo(struct mem_cgroup *m
return &mem->info.nodeinfo[nid]->zoneinfo[zid];
}
+struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *mem)
+{
+ return &mem->css;
+}
+
static struct mem_cgroup_per_zone *
page_cgroup_zoneinfo(struct page_cgroup *pc)
{
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 21/24] cgroup: define empty css_put() when !CONFIG_CGROUPS
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (19 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 20/24] memcg: add accessor to mem_cgroup.css Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 22:48 ` Paul Menage
2009-12-02 3:12 ` [PATCH 22/24] HWPOISON: add memory cgroup filter Wu Fengguang
` (2 subsequent siblings)
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Balbir Singh, KAMEZAWA Hiroyuki, Li Zefan,
Paul Menage, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: memcg-css_put.patch --]
[-- Type: text/plain, Size: 1024 bytes --]
It will be used by the hwpoison inject code for releasing the
css grabbed by try_get_mem_cgroup_from_page().
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Paul Menage <menage@google.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
include/linux/cgroup.h | 3 +++
1 file changed, 3 insertions(+)
--- linux-mm.orig/include/linux/cgroup.h 2009-11-02 10:18:41.000000000 +0800
+++ linux-mm/include/linux/cgroup.h 2009-11-02 10:26:22.000000000 +0800
@@ -581,6 +581,9 @@ static inline int cgroupstats_build(stru
return -EINVAL;
}
+struct cgroup_subsys_state;
+static inline void css_put(struct cgroup_subsys_state *css) {}
+
#endif /* !CONFIG_CGROUPS */
#endif /* _LINUX_CGROUP_H */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 21/24] cgroup: define empty css_put() when !CONFIG_CGROUPS
2009-12-02 3:12 ` [PATCH 21/24] cgroup: define empty css_put() when !CONFIG_CGROUPS Wu Fengguang
@ 2009-12-02 22:48 ` Paul Menage
2009-12-02 22:52 ` Andi Kleen
0 siblings, 1 reply; 61+ messages in thread
From: Paul Menage @ 2009-12-02 22:48 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, Balbir Singh, KAMEZAWA Hiroyuki,
Li Zefan, Nick Piggin, linux-mm, LKML
On Tue, Dec 1, 2009 at 7:12 PM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> --- linux-mm.orig/include/linux/cgroup.h 2009-11-02 10:18:41.000000000 +0800
> +++ linux-mm/include/linux/cgroup.h 2009-11-02 10:26:22.000000000 +0800
> @@ -581,6 +581,9 @@ static inline int cgroupstats_build(stru
> return -EINVAL;
> }
>
> +struct cgroup_subsys_state;
> +static inline void css_put(struct cgroup_subsys_state *css) {}
> +
> #endif /* !CONFIG_CGROUPS */
This doesn't sound like the right thing to do - if !CONFIG_CGROUPS,
then the code shouldn't be able to see a css structure to pass to this
function.
Paul
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 21/24] cgroup: define empty css_put() when !CONFIG_CGROUPS
2009-12-02 22:48 ` Paul Menage
@ 2009-12-02 22:52 ` Andi Kleen
2009-12-03 1:53 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 22:52 UTC (permalink / raw)
To: Paul Menage
Cc: Wu Fengguang, Andi Kleen, Andrew Morton, Balbir Singh,
KAMEZAWA Hiroyuki, Li Zefan, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 02:48:03PM -0800, Paul Menage wrote:
> On Tue, Dec 1, 2009 at 7:12 PM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > --- linux-mm.orig/include/linux/cgroup.h 2009-11-02 10:18:41.000000000 +0800
> > +++ linux-mm/include/linux/cgroup.h 2009-11-02 10:26:22.000000000 +0800
> > @@ -581,6 +581,9 @@ static inline int cgroupstats_build(stru
> > return -EINVAL;
> > }
> >
> > +struct cgroup_subsys_state;
> > +static inline void css_put(struct cgroup_subsys_state *css) {}
> > +
> > #endif /* !CONFIG_CGROUPS */
>
> This doesn't sound like the right thing to do - if !CONFIG_CGROUPS,
> then the code shouldn't be able to see a css structure to pass to this
> function.
I agree. The high level code should be ifdefed.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 21/24] cgroup: define empty css_put() when !CONFIG_CGROUPS
2009-12-02 22:52 ` Andi Kleen
@ 2009-12-03 1:53 ` Wu Fengguang
0 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-03 1:53 UTC (permalink / raw)
To: Andi Kleen
Cc: Paul Menage, Andrew Morton, Balbir Singh, KAMEZAWA Hiroyuki,
Li Zefan, Nick Piggin, linux-mm, LKML
On Thu, Dec 03, 2009 at 06:52:43AM +0800, Andi Kleen wrote:
> On Wed, Dec 02, 2009 at 02:48:03PM -0800, Paul Menage wrote:
> > On Tue, Dec 1, 2009 at 7:12 PM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > --- linux-mm.orig/include/linux/cgroup.h A A A A 2009-11-02 10:18:41.000000000 +0800
> > > +++ linux-mm/include/linux/cgroup.h A A 2009-11-02 10:26:22.000000000 +0800
> > > @@ -581,6 +581,9 @@ static inline int cgroupstats_build(stru
> > > A A A A return -EINVAL;
> > > A }
> > >
> > > +struct cgroup_subsys_state;
> > > +static inline void css_put(struct cgroup_subsys_state *css) {}
> > > +
> > > A #endif /* !CONFIG_CGROUPS */
> >
> > This doesn't sound like the right thing to do - if !CONFIG_CGROUPS,
> > then the code shouldn't be able to see a css structure to pass to this
> > function.
>
> I agree. The high level code should be ifdefed.
Right. Following your suggestion to ifdef the memcg user
hwpoison_filter_task(), this patch can be dropped.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (20 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 21/24] cgroup: define empty css_put() when !CONFIG_CGROUPS Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 12:44 ` Andi Kleen
2009-12-02 3:12 ` [PATCH 23/24] HWPOISON: add an interface to switch off/on all the page filters Wu Fengguang
2009-12-02 3:12 ` [PATCH 24/24] HWPOISON: show corrupted file info Wu Fengguang
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, KOSAKI Motohiro, Hugh Dickins, Daisuke Nishimura,
Balbir Singh, KAMEZAWA Hiroyuki, Li Zefan, Paul Menage,
Nick Piggin, Wu Fengguang, linux-mm, LKML
[-- Attachment #1: hwpoison-filter-memcg.patch --]
[-- Type: text/plain, Size: 4340 bytes --]
The hwpoison test suite need to inject hwpoison to a collection of
selected task pages, and must not touch pages not owned by them and
thus kill important system processes such as init. (But it's OK to
mis-hwpoison free/unowned pages as well as shared clean pages.
Mis-hwpoison of shared dirty pages will kill all tasks, so the test
suite will target all or non of such tasks in the first place.)
The memory cgroup serves this purpose well. We can put the target
processes under the control of a memory cgroup, and tell the hwpoison
injection code to only kill pages associated with some active memory
cgroup.
The prerequisite for doing hwpoison stress tests with mem_cgroup is,
the mem_cgroup code tracks task pages _accurately_ (unless page is
locked). Which we believe is/should be true.
The benifits are simplification of hwpoison injector code. Also the
mem_cgroup code will automatically be tested by hwpoison test cases.
The alternative interfaces pin-pfn/unpin-pfn can also delegate the
(process and page flags) filtering functions reliably to user space.
However prototype implementation shows that this scheme adds more
complexity than we wanted.
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Paul Menage <menage@google.com>
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/Kconfig | 3 ++-
mm/hwpoison-inject.c | 5 +++++
mm/internal.h | 1 +
mm/memory-failure.c | 24 ++++++++++++++++++++++++
4 files changed, 32 insertions(+), 1 deletion(-)
--- linux-mm.orig/mm/memory-failure.c 2009-12-01 09:56:06.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-12-01 09:56:18.000000000 +0800
@@ -53,6 +53,7 @@ u32 hwpoison_filter_dev_major = ~0U;
u32 hwpoison_filter_dev_minor = ~0U;
u64 hwpoison_filter_flags_mask;
u64 hwpoison_filter_flags_value;
+u32 hwpoison_filter_memcg;
static int hwpoison_filter_dev(struct page *p)
{
@@ -96,6 +97,26 @@ static int hwpoison_filter_flags(struct
return -EINVAL;
}
+static int hwpoison_filter_task(struct page *p)
+{
+ struct mem_cgroup *mem;
+ struct cgroup_subsys_state *css;
+
+ if (!hwpoison_filter_memcg)
+ return 0;
+
+ mem = try_get_mem_cgroup_from_page(p);
+ if (!mem)
+ return -EINVAL;
+
+ css = mem_cgroup_css(mem);
+ if (!css)
+ return -EINVAL;
+
+ css_put(css);
+ return 0;
+}
+
int hwpoison_filter(struct page *p)
{
if (hwpoison_filter_dev(p))
@@ -104,6 +125,9 @@ int hwpoison_filter(struct page *p)
if (hwpoison_filter_flags(p))
return -EINVAL;
+ if (hwpoison_filter_task(p))
+ return -EINVAL;
+
return 0;
}
--- linux-mm.orig/mm/internal.h 2009-12-01 09:56:06.000000000 +0800
+++ linux-mm/mm/internal.h 2009-12-01 09:56:18.000000000 +0800
@@ -270,3 +270,4 @@ extern u32 hwpoison_filter_dev_major;
extern u32 hwpoison_filter_dev_minor;
extern u64 hwpoison_filter_flags_mask;
extern u64 hwpoison_filter_flags_value;
+extern u32 hwpoison_filter_memcg;
--- linux-mm.orig/mm/hwpoison-inject.c 2009-12-01 09:56:06.000000000 +0800
+++ linux-mm/mm/hwpoison-inject.c 2009-12-01 09:56:18.000000000 +0800
@@ -95,6 +95,11 @@ static int pfn_inject_init(void)
if (!dentry)
goto fail;
+ dentry = debugfs_create_u32("corrupt-filter-memcg", 0600,
+ hwpoison_dir, &hwpoison_filter_memcg);
+ if (!dentry)
+ goto fail;
+
return 0;
fail:
pfn_inject_exit();
--- linux-mm.orig/mm/Kconfig 2009-11-30 11:08:30.000000000 +0800
+++ linux-mm/mm/Kconfig 2009-12-01 09:56:18.000000000 +0800
@@ -257,8 +257,9 @@ config MEMORY_FAILURE
special hardware support and typically ECC memory.
config HWPOISON_INJECT
- tristate "Poison pages injector"
+ tristate "HWPoison pages injector"
depends on MEMORY_FAILURE && DEBUG_KERNEL
+ depends on CGROUP_MEM_RES_CTLR_SWAP
config NOMMU_INITIAL_TRIM_EXCESS
int "Turn on mmap() excess space trimming before booting"
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-02 3:12 ` [PATCH 22/24] HWPOISON: add memory cgroup filter Wu Fengguang
@ 2009-12-02 12:44 ` Andi Kleen
2009-12-02 12:58 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 12:44 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, KAMEZAWA Hiroyuki, Li Zefan,
Paul Menage, Nick Piggin, linux-mm, LKML
>
> +static int hwpoison_filter_task(struct page *p)
> +{
Can we make that ifdef instead of depends on ?
-Andi
> config HWPOISON_INJECT
> - tristate "Poison pages injector"
> + tristate "HWPoison pages injector"
> depends on MEMORY_FAILURE && DEBUG_KERNEL
> + depends on CGROUP_MEM_RES_CTLR_SWAP
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-02 12:44 ` Andi Kleen
@ 2009-12-02 12:58 ` Wu Fengguang
2009-12-03 1:52 ` KAMEZAWA Hiroyuki
2009-12-03 2:15 ` Li Zefan
0 siblings, 2 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 12:58 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, KOSAKI Motohiro, Hugh Dickins, Daisuke Nishimura,
Balbir Singh, KAMEZAWA Hiroyuki, Li Zefan, Paul Menage,
Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 08:44:46PM +0800, Andi Kleen wrote:
> >
> > +static int hwpoison_filter_task(struct page *p)
> > +{
>
> Can we make that ifdef instead of depends on ?
Sure. Here is the updated patch.
---
HWPOISON: add memory cgroup filter
The hwpoison test suite need to inject hwpoison to a collection of
selected task pages, and must not touch pages not owned by them and
thus kill important system processes such as init. (But it's OK to
mis-hwpoison free/unowned pages as well as shared clean pages.
Mis-hwpoison of shared dirty pages will kill all tasks, so the test
suite will target all or non of such tasks in the first place.)
The memory cgroup serves this purpose well. We can put the target
processes under the control of a memory cgroup, and tell the hwpoison
injection code to only kill pages associated with some active memory
cgroup.
The prerequisite for doing hwpoison stress tests with mem_cgroup is,
the mem_cgroup code tracks task pages _accurately_ (unless page is
locked). Which we believe is/should be true.
The benifits are simplification of hwpoison injector code. Also the
mem_cgroup code will automatically be tested by hwpoison test cases.
The alternative interfaces pin-pfn/unpin-pfn can also delegate the
(process and page flags) filtering functions reliably to user space.
However prototype implementation shows that this scheme adds more
complexity than we wanted.
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Paul Menage <menage@google.com>
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/Kconfig | 2 +-
mm/hwpoison-inject.c | 7 +++++++
mm/internal.h | 1 +
mm/memory-failure.c | 28 ++++++++++++++++++++++++++++
4 files changed, 37 insertions(+), 1 deletion(-)
--- linux-mm.orig/mm/memory-failure.c 2009-12-01 09:56:06.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-12-02 20:56:55.000000000 +0800
@@ -96,6 +96,31 @@ static int hwpoison_filter_flags(struct
return -EINVAL;
}
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
+u32 hwpoison_filter_memcg;
+static int hwpoison_filter_task(struct page *p)
+{
+ struct mem_cgroup *mem;
+ struct cgroup_subsys_state *css;
+
+ if (!hwpoison_filter_memcg)
+ return 0;
+
+ mem = try_get_mem_cgroup_from_page(p);
+ if (!mem)
+ return -EINVAL;
+
+ css = mem_cgroup_css(mem);
+ if (!css)
+ return -EINVAL;
+
+ css_put(css);
+ return 0;
+}
+#else
+static int hwpoison_filter_task(struct page *p) {}
+#endif
+
int hwpoison_filter(struct page *p)
{
if (hwpoison_filter_dev(p))
@@ -104,6 +129,9 @@ int hwpoison_filter(struct page *p)
if (hwpoison_filter_flags(p))
return -EINVAL;
+ if (hwpoison_filter_task(p))
+ return -EINVAL;
+
return 0;
}
--- linux-mm.orig/mm/internal.h 2009-12-01 09:56:06.000000000 +0800
+++ linux-mm/mm/internal.h 2009-12-02 20:54:53.000000000 +0800
@@ -270,3 +270,4 @@ extern u32 hwpoison_filter_dev_major;
extern u32 hwpoison_filter_dev_minor;
extern u64 hwpoison_filter_flags_mask;
extern u64 hwpoison_filter_flags_value;
+extern u32 hwpoison_filter_memcg;
--- linux-mm.orig/mm/hwpoison-inject.c 2009-12-01 09:56:06.000000000 +0800
+++ linux-mm/mm/hwpoison-inject.c 2009-12-02 20:55:49.000000000 +0800
@@ -95,6 +95,13 @@ static int pfn_inject_init(void)
if (!dentry)
goto fail;
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
+ dentry = debugfs_create_u32("corrupt-filter-memcg", 0600,
+ hwpoison_dir, &hwpoison_filter_memcg);
+ if (!dentry)
+ goto fail;
+#endif
+
return 0;
fail:
pfn_inject_exit();
--- linux-mm.orig/mm/Kconfig 2009-11-30 11:08:30.000000000 +0800
+++ linux-mm/mm/Kconfig 2009-12-02 20:55:15.000000000 +0800
@@ -257,7 +257,7 @@ config MEMORY_FAILURE
special hardware support and typically ECC memory.
config HWPOISON_INJECT
- tristate "Poison pages injector"
+ tristate "HWPoison pages injector"
depends on MEMORY_FAILURE && DEBUG_KERNEL
config NOMMU_INITIAL_TRIM_EXCESS
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-02 12:58 ` Wu Fengguang
@ 2009-12-03 1:52 ` KAMEZAWA Hiroyuki
2009-12-03 2:19 ` Wu Fengguang
2009-12-03 2:15 ` Li Zefan
1 sibling, 1 reply; 61+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-12-03 1:52 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, Li Zefan, Paul Menage,
Nick Piggin, linux-mm, LKML
On Wed, 2 Dec 2009 20:58:42 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> On Wed, Dec 02, 2009 at 08:44:46PM +0800, Andi Kleen wrote:
> > >
> > > +static int hwpoison_filter_task(struct page *p)
> > > +{
> >
> > Can we make that ifdef instead of depends on ?
>
> Sure. Here is the updated patch.
>
> ---
> HWPOISON: add memory cgroup filter
>
> The hwpoison test suite need to inject hwpoison to a collection of
> selected task pages, and must not touch pages not owned by them and
> thus kill important system processes such as init. (But it's OK to
> mis-hwpoison free/unowned pages as well as shared clean pages.
> Mis-hwpoison of shared dirty pages will kill all tasks, so the test
> suite will target all or non of such tasks in the first place.)
>
> The memory cgroup serves this purpose well. We can put the target
> processes under the control of a memory cgroup, and tell the hwpoison
> injection code to only kill pages associated with some active memory
> cgroup.
>
> The prerequisite for doing hwpoison stress tests with mem_cgroup is,
> the mem_cgroup code tracks task pages _accurately_ (unless page is
> locked). Which we believe is/should be true.
>
> The benifits are simplification of hwpoison injector code. Also the
> mem_cgroup code will automatically be tested by hwpoison test cases.
>
> The alternative interfaces pin-pfn/unpin-pfn can also delegate the
> (process and page flags) filtering functions reliably to user space.
> However prototype implementation shows that this scheme adds more
> complexity than we wanted.
>
> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> CC: Balbir Singh <balbir@linux.vnet.ibm.com>
> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> CC: Li Zefan <lizf@cn.fujitsu.com>
> CC: Paul Menage <menage@google.com>
> CC: Nick Piggin <npiggin@suse.de>
> CC: Andi Kleen <andi@firstfloor.org>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
> mm/Kconfig | 2 +-
> mm/hwpoison-inject.c | 7 +++++++
> mm/internal.h | 1 +
> mm/memory-failure.c | 28 ++++++++++++++++++++++++++++
> 4 files changed, 37 insertions(+), 1 deletion(-)
>
> --- linux-mm.orig/mm/memory-failure.c 2009-12-01 09:56:06.000000000 +0800
> +++ linux-mm/mm/memory-failure.c 2009-12-02 20:56:55.000000000 +0800
> @@ -96,6 +96,31 @@ static int hwpoison_filter_flags(struct
> return -EINVAL;
> }
>
> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> +u32 hwpoison_filter_memcg;
> +static int hwpoison_filter_task(struct page *p)
> +{
> + struct mem_cgroup *mem;
> + struct cgroup_subsys_state *css;
> +
> + if (!hwpoison_filter_memcg)
> + return 0;
> +
> + mem = try_get_mem_cgroup_from_page(p);
> + if (!mem)
> + return -EINVAL;
> +
> + css = mem_cgroup_css(mem);
> + if (!css)
> + return -EINVAL;
> +
> + css_put(css);
> + return 0;
> +}
Hmm..can you adds comment ? What does this function is for ?
Is this more meaningful than PageLRU(page) etc..?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-03 1:52 ` KAMEZAWA Hiroyuki
@ 2009-12-03 2:19 ` Wu Fengguang
2009-12-03 2:28 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-03 2:19 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, Li Zefan, Paul Menage,
Nick Piggin, linux-mm, LKML
On Thu, Dec 03, 2009 at 09:52:29AM +0800, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Dec 2009 20:58:42 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > On Wed, Dec 02, 2009 at 08:44:46PM +0800, Andi Kleen wrote:
> > > >
> > > > +static int hwpoison_filter_task(struct page *p)
> > > > +{
> > >
> > > Can we make that ifdef instead of depends on ?
> >
> > Sure. Here is the updated patch.
> >
> > ---
> > HWPOISON: add memory cgroup filter
> >
> > The hwpoison test suite need to inject hwpoison to a collection of
> > selected task pages, and must not touch pages not owned by them and
> > thus kill important system processes such as init. (But it's OK to
> > mis-hwpoison free/unowned pages as well as shared clean pages.
> > Mis-hwpoison of shared dirty pages will kill all tasks, so the test
> > suite will target all or non of such tasks in the first place.)
> >
> > The memory cgroup serves this purpose well. We can put the target
> > processes under the control of a memory cgroup, and tell the hwpoison
> > injection code to only kill pages associated with some active memory
> > cgroup.
> >
> > The prerequisite for doing hwpoison stress tests with mem_cgroup is,
> > the mem_cgroup code tracks task pages _accurately_ (unless page is
> > locked). Which we believe is/should be true.
> >
> > The benifits are simplification of hwpoison injector code. Also the
> > mem_cgroup code will automatically be tested by hwpoison test cases.
> >
> > The alternative interfaces pin-pfn/unpin-pfn can also delegate the
> > (process and page flags) filtering functions reliably to user space.
> > However prototype implementation shows that this scheme adds more
> > complexity than we wanted.
> >
> > CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> > CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > CC: Balbir Singh <balbir@linux.vnet.ibm.com>
> > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > CC: Li Zefan <lizf@cn.fujitsu.com>
> > CC: Paul Menage <menage@google.com>
> > CC: Nick Piggin <npiggin@suse.de>
> > CC: Andi Kleen <andi@firstfloor.org>
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> > mm/Kconfig | 2 +-
> > mm/hwpoison-inject.c | 7 +++++++
> > mm/internal.h | 1 +
> > mm/memory-failure.c | 28 ++++++++++++++++++++++++++++
> > 4 files changed, 37 insertions(+), 1 deletion(-)
> >
> > --- linux-mm.orig/mm/memory-failure.c 2009-12-01 09:56:06.000000000 +0800
> > +++ linux-mm/mm/memory-failure.c 2009-12-02 20:56:55.000000000 +0800
> > @@ -96,6 +96,31 @@ static int hwpoison_filter_flags(struct
> > return -EINVAL;
> > }
> >
> > +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> > +u32 hwpoison_filter_memcg;
> > +static int hwpoison_filter_task(struct page *p)
> > +{
> > + struct mem_cgroup *mem;
> > + struct cgroup_subsys_state *css;
> > +
> > + if (!hwpoison_filter_memcg)
> > + return 0;
> > +
> > + mem = try_get_mem_cgroup_from_page(p);
> > + if (!mem)
> > + return -EINVAL;
> > +
> > + css = mem_cgroup_css(mem);
> > + if (!css)
> > + return -EINVAL;
>
> > +
> > + css_put(css);
> > + return 0;
> > +}
>
>
> Hmm..can you adds comment ? What does this function is for ?
Good idea. How about this one?
/*
* This allows stress tests to limit test scope to a collection of tasks
* by putting them under some memcg. This prevents killing unrelated/important
* processes such as /sbin/init. Note that the target task may share clean
* pages with init (eg. libc text), which is harmless. If the target task
* share _dirty_ pages with another task B, the test scheme must make sure B
* is also included in the memcg. At last, due to race conditions this filter
* can only guarantee that the page either belongs to the memcg tasks, or is
* a freed page.
*/
> Is this more meaningful than PageLRU(page) etc..?
It's mainly for stress testing (randomly killing pages of many tasks
until all of them get killed, and see if it impacts the health of the
whole system).
It could be used in combination with the page flags filter to do more
oriented tests.
A task may map some non-LRU page - eg. the vdso page.
Thanks,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-03 2:19 ` Wu Fengguang
@ 2009-12-03 2:28 ` KAMEZAWA Hiroyuki
2009-12-03 2:47 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-12-03 2:28 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, Li Zefan, Paul Menage,
Nick Piggin, linux-mm, LKML
On Thu, 3 Dec 2009 10:19:15 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> On Thu, Dec 03, 2009 at 09:52:29AM +0800, KAMEZAWA Hiroyuki wrote:
> > On Wed, 2 Dec 2009 20:58:42 +0800
> > Wu Fengguang <fengguang.wu@intel.com> wrote:
> >
> > > On Wed, Dec 02, 2009 at 08:44:46PM +0800, Andi Kleen wrote:
> > > > >
> > > > > +static int hwpoison_filter_task(struct page *p)
> > > > > +{
> > > >
> > > > Can we make that ifdef instead of depends on ?
> > >
> > > Sure. Here is the updated patch.
> > >
> > > ---
> > > HWPOISON: add memory cgroup filter
> > >
> > > The hwpoison test suite need to inject hwpoison to a collection of
> > > selected task pages, and must not touch pages not owned by them and
> > > thus kill important system processes such as init. (But it's OK to
> > > mis-hwpoison free/unowned pages as well as shared clean pages.
> > > Mis-hwpoison of shared dirty pages will kill all tasks, so the test
> > > suite will target all or non of such tasks in the first place.)
> > >
> > > The memory cgroup serves this purpose well. We can put the target
> > > processes under the control of a memory cgroup, and tell the hwpoison
> > > injection code to only kill pages associated with some active memory
> > > cgroup.
> > >
> > > The prerequisite for doing hwpoison stress tests with mem_cgroup is,
> > > the mem_cgroup code tracks task pages _accurately_ (unless page is
> > > locked). Which we believe is/should be true.
> > >
> > > The benifits are simplification of hwpoison injector code. Also the
> > > mem_cgroup code will automatically be tested by hwpoison test cases.
> > >
> > > The alternative interfaces pin-pfn/unpin-pfn can also delegate the
> > > (process and page flags) filtering functions reliably to user space.
> > > However prototype implementation shows that this scheme adds more
> > > complexity than we wanted.
> > >
> > > CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > > CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> > > CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > CC: Balbir Singh <balbir@linux.vnet.ibm.com>
> > > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > CC: Li Zefan <lizf@cn.fujitsu.com>
> > > CC: Paul Menage <menage@google.com>
> > > CC: Nick Piggin <npiggin@suse.de>
> > > CC: Andi Kleen <andi@firstfloor.org>
> > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > > ---
> > > mm/Kconfig | 2 +-
> > > mm/hwpoison-inject.c | 7 +++++++
> > > mm/internal.h | 1 +
> > > mm/memory-failure.c | 28 ++++++++++++++++++++++++++++
> > > 4 files changed, 37 insertions(+), 1 deletion(-)
> > >
> > > --- linux-mm.orig/mm/memory-failure.c 2009-12-01 09:56:06.000000000 +0800
> > > +++ linux-mm/mm/memory-failure.c 2009-12-02 20:56:55.000000000 +0800
> > > @@ -96,6 +96,31 @@ static int hwpoison_filter_flags(struct
> > > return -EINVAL;
> > > }
> > >
> > > +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> > > +u32 hwpoison_filter_memcg;
> > > +static int hwpoison_filter_task(struct page *p)
> > > +{
> > > + struct mem_cgroup *mem;
> > > + struct cgroup_subsys_state *css;
> > > +
> > > + if (!hwpoison_filter_memcg)
> > > + return 0;
> > > +
> > > + mem = try_get_mem_cgroup_from_page(p);
> > > + if (!mem)
> > > + return -EINVAL;
> > > +
> > > + css = mem_cgroup_css(mem);
> > > + if (!css)
> > > + return -EINVAL;
> >
> > > +
> > > + css_put(css);
> > > + return 0;
> > > +}
> >
> >
> > Hmm..can you adds comment ? What does this function is for ?
>
> Good idea. How about this one?
>
> /*
> * This allows stress tests to limit test scope to a collection of tasks
> * by putting them under some memcg. This prevents killing unrelated/important
> * processes such as /sbin/init. Note that the target task may share clean
> * pages with init (eg. libc text), which is harmless. If the target task
> * share _dirty_ pages with another task B, the test scheme must make sure B
> * is also included in the memcg. At last, due to race conditions this filter
> * can only guarantee that the page either belongs to the memcg tasks, or is
> * a freed page.
> */
>
Hmm. seems good but..by what means "avoiding killing /sbin/init" is done ?
All process are under some memcg..
If you have more patches to be usable the function above,
I recommend you to post this with some real-use patches, in step by step.
patch 19,20 is ok for me.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-03 2:28 ` KAMEZAWA Hiroyuki
@ 2009-12-03 2:47 ` Wu Fengguang
2009-12-03 2:58 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-03 2:47 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, Li Zefan, Paul Menage,
Nick Piggin, linux-mm, LKML
On Thu, Dec 03, 2009 at 10:28:22AM +0800, KAMEZAWA Hiroyuki wrote:
> On Thu, 3 Dec 2009 10:19:15 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > On Thu, Dec 03, 2009 at 09:52:29AM +0800, KAMEZAWA Hiroyuki wrote:
> > > On Wed, 2 Dec 2009 20:58:42 +0800
> > > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > >
> > > > On Wed, Dec 02, 2009 at 08:44:46PM +0800, Andi Kleen wrote:
> > > > > >
> > > > > > +static int hwpoison_filter_task(struct page *p)
> > > > > > +{
> > > > >
> > > > > Can we make that ifdef instead of depends on ?
> > > >
> > > > Sure. Here is the updated patch.
> > > >
> > > > ---
> > > > HWPOISON: add memory cgroup filter
> > > >
> > > > The hwpoison test suite need to inject hwpoison to a collection of
> > > > selected task pages, and must not touch pages not owned by them and
> > > > thus kill important system processes such as init. (But it's OK to
> > > > mis-hwpoison free/unowned pages as well as shared clean pages.
> > > > Mis-hwpoison of shared dirty pages will kill all tasks, so the test
> > > > suite will target all or non of such tasks in the first place.)
> > > >
> > > > The memory cgroup serves this purpose well. We can put the target
> > > > processes under the control of a memory cgroup, and tell the hwpoison
> > > > injection code to only kill pages associated with some active memory
> > > > cgroup.
> > > >
> > > > The prerequisite for doing hwpoison stress tests with mem_cgroup is,
> > > > the mem_cgroup code tracks task pages _accurately_ (unless page is
> > > > locked). Which we believe is/should be true.
> > > >
> > > > The benifits are simplification of hwpoison injector code. Also the
> > > > mem_cgroup code will automatically be tested by hwpoison test cases.
> > > >
> > > > The alternative interfaces pin-pfn/unpin-pfn can also delegate the
> > > > (process and page flags) filtering functions reliably to user space.
> > > > However prototype implementation shows that this scheme adds more
> > > > complexity than we wanted.
> > > >
> > > > CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > > > CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> > > > CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
> > > > CC: Balbir Singh <balbir@linux.vnet.ibm.com>
> > > > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > > CC: Li Zefan <lizf@cn.fujitsu.com>
> > > > CC: Paul Menage <menage@google.com>
> > > > CC: Nick Piggin <npiggin@suse.de>
> > > > CC: Andi Kleen <andi@firstfloor.org>
> > > > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > > > ---
> > > > mm/Kconfig | 2 +-
> > > > mm/hwpoison-inject.c | 7 +++++++
> > > > mm/internal.h | 1 +
> > > > mm/memory-failure.c | 28 ++++++++++++++++++++++++++++
> > > > 4 files changed, 37 insertions(+), 1 deletion(-)
> > > >
> > > > --- linux-mm.orig/mm/memory-failure.c 2009-12-01 09:56:06.000000000 +0800
> > > > +++ linux-mm/mm/memory-failure.c 2009-12-02 20:56:55.000000000 +0800
> > > > @@ -96,6 +96,31 @@ static int hwpoison_filter_flags(struct
> > > > return -EINVAL;
> > > > }
> > > >
> > > > +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> > > > +u32 hwpoison_filter_memcg;
> > > > +static int hwpoison_filter_task(struct page *p)
> > > > +{
> > > > + struct mem_cgroup *mem;
> > > > + struct cgroup_subsys_state *css;
> > > > +
> > > > + if (!hwpoison_filter_memcg)
> > > > + return 0;
> > > > +
> > > > + mem = try_get_mem_cgroup_from_page(p);
> > > > + if (!mem)
> > > > + return -EINVAL;
> > > > +
> > > > + css = mem_cgroup_css(mem);
> > > > + if (!css)
> > > > + return -EINVAL;
> > >
> > > > +
> > > > + css_put(css);
> > > > + return 0;
> > > > +}
> > >
> > >
> > > Hmm..can you adds comment ? What does this function is for ?
> >
> > Good idea. How about this one?
> >
> > /*
> > * This allows stress tests to limit test scope to a collection of tasks
> > * by putting them under some memcg. This prevents killing unrelated/important
> > * processes such as /sbin/init. Note that the target task may share clean
> > * pages with init (eg. libc text), which is harmless. If the target task
> > * share _dirty_ pages with another task B, the test scheme must make sure B
> > * is also included in the memcg. At last, due to race conditions this filter
> > * can only guarantee that the page either belongs to the memcg tasks, or is
> > * a freed page.
> > */
> >
> Hmm. seems good but..by what means "avoiding killing /sbin/init" is done ?
> All process are under some memcg..
Ah please forgive my memcg ignorance.. Then how about bring back the
old css_id() based scheme (old patch follows)?
> If you have more patches to be usable the function above,
> I recommend you to post this with some real-use patches, in step by step.
Do you mean user space test case? Here is a simple one:
#!/bin/sh
TEST_PROG=usemem
TEST_PARM="-m 100 -s 100"
test -d /cgroup/hwpoison && rmdir /cgroup/hwpoison
mkdir /cgroup/hwpoison
$TEST_PROG $TEST_PARM &
echo `pidof $TEST_PROG` > /cgroup/hwpoison/tasks
memcg_id=$(</cgroup/hwpoison/memory.id)
echo $memcg_id > /debug/hwpoison/corrupt-filter-memcg
./corrupt-all-pfn
> patch 19,20 is ok for me.
Thanks,
Fengguang
---
memcg: show memory.id in cgroupfs
The hwpoison test suite need to selectively inject hwpoison to some
targeted task pages, and must not kill important system processes
such as init.
The memory cgroup serves this purpose well. We can put the target
processes under the control of a memory cgroup, tell the hwpoison
injection code the id of that memory cgroup so that it will only
poison pages associated with it.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memcontrol.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
--- linux-mm.orig/mm/memcontrol.c 2009-09-07 16:01:02.000000000 +0800
+++ linux-mm/mm/memcontrol.c 2009-09-11 18:20:55.000000000 +0800
@@ -2510,6 +2510,13 @@ mem_cgroup_get_recursive_idx_stat(struct
*val = d.val;
}
+#ifdef CONFIG_HWPOISON_INJECT
+static u64 mem_cgroup_id_read(struct cgroup *cont, struct cftype *cft)
+{
+ return css_id(cgroup_subsys_state(cont, mem_cgroup_subsys_id));
+}
+#endif
+
static u64 mem_cgroup_read(struct cgroup *cont, struct cftype *cft)
{
struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
@@ -2841,6 +2848,12 @@ static int mem_cgroup_swappiness_write(s
static struct cftype mem_cgroup_files[] = {
+#ifdef CONFIG_HWPOISON_INJECT /* for now, only user is hwpoison testing */
+ {
+ .name = "id",
+ .read_u64 = mem_cgroup_id_read,
+ },
+#endif
{
.name = "usage_in_bytes",
.private = MEMFILE_PRIVATE(_MEM, RES_USAGE),
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-03 2:47 ` Wu Fengguang
@ 2009-12-03 2:58 ` KAMEZAWA Hiroyuki
2009-12-03 15:03 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-12-03 2:58 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, Li Zefan, Paul Menage,
Nick Piggin, linux-mm, LKML
On Thu, 3 Dec 2009 10:47:39 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> On Thu, Dec 03, 2009 at 10:28:22AM +0800, KAMEZAWA Hiroyuki wrote:
> Ah please forgive my memcg ignorance.. Then how about bring back the
> old css_id() based scheme (old patch follows)?
>
maybe enough. but please take care of the fact that css is can be "reused"
once freed.
> > If you have more patches to be usable the function above,
> > I recommend you to post this with some real-use patches, in step by step.
>
> Do you mean user space test case? Here is a simple one:
>
> #!/bin/sh
>
> TEST_PROG=usemem
> TEST_PARM="-m 100 -s 100"
>
> test -d /cgroup/hwpoison && rmdir /cgroup/hwpoison
> mkdir /cgroup/hwpoison
>
> $TEST_PROG $TEST_PARM &
> echo `pidof $TEST_PROG` > /cgroup/hwpoison/tasks
>
> memcg_id=$(</cgroup/hwpoison/memory.id)
> echo $memcg_id > /debug/hwpoison/corrupt-filter-memcg
>
> ./corrupt-all-pfn
>
Ah, this is nice to be put into changelog or some documentation.
> > patch 19,20 is ok for me.
>
> Thanks,
> Fengguang
> ---
> memcg: show memory.id in cgroupfs
>
> The hwpoison test suite need to selectively inject hwpoison to some
> targeted task pages, and must not kill important system processes
> such as init.
>
> The memory cgroup serves this purpose well. We can put the target
> processes under the control of a memory cgroup, tell the hwpoison
> injection code the id of that memory cgroup so that it will only
> poison pages associated with it.
>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
No objections from me. please use "id" check. or adds new flag to
struct mem_cgroup, as you like.
The style I prefer is
==
struct mem_cgroup {
....
bool hwpoison_test_enabled;
};
+#ifdef CONFIG_HWPOISON_INJECT /* for now, only user is hwpoison testing */
+ {
+ .name = "hwpoison_test_enable",
+ .read_u64 = ....
+ },
+#endif
and.
mem = try_get_mem_cgroup_from_page(p);
if (mem_cgroup_is_under_poison_test(mem))
ret = true;
mem_cgroup_put(mem); /* calls css_put() */
Maybe not difficult. and this is an usual way. But it's ok if you don't want to
scannter HWPOISON things to other function's files. This is test operation.
So, "including real use case and patches" is only my request, for this time.
Thanks,
-Kame
> ---
> mm/memcontrol.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> --- linux-mm.orig/mm/memcontrol.c 2009-09-07 16:01:02.000000000 +0800
> +++ linux-mm/mm/memcontrol.c 2009-09-11 18:20:55.000000000 +0800
> @@ -2510,6 +2510,13 @@ mem_cgroup_get_recursive_idx_stat(struct
> *val = d.val;
> }
>
> +#ifdef CONFIG_HWPOISON_INJECT
> +static u64 mem_cgroup_id_read(struct cgroup *cont, struct cftype *cft)
> +{
> + return css_id(cgroup_subsys_state(cont, mem_cgroup_subsys_id));
> +}
> +#endif
> +
> static u64 mem_cgroup_read(struct cgroup *cont, struct cftype *cft)
> {
> struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
> @@ -2841,6 +2848,12 @@ static int mem_cgroup_swappiness_write(s
>
>
> static struct cftype mem_cgroup_files[] = {
> +#ifdef CONFIG_HWPOISON_INJECT /* for now, only user is hwpoison testing */
> + {
> + .name = "id",
> + .read_u64 = mem_cgroup_id_read,
> + },
> +#endif
> {
> .name = "usage_in_bytes",
> .private = MEMFILE_PRIVATE(_MEM, RES_USAGE),
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-03 2:58 ` KAMEZAWA Hiroyuki
@ 2009-12-03 15:03 ` Wu Fengguang
0 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-03 15:03 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, Li Zefan, Paul Menage,
Nick Piggin, linux-mm, LKML
On Thu, Dec 03, 2009 at 10:58:40AM +0800, KAMEZAWA Hiroyuki wrote:
> On Thu, 3 Dec 2009 10:47:39 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > On Thu, Dec 03, 2009 at 10:28:22AM +0800, KAMEZAWA Hiroyuki wrote:
> > Ah please forgive my memcg ignorance.. Then how about bring back the
> > old css_id() based scheme (old patch follows)?
> >
> maybe enough. but please take care of the fact that css is can be "reused"
> once freed.
OK.
> > > If you have more patches to be usable the function above,
> > > I recommend you to post this with some real-use patches, in step by step.
> >
> > Do you mean user space test case? Here is a simple one:
> >
> > #!/bin/sh
> >
> > TEST_PROG=usemem
> > TEST_PARM="-m 100 -s 100"
> >
> > test -d /cgroup/hwpoison && rmdir /cgroup/hwpoison
> > mkdir /cgroup/hwpoison
> >
> > $TEST_PROG $TEST_PARM &
> > echo `pidof $TEST_PROG` > /cgroup/hwpoison/tasks
> >
> > memcg_id=$(</cgroup/hwpoison/memory.id)
> > echo $memcg_id > /debug/hwpoison/corrupt-filter-memcg
> >
> > ./corrupt-all-pfn
> >
> Ah, this is nice to be put into changelog or some documentation.
Good idea, I'll add it.
> > > patch 19,20 is ok for me.
> >
> > Thanks,
> > Fengguang
> > ---
> > memcg: show memory.id in cgroupfs
> >
> > The hwpoison test suite need to selectively inject hwpoison to some
> > targeted task pages, and must not kill important system processes
> > such as init.
> >
> > The memory cgroup serves this purpose well. We can put the target
> > processes under the control of a memory cgroup, tell the hwpoison
> > injection code the id of that memory cgroup so that it will only
> > poison pages associated with it.
> >
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
>
> No objections from me. please use "id" check. or adds new flag to
> struct mem_cgroup, as you like.
There's a 3rd option: inode number in cgroupfs.
test case:
memcg_ino=$(ls -id /cgroup/hwpoison | cut -f1 -d' ')
echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg
kernel code:
hwpoison_filter_memcg ==
memcg->css->cgroup->dentry->d_inode->i_ino
It's pretty long chain, but performance is not a big concern for test
purpose :) As long as the inode number will be accessible and unique
in long term.
This avoids adding extra interfaces to memcg. What do you think?
> The style I prefer is
> ==
> struct mem_cgroup {
> ....
> bool hwpoison_test_enabled;
> };
>
> +#ifdef CONFIG_HWPOISON_INJECT /* for now, only user is hwpoison testing */
> + {
> + .name = "hwpoison_test_enable",
> + .read_u64 = ....
> + },
> +#endif
>
> and.
> mem = try_get_mem_cgroup_from_page(p);
> if (mem_cgroup_is_under_poison_test(mem))
> ret = true;
> mem_cgroup_put(mem); /* calls css_put() */
It seems mem_cgroup_put() does atomic_dec_and_test(&mem->refcnt).
Is that changed to css_put() recently?
> Maybe not difficult. and this is an usual way. But it's ok if you don't want to
> scannter HWPOISON things to other function's files. This is test operation.
>
> So, "including real use case and patches" is only my request, for this time.
OK, thanks for the review!
Regards,
Fengguang
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-02 12:58 ` Wu Fengguang
2009-12-03 1:52 ` KAMEZAWA Hiroyuki
@ 2009-12-03 2:15 ` Li Zefan
2009-12-03 2:20 ` Wu Fengguang
2009-12-03 2:28 ` Wu Fengguang
1 sibling, 2 replies; 61+ messages in thread
From: Li Zefan @ 2009-12-03 2:15 UTC (permalink / raw)
To: Wu Fengguang
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, KAMEZAWA Hiroyuki, Paul Menage,
Nick Piggin, linux-mm, LKML
> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> +u32 hwpoison_filter_memcg;
> +static int hwpoison_filter_task(struct page *p)
> +{
> + struct mem_cgroup *mem;
> + struct cgroup_subsys_state *css;
> +
> + if (!hwpoison_filter_memcg)
> + return 0;
> +
> + mem = try_get_mem_cgroup_from_page(p);
> + if (!mem)
> + return -EINVAL;
> +
> + css = mem_cgroup_css(mem);
> + if (!css)
> + return -EINVAL;
> +
Here, if mem != NULL, then css won't be NULL.
> + css_put(css);
> + return 0;
> +}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-03 2:15 ` Li Zefan
@ 2009-12-03 2:20 ` Wu Fengguang
2009-12-03 2:28 ` Wu Fengguang
1 sibling, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-03 2:20 UTC (permalink / raw)
To: Li Zefan
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, KAMEZAWA Hiroyuki, Paul Menage,
Nick Piggin, linux-mm, LKML
On Thu, Dec 03, 2009 at 10:15:17AM +0800, Li Zefan wrote:
> > +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
> > +u32 hwpoison_filter_memcg;
> > +static int hwpoison_filter_task(struct page *p)
> > +{
> > + struct mem_cgroup *mem;
> > + struct cgroup_subsys_state *css;
> > +
> > + if (!hwpoison_filter_memcg)
> > + return 0;
> > +
> > + mem = try_get_mem_cgroup_from_page(p);
> > + if (!mem)
> > + return -EINVAL;
> > +
> > + css = mem_cgroup_css(mem);
> > + if (!css)
> > + return -EINVAL;
> > +
>
> Here, if mem != NULL, then css won't be NULL.
Good catch, thank you!
> > + css_put(css);
> > + return 0;
> > +}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 22/24] HWPOISON: add memory cgroup filter
2009-12-03 2:15 ` Li Zefan
2009-12-03 2:20 ` Wu Fengguang
@ 2009-12-03 2:28 ` Wu Fengguang
1 sibling, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-03 2:28 UTC (permalink / raw)
To: Li Zefan
Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Hugh Dickins,
Daisuke Nishimura, Balbir Singh, KAMEZAWA Hiroyuki, Paul Menage,
Nick Piggin, linux-mm, LKML
After integrating Andi, Kame and Zefan's review comments:
---
HWPOISON: add memory cgroup filter
The hwpoison test suite need to inject hwpoison to a collection of
selected task pages, and must not touch pages not owned by them and
thus kill important system processes such as init. (But it's OK to
mis-hwpoison free/unowned pages as well as shared clean pages.
Mis-hwpoison of shared dirty pages will kill all tasks, so the test
suite will target all or non of such tasks in the first place.)
The memory cgroup serves this purpose well. We can put the target
processes under the control of a memory cgroup, and tell the hwpoison
injection code to only kill pages associated with some active memory
cgroup.
The prerequisite for doing hwpoison stress tests with mem_cgroup is,
the mem_cgroup code tracks task pages _accurately_ (unless page is
locked). Which we believe is/should be true.
The benefits are simplification of hwpoison injector code. Also the
mem_cgroup code will automatically be tested by hwpoison test cases.
The alternative interfaces pin-pfn/unpin-pfn can also delegate the
(process and page flags) filtering functions reliably to user space.
However prototype implementation shows that this scheme adds more
complexity than we wanted.
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
CC: Balbir Singh <balbir@linux.vnet.ibm.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Paul Menage <menage@google.com>
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/Kconfig | 2 +-
mm/hwpoison-inject.c | 7 +++++++
mm/internal.h | 1 +
mm/memory-failure.c | 35 +++++++++++++++++++++++++++++++++++
4 files changed, 44 insertions(+), 1 deletion(-)
--- linux-mm.orig/mm/memory-failure.c 2009-12-03 09:51:46.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-12-03 10:24:14.000000000 +0800
@@ -96,6 +96,38 @@ static int hwpoison_filter_flags(struct
return -EINVAL;
}
+/*
+ * This allows stress tests to limit test scope to a collection of tasks
+ * by putting them under some memcg. This prevents killing unrelated/important
+ * processes such as /sbin/init. Note that the target task may share clean
+ * pages with init (eg. libc text), which is harmless. If the target task
+ * share _dirty_ pages with another task B, the test scheme must make sure B
+ * is also included in the memcg. At last, due to race conditions this filter
+ * can only guarantee that the page either belongs to the memcg tasks, or is
+ * a freed page.
+ */
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
+u32 hwpoison_filter_memcg;
+static int hwpoison_filter_task(struct page *p)
+{
+ struct mem_cgroup *mem;
+ struct cgroup_subsys_state *css;
+
+ if (!hwpoison_filter_memcg)
+ return 0;
+
+ mem = try_get_mem_cgroup_from_page(p);
+ if (!mem)
+ return -EINVAL;
+
+ css = mem_cgroup_css(mem);
+ css_put(css);
+ return 0;
+}
+#else
+static int hwpoison_filter_task(struct page *p) { return 0; }
+#endif
+
int hwpoison_filter(struct page *p)
{
if (hwpoison_filter_dev(p))
@@ -104,6 +136,9 @@ int hwpoison_filter(struct page *p)
if (hwpoison_filter_flags(p))
return -EINVAL;
+ if (hwpoison_filter_task(p))
+ return -EINVAL;
+
return 0;
}
--- linux-mm.orig/mm/internal.h 2009-12-03 09:51:46.000000000 +0800
+++ linux-mm/mm/internal.h 2009-12-03 09:51:54.000000000 +0800
@@ -270,3 +270,4 @@ extern u32 hwpoison_filter_dev_major;
extern u32 hwpoison_filter_dev_minor;
extern u64 hwpoison_filter_flags_mask;
extern u64 hwpoison_filter_flags_value;
+extern u32 hwpoison_filter_memcg;
--- linux-mm.orig/mm/hwpoison-inject.c 2009-12-03 09:51:46.000000000 +0800
+++ linux-mm/mm/hwpoison-inject.c 2009-12-03 09:51:54.000000000 +0800
@@ -95,6 +95,13 @@ static int pfn_inject_init(void)
if (!dentry)
goto fail;
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
+ dentry = debugfs_create_u32("corrupt-filter-memcg", 0600,
+ hwpoison_dir, &hwpoison_filter_memcg);
+ if (!dentry)
+ goto fail;
+#endif
+
return 0;
fail:
pfn_inject_exit();
--- linux-mm.orig/mm/Kconfig 2009-12-03 09:51:46.000000000 +0800
+++ linux-mm/mm/Kconfig 2009-12-03 09:51:54.000000000 +0800
@@ -257,7 +257,7 @@ config MEMORY_FAILURE
special hardware support and typically ECC memory.
config HWPOISON_INJECT
- tristate "Poison pages injector"
+ tristate "HWPoison pages injector"
depends on MEMORY_FAILURE && DEBUG_KERNEL
config NOMMU_INITIAL_TRIM_EXCESS
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread
* [PATCH 23/24] HWPOISON: add an interface to switch off/on all the page filters
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (21 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 22/24] HWPOISON: add memory cgroup filter Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 3:12 ` [PATCH 24/24] HWPOISON: show corrupted file info Wu Fengguang
23 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen
Cc: Andrew Morton, Haicheng Li, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-filter-enable.patch --]
[-- Type: text/plain, Size: 2401 bytes --]
From: Haicheng Li <haicheng.li@linux.intel.com>
In some use cases, user doesn't need extra filtering. E.g. user program
can inject errors through madvise syscall to its own pages, however it
might not know what the page state exactly is or which inode the page
belongs to.
So introduce an one-off interface "corrupt-filter-enable".
Echo 0 to switch off page filters, and echo 1 to switch on the filters.
Its default value is 1, i.e. all page filters are in effect.
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/hwpoison-inject.c | 5 +++++
mm/internal.h | 1 +
mm/memory-failure.c | 4 ++++
3 files changed, 10 insertions(+)
--- linux-mm.orig/mm/hwpoison-inject.c 2009-12-01 09:56:18.000000000 +0800
+++ linux-mm/mm/hwpoison-inject.c 2009-12-01 09:56:21.000000000 +0800
@@ -75,6 +75,11 @@ static int pfn_inject_init(void)
if (!dentry)
goto fail;
+ dentry = debugfs_create_u32("corrupt-filter-enable", 0600,
+ hwpoison_dir, &hwpoison_filter_enable);
+ if (!dentry)
+ goto fail;
+
dentry = debugfs_create_u32("corrupt-filter-dev-major", 0600,
hwpoison_dir, &hwpoison_filter_dev_major);
if (!dentry)
--- linux-mm.orig/mm/internal.h 2009-12-01 09:56:18.000000000 +0800
+++ linux-mm/mm/internal.h 2009-12-01 09:56:21.000000000 +0800
@@ -271,3 +271,4 @@ extern u32 hwpoison_filter_dev_minor;
extern u64 hwpoison_filter_flags_mask;
extern u64 hwpoison_filter_flags_value;
extern u32 hwpoison_filter_memcg;
+extern u32 hwpoison_filter_enable;
--- linux-mm.orig/mm/memory-failure.c 2009-12-01 09:56:18.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-12-01 09:56:21.000000000 +0800
@@ -49,6 +49,7 @@ int sysctl_memory_failure_recovery __rea
atomic_long_t mce_bad_pages __read_mostly = ATOMIC_LONG_INIT(0);
+u32 hwpoison_filter_enable = 1;
u32 hwpoison_filter_dev_major = ~0U;
u32 hwpoison_filter_dev_minor = ~0U;
u64 hwpoison_filter_flags_mask;
@@ -119,6 +120,9 @@ static int hwpoison_filter_task(struct p
int hwpoison_filter(struct page *p)
{
+ if (!hwpoison_filter_enable)
+ return 0;
+
if (hwpoison_filter_dev(p))
return -EINVAL;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* [PATCH 24/24] HWPOISON: show corrupted file info
2009-12-02 3:12 [PATCH 00/24] hwpoison fixes and stress testing filters Wu Fengguang
` (22 preceding siblings ...)
2009-12-02 3:12 ` [PATCH 23/24] HWPOISON: add an interface to switch off/on all the page filters Wu Fengguang
@ 2009-12-02 3:12 ` Wu Fengguang
2009-12-02 13:20 ` Andi Kleen
23 siblings, 1 reply; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 3:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Wu Fengguang, Nick Piggin, linux-mm, LKML
[-- Attachment #1: hwpoison-describe-page-file.patch --]
[-- Type: text/plain, Size: 1851 bytes --]
If file data is corrupted, the user may want to know which file
is corrupted.
CC: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/memory-failure.c | 43 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
--- linux-mm.orig/mm/memory-failure.c 2009-12-01 09:56:21.000000000 +0800
+++ linux-mm/mm/memory-failure.c 2009-12-01 09:56:23.000000000 +0800
@@ -56,6 +56,47 @@ u64 hwpoison_filter_flags_mask;
u64 hwpoison_filter_flags_value;
u32 hwpoison_filter_memcg;
+static void describe_page_file(struct page *page)
+{
+ char *name = "?";
+ struct inode *inode;
+ struct dentry *dentry;
+
+ if (PageAnon(page))
+ return;
+
+ if (!page->mapping)
+ return;
+
+ inode = igrab(page->mapping->host);
+ if (!inode)
+ return;
+
+ dentry = d_find_alias(inode);
+
+ if (dentry) {
+ spin_lock(&dentry->d_lock);
+ name = dentry->d_name.name;
+ }
+
+ printk(KERN_ERR
+ "MCE %#lx: dev %d:%d inode %lu(%s) pgoff %lu%s\n",
+ page_to_pfn(page),
+ MAJOR(inode->i_sb->s_dev),
+ MINOR(inode->i_sb->s_dev),
+ inode->i_ino,
+ name,
+ page->index,
+ PageDirty(page) ? " corrupted" : "");
+
+ if (dentry) {
+ spin_unlock(&dentry->d_lock);
+ dput(dentry);
+ }
+
+ iput(inode);
+}
+
static int hwpoison_filter_dev(struct page *p)
{
struct address_space *mapping;
@@ -525,6 +566,8 @@ static int me_pagecache_clean(struct hwp
} else {
ret = RECOVERED;
}
+ if (PageDirty(p) && !PageSwapBacked(p))
+ describe_page_file(p);
} else {
/*
* If the file system doesn't support it just invalidate
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 24/24] HWPOISON: show corrupted file info
2009-12-02 3:12 ` [PATCH 24/24] HWPOISON: show corrupted file info Wu Fengguang
@ 2009-12-02 13:20 ` Andi Kleen
2009-12-02 13:37 ` Wu Fengguang
0 siblings, 1 reply; 61+ messages in thread
From: Andi Kleen @ 2009-12-02 13:20 UTC (permalink / raw)
To: Wu Fengguang; +Cc: Andi Kleen, Andrew Morton, Nick Piggin, linux-mm, LKML
> + dentry = d_find_alias(inode);
> +
> + if (dentry) {
> + spin_lock(&dentry->d_lock);
> + name = dentry->d_name.name;
> + }
The standard way to do that is d_path()
But the paths are somewhat meaningless without the root.
Better to not print path names for now.
And pgoff should be just a byte offset with a range
I'll skip this one for now.
-Andi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread* Re: [PATCH 24/24] HWPOISON: show corrupted file info
2009-12-02 13:20 ` Andi Kleen
@ 2009-12-02 13:37 ` Wu Fengguang
0 siblings, 0 replies; 61+ messages in thread
From: Wu Fengguang @ 2009-12-02 13:37 UTC (permalink / raw)
To: Andi Kleen; +Cc: Andrew Morton, Nick Piggin, linux-mm, LKML
On Wed, Dec 02, 2009 at 09:20:48PM +0800, Andi Kleen wrote:
> > + dentry = d_find_alias(inode);
> > +
> > + if (dentry) {
> > + spin_lock(&dentry->d_lock);
> > + name = dentry->d_name.name;
> > + }
>
> The standard way to do that is d_path()
Good idea.
> But the paths are somewhat meaningless without the root.
It would still be more helpful :)
> Better to not print path names for now.
OK.
> And pgoff should be just a byte offset with a range
Makes sense.
btw, I have a patch (maybe out of date) to allow calling d_path()
without a known root:
Subject: vfs: enable __d_path() to handle NULL vfsmnt
Enable __d_path() to handle vfsmnt==NULL case.
This can happen when the caller only have a struct inode/dentry instead of a
struct file, and still want to print the (partial) path within the filesystem.
Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
---
---
fs/dcache.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- linux-2.6.orig/fs/dcache.c
+++ linux-2.6/fs/dcache.c
@@ -1943,7 +1943,10 @@ char *__d_path(const struct path *path,
if (dentry == root->dentry && vfsmnt == root->mnt)
break;
- if (dentry == vfsmnt->mnt_root || IS_ROOT(dentry)) {
+ if (unlikely(!vfsmnt)) {
+ if (IS_ROOT(dentry))
+ break;
+ } else if (dentry == vfsmnt->mnt_root || IS_ROOT(dentry)) {
/* Global root? */
if (vfsmnt->mnt_parent == vfsmnt) {
goto global_root;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 61+ messages in thread