linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND -mm 1/2] mm: add !pte_present() check on existing hugetlb_entry callbacks
@ 2014-03-19  2:29 Naoya Horiguchi
  2014-03-19  2:29 ` [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb() Naoya Horiguchi
  0 siblings, 1 reply; 6+ messages in thread
From: Naoya Horiguchi @ 2014-03-19  2:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Sasha Levin, Rik van Riel, linux-mm, linux-kernel

Page table walker doesn't check non-present hugetlb entry in common path,
so hugetlb_entry() callbacks must check it. The reason for this behavior
is that some callers want to handle it in its own way.

However, some callers don't check it now, which causes unpredictable result,
for example when we have a race between migrating hugepage and reading
/proc/pid/numa_maps. This patch fixes it by adding !pte_present checks on
buggy callbacks.

This bug exists for years and got visible by introducing hugepage migration.

ChangeLog v2:
- fix if condition (check !pte_present() instead of pte_present())

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: stable@vger.kernel.org # 3.12+
---
 fs/proc/task_mmu.c | 3 +++
 mm/mempolicy.c     | 6 +++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git v3.14-rc7-mmotm-2014-03-18-16-37.orig/fs/proc/task_mmu.c v3.14-rc7-mmotm-2014-03-18-16-37/fs/proc/task_mmu.c
index d9d9d4f41544..f75ce811d430 100644
--- v3.14-rc7-mmotm-2014-03-18-16-37.orig/fs/proc/task_mmu.c
+++ v3.14-rc7-mmotm-2014-03-18-16-37/fs/proc/task_mmu.c
@@ -1300,6 +1300,9 @@ static int gather_hugetlb_stats(pte_t *pte, unsigned long addr,
 	if (pte_none(*pte))
 		return 0;
 
+	if (!pte_present(*pte))
+		return 0;
+
 	page = pte_page(*pte);
 	if (!page)
 		return 0;
diff --git v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
index af635c458dee..9d2ef4111a4c 100644
--- v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c
+++ v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
@@ -524,8 +524,12 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long addr,
 	unsigned long flags = qp->flags;
 	int nid;
 	struct page *page;
+	pte_t entry;
 
-	page = pte_page(huge_ptep_get(pte));
+	entry = huge_ptep_get(pte);
+	if (!pte_present(entry))
+		return 0;
+	page = pte_page(entry);
 	nid = page_to_nid(page);
 	if (node_isset(nid, *qp->nmask) == !!(flags & MPOL_MF_INVERT))
 		return 0;
-- 
1.8.5.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb()
  2014-03-19  2:29 [PATCH RESEND -mm 1/2] mm: add !pte_present() check on existing hugetlb_entry callbacks Naoya Horiguchi
@ 2014-03-19  2:29 ` Naoya Horiguchi
  2014-03-19 16:24   ` Sasha Levin
  2014-03-19 20:33   ` Andrew Morton
  0 siblings, 2 replies; 6+ messages in thread
From: Naoya Horiguchi @ 2014-03-19  2:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Sasha Levin, Rik van Riel, linux-mm, linux-kernel

We have a race where we try to migrate an invalid page, resulting in
hitting VM_BUG_ON_PAGE in isolate_huge_page().
queue_pages_hugetlb() is OK to fail, so let's check !PageHeadHuge to keep
invalid hugepage from queuing.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 mm/mempolicy.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
index 9d2ef4111a4c..ae6e2d9dc855 100644
--- v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c
+++ v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
@@ -530,6 +530,17 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long addr,
 	if (!pte_present(entry))
 		return 0;
 	page = pte_page(entry);
+
+	/*
+	 * Trinity found that page could be a non-hugepage. This is an
+	 * unexpected behavior, but it's not clear how this problem happens.
+	 * So let's simply skip such corner case. Page migration can often
+	 * fail for various reasons, so it's ok to just skip the address
+	 * unsuitable to hugepage migration.
+	 */
+	if (!PageHeadHuge(page))
+		return 0;
+
 	nid = page_to_nid(page);
 	if (node_isset(nid, *qp->nmask) == !!(flags & MPOL_MF_INVERT))
 		return 0;
-- 
1.8.5.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb()
  2014-03-19  2:29 ` [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb() Naoya Horiguchi
@ 2014-03-19 16:24   ` Sasha Levin
  2014-03-19 20:49     ` Naoya Horiguchi
  2014-03-19 20:33   ` Andrew Morton
  1 sibling, 1 reply; 6+ messages in thread
From: Sasha Levin @ 2014-03-19 16:24 UTC (permalink / raw)
  To: Naoya Horiguchi, Andrew Morton; +Cc: Rik van Riel, linux-mm, linux-kernel

On 03/18/2014 10:29 PM, Naoya Horiguchi wrote:
> We have a race where we try to migrate an invalid page, resulting in
> hitting VM_BUG_ON_PAGE in isolate_huge_page().
> queue_pages_hugetlb() is OK to fail, so let's check !PageHeadHuge to keep
> invalid hugepage from queuing.
>
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> ---
>   mm/mempolicy.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
>
> diff --git v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
> index 9d2ef4111a4c..ae6e2d9dc855 100644
> --- v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c
> +++ v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
> @@ -530,6 +530,17 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long addr,
>   	if (!pte_present(entry))
>   		return 0;
>   	page = pte_page(entry);
> +
> +	/*
> +	 * Trinity found that page could be a non-hugepage. This is an
> +	 * unexpected behavior, but it's not clear how this problem happens.
> +	 * So let's simply skip such corner case. Page migration can often
> +	 * fail for various reasons, so it's ok to just skip the address
> +	 * unsuitable to hugepage migration.
> +	 */
> +	if (!PageHeadHuge(page))
> +		return 0;
> +
>   	nid = page_to_nid(page);
>   	if (node_isset(nid, *qp->nmask) == !!(flags & MPOL_MF_INVERT))
>   		return 0;
>

I have to say that I really dislike this method of solving the issue.

I think it's something fine to do for testing, but this will just hide this issue
and will let it sneak upstream. I'm really not sure if the trace I've reported is
the only codepath that would trigger it, so if we let it sneak upstream we're risking
of someone hitting it some other way.


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb()
  2014-03-19  2:29 ` [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb() Naoya Horiguchi
  2014-03-19 16:24   ` Sasha Levin
@ 2014-03-19 20:33   ` Andrew Morton
  2014-03-19 20:55     ` Naoya Horiguchi
  1 sibling, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2014-03-19 20:33 UTC (permalink / raw)
  To: Naoya Horiguchi; +Cc: Sasha Levin, Rik van Riel, linux-mm, linux-kernel

On Tue, 18 Mar 2014 22:29:39 -0400 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> We have a race where we try to migrate an invalid page, resulting in
> hitting VM_BUG_ON_PAGE in isolate_huge_page().
> queue_pages_hugetlb() is OK to fail, so let's check !PageHeadHuge to keep
> invalid hugepage from queuing.
> 
> ..
>
> --- v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c
> +++ v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
> @@ -530,6 +530,17 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long addr,
>  	if (!pte_present(entry))
>  		return 0;
>  	page = pte_page(entry);
> +
> +	/*
> +	 * Trinity found that page could be a non-hugepage. This is an
> +	 * unexpected behavior, but it's not clear how this problem happens.
> +	 * So let's simply skip such corner case. Page migration can often
> +	 * fail for various reasons, so it's ok to just skip the address
> +	 * unsuitable to hugepage migration.
> +	 */
> +	if (!PageHeadHuge(page))
> +		return 0;
> +

Whoa, we won't be doing this thanks.  The day we resort to this sort of
thing is the day we revert to the 2.2.26 VM.

I suppose I'd be OK with putting

	if (WARN_ON(!PageHeadHuge(page)))
		return 0;

in there as a temporary be-kind-to-testers thing, but we must get a
full understanding of what's happening in there.

Was this problem caused by or exposed by the pagetable walker patches?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb()
  2014-03-19 16:24   ` Sasha Levin
@ 2014-03-19 20:49     ` Naoya Horiguchi
  0 siblings, 0 replies; 6+ messages in thread
From: Naoya Horiguchi @ 2014-03-19 20:49 UTC (permalink / raw)
  To: sasha.levin; +Cc: akpm, riel, linux-mm, linux-kernel

On Wed, Mar 19, 2014 at 12:24:44PM -0400, Sasha Levin wrote:
> On 03/18/2014 10:29 PM, Naoya Horiguchi wrote:
> >We have a race where we try to migrate an invalid page, resulting in
> >hitting VM_BUG_ON_PAGE in isolate_huge_page().
> >queue_pages_hugetlb() is OK to fail, so let's check !PageHeadHuge to keep
> >invalid hugepage from queuing.
> >
> >Reported-by: Sasha Levin <sasha.levin@oracle.com>
> >Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> >---
> >  mm/mempolicy.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> >diff --git v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
> >index 9d2ef4111a4c..ae6e2d9dc855 100644
> >--- v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c
> >+++ v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
> >@@ -530,6 +530,17 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long addr,
> >  	if (!pte_present(entry))
> >  		return 0;
> >  	page = pte_page(entry);
> >+
> >+	/*
> >+	 * Trinity found that page could be a non-hugepage. This is an
> >+	 * unexpected behavior, but it's not clear how this problem happens.
> >+	 * So let's simply skip such corner case. Page migration can often
> >+	 * fail for various reasons, so it's ok to just skip the address
> >+	 * unsuitable to hugepage migration.
> >+	 */
> >+	if (!PageHeadHuge(page))
> >+		return 0;
> >+
> >  	nid = page_to_nid(page);
> >  	if (node_isset(nid, *qp->nmask) == !!(flags & MPOL_MF_INVERT))
> >  		return 0;
> >
> 
> I have to say that I really dislike this method of solving the issue.

Yes, I understand that this is not the best solution.

> I think it's something fine to do for testing, but this will just hide this issue
> and will let it sneak upstream. I'm really not sure if the trace I've reported is
> the only codepath that would trigger it, so if we let it sneak upstream we're risking
> of someone hitting it some other way.

Unfortunately, I didn't have a reliable reproducer focusing on this problem
(trinity hits other errors rather than this in my trials, so it gave me no
crucial hint for my detailed analysis.)
I think that if reproduced differently this could give us another information
about how the problem happens.
What I'm suggesting here is not a final-form fix, but kind of "needinfo".
I must (and will) try to work on this more after LSFMM summit.

Thanks,
Naoya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb()
  2014-03-19 20:33   ` Andrew Morton
@ 2014-03-19 20:55     ` Naoya Horiguchi
  0 siblings, 0 replies; 6+ messages in thread
From: Naoya Horiguchi @ 2014-03-19 20:55 UTC (permalink / raw)
  To: akpm; +Cc: sasha.levin, riel, linux-mm, linux-kernel

On Wed, Mar 19, 2014 at 01:33:05PM -0700, Andrew Morton wrote:
> On Tue, 18 Mar 2014 22:29:39 -0400 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > We have a race where we try to migrate an invalid page, resulting in
> > hitting VM_BUG_ON_PAGE in isolate_huge_page().
> > queue_pages_hugetlb() is OK to fail, so let's check !PageHeadHuge to keep
> > invalid hugepage from queuing.
> > 
> > ..
> >
> > --- v3.14-rc7-mmotm-2014-03-18-16-37.orig/mm/mempolicy.c
> > +++ v3.14-rc7-mmotm-2014-03-18-16-37/mm/mempolicy.c
> > @@ -530,6 +530,17 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long addr,
> >  	if (!pte_present(entry))
> >  		return 0;
> >  	page = pte_page(entry);
> > +
> > +	/*
> > +	 * Trinity found that page could be a non-hugepage. This is an
> > +	 * unexpected behavior, but it's not clear how this problem happens.
> > +	 * So let's simply skip such corner case. Page migration can often
> > +	 * fail for various reasons, so it's ok to just skip the address
> > +	 * unsuitable to hugepage migration.
> > +	 */
> > +	if (!PageHeadHuge(page))
> > +		return 0;
> > +
> 
> Whoa, we won't be doing this thanks.  The day we resort to this sort of
> thing is the day we revert to the 2.2.26 VM.
> 
> I suppose I'd be OK with putting
> 
> 	if (WARN_ON(!PageHeadHuge(page)))
> 		return 0;
> 
> in there as a temporary be-kind-to-testers thing, but we must get a
> full understanding of what's happening in there.

I agree.

> Was this problem caused by or exposed by the pagetable walker patches?

I'm not sure, but at least I'm the last person who touched around
error point, so I feel responsible for this.

Naoya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-03-19 20:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-19  2:29 [PATCH RESEND -mm 1/2] mm: add !pte_present() check on existing hugetlb_entry callbacks Naoya Horiguchi
2014-03-19  2:29 ` [PATCH RESEND -mm 2/2] mm/mempolicy.c: add comment in queue_pages_hugetlb() Naoya Horiguchi
2014-03-19 16:24   ` Sasha Levin
2014-03-19 20:49     ` Naoya Horiguchi
2014-03-19 20:33   ` Andrew Morton
2014-03-19 20:55     ` Naoya Horiguchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox