* [REPOST] [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC allocations.
@ 2015-08-23 7:23 Tetsuo Handa
2015-08-24 8:54 ` Michal Hocko
2015-09-01 22:21 ` David Rientjes
0 siblings, 2 replies; 5+ messages in thread
From: Tetsuo Handa @ 2015-08-23 7:23 UTC (permalink / raw)
To: mhocko, rientjes, hannes; +Cc: linux-mm
>From 08a638e04351386ab03cd1223988ac7940d4d3aa Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sat, 1 Aug 2015 22:46:12 +0900
Subject: [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC
allocations.
Currently, if somebody does GFP_ATOMIC | __GFP_NOFAIL allocation,
wait_iff_congested() might be called via __alloc_pages_high_priority()
before reaching
if (!wait) {
WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
goto nopage;
}
because gfp_to_alloc_flags() includes ALLOC_NO_WATERMARKS if TIF_MEMDIE
was set.
We need to check for __GFP_WAIT flag at __alloc_pages_high_priority()
in order to make sure that we won't schedule.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
mm/page_alloc.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 37a0390..f9f09fa 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2917,16 +2917,15 @@ __alloc_pages_high_priority(gfp_t gfp_mask, unsigned int order,
{
struct page *page;
- do {
+ for (;;) {
page = get_page_from_freelist(gfp_mask, order,
ALLOC_NO_WATERMARKS, ac);
- if (!page && gfp_mask & __GFP_NOFAIL)
- wait_iff_congested(ac->preferred_zone, BLK_RW_ASYNC,
- HZ/50);
- } while (!page && (gfp_mask & __GFP_NOFAIL));
-
- return page;
+ if (page || (gfp_mask & (__GFP_NOFAIL | __GFP_WAIT)) !=
+ (__GFP_NOFAIL | __GFP_WAIT))
+ return page;
+ wait_iff_congested(ac->preferred_zone, BLK_RW_ASYNC, HZ/50);
+ }
}
static void wake_all_kswapds(unsigned int order, const struct alloc_context *ac)
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [REPOST] [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC allocations.
2015-08-23 7:23 [REPOST] [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC allocations Tetsuo Handa
@ 2015-08-24 8:54 ` Michal Hocko
2015-09-01 22:21 ` David Rientjes
1 sibling, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2015-08-24 8:54 UTC (permalink / raw)
To: Tetsuo Handa; +Cc: rientjes, hannes, linux-mm
On Sun 23-08-15 16:23:37, Tetsuo Handa wrote:
> >From 08a638e04351386ab03cd1223988ac7940d4d3aa Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Sat, 1 Aug 2015 22:46:12 +0900
> Subject: [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC
> allocations.
>
> Currently, if somebody does GFP_ATOMIC | __GFP_NOFAIL allocation,
This combination of flags is broken by definition and I fail to see it
being used anywhere in the kernel.
> wait_iff_congested() might be called via __alloc_pages_high_priority()
> before reaching
>
> if (!wait) {
> WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
> goto nopage;
> }
>
> because gfp_to_alloc_flags() includes ALLOC_NO_WATERMARKS if TIF_MEMDIE
> was set.
>
> We need to check for __GFP_WAIT flag at __alloc_pages_high_priority()
> in order to make sure that we won't schedule.
I do not think this is an improvement. It is true we are already failing
__GFP_NOFAIL & ~__GFP_WAIT but I believe it doesn't make much sense
to replace one buggy behavior (sleeping in atomic context) by another
(failing __GFP_NOFAIL). It is the caller which should be fixed here. We
should get "scheduling while atomic:" and the trace with the current
code so we are not loosing any debugging options.
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> mm/page_alloc.c | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 37a0390..f9f09fa 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2917,16 +2917,15 @@ __alloc_pages_high_priority(gfp_t gfp_mask, unsigned int order,
> {
> struct page *page;
>
> - do {
> + for (;;) {
> page = get_page_from_freelist(gfp_mask, order,
> ALLOC_NO_WATERMARKS, ac);
>
> - if (!page && gfp_mask & __GFP_NOFAIL)
> - wait_iff_congested(ac->preferred_zone, BLK_RW_ASYNC,
> - HZ/50);
> - } while (!page && (gfp_mask & __GFP_NOFAIL));
> -
> - return page;
> + if (page || (gfp_mask & (__GFP_NOFAIL | __GFP_WAIT)) !=
> + (__GFP_NOFAIL | __GFP_WAIT))
> + return page;
> + wait_iff_congested(ac->preferred_zone, BLK_RW_ASYNC, HZ/50);
> + }
> }
>
> static void wake_all_kswapds(unsigned int order, const struct alloc_context *ac)
> --
> 1.8.3.1
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [REPOST] [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC allocations.
2015-08-23 7:23 [REPOST] [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC allocations Tetsuo Handa
2015-08-24 8:54 ` Michal Hocko
@ 2015-09-01 22:21 ` David Rientjes
2015-09-03 8:55 ` Tetsuo Handa
1 sibling, 1 reply; 5+ messages in thread
From: David Rientjes @ 2015-09-01 22:21 UTC (permalink / raw)
To: Tetsuo Handa; +Cc: mhocko, hannes, linux-mm
On Sun, 23 Aug 2015, Tetsuo Handa wrote:
> >From 08a638e04351386ab03cd1223988ac7940d4d3aa Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Sat, 1 Aug 2015 22:46:12 +0900
> Subject: [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC
> allocations.
>
> Currently, if somebody does GFP_ATOMIC | __GFP_NOFAIL allocation,
> wait_iff_congested() might be called via __alloc_pages_high_priority()
> before reaching
>
> if (!wait) {
> WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
> goto nopage;
> }
>
> because gfp_to_alloc_flags() includes ALLOC_NO_WATERMARKS if TIF_MEMDIE
> was set.
>
> We need to check for __GFP_WAIT flag at __alloc_pages_high_priority()
> in order to make sure that we won't schedule.
>
I've brought the GFP_ATOMIC | __GFP_NOFAIL combination up before, which
resulted in the WARN_ON_ONCE() that you cited. We don't support such a
combination. Fixing up the documentation in any places you feel it is
deficient would be the best.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [REPOST] [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC allocations.
2015-09-01 22:21 ` David Rientjes
@ 2015-09-03 8:55 ` Tetsuo Handa
2015-09-09 22:23 ` David Rientjes
0 siblings, 1 reply; 5+ messages in thread
From: Tetsuo Handa @ 2015-09-03 8:55 UTC (permalink / raw)
To: rientjes; +Cc: mhocko, hannes, linux-mm
David Rientjes wrote:
> On Sun, 23 Aug 2015, Tetsuo Handa wrote:
>
> > >From 08a638e04351386ab03cd1223988ac7940d4d3aa Mon Sep 17 00:00:00 2001
> > From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > Date: Sat, 1 Aug 2015 22:46:12 +0900
> > Subject: [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC
> > allocations.
> >
> > Currently, if somebody does GFP_ATOMIC | __GFP_NOFAIL allocation,
> > wait_iff_congested() might be called via __alloc_pages_high_priority()
> > before reaching
> >
> > if (!wait) {
> > WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
> > goto nopage;
> > }
> >
> > because gfp_to_alloc_flags() includes ALLOC_NO_WATERMARKS if TIF_MEMDIE
> > was set.
> >
> > We need to check for __GFP_WAIT flag at __alloc_pages_high_priority()
> > in order to make sure that we won't schedule.
> >
>
> I've brought the GFP_ATOMIC | __GFP_NOFAIL combination up before, which
> resulted in the WARN_ON_ONCE() that you cited. We don't support such a
> combination. Fixing up the documentation in any places you feel it is
> deficient would be the best.
>
The purpose of this check is to warn about unassured combination, isn't it?
Then, I think this check should be done like
----------
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5b5240b..7358225 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3046,15 +3046,8 @@ retry:
}
/* Atomic allocations - we can't balance anything */
- if (!wait) {
- /*
- * All existing users of the deprecated __GFP_NOFAIL are
- * blockable, so warn of any new users that actually allow this
- * type of allocation to fail.
- */
- WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
+ if (!wait)
goto nopage;
- }
/* Avoid recursion of direct reclaim */
if (current->flags & PF_MEMALLOC)
@@ -3183,6 +3176,12 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
lockdep_trace_alloc(gfp_mask);
+ /*
+ * All existing users of the __GFP_NOFAIL have __GFP_WAIT.
+ * __GFP_NOFAIL allocations without __GFP_WAIT is unassured.
+ */
+ WARN_ON_ONCE((gfp_mask & (__GFP_NOFAIL | __GFP_WAIT)) == __GFP_NOFAIL);
+
might_sleep_if(gfp_mask & __GFP_WAIT);
if (should_fail_alloc_page(gfp_mask, order))
----------
because such allocation requests can succeed at fast path or at
/* This is the last chance, in general, before the goto nopage. */
. If unconditional WARN_ON_ONCE() is too wasteful, maybe we can do like
#ifdef CONFIG_DEBUG_SOMETHING
WARN_ON_ONCE((gfp_mask & (__GFP_NOFAIL | __GFP_WAIT)) == __GFP_NOFAIL);
#endif
.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [REPOST] [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC allocations.
2015-09-03 8:55 ` Tetsuo Handa
@ 2015-09-09 22:23 ` David Rientjes
0 siblings, 0 replies; 5+ messages in thread
From: David Rientjes @ 2015-09-09 22:23 UTC (permalink / raw)
To: Tetsuo Handa; +Cc: mhocko, hannes, linux-mm
On Thu, 3 Sep 2015, Tetsuo Handa wrote:
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5b5240b..7358225 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3046,15 +3046,8 @@ retry:
> }
>
> /* Atomic allocations - we can't balance anything */
> - if (!wait) {
> - /*
> - * All existing users of the deprecated __GFP_NOFAIL are
> - * blockable, so warn of any new users that actually allow this
> - * type of allocation to fail.
> - */
> - WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
> + if (!wait)
> goto nopage;
> - }
>
> /* Avoid recursion of direct reclaim */
> if (current->flags & PF_MEMALLOC)
> @@ -3183,6 +3176,12 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
>
> lockdep_trace_alloc(gfp_mask);
>
> + /*
> + * All existing users of the __GFP_NOFAIL have __GFP_WAIT.
> + * __GFP_NOFAIL allocations without __GFP_WAIT is unassured.
> + */
> + WARN_ON_ONCE((gfp_mask & (__GFP_NOFAIL | __GFP_WAIT)) == __GFP_NOFAIL);
> +
> might_sleep_if(gfp_mask & __GFP_WAIT);
>
> if (should_fail_alloc_page(gfp_mask, order))
This is correct, but since there are no GFP_ATOMIC | __GFP_NOFAIL callers
in the tree, this would needlessly add the check to the fastpath and never
trigger. That's why it currently exists only in the slowpath. It's more
for documentation than actually triggering, although bug reports would
always be welcome to report new callers. Documentation can always be
improved, however.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-09-09 22:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-23 7:23 [REPOST] [PATCH 2/2] mm: Fix potentially scheduling in GFP_ATOMIC allocations Tetsuo Handa
2015-08-24 8:54 ` Michal Hocko
2015-09-01 22:21 ` David Rientjes
2015-09-03 8:55 ` Tetsuo Handa
2015-09-09 22:23 ` David Rientjes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox