From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: linux-mm@kvack.org
Subject: [PATCH 4/5] mm: Drop __GFP_WAIT flag when allocating from shrinker functions.
Date: Sun, 23 Nov 2014 13:52:48 +0900 [thread overview]
Message-ID: <201411231352.IFC13048.LOOJQMFtFVSHFO@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <201411231349.CAG78628.VFQFOtOSFJMOLH@I-love.SAKURA.ne.jp>
>From b248c31988ea582d2d4f4093fb8b649be91174bb Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sun, 23 Nov 2014 13:40:47 +0900
Subject: [PATCH 4/5] mm: Drop __GFP_WAIT flag when allocating from shrinker functions.
Memory allocations from shrinker functions are complicated.
If unexpected flags are stored in "struct shrink_control"->gfp_mask and
used inside shrinker functions, it can cause difficult-to-trigger bugs
like https://bugzilla.kernel.org/show_bug.cgi?id=87891 .
Also, stack usage by __alloc_pages_nodemask() is large. If we unlimitedly
allow recursive __alloc_pages_nodemask() calls, kernel stack could overflow
under extreme memory pressure.
Some shrinker functions are using sleepable locks which could make kswapd
sleep for unpredictable duration. If kswapd is unexpectedly blocked inside
shrinker functions and somebody is expecting that kswapd is running for
reclaiming memory (e.g.
while (unlikely(too_many_isolated(zone, file, sc))) {
congestion_wait(BLK_RW_ASYNC, HZ/10);
/* We are about to die and free our memory. Return now. */
if (fatal_signal_pending(current))
return SWAP_CLUSTER_MAX;
}
in shrink_inactive_list()), it is a memory allocation deadlock.
This patch drops __GFP_WAIT flag when allocating from shrinker functions
so that recursive __alloc_pages_nodemask() calls will not cause troubles
like recursive locks and/or unpredictable sleep. The comments in this patch
suggest shrinker functions users to try to avoid use of sleepable locks
and memory allocations from shrinker functions, as with TTM driver's
shrinker functions.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
mm/page_alloc.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 11cc37d..c77418e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2801,6 +2801,41 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
*/
current->gfp_start = jiffies;
current->gfp_flags = gfp_mask;
+ } else {
+ /*
+ * When this function is called from interrupt context,
+ * the caller must not include __GFP_WAIT flag.
+ *
+ * When this function is called by recursive
+ * __alloc_pages_nodemask() calls from shrinker functions,
+ * the context might allow __GFP_WAIT flag. But since this
+ * function consumes a lot of kernel stack, kernel stack
+ * could overflow under extreme memory pressure if we
+ * unlimitedly allow recursive __alloc_pages_nodemask() calls.
+ * Also, if kswapd is unexpectedly blocked for unpredictable
+ * duration inside shrinker functions, and somebody is
+ * expecting that kswapd is running for reclaiming memory,
+ * it is a memory allocation deadlock.
+ *
+ * If current->gfp_flags != 0 here, it means that this function
+ * is called from either interrupt context or shrinker
+ * functions. Thus, it should be safe to drop __GFP_WAIT flag.
+ *
+ * Moreover, we don't need to check for current->gfp_flags != 0
+ * here because omit_timestamp == true is equivalent to
+ * (gfp_mask & __GFP_WAIT) == 0 and/or current->gfp_flags != 0.
+ * Dropping __GFP_WAIT flag when (gfp_mask & __GFP_WAIT) == 0
+ * is a no-op.
+ *
+ * By dropping __GFP_WAIT flag, kswapd will no longer blocked
+ * by recursive __alloc_pages_nodemask() calls from shrinker
+ * functions. Note that kswapd could still be blocked for
+ * unpredictable duration if sleepable locks are used inside
+ * shrinker functions. Therefore, please try to avoid use of
+ * sleepable locks and memory allocations from shrinker
+ * functions.
+ */
+ gfp_mask &= ~__GFP_WAIT;
}
gfp_mask &= gfp_allowed_mask;
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-11-23 4:52 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-23 4:49 [RFC PATCH 0/5] mm: Patches for mitigating memory allocation stalls Tetsuo Handa
2014-11-23 4:50 ` [PATCH 1/5] mm: Introduce OOM kill timeout Tetsuo Handa
2014-11-24 16:50 ` Michal Hocko
2014-11-24 22:29 ` David Rientjes
2014-11-25 10:38 ` Michal Hocko
2014-11-25 12:54 ` Tetsuo Handa
2014-11-25 13:45 ` Michal Hocko
2014-11-26 11:58 ` Tetsuo Handa
2014-11-26 18:43 ` Michal Hocko
2014-11-27 14:49 ` Tetsuo Handa
2014-11-28 16:17 ` Michal Hocko
2014-11-23 4:50 ` [PATCH 2/5] mm: Kill shrinker's global semaphore Tetsuo Handa
2014-11-24 16:55 ` Michal Hocko
2014-11-23 4:51 ` [PATCH 3/5] mm: Remember ongoing memory allocation status Tetsuo Handa
2014-11-24 17:01 ` Michal Hocko
2014-11-23 4:52 ` Tetsuo Handa [this message]
2014-11-24 17:14 ` [PATCH 4/5] mm: Drop __GFP_WAIT flag when allocating from shrinker functions Michal Hocko
2014-11-23 4:53 ` [PATCH 5/5] mm: Insert some delay if ongoing memory allocation stalls Tetsuo Handa
2014-11-24 17:19 ` Michal Hocko
2014-11-24 17:25 ` [RFC PATCH 0/5] mm: Patches for mitigating " Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201411231352.IFC13048.LOOJQMFtFVSHFO@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox