From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: mhocko@kernel.org, akpm@linux-foundation.org
Cc: rientjes@google.com, hannes@cmpxchg.org, guro@fb.com,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
mhocko@suse.com
Subject: Re: [PATCH 1/2] mm, oom: do not rely on TIF_MEMDIE for memory reserves access
Date: Wed, 2 Aug 2017 00:30:33 +0900 [thread overview]
Message-ID: <201708020030.ACB04683.JLHMFVOSFFOtOQ@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20170727090357.3205-2-mhocko@kernel.org>
Michal Hocko wrote:
> CONFIG_MMU=n doesn't have oom reaper so let's stick to the original
> ALLOC_NO_WATERMARKS approach but be careful because they still might
> deplete all the memory reserves so keep the semantic as close to the
> original implementation as possible and give them access to memory
> reserves only up to exit_mm (when tsk->mm is cleared) rather than while
> tsk_is_oom_victim which is until signal struct is gone.
Currently memory allocations from __mmput() can use memory reserves but
this patch changes __mmput() not to use memory reserves. You say "keep
the semantic as close to the original implementation as possible" but
this change is not guaranteed to be safe.
> @@ -2943,10 +2943,19 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark,
> * the high-atomic reserves. This will over-estimate the size of the
> * atomic reserve but it avoids a search.
> */
> - if (likely(!alloc_harder))
> + if (likely(!alloc_harder)) {
> free_pages -= z->nr_reserved_highatomic;
> - else
> - min -= min / 4;
> + } else {
> + /*
> + * OOM victims can try even harder than normal ALLOC_HARDER
> + * users
> + */
> + if (alloc_flags & ALLOC_OOM)
ALLOC_OOM is ALLOC_NO_WATERMARKS if CONFIG_MMU=n.
I wonder this test makes sense for ALLOC_NO_WATERMARKS.
> + min -= min / 2;
> + else
> + min -= min / 4;
> + }
> +
>
> #ifdef CONFIG_CMA
> /* If allocation can't use CMA areas don't use free CMA pages */
> @@ -3603,6 +3612,22 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> return alloc_flags;
> }
>
> +static bool oom_reserves_allowed(struct task_struct *tsk)
> +{
> + if (!tsk_is_oom_victim(tsk))
> + return false;
> +
> + /*
> + * !MMU doesn't have oom reaper so we shouldn't risk the memory reserves
> + * depletion and shouldn't give access to memory reserves passed the
> + * exit_mm
> + */
> + if (!IS_ENABLED(CONFIG_MMU) && !tsk->mm)
> + return false;
Branching based on CONFIG_MMU is ugly. I suggest timeout based next OOM
victim selection if CONFIG_MMU=n. Then, we no longer need to worry about
memory reserves depletion and we can treat equally.
> +
> + return true;
> +}
> +
> bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
> {
> if (unlikely(gfp_mask & __GFP_NOMEMALLOC))
> @@ -3770,6 +3795,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> unsigned long alloc_start = jiffies;
> unsigned int stall_timeout = 10 * HZ;
> unsigned int cpuset_mems_cookie;
> + bool reserves;
>
> /*
> * In the slowpath, we sanity check order to avoid ever trying to
> @@ -3875,15 +3901,24 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> wake_all_kswapds(order, ac);
>
> - if (gfp_pfmemalloc_allowed(gfp_mask))
> - alloc_flags = ALLOC_NO_WATERMARKS;
> + /*
> + * Distinguish requests which really need access to whole memory
> + * reserves from oom victims which can live with their own reserve
> + */
> + reserves = gfp_pfmemalloc_allowed(gfp_mask);
> + if (reserves) {
> + if (tsk_is_oom_victim(current))
> + alloc_flags = ALLOC_OOM;
If reserves == true due to reasons other than tsk_is_oom_victim(current) == true
(e.g. __GFP_MEMALLOC), why dare to reduce it?
> + else
> + alloc_flags = ALLOC_NO_WATERMARKS;
> + }
If CONFIG_MMU=n, doing this test is silly.
if (tsk_is_oom_victim(current))
alloc_flags = ALLOC_NO_WATERMARKS;
else
alloc_flags = ALLOC_NO_WATERMARKS;
>
> /*
> * Reset the zonelist iterators if memory policies can be ignored.
> * These allocations are high priority and system rather than user
> * orientated.
> */
> - if (!(alloc_flags & ALLOC_CPUSET) || (alloc_flags & ALLOC_NO_WATERMARKS)) {
> + if (!(alloc_flags & ALLOC_CPUSET) || reserves) {
> ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
> ac->preferred_zoneref = first_zones_zonelist(ac->zonelist,
> ac->high_zoneidx, ac->nodemask);
> @@ -3960,7 +3995,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> goto got_pg;
>
> /* Avoid allocations with no watermarks from looping endlessly */
> - if (test_thread_flag(TIF_MEMDIE) &&
> + if (tsk_is_oom_victim(current) &&
> (alloc_flags == ALLOC_NO_WATERMARKS ||
> (gfp_mask & __GFP_NOMEMALLOC)))
> goto nopage;
And you are silently changing to "!costly __GFP_DIRECT_RECLAIM allocations never fail
(even selected for OOM victims)" (i.e. updating the too small to fail memory allocation
rule) by doing alloc_flags == ALLOC_NO_WATERMARKS if CONFIG_MMU=y.
Applying this change might disturb memory allocation behavior. I don't like this patch.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-01 15:30 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-27 9:03 [PATCH 0/2] mm, oom: do not grant oom victims full " Michal Hocko
2017-07-27 9:03 ` [PATCH 1/2] mm, oom: do not rely on TIF_MEMDIE for " Michal Hocko
2017-08-01 15:30 ` Tetsuo Handa [this message]
2017-08-01 16:52 ` Michal Hocko
2017-08-02 6:10 ` Michal Hocko
2017-08-03 1:39 ` Tetsuo Handa
2017-08-03 7:06 ` Michal Hocko
2017-08-03 8:03 ` Tetsuo Handa
2017-08-03 8:21 ` Michal Hocko
2017-08-02 8:29 ` [PATCH v2 " Michal Hocko
2017-08-03 9:37 ` Mel Gorman
2017-08-03 11:00 ` Michal Hocko
2017-08-03 12:22 ` Mel Gorman
2017-07-27 9:03 ` [PATCH 2/2] mm: replace TIF_MEMDIE checks by tsk_is_oom_victim Michal Hocko
2017-07-27 14:01 ` Tetsuo Handa
2017-07-27 14:08 ` Tetsuo Handa
2017-07-27 14:18 ` Michal Hocko
2017-07-27 14:45 ` Michal Hocko
2017-07-27 14:55 ` Roman Gushchin
2017-07-29 8:33 ` kbuild test robot
2017-07-31 6:46 ` Michal Hocko
2017-08-01 12:16 ` [PATCH 0/2] mm, oom: do not grant oom victims full memory reserves access Michal Hocko
2017-08-01 12:23 ` Roman Gushchin
2017-08-01 12:29 ` Michal Hocko
2017-08-01 12:42 ` Roman Gushchin
2017-08-01 12:54 ` Michal Hocko
2017-08-07 14:21 ` Michal Hocko
2017-08-10 7:50 [PATCH v2 " Michal Hocko
2017-08-10 7:50 ` [PATCH 1/2] mm, oom: do not rely on TIF_MEMDIE for " Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201708020030.ACB04683.JLHMFVOSFFOtOQ@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox