From: David Rientjes <rientjes@google.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Rik van Riel <riel@redhat.com>, Minchan Kim <minchan@kernel.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch] mm, thp: abort compaction if migration page cannot be charged to memcg
Date: Thu, 21 Jun 2012 02:57:32 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.00.1206210247440.15747@chino.kir.corp.google.com> (raw)
In-Reply-To: <20120621093220.GL4011@suse.de>
On Thu, 21 Jun 2012, Mel Gorman wrote:
> I take it this happens in a memcg that is full and the temporary page
> is enough to push it over the edge. Is that correct? If so, it should
> be included in the changelog because that would make this a -stable
> candidate on the grounds that it is a corner case where compaction can
> consume excessive CPU when using memcgs leading to apparant stalls.
>
Yes, the charge against the page under migration causes the oom. It's
really a nasty side-effect of memcg page migration that we have to charge
a temporary page that I wish we could address there and certainly we can
try to do that in the future. This issue has just been causing us a lot
of pain, especially for systems with a low number of very large memcgs.
I agree with your assessment that it should be added to stable and ask
that Andrew replace the old changelog with the following:
===SNIP===
mm, thp: abort compaction if migration page cannot be charged to memcg
If page migration cannot charge the temporary page to the memcg,
migrate_pages() will return -ENOMEM. This isn't considered in memory
compaction however, and the loop continues to iterate over all pageblocks
trying to isolate and migrate pages. If a small number of very large
memcgs happen to be oom, however, these attempts will mostly be futile
leading to an enormous amout of cpu consumption due to the page migration
failures.
This patch will short circuit and fail memory compaction if
migrate_pages() returns -ENOMEM. COMPACT_PARTIAL is returned in case some
migrations were successful so that the page allocator will retry.
Cc: stable@vger.kernel.org
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
===SNIP===
> However, here is a possible extention to your patch that should work while
> preserving THP success rates but needs a more messing. At the place of
> your patch do something like this in compact_zone
>
> arbitrary_mem_group = NULL
>
> ...
>
> /*
> * Break out if memcg has "unmovable" pages that disable compaction in
> * this zone
> */
> if err == -ENOMEM
> foreach page in cc->migratepages
> cgroup = page_cgroup(page)
> if cgroup
> mem_group = cgroup->mem_cgroup
> if mem_cgroup->disabled_compaction == true
> goto out
> else
> arbitrary_cgroup = mem_cgroup
>
> i.e. add a new boolean to mem_cgroup that is set to true if this memcg
> has impaired compaction. If a cgroup is not disabled_compaction then
> remember that.
>
> Next is when to set disabled_compaction. At the end of compact_zones,
> do
>
> if ret == COMPACT_COMPLETE && cc->order != -1 && arbitrary_cgroup
> arbitrary_cgroup->disabled_compaction = true
>
> i.e. if we are in direct compaction and there was a full compaction
> cycle that failed due to a cgroup getting in the way then tag that
> cgroup is "disabled_compaction". On subsequent compaction attempts if
> that cgroup is encountered again then abort compaction faster.
>
> This will mitigate a small full memcg disabling compaction for the entire
> zone at least until such time as the memcg has polluted every movable
> pageblock.
>
Interesting approach, I'll look to do something like this as a follow-up
to this patch since we have usecases that reproduce this easily.
Thanks for looking at it and the detailed analysis, Mel.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-06-21 9:57 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-21 6:52 David Rientjes
2012-06-21 8:11 ` Minchan Kim
2012-06-21 8:30 ` David Rientjes
2012-06-21 9:01 ` David Rientjes
2012-06-21 10:05 ` Kamezawa Hiroyuki
2012-06-21 10:16 ` David Rientjes
2012-06-21 9:32 ` Mel Gorman
2012-06-21 9:57 ` David Rientjes [this message]
2012-06-25 20:40 ` Rik van Riel
2012-06-26 0:32 ` David Rientjes
2012-06-26 3:11 ` Kamezawa Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1206210247440.15747@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox