From: Ying Han <yinghan@google.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>, Mel Gorman <mel@csn.ul.ie>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Rik van Riel <riel@redhat.com>, Hillf Danton <dhillf@gmail.com>,
Hugh Dickins <hughd@google.com>,
Dan Magenheimer <dan.magenheimer@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org
Subject: Re: [PATCH V3 0/2] memcg softlimit reclaim rework
Date: Thu, 19 Apr 2012 10:47:27 -0700 [thread overview]
Message-ID: <CALWz4iw156qErZn0gGUUatUTisy_6uF_5mrY0kXt1W89hvVjRw@mail.gmail.com> (raw)
In-Reply-To: <20120419170434.GE15634@tiehlicka.suse.cz>
On Thu, Apr 19, 2012 at 10:04 AM, Michal Hocko <mhocko@suse.cz> wrote:
> On Wed 18-04-12 11:00:40, Ying Han wrote:
>> On Wed, Apr 18, 2012 at 5:24 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
>> > On Tue, Apr 17, 2012 at 09:37:46AM -0700, Ying Han wrote:
>> >> The "soft_limit" was introduced in memcg to support over-committing the
>> >> memory resource on the host. Each cgroup configures its "hard_limit" where
>> >> it will be throttled or OOM killed by going over the limit. However, the
>> >> cgroup can go above the "soft_limit" as long as there is no system-wide
>> >> memory contention. So, the "soft_limit" is the kernel mechanism for
>> >> re-distributing system spare memory among cgroups.
>> >>
>> >> This patch reworks the softlimit reclaim by hooking it into the new global
>> >> reclaim scheme. So the global reclaim path including direct reclaim and
>> >> background reclaim will respect the memcg softlimit.
>> >>
>> >> v3..v2:
>> >> 1. rebase the patch on 3.4-rc3
>> >> 2. squash the commits of replacing the old implementation with new
>> >> implementation into one commit. This is to make sure to leave the tree
>> >> in stable state between each commit.
>> >> 3. removed the commit which changes the nr_to_reclaim for global reclaim
>> >> case. The need of that patch is not obvious now.
>> >>
>> >> Note:
>> >> 1. the new implementation of softlimit reclaim is rather simple and first
>> >> step for further optimizations. there is no memory pressure balancing between
>> >> memcgs for each zone, and that is something we would like to add as follow-ups.
>> >>
>> >> 2. this patch is slightly different from the last one posted from Johannes
>> >> http://comments.gmane.org/gmane.linux.kernel.mm/72382
>> >> where his patch is closer to the reverted implementation by doing hierarchical
>> >> reclaim for each selected memcg. However, that is not expected behavior from
>> >> user perspective. Considering the following example:
>> >>
>> >> root (32G capacity)
>> >> --> A (hard limit 20G, soft limit 15G, usage 16G)
>> >> --> A1 (soft limit 5G, usage 4G)
>> >> --> A2 (soft limit 10G, usage 12G)
>> >> --> B (hard limit 20G, soft limit 10G, usage 16G)
>> >>
>> >> Under global reclaim, we shouldn't add pressure on A1 although its parent(A)
>> >> exceeds softlimit. This is what admin expects by setting softlimit to the
>> >> actual working set size and only reclaim pages under softlimit if system has
>> >> trouble to reclaim.
>> >
>> > Actually, this is exactly what the admin expects when creating a
>> > hierarchy, because she defines that A1 is a child of A and is
>> > responsible for the memory situation in its parent.
>
> Hmm, I guess that both approaches have cons and pros.
> * Hierarchical soft limit reclaim - reclaim the whole subtree of the over
> soft limit memcg
> + it is consistent with the hard limit reclaim
Not sure why we want them to be consistent. Soft_limit is serving
different purpose and the one of the main purpose is to preserve the
working set of the cgroup.
> + easier for top to bottom configuration - especially when you allow
> subgroups to create deeper hierarchies. Does anybody do that?
As far as I heard, most (if not all) are using flat configuration
where everything is running under root.
> - harder to set up if soft limit should act as a guarantee - might lead
> to an unexpected reclaim.
>
> * Targeted soft limit reclaim - only reclaim LRUs of over limit memcgs
> + easier to set up for the working set guarantee because admin can focus
> on the working set of a single group and not the whole hierarchy
This is true.
> - easier to construct soft unreclaimable hierarchies - whole subtree
> contributes but nobody wants to take the responsibility when we reach
> the limit.
>
> Both approaches don't play very well with the default 0 limit because we
> either reclaim unless we set up the whole hierarchy properly or we just
> burn cycles by trying to reclaim groups wit no or only few pages.
Setting the default to 0 is a good optimization which makes everybody
to be eligible for reclaim if admin doesn't do anything.
In reality, if admin want to preserve working set of cgroups and
he/she has to set the softlimit. By doing that, it is easier to only
focus on the cgroup itself without looking up its ancestors.
> The second approach leads to more expected results though because we do
> not touch "leaf" groups unless they are over limit.
> I have to think about that some more but it seems that the second approach
> is much easier to implement and matches the "guarantee" expectations
> more.
Agree.
> I guess we could converge both approaches if we could reclaim from the
> leaf groups upwards to the root but I didn't think about this very much.
That is what the current patch does, which only consider softlimit
under global pressure :)
--Ying
>
> [...]
> --
> Michal Hocko
> SUSE Labs
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9
> Czech Republic
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-04-19 17:47 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-17 16:37 Ying Han
2012-04-18 12:24 ` Johannes Weiner
2012-04-18 18:00 ` Ying Han
2012-04-19 17:04 ` Michal Hocko
2012-04-19 17:47 ` Ying Han [this message]
2012-04-19 22:33 ` Johannes Weiner
2012-04-19 22:51 ` Johannes Weiner
2012-04-20 7:37 ` Ying Han
2012-04-20 8:21 ` KAMEZAWA Hiroyuki
2012-04-20 14:17 ` Rik van Riel
2012-04-20 16:56 ` Ying Han
2012-04-20 13:17 ` Johannes Weiner
2012-04-20 17:44 ` Ying Han
2012-04-20 18:58 ` Michal Hocko
2012-04-20 22:50 ` Ying Han
2012-04-20 22:56 ` Rik van Riel
2012-04-20 23:14 ` Ying Han
2012-04-21 0:19 ` Johannes Weiner
2012-04-21 0:48 ` Johannes Weiner
2012-04-23 22:19 ` Ying Han
2012-04-20 23:29 ` Johannes Weiner
2012-04-23 13:59 ` Michal Hocko
2012-04-20 8:28 ` Michal Hocko
2012-04-20 8:11 ` Michal Hocko
2012-04-20 17:22 ` Ying Han
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALWz4iw156qErZn0gGUUatUTisy_6uF_5mrY0kXt1W89hvVjRw@mail.gmail.com \
--to=yinghan@google.com \
--cc=akpm@linux-foundation.org \
--cc=dan.magenheimer@oracle.com \
--cc=dhillf@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox