From: Johannes Weiner <hannes@cmpxchg.org>
To: Roman Gushchin <guro@fb.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Tejun Heo <tj@kernel.org>,
kernel-team@fb.com, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/4] mm: memory.low hierarchical behavior
Date: Thu, 5 Apr 2018 15:36:40 -0400 [thread overview]
Message-ID: <20180405193640.GB27918@cmpxchg.org> (raw)
In-Reply-To: <20180405185921.4942-2-guro@fb.com>
On Thu, Apr 05, 2018 at 07:59:19PM +0100, Roman Gushchin wrote:
> This patch aims to address an issue in current memory.low semantics,
> which makes it hard to use it in a hierarchy, where some leaf memory
> cgroups are more valuable than others.
>
> For example, there are memcgs A, A/B, A/C, A/D and A/E:
>
> A A/memory.low = 2G, A/memory.current = 6G
> //\\
> BC DE B/memory.low = 3G B/memory.current = 2G
> C/memory.low = 1G C/memory.current = 2G
> D/memory.low = 0 D/memory.current = 2G
> E/memory.low = 10G E/memory.current = 0
>
> If we apply memory pressure, B, C and D are reclaimed at
> the same pace while A's usage exceeds 2G.
> This is obviously wrong, as B's usage is fully below B's memory.low,
> and C has 1G of protection as well.
> Also, A is pushed to the size, which is less than A's 2G memory.low,
> which is also wrong.
>
> A simple bash script (provided below) can be used to reproduce
> the problem. Current results are:
> A: 1430097920
> A/B: 711929856
> A/C: 717426688
> A/D: 741376
> A/E: 0
>
> To address the issue a concept of effective memory.low is introduced.
> Effective memory.low is always equal or less than original memory.low.
> In a case, when there is no memory.low overcommittment (and also for
> top-level cgroups), these two values are equal.
> Otherwise it's a part of parent's effective memory.low, calculated as
> a cgroup's memory.low usage divided by sum of sibling's memory.low
> usages (under memory.low usage I mean the size of actually protected
> memory: memory.current if memory.current < memory.low, 0 otherwise).
> It's necessary to track the actual usage, because otherwise an empty
> cgroup with memory.low set (A/E in my example) will affect actual
> memory distribution, which makes no sense. To avoid traversing
> the cgroup tree twice, page_counters code is reused.
>
> Calculating effective memory.low can be done in the reclaim path,
> as we conveniently traversing the cgroup tree from top to bottom and
> check memory.low on each level. So, it's a perfect place to calculate
> effective memory low and save it to use it for children cgroups.
>
> This also eliminates a need to traverse the cgroup tree from bottom
> to top each time to check if parent's guarantee is not exceeded.
>
> Setting/resetting effective memory.low is intentionally racy, but
> it's fine and shouldn't lead to any significant differences in
> actual memory distribution.
>
> With this patch applied results are matching the expectations:
> A: 2147930112
> A/B: 1428721664
> A/C: 718393344
> A/D: 815104
> A/E: 0
>
> Test script:
> #!/bin/bash
>
> CGPATH="/sys/fs/cgroup"
>
> truncate /file1 --size 2G
> truncate /file2 --size 2G
> truncate /file3 --size 2G
> truncate /file4 --size 50G
>
> mkdir "${CGPATH}/A"
> echo "+memory" > "${CGPATH}/A/cgroup.subtree_control"
> mkdir "${CGPATH}/A/B" "${CGPATH}/A/C" "${CGPATH}/A/D" "${CGPATH}/A/E"
>
> echo 2G > "${CGPATH}/A/memory.low"
> echo 3G > "${CGPATH}/A/B/memory.low"
> echo 1G > "${CGPATH}/A/C/memory.low"
> echo 0 > "${CGPATH}/A/D/memory.low"
> echo 10G > "${CGPATH}/A/E/memory.low"
>
> echo $$ > "${CGPATH}/A/B/cgroup.procs" && vmtouch -qt /file1
> echo $$ > "${CGPATH}/A/C/cgroup.procs" && vmtouch -qt /file2
> echo $$ > "${CGPATH}/A/D/cgroup.procs" && vmtouch -qt /file3
> echo $$ > "${CGPATH}/cgroup.procs" && vmtouch -qt /file4
>
> echo "A: " `cat "${CGPATH}/A/memory.current"`
> echo "A/B: " `cat "${CGPATH}/A/B/memory.current"`
> echo "A/C: " `cat "${CGPATH}/A/C/memory.current"`
> echo "A/D: " `cat "${CGPATH}/A/D/memory.current"`
> echo "A/E: " `cat "${CGPATH}/A/E/memory.current"`
>
> rmdir "${CGPATH}/A/B" "${CGPATH}/A/C" "${CGPATH}/A/D" "${CGPATH}/A/E"
> rmdir "${CGPATH}/A"
> rm /file1 /file2 /file3 /file4
>
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: kernel-team@fb.com
> Cc: linux-mm@kvack.org
> Cc: cgroups@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
next prev parent reply other threads:[~2018-04-05 19:36 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-05 18:59 [PATCH v3 1/4] mm: rename page_counter's count/limit into usage/max Roman Gushchin
2018-04-05 18:59 ` [PATCH v3 2/4] mm: memory.low hierarchical behavior Roman Gushchin
2018-04-05 19:36 ` Johannes Weiner [this message]
2018-04-05 18:59 ` [PATCH v3 3/4] mm: treat memory.low value inclusive Roman Gushchin
2018-04-05 19:45 ` Johannes Weiner
2018-04-06 12:21 ` Roman Gushchin
2018-04-06 16:38 ` Johannes Weiner
2018-04-17 19:00 ` Roman Gushchin
2018-04-05 18:59 ` [PATCH v3 4/4] mm/docs: describe memory.low refinements Roman Gushchin
2018-04-05 19:46 ` Johannes Weiner
2018-04-05 19:32 ` [PATCH v3 1/4] mm: rename page_counter's count/limit into usage/max Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180405193640.GB27918@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=guro@fb.com \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox