From: "Christopherson, Sean J" <sean.j.christopherson@intel.com>
To: "mhocko@kernel.org" <mhocko@kernel.org>
Cc: "hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"vdavydov.dev@gmail.com" <vdavydov.dev@gmail.com>,
"cgroups@vger.kernel.org" <cgroups@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: RE: [PATCH] mm/memcontrol: exclude @root from checks in mem_cgroup_low
Date: Fri, 2 Jun 2017 20:22:50 +0000 [thread overview]
Message-ID: <37306EFA9975BE469F115FDE982C075BC6112095@ORSMSX108.amr.corp.intel.com> (raw)
In-Reply-To: <1496434412-21005-1-git-send-email-sean.j.christopherson@intel.com>
Christopherson, Sean J <sean.j.christopherson@intel.com> wrote:
> Make @root exclusive in mem_cgroup_low; it is never considered low
> when looked at directly and is not checked when traversing the tree.
> In effect, @root is handled identically to how root_mem_cgroup was
> previously handled by mem_cgroup_low.
>
> If @root is not excluded from the checks, a cgroup underneath @root
> will never be considered low during targeted reclaim of @root, e.g.
> due to memory.current > memory.high, unless @root is misconfigured
> to have memory.low > memory.high.
>
> Excluding @root enables using memory.low to prioritize memory usage
> between cgroups within a subtree of the hierarchy that is limited by
> memory.high or memory.max, e.g. when ROOT owns @root's controls but
> delegates the @root directory to a USER so that USER can create and
> administer children of @root.
>
> For example, given cgroup A with children B and C:
>
> A
> / \
> B C
>
> and
>
> 1. A/memory.current > A/memory.high
> 2. A/B/memory.current < A/B/memory.low
> 3. A/C/memory.current >= A/C/memory.low
>
> As 'A' is high, i.e. triggers reclaim from 'A', and 'B' is low, we
> should reclaim from 'C' until 'A' is no longer high or until we can
> no longer reclaim from 'C'. If 'A', i.e. @root, isn't excluded by
> mem_cgroup_low when reclaming from 'A', then 'B' won't be considered
> low and we will reclaim indiscriminately from both 'B' and 'C'.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Here is the test I used to confirm the bug and the patch.
20:00:55@sjchrist-vm ? ~ $ cat ~/.bin/memcg_low_test
#!/bin/bash
x62mb=$((62<<20))
x66mb=$((66<<20))
x94mb=$((94<<20))
x98mb=$((98<<20))
setup() {
set -e
if [[ -n $DEBUG ]]; then
set -x
fi
trap teardown EXIT HUP INT TERM
if [[ ! -e /mnt/1gb.swap ]]; then
sudo fallocate -l 1G /mnt/1gb.swap > /dev/null
sudo mkswap /mnt/1gb.swap > /dev/null
fi
if ! swapon --show=NAME | grep -q "/mnt/1gb.swap"; then
sudo swapon /mnt/1gb.swap
fi
if [[ ! -e /cgroup/cgroup.controllers ]]; then
sudo mount -t cgroup2 none /cgroup
fi
grep -q memory /cgroup/cgroup.controllers
sudo sh -c "echo '+memory' > /cgroup/cgroup.subtree_control"
sudo mkdir /cgroup/A && sudo chown $USER:$USER /cgroup/A
sudo sh -c "echo '+memory' > /cgroup/A/cgroup.subtree_control"
sudo sh -c "echo '96m' > /cgroup/A/memory.high"
mkdir /cgroup/A/0
mkdir /cgroup/A/1
echo 64m > /cgroup/A/0/memory.low
}
teardown() {
set +e
trap - EXIT HUP INT TERM
if [[ -z $1 ]]; then
printf "\n"
printf "%0.s*" {1..35}
printf "\nFAILED!\n\n"
tail /cgroup/A/**/memory.current
printf "%0.s*" {1..35}
printf "\n\n"
fi
ps | grep stress | tr -s ' ' | cut -f 2 -d ' ' | xargs -I % kill %
sleep 2
if [[ -e /cgroup/A/0 ]]; then
rmdir /cgroup/A/0
fi
if [[ -e /cgroup/A/1 ]]; then
rmdir /cgroup/A/1
fi
if [[ -e /cgroup/A ]]; then
sudo rmdir /cgroup/A
fi
}
stress_test() {
sudo sh -c "echo $$ > /cgroup/A/$1/cgroup.procs"
stress --vm 1 --vm-bytes 64M --vm-keep > /dev/null &
sudo sh -c "echo $$ > /cgroup/A/$2/cgroup.procs"
stress --vm 1 --vm-bytes 64M --vm-keep > /dev/null &
sudo sh -c "echo $$ > /cgroup/cgroup.procs"
sleep 1
# A/0 should be consuming more memory than A/1
[[ $(cat /cgroup/A/0/memory.current) -ge $(cat /cgroup/A/1/memory.current) ]]
# A/0 should be consuming ~64mb
[[ $(cat /cgroup/A/0/memory.current) -ge $x62mb ]] && [[ $(cat /cgroup/A/0/memory.current) -le $x66mb ]]
# A should cumulatively be consuming ~96mb
[[ $(cat /cgroup/A/memory.current) -ge $x94mb ]] && [[ $(cat /cgroup/A/memory.current) -le $x98mb ]]
# Stop the stressors
ps | grep stress | tr -s ' ' | cut -f 2 -d ' ' | xargs -I % kill %
}
teardown 1
setup
for ((i=1;i<=$1;i++)); do
printf "ITERATION $i of $1 - stress_test 0 1"
stress_test 0 1
printf "\x1b[2K\r"
printf "ITERATION $i of $1 - stress_test 1 0"
stress_test 1 0
printf "\x1b[2K\r"
printf "ITERATION $i of $1 - PASSED\n"
done
teardown 1
echo PASSED!
20:11:26@sjchrist-vm ? ~ $ memcg_low_test 10
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-06-02 20:22 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-02 20:13 Sean Christopherson
2017-06-02 20:22 ` Christopherson, Sean J [this message]
2017-06-03 19:15 ` Vladimir Davydov
2017-06-05 6:45 ` Balbir Singh
2017-06-05 15:16 ` Johannes Weiner
2017-06-06 10:29 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=37306EFA9975BE469F115FDE982C075BC6112095@ORSMSX108.amr.corp.intel.com \
--to=sean.j.christopherson@intel.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox