From: Vlastimil Babka <vbabka@suse.cz>
To: Qi Zheng <zhengqi.arch@bytedance.com>,
akpm@linux-foundation.org, tkhai@ya.ru, hannes@cmpxchg.org,
shakeelb@google.com, mhocko@kernel.org, roman.gushchin@linux.dev,
muchun.song@linux.dev, david@redhat.com, shy828301@gmail.com,
rppt@kernel.org
Cc: sultan@kerneltoast.com, dave@stgolabs.net,
penguin-kernel@I-love.SAKURA.ne.jp, paulmck@kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 3/8] mm: vmscan: make memcg slab shrink lockless
Date: Wed, 8 Mar 2023 23:46:46 +0100 [thread overview]
Message-ID: <a5a07356-048b-562b-6748-d6d5b99acddc@suse.cz> (raw)
In-Reply-To: <20230307065605.58209-4-zhengqi.arch@bytedance.com>
On 3/7/23 07:56, Qi Zheng wrote:
> Like global slab shrink, this commit also uses SRCU to make
> memcg slab shrink lockless.
>
> We can reproduce the down_read_trylock() hotspot through the
> following script:
>
> ```
>
> DIR="/root/shrinker/memcg/mnt"
>
> do_create()
> {
> mkdir -p /sys/fs/cgroup/memory/test
> mkdir -p /sys/fs/cgroup/perf_event/test
> echo 4G > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
> for i in `seq 0 $1`;
> do
> mkdir -p /sys/fs/cgroup/memory/test/$i;
> echo $$ > /sys/fs/cgroup/memory/test/$i/cgroup.procs;
> echo $$ > /sys/fs/cgroup/perf_event/test/cgroup.procs;
> mkdir -p $DIR/$i;
> done
> }
>
> do_mount()
> {
> for i in `seq $1 $2`;
> do
> mount -t tmpfs $i $DIR/$i;
> done
> }
>
> do_touch()
> {
> for i in `seq $1 $2`;
> do
> echo $$ > /sys/fs/cgroup/memory/test/$i/cgroup.procs;
> echo $$ > /sys/fs/cgroup/perf_event/test/cgroup.procs;
> dd if=/dev/zero of=$DIR/$i/file$i bs=1M count=1 &
> done
> }
>
> case "$1" in
> touch)
> do_touch $2 $3
> ;;
> test)
> do_create 4000
> do_mount 0 4000
> do_touch 0 3000
> ;;
> *)
> exit 1
> ;;
> esac
> ```
>
> Save the above script, then run test and touch commands.
> Then we can use the following perf command to view hotspots:
>
> perf top -U -F 999
>
> 1) Before applying this patchset:
>
> 32.31% [kernel] [k] down_read_trylock
> 19.40% [kernel] [k] pv_native_safe_halt
> 16.24% [kernel] [k] up_read
> 15.70% [kernel] [k] shrink_slab
> 4.69% [kernel] [k] _find_next_bit
> 2.62% [kernel] [k] shrink_node
> 1.78% [kernel] [k] shrink_lruvec
> 0.76% [kernel] [k] do_shrink_slab
>
> 2) After applying this patchset:
>
> 27.83% [kernel] [k] _find_next_bit
> 16.97% [kernel] [k] shrink_slab
> 15.82% [kernel] [k] pv_native_safe_halt
> 9.58% [kernel] [k] shrink_node
> 8.31% [kernel] [k] shrink_lruvec
> 5.64% [kernel] [k] do_shrink_slab
> 3.88% [kernel] [k] mem_cgroup_iter
>
> At the same time, we use the following perf command to capture
> IPC information:
>
> perf stat -e cycles,instructions -G test -a --repeat 5 -- sleep 10
>
> 1) Before applying this patchset:
>
> Performance counter stats for 'system wide' (5 runs):
>
> 454187219766 cycles test ( +- 1.84% )
> 78896433101 instructions test # 0.17 insn per cycle ( +- 0.44% )
>
> 10.0020430 +- 0.0000366 seconds time elapsed ( +- 0.00% )
>
> 2) After applying this patchset:
>
> Performance counter stats for 'system wide' (5 runs):
>
> 841954709443 cycles test ( +- 15.80% ) (98.69%)
> 527258677936 instructions test # 0.63 insn per cycle ( +- 15.11% ) (98.68%)
>
> 10.01064 +- 0.00831 seconds time elapsed ( +- 0.08% )
>
> We can see that IPC drops very seriously when calling
> down_read_trylock() at high frequency. After using SRCU,
> the IPC is at a normal level.
The interpretation looks somewhat weird to me. I'd say the workload is
stalled a lot as it fails the trylock (there might be some optimistic
spinning perhaps) and then goes to sleep. See how "pv_native_safe_halt" is
also more prominent in before. And because of that sleeping, there's less
instructions executed in the same amount of cycles (as it's a system wide
collection, otherwise it wouldn't be collecting the sleeping processes).
>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Other than that:
Acked-by: Vlastimil Babka <Vbabka@suse.cz>
A small thing below:
> ---
> mm/vmscan.c | 46 +++++++++++++++++++++++++++-------------------
> 1 file changed, 27 insertions(+), 19 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 8515ac40bcaf..1de9bc3e5aa2 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -57,6 +57,7 @@
> #include <linux/khugepaged.h>
> #include <linux/rculist_nulls.h>
> #include <linux/random.h>
> +#include <linux/srcu.h>
I guess this should have been in patch 2/8 already? It may work accidentaly
because some other header pulls it transitively...
next prev parent reply other threads:[~2023-03-08 22:46 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-07 6:55 [PATCH v4 0/8] make " Qi Zheng
2023-03-07 6:55 ` [PATCH v4 1/8] mm: vmscan: add a map_nr_max field to shrinker_info Qi Zheng
2023-03-08 14:40 ` Vlastimil Babka
2023-03-08 22:13 ` Kirill Tkhai
2023-03-09 6:33 ` Qi Zheng
2023-03-07 6:55 ` [PATCH v4 2/8] mm: vmscan: make global slab shrink lockless Qi Zheng
2023-03-08 15:02 ` Vlastimil Babka
2023-03-08 22:18 ` Kirill Tkhai
2023-03-07 6:56 ` [PATCH v4 3/8] mm: vmscan: make memcg " Qi Zheng
2023-03-08 22:23 ` Kirill Tkhai
2023-03-08 22:46 ` Vlastimil Babka [this message]
2023-03-09 6:47 ` Qi Zheng
2023-03-07 6:56 ` [PATCH v4 4/8] mm: vmscan: add shrinker_srcu_generation Qi Zheng
2023-03-09 9:23 ` Vlastimil Babka
2023-03-09 10:12 ` Qi Zheng
2023-03-07 6:56 ` [PATCH v4 5/8] mm: shrinkers: make count and scan in shrinker debugfs lockless Qi Zheng
2023-03-09 9:36 ` Vlastimil Babka
2023-03-09 9:39 ` Vlastimil Babka
2023-03-09 10:14 ` Qi Zheng
2023-03-09 19:30 ` Kirill Tkhai
2023-03-07 6:56 ` [PATCH v4 6/8] mm: vmscan: hold write lock to reparent shrinker nr_deferred Qi Zheng
2023-03-09 9:36 ` Vlastimil Babka
2023-03-07 6:56 ` [PATCH v4 7/8] mm: vmscan: remove shrinker_rwsem from synchronize_shrinkers() Qi Zheng
2023-03-08 22:39 ` Kirill Tkhai
2023-03-09 7:06 ` Qi Zheng
2023-03-09 8:11 ` Christian König
2023-03-09 8:32 ` Qi Zheng
2023-03-09 19:34 ` Kirill Tkhai
2023-03-09 9:40 ` Vlastimil Babka
2023-03-09 19:34 ` Kirill Tkhai
2023-03-07 6:56 ` [PATCH v4 8/8] mm: shrinkers: convert shrinker_rwsem to mutex Qi Zheng
2023-03-09 9:42 ` Vlastimil Babka
2023-03-09 19:49 ` Kirill Tkhai
2023-03-07 22:20 ` [PATCH v4 0/8] make slab shrink lockless Andrew Morton
2023-03-08 11:59 ` Qi Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a5a07356-048b-562b-6748-d6d5b99acddc@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=paulmck@kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=shakeelb@google.com \
--cc=shy828301@gmail.com \
--cc=sultan@kerneltoast.com \
--cc=tkhai@ya.ru \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox