From: Shakeel Butt <shakeelb@google.com>
To: Ivan Babrou <ivan@cloudflare.com>
Cc: "Daniel Dao" <dqminh@cloudflare.com>,
kernel-team <kernel-team@cloudflare.com>,
"Linux MM" <linux-mm@kvack.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Roman Gushchin" <guro@fb.com>, "Feng Tang" <feng.tang@intel.com>,
"Michal Hocko" <mhocko@kernel.org>,
"Hillf Danton" <hdanton@sina.com>,
"Michal Koutný" <mkoutny@suse.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: Re: Regression in workingset_refault latency on 5.15
Date: Wed, 23 Feb 2022 12:28:23 -0800 [thread overview]
Message-ID: <CALvZod4TRnsWuq3J1jmBuX6eDW9vmdqO_ncinSFEDo0CAS2BnQ@mail.gmail.com> (raw)
In-Reply-To: <CABWYdi0yzaKdOC2r6tgWODmQtHT_8NgaHQhrFEo_to0BWDKP2A@mail.gmail.com>
On Wed, Feb 23, 2022 at 11:28 AM Ivan Babrou <ivan@cloudflare.com> wrote:
>
[...]
> > 2) Can you please use the similar bpf+kprobe tracing for the
> > memcg_rstat_updated() (or __mod_memcg_lruvec_state()) to find the
> > source of frequent stat updates.
>
> "memcg_rstat_updated" is "static inline".
>
> With the following:
>
> bpftrace -e 'kprobe:__mod_memcg_lruvec_state { @stacks[kstack(10)]++ }'
>
[...]
Thanks, it is helpful. It seems like most of the stats updates are
happening on the anon page faults and based on signature, it seems
like swap refaults.
>
> > 3) I am still pondering why disabling swap resolves the issue for you.
> > Is that only for a workload different from xfs read?
>
> My understanding is that any block IO (including swap) triggers new
> memcg accounting code. In our process we don't have any other IO than
> swap, so disabling swap removes the major (if not only) vector of
> triggering this issue.
>
Now, I understand why disabling swap is helping your case as the
number of stat updates would be reduced drastically and rstat flush
would happen async most of the time.
[...]
> I should mention that there are really two issues:
>
> 1. Expensive workingset_refault, which shows up on flamegraphs. We see
> it for our rocksdb based database, which persists data on xfs (local
> nvme).
> 2. Expensive workingset_refault that causes latency hiccups, but
> doesn't show up on flamegraphs. We see it in our nginx based proxy
> with swap enabled (either zram or regular file on xfs on local nvme).
>
> We solved the latter by disabling swap. I think the proper solution
> would be for workingset_refault to be fast enough to be invisible, in
> line with what was happening on Linux 5.10.
Thanks for the info. Is it possible to test
https://lore.kernel.org/all/20210929235936.2859271-1-shakeelb@google.com/
?
If that patch did not help then we either have to optimize rstat
flushing or further increase the update buffer which is nr_cpus * 32.
next prev parent reply other threads:[~2022-02-23 20:28 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-23 13:51 Daniel Dao
2022-02-23 15:57 ` Shakeel Butt
2022-02-23 16:00 ` Shakeel Butt
2022-02-23 17:07 ` Daniel Dao
2022-02-23 17:36 ` Shakeel Butt
2022-02-23 19:28 ` Ivan Babrou
2022-02-23 20:28 ` Shakeel Butt [this message]
2022-02-23 21:16 ` Ivan Babrou
2022-02-24 14:46 ` Daniel Dao
2022-02-24 16:58 ` Shakeel Butt
2022-02-24 17:34 ` Daniel Dao
2022-02-24 18:00 ` Shakeel Butt
2022-02-24 18:52 ` Shakeel Butt
2022-02-25 10:23 ` Daniel Dao
2022-02-25 17:08 ` Ivan Babrou
2022-02-25 17:22 ` Shakeel Butt
2022-02-25 18:03 ` Michal Koutný
2022-02-25 18:08 ` Ivan Babrou
2022-02-28 23:09 ` Shakeel Butt
2022-02-28 23:34 ` Ivan Babrou
2022-02-28 23:43 ` Shakeel Butt
2022-03-02 0:48 ` Ivan Babrou
2022-03-02 2:50 ` Shakeel Butt
2022-03-02 3:40 ` Ivan Babrou
2022-03-02 22:33 ` Ivan Babrou
2022-03-03 2:32 ` Shakeel Butt
2022-03-03 2:35 ` Shakeel Butt
2022-03-04 0:21 ` Ivan Babrou
2022-03-04 1:05 ` Shakeel Butt
2022-03-04 1:12 ` Ivan Babrou
2022-03-02 11:49 ` Frank Hofmann
2022-03-02 15:52 ` Shakeel Butt
2022-03-02 10:08 ` Michal Koutný
2022-03-02 15:53 ` Shakeel Butt
2022-03-02 17:28 ` Ivan Babrou
2022-02-24 9:22 ` Thorsten Leemhuis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALvZod4TRnsWuq3J1jmBuX6eDW9vmdqO_ncinSFEDo0CAS2BnQ@mail.gmail.com \
--to=shakeelb@google.com \
--cc=akpm@linux-foundation.org \
--cc=dqminh@cloudflare.com \
--cc=feng.tang@intel.com \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=ivan@cloudflare.com \
--cc=kernel-team@cloudflare.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox