From: Yosry Ahmed <yosryahmed@google.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Hocko" <mhocko@kernel.org>,
"Roman Gushchin" <roman.gushchin@linux.dev>,
"Muchun Song" <muchun.song@linux.dev>,
"Ivan Babrou" <ivan@cloudflare.com>, "Tejun Heo" <tj@kernel.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Waiman Long" <longman@redhat.com>,
kernel-team@cloudflare.com, "Wei Xu" <weixugc@google.com>,
"Greg Thelen" <gthelen@google.com>,
"Domenico Cerasuolo" <cerasuolodomenico@gmail.com>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [mm-unstable v4 5/5] mm: memcg: restore subtree stats flushing
Date: Mon, 4 Dec 2023 12:12:25 -0800 [thread overview]
Message-ID: <CAJD7tkZPcBbvcK+Xj0edevemB+801wRvvcFDJEjk4ZcjNVoV_w@mail.gmail.com> (raw)
In-Reply-To: <20231202083129.3pmds2cddy765szr@google.com>
On Sat, Dec 2, 2023 at 12:31 AM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Wed, Nov 29, 2023 at 03:21:53AM +0000, Yosry Ahmed wrote:
> [...]
> > +void mem_cgroup_flush_stats(struct mem_cgroup *memcg)
> > {
> > - if (memcg_should_flush_stats(root_mem_cgroup))
> > - do_flush_stats();
> > + static DEFINE_MUTEX(memcg_stats_flush_mutex);
> > +
> > + if (mem_cgroup_disabled())
> > + return;
> > +
> > + if (!memcg)
> > + memcg = root_mem_cgroup;
> > +
> > + if (memcg_should_flush_stats(memcg)) {
> > + mutex_lock(&memcg_stats_flush_mutex);
>
> What's the point of this mutex now? What is it providing? I understand
> we can not try_lock here due to targeted flushing. Why not just let the
> global rstat serialize the flushes? Actually this mutex can cause
> latency hiccups as the mutex owner can get resched during flush and then
> no one can flush for a potentially long time.
I was hoping this was clear from the commit message and code comments,
but apparently I was wrong, sorry. Let me give more context.
In previous versions and/or series, the mutex was only used with
flushes from userspace to guard in-kernel flushers against high
contention from userspace. Later on, I kept the mutex for all memcg
flushers for the following reasons:
(a) Allow waiters to sleep:
Unlike other flushers, the memcg flushing path can see a lot of
concurrency. The mutex avoids having a lot of CPUs spinning (e.g.
concurrent reclaimers) by allowing waiters to sleep.
(b) Check the threshold under lock but before calling cgroup_rstat_flush():
The calls to cgroup_rstat_flush() are not very cheap even if there's
nothing to flush, as we still need to iterate all CPUs. If flushers
contend directly on the rstat lock, overlapping flushes will
unnecessarily do the percpu iteration once they hold the lock. With
the mutex, they will check the threshold again once they hold the
mutex.
(c) Protect non-memcg flushers from contention from memcg flushers.
This is not as strong of an argument as protecting in-kernel flushers
from userspace flushers.
There has been discussions before about changing the rstat lock itself
to be a mutex, which would resolve (a), but there are concerns about
priority inversions if a low priority task holds the mutex and gets
preempted, as well as the amount of time the rstat lock holder keeps
the lock for:
https://lore.kernel.org/lkml/ZO48h7c9qwQxEPPA@slm.duckdns.org/
I agree about possible hiccups due to the inner lock being dropped
while the mutex is held. Running a synthetic test with high
concurrency between reclaimers (in-kernel flushers) and stats readers
show no material performance difference with or without the mutex.
Maybe things cancel out, or don't really matter in practice.
I would prefer to keep the current code as I think (a) and (b) could
cause problems in the future, and the current form of the code (with
the mutex) has already seen mileage with production workloads.
next prev parent reply other threads:[~2023-12-04 20:13 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-29 3:21 [mm-unstable v4 0/5] mm: memcg: subtree stats flushing and thresholds Yosry Ahmed
2023-11-29 3:21 ` [mm-unstable v4 1/5] mm: memcg: change flush_next_time to flush_last_time Yosry Ahmed
2023-11-29 3:21 ` [mm-unstable v4 2/5] mm: memcg: move vmstats structs definition above flushing code Yosry Ahmed
2023-11-29 3:21 ` [mm-unstable v4 3/5] mm: memcg: make stats flushing threshold per-memcg Yosry Ahmed
2023-12-02 7:48 ` Shakeel Butt
2023-11-29 3:21 ` [mm-unstable v4 4/5] mm: workingset: move the stats flush into workingset_test_recent() Yosry Ahmed
2023-12-02 8:07 ` Shakeel Butt
2023-11-29 3:21 ` [mm-unstable v4 5/5] mm: memcg: restore subtree stats flushing Yosry Ahmed
2023-12-02 1:57 ` Bagas Sanjaya
2023-12-02 2:56 ` Waiman Long
2023-12-02 5:53 ` Bagas Sanjaya
2023-12-04 19:51 ` Yosry Ahmed
2023-12-02 8:31 ` Shakeel Butt
2023-12-04 20:12 ` Yosry Ahmed [this message]
2023-12-04 21:37 ` Yosry Ahmed
2023-12-04 23:31 ` Shakeel Butt
2023-12-04 23:46 ` Wei Xu
2023-12-04 23:49 ` Yosry Ahmed
2023-12-04 23:58 ` Shakeel Butt
2023-12-12 18:43 ` Andrew Morton
2023-12-12 19:11 ` Shakeel Butt
2023-12-12 20:44 ` Yosry Ahmed
2023-12-02 4:51 ` [mm-unstable v4 0/5] mm: memcg: subtree stats flushing and thresholds Bagas Sanjaya
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJD7tkZPcBbvcK+Xj0edevemB+801wRvvcFDJEjk4ZcjNVoV_w@mail.gmail.com \
--to=yosryahmed@google.com \
--cc=akpm@linux-foundation.org \
--cc=cerasuolodomenico@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=ivan@cloudflare.com \
--cc=kernel-team@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
--cc=weixugc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox