From: Shakeel Butt <shakeel.butt@linux.dev>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>,
tj@kernel.org, cgroups@vger.kernel.org, hannes@cmpxchg.org,
lizefan.x@bytedance.com, longman@redhat.com,
kernel-team@cloudflare.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2] cgroup/rstat: Avoid thundering herd problem by kswapd across NUMA nodes
Date: Tue, 25 Jun 2024 14:20:43 -0700 [thread overview]
Message-ID: <ntpnm3kdpqexncc4hz4xmfliay3tmbasxl6zatmsauo3sruwf3@zcmgz7oq5huy> (raw)
In-Reply-To: <CAJD7tkZ_aba9N9Qe8WeaLcp_ON_jQvuP9dg4tW0919QbCLLTMA@mail.gmail.com>
On Tue, Jun 25, 2024 at 01:45:00PM GMT, Yosry Ahmed wrote:
> On Tue, Jun 25, 2024 at 9:21 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> >
> > On Tue, Jun 25, 2024 at 09:00:03AM GMT, Yosry Ahmed wrote:
> > [...]
> > >
> > > My point is not about accuracy, although I think it's a reasonable
> > > argument on its own (a lot of things could change in a short amount of
> > > time, which is why I prefer magnitude-based ratelimiting).
> > >
> > > My point is about logical ordering. If a userspace program reads the
> > > stats *after* an event occurs, it expects to get a snapshot of the
> > > system state after that event. Two examples are:
> > >
> > > - A proactive reclaimer reading the stats after a reclaim attempt to
> > > check if it needs to reclaim more memory or fallback.
> > > - A userspace OOM killer reading the stats after a usage spike to
> > > decide which workload to kill.
> > >
> > > I listed such examples with more detail in [1], when I removed
> > > stats_flush_ongoing from the memcg code.
> > >
> > > [1]https://lore.kernel.org/lkml/20231129032154.3710765-6-yosryahmed@google.com/
> >
> > You are kind of arbitrarily adding restrictions and rules here. Why not
> > follow the rules of a well established and battle tested stats infra
> > used by everyone i.e. vmstats? There is no sync flush and there are
> > frequent async flushes. I think that is what Jesper wants as well.
>
> That's how the memcg stats worked previously since before rstat and
> until the introduction of stats_flush_ongoing AFAICT. We saw an actual
> behavioral change when we were moving from a pre-rstat kernel to a
> kernel with stats_flush_ongoing. This was the rationale when I removed
> stats_flush_ongoing in [1]. It's not a new argument, I am just
> reiterating what we discussed back then.
In my reply above, I am not arguing to go back to the older
stats_flush_ongoing situation. Rather I am discussing what should be the
best eventual solution. From the vmstats infra, we can learn that
frequent async flushes along with no sync flush, users are fine with the
'non-determinism'. Of course cgroup stats are different from vmstats
i.e. are hierarchical but I think we can try out this approach and see
if this works or not.
BTW it seems like this topic should be discussed be discussed
face-to-face over vc or LPC. What do you folks thing?
Shakeel
next prev parent reply other threads:[~2024-06-25 21:20 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-24 11:55 Jesper Dangaard Brouer
2024-06-24 12:46 ` Yosry Ahmed
2024-06-24 17:32 ` Shakeel Butt
2024-06-24 17:40 ` Yosry Ahmed
2024-06-24 19:29 ` Shakeel Butt
2024-06-24 19:37 ` Yosry Ahmed
2024-06-24 20:18 ` Shakeel Butt
2024-06-24 21:43 ` Yosry Ahmed
2024-06-24 22:17 ` Shakeel Butt
[not found] ` <CAJD7tka0b52zm=SjqxO-gxc0XTib=81c7nMx9MFNttwVkCVmSg@mail.gmail.com>
2024-06-25 0:24 ` Shakeel Butt
[not found] ` <CAJD7tkaMeevj2TS_aRj_WXVi26CuuBrprYwUfQmszJnwqqJrHw@mail.gmail.com>
2024-06-25 15:32 ` Jesper Dangaard Brouer
2024-06-25 16:00 ` Yosry Ahmed
2024-06-25 16:21 ` Shakeel Butt
2024-06-25 20:45 ` Yosry Ahmed
2024-06-25 21:20 ` Shakeel Butt [this message]
2024-06-25 21:24 ` Yosry Ahmed
2024-06-25 22:35 ` Christoph Lameter (Ampere)
2024-06-25 22:59 ` Yosry Ahmed
2024-06-26 21:35 ` Jesper Dangaard Brouer
2024-06-26 22:07 ` Yosry Ahmed
2024-06-27 9:21 ` Jesper Dangaard Brouer
2024-06-27 10:36 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ntpnm3kdpqexncc4hz4xmfliay3tmbasxl6zatmsauo3sruwf3@zcmgz7oq5huy \
--to=shakeel.butt@linux.dev \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hawk@kernel.org \
--cc=kernel-team@cloudflare.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan.x@bytedance.com \
--cc=longman@redhat.com \
--cc=tj@kernel.org \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox