From: "Michal Koutný" <mkoutny@suse.com>
To: Shakeel Butt <shakeel.butt@linux.dev>,
"T.J. Mercier" <tjmercier@google.com>
Cc: Tejun Heo <tj@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org,
Meta kernel team <kernel-team@meta.com>
Subject: Re: [PATCH] memcg: add hierarchical effective limits for v2
Date: Mon, 10 Feb 2025 17:24:17 +0100 [thread overview]
Message-ID: <5jwdklebrnbym6c7ynd5y53t3wq453lg2iup6rj4yux5i72own@ay52cqthg3hy> (raw)
In-Reply-To: <ctuqkowzqhxvpgij762dcuf24i57exuhjjhuh243qhngxi5ymg@lazsczjvy4yd>
Hello.
On Thu, Feb 06, 2025 at 11:09:05AM -0800, Shakeel Butt <shakeel.butt@linux.dev> wrote:
> Oh I totally forgot about your series. In my use-case, it is not about
> dynamically knowning how much they can expand and adjust themselves but
> rather knowing statically upfront what resources they have been given.
From the memcg PoV, the effective value doesn't tell how much they were
given (because of sharing).
> More concretely, these are workloads which used to completely occupy a
> single machine, though within containers but without limits. These
> workloads used to look at machine level metrics at startup on how much
> resources are available.
I've been there but haven't found convincing mapping of global to memcg
limits.
The issue is that such a value won't guarantee no OOM when below because
it can be (generally) effectively shared.
(Alas, apps typically don't express their memory needs in units of
PSI. So it boils down to a system wide monitor like systemd-oomd and
cooperation with it.)
> Now these workloads are being moved to multi-tenant environment but
> still the machine is partitioned statically between the workloads. So,
> these workloads need to know upfront how much resources are allocated to
> them upfront and the way the cgroup hierarchy is setup, that information
> is a bit above the tree.
FTR, e.g. in systemd setups, this can be partially overcome by exposed
EffectiveMemoryMax= (the service manager who configures the resources
also can do the ancestry traversal).
kubernetes has downward API where generic resource info is shared into
containers and I recall that lxcfs could mangle procfs
memory info wrt memory limits for legacy apps.
As I think about it, the cgroupns (in)visibility should be resolved by
assigning the proper limit to namespace's root group memory.max (read
only for contained user) and the traversal...
On Thu, Feb 06, 2025 at 11:37:31AM -0800, "T.J. Mercier" <tjmercier@google.com> wrote:
> but having a single file to read instead of walking up the
> tree with multiple reads to calculate an effective limit would be
> nice.
...in kernel is nice but possible performance gain isn't worth hiding
the shareability of the effective limit.
So I wonder what is the current PoV of more MM people...
Michal
next prev parent reply other threads:[~2025-02-10 16:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-05 22:20 Shakeel Butt
2025-02-05 22:33 ` Balbir Singh
2025-02-06 15:57 ` Michal Koutný
2025-02-06 19:09 ` Shakeel Butt
2025-02-06 19:37 ` T.J. Mercier
2025-02-10 16:24 ` Michal Koutný [this message]
2025-02-10 18:34 ` Shakeel Butt
2025-02-10 22:52 ` Johannes Weiner
2025-02-11 4:55 ` Roman Gushchin
2025-02-12 1:08 ` Shakeel Butt
2025-02-17 17:57 ` Michal Koutný
2025-02-26 21:13 ` Shakeel Butt
2025-02-27 3:51 ` Johannes Weiner
2025-03-17 1:12 ` Andrew Morton
2025-03-17 18:06 ` Tejun Heo
2025-02-06 22:24 ` Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5jwdklebrnbym6c7ynd5y53t3wq453lg2iup6rj4yux5i72own@ay52cqthg3hy \
--to=mkoutny@suse.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tj@kernel.org \
--cc=tjmercier@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox