From: Joshua Hahn <joshua.hahnjy@gmail.com>
To: Donet Tom <donettom@linux.ibm.com>
Cc: Gregory Price <gourry@gourry.net>,
Johannes Weiner <hannes@cmpxchg.org>,
Kaiyang Zhao <kaiyang2@cs.cmu.edu>,
Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@kernel.org>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Muchun Song <muchun.song@linux.dev>,
Waiman Long <longman@redhat.com>,
Chen Ridong <chenridong@huaweicloud.com>,
Tejun Heo <tj@kernel.org>, Michal Koutny <mkoutny@suse.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Qi Zheng <zhengqi.arch@bytedance.com>,
linux-mm@kvack.org, cgroups@vger.kernel.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [RFC PATCH 0/6] mm/memcontrol: Make memcg limits tier-aware
Date: Tue, 24 Mar 2026 07:58:16 -0700 [thread overview]
Message-ID: <20260324145816.3939303-1-joshua.hahnjy@gmail.com> (raw)
In-Reply-To: <13eb0f7a-95bc-4337-9d38-a06db0700777@linux.ibm.com>
On Tue, 24 Mar 2026 16:00:34 +0530 Donet Tom <donettom@linux.ibm.com> wrote:
> Hi Josua
>
> On 2/24/26 4:08 AM, Joshua Hahn wrote:
> > Memory cgroups provide an interface that allow multiple workloads on a
> > host to co-exist, and establish both weak and strong memory isolation
> > guarantees. For large servers and small embedded systems alike, memcgs
> > provide an effective way to provide a baseline quality of service for
> > protected workloads.
> >
> > This works, because for the most part, all memory is equal (except for
> > zram / zswap). Restricting a cgroup's memory footprint restricts how
> > much it can hurt other workloads competing for memory. Likewise, setting
> > memory.low or memory.min limits can provide weak and strong guarantees
> > to the performance of a cgroup.
> >
> > However, on systems with tiered memory (e.g. CXL / compressed memory),
> > the quality of service guarantees that memcg limits enforced become less
> > effective, as memcg has no awareness of the physical location of its
> > charged memory. In other words, a workload that is well-behaved within
> > its memcg limits may still be hurting the performance of other
> > well-behaving workloads on the system by hogging more than its
> > "fair share" of toptier memory.
> >
> > Introduce tier-aware memcg limits, which scale memory.low/high to
> > reflect the ratio of toptier:total memory the cgroup has access.
> >
> > Take the following scenario as an example:
> > On a host with 3:1 toptier:lowtier, say 150G toptier, and 50Glowtier,
> > setting a cgroup's limits to:
> > memory.min: 15G
> > memory.low: 20G
> > memory.high: 40G
> > memory.max: 50G
> >
> > Will be enforced at the toptier as:
> > memory.min: 15G
> > memory.toptier_low: 15G (20 * 150/200)
> > memory.toptier_high: 30G (40 * 150/200)
> > memory.max: 50G
>
>
Hello Donet,
Thank you for reviewing the series! I hope you are doing well.
> Currently, the high and low thresholds are adjusted based on the ratio
> of top-tier to total memory. One concern I see is that if the working
> set size exceeds the top-tier high threshold, it could lead to frequent
> demotions and promotions. Instead, would it make sense to introduce a
> tunable knob to configure the top-tier high threshold?
Yes, this is true. It is also a concern that I have, and I think that
adding a tunable knob could be helpful. The other side of the question is
whether there are too many tunables for the users already, with min /
low / high / max. I'm hoping to get a consensus for this at LSFMMBPF,
I hope we can talk about it there!
The other way to approach this is to throttle promotions and demotions
when workloads are thrashing. Personally I prefer this decision, although
it isn't mutually exclusive to adding more knobs.
> Another concern is that if the lower-tier memory size is very large, the
> cgroup may end up getting only a small portion of higher-tier memory.
I think the issue you mentioned above is a bigger problem.
If the lower tier memory is large and the toptier memory is small, then it
makes toptier memory an even more constrained resource, so splitting it
fairly among the cgroups becomes an even bigger issue. Remember, we're
limiting workloads' toptier memory usage because other workloads have
to use it; if we let a cgroup use more toptier memory, it has to come
from another cgroup's share.
Thanks again. Please let me know if you have any other concerns, I'm
excited to talk about this more as well!
Joshua
prev parent reply other threads:[~2026-03-24 14:58 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 22:38 Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 1/6] mm/memory-tiers: Introduce tier-aware memcg limit sysfs Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 2/6] mm/page_counter: Introduce tiered memory awareness to page_counter Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 3/6] mm/memory-tiers, memcontrol: Introduce toptier capacity updates Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 4/6] mm/memcontrol: Charge and uncharge from toptier Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 5/6] mm/memcontrol, page_counter: Make memory.low tier-aware Joshua Hahn
2026-02-23 22:38 ` [RFC PATCH 6/6] mm/memcontrol: Make memory.high tier-aware Joshua Hahn
2026-03-11 22:05 ` Bing Jiao
2026-03-12 19:44 ` Joshua Hahn
2026-03-24 10:51 ` Donet Tom
2026-03-24 15:23 ` Gregory Price
2026-03-24 15:46 ` Donet Tom
2026-03-24 15:44 ` Joshua Hahn
2026-03-24 16:06 ` Donet Tom
2026-02-24 11:27 ` [RFC PATCH 0/6] mm/memcontrol: Make memcg limits tier-aware Michal Hocko
2026-02-24 16:13 ` Joshua Hahn
2026-02-24 18:49 ` Gregory Price
2026-02-24 20:03 ` Kaiyang Zhao
2026-02-26 8:04 ` Michal Hocko
2026-02-26 16:08 ` Joshua Hahn
2026-03-24 10:30 ` Donet Tom
2026-03-24 14:58 ` Joshua Hahn [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260324145816.3939303-1-joshua.hahnjy@gmail.com \
--to=joshua.hahnjy@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=david@kernel.org \
--cc=donettom@linux.ibm.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=kaiyang2@cs.cmu.edu \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=longman@redhat.com \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox