From: David Rientjes <rientjes@google.com>
To: Kaiyang Zhao <kaiyang2@cs.cmu.edu>
Cc: linux-mm@kvack.org, cgroups@vger.kernel.org,
roman.gushchin@linux.dev, shakeel.butt@linux.dev,
muchun.song@linux.dev, akpm@linux-foundation.org,
mhocko@kernel.org, nehagholkar@meta.com, abhishekd@meta.com,
hannes@cmpxchg.org
Subject: Re: [PATCH] mm,memcg: provide per-cgroup counters for NUMA balancing operations
Date: Sun, 11 Aug 2024 13:16:53 -0700 (PDT) [thread overview]
Message-ID: <e34a841c-c4c6-30fd-ca20-312c84654c34@google.com> (raw)
In-Reply-To: <20240809212115.59291-1-kaiyang2@cs.cmu.edu>
On Fri, 9 Aug 2024, kaiyang2@cs.cmu.edu wrote:
> From: Kaiyang Zhao <kaiyang2@cs.cmu.edu>
>
> The ability to observe the demotion and promotion decisions made by the
> kernel on a per-cgroup basis is important for monitoring and tuning
> containerized workloads on either NUMA machines or machines
> equipped with tiered memory.
>
> Different containers in the system may experience drastically different
> memory tiering actions that cannot be distinguished from the global
> counters alone.
>
> For example, a container running a workload that has a much hotter
> memory accesses will likely see more promotions and fewer demotions,
> potentially depriving a colocated container of top tier memory to such
> an extent that its performance degrades unacceptably.
>
> For another example, some containers may exhibit longer periods between
> data reuse, causing much more numa_hint_faults than numa_pages_migrated.
> In this case, tuning hot_threshold_ms may be appropriate, but the signal
> can easily be lost if only global counters are available.
>
> This patch set adds five counters to
> memory.stat in a cgroup: numa_pages_migrated, numa_pte_updates,
> numa_hint_faults, pgdemote_kswapd and pgdemote_direct.
>
> count_memcg_events_mm() is added to count multiple event occurrences at
> once, and get_mem_cgroup_from_folio() is added because we need to get a
> reference to the memcg of a folio before it's migrated to track
> numa_pages_migrated. The accounting of PGDEMOTE_* is moved to
> shrink_inactive_list() before being changed to per-cgroup.
>
> Signed-off-by: Kaiyang Zhao <kaiyang2@cs.cmu.edu>
Hi Kaiyang, have you considered per-memcg control over NUMA balancing
operations as well?
Wondering if that's the direction that you're heading in, because it would
be very useful to be able to control NUMA balancing at memcg granularity
on multi-tenant systems.
I mentioned this at LSF/MM/BPF this year. If people believe this is out
of scope for memcg, that would be good feedback as well.
next prev parent reply other threads:[~2024-08-11 20:17 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-09 21:21 kaiyang2
2024-08-10 0:28 ` Andrew Morton
2024-08-11 20:16 ` David Rientjes [this message]
2024-08-12 22:49 ` Kaiyang Zhao
2024-08-13 5:09 ` David Rientjes
2024-08-13 18:21 ` Kaiyang Zhao
2024-08-14 20:48 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e34a841c-c4c6-30fd-ca20-312c84654c34@google.com \
--to=rientjes@google.com \
--cc=abhishekd@meta.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kaiyang2@cs.cmu.edu \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=muchun.song@linux.dev \
--cc=nehagholkar@meta.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox