linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	 Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeelb@google.com>,
	 Muchun Song <muchun.song@linux.dev>,
	Ivan Babrou <ivan@cloudflare.com>, Tejun Heo <tj@kernel.org>,
	 linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	 Yosry Ahmed <yosryahmed@google.com>
Subject: [PATCH 3/3] mm: memcg: use non-unified stats flushing for userspace reads
Date: Mon, 21 Aug 2023 20:54:58 +0000	[thread overview]
Message-ID: <20230821205458.1764662-4-yosryahmed@google.com> (raw)
In-Reply-To: <20230821205458.1764662-1-yosryahmed@google.com>

Unified flushing allows for great concurrency for paths that attempt to
flush the stats, at the expense of potential staleness and a single
flusher paying the extra cost of flushing the full tree.

This tradeoff makes sense for in-kernel flushers that may observe high
concurrency (e.g. reclaim, refault). For userspace readers, stale stats
may be unexpected and problematic, especially when such stats are used
for critical paths such as userspace OOM handling. Additionally, a
userspace reader will occasionally pay the cost of flushing the entire
hierarchy, which also causes problems in some cases [1].

Opt userspace reads out of unified flushing. This makes the cost of
reading the stats more predictable (proportional to the size of the
subtree), as well as the freshness of the stats. Since userspace readers
are not expected to have similar concurrency to in-kernel flushers,
serializing them among themselves and among in-kernel flushers should be
okay.

This was tested on a machine with 256 cpus by running a synthetic test
The script that creates 50 top-level cgroups, each with 5 children (250
leaf cgroups). Each leaf cgroup has 10 processes running that allocate
memory beyond the cgroup limit, invoking reclaim (which is an in-kernel
unified flusher). Concurrently, one thread is spawned per-cgroup to read
the stats every second (including root, top-level, and leaf cgroups --
so total 251 threads). No regressions were observed in the total running
time; which means that non-unified userspace readers are not slowing
down in-kernel unified flushers:

Base (mm-unstable):

real	0m18.228s
user	0m9.463s
sys	60m15.879s

real	0m20.828s
user	0m8.535s
sys	70m12.364s

real	0m19.789s
user	0m9.177s
sys	66m10.798s

With this patch:

real	0m19.632s
user	0m8.608s
sys	64m23.483s

real	0m18.463s
user	0m7.465s
sys	60m34.089s

real	0m20.309s
user	0m7.754s
sys	68m2.392s

Additionally, the average latency for reading stats went from roughly
40ms to 5 ms, because we mostly read the stats of leaf cgroups in this
script, so we only have to flush one cgroup, instead of *sometimes*
flushing the entire tree with unified flushing.

[1]https://lore.kernel.org/lkml/CABWYdi0c6__rh-K7dcM_pkf9BJdTRtAU08M43KO9ME4-dsgfoQ@mail.gmail.com/

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
---
 mm/memcontrol.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 90f08b35fa77..d3b13a06224c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1606,7 +1606,7 @@ static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
 	 *
 	 * Current memory state:
 	 */
-	mem_cgroup_try_flush_stats();
+	do_stats_flush(memcg);
 
 	for (i = 0; i < ARRAY_SIZE(memory_stats); i++) {
 		u64 size;
@@ -4048,7 +4048,7 @@ static int memcg_numa_stat_show(struct seq_file *m, void *v)
 	int nid;
 	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
 
-	mem_cgroup_try_flush_stats();
+	do_stats_flush(memcg);
 
 	for (stat = stats; stat < stats + ARRAY_SIZE(stats); stat++) {
 		seq_printf(m, "%s=%lu", stat->name,
@@ -4123,7 +4123,7 @@ static void memcg1_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
 
 	BUILD_BUG_ON(ARRAY_SIZE(memcg1_stat_names) != ARRAY_SIZE(memcg1_stats));
 
-	mem_cgroup_try_flush_stats();
+	do_stats_flush(memcg);
 
 	for (i = 0; i < ARRAY_SIZE(memcg1_stats); i++) {
 		unsigned long nr;
@@ -4625,7 +4625,7 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
 	struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
 	struct mem_cgroup *parent;
 
-	mem_cgroup_try_flush_stats();
+	do_stats_flush(memcg);
 
 	*pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
 	*pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
@@ -6640,7 +6640,7 @@ static int memory_numa_stat_show(struct seq_file *m, void *v)
 	int i;
 	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
 
-	mem_cgroup_try_flush_stats();
+	do_stats_flush(memcg);
 
 	for (i = 0; i < ARRAY_SIZE(memory_stats); i++) {
 		int nid;
-- 
2.42.0.rc1.204.g551eb34607-goog



  parent reply	other threads:[~2023-08-21 20:55 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-21 20:54 [PATCH 0/3] memcg: non-unified flushing for userspace stats Yosry Ahmed
2023-08-21 20:54 ` [PATCH 1/3] mm: memcg: properly name and document unified stats flushing Yosry Ahmed
2023-08-21 20:54 ` [PATCH 2/3] mm: memcg: add a helper for non-unified " Yosry Ahmed
2023-08-22 13:01   ` Michal Koutný
2023-08-22 16:00     ` Yosry Ahmed
2023-08-22 16:35       ` Michal Koutný
2023-08-22 16:48         ` Yosry Ahmed
2023-08-21 20:54 ` Yosry Ahmed [this message]
2023-08-22  9:06   ` [PATCH 3/3] mm: memcg: use non-unified stats flushing for userspace reads Michal Hocko
2023-08-22 15:30     ` Yosry Ahmed
2023-08-23  7:33       ` Michal Hocko
2023-08-23 14:55         ` Yosry Ahmed
2023-08-24  7:13           ` Michal Hocko
2023-08-24 18:15             ` Yosry Ahmed
2023-08-24 18:50               ` Yosry Ahmed
2023-08-25  7:05                 ` Michal Hocko
2023-08-25 15:14                   ` Yosry Ahmed
2023-08-25 18:17                     ` Michal Hocko
2023-08-25 18:21                       ` Yosry Ahmed
2023-08-25 18:43                         ` Michal Hocko
2023-08-25 18:44                           ` Michal Hocko
2023-08-28 15:47                     ` Michal Hocko
2023-08-28 16:15                       ` Yosry Ahmed
2023-08-28 17:00                         ` Shakeel Butt
2023-08-28 17:07                           ` Yosry Ahmed
2023-08-28 17:27                             ` Waiman Long
2023-08-28 17:28                               ` Yosry Ahmed
2023-08-28 17:35                                 ` Waiman Long
2023-08-28 17:43                                   ` Waiman Long
2023-08-28 18:35                                     ` Yosry Ahmed
2023-08-29  7:27                               ` Michal Hocko
2023-08-29 15:05                                 ` Waiman Long
2023-08-29 15:17                                   ` Michal Hocko
2023-08-29 16:04                                     ` Yosry Ahmed
2023-08-29 18:44                   ` Tejun Heo
2023-08-29 19:13                     ` Yosry Ahmed
2023-08-29 19:36                       ` Tejun Heo
2023-08-29 19:54                         ` Yosry Ahmed
2023-08-29 20:12                           ` Tejun Heo
2023-08-29 20:20                             ` Yosry Ahmed
2023-08-31  9:05                               ` Michal Hocko
2023-08-22 13:00 ` [PATCH 0/3] memcg: non-unified flushing for userspace stats Michal Koutný
2023-08-22 15:43   ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230821205458.1764662-4-yosryahmed@google.com \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=ivan@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox