linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values
@ 2023-09-22  8:25 Yosry Ahmed
  2023-09-22  8:25 ` [PATCH 1/2] mm: memcg: refactor page state unit helpers Yosry Ahmed
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Yosry Ahmed @ 2023-09-22  8:25 UTC (permalink / raw)
  To: Andrew Morton, Shakeel Butt
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	Michal Koutný,
	linux-mm, cgroups, linux-kernel, Yosry Ahmed

While working on adjacent code [1], I realized that the values passed
into memcg_rstat_updated() to keep track of the magnitude of pending
updates is consistent. It is mostly in pages, but sometimes it can be in
bytes or KBs. Fix that.

Patch 1 reworks memcg_page_state_unit() so that we can reuse it in patch
2 to check and normalize the units of state updates.

[1]https://lore.kernel.org/lkml/20230921081057.3440885-1-yosryahmed@google.com/

Yosry Ahmed (2):
  mm: memcg: refactor page state unit helpers
  mm: memcg: normalize the value passed into memcg_rstat_updated()

 mm/memcontrol.c | 64 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 51 insertions(+), 13 deletions(-)

-- 
2.42.0.515.g380fc7ccd1-goog



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] mm: memcg: refactor page state unit helpers
  2023-09-22  8:25 [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values Yosry Ahmed
@ 2023-09-22  8:25 ` Yosry Ahmed
  2023-09-22  8:25 ` [PATCH 2/2] mm: memcg: normalize the value passed into memcg_rstat_updated() Yosry Ahmed
  2023-09-22 17:03 ` [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values Andrew Morton
  2 siblings, 0 replies; 5+ messages in thread
From: Yosry Ahmed @ 2023-09-22  8:25 UTC (permalink / raw)
  To: Andrew Morton, Shakeel Butt
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	Michal Koutný,
	linux-mm, cgroups, linux-kernel, Yosry Ahmed

memcg_page_state_unit() is currently used to identify the unit of a
memcg state item so that all stats in memory.stat are in bytes. However,
it lies about the units of WORKINGSET_* stats. These stats actually
represent pages, but we present them to userspace as a scalar number of
events. In retrospect, maybe those stats should have been memcg "events"
rather than memcg "state".

In preparation for using memcg_page_state_unit() for other purposes that
need to know the truthful units of different stat items, break it down
into two helpers:
- memcg_page_state_unit() retuns the actual unit of the item.
- memcg_page_state_output_unit() returns the unit used for output.

Use the latter instead of the former in memcg_page_state_output() and
lruvec_page_state_output(). While we are at it, let's show cgroup v1
some love and add memcg_page_state_local_output() for consistency.

No functional change intended.

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
---
 mm/memcontrol.c | 44 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 33 insertions(+), 11 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a4d3282493b6..683aa8405c22 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1533,7 +1533,7 @@ static const struct memory_stat memory_stats[] = {
 	{ "workingset_nodereclaim",	WORKINGSET_NODERECLAIM		},
 };
 
-/* Translate stat items to the correct unit for memory.stat output */
+/* The actual unit of the state item, not the same as the output unit */
 static int memcg_page_state_unit(int item)
 {
 	switch (item) {
@@ -1541,6 +1541,22 @@ static int memcg_page_state_unit(int item)
 	case MEMCG_ZSWAP_B:
 	case NR_SLAB_RECLAIMABLE_B:
 	case NR_SLAB_UNRECLAIMABLE_B:
+		return 1;
+	case NR_KERNEL_STACK_KB:
+		return SZ_1K;
+	default:
+		return PAGE_SIZE;
+	}
+}
+
+/* Translate stat items to the correct unit for memory.stat output */
+static int memcg_page_state_output_unit(int item)
+{
+	/*
+	 * Workingset state is actually in pages, but we export it to userspace
+	 * as a scalar count of events, so special case it here.
+	 */
+	switch (item) {
 	case WORKINGSET_REFAULT_ANON:
 	case WORKINGSET_REFAULT_FILE:
 	case WORKINGSET_ACTIVATE_ANON:
@@ -1549,17 +1565,23 @@ static int memcg_page_state_unit(int item)
 	case WORKINGSET_RESTORE_FILE:
 	case WORKINGSET_NODERECLAIM:
 		return 1;
-	case NR_KERNEL_STACK_KB:
-		return SZ_1K;
 	default:
-		return PAGE_SIZE;
+		return memcg_page_state_unit(item);
 	}
 }
 
 static inline unsigned long memcg_page_state_output(struct mem_cgroup *memcg,
 						    int item)
 {
-	return memcg_page_state(memcg, item) * memcg_page_state_unit(item);
+	return memcg_page_state(memcg, item) *
+		memcg_page_state_output_unit(item);
+}
+
+static inline unsigned long memcg_page_state_local_output(
+		struct mem_cgroup *memcg, int item)
+{
+	return memcg_page_state_local(memcg, item) *
+		memcg_page_state_output_unit(item);
 }
 
 static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
@@ -4100,9 +4122,8 @@ static void memcg1_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
 
 		if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account())
 			continue;
-		nr = memcg_page_state_local(memcg, memcg1_stats[i]);
-		seq_buf_printf(s, "%s %lu\n", memcg1_stat_names[i],
-			   nr * memcg_page_state_unit(memcg1_stats[i]));
+		nr = memcg_page_state_local_output(memcg, memcg1_stats[i]);
+		seq_buf_printf(s, "%s %lu\n", memcg1_stat_names[i], nr);
 	}
 
 	for (i = 0; i < ARRAY_SIZE(memcg1_events); i++)
@@ -4131,9 +4152,9 @@ static void memcg1_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
 
 		if (memcg1_stats[i] == MEMCG_SWAP && !do_memsw_account())
 			continue;
-		nr = memcg_page_state(memcg, memcg1_stats[i]);
+		nr = memcg_page_state_output(memcg, memcg1_stats[i]);
 		seq_buf_printf(s, "total_%s %llu\n", memcg1_stat_names[i],
-			   (u64)nr * memcg_page_state_unit(memcg1_stats[i]));
+			       (u64)nr);
 	}
 
 	for (i = 0; i < ARRAY_SIZE(memcg1_events); i++)
@@ -6609,7 +6630,8 @@ static int memory_stat_show(struct seq_file *m, void *v)
 static inline unsigned long lruvec_page_state_output(struct lruvec *lruvec,
 						     int item)
 {
-	return lruvec_page_state(lruvec, item) * memcg_page_state_unit(item);
+	return lruvec_page_state(lruvec, item) *
+		memcg_page_state_output_unit(item);
 }
 
 static int memory_numa_stat_show(struct seq_file *m, void *v)
-- 
2.42.0.515.g380fc7ccd1-goog



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/2] mm: memcg: normalize the value passed into memcg_rstat_updated()
  2023-09-22  8:25 [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values Yosry Ahmed
  2023-09-22  8:25 ` [PATCH 1/2] mm: memcg: refactor page state unit helpers Yosry Ahmed
@ 2023-09-22  8:25 ` Yosry Ahmed
  2023-09-22 17:03 ` [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values Andrew Morton
  2 siblings, 0 replies; 5+ messages in thread
From: Yosry Ahmed @ 2023-09-22  8:25 UTC (permalink / raw)
  To: Andrew Morton, Shakeel Butt
  Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
	Michal Koutný,
	linux-mm, cgroups, linux-kernel, Yosry Ahmed

memcg_rstat_updated() uses the value of the state update to keep track
of the magnitude of pending updates, so that we only do a stats flush
when it's worth the work. Most values passed into memcg_rstat_updated()
are in pages, however, a few of them are actually in bytes or KBs.

To put this into perspective, a 512 byte slab allocation today would
look the same as allocating 512 pages. This may result in premature
flushes, which means unnecessary work and latency.

Normalize all the state values passed into memcg_rstat_updated() to
pages. Round up non-zero sub-page to 1 page, because
memcg_rstat_updated() ignores 0 page updates.

Fixes: 5b3be698a872 ("memcg: better bounds on the memcg stats updates")
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
---
 mm/memcontrol.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 683aa8405c22..ea050908338a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -761,6 +761,22 @@ unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx)
 	return x;
 }
 
+static int memcg_page_state_unit(int item);
+
+/*
+ * Normalize the value passed into memcg_rstat_updated() to be in pages. Round
+ * up non-zero sub-page updates to 1 page as zero page updates are ignored.
+ */
+static int memcg_state_val_in_pages(int idx, int val)
+{
+	int unit = memcg_page_state_unit(idx);
+
+	if (!val || unit == PAGE_SIZE)
+		return val;
+	else
+		return max(val * unit / PAGE_SIZE, 1UL);
+}
+
 /**
  * __mod_memcg_state - update cgroup memory statistics
  * @memcg: the memory cgroup
@@ -773,7 +789,7 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val)
 		return;
 
 	__this_cpu_add(memcg->vmstats_percpu->state[idx], val);
-	memcg_rstat_updated(memcg, val);
+	memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val));
 }
 
 /* idx can be of type enum memcg_stat_item or node_stat_item. */
@@ -824,7 +840,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
 	/* Update lruvec */
 	__this_cpu_add(pn->lruvec_stats_percpu->state[idx], val);
 
-	memcg_rstat_updated(memcg, val);
+	memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val));
 	memcg_stats_unlock();
 }
 
-- 
2.42.0.515.g380fc7ccd1-goog



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values
  2023-09-22  8:25 [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values Yosry Ahmed
  2023-09-22  8:25 ` [PATCH 1/2] mm: memcg: refactor page state unit helpers Yosry Ahmed
  2023-09-22  8:25 ` [PATCH 2/2] mm: memcg: normalize the value passed into memcg_rstat_updated() Yosry Ahmed
@ 2023-09-22 17:03 ` Andrew Morton
  2023-09-22 17:59   ` Yosry Ahmed
  2 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2023-09-22 17:03 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Shakeel Butt, Johannes Weiner, Michal Hocko, Roman Gushchin,
	Muchun Song,  Michal Koutný ,
	linux-mm, cgroups, linux-kernel

On Fri, 22 Sep 2023 08:25:40 +0000 Yosry Ahmed <yosryahmed@google.com> wrote:

> While working on adjacent code [1], I realized that the values passed
> into memcg_rstat_updated() to keep track of the magnitude of pending
> updates is consistent. It is mostly in pages, but sometimes it can be in
> bytes or KBs. Fix that.

Clashes with "memcg: remove unused do_memsw_account in
memcg1_stat_format", and maybe other things.  Can you please redo
against mm-unsstable or linux-next?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values
  2023-09-22 17:03 ` [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values Andrew Morton
@ 2023-09-22 17:59   ` Yosry Ahmed
  0 siblings, 0 replies; 5+ messages in thread
From: Yosry Ahmed @ 2023-09-22 17:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Shakeel Butt, Johannes Weiner, Michal Hocko, Roman Gushchin,
	Muchun Song, Michal Koutný,
	linux-mm, cgroups, linux-kernel

On Fri, Sep 22, 2023 at 10:03 AM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> On Fri, 22 Sep 2023 08:25:40 +0000 Yosry Ahmed <yosryahmed@google.com> wrote:
>
> > While working on adjacent code [1], I realized that the values passed
> > into memcg_rstat_updated() to keep track of the magnitude of pending
> > updates is consistent. It is mostly in pages, but sometimes it can be in
> > bytes or KBs. Fix that.
>
> Clashes with "memcg: remove unused do_memsw_account in
> memcg1_stat_format", and maybe other things.  Can you please redo
> against mm-unsstable or linux-next?

Done, sent v2:
https://lore.kernel.org/linux-mm/20230922175741.635002-1-yosryahmed@google.com/

Thanks!


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-09-22 17:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-22  8:25 [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values Yosry Ahmed
2023-09-22  8:25 ` [PATCH 1/2] mm: memcg: refactor page state unit helpers Yosry Ahmed
2023-09-22  8:25 ` [PATCH 2/2] mm: memcg: normalize the value passed into memcg_rstat_updated() Yosry Ahmed
2023-09-22 17:03 ` [PATCH 0/2] mm: memcg: fix tracking of pending stats updates values Andrew Morton
2023-09-22 17:59   ` Yosry Ahmed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox