* [PATCH v2 0/3] Use kmem_cache for memcg alloc
@ 2025-04-24 12:09 Huan Yang
2025-04-24 12:09 ` [PATCH v2 1/3] mm/memcg: use kmem_cache when alloc memcg Huan Yang
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Huan Yang @ 2025-04-24 12:09 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Petr Mladek,
Sebastian Andrzej Siewior, Huan Yang, Francesco Valla,
Huang Shijie, KP Singh, Paul E. McKenney, Rasmus Villemoes,
Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven
Cc: opensource.kernel
The mem_cgroup_alloc function creates mem_cgroup struct and it's associated
structures including mem_cgroup_per_node.
Through detailed analysis on our test machine (Arm64, 16GB RAM, 6.6 kernel,
1 NUMA node, memcgv2 with nokmem,nosocket,cgroup_disable=pressure),
we can observe the memory allocation for these structures using the
following shell commands:
# Enable tracing
echo 1 > /sys/kernel/tracing/events/kmem/kmalloc/enable
echo 1 > /sys/kernel/tracing/tracing_on
cat /sys/kernel/tracing/trace_pipe | grep kmalloc | grep mem_cgroup
# Trigger allocation if cgroup subtree do not enable memcg
echo +memory > /sys/fs/cgroup/cgroup.subtree_control
Ftrace Output:
# mem_cgroup struct allocation
sh-6312 [000] ..... 58015.698365: kmalloc:
call_site=mem_cgroup_css_alloc+0xd8/0x5b4
ptr=000000003e4c3799 bytes_req=2312 bytes_alloc=4096
gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 accounted=false
# mem_cgroup_per_node allocation
sh-6312 [000] ..... 58015.698389: kmalloc:
call_site=mem_cgroup_css_alloc+0x1d8/0x5b4
ptr=00000000d798700c bytes_req=2896 bytes_alloc=4096
gfp_flags=GFP_KERNEL|__GFP_ZERO node=0 accounted=false
Key Observations:
1. Both structures use kmalloc with requested sizes between 2KB-4KB
2. Allocation alignment forces 4KB slab usage due to pre-defined sizes
(64B, 128B,..., 2KB, 4KB, 8KB)
3. Memory waste per memcg instance:
Base struct: 4096 - 2312 = 1784 bytes
Per-node struct: 4096 - 2896 = 1200 bytes
Total waste: 2984 bytes (1-node system)
NUMA scaling: (1200 + 8) * nr_node_ids bytes
So, it's a little waste.
This patchset introduces dedicated kmem_cache:
Patch1 - mem_cgroup kmem_cache - memcg_cachep
Patch2 - mem_cgroup_per_node kmem_cache - memcg_pn_cachep
The benefits of this change can be observed with the following tracing
commands:
# Enable tracing
echo 1 > /sys/kernel/tracing/events/kmem/kmem_cache_alloc/enable
echo 1 > /sys/kernel/tracing/tracing_on
cat /sys/kernel/tracing/trace_pipe | grep kmem_cache_alloc | grep mem_cgroup
# In another terminal:
echo +memory > /sys/fs/cgroup/cgroup.subtree_control
The output might now look like this:
# mem_cgroup struct allocation
sh-9827 [000] ..... 289.513598: kmem_cache_alloc:
call_site=mem_cgroup_css_alloc+0xbc/0x5d4 ptr=00000000695c1806
bytes_req=2312 bytes_alloc=2368 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1
accounted=false
# mem_cgroup_per_node allocation
sh-9827 [000] ..... 289.513602: kmem_cache_alloc:
call_site=mem_cgroup_css_alloc+0x1b8/0x5d4 ptr=000000002989e63a
bytes_req=2896 bytes_alloc=2944 gfp_flags=GFP_KERNEL|__GFP_ZERO node=0
accounted=false
This indicates that the `mem_cgroup` struct now requests 2312 bytes
and is allocated 2368 bytes, while `mem_cgroup_per_node` requests 2896 bytes
and is allocated 2944 bytes.
The slight increase in allocated size is due to `SLAB_HWCACHE_ALIGN` in the
`kmem_cache`.
Without `SLAB_HWCACHE_ALIGN`, the allocation might appear as:
# mem_cgroup struct allocation
sh-9269 [003] ..... 80.396366: kmem_cache_alloc:
call_site=mem_cgroup_css_alloc+0xbc/0x5d4 ptr=000000005b12b475
bytes_req=2312 bytes_alloc=2312 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1
accounted=false
# mem_cgroup_per_node allocation
sh-9269 [003] ..... 80.396411: kmem_cache_alloc:
call_site=mem_cgroup_css_alloc+0x1b8/0x5d4 ptr=00000000f347adc6
bytes_req=2896 bytes_alloc=2896 gfp_flags=GFP_KERNEL|__GFP_ZERO node=0
accounted=false
While the `bytes_alloc` now matches the `bytes_req`, this patchset defaults
to using `SLAB_HWCACHE_ALIGN` as it is generally considered more beneficial
for performance. Please let me know if there are any issues or if I've
misunderstood anything.
Patch3 - introduce the mem_cgroup_early_init() function to pre-allocate
essential resources before cgroup_init() create the root_mem_cgroup.
Currently is create memcg_cachep and memcg_pn_cachep, so keep
this struct alloc cleanly.
ChangeLog:
v1 -> v2:
Patch1-2 simple change commit message.
Patch3: Add mem_cgroup_init_early to help "memcg" prepare resources
before cgroup_init().
v1: https://lore.kernel.org/all/20250423084306.65706-1-link@vivo.com/
Huan Yang (3):
mm/memcg: use kmem_cache when alloc memcg
mm/memcg: use kmem_cache when alloc memcg pernode info
mm/memcg: introduce mem_cgroup_early_init
include/linux/memcontrol.h | 5 +++++
init/main.c | 2 ++
mm/memcontrol.c | 29 +++++++++++++++++++++++++++--
3 files changed, 34 insertions(+), 2 deletions(-)
base-commit: 2c9c612abeb38aab0e87d48496de6fd6daafb00b
--
2.48.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] mm/memcg: use kmem_cache when alloc memcg
2025-04-24 12:09 [PATCH v2 0/3] Use kmem_cache for memcg alloc Huan Yang
@ 2025-04-24 12:09 ` Huan Yang
2025-04-24 12:09 ` [PATCH v2 2/3] mm/memcg: use kmem_cache when alloc memcg pernode info Huan Yang
2025-04-24 12:09 ` [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init Huan Yang
2 siblings, 0 replies; 10+ messages in thread
From: Huan Yang @ 2025-04-24 12:09 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Petr Mladek,
Sebastian Andrzej Siewior, Huan Yang, Francesco Valla,
Huang Shijie, KP Singh, Paul E. McKenney, Rasmus Villemoes,
Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven, Palmer Dabbelt
Cc: opensource.kernel
When tracing mem_cgroup_alloc() with kmalloc ftrace, we observe:
kmalloc: call_site=mem_cgroup_css_alloc+0xd8/0x5b4 ptr=000000003e4c3799
bytes_req=2312 bytes_alloc=4096 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1
accounted=false
The output indicates that while allocating mem_cgroup struct (2312 bytes),
the slab allocator actually provides 4KB chunks. This occurs because:
1. The slab allocator predefines bucket sizes from 64B to 8096B
2. The mem_cgroup allocation size (2312 bytes) falls between the 2KB and
4KB slabs
3. The allocator rounds up to the nearest larger slab (4KB), resulting in
~1KB wasted memory per allocation
This patch introduces a dedicated kmem_cache for mem_cgroup structs,
achieving precise memory allocation. Post-patch ftrace verification shows:
kmem_cache_alloc: call_site=mem_cgroup_css_alloc+0xbc/0x5d4
ptr=00000000695c1806 bytes_req=2312 bytes_alloc=2368
gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 accounted=false
Each memcg alloc offer 2368 bytes(include hw cacheline align), compare to
4KB, avoid waste.
Signed-off-by: Huan Yang <link@vivo.com>
---
mm/memcontrol.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5e2ea8b8a898..cb32a498e5ae 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -95,6 +95,8 @@ static bool cgroup_memory_nokmem __ro_after_init;
/* BPF memory accounting disabled? */
static bool cgroup_memory_nobpf __ro_after_init;
+static struct kmem_cache *memcg_cachep;
+
#ifdef CONFIG_CGROUP_WRITEBACK
static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq);
#endif
@@ -3652,7 +3654,10 @@ static struct mem_cgroup *mem_cgroup_alloc(struct mem_cgroup *parent)
int __maybe_unused i;
long error;
- memcg = kzalloc(struct_size(memcg, nodeinfo, nr_node_ids), GFP_KERNEL);
+ memcg = likely(memcg_cachep) ?
+ kmem_cache_zalloc(memcg_cachep, GFP_KERNEL) :
+ kzalloc(struct_size(memcg, nodeinfo, nr_node_ids),
+ GFP_KERNEL);
if (!memcg)
return ERR_PTR(-ENOMEM);
@@ -5039,6 +5044,7 @@ __setup("cgroup.memory=", cgroup_memory);
static int __init mem_cgroup_init(void)
{
int cpu;
+ unsigned int memcg_size;
/*
* Currently s32 type (can refer to struct batched_lruvec_stat) is
@@ -5055,6 +5061,10 @@ static int __init mem_cgroup_init(void)
INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work,
drain_local_stock);
+ memcg_size = struct_size_t(struct mem_cgroup, nodeinfo, nr_node_ids);
+ memcg_cachep = kmem_cache_create("mem_cgroup", memcg_size, 0,
+ SLAB_PANIC | SLAB_HWCACHE_ALIGN, NULL);
+
return 0;
}
subsys_initcall(mem_cgroup_init);
--
2.48.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 2/3] mm/memcg: use kmem_cache when alloc memcg pernode info
2025-04-24 12:09 [PATCH v2 0/3] Use kmem_cache for memcg alloc Huan Yang
2025-04-24 12:09 ` [PATCH v2 1/3] mm/memcg: use kmem_cache when alloc memcg Huan Yang
@ 2025-04-24 12:09 ` Huan Yang
2025-04-24 12:09 ` [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init Huan Yang
2 siblings, 0 replies; 10+ messages in thread
From: Huan Yang @ 2025-04-24 12:09 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Petr Mladek,
Sebastian Andrzej Siewior, Huan Yang, Francesco Valla,
Huang Shijie, KP Singh, Paul E. McKenney, Rasmus Villemoes,
Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven
Cc: opensource.kernel
When tracing mem_cgroup_per_node allocations with kmalloc ftrace:
kmalloc: call_site=mem_cgroup_css_alloc+0x1d8/0x5b4 ptr=00000000d798700c
bytes_req=2896 bytes_alloc=4096 gfp_flags=GFP_KERNEL|__GFP_ZERO node=0
accounted=false
This reveals the slab allocator provides 4KB chunks for 2896B
mem_cgroup_per_node due to:
1. The slab allocator predefines bucket sizes from 64B to 8096B
2. The mem_cgroup allocation size (2312 bytes) falls between the 2KB and
4KB slabs
3. The allocator rounds up to the nearest larger slab (4KB), resulting in
~1KB wasted memory per memcg alloc - per node.
This patch introduces a dedicated kmem_cache for mem_cgroup structs,
achieving precise memory allocation. Post-patch ftrace verification shows:
kmem_cache_alloc: call_site=mem_cgroup_css_alloc+0x1b8/0x5d4
ptr=000000002989e63a bytes_req=2896 bytes_alloc=2944
gfp_flags=GFP_KERNEL|__GFP_ZERO node=0 accounted=false
Each mem_cgroup_per_node alloc 2944 bytes(include hw cacheline align),
compare to 4KB, it avoid waste.
Signed-off-by: Huan Yang <link@vivo.com>
---
mm/memcontrol.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cb32a498e5ae..e8797382aeb4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -96,6 +96,7 @@ static bool cgroup_memory_nokmem __ro_after_init;
static bool cgroup_memory_nobpf __ro_after_init;
static struct kmem_cache *memcg_cachep;
+static struct kmem_cache *memcg_pn_cachep;
#ifdef CONFIG_CGROUP_WRITEBACK
static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq);
@@ -3601,7 +3602,10 @@ static bool alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
{
struct mem_cgroup_per_node *pn;
- pn = kzalloc_node(sizeof(*pn), GFP_KERNEL, node);
+ pn = likely(memcg_pn_cachep) ?
+ kmem_cache_alloc_node(memcg_pn_cachep,
+ GFP_KERNEL | __GFP_ZERO, node) :
+ kzalloc_node(sizeof(*pn), GFP_KERNEL, node);
if (!pn)
return false;
@@ -5065,6 +5069,9 @@ static int __init mem_cgroup_init(void)
memcg_cachep = kmem_cache_create("mem_cgroup", memcg_size, 0,
SLAB_PANIC | SLAB_HWCACHE_ALIGN, NULL);
+ memcg_pn_cachep = KMEM_CACHE(mem_cgroup_per_node,
+ SLAB_PANIC | SLAB_HWCACHE_ALIGN);
+
return 0;
}
subsys_initcall(mem_cgroup_init);
--
2.48.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init
2025-04-24 12:09 [PATCH v2 0/3] Use kmem_cache for memcg alloc Huan Yang
2025-04-24 12:09 ` [PATCH v2 1/3] mm/memcg: use kmem_cache when alloc memcg Huan Yang
2025-04-24 12:09 ` [PATCH v2 2/3] mm/memcg: use kmem_cache when alloc memcg pernode info Huan Yang
@ 2025-04-24 12:09 ` Huan Yang
2025-04-24 16:00 ` Shakeel Butt
2 siblings, 1 reply; 10+ messages in thread
From: Huan Yang @ 2025-04-24 12:09 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Roman Gushchin, Shakeel Butt,
Muchun Song, Andrew Morton, Petr Mladek,
Sebastian Andrzej Siewior, Huan Yang, Francesco Valla,
Huang Shijie, KP Singh, Paul E. McKenney, Rasmus Villemoes,
Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven, Paul Moore,
Mike Rapoport (Microsoft)
Cc: opensource.kernel
When cgroup_init() creates root_mem_cgroup through css_online callback,
some critical resources might not be fully initialized, forcing later
operations to perform conditional checks for resource availability.
This patch introduces mem_cgroup_early_init() to address the init order,
it invoke before cgroup_init, so, compare mem_cgroup_init which invoked
by initcall, mem_cgroup_early_init can use to prepare some key resources
before root_mem_cgroup alloc.
Signed-off-by: Huan Yang <link@vivo.com>
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
---
include/linux/memcontrol.h | 5 +++++
init/main.c | 2 ++
mm/memcontrol.c | 40 +++++++++++++++++++++++---------------
3 files changed, 31 insertions(+), 16 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 5264d148bdd9..231f3c577294 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1057,6 +1057,7 @@ static inline u64 cgroup_id_from_mm(struct mm_struct *mm)
return id;
}
+extern void mem_cgroup_early_init(void);
#else /* CONFIG_MEMCG */
#define MEM_CGROUP_ID_SHIFT 0
@@ -1472,6 +1473,10 @@ static inline u64 cgroup_id_from_mm(struct mm_struct *mm)
{
return 0;
}
+
+static inline void mem_cgroup_early_init(void)
+{
+}
#endif /* CONFIG_MEMCG */
/*
diff --git a/init/main.c b/init/main.c
index 6b14e6116a1f..fd59d5ba2dc7 100644
--- a/init/main.c
+++ b/init/main.c
@@ -50,6 +50,7 @@
#include <linux/writeback.h>
#include <linux/cpu.h>
#include <linux/cpuset.h>
+#include <linux/memcontrol.h>
#include <linux/cgroup.h>
#include <linux/efi.h>
#include <linux/tick.h>
@@ -1087,6 +1088,7 @@ void start_kernel(void)
nsfs_init();
pidfs_init();
cpuset_init();
+ mem_cgroup_early_init();
cgroup_init();
taskstats_init_early();
delayacct_init();
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e8797382aeb4..bef1be3aad6f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3602,10 +3602,8 @@ static bool alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
{
struct mem_cgroup_per_node *pn;
- pn = likely(memcg_pn_cachep) ?
- kmem_cache_alloc_node(memcg_pn_cachep,
- GFP_KERNEL | __GFP_ZERO, node) :
- kzalloc_node(sizeof(*pn), GFP_KERNEL, node);
+ pn = kmem_cache_alloc_node(memcg_pn_cachep, GFP_KERNEL | __GFP_ZERO,
+ node);
if (!pn)
return false;
@@ -3658,10 +3656,7 @@ static struct mem_cgroup *mem_cgroup_alloc(struct mem_cgroup *parent)
int __maybe_unused i;
long error;
- memcg = likely(memcg_cachep) ?
- kmem_cache_zalloc(memcg_cachep, GFP_KERNEL) :
- kzalloc(struct_size(memcg, nodeinfo, nr_node_ids),
- GFP_KERNEL);
+ memcg = kmem_cache_zalloc(memcg_cachep, GFP_KERNEL);
if (!memcg)
return ERR_PTR(-ENOMEM);
@@ -5037,6 +5032,27 @@ static int __init cgroup_memory(char *s)
}
__setup("cgroup.memory=", cgroup_memory);
+/**
+ * Before cgroup_init() create root_mem_cgroup, we can prepare
+ * something in here which root_mem_cgroup may need.
+ * This currently initializes:
+ * 1) memcg_cachep - kmem_cache for mem_cgroup struct allocations
+ * 2) memcg_pn_cachep - kmem_cache for mem_cgroup_per_node structs
+ * (one per NUMA node)
+ */
+void __init mem_cgroup_early_init(void)
+{
+ struct mem_cgroup *memcg;
+ unsigned int memcg_size;
+
+ memcg_size = struct_size(memcg, nodeinfo, nr_node_ids);
+ memcg_cachep = kmem_cache_create("mem_cgroup", memcg_size, 0,
+ SLAB_PANIC | SLAB_HWCACHE_ALIGN, NULL);
+
+ memcg_pn_cachep = KMEM_CACHE(mem_cgroup_per_node,
+ SLAB_PANIC | SLAB_HWCACHE_ALIGN);
+}
+
/*
* subsys_initcall() for memory controller.
*
@@ -5048,7 +5064,6 @@ __setup("cgroup.memory=", cgroup_memory);
static int __init mem_cgroup_init(void)
{
int cpu;
- unsigned int memcg_size;
/*
* Currently s32 type (can refer to struct batched_lruvec_stat) is
@@ -5065,13 +5080,6 @@ static int __init mem_cgroup_init(void)
INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work,
drain_local_stock);
- memcg_size = struct_size_t(struct mem_cgroup, nodeinfo, nr_node_ids);
- memcg_cachep = kmem_cache_create("mem_cgroup", memcg_size, 0,
- SLAB_PANIC | SLAB_HWCACHE_ALIGN, NULL);
-
- memcg_pn_cachep = KMEM_CACHE(mem_cgroup_per_node,
- SLAB_PANIC | SLAB_HWCACHE_ALIGN);
-
return 0;
}
subsys_initcall(mem_cgroup_init);
--
2.48.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init
2025-04-24 12:09 ` [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init Huan Yang
@ 2025-04-24 16:00 ` Shakeel Butt
2025-04-24 23:00 ` Shakeel Butt
2025-04-25 1:11 ` Huan Yang
0 siblings, 2 replies; 10+ messages in thread
From: Shakeel Butt @ 2025-04-24 16:00 UTC (permalink / raw)
To: Huan Yang
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Andrew Morton, Petr Mladek, Sebastian Andrzej Siewior,
Francesco Valla, Huang Shijie, KP Singh, Paul E. McKenney,
Rasmus Villemoes, Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven, Paul Moore,
Mike Rapoport (Microsoft),
opensource.kernel
On Thu, Apr 24, 2025 at 08:09:29PM +0800, Huan Yang wrote:
> When cgroup_init() creates root_mem_cgroup through css_online callback,
> some critical resources might not be fully initialized, forcing later
> operations to perform conditional checks for resource availability.
>
> This patch introduces mem_cgroup_early_init() to address the init order,
> it invoke before cgroup_init, so, compare mem_cgroup_init which invoked
> by initcall, mem_cgroup_early_init can use to prepare some key resources
> before root_mem_cgroup alloc.
>
> Signed-off-by: Huan Yang <link@vivo.com>
> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Please move this patch as the first patch of the series and also remove
the "early" from the function name as it has a different meaning in the
context of cgroup init. Something like either memcg_init() or
memcg_kmem_caches_init().
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init
2025-04-24 16:00 ` Shakeel Butt
@ 2025-04-24 23:00 ` Shakeel Butt
2025-04-25 1:11 ` Huan Yang
2025-04-25 1:11 ` Huan Yang
1 sibling, 1 reply; 10+ messages in thread
From: Shakeel Butt @ 2025-04-24 23:00 UTC (permalink / raw)
To: Huan Yang
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Andrew Morton, Petr Mladek, Sebastian Andrzej Siewior,
Francesco Valla, Huang Shijie, KP Singh, Paul E. McKenney,
Rasmus Villemoes, Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven, Paul Moore,
Mike Rapoport (Microsoft),
opensource.kernel
On Thu, Apr 24, 2025 at 09:00:01AM -0700, Shakeel Butt wrote:
> On Thu, Apr 24, 2025 at 08:09:29PM +0800, Huan Yang wrote:
> > When cgroup_init() creates root_mem_cgroup through css_online callback,
> > some critical resources might not be fully initialized, forcing later
> > operations to perform conditional checks for resource availability.
> >
> > This patch introduces mem_cgroup_early_init() to address the init order,
> > it invoke before cgroup_init, so, compare mem_cgroup_init which invoked
> > by initcall, mem_cgroup_early_init can use to prepare some key resources
> > before root_mem_cgroup alloc.
> >
> > Signed-off-by: Huan Yang <link@vivo.com>
> > Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
>
> Please move this patch as the first patch of the series and also remove
> the "early" from the function name as it has a different meaning in the
> context of cgroup init. Something like either memcg_init() or
> memcg_kmem_caches_init().
BTW I think just putting this kmem cache creation in mem_cgroup_init()
and explicitly calling it before cgroup_init() would be fine. In that
case there would be a single memcg init function.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init
2025-04-24 23:00 ` Shakeel Butt
@ 2025-04-25 1:11 ` Huan Yang
2025-04-25 1:30 ` Shakeel Butt
0 siblings, 1 reply; 10+ messages in thread
From: Huan Yang @ 2025-04-25 1:11 UTC (permalink / raw)
To: Shakeel Butt
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Andrew Morton, Petr Mladek, Sebastian Andrzej Siewior,
Francesco Valla, Huang Shijie, KP Singh, Paul E. McKenney,
Rasmus Villemoes, Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven, Paul Moore,
Mike Rapoport (Microsoft),
opensource.kernel
Hi Shakeel
在 2025/4/25 07:00, Shakeel Butt 写道:
> On Thu, Apr 24, 2025 at 09:00:01AM -0700, Shakeel Butt wrote:
>> On Thu, Apr 24, 2025 at 08:09:29PM +0800, Huan Yang wrote:
>>> When cgroup_init() creates root_mem_cgroup through css_online callback,
>>> some critical resources might not be fully initialized, forcing later
>>> operations to perform conditional checks for resource availability.
>>>
>>> This patch introduces mem_cgroup_early_init() to address the init order,
>>> it invoke before cgroup_init, so, compare mem_cgroup_init which invoked
>>> by initcall, mem_cgroup_early_init can use to prepare some key resources
>>> before root_mem_cgroup alloc.
>>>
>>> Signed-off-by: Huan Yang <link@vivo.com>
>>> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
>> Please move this patch as the first patch of the series and also remove
>> the "early" from the function name as it has a different meaning in the
>> context of cgroup init. Something like either memcg_init() or
>> memcg_kmem_caches_init().
> BTW I think just putting this kmem cache creation in mem_cgroup_init()
> and explicitly calling it before cgroup_init() would be fine. In that
> case there would be a single memcg init function.
Maybe someone also need init something after cgroup init done?
Currently no, but for furture may need?
So, memcg_init then cgroup_init then initcall->mem_cgroup_init.
Thanks,
Huan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init
2025-04-24 16:00 ` Shakeel Butt
2025-04-24 23:00 ` Shakeel Butt
@ 2025-04-25 1:11 ` Huan Yang
1 sibling, 0 replies; 10+ messages in thread
From: Huan Yang @ 2025-04-25 1:11 UTC (permalink / raw)
To: Shakeel Butt
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Andrew Morton, Petr Mladek, Sebastian Andrzej Siewior,
Francesco Valla, Huang Shijie, KP Singh, Paul E. McKenney,
Rasmus Villemoes, Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven, Paul Moore,
Mike Rapoport (Microsoft),
opensource.kernel
在 2025/4/25 00:00, Shakeel Butt 写道:
> On Thu, Apr 24, 2025 at 08:09:29PM +0800, Huan Yang wrote:
>> When cgroup_init() creates root_mem_cgroup through css_online callback,
>> some critical resources might not be fully initialized, forcing later
>> operations to perform conditional checks for resource availability.
>>
>> This patch introduces mem_cgroup_early_init() to address the init order,
>> it invoke before cgroup_init, so, compare mem_cgroup_init which invoked
>> by initcall, mem_cgroup_early_init can use to prepare some key resources
>> before root_mem_cgroup alloc.
>>
>> Signed-off-by: Huan Yang <link@vivo.com>
>> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
> Please move this patch as the first patch of the series and also remove
> the "early" from the function name as it has a different meaning in the
OK,
Thanks.
> context of cgroup init. Something like either memcg_init() or
> memcg_kmem_caches_init().
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init
2025-04-25 1:11 ` Huan Yang
@ 2025-04-25 1:30 ` Shakeel Butt
2025-04-25 1:55 ` Huan Yang
0 siblings, 1 reply; 10+ messages in thread
From: Shakeel Butt @ 2025-04-25 1:30 UTC (permalink / raw)
To: Huan Yang
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Andrew Morton, Petr Mladek, Sebastian Andrzej Siewior,
Francesco Valla, Huang Shijie, KP Singh, Paul E. McKenney,
Rasmus Villemoes, Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven, Paul Moore,
Mike Rapoport (Microsoft),
opensource.kernel
On Fri, Apr 25, 2025 at 09:11:01AM +0800, Huan Yang wrote:
> Hi Shakeel
>
> 在 2025/4/25 07:00, Shakeel Butt 写道:
> > On Thu, Apr 24, 2025 at 09:00:01AM -0700, Shakeel Butt wrote:
> > > On Thu, Apr 24, 2025 at 08:09:29PM +0800, Huan Yang wrote:
> > > > When cgroup_init() creates root_mem_cgroup through css_online callback,
> > > > some critical resources might not be fully initialized, forcing later
> > > > operations to perform conditional checks for resource availability.
> > > >
> > > > This patch introduces mem_cgroup_early_init() to address the init order,
> > > > it invoke before cgroup_init, so, compare mem_cgroup_init which invoked
> > > > by initcall, mem_cgroup_early_init can use to prepare some key resources
> > > > before root_mem_cgroup alloc.
> > > >
> > > > Signed-off-by: Huan Yang <link@vivo.com>
> > > > Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
> > > Please move this patch as the first patch of the series and also remove
> > > the "early" from the function name as it has a different meaning in the
> > > context of cgroup init. Something like either memcg_init() or
> > > memcg_kmem_caches_init().
> > BTW I think just putting this kmem cache creation in mem_cgroup_init()
> > and explicitly calling it before cgroup_init() would be fine. In that
> > case there would be a single memcg init function.
>
> Maybe someone also need init something after cgroup init done?
>
> Currently no, but for furture may need?
If that is needed in future then that can be done in future. I would say
simply call mem_cgroup_init() before cgroup_init() for now.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init
2025-04-25 1:30 ` Shakeel Butt
@ 2025-04-25 1:55 ` Huan Yang
0 siblings, 0 replies; 10+ messages in thread
From: Huan Yang @ 2025-04-25 1:55 UTC (permalink / raw)
To: Shakeel Butt
Cc: Johannes Weiner, Michal Hocko, Roman Gushchin, Muchun Song,
Andrew Morton, Petr Mladek, Sebastian Andrzej Siewior,
Francesco Valla, Huang Shijie, KP Singh, Paul E. McKenney,
Rasmus Villemoes, Uladzislau Rezki (Sony),
Guo Weikang, Raul E Rangel, cgroups, linux-mm, linux-kernel,
Boqun Feng, Geert Uytterhoeven, Paul Moore,
Mike Rapoport (Microsoft),
opensource.kernel
在 2025/4/25 09:30, Shakeel Butt 写道:
> On Fri, Apr 25, 2025 at 09:11:01AM +0800, Huan Yang wrote:
>> Hi Shakeel
>>
>> 在 2025/4/25 07:00, Shakeel Butt 写道:
>>> On Thu, Apr 24, 2025 at 09:00:01AM -0700, Shakeel Butt wrote:
>>>> On Thu, Apr 24, 2025 at 08:09:29PM +0800, Huan Yang wrote:
>>>>> When cgroup_init() creates root_mem_cgroup through css_online callback,
>>>>> some critical resources might not be fully initialized, forcing later
>>>>> operations to perform conditional checks for resource availability.
>>>>>
>>>>> This patch introduces mem_cgroup_early_init() to address the init order,
>>>>> it invoke before cgroup_init, so, compare mem_cgroup_init which invoked
>>>>> by initcall, mem_cgroup_early_init can use to prepare some key resources
>>>>> before root_mem_cgroup alloc.
>>>>>
>>>>> Signed-off-by: Huan Yang <link@vivo.com>
>>>>> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
>>>> Please move this patch as the first patch of the series and also remove
>>>> the "early" from the function name as it has a different meaning in the
>>>> context of cgroup init. Something like either memcg_init() or
>>>> memcg_kmem_caches_init().
>>> BTW I think just putting this kmem cache creation in mem_cgroup_init()
>>> and explicitly calling it before cgroup_init() would be fine. In that
>>> case there would be a single memcg init function.
>> Maybe someone also need init something after cgroup init done?
>>
>> Currently no, but for furture may need?
> If that is needed in future then that can be done in future. I would say
Yes, that's right.
> simply call mem_cgroup_init() before cgroup_init() for now.
OK, I'll do it.
Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-04-25 1:55 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-24 12:09 [PATCH v2 0/3] Use kmem_cache for memcg alloc Huan Yang
2025-04-24 12:09 ` [PATCH v2 1/3] mm/memcg: use kmem_cache when alloc memcg Huan Yang
2025-04-24 12:09 ` [PATCH v2 2/3] mm/memcg: use kmem_cache when alloc memcg pernode info Huan Yang
2025-04-24 12:09 ` [PATCH v2 3/3] mm/memcg: introduce mem_cgroup_early_init Huan Yang
2025-04-24 16:00 ` Shakeel Butt
2025-04-24 23:00 ` Shakeel Butt
2025-04-25 1:11 ` Huan Yang
2025-04-25 1:30 ` Shakeel Butt
2025-04-25 1:55 ` Huan Yang
2025-04-25 1:11 ` Huan Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox