linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints
@ 2025-12-08 18:14 Thomas Ballasi
  2025-12-08 18:14 ` [PATCH 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Thomas Ballasi @ 2025-12-08 18:14 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Andrew Morton
  Cc: linux-mm, linux-trace-kernel

Attributing tracepoints to specific processes or cgroups might happen
to be challenging in some scenarios.  Implementing additional context
to these tracepoints with PIDs and cgroup IDs will help improve
analysis.

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>

Thomas Ballasi (2):
  mm: vmscan: add cgroup IDs to vmscan tracepoints
  mm: vmscan: add PIDs to vmscan tracepoints

 include/trace/events/vmscan.h | 77 +++++++++++++++++++++++------------
 mm/vmscan.c                   | 17 ++++----
 2 files changed, 60 insertions(+), 34 deletions(-)

-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2025-12-08 18:14 [PATCH 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints Thomas Ballasi
@ 2025-12-08 18:14 ` Thomas Ballasi
  2025-12-08 18:14 ` [PATCH 2/2] mm: vmscan: add PIDs " Thomas Ballasi
  2025-12-16 14:02 ` [PATCH v2 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
  2 siblings, 0 replies; 24+ messages in thread
From: Thomas Ballasi @ 2025-12-08 18:14 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Andrew Morton
  Cc: linux-mm, linux-trace-kernel

Memory reclaim events are currently difficult to attribute to
specific cgroups, making debugging memory pressure issues
challenging.  This patch adds memory cgroup ID (memcg_id) to key
vmscan tracepoints to enable better correlation and analysis.

For operations not associated with a specific cgroup, the field
is defaulted to 0.

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
---
 include/trace/events/vmscan.h | 65 +++++++++++++++++++++--------------
 mm/vmscan.c                   | 17 ++++-----
 2 files changed, 48 insertions(+), 34 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index d2123dd960d59..afc9f80d03f34 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -114,85 +114,92 @@ TRACE_EVENT(mm_vmscan_wakeup_kswapd,
 
 DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags),
+	TP_ARGS(order, gfp_flags, memcg_id),
 
 	TP_STRUCT__entry(
 		__field(	int,	order		)
 		__field(	unsigned long,	gfp_flags	)
+		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->order		= order;
 		__entry->gfp_flags	= (__force unsigned long)gfp_flags;
+		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("order=%d gfp_flags=%s",
+	TP_printk("order=%d gfp_flags=%s memcg_id=%u",
 		__entry->order,
-		show_gfp_flags(__entry->gfp_flags))
+		show_gfp_flags(__entry->gfp_flags),
+		__entry->memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_direct_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(order, gfp_flags, memcg_id)
 );
 
 #ifdef CONFIG_MEMCG
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_memcg_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(order, gfp_flags, memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_memcg_softlimit_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(order, gfp_flags, memcg_id)
 );
 #endif /* CONFIG_MEMCG */
 
 DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_end_template,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed),
+	TP_ARGS(nr_reclaimed, memcg_id),
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	nr_reclaimed	)
+		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->nr_reclaimed	= nr_reclaimed;
+		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("nr_reclaimed=%lu", __entry->nr_reclaimed)
+	TP_printk("nr_reclaimed=%lu memcg_id=%u",
+		__entry->nr_reclaimed,
+		__entry->memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_direct_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 #ifdef CONFIG_MEMCG
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_memcg_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_memcg_softlimit_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 #endif /* CONFIG_MEMCG */
 
@@ -209,6 +216,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__field(struct shrinker *, shr)
 		__field(void *, shrink)
 		__field(int, nid)
+		__field(unsigned short, memcg_id)
 		__field(long, nr_objects_to_shrink)
 		__field(unsigned long, gfp_flags)
 		__field(unsigned long, cache_items)
@@ -221,6 +229,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->shr = shr;
 		__entry->shrink = shr->scan_objects;
 		__entry->nid = sc->nid;
+		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
 		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
 		__entry->gfp_flags = (__force unsigned long)sc->gfp_mask;
 		__entry->cache_items = cache_items;
@@ -229,10 +238,11 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->priority = priority;
 	),
 
-	TP_printk("%pS %p: nid: %d objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
+	TP_printk("%pS %p: nid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->memcg_id,
 		__entry->nr_objects_to_shrink,
 		show_gfp_flags(__entry->gfp_flags),
 		__entry->cache_items,
@@ -242,15 +252,16 @@ TRACE_EVENT(mm_shrink_slab_start,
 );
 
 TRACE_EVENT(mm_shrink_slab_end,
-	TP_PROTO(struct shrinker *shr, int nid, int shrinker_retval,
+	TP_PROTO(struct shrinker *shr, struct shrink_control *sc, int shrinker_retval,
 		long unused_scan_cnt, long new_scan_cnt, long total_scan),
 
-	TP_ARGS(shr, nid, shrinker_retval, unused_scan_cnt, new_scan_cnt,
+	TP_ARGS(shr, sc, shrinker_retval, unused_scan_cnt, new_scan_cnt,
 		total_scan),
 
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
 		__field(int, nid)
+		__field(unsigned short, memcg_id)
 		__field(void *, shrink)
 		__field(long, unused_scan)
 		__field(long, new_scan)
@@ -260,7 +271,8 @@ TRACE_EVENT(mm_shrink_slab_end,
 
 	TP_fast_assign(
 		__entry->shr = shr;
-		__entry->nid = nid;
+		__entry->nid = sc->nid;
+		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
 		__entry->shrink = shr->scan_objects;
 		__entry->unused_scan = unused_scan_cnt;
 		__entry->new_scan = new_scan_cnt;
@@ -268,10 +280,11 @@ TRACE_EVENT(mm_shrink_slab_end,
 		__entry->total_scan = total_scan;
 	),
 
-	TP_printk("%pS %p: nid: %d unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
+	TP_printk("%pS %p: nid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->memcg_id,
 		__entry->unused_scan,
 		__entry->new_scan,
 		__entry->total_scan,
@@ -463,9 +476,9 @@ TRACE_EVENT(mm_vmscan_node_reclaim_begin,
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_node_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 TRACE_EVENT(mm_vmscan_throttled,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 258f5472f1e90..0e65ec3a087a5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -931,7 +931,7 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl,
 	 */
 	new_nr = add_nr_deferred(next_deferred, shrinker, shrinkctl);
 
-	trace_mm_shrink_slab_end(shrinker, shrinkctl->nid, freed, nr, new_nr, total_scan);
+	trace_mm_shrink_slab_end(shrinker, shrinkctl, freed, nr, new_nr, total_scan);
 	return freed;
 }
 
@@ -7092,11 +7092,11 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 		return 1;
 
 	set_task_reclaim_state(current, &sc.reclaim_state);
-	trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask);
+	trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask, 0);
 
 	nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
 
-	trace_mm_vmscan_direct_reclaim_end(nr_reclaimed);
+	trace_mm_vmscan_direct_reclaim_end(nr_reclaimed, 0);
 	set_task_reclaim_state(current, NULL);
 
 	return nr_reclaimed;
@@ -7126,7 +7126,8 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 			(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
 
 	trace_mm_vmscan_memcg_softlimit_reclaim_begin(sc.order,
-						      sc.gfp_mask);
+						      sc.gfp_mask,
+						      mem_cgroup_id(memcg));
 
 	/*
 	 * NOTE: Although we can get the priority field, using it
@@ -7137,7 +7138,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 	 */
 	shrink_lruvec(lruvec, &sc);
 
-	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed);
+	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed, mem_cgroup_id(memcg));
 
 	*nr_scanned = sc.nr_scanned;
 
@@ -7171,13 +7172,13 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
 	struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
 
 	set_task_reclaim_state(current, &sc.reclaim_state);
-	trace_mm_vmscan_memcg_reclaim_begin(0, sc.gfp_mask);
+	trace_mm_vmscan_memcg_reclaim_begin(0, sc.gfp_mask, mem_cgroup_id(memcg));
 	noreclaim_flag = memalloc_noreclaim_save();
 
 	nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
 
 	memalloc_noreclaim_restore(noreclaim_flag);
-	trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed);
+	trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed, mem_cgroup_id(memcg));
 	set_task_reclaim_state(current, NULL);
 
 	return nr_reclaimed;
@@ -8072,7 +8073,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 	fs_reclaim_release(sc.gfp_mask);
 	psi_memstall_leave(&pflags);
 
-	trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
+	trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed, 0);
 
 	return sc.nr_reclaimed >= nr_pages;
 }
-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/2] mm: vmscan: add PIDs to vmscan tracepoints
  2025-12-08 18:14 [PATCH 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints Thomas Ballasi
  2025-12-08 18:14 ` [PATCH 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
@ 2025-12-08 18:14 ` Thomas Ballasi
  2025-12-10  3:09   ` Steven Rostedt
  2025-12-16 14:02 ` [PATCH v2 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
  2 siblings, 1 reply; 24+ messages in thread
From: Thomas Ballasi @ 2025-12-08 18:14 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Andrew Morton
  Cc: linux-mm, linux-trace-kernel

The changes aims at adding additionnal tracepoints variables to help
debuggers attribute them to specific processes.

The PID field uses in_task() to reliably detect when we're in process
context and can safely access current->pid.  When not in process
context (such as in interrupt or in an asynchronous RCU context), the
field is set to -1 as a sentinel value.

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
---
 include/trace/events/vmscan.h | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index afc9f80d03f34..eddb4e75e2e23 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -121,18 +121,21 @@ DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
 	TP_STRUCT__entry(
 		__field(	int,	order		)
 		__field(	unsigned long,	gfp_flags	)
+		__field(	int,	pid		)
 		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->order		= order;
 		__entry->gfp_flags	= (__force unsigned long)gfp_flags;
+		__entry->pid		= in_task() ? current->pid : -1;
 		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("order=%d gfp_flags=%s memcg_id=%u",
+	TP_printk("order=%d gfp_flags=%s pid=%d memcg_id=%u",
 		__entry->order,
 		show_gfp_flags(__entry->gfp_flags),
+		__entry->pid,
 		__entry->memcg_id)
 );
 
@@ -167,16 +170,19 @@ DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_end_template,
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	nr_reclaimed	)
+		__field(	int,	pid		)
 		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->nr_reclaimed	= nr_reclaimed;
+		__entry->pid		= in_task() ? current->pid : -1;
 		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("nr_reclaimed=%lu memcg_id=%u",
+	TP_printk("nr_reclaimed=%lu pid=%d memcg_id=%u",
 		__entry->nr_reclaimed,
+		__entry->pid,
 		__entry->memcg_id)
 );
 
@@ -216,6 +222,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__field(struct shrinker *, shr)
 		__field(void *, shrink)
 		__field(int, nid)
+		__field(int, pid)
 		__field(unsigned short, memcg_id)
 		__field(long, nr_objects_to_shrink)
 		__field(unsigned long, gfp_flags)
@@ -229,6 +236,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->shr = shr;
 		__entry->shrink = shr->scan_objects;
 		__entry->nid = sc->nid;
+		__entry->pid = in_task() ? current->pid : -1;
 		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
 		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
 		__entry->gfp_flags = (__force unsigned long)sc->gfp_mask;
@@ -238,10 +246,11 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->priority = priority;
 	),
 
-	TP_printk("%pS %p: nid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
+	TP_printk("%pS %p: nid: %d pid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->pid,
 		__entry->memcg_id,
 		__entry->nr_objects_to_shrink,
 		show_gfp_flags(__entry->gfp_flags),
@@ -261,6 +270,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
 		__field(int, nid)
+		__field(int, pid)
 		__field(unsigned short, memcg_id)
 		__field(void *, shrink)
 		__field(long, unused_scan)
@@ -272,6 +282,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 	TP_fast_assign(
 		__entry->shr = shr;
 		__entry->nid = sc->nid;
+		__entry->pid = in_task() ? current->pid : -1;
 		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
 		__entry->shrink = shr->scan_objects;
 		__entry->unused_scan = unused_scan_cnt;
@@ -280,10 +291,11 @@ TRACE_EVENT(mm_shrink_slab_end,
 		__entry->total_scan = total_scan;
 	),
 
-	TP_printk("%pS %p: nid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
+	TP_printk("%pS %p: nid: %d pid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->pid,
 		__entry->memcg_id,
 		__entry->unused_scan,
 		__entry->new_scan,
-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] mm: vmscan: add PIDs to vmscan tracepoints
  2025-12-08 18:14 ` [PATCH 2/2] mm: vmscan: add PIDs " Thomas Ballasi
@ 2025-12-10  3:09   ` Steven Rostedt
  0 siblings, 0 replies; 24+ messages in thread
From: Steven Rostedt @ 2025-12-10  3:09 UTC (permalink / raw)
  To: Thomas Ballasi
  Cc: Masami Hiramatsu, Andrew Morton, linux-mm, linux-trace-kernel

On Mon,  8 Dec 2025 10:14:13 -0800
Thomas Ballasi <tballasi@linux.microsoft.com> wrote:

> ---
>  include/trace/events/vmscan.h | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index afc9f80d03f34..eddb4e75e2e23 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -121,18 +121,21 @@ DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
>  	TP_STRUCT__entry(
>  		__field(	int,	order		)
>  		__field(	unsigned long,	gfp_flags	)
> +		__field(	int,	pid		)

This puts a hole in the ring buffer on 64 bit machines. Please keep pid
next to order as they are both 'int' and not have an "unsigned long"
between the two.

>  		__field(	unsigned short,	memcg_id	)
>  	),

-- Steve


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints
  2025-12-08 18:14 [PATCH 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints Thomas Ballasi
  2025-12-08 18:14 ` [PATCH 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
  2025-12-08 18:14 ` [PATCH 2/2] mm: vmscan: add PIDs " Thomas Ballasi
@ 2025-12-16 14:02 ` Thomas Ballasi
  2025-12-16 14:02   ` [PATCH v2 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
                     ` (2 more replies)
  2 siblings, 3 replies; 24+ messages in thread
From: Thomas Ballasi @ 2025-12-16 14:02 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Andrew Morton
  Cc: linux-mm, linux-trace-kernel

Changes in v2:
- Swapped field entries to prevent a hole in the ring buffer

Link to v1:
https://lore.kernel.org/linux-trace-kernel/20251208181413.4722-1-tballasi@linux.microsoft.com/

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>

Thomas Ballasi (2):
  mm: vmscan: add cgroup IDs to vmscan tracepoints
  mm: vmscan: add PIDs to vmscan tracepoints

 include/trace/events/vmscan.h | 77 +++++++++++++++++++++++------------
 mm/vmscan.c                   | 17 ++++----
 2 files changed, 60 insertions(+), 34 deletions(-)

-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2025-12-16 14:02 ` [PATCH v2 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
@ 2025-12-16 14:02   ` Thomas Ballasi
  2025-12-16 18:50     ` Shakeel Butt
  2025-12-17 22:21     ` Steven Rostedt
  2025-12-16 14:02   ` [PATCH v2 2/2] mm: vmscan: add PIDs " Thomas Ballasi
  2026-01-05 16:04   ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
  2 siblings, 2 replies; 24+ messages in thread
From: Thomas Ballasi @ 2025-12-16 14:02 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Andrew Morton
  Cc: linux-mm, linux-trace-kernel

Memory reclaim events are currently difficult to attribute to
specific cgroups, making debugging memory pressure issues
challenging.  This patch adds memory cgroup ID (memcg_id) to key
vmscan tracepoints to enable better correlation and analysis.

For operations not associated with a specific cgroup, the field
is defaulted to 0.

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
---
 include/trace/events/vmscan.h | 65 +++++++++++++++++++++--------------
 mm/vmscan.c                   | 17 ++++-----
 2 files changed, 48 insertions(+), 34 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index d2123dd960d59..afc9f80d03f34 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -114,85 +114,92 @@ TRACE_EVENT(mm_vmscan_wakeup_kswapd,
 
 DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags),
+	TP_ARGS(order, gfp_flags, memcg_id),
 
 	TP_STRUCT__entry(
 		__field(	int,	order		)
 		__field(	unsigned long,	gfp_flags	)
+		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->order		= order;
 		__entry->gfp_flags	= (__force unsigned long)gfp_flags;
+		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("order=%d gfp_flags=%s",
+	TP_printk("order=%d gfp_flags=%s memcg_id=%u",
 		__entry->order,
-		show_gfp_flags(__entry->gfp_flags))
+		show_gfp_flags(__entry->gfp_flags),
+		__entry->memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_direct_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(order, gfp_flags, memcg_id)
 );
 
 #ifdef CONFIG_MEMCG
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_memcg_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(order, gfp_flags, memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_memcg_softlimit_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(order, gfp_flags, memcg_id)
 );
 #endif /* CONFIG_MEMCG */
 
 DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_end_template,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed),
+	TP_ARGS(nr_reclaimed, memcg_id),
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	nr_reclaimed	)
+		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->nr_reclaimed	= nr_reclaimed;
+		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("nr_reclaimed=%lu", __entry->nr_reclaimed)
+	TP_printk("nr_reclaimed=%lu memcg_id=%u",
+		__entry->nr_reclaimed,
+		__entry->memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_direct_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 #ifdef CONFIG_MEMCG
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_memcg_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_memcg_softlimit_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 #endif /* CONFIG_MEMCG */
 
@@ -209,6 +216,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__field(struct shrinker *, shr)
 		__field(void *, shrink)
 		__field(int, nid)
+		__field(unsigned short, memcg_id)
 		__field(long, nr_objects_to_shrink)
 		__field(unsigned long, gfp_flags)
 		__field(unsigned long, cache_items)
@@ -221,6 +229,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->shr = shr;
 		__entry->shrink = shr->scan_objects;
 		__entry->nid = sc->nid;
+		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
 		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
 		__entry->gfp_flags = (__force unsigned long)sc->gfp_mask;
 		__entry->cache_items = cache_items;
@@ -229,10 +238,11 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->priority = priority;
 	),
 
-	TP_printk("%pS %p: nid: %d objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
+	TP_printk("%pS %p: nid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->memcg_id,
 		__entry->nr_objects_to_shrink,
 		show_gfp_flags(__entry->gfp_flags),
 		__entry->cache_items,
@@ -242,15 +252,16 @@ TRACE_EVENT(mm_shrink_slab_start,
 );
 
 TRACE_EVENT(mm_shrink_slab_end,
-	TP_PROTO(struct shrinker *shr, int nid, int shrinker_retval,
+	TP_PROTO(struct shrinker *shr, struct shrink_control *sc, int shrinker_retval,
 		long unused_scan_cnt, long new_scan_cnt, long total_scan),
 
-	TP_ARGS(shr, nid, shrinker_retval, unused_scan_cnt, new_scan_cnt,
+	TP_ARGS(shr, sc, shrinker_retval, unused_scan_cnt, new_scan_cnt,
 		total_scan),
 
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
 		__field(int, nid)
+		__field(unsigned short, memcg_id)
 		__field(void *, shrink)
 		__field(long, unused_scan)
 		__field(long, new_scan)
@@ -260,7 +271,8 @@ TRACE_EVENT(mm_shrink_slab_end,
 
 	TP_fast_assign(
 		__entry->shr = shr;
-		__entry->nid = nid;
+		__entry->nid = sc->nid;
+		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
 		__entry->shrink = shr->scan_objects;
 		__entry->unused_scan = unused_scan_cnt;
 		__entry->new_scan = new_scan_cnt;
@@ -268,10 +280,11 @@ TRACE_EVENT(mm_shrink_slab_end,
 		__entry->total_scan = total_scan;
 	),
 
-	TP_printk("%pS %p: nid: %d unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
+	TP_printk("%pS %p: nid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->memcg_id,
 		__entry->unused_scan,
 		__entry->new_scan,
 		__entry->total_scan,
@@ -463,9 +476,9 @@ TRACE_EVENT(mm_vmscan_node_reclaim_begin,
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_node_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 TRACE_EVENT(mm_vmscan_throttled,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 258f5472f1e90..0e65ec3a087a5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -931,7 +931,7 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl,
 	 */
 	new_nr = add_nr_deferred(next_deferred, shrinker, shrinkctl);
 
-	trace_mm_shrink_slab_end(shrinker, shrinkctl->nid, freed, nr, new_nr, total_scan);
+	trace_mm_shrink_slab_end(shrinker, shrinkctl, freed, nr, new_nr, total_scan);
 	return freed;
 }
 
@@ -7092,11 +7092,11 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 		return 1;
 
 	set_task_reclaim_state(current, &sc.reclaim_state);
-	trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask);
+	trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask, 0);
 
 	nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
 
-	trace_mm_vmscan_direct_reclaim_end(nr_reclaimed);
+	trace_mm_vmscan_direct_reclaim_end(nr_reclaimed, 0);
 	set_task_reclaim_state(current, NULL);
 
 	return nr_reclaimed;
@@ -7126,7 +7126,8 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 			(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
 
 	trace_mm_vmscan_memcg_softlimit_reclaim_begin(sc.order,
-						      sc.gfp_mask);
+						      sc.gfp_mask,
+						      mem_cgroup_id(memcg));
 
 	/*
 	 * NOTE: Although we can get the priority field, using it
@@ -7137,7 +7138,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 	 */
 	shrink_lruvec(lruvec, &sc);
 
-	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed);
+	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed, mem_cgroup_id(memcg));
 
 	*nr_scanned = sc.nr_scanned;
 
@@ -7171,13 +7172,13 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
 	struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
 
 	set_task_reclaim_state(current, &sc.reclaim_state);
-	trace_mm_vmscan_memcg_reclaim_begin(0, sc.gfp_mask);
+	trace_mm_vmscan_memcg_reclaim_begin(0, sc.gfp_mask, mem_cgroup_id(memcg));
 	noreclaim_flag = memalloc_noreclaim_save();
 
 	nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
 
 	memalloc_noreclaim_restore(noreclaim_flag);
-	trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed);
+	trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed, mem_cgroup_id(memcg));
 	set_task_reclaim_state(current, NULL);
 
 	return nr_reclaimed;
@@ -8072,7 +8073,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 	fs_reclaim_release(sc.gfp_mask);
 	psi_memstall_leave(&pflags);
 
-	trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed);
+	trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed, 0);
 
 	return sc.nr_reclaimed >= nr_pages;
 }
-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 2/2] mm: vmscan: add PIDs to vmscan tracepoints
  2025-12-16 14:02 ` [PATCH v2 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
  2025-12-16 14:02   ` [PATCH v2 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
@ 2025-12-16 14:02   ` Thomas Ballasi
  2025-12-16 18:03     ` Steven Rostedt
  2026-01-05 16:04   ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
  2 siblings, 1 reply; 24+ messages in thread
From: Thomas Ballasi @ 2025-12-16 14:02 UTC (permalink / raw)
  To: Steven Rostedt, Masami Hiramatsu, Andrew Morton
  Cc: linux-mm, linux-trace-kernel

The changes aims at adding additionnal tracepoints variables to help
debuggers attribute them to specific processes.

The PID field uses in_task() to reliably detect when we're in process
context and can safely access current->pid.  When not in process
context (such as in interrupt or in an asynchronous RCU context), the
field is set to -1 as a sentinel value.

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
---
 include/trace/events/vmscan.h | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index afc9f80d03f34..315725f30b504 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -120,19 +120,22 @@ DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
 
 	TP_STRUCT__entry(
 		__field(	int,	order		)
+		__field(	int,	pid		)
 		__field(	unsigned long,	gfp_flags	)
 		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->order		= order;
+		__entry->pid		= in_task() ? current->pid : -1;
 		__entry->gfp_flags	= (__force unsigned long)gfp_flags;
 		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("order=%d gfp_flags=%s memcg_id=%u",
+	TP_printk("order=%d gfp_flags=%s pid=%d memcg_id=%u",
 		__entry->order,
 		show_gfp_flags(__entry->gfp_flags),
+		__entry->pid,
 		__entry->memcg_id)
 );
 
@@ -167,16 +170,19 @@ DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_end_template,
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	nr_reclaimed	)
+		__field(	int,	pid		)
 		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->nr_reclaimed	= nr_reclaimed;
+		__entry->pid		= in_task() ? current->pid : -1;
 		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("nr_reclaimed=%lu memcg_id=%u",
+	TP_printk("nr_reclaimed=%lu pid=%d memcg_id=%u",
 		__entry->nr_reclaimed,
+		__entry->pid,
 		__entry->memcg_id)
 );
 
@@ -216,6 +222,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__field(struct shrinker *, shr)
 		__field(void *, shrink)
 		__field(int, nid)
+		__field(int, pid)
 		__field(unsigned short, memcg_id)
 		__field(long, nr_objects_to_shrink)
 		__field(unsigned long, gfp_flags)
@@ -229,6 +236,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->shr = shr;
 		__entry->shrink = shr->scan_objects;
 		__entry->nid = sc->nid;
+		__entry->pid = in_task() ? current->pid : -1;
 		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
 		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
 		__entry->gfp_flags = (__force unsigned long)sc->gfp_mask;
@@ -238,10 +246,11 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->priority = priority;
 	),
 
-	TP_printk("%pS %p: nid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
+	TP_printk("%pS %p: nid: %d pid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->pid,
 		__entry->memcg_id,
 		__entry->nr_objects_to_shrink,
 		show_gfp_flags(__entry->gfp_flags),
@@ -261,6 +270,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
 		__field(int, nid)
+		__field(int, pid)
 		__field(unsigned short, memcg_id)
 		__field(void *, shrink)
 		__field(long, unused_scan)
@@ -272,6 +282,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 	TP_fast_assign(
 		__entry->shr = shr;
 		__entry->nid = sc->nid;
+		__entry->pid = in_task() ? current->pid : -1;
 		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
 		__entry->shrink = shr->scan_objects;
 		__entry->unused_scan = unused_scan_cnt;
@@ -280,10 +291,11 @@ TRACE_EVENT(mm_shrink_slab_end,
 		__entry->total_scan = total_scan;
 	),
 
-	TP_printk("%pS %p: nid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
+	TP_printk("%pS %p: nid: %d pid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->pid,
 		__entry->memcg_id,
 		__entry->unused_scan,
 		__entry->new_scan,
-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/2] mm: vmscan: add PIDs to vmscan tracepoints
  2025-12-16 14:02   ` [PATCH v2 2/2] mm: vmscan: add PIDs " Thomas Ballasi
@ 2025-12-16 18:03     ` Steven Rostedt
  2025-12-29 10:54       ` Thomas Ballasi
  0 siblings, 1 reply; 24+ messages in thread
From: Steven Rostedt @ 2025-12-16 18:03 UTC (permalink / raw)
  To: Thomas Ballasi
  Cc: Masami Hiramatsu, Andrew Morton, linux-mm, linux-trace-kernel

On Tue, 16 Dec 2025 06:02:52 -0800
Thomas Ballasi <tballasi@linux.microsoft.com> wrote:

> The changes aims at adding additionnal tracepoints variables to help
> debuggers attribute them to specific processes.
> 
> The PID field uses in_task() to reliably detect when we're in process
> context and can safely access current->pid.  When not in process
> context (such as in interrupt or in an asynchronous RCU context), the
> field is set to -1 as a sentinel value.
> 
> Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>

Is this really needed? The trace events already show if you are in
interrupt context or not.

# tracer: nop
#
# entries-in-buffer/entries-written: 25817/25817   #P:8
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq   <<<<------ Shows irq context
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
          <idle>-0       [002] d..1. 11429.293552: rcu_watching: Startirq 0 1 0x74c
          <idle>-0       [000] d.H1. 11429.293564: rcu_utilization: Start scheduler-tick
          <idle>-0       [000] d.H1. 11429.293566: rcu_utilization: End scheduler-tick
          <idle>-0       [002] dN.1. 11429.293567: rcu_watching: Endirq 1 0 0x74c
          <idle>-0       [002] dN.1. 11429.293568: rcu_watching: Start 0 1 0x754
          <idle>-0       [000] d.s1. 11429.293577: rcu_watching: --= 3 1 0xdf4
          <idle>-0       [002] dN.1. 11429.293579: rcu_utilization: Start context switch
          <idle>-0       [002] dN.1. 11429.293580: rcu_utilization: End context switch
       rcu_sched-15      [002] d..1. 11429.293589: rcu_grace_period: rcu_sched 132685 start
          <idle>-0       [000] dN.1. 11429.293592: rcu_watching: Endirq 1 0 0xdf4
       rcu_sched-15      [002] d..1. 11429.293592: rcu_grace_period: rcu_sched 132685 cpustart
       rcu_sched-15      [002] d..1. 11429.293592: rcu_grace_period_init: rcu_sched 132685 0 0 7 ff
          <idle>-0       [000] dN.1. 11429.293593: rcu_watching: Start 0 1 0xdfc

Thus, you can already tell if you are in interrupt context or not, and you
always get the current pid. The 'H', 'h' or 's' means you are in a
interrupt type context. ('H' for hard interrupt interrupting a softirq, 'h'
for just a hard interrupt, and 's' for a softirq).

What's the point of adding another field to cover the same information
that's already available?

-- Steve


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2025-12-16 14:02   ` [PATCH v2 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
@ 2025-12-16 18:50     ` Shakeel Butt
  2025-12-17 22:21     ` Steven Rostedt
  1 sibling, 0 replies; 24+ messages in thread
From: Shakeel Butt @ 2025-12-16 18:50 UTC (permalink / raw)
  To: Thomas Ballasi
  Cc: Steven Rostedt, Masami Hiramatsu, Andrew Morton, linux-mm,
	linux-trace-kernel

On Tue, Dec 16, 2025 at 06:02:51AM -0800, Thomas Ballasi wrote:
> Memory reclaim events are currently difficult to attribute to
> specific cgroups, making debugging memory pressure issues
> challenging.  This patch adds memory cgroup ID (memcg_id) to key
> vmscan tracepoints to enable better correlation and analysis.
> 
> For operations not associated with a specific cgroup, the field
> is defaulted to 0.
> 
> Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
> ---
...
> +		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
...
> +		__entry->memcg_id = sc->memcg ? mem_cgroup_id(sc->memcg) : 0;
...
>  
...
>  	trace_mm_vmscan_memcg_softlimit_reclaim_begin(sc.order,
> -						      sc.gfp_mask);
> +						      sc.gfp_mask,
> +						      mem_cgroup_id(memcg));
>  

...

> +	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed, mem_cgroup_id(memcg));
...
> +	trace_mm_vmscan_memcg_reclaim_begin(0, sc.gfp_mask, mem_cgroup_id(memcg));
...
> +	trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed, mem_cgroup_id(memcg));

Please don't use mem_cgroup_id() here as it is an ID internal to memcg.
Use cgroup_id(memcg->css.cgroup) instead which is inode number and is
exposed to the userspace.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2025-12-16 14:02   ` [PATCH v2 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
  2025-12-16 18:50     ` Shakeel Butt
@ 2025-12-17 22:21     ` Steven Rostedt
  1 sibling, 0 replies; 24+ messages in thread
From: Steven Rostedt @ 2025-12-17 22:21 UTC (permalink / raw)
  To: Thomas Ballasi
  Cc: Masami Hiramatsu, Andrew Morton, linux-mm, linux-trace-kernel

On Tue, 16 Dec 2025 06:02:51 -0800
Thomas Ballasi <tballasi@linux.microsoft.com> wrote:

> ---
>  include/trace/events/vmscan.h | 65 +++++++++++++++++++++--------------
>  mm/vmscan.c                   | 17 ++++-----
>  2 files changed, 48 insertions(+), 34 deletions(-)
> 
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index d2123dd960d59..afc9f80d03f34 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -114,85 +114,92 @@ TRACE_EVENT(mm_vmscan_wakeup_kswapd,
>  
>  DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
>  
> -	TP_PROTO(int order, gfp_t gfp_flags),
> +	TP_PROTO(int order, gfp_t gfp_flags, unsigned short memcg_id),
>  
> -	TP_ARGS(order, gfp_flags),
> +	TP_ARGS(order, gfp_flags, memcg_id),
>  
>  	TP_STRUCT__entry(
>  		__field(	int,	order		)
>  		__field(	unsigned long,	gfp_flags	)
> +		__field(	unsigned short,	memcg_id	)
>  	),

Hmm, the above adds some holes. Note, events are at a minimum, 4 bytes
aligend. On 64bit, they can be 8 byte aligned. Still, above is the same as:

	struct {
		int		order;
		unsigned long	gfp_flags;
		unsigned short	memcg_id;
	};

See the issue? Perhaps it may be better to add the memcg_id in between the
order and gfp_flags?

-- Steve


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/2] mm: vmscan: add PIDs to vmscan tracepoints
  2025-12-16 18:03     ` Steven Rostedt
@ 2025-12-29 10:54       ` Thomas Ballasi
  2025-12-29 18:29         ` Steven Rostedt
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Ballasi @ 2025-12-29 10:54 UTC (permalink / raw)
  To: rostedt; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat, tballasi

On Tue, Dec 16, 2025 at 01:03:02PM -0500, Steven Rostedt wrote:
> On Tue, 16 Dec 2025 06:02:52 -0800
> Thomas Ballasi <tballasi@linux.microsoft.com> wrote:
> 
> > The changes aims at adding additionnal tracepoints variables to help
> > debuggers attribute them to specific processes.
> > 
> > The PID field uses in_task() to reliably detect when we're in process
> > context and can safely access current->pid.  When not in process
> > context (such as in interrupt or in an asynchronous RCU context), the
> > field is set to -1 as a sentinel value.
> > 
> > Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
> 
> Is this really needed? The trace events already show if you are in
> interrupt context or not.
> 
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 25817/25817   #P:8
> #
> #                                _-----=> irqs-off/BH-disabled
> #                               / _----=> need-resched
> #                              | / _---=> hardirq/softirq   <<<<------ Shows irq context
> #                              || / _--=> preempt-depth
> #                              ||| / _-=> migrate-disable
> #                              |||| /     delay
> #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
> #              | |         |   |||||     |         |
>           <idle>-0       [002] d..1. 11429.293552: rcu_watching: Startirq 0 1 0x74c
>           <idle>-0       [000] d.H1. 11429.293564: rcu_utilization: Start scheduler-tick
>           <idle>-0       [000] d.H1. 11429.293566: rcu_utilization: End scheduler-tick
>           <idle>-0       [002] dN.1. 11429.293567: rcu_watching: Endirq 1 0 0x74c
>           <idle>-0       [002] dN.1. 11429.293568: rcu_watching: Start 0 1 0x754
>           <idle>-0       [000] d.s1. 11429.293577: rcu_watching: --= 3 1 0xdf4
>           <idle>-0       [002] dN.1. 11429.293579: rcu_utilization: Start context switch
>           <idle>-0       [002] dN.1. 11429.293580: rcu_utilization: End context switch
>        rcu_sched-15      [002] d..1. 11429.293589: rcu_grace_period: rcu_sched 132685 start
>           <idle>-0       [000] dN.1. 11429.293592: rcu_watching: Endirq 1 0 0xdf4
>        rcu_sched-15      [002] d..1. 11429.293592: rcu_grace_period: rcu_sched 132685 cpustart
>        rcu_sched-15      [002] d..1. 11429.293592: rcu_grace_period_init: rcu_sched 132685 0 0 7 ff
>           <idle>-0       [000] dN.1. 11429.293593: rcu_watching: Start 0 1 0xdfc
> 
> Thus, you can already tell if you are in interrupt context or not, and you
> always get the current pid. The 'H', 'h' or 's' means you are in a
> interrupt type context. ('H' for hard interrupt interrupting a softirq, 'h'
> for just a hard interrupt, and 's' for a softirq).
> 
> What's the point of adding another field to cover the same information
> that's already available?
> 
> -- Steve

(re-sending the reply as I believe I missed the reply all)

It indeed shows whether or not we're in an IRQ, but I believe the
kernel shouldn't show erronous debugging values. Even though it can be
obvious that we're in an interrupt, some people might look directly at
the garbage PID value without having second thoughts and taking it for
granted. On the other hand, it takes just a small check to mark the
debugging information as clearly invalid, which complements the IRQ
context flag.

If we shouldn't put that check there, I'd happily remove it, but I'd
tend to think it's a trivial addition that can only be for the best.

Thomas



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/2] mm: vmscan: add PIDs to vmscan tracepoints
  2025-12-29 10:54       ` Thomas Ballasi
@ 2025-12-29 18:29         ` Steven Rostedt
  2025-12-29 21:36           ` Steven Rostedt
  0 siblings, 1 reply; 24+ messages in thread
From: Steven Rostedt @ 2025-12-29 18:29 UTC (permalink / raw)
  To: Thomas Ballasi; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat

On Mon, 29 Dec 2025 02:54:27 -0800
Thomas Ballasi <tballasi@linux.microsoft.com> wrote:

> It indeed shows whether or not we're in an IRQ, but I believe the
> kernel shouldn't show erronous debugging values. Even though it can be
> obvious that we're in an interrupt, some people might look directly at
> the garbage PID value without having second thoughts and taking it for
> granted. On the other hand, it takes just a small check to mark the
> debugging information as clearly invalid, which complements the IRQ
> context flag.
> 
> If we shouldn't put that check there, I'd happily remove it, but I'd
> tend to think it's a trivial addition that can only be for the best.

I just don't like wasting valuable ring buffer space for something that can
be easily determined without it.

How about this. I just wrote up this patch, and it could be something you
use. I tested it against the sched waking events, by adding:

 		__entry->target_cpu	= task_cpu(p);
 	),
 
-	TP_printk("comm=%s pid=%d prio=%d target_cpu=%03d",
+	TP_printk("comm=%s pid=%d prio=%d target_cpu=%03d %s",
 		  __entry->comm, __entry->pid, __entry->prio,
-		  __entry->target_cpu)
+		  __entry->target_cpu,
+		  __event_in_irq() ? "(in-irq)" : "")
 );
 
Which produces:

          <idle>-0     [003] d.h4.    44.832126: sched_waking:         comm=in:imklog pid=619 prio=120 target_cpu=006 (in-irq)
          <idle>-0     [003] d.s3.    44.832180: sched_waking:         comm=rcu_preempt pid=15 prio=120 target_cpu=001 (in-irq)
       in:imklog-619   [006] d..2.    44.832393: sched_waking:         comm=rs:main Q:Reg pid=620 prio=120 target_cpu=003 

You can see it adds "(in-irq)" when the even is executed from IRQ context
(soft or hard irq). But I also added __event_in_hardirq() and
__event_in_softirq() if you wanted to distinguish them.

Now you don't need to update what goes into the ring buffer (and waste its
space), but only update the output format that makes it obvious that the
task was in interrupt context or not.

I also used trace-cmd to record the events, and it still parses properly
with no updates to libtraceevent needed.

Would this work for you?

Below is the patch that allows for this:

-- Steve


diff --git a/include/trace/stages/stage3_trace_output.h b/include/trace/stages/stage3_trace_output.h
index 1e7b0bef95f5..53a23988a3b8 100644
--- a/include/trace/stages/stage3_trace_output.h
+++ b/include/trace/stages/stage3_trace_output.h
@@ -150,3 +150,11 @@
 
 #undef __get_buf
 #define __get_buf(len)		trace_seq_acquire(p, (len))
+
+#undef __event_in_hardirq
+#undef __event_in_softirq
+#undef __event_in_irq
+
+#define __event_in_hardirq()	(__entry->ent.flags & TRACE_FLAG_HARDIRQ)
+#define __event_in_softirq()	(__entry->ent.flags & TRACE_FLAG_SOFTIRQ)
+#define __event_in_irq()	(__entry->ent.flags & (TRACE_FLAG_HARDIRQ | TRACE_FLAG_SOFTIRQ))
diff --git a/include/trace/stages/stage7_class_define.h b/include/trace/stages/stage7_class_define.h
index fcd564a590f4..47008897a795 100644
--- a/include/trace/stages/stage7_class_define.h
+++ b/include/trace/stages/stage7_class_define.h
@@ -26,6 +26,25 @@
 #undef __print_hex_dump
 #undef __get_buf
 
+#undef __event_in_hardirq
+#undef __event_in_softirq
+#undef __event_in_irq
+
+/*
+ * The TRACE_FLAG_* are enums. Instead of using TRACE_DEFINE_ENUM(),
+ * use their hardcoded values. These values are parsed by user space
+ * tooling elsewhere so they will never change.
+ *
+ * See "enum trace_flag_type" in linux/trace_events.h:
+ *   TRACE_FLAG_HARDIRQ
+ *   TRACE_FLAG_SOFTIRQ
+ */
+
+/* This is what is displayed in the format files */
+#define __event_in_hardirq()	(REC->common_flags & 0x8)
+#define __event_in_softirq()	(REC->common_flags & 0x10)
+#define __event_in_irq()	(REC->common_flags & 0x18)
+
 /*
  * The below is not executed in the kernel. It is only what is
  * displayed in the print format for userspace to parse.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/2] mm: vmscan: add PIDs to vmscan tracepoints
  2025-12-29 18:29         ` Steven Rostedt
@ 2025-12-29 21:36           ` Steven Rostedt
  0 siblings, 0 replies; 24+ messages in thread
From: Steven Rostedt @ 2025-12-29 21:36 UTC (permalink / raw)
  To: Thomas Ballasi; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat

On Mon, 29 Dec 2025 13:29:42 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> I just don't like wasting valuable ring buffer space for something that can
> be easily determined without it.
> 
> How about this. I just wrote up this patch, and it could be something you
> use. I tested it against the sched waking events, by adding:
> 
>  		__entry->target_cpu	= task_cpu(p);
>  	),
>  
> -	TP_printk("comm=%s pid=%d prio=%d target_cpu=%03d",
> +	TP_printk("comm=%s pid=%d prio=%d target_cpu=%03d %s",
>  		  __entry->comm, __entry->pid, __entry->prio,
> -		  __entry->target_cpu)
> +		  __entry->target_cpu,
> +		  __event_in_irq() ? "(in-irq)" : "")
>  );
>  
> Which produces:
> 
>           <idle>-0     [003] d.h4.    44.832126: sched_waking:         comm=in:imklog pid=619 prio=120 target_cpu=006 (in-irq)
>           <idle>-0     [003] d.s3.    44.832180: sched_waking:         comm=rcu_preempt pid=15 prio=120 target_cpu=001 (in-irq)
>        in:imklog-619   [006] d..2.    44.832393: sched_waking:         comm=rs:main Q:Reg pid=620 prio=120 target_cpu=003 
> 
> You can see it adds "(in-irq)" when the even is executed from IRQ context
> (soft or hard irq). But I also added __event_in_hardirq() and
> __event_in_softirq() if you wanted to distinguish them.
> 
> Now you don't need to update what goes into the ring buffer (and waste its
> space), but only update the output format that makes it obvious that the
> task was in interrupt context or not.
> 
> I also used trace-cmd to record the events, and it still parses properly
> with no updates to libtraceevent needed.
> 
> Would this work for you?

If this would work for you. Feel free to take the patch I posted and use that:

   https://lore.kernel.org/all/20251229163515.3d1b0bba@gandalf.local.home/

-- Steve


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints
  2025-12-16 14:02 ` [PATCH v2 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
  2025-12-16 14:02   ` [PATCH v2 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
  2025-12-16 14:02   ` [PATCH v2 2/2] mm: vmscan: add PIDs " Thomas Ballasi
@ 2026-01-05 16:04   ` Thomas Ballasi
  2026-01-05 16:04     ` [PATCH v3 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
                       ` (2 more replies)
  2 siblings, 3 replies; 24+ messages in thread
From: Thomas Ballasi @ 2026-01-05 16:04 UTC (permalink / raw)
  To: tballasi; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat, rostedt

Changes in v3:
- Swapped multiple field entries to prevent a hole in the ring buffer
- Replaced in_task() with __event_in_irq
- Replaced mem_cgroup_id(memcg) with cgroup_id(memcg->css.cgroup)
- Rebased the tree to latest 6.18

Link to v2:
https://lore.kernel.org/linux-trace-kernel/20251216140252.11864-1-tballasi@linux.microsoft.com/

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>

Thomas Ballasi (2):
  mm: vmscan: add cgroup IDs to vmscan tracepoints
  mm: vmscan: add PIDs to vmscan tracepoints

 include/trace/events/vmscan.h | 100 ++++++++++++++++++++++------------
 mm/shrinker.c                 |   2 +-
 mm/vmscan.c                   |  17 +++---
 3 files changed, 74 insertions(+), 45 deletions(-)

-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2026-01-05 16:04   ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
@ 2026-01-05 16:04     ` Thomas Ballasi
  2026-01-05 22:46       ` Shakeel Butt
  2026-01-07  1:56       ` build error on CONFIG_MEMCG=n "error: invalid use of undefined type 'struct mem_cgroup'" Harry Yoo
  2026-01-05 16:04     ` [PATCH v3 2/2] mm: vmscan: add PIDs to vmscan tracepoints Thomas Ballasi
  2026-01-06  2:06     ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Andrew Morton
  2 siblings, 2 replies; 24+ messages in thread
From: Thomas Ballasi @ 2026-01-05 16:04 UTC (permalink / raw)
  To: tballasi; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat, rostedt

Memory reclaim events are currently difficult to attribute to
specific cgroups, making debugging memory pressure issues
challenging.  This patch adds memory cgroup ID (memcg_id) to key
vmscan tracepoints to enable better correlation and analysis.

For operations not associated with a specific cgroup, the field
is defaulted to 0.

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
---
 include/trace/events/vmscan.h | 79 ++++++++++++++++++++---------------
 mm/shrinker.c                 |  2 +-
 mm/vmscan.c                   | 17 ++++----
 3 files changed, 56 insertions(+), 42 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 490958fa10dee..93a9a9ba9405d 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -114,85 +114,92 @@ TRACE_EVENT(mm_vmscan_wakeup_kswapd,
 
 DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(gfp_t gfp_flags, int order, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags),
+	TP_ARGS(gfp_flags, order, memcg_id),
 
 	TP_STRUCT__entry(
-		__field(	int,	order		)
 		__field(	unsigned long,	gfp_flags	)
+		__field(	int,	order		)
+		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
-		__entry->order		= order;
 		__entry->gfp_flags	= (__force unsigned long)gfp_flags;
+		__entry->order		= order;
+		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("order=%d gfp_flags=%s",
+	TP_printk("order=%d gfp_flags=%s memcg_id=%u",
 		__entry->order,
-		show_gfp_flags(__entry->gfp_flags))
+		show_gfp_flags(__entry->gfp_flags),
+		__entry->memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_direct_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(gfp_t gfp_flags, int order, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(gfp_flags, order, memcg_id)
 );
 
 #ifdef CONFIG_MEMCG
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_memcg_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(gfp_t gfp_flags, int order, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(gfp_flags, order, memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_memcg_softlimit_reclaim_begin,
 
-	TP_PROTO(int order, gfp_t gfp_flags),
+	TP_PROTO(gfp_t gfp_flags, int order, unsigned short memcg_id),
 
-	TP_ARGS(order, gfp_flags)
+	TP_ARGS(gfp_flags, order, memcg_id)
 );
 #endif /* CONFIG_MEMCG */
 
 DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_end_template,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed),
+	TP_ARGS(nr_reclaimed, memcg_id),
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	nr_reclaimed	)
+		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->nr_reclaimed	= nr_reclaimed;
+		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("nr_reclaimed=%lu", __entry->nr_reclaimed)
+	TP_printk("nr_reclaimed=%lu memcg_id=%u",
+		__entry->nr_reclaimed,
+		__entry->memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_direct_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 #ifdef CONFIG_MEMCG
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_memcg_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_memcg_softlimit_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 #endif /* CONFIG_MEMCG */
 
@@ -208,31 +215,34 @@ TRACE_EVENT(mm_shrink_slab_start,
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
 		__field(void *, shrink)
-		__field(int, nid)
 		__field(long, nr_objects_to_shrink)
 		__field(unsigned long, gfp_flags)
 		__field(unsigned long, cache_items)
 		__field(unsigned long long, delta)
 		__field(unsigned long, total_scan)
 		__field(int, priority)
+		__field(int, nid)
+		__field(unsigned short, memcg_id)
 	),
 
 	TP_fast_assign(
 		__entry->shr = shr;
 		__entry->shrink = shr->scan_objects;
-		__entry->nid = sc->nid;
 		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
 		__entry->gfp_flags = (__force unsigned long)sc->gfp_mask;
 		__entry->cache_items = cache_items;
 		__entry->delta = delta;
 		__entry->total_scan = total_scan;
 		__entry->priority = priority;
+		__entry->nid = sc->nid;
+		__entry->memcg_id = sc->memcg ? cgroup_id(sc->memcg->css.cgroup) : 0;
 	),
 
-	TP_printk("%pS %p: nid: %d objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
+	TP_printk("%pS %p: nid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->memcg_id,
 		__entry->nr_objects_to_shrink,
 		show_gfp_flags(__entry->gfp_flags),
 		__entry->cache_items,
@@ -242,36 +252,39 @@ TRACE_EVENT(mm_shrink_slab_start,
 );
 
 TRACE_EVENT(mm_shrink_slab_end,
-	TP_PROTO(struct shrinker *shr, int nid, int shrinker_retval,
+	TP_PROTO(struct shrinker *shr, struct shrink_control *sc, int shrinker_retval,
 		long unused_scan_cnt, long new_scan_cnt, long total_scan),
 
-	TP_ARGS(shr, nid, shrinker_retval, unused_scan_cnt, new_scan_cnt,
+	TP_ARGS(shr, sc, shrinker_retval, unused_scan_cnt, new_scan_cnt,
 		total_scan),
 
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
-		__field(int, nid)
 		__field(void *, shrink)
 		__field(long, unused_scan)
 		__field(long, new_scan)
-		__field(int, retval)
 		__field(long, total_scan)
+		__field(int, nid)
+		__field(int, retval)
+		__field(unsigned short, memcg_id)
 	),
 
 	TP_fast_assign(
 		__entry->shr = shr;
-		__entry->nid = nid;
 		__entry->shrink = shr->scan_objects;
 		__entry->unused_scan = unused_scan_cnt;
 		__entry->new_scan = new_scan_cnt;
-		__entry->retval = shrinker_retval;
 		__entry->total_scan = total_scan;
+		__entry->nid = sc->nid;
+		__entry->retval = shrinker_retval;
+		__entry->memcg_id = cgroup_id(sc->memcg->css.cgroup);
 	),
 
-	TP_printk("%pS %p: nid: %d unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
+	TP_printk("%pS %p: nid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->memcg_id,
 		__entry->unused_scan,
 		__entry->new_scan,
 		__entry->total_scan,
@@ -504,9 +517,9 @@ TRACE_EVENT(mm_vmscan_node_reclaim_begin,
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_node_reclaim_end,
 
-	TP_PROTO(unsigned long nr_reclaimed),
+	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
 
-	TP_ARGS(nr_reclaimed)
+	TP_ARGS(nr_reclaimed, memcg_id)
 );
 
 TRACE_EVENT(mm_vmscan_throttled,
diff --git a/mm/shrinker.c b/mm/shrinker.c
index 4a93fd433689a..e3b894c20bec8 100644
--- a/mm/shrinker.c
+++ b/mm/shrinker.c
@@ -461,7 +461,7 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl,
 	 */
 	new_nr = add_nr_deferred(next_deferred, shrinker, shrinkctl);
 
-	trace_mm_shrink_slab_end(shrinker, shrinkctl->nid, freed, nr, new_nr, total_scan);
+	trace_mm_shrink_slab_end(shrinker, shrinkctl, freed, nr, new_nr, total_scan);
 	return freed;
 }
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b2fc8b626d3df..3ac9f45461795 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -6642,11 +6642,11 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 		return 1;
 
 	set_task_reclaim_state(current, &sc.reclaim_state);
-	trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask);
+	trace_mm_vmscan_direct_reclaim_begin(sc.gfp_mask, order, 0);
 
 	nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
 
-	trace_mm_vmscan_direct_reclaim_end(nr_reclaimed);
+	trace_mm_vmscan_direct_reclaim_end(nr_reclaimed, 0);
 	set_task_reclaim_state(current, NULL);
 
 	return nr_reclaimed;
@@ -6675,8 +6675,9 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 	sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
 			(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
 
-	trace_mm_vmscan_memcg_softlimit_reclaim_begin(sc.order,
-						      sc.gfp_mask);
+	trace_mm_vmscan_memcg_softlimit_reclaim_begin(sc.gfp_mask,
+						      sc.order,
+						      cgroup_id(memcg->css.cgroup));
 
 	/*
 	 * NOTE: Although we can get the priority field, using it
@@ -6687,7 +6688,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 	 */
 	shrink_lruvec(lruvec, &sc);
 
-	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed);
+	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed, cgroup_id(memcg->css.cgroup));
 
 	*nr_scanned = sc.nr_scanned;
 
@@ -6723,13 +6724,13 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
 	struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
 
 	set_task_reclaim_state(current, &sc.reclaim_state);
-	trace_mm_vmscan_memcg_reclaim_begin(0, sc.gfp_mask);
+	trace_mm_vmscan_memcg_reclaim_begin(sc.gfp_mask, 0, cgroup_id(memcg->css.cgroup));
 	noreclaim_flag = memalloc_noreclaim_save();
 
 	nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
 
 	memalloc_noreclaim_restore(noreclaim_flag);
-	trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed);
+	trace_mm_vmscan_memcg_reclaim_end(nr_reclaimed, cgroup_id(memcg->css.cgroup));
 	set_task_reclaim_state(current, NULL);
 
 	return nr_reclaimed;
@@ -7675,7 +7676,7 @@ static unsigned long __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask,
 	delayacct_freepages_end();
 	psi_memstall_leave(&pflags);
 
-	trace_mm_vmscan_node_reclaim_end(sc->nr_reclaimed);
+	trace_mm_vmscan_node_reclaim_end(sc->nr_reclaimed, 0);
 
 	return sc->nr_reclaimed;
 }
-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 2/2] mm: vmscan: add PIDs to vmscan tracepoints
  2026-01-05 16:04   ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
  2026-01-05 16:04     ` [PATCH v3 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
@ 2026-01-05 16:04     ` Thomas Ballasi
  2026-01-06  2:06     ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Andrew Morton
  2 siblings, 0 replies; 24+ messages in thread
From: Thomas Ballasi @ 2026-01-05 16:04 UTC (permalink / raw)
  To: tballasi; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat, rostedt

The changes aims at adding additionnal tracepoints variables to help
debuggers attribute them to specific processes.

The PID field uses in_task() to reliably detect when we're in process
context and can safely access current->pid.  When not in process
context (such as in interrupt or in an asynchronous RCU context), the
field is set to -1 as a sentinel value.

Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
---
 include/trace/events/vmscan.h | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 93a9a9ba9405d..d438abfa03ebb 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -121,19 +121,23 @@ DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_begin_template,
 	TP_STRUCT__entry(
 		__field(	unsigned long,	gfp_flags	)
 		__field(	int,	order		)
+		__field(	int,	pid		)
 		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->gfp_flags	= (__force unsigned long)gfp_flags;
 		__entry->order		= order;
+		__entry->pid		= current->pid;
 		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("order=%d gfp_flags=%s memcg_id=%u",
+	TP_printk("order=%d gfp_flags=%s pid=%d memcg_id=%u %s",
 		__entry->order,
 		show_gfp_flags(__entry->gfp_flags),
-		__entry->memcg_id)
+		__entry->pid,
+		__entry->memcg_id,
++		__event_in_irq() ? "(in-irq)" : "")
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_begin_template, mm_vmscan_direct_reclaim_begin,
@@ -167,17 +171,21 @@ DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_end_template,
 
 	TP_STRUCT__entry(
 		__field(	unsigned long,	nr_reclaimed	)
+		__field(	int,	pid		)
 		__field(	unsigned short,	memcg_id	)
 	),
 
 	TP_fast_assign(
 		__entry->nr_reclaimed	= nr_reclaimed;
+		__entry->pid		= current->pid;
 		__entry->memcg_id	= memcg_id;
 	),
 
-	TP_printk("nr_reclaimed=%lu memcg_id=%u",
+	TP_printk("nr_reclaimed=%lu pid=%d memcg_id=%u %s",
 		__entry->nr_reclaimed,
-		__entry->memcg_id)
+		__entry->pid,
+		__entry->memcg_id,
+		__event_in_irq() ? "(in-irq)" : "")
 );
 
 DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_direct_reclaim_end,
@@ -222,6 +230,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__field(unsigned long, total_scan)
 		__field(int, priority)
 		__field(int, nid)
+		__field(int, pid)
 		__field(unsigned short, memcg_id)
 	),
 
@@ -235,20 +244,23 @@ TRACE_EVENT(mm_shrink_slab_start,
 		__entry->total_scan = total_scan;
 		__entry->priority = priority;
 		__entry->nid = sc->nid;
+		__entry->pid = current->pid;
 		__entry->memcg_id = sc->memcg ? cgroup_id(sc->memcg->css.cgroup) : 0;
 	),
 
-	TP_printk("%pS %p: nid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
+	TP_printk("%pS %p: nid: %d pid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d %s",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->pid,
 		__entry->memcg_id,
 		__entry->nr_objects_to_shrink,
 		show_gfp_flags(__entry->gfp_flags),
 		__entry->cache_items,
 		__entry->delta,
 		__entry->total_scan,
-		__entry->priority)
+		__entry->priority,
++		__event_in_irq() ? "(in-irq)" : "")
 );
 
 TRACE_EVENT(mm_shrink_slab_end,
@@ -266,29 +278,32 @@ TRACE_EVENT(mm_shrink_slab_end,
 		__field(long, total_scan)
 		__field(int, nid)
 		__field(int, retval)
+		__field(int, pid)
 		__field(unsigned short, memcg_id)
 	),
 
 	TP_fast_assign(
 		__entry->shr = shr;
-		__entry->shrink = shr->scan_objects;
 		__entry->unused_scan = unused_scan_cnt;
 		__entry->new_scan = new_scan_cnt;
 		__entry->total_scan = total_scan;
 		__entry->nid = sc->nid;
 		__entry->retval = shrinker_retval;
+		__entry->pid = current->pid;
 		__entry->memcg_id = cgroup_id(sc->memcg->css.cgroup);
 	),
 
-	TP_printk("%pS %p: nid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
+	TP_printk("%pS %p: nid: %d pid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d %s",
 		__entry->shrink,
 		__entry->shr,
 		__entry->nid,
+		__entry->pid,
 		__entry->memcg_id,
 		__entry->unused_scan,
 		__entry->new_scan,
 		__entry->total_scan,
-		__entry->retval)
+		__entry->retval,
++		__event_in_irq() ? "(in-irq)" : "")
 );
 
 TRACE_EVENT(mm_vmscan_lru_isolate,
-- 
2.33.8



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2026-01-05 16:04     ` [PATCH v3 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
@ 2026-01-05 22:46       ` Shakeel Butt
  2026-01-07 18:14         ` Shakeel Butt
  2026-01-07  1:56       ` build error on CONFIG_MEMCG=n "error: invalid use of undefined type 'struct mem_cgroup'" Harry Yoo
  1 sibling, 1 reply; 24+ messages in thread
From: Shakeel Butt @ 2026-01-05 22:46 UTC (permalink / raw)
  To: Thomas Ballasi; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat, rostedt

On Mon, Jan 05, 2026 at 08:04:22AM -0800, Thomas Ballasi wrote:
> Memory reclaim events are currently difficult to attribute to
> specific cgroups, making debugging memory pressure issues
> challenging.  This patch adds memory cgroup ID (memcg_id) to key
> vmscan tracepoints to enable better correlation and analysis.
> 
> For operations not associated with a specific cgroup, the field
> is defaulted to 0.
> 
> Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>

Couple of comments:

1. memcg_id is u64 but the patch is using 'unsigned short'.
2. I would prefer memcg pointer be passed in tracepoint and then in
trace header file cgroup_id() be used similar to other users in
include/trace/events/ folder.

Orthogonally I am cleaning up memcg id usage and after that cleanup,
mem_cgroup_id() would be preferred way to get the ID. No need to do
anything now as I will cleanup this usage later as well.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints
  2026-01-05 16:04   ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
  2026-01-05 16:04     ` [PATCH v3 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
  2026-01-05 16:04     ` [PATCH v3 2/2] mm: vmscan: add PIDs to vmscan tracepoints Thomas Ballasi
@ 2026-01-06  2:06     ` Andrew Morton
  2026-01-06  2:21       ` Steven Rostedt
  2 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2026-01-06  2:06 UTC (permalink / raw)
  To: Thomas Ballasi; +Cc: linux-mm, linux-trace-kernel, mhiramat, rostedt

On Mon,  5 Jan 2026 08:04:21 -0800 Thomas Ballasi <tballasi@linux.microsoft.com> wrote:

> Changes in v3:
> - Swapped multiple field entries to prevent a hole in the ring buffer
> - Replaced in_task() with __event_in_irq
> - Replaced mem_cgroup_id(memcg) with cgroup_id(memcg->css.cgroup)
> - Rebased the tree to latest 6.18

x86_64 allmodconfig;


In file included from ./include/trace/define_trace.h:132,
                 from ./include/trace/events/vmscan.h:569,
                 from mm/vmscan.c:73:
./include/trace/events/vmscan.h: In function 'trace_raw_output_mm_vmscan_direct_reclaim_begin_template':
./include/trace/events/vmscan.h:140:17: error: implicit declaration of function '__event_in_irq' [-Wimplicit-function-declaration]
  140 | +               __event_in_irq() ? "(in-irq)" : "")
      |                 ^~~~~~~~~~~~~~
./include/trace/trace_events.h:219:34: note: in definition of macro 'DECLARE_EVENT_CLASS'
  219 |         trace_event_printf(iter, print);                                \
      |                                  ^~~~~
./include/trace/events/vmscan.h:135:9: note: in expansion of macro 'TP_printk'
  135 |         TP_printk("order=%d gfp_flags=%s pid=%d memcg_id=%u %s",
      |         ^~~~~~~~~
make[3]: *** [scripts/Makefile.build:287: mm/vmscan.o] Error 1
make[2]: *** [scripts/Makefile.build:556: mm] Error 2
make[1]: *** [/usr/src/25/Makefile:2054: .] Error 2
make: *** [Makefile:248: __sub-make] Error 2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints
  2026-01-06  2:06     ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Andrew Morton
@ 2026-01-06  2:21       ` Steven Rostedt
  0 siblings, 0 replies; 24+ messages in thread
From: Steven Rostedt @ 2026-01-06  2:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Thomas Ballasi, linux-mm, linux-trace-kernel, mhiramat

On Mon, 5 Jan 2026 18:06:40 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Mon,  5 Jan 2026 08:04:21 -0800 Thomas Ballasi <tballasi@linux.microsoft.com> wrote:
> 
> > Changes in v3:
> > - Swapped multiple field entries to prevent a hole in the ring buffer
> > - Replaced in_task() with __event_in_irq
> > - Replaced mem_cgroup_id(memcg) with cgroup_id(memcg->css.cgroup)
> > - Rebased the tree to latest 6.18  
> 
> x86_64 allmodconfig;
> 
> 
> In file included from ./include/trace/define_trace.h:132,
>                  from ./include/trace/events/vmscan.h:569,
>                  from mm/vmscan.c:73:
> ./include/trace/events/vmscan.h: In function 'trace_raw_output_mm_vmscan_direct_reclaim_begin_template':
> ./include/trace/events/vmscan.h:140:17: error: implicit declaration of function '__event_in_irq' [-Wimplicit-function-declaration]
>   140 | +               __event_in_irq() ? "(in-irq)" : "")
>       |                 ^~~~~~~~~~~~~~
> ./include/trace/trace_events.h:219:34: note: in definition of macro 'DECLARE_EVENT_CLASS'
>   219 |         trace_event_printf(iter, print);                                \
>       |                                  ^~~~~
> ./include/trace/events/vmscan.h:135:9: note: in expansion of macro 'TP_printk'
>   135 |         TP_printk("order=%d gfp_flags=%s pid=%d memcg_id=%u %s",
>       |         ^~~~~~~~~
> make[3]: *** [scripts/Makefile.build:287: mm/vmscan.o] Error 1
> make[2]: *** [scripts/Makefile.build:556: mm] Error 2
> make[1]: *** [/usr/src/25/Makefile:2054: .] Error 2
> make: *** [Makefile:248: __sub-make] Error 2

This is dependent on my patch:

  https://lore.kernel.org/all/20251229163515.3d1b0bba@gandalf.local.home/

Where I said you can take this patch. But I don't see it as part of the
series.

  https://lore.kernel.org/all/20251229163634.5aad205d@gandalf.local.home/

-- Steve



^ permalink raw reply	[flat|nested] 24+ messages in thread

* build error on CONFIG_MEMCG=n "error: invalid use of undefined type 'struct mem_cgroup'"
  2026-01-05 16:04     ` [PATCH v3 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
  2026-01-05 22:46       ` Shakeel Butt
@ 2026-01-07  1:56       ` Harry Yoo
  2026-01-07  2:17         ` Andrew Morton
  1 sibling, 1 reply; 24+ messages in thread
From: Harry Yoo @ 2026-01-07  1:56 UTC (permalink / raw)
  To: Thomas Ballasi; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat, rostedt

On Mon, Jan 05, 2026 at 08:04:22AM -0800, Thomas Ballasi wrote:
> Memory reclaim events are currently difficult to attribute to
> specific cgroups, making debugging memory pressure issues
> challenging.  This patch adds memory cgroup ID (memcg_id) to key
> vmscan tracepoints to enable better correlation and analysis.
> 
> For operations not associated with a specific cgroup, the field
> is defaulted to 0.
> 
> Signed-off-by: Thomas Ballasi <tballasi@linux.microsoft.com>
> ---
>  include/trace/events/vmscan.h | 79 ++++++++++++++++++++---------------
>  mm/shrinker.c                 |  2 +-
>  mm/vmscan.c                   | 17 ++++----
>  3 files changed, 56 insertions(+), 42 deletions(-)
> 
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 490958fa10dee..93a9a9ba9405d 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -208,31 +215,34 @@ TRACE_EVENT(mm_shrink_slab_start,
>  	TP_fast_assign(
>  		__entry->shr = shr;
>  		__entry->shrink = shr->scan_objects;
> -		__entry->nid = sc->nid;
>  		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
>  		__entry->gfp_flags = (__force unsigned long)sc->gfp_mask;
>  		__entry->cache_items = cache_items;
>  		__entry->delta = delta;
>  		__entry->total_scan = total_scan;
>  		__entry->priority = priority;
> +		__entry->nid = sc->nid;
> +		__entry->memcg_id = sc->memcg ? cgroup_id(sc->memcg->css.cgroup) : 0;

Hi Thomas, this is breaking CONFIG_MEMCG=n builds.

In file included from ./include/trace/define_trace.h:132,
                 from ./include/trace/events/vmscan.h:569,
                 from mm/vmscan.c:73:
./include/trace/events/vmscan.h: In function ‘do_trace_event_raw_event_mm_shrink_slab_start’:
./include/trace/events/vmscan.h:248:68: error: invalid use of undefined type ‘struct mem_cgroup’
  248 |                 __entry->memcg_id = sc->memcg ? cgroup_id(sc->memcg->css.cgroup) : 0;
      |                                                                    ^~
./include/trace/trace_events.h:427:11: note: in definition of macro ‘__DECLARE_EVENT_CLASS’
  427 |         { assign; }                                                     \
      |           ^~~~~~
./include/trace/trace_events.h:435:23: note: in expansion of macro ‘PARAMS’
  435 |                       PARAMS(assign), PARAMS(print))                    \
      |                       ^~~~~~
./include/trace/trace_events.h:40:9: note: in expansion of macro ‘DECLARE_EVENT_CLASS’
   40 |         DECLARE_EVENT_CLASS(name,                              \
      |         ^~~~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:44:30: note: in expansion of macro ‘PARAMS’
   44 |                              PARAMS(assign),                   \
      |                              ^~~~~~
./include/trace/events/vmscan.h:214:1: note: in expansion of macro ‘TRACE_EVENT’
  214 | TRACE_EVENT(mm_shrink_slab_start,
      | ^~~~~~~~~~~
./include/trace/events/vmscan.h:237:9: note: in expansion of macro ‘TP_fast_assign’
  237 |         TP_fast_assign(
      |         ^~~~~~~~~~~~~~
./include/trace/events/vmscan.h: In function ‘do_trace_event_raw_event_mm_shrink_slab_end’:
./include/trace/events/vmscan.h:293:56: error: invalid use of undefined type ‘struct mem_cgroup’
  293 |                 __entry->memcg_id = cgroup_id(sc->memcg->css.cgroup);
      |                                                        ^~
./include/trace/trace_events.h:427:11: note: in definition of macro ‘__DECLARE_EVENT_CLASS’
  427 |         { assign; }                                                     \
      |           ^~~~~~
./include/trace/trace_events.h:435:23: note: in expansion of macro ‘PARAMS’
  435 |                       PARAMS(assign), PARAMS(print))                    \
      |                       ^~~~~~
./include/trace/trace_events.h:40:9: note: in expansion of macro ‘DECLARE_EVENT_CLASS’
   40 |         DECLARE_EVENT_CLASS(name,                              \
      |         ^~~~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:44:30: note: in expansion of macro ‘PARAMS’
   44 |                              PARAMS(assign),                   \
      |                              ^~~~~~
./include/trace/events/vmscan.h:266:1: note: in expansion of macro ‘TRACE_EVENT’
  266 | TRACE_EVENT(mm_shrink_slab_end,
      | ^~~~~~~~~~~
./include/trace/events/vmscan.h:285:9: note: in expansion of macro ‘TP_fast_assign’
  285 |         TP_fast_assign(
      |         ^~~~~~~~~~~~~~
In file included from ./include/trace/define_trace.h:133:
./include/trace/events/vmscan.h: In function ‘do_perf_trace_mm_shrink_slab_start’:
./include/trace/events/vmscan.h:248:68: error: invalid use of undefined type ‘struct mem_cgroup’
  248 |                 __entry->memcg_id = sc->memcg ? cgroup_id(sc->memcg->css.cgroup) : 0;
      |                                                                    ^~
./include/trace/perf.h:51:11: note: in definition of macro ‘__DECLARE_EVENT_CLASS’
   51 |         { assign; }                                                     \
      |           ^~~~~~
./include/trace/perf.h:67:23: note: in expansion of macro ‘PARAMS’
   67 |                       PARAMS(assign), PARAMS(print))                    \
      |                       ^~~~~~
./include/trace/trace_events.h:40:9: note: in expansion of macro ‘DECLARE_EVENT_CLASS’
   40 |         DECLARE_EVENT_CLASS(name,                              \
      |         ^~~~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:44:30: note: in expansion of macro ‘PARAMS’
   44 |                              PARAMS(assign),                   \
      |                              ^~~~~~
./include/trace/events/vmscan.h:214:1: note: in expansion of macro ‘TRACE_EVENT’
  214 | TRACE_EVENT(mm_shrink_slab_start,
      | ^~~~~~~~~~~
./include/trace/events/vmscan.h:237:9: note: in expansion of macro ‘TP_fast_assign’
  237 |         TP_fast_assign(
      |         ^~~~~~~~~~~~~~
./include/trace/events/vmscan.h: In function ‘do_perf_trace_mm_shrink_slab_end’:
./include/trace/events/vmscan.h:293:56: error: invalid use of undefined type ‘struct mem_cgroup’
  293 |                 __entry->memcg_id = cgroup_id(sc->memcg->css.cgroup);
      |                                                        ^~
./include/trace/perf.h:51:11: note: in definition of macro ‘__DECLARE_EVENT_CLASS’
   51 |         { assign; }                                                     \
      |           ^~~~~~
./include/trace/perf.h:67:23: note: in expansion of macro ‘PARAMS’
   67 |                       PARAMS(assign), PARAMS(print))                    \
      |                       ^~~~~~
./include/trace/trace_events.h:40:9: note: in expansion of macro ‘DECLARE_EVENT_CLASS’
   40 |         DECLARE_EVENT_CLASS(name,                              \
      |         ^~~~~~~~~~~~~~~~~~~
./include/trace/trace_events.h:44:30: note: in expansion of macro ‘PARAMS’
   44 |                              PARAMS(assign),                   \
      |                              ^~~~~~
./include/trace/events/vmscan.h:266:1: note: in expansion of macro ‘TRACE_EVENT’
  266 | TRACE_EVENT(mm_shrink_slab_end,
      | ^~~~~~~~~~~
./include/trace/events/vmscan.h:285:9: note: in expansion of macro ‘TP_fast_assign’
  285 |         TP_fast_assign(
      |         ^~~~~~~~~~~~~~
  CC      arch/x86/mm/extable.o
make[3]: *** [scripts/Makefile.build:287: mm/vmscan.o] Error 1
make[2]: *** [scripts/Makefile.build:556: mm] Error 2
make[2]: *** Waiting for unfinished jobs....

-- 
Cheers,
Harry / Hyeonggon

>  	),
>  
> -	TP_printk("%pS %p: nid: %d objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
> +	TP_printk("%pS %p: nid: %d memcg_id: %u objects to shrink %ld gfp_flags %s cache items %ld delta %lld total_scan %ld priority %d",
>  		__entry->shrink,
>  		__entry->shr,
>  		__entry->nid,
> +		__entry->memcg_id,
>  		__entry->nr_objects_to_shrink,
>  		show_gfp_flags(__entry->gfp_flags),
>  		__entry->cache_items,
> @@ -242,36 +252,39 @@ TRACE_EVENT(mm_shrink_slab_start,
>  );
>  
>  TRACE_EVENT(mm_shrink_slab_end,
> -	TP_PROTO(struct shrinker *shr, int nid, int shrinker_retval,
> +	TP_PROTO(struct shrinker *shr, struct shrink_control *sc, int shrinker_retval,
>  		long unused_scan_cnt, long new_scan_cnt, long total_scan),
>  
> -	TP_ARGS(shr, nid, shrinker_retval, unused_scan_cnt, new_scan_cnt,
> +	TP_ARGS(shr, sc, shrinker_retval, unused_scan_cnt, new_scan_cnt,
>  		total_scan),
>  
>  	TP_STRUCT__entry(
>  		__field(struct shrinker *, shr)
> -		__field(int, nid)
>  		__field(void *, shrink)
>  		__field(long, unused_scan)
>  		__field(long, new_scan)
> -		__field(int, retval)
>  		__field(long, total_scan)
> +		__field(int, nid)
> +		__field(int, retval)
> +		__field(unsigned short, memcg_id)
>  	),
>  
>  	TP_fast_assign(
>  		__entry->shr = shr;
> -		__entry->nid = nid;
>  		__entry->shrink = shr->scan_objects;
>  		__entry->unused_scan = unused_scan_cnt;
>  		__entry->new_scan = new_scan_cnt;
> -		__entry->retval = shrinker_retval;
>  		__entry->total_scan = total_scan;
> +		__entry->nid = sc->nid;
> +		__entry->retval = shrinker_retval;
> +		__entry->memcg_id = cgroup_id(sc->memcg->css.cgroup);
>  	),
>  
> -	TP_printk("%pS %p: nid: %d unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
> +	TP_printk("%pS %p: nid: %d memcg_id: %u unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
>  		__entry->shrink,
>  		__entry->shr,
>  		__entry->nid,
> +		__entry->memcg_id,
>  		__entry->unused_scan,
>  		__entry->new_scan,
>  		__entry->total_scan,
> @@ -504,9 +517,9 @@ TRACE_EVENT(mm_vmscan_node_reclaim_begin,
>  
>  DEFINE_EVENT(mm_vmscan_direct_reclaim_end_template, mm_vmscan_node_reclaim_end,
>  
> -	TP_PROTO(unsigned long nr_reclaimed),
> +	TP_PROTO(unsigned long nr_reclaimed, unsigned short memcg_id),
>  
> -	TP_ARGS(nr_reclaimed)
> +	TP_ARGS(nr_reclaimed, memcg_id)
>  );
>  
>  TRACE_EVENT(mm_vmscan_throttled,


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: build error on CONFIG_MEMCG=n "error: invalid use of undefined type 'struct mem_cgroup'"
  2026-01-07  1:56       ` build error on CONFIG_MEMCG=n "error: invalid use of undefined type 'struct mem_cgroup'" Harry Yoo
@ 2026-01-07  2:17         ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2026-01-07  2:17 UTC (permalink / raw)
  To: Harry Yoo; +Cc: Thomas Ballasi, linux-mm, linux-trace-kernel, mhiramat, rostedt

On Wed, 7 Jan 2026 10:56:46 +0900 Harry Yoo <harry.yoo@oracle.com> wrote:

> > --- a/include/trace/events/vmscan.h
> > +++ b/include/trace/events/vmscan.h
> > @@ -208,31 +215,34 @@ TRACE_EVENT(mm_shrink_slab_start,
> >  	TP_fast_assign(
> >  		__entry->shr = shr;
> >  		__entry->shrink = shr->scan_objects;
> > -		__entry->nid = sc->nid;
> >  		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
> >  		__entry->gfp_flags = (__force unsigned long)sc->gfp_mask;
> >  		__entry->cache_items = cache_items;
> >  		__entry->delta = delta;
> >  		__entry->total_scan = total_scan;
> >  		__entry->priority = priority;
> > +		__entry->nid = sc->nid;
> > +		__entry->memcg_id = sc->memcg ? cgroup_id(sc->memcg->css.cgroup) : 0;
> 
> Hi Thomas, this is breaking CONFIG_MEMCG=n builds.

yup, thanks, that series has been removed from mm.git.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2026-01-05 22:46       ` Shakeel Butt
@ 2026-01-07 18:14         ` Shakeel Butt
  2026-01-07 18:32           ` Andrew Morton
  0 siblings, 1 reply; 24+ messages in thread
From: Shakeel Butt @ 2026-01-07 18:14 UTC (permalink / raw)
  To: Thomas Ballasi; +Cc: akpm, linux-mm, linux-trace-kernel, mhiramat, rostedt

On Mon, Jan 05, 2026 at 02:46:39PM -0800, Shakeel Butt wrote:
[...]
> 
> Orthogonally I am cleaning up memcg id usage and after that cleanup,
> mem_cgroup_id() would be preferred way to get the ID. No need to do
> anything now as I will cleanup this usage later as well.

The series has been landed in mm-new. Please use mem_cgroup_id() instead
of cgroup_id(memcg->css.cgroup).


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2026-01-07 18:14         ` Shakeel Butt
@ 2026-01-07 18:32           ` Andrew Morton
  2026-01-07 20:35             ` Shakeel Butt
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2026-01-07 18:32 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Thomas Ballasi, linux-mm, linux-trace-kernel, mhiramat, rostedt

On Wed, 7 Jan 2026 10:14:53 -0800 Shakeel Butt <shakeel.butt@linux.dev> wrote:

> On Mon, Jan 05, 2026 at 02:46:39PM -0800, Shakeel Butt wrote:
> [...]
> > 
> > Orthogonally I am cleaning up memcg id usage and after that cleanup,
> > mem_cgroup_id() would be preferred way to get the ID. No need to do
> > anything now as I will cleanup this usage later as well.
> 
> The series has been landed in mm-new. Please use mem_cgroup_id() instead
> of cgroup_id(memcg->css.cgroup).

fyi, I removed this series yesterday due to a CONFIG_MEMCG=n build failure.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/2] mm: vmscan: add cgroup IDs to vmscan tracepoints
  2026-01-07 18:32           ` Andrew Morton
@ 2026-01-07 20:35             ` Shakeel Butt
  0 siblings, 0 replies; 24+ messages in thread
From: Shakeel Butt @ 2026-01-07 20:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Ballasi, linux-mm, linux-trace-kernel, mhiramat, rostedt

On Wed, Jan 07, 2026 at 10:32:08AM -0800, Andrew Morton wrote:
> On Wed, 7 Jan 2026 10:14:53 -0800 Shakeel Butt <shakeel.butt@linux.dev> wrote:
> 
> > On Mon, Jan 05, 2026 at 02:46:39PM -0800, Shakeel Butt wrote:
> > [...]
> > > 
> > > Orthogonally I am cleaning up memcg id usage and after that cleanup,
> > > mem_cgroup_id() would be preferred way to get the ID. No need to do
> > > anything now as I will cleanup this usage later as well.
> > 
> > The series has been landed in mm-new. Please use mem_cgroup_id() instead
> > of cgroup_id(memcg->css.cgroup).
> 
> fyi, I removed this series yesterday due to a CONFIG_MEMCG=n build failure.

Oh sorry, I meant my series [1] landed in mm-new and with that Thomas
can directly use mem_cgroup_id() without worrying about CONFIG_MEMCG=n.

[1] https://lkml.kernel.org/r/20251225232116.294540-1-shakeel.butt@linux.dev


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2026-01-07 20:35 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-08 18:14 [PATCH 0/2] mm: vmscan: add PID and cgroup ID to vmscan tracepoints Thomas Ballasi
2025-12-08 18:14 ` [PATCH 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
2025-12-08 18:14 ` [PATCH 2/2] mm: vmscan: add PIDs " Thomas Ballasi
2025-12-10  3:09   ` Steven Rostedt
2025-12-16 14:02 ` [PATCH v2 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
2025-12-16 14:02   ` [PATCH v2 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
2025-12-16 18:50     ` Shakeel Butt
2025-12-17 22:21     ` Steven Rostedt
2025-12-16 14:02   ` [PATCH v2 2/2] mm: vmscan: add PIDs " Thomas Ballasi
2025-12-16 18:03     ` Steven Rostedt
2025-12-29 10:54       ` Thomas Ballasi
2025-12-29 18:29         ` Steven Rostedt
2025-12-29 21:36           ` Steven Rostedt
2026-01-05 16:04   ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Thomas Ballasi
2026-01-05 16:04     ` [PATCH v3 1/2] mm: vmscan: add cgroup IDs " Thomas Ballasi
2026-01-05 22:46       ` Shakeel Butt
2026-01-07 18:14         ` Shakeel Butt
2026-01-07 18:32           ` Andrew Morton
2026-01-07 20:35             ` Shakeel Butt
2026-01-07  1:56       ` build error on CONFIG_MEMCG=n "error: invalid use of undefined type 'struct mem_cgroup'" Harry Yoo
2026-01-07  2:17         ` Andrew Morton
2026-01-05 16:04     ` [PATCH v3 2/2] mm: vmscan: add PIDs to vmscan tracepoints Thomas Ballasi
2026-01-06  2:06     ` [PATCH v3 0/2] mm: vmscan: add PID and cgroup ID " Andrew Morton
2026-01-06  2:21       ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox