linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting
@ 2026-02-23 16:01 Johannes Weiner
  2026-02-23 16:01 ` [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Johannes Weiner @ 2026-02-23 16:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Uladzislau Rezki, Joshua Hahn, Michal Hocko, Roman Gushchin,
	Shakeel Butt, Muchun Song, linux-mm, cgroups, linux-kernel

Use a vmstat counter instead of a custom, open-coded atomic. This has
the added benefit of making the data available per-node, and prepares
for cleaning up the memcg accounting as well.

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 fs/proc/meminfo.c       |  3 ++-
 include/linux/mmzone.h  |  1 +
 include/linux/vmalloc.h |  3 ---
 mm/vmalloc.c            | 19 ++++++++++---------
 mm/vmstat.c             |  1 +
 5 files changed, 14 insertions(+), 13 deletions(-)

V2:
- Fix mod_node_page_state() pgdat argument (Shakeel)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index a458f1e112fd..549793f44726 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -126,7 +126,8 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 	show_val_kb(m, "Committed_AS:   ", committed);
 	seq_printf(m, "VmallocTotal:   %8lu kB\n",
 		   (unsigned long)VMALLOC_TOTAL >> 10);
-	show_val_kb(m, "VmallocUsed:    ", vmalloc_nr_pages());
+	show_val_kb(m, "VmallocUsed:    ",
+		    global_node_page_state(NR_VMALLOC));
 	show_val_kb(m, "VmallocChunk:   ", 0ul);
 	show_val_kb(m, "Percpu:         ", pcpu_nr_pages());
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index fc5d6c88d2f0..64df797d45c6 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -220,6 +220,7 @@ enum node_stat_item {
 	NR_KERNEL_MISC_RECLAIMABLE,	/* reclaimable non-slab kernel pages */
 	NR_FOLL_PIN_ACQUIRED,	/* via: pin_user_page(), gup flag: FOLL_PIN */
 	NR_FOLL_PIN_RELEASED,	/* pages returned via unpin_user_page() */
+	NR_VMALLOC,
 	NR_KERNEL_STACK_KB,	/* measured in KiB */
 #if IS_ENABLED(CONFIG_SHADOW_CALL_STACK)
 	NR_KERNEL_SCS_KB,	/* measured in KiB */
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index e8e94f90d686..3b02c0c6b371 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -286,8 +286,6 @@ int unregister_vmap_purge_notifier(struct notifier_block *nb);
 #ifdef CONFIG_MMU
 #define VMALLOC_TOTAL (VMALLOC_END - VMALLOC_START)
 
-unsigned long vmalloc_nr_pages(void);
-
 int vm_area_map_pages(struct vm_struct *area, unsigned long start,
 		      unsigned long end, struct page **pages);
 void vm_area_unmap_pages(struct vm_struct *area, unsigned long start,
@@ -304,7 +302,6 @@ static inline void set_vm_flush_reset_perms(void *addr)
 #else  /* !CONFIG_MMU */
 #define VMALLOC_TOTAL 0UL
 
-static inline unsigned long vmalloc_nr_pages(void) { return 0; }
 static inline void set_vm_flush_reset_perms(void *addr) {}
 #endif /* CONFIG_MMU */
 
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e286c2d2068c..a5fc7795aafd 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1063,14 +1063,8 @@ static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
 static void drain_vmap_area_work(struct work_struct *work);
 static DECLARE_WORK(drain_vmap_work, drain_vmap_area_work);
 
-static __cacheline_aligned_in_smp atomic_long_t nr_vmalloc_pages;
 static __cacheline_aligned_in_smp atomic_long_t vmap_lazy_nr;
 
-unsigned long vmalloc_nr_pages(void)
-{
-	return atomic_long_read(&nr_vmalloc_pages);
-}
-
 static struct vmap_area *__find_vmap_area(unsigned long addr, struct rb_root *root)
 {
 	struct rb_node *n = root->rb_node;
@@ -3463,11 +3457,11 @@ void vfree(const void *addr)
 		 * High-order allocs for huge vmallocs are split, so
 		 * can be freed as an array of order-0 allocations
 		 */
+		if (!(vm->flags & VM_MAP_PUT_PAGES))
+			dec_node_page_state(page, NR_VMALLOC);
 		__free_page(page);
 		cond_resched();
 	}
-	if (!(vm->flags & VM_MAP_PUT_PAGES))
-		atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages);
 	kvfree(vm->pages);
 	kfree(vm);
 }
@@ -3655,6 +3649,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			continue;
 		}
 
+		mod_node_page_state(page_pgdat(page), NR_VMALLOC, 1 << large_order);
+
 		split_page(page, large_order);
 		for (i = 0; i < (1U << large_order); i++)
 			pages[nr_allocated + i] = page + i;
@@ -3675,6 +3671,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 	if (!order) {
 		while (nr_allocated < nr_pages) {
 			unsigned int nr, nr_pages_request;
+			int i;
 
 			/*
 			 * A maximum allowed request is hard-coded and is 100
@@ -3698,6 +3695,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 							nr_pages_request,
 							pages + nr_allocated);
 
+			for (i = nr_allocated; i < nr_allocated + nr; i++)
+				inc_node_page_state(pages[i], NR_VMALLOC);
+
 			nr_allocated += nr;
 
 			/*
@@ -3722,6 +3722,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 		if (unlikely(!page))
 			break;
 
+		mod_node_page_state(page_pgdat(page), NR_VMALLOC, 1 << order);
+
 		/*
 		 * High-order allocations must be able to be treated as
 		 * independent small pages by callers (as they can with
@@ -3864,7 +3866,6 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 			vmalloc_gfp_adjust(gfp_mask, page_order), node,
 			page_order, nr_small_pages, area->pages);
 
-	atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
 	/* All pages of vm should be charged to same memcg, so use first one. */
 	if (gfp_mask & __GFP_ACCOUNT && area->nr_pages)
 		mod_memcg_page_state(area->pages[0], MEMCG_VMALLOC,
diff --git a/mm/vmstat.c b/mm/vmstat.c
index d6e814c82952..bc199c7cd07b 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1270,6 +1270,7 @@ const char * const vmstat_text[] = {
 	[I(NR_KERNEL_MISC_RECLAIMABLE)]		= "nr_kernel_misc_reclaimable",
 	[I(NR_FOLL_PIN_ACQUIRED)]		= "nr_foll_pin_acquired",
 	[I(NR_FOLL_PIN_RELEASED)]		= "nr_foll_pin_released",
+	[I(NR_VMALLOC)]				= "nr_vmalloc",
 	[I(NR_KERNEL_STACK_KB)]			= "nr_kernel_stack",
 #if IS_ENABLED(CONFIG_SHADOW_CALL_STACK)
 	[I(NR_KERNEL_SCS_KB)]			= "nr_shadow_call_stack",
-- 
2.53.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter
  2026-02-23 16:01 [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting Johannes Weiner
@ 2026-02-23 16:01 ` Johannes Weiner
  2026-02-23 19:24   ` Roman Gushchin
  2026-02-23 19:48   ` Vishal Moola (Oracle)
  2026-02-23 19:22 ` [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting Roman Gushchin
  2026-02-23 19:48 ` Vishal Moola (Oracle)
  2 siblings, 2 replies; 6+ messages in thread
From: Johannes Weiner @ 2026-02-23 16:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Uladzislau Rezki, Joshua Hahn, Michal Hocko, Roman Gushchin,
	Shakeel Butt, Muchun Song, linux-mm, cgroups, linux-kernel

Eliminates the custom memcg counter and results in a single,
consolidated accounting call in vmalloc code.

Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/memcontrol.h |  1 -
 mm/memcontrol.c            |  4 ++--
 mm/vmalloc.c               | 16 ++++------------
 3 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 67f154de10bc..c7cc4e50e59a 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -35,7 +35,6 @@ enum memcg_stat_item {
 	MEMCG_SWAP = NR_VM_NODE_STAT_ITEMS,
 	MEMCG_SOCK,
 	MEMCG_PERCPU_B,
-	MEMCG_VMALLOC,
 	MEMCG_KMEM,
 	MEMCG_ZSWAP_B,
 	MEMCG_ZSWAPPED,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 129eed3ff5bb..fef5bdd887e0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -317,6 +317,7 @@ static const unsigned int memcg_node_stat_items[] = {
 	NR_SHMEM_THPS,
 	NR_FILE_THPS,
 	NR_ANON_THPS,
+	NR_VMALLOC,
 	NR_KERNEL_STACK_KB,
 	NR_PAGETABLE,
 	NR_SECONDARY_PAGETABLE,
@@ -339,7 +340,6 @@ static const unsigned int memcg_stat_items[] = {
 	MEMCG_SWAP,
 	MEMCG_SOCK,
 	MEMCG_PERCPU_B,
-	MEMCG_VMALLOC,
 	MEMCG_KMEM,
 	MEMCG_ZSWAP_B,
 	MEMCG_ZSWAPPED,
@@ -1359,7 +1359,7 @@ static const struct memory_stat memory_stats[] = {
 	{ "sec_pagetables",		NR_SECONDARY_PAGETABLE		},
 	{ "percpu",			MEMCG_PERCPU_B			},
 	{ "sock",			MEMCG_SOCK			},
-	{ "vmalloc",			MEMCG_VMALLOC			},
+	{ "vmalloc",			NR_VMALLOC			},
 	{ "shmem",			NR_SHMEM			},
 #ifdef CONFIG_ZSWAP
 	{ "zswap",			MEMCG_ZSWAP_B			},
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index a5fc7795aafd..8773bc0c4734 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3446,9 +3446,6 @@ void vfree(const void *addr)
 
 	if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
 		vm_reset_perms(vm);
-	/* All pages of vm should be charged to same memcg, so use first one. */
-	if (vm->nr_pages && !(vm->flags & VM_MAP_PUT_PAGES))
-		mod_memcg_page_state(vm->pages[0], MEMCG_VMALLOC, -vm->nr_pages);
 	for (i = 0; i < vm->nr_pages; i++) {
 		struct page *page = vm->pages[i];
 
@@ -3458,7 +3455,7 @@ void vfree(const void *addr)
 		 * can be freed as an array of order-0 allocations
 		 */
 		if (!(vm->flags & VM_MAP_PUT_PAGES))
-			dec_node_page_state(page, NR_VMALLOC);
+			mod_lruvec_page_state(page, NR_VMALLOC, -1);
 		__free_page(page);
 		cond_resched();
 	}
@@ -3649,7 +3646,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			continue;
 		}
 
-		mod_node_page_state(page_pgdat(page), NR_VMALLOC, 1 << large_order);
+		mod_lruvec_page_state(page, NR_VMALLOC, 1 << large_order);
 
 		split_page(page, large_order);
 		for (i = 0; i < (1U << large_order); i++)
@@ -3696,7 +3693,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 							pages + nr_allocated);
 
 			for (i = nr_allocated; i < nr_allocated + nr; i++)
-				inc_node_page_state(pages[i], NR_VMALLOC);
+				mod_lruvec_page_state(pages[i], NR_VMALLOC, 1);
 
 			nr_allocated += nr;
 
@@ -3722,7 +3719,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 		if (unlikely(!page))
 			break;
 
-		mod_node_page_state(page_pgdat(page), NR_VMALLOC, 1 << order);
+		mod_lruvec_page_state(page, NR_VMALLOC, 1 << order);
 
 		/*
 		 * High-order allocations must be able to be treated as
@@ -3866,11 +3863,6 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 			vmalloc_gfp_adjust(gfp_mask, page_order), node,
 			page_order, nr_small_pages, area->pages);
 
-	/* All pages of vm should be charged to same memcg, so use first one. */
-	if (gfp_mask & __GFP_ACCOUNT && area->nr_pages)
-		mod_memcg_page_state(area->pages[0], MEMCG_VMALLOC,
-				     area->nr_pages);
-
 	/*
 	 * If not enough pages were obtained to accomplish an
 	 * allocation request, free them via vfree() if any.
-- 
2.53.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting
  2026-02-23 16:01 [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting Johannes Weiner
  2026-02-23 16:01 ` [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
@ 2026-02-23 19:22 ` Roman Gushchin
  2026-02-23 19:48 ` Vishal Moola (Oracle)
  2 siblings, 0 replies; 6+ messages in thread
From: Roman Gushchin @ 2026-02-23 19:22 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
	Shakeel Butt, Muchun Song, linux-mm, cgroups, linux-kernel

Johannes Weiner <hannes@cmpxchg.org> writes:

> Use a vmstat counter instead of a custom, open-coded atomic. This has
> the added benefit of making the data available per-node, and prepares
> for cleaning up the memcg accounting as well.
>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>

Thanks!


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter
  2026-02-23 16:01 ` [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
@ 2026-02-23 19:24   ` Roman Gushchin
  2026-02-23 19:48   ` Vishal Moola (Oracle)
  1 sibling, 0 replies; 6+ messages in thread
From: Roman Gushchin @ 2026-02-23 19:24 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
	Shakeel Butt, Muchun Song, linux-mm, cgroups, linux-kernel

Johannes Weiner <hannes@cmpxchg.org> writes:

> Eliminates the custom memcg counter and results in a single,
> consolidated accounting call in vmalloc code.
>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>

Nice series!

Thanks


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting
  2026-02-23 16:01 [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting Johannes Weiner
  2026-02-23 16:01 ` [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
  2026-02-23 19:22 ` [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting Roman Gushchin
@ 2026-02-23 19:48 ` Vishal Moola (Oracle)
  2 siblings, 0 replies; 6+ messages in thread
From: Vishal Moola (Oracle) @ 2026-02-23 19:48 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song, linux-mm, cgroups,
	linux-kernel

On Mon, Feb 23, 2026 at 11:01:06AM -0500, Johannes Weiner wrote:
> Use a vmstat counter instead of a custom, open-coded atomic. This has
> the added benefit of making the data available per-node, and prepares
> for cleaning up the memcg accounting as well.
> 
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter
  2026-02-23 16:01 ` [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
  2026-02-23 19:24   ` Roman Gushchin
@ 2026-02-23 19:48   ` Vishal Moola (Oracle)
  1 sibling, 0 replies; 6+ messages in thread
From: Vishal Moola (Oracle) @ 2026-02-23 19:48 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song, linux-mm, cgroups,
	linux-kernel

On Mon, Feb 23, 2026 at 11:01:07AM -0500, Johannes Weiner wrote:
> Eliminates the custom memcg counter and results in a single,
> consolidated accounting call in vmalloc code.
> 
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-02-23 19:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-23 16:01 [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting Johannes Weiner
2026-02-23 16:01 ` [PATCH v2 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
2026-02-23 19:24   ` Roman Gushchin
2026-02-23 19:48   ` Vishal Moola (Oracle)
2026-02-23 19:22 ` [PATCH v2 1/2] mm: vmalloc: streamline vmalloc memory accounting Roman Gushchin
2026-02-23 19:48 ` Vishal Moola (Oracle)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox