* [PATCH 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter
2026-02-20 19:10 [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting Johannes Weiner
@ 2026-02-20 19:10 ` Johannes Weiner
2026-02-20 22:15 ` Shakeel Butt
2026-02-23 15:12 ` Uladzislau Rezki
2026-02-20 22:09 ` [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting Shakeel Butt
2026-02-23 15:30 ` Uladzislau Rezki
2 siblings, 2 replies; 8+ messages in thread
From: Johannes Weiner @ 2026-02-20 19:10 UTC (permalink / raw)
To: Andrew Morton
Cc: Uladzislau Rezki, Joshua Hahn, Michal Hocko, Roman Gushchin,
Shakeel Butt, Muchun Song, linux-mm, cgroups, linux-kernel
Eliminates the custom memcg counter and results in a single,
consolidated accounting call in vmalloc code.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
include/linux/memcontrol.h | 1 -
mm/memcontrol.c | 4 ++--
mm/vmalloc.c | 16 ++++------------
3 files changed, 6 insertions(+), 15 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 67f154de10bc..c7cc4e50e59a 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -35,7 +35,6 @@ enum memcg_stat_item {
MEMCG_SWAP = NR_VM_NODE_STAT_ITEMS,
MEMCG_SOCK,
MEMCG_PERCPU_B,
- MEMCG_VMALLOC,
MEMCG_KMEM,
MEMCG_ZSWAP_B,
MEMCG_ZSWAPPED,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 129eed3ff5bb..fef5bdd887e0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -317,6 +317,7 @@ static const unsigned int memcg_node_stat_items[] = {
NR_SHMEM_THPS,
NR_FILE_THPS,
NR_ANON_THPS,
+ NR_VMALLOC,
NR_KERNEL_STACK_KB,
NR_PAGETABLE,
NR_SECONDARY_PAGETABLE,
@@ -339,7 +340,6 @@ static const unsigned int memcg_stat_items[] = {
MEMCG_SWAP,
MEMCG_SOCK,
MEMCG_PERCPU_B,
- MEMCG_VMALLOC,
MEMCG_KMEM,
MEMCG_ZSWAP_B,
MEMCG_ZSWAPPED,
@@ -1359,7 +1359,7 @@ static const struct memory_stat memory_stats[] = {
{ "sec_pagetables", NR_SECONDARY_PAGETABLE },
{ "percpu", MEMCG_PERCPU_B },
{ "sock", MEMCG_SOCK },
- { "vmalloc", MEMCG_VMALLOC },
+ { "vmalloc", NR_VMALLOC },
{ "shmem", NR_SHMEM },
#ifdef CONFIG_ZSWAP
{ "zswap", MEMCG_ZSWAP_B },
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index a49a46de9c4f..8773bc0c4734 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3446,9 +3446,6 @@ void vfree(const void *addr)
if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
vm_reset_perms(vm);
- /* All pages of vm should be charged to same memcg, so use first one. */
- if (vm->nr_pages && !(vm->flags & VM_MAP_PUT_PAGES))
- mod_memcg_page_state(vm->pages[0], MEMCG_VMALLOC, -vm->nr_pages);
for (i = 0; i < vm->nr_pages; i++) {
struct page *page = vm->pages[i];
@@ -3458,7 +3455,7 @@ void vfree(const void *addr)
* can be freed as an array of order-0 allocations
*/
if (!(vm->flags & VM_MAP_PUT_PAGES))
- dec_node_page_state(page, NR_VMALLOC);
+ mod_lruvec_page_state(page, NR_VMALLOC, -1);
__free_page(page);
cond_resched();
}
@@ -3649,7 +3646,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
continue;
}
- mod_node_page_state(page, NR_VMALLOC, 1 << large_order);
+ mod_lruvec_page_state(page, NR_VMALLOC, 1 << large_order);
split_page(page, large_order);
for (i = 0; i < (1U << large_order); i++)
@@ -3696,7 +3693,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
pages + nr_allocated);
for (i = nr_allocated; i < nr_allocated + nr; i++)
- inc_node_page_state(pages[i], NR_VMALLOC);
+ mod_lruvec_page_state(pages[i], NR_VMALLOC, 1);
nr_allocated += nr;
@@ -3722,7 +3719,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
if (unlikely(!page))
break;
- mod_node_page_state(page, NR_VMALLOC, 1 << order);
+ mod_lruvec_page_state(page, NR_VMALLOC, 1 << order);
/*
* High-order allocations must be able to be treated as
@@ -3866,11 +3863,6 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
vmalloc_gfp_adjust(gfp_mask, page_order), node,
page_order, nr_small_pages, area->pages);
- /* All pages of vm should be charged to same memcg, so use first one. */
- if (gfp_mask & __GFP_ACCOUNT && area->nr_pages)
- mod_memcg_page_state(area->pages[0], MEMCG_VMALLOC,
- area->nr_pages);
-
/*
* If not enough pages were obtained to accomplish an
* allocation request, free them via vfree() if any.
--
2.53.0
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter
2026-02-20 19:10 ` [PATCH 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
@ 2026-02-20 22:15 ` Shakeel Butt
2026-02-23 15:12 ` Uladzislau Rezki
1 sibling, 0 replies; 8+ messages in thread
From: Shakeel Butt @ 2026-02-20 22:15 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
Roman Gushchin, Muchun Song, linux-mm, cgroups, linux-kernel
On Fri, Feb 20, 2026 at 02:10:35PM -0500, Johannes Weiner wrote:
> Eliminates the custom memcg counter and results in a single,
> consolidated accounting call in vmalloc code.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter
2026-02-20 19:10 ` [PATCH 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
2026-02-20 22:15 ` Shakeel Butt
@ 2026-02-23 15:12 ` Uladzislau Rezki
1 sibling, 0 replies; 8+ messages in thread
From: Uladzislau Rezki @ 2026-02-23 15:12 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
Roman Gushchin, Shakeel Butt, Muchun Song, linux-mm, cgroups,
linux-kernel
On Fri, Feb 20, 2026 at 02:10:35PM -0500, Johannes Weiner wrote:
> Eliminates the custom memcg counter and results in a single,
> consolidated accounting call in vmalloc code.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> include/linux/memcontrol.h | 1 -
> mm/memcontrol.c | 4 ++--
> mm/vmalloc.c | 16 ++++------------
> 3 files changed, 6 insertions(+), 15 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 67f154de10bc..c7cc4e50e59a 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -35,7 +35,6 @@ enum memcg_stat_item {
> MEMCG_SWAP = NR_VM_NODE_STAT_ITEMS,
> MEMCG_SOCK,
> MEMCG_PERCPU_B,
> - MEMCG_VMALLOC,
> MEMCG_KMEM,
> MEMCG_ZSWAP_B,
> MEMCG_ZSWAPPED,
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 129eed3ff5bb..fef5bdd887e0 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -317,6 +317,7 @@ static const unsigned int memcg_node_stat_items[] = {
> NR_SHMEM_THPS,
> NR_FILE_THPS,
> NR_ANON_THPS,
> + NR_VMALLOC,
> NR_KERNEL_STACK_KB,
> NR_PAGETABLE,
> NR_SECONDARY_PAGETABLE,
> @@ -339,7 +340,6 @@ static const unsigned int memcg_stat_items[] = {
> MEMCG_SWAP,
> MEMCG_SOCK,
> MEMCG_PERCPU_B,
> - MEMCG_VMALLOC,
> MEMCG_KMEM,
> MEMCG_ZSWAP_B,
> MEMCG_ZSWAPPED,
> @@ -1359,7 +1359,7 @@ static const struct memory_stat memory_stats[] = {
> { "sec_pagetables", NR_SECONDARY_PAGETABLE },
> { "percpu", MEMCG_PERCPU_B },
> { "sock", MEMCG_SOCK },
> - { "vmalloc", MEMCG_VMALLOC },
> + { "vmalloc", NR_VMALLOC },
> { "shmem", NR_SHMEM },
> #ifdef CONFIG_ZSWAP
> { "zswap", MEMCG_ZSWAP_B },
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index a49a46de9c4f..8773bc0c4734 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3446,9 +3446,6 @@ void vfree(const void *addr)
>
> if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
> vm_reset_perms(vm);
> - /* All pages of vm should be charged to same memcg, so use first one. */
> - if (vm->nr_pages && !(vm->flags & VM_MAP_PUT_PAGES))
> - mod_memcg_page_state(vm->pages[0], MEMCG_VMALLOC, -vm->nr_pages);
> for (i = 0; i < vm->nr_pages; i++) {
> struct page *page = vm->pages[i];
>
> @@ -3458,7 +3455,7 @@ void vfree(const void *addr)
> * can be freed as an array of order-0 allocations
> */
> if (!(vm->flags & VM_MAP_PUT_PAGES))
> - dec_node_page_state(page, NR_VMALLOC);
> + mod_lruvec_page_state(page, NR_VMALLOC, -1);
> __free_page(page);
> cond_resched();
> }
> @@ -3649,7 +3646,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> continue;
> }
>
> - mod_node_page_state(page, NR_VMALLOC, 1 << large_order);
> + mod_lruvec_page_state(page, NR_VMALLOC, 1 << large_order);
>
> split_page(page, large_order);
> for (i = 0; i < (1U << large_order); i++)
> @@ -3696,7 +3693,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> pages + nr_allocated);
>
> for (i = nr_allocated; i < nr_allocated + nr; i++)
> - inc_node_page_state(pages[i], NR_VMALLOC);
> + mod_lruvec_page_state(pages[i], NR_VMALLOC, 1);
>
> nr_allocated += nr;
>
> @@ -3722,7 +3719,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> if (unlikely(!page))
> break;
>
> - mod_node_page_state(page, NR_VMALLOC, 1 << order);
> + mod_lruvec_page_state(page, NR_VMALLOC, 1 << order);
>
> /*
> * High-order allocations must be able to be treated as
> @@ -3866,11 +3863,6 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> vmalloc_gfp_adjust(gfp_mask, page_order), node,
> page_order, nr_small_pages, area->pages);
>
> - /* All pages of vm should be charged to same memcg, so use first one. */
> - if (gfp_mask & __GFP_ACCOUNT && area->nr_pages)
> - mod_memcg_page_state(area->pages[0], MEMCG_VMALLOC,
> - area->nr_pages);
> -
> /*
> * If not enough pages were obtained to accomplish an
> * allocation request, free them via vfree() if any.
> --
> 2.53.0
>
LGTM:
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting
2026-02-20 19:10 [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting Johannes Weiner
2026-02-20 19:10 ` [PATCH 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
@ 2026-02-20 22:09 ` Shakeel Butt
2026-02-23 15:58 ` Johannes Weiner
2026-02-23 15:30 ` Uladzislau Rezki
2 siblings, 1 reply; 8+ messages in thread
From: Shakeel Butt @ 2026-02-20 22:09 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
Roman Gushchin, Muchun Song, linux-mm, cgroups, linux-kernel
On Fri, Feb 20, 2026 at 02:10:34PM -0500, Johannes Weiner wrote:
[...]
> static struct vmap_area *__find_vmap_area(unsigned long addr, struct rb_root *root)
> {
> struct rb_node *n = root->rb_node;
> @@ -3463,11 +3457,11 @@ void vfree(const void *addr)
> * High-order allocs for huge vmallocs are split, so
> * can be freed as an array of order-0 allocations
> */
> + if (!(vm->flags & VM_MAP_PUT_PAGES))
> + dec_node_page_state(page, NR_VMALLOC);
> __free_page(page);
> cond_resched();
> }
> - if (!(vm->flags & VM_MAP_PUT_PAGES))
> - atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages);
> kvfree(vm->pages);
> kfree(vm);
> }
> @@ -3655,6 +3649,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> continue;
> }
>
> + mod_node_page_state(page, NR_VMALLOC, 1 << large_order);
mod_node_page_state() takes 'struct pglist_data *pgdat', you need to use
page_pgdat(page) as first param.
> +
> split_page(page, large_order);
> for (i = 0; i < (1U << large_order); i++)
> pages[nr_allocated + i] = page + i;
> @@ -3675,6 +3671,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> if (!order) {
> while (nr_allocated < nr_pages) {
> unsigned int nr, nr_pages_request;
> + int i;
>
> /*
> * A maximum allowed request is hard-coded and is 100
> @@ -3698,6 +3695,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> nr_pages_request,
> pages + nr_allocated);
>
> + for (i = nr_allocated; i < nr_allocated + nr; i++)
> + inc_node_page_state(pages[i], NR_VMALLOC);
> +
> nr_allocated += nr;
>
> /*
> @@ -3722,6 +3722,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> if (unlikely(!page))
> break;
>
> + mod_node_page_state(page, NR_VMALLOC, 1 << order);
Same here.
With above fixes, you can add:
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting
2026-02-20 22:09 ` [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting Shakeel Butt
@ 2026-02-23 15:58 ` Johannes Weiner
0 siblings, 0 replies; 8+ messages in thread
From: Johannes Weiner @ 2026-02-23 15:58 UTC (permalink / raw)
To: Shakeel Butt
Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
Roman Gushchin, Muchun Song, linux-mm, cgroups, linux-kernel
On Fri, Feb 20, 2026 at 02:09:28PM -0800, Shakeel Butt wrote:
> On Fri, Feb 20, 2026 at 02:10:34PM -0500, Johannes Weiner wrote:
> [...]
> > static struct vmap_area *__find_vmap_area(unsigned long addr, struct rb_root *root)
> > {
> > struct rb_node *n = root->rb_node;
> > @@ -3463,11 +3457,11 @@ void vfree(const void *addr)
> > * High-order allocs for huge vmallocs are split, so
> > * can be freed as an array of order-0 allocations
> > */
> > + if (!(vm->flags & VM_MAP_PUT_PAGES))
> > + dec_node_page_state(page, NR_VMALLOC);
> > __free_page(page);
> > cond_resched();
> > }
> > - if (!(vm->flags & VM_MAP_PUT_PAGES))
> > - atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages);
> > kvfree(vm->pages);
> > kfree(vm);
> > }
> > @@ -3655,6 +3649,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> > continue;
> > }
> >
> > + mod_node_page_state(page, NR_VMALLOC, 1 << large_order);
>
> mod_node_page_state() takes 'struct pglist_data *pgdat', you need to use
> page_pgdat(page) as first param.
Good catch, my apologies. Serves me right for not compiling
incrementally.
> With above fixes, you can add:
>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Thanks! I'll send out v2.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting
2026-02-20 19:10 [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting Johannes Weiner
2026-02-20 19:10 ` [PATCH 2/2] mm: memcontrol: switch to native NR_VMALLOC vmstat counter Johannes Weiner
2026-02-20 22:09 ` [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting Shakeel Butt
@ 2026-02-23 15:30 ` Uladzislau Rezki
2026-02-23 20:19 ` Johannes Weiner
2 siblings, 1 reply; 8+ messages in thread
From: Uladzislau Rezki @ 2026-02-23 15:30 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Uladzislau Rezki, Joshua Hahn, Michal Hocko,
Roman Gushchin, Shakeel Butt, Muchun Song, linux-mm, cgroups,
linux-kernel
On Fri, Feb 20, 2026 at 02:10:34PM -0500, Johannes Weiner wrote:
> Use a vmstat counter instead of a custom, open-coded atomic. This has
> the added benefit of making the data available per-node, and prepares
> for cleaning up the memcg accounting as well.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> fs/proc/meminfo.c | 3 ++-
> include/linux/mmzone.h | 1 +
> include/linux/vmalloc.h | 3 ---
> mm/vmalloc.c | 19 ++++++++++---------
> mm/vmstat.c | 1 +
> 5 files changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
> index a458f1e112fd..549793f44726 100644
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -126,7 +126,8 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> show_val_kb(m, "Committed_AS: ", committed);
> seq_printf(m, "VmallocTotal: %8lu kB\n",
> (unsigned long)VMALLOC_TOTAL >> 10);
> - show_val_kb(m, "VmallocUsed: ", vmalloc_nr_pages());
> + show_val_kb(m, "VmallocUsed: ",
> + global_node_page_state(NR_VMALLOC));
> show_val_kb(m, "VmallocChunk: ", 0ul);
> show_val_kb(m, "Percpu: ", pcpu_nr_pages());
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index fc5d6c88d2f0..64df797d45c6 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -220,6 +220,7 @@ enum node_stat_item {
> NR_KERNEL_MISC_RECLAIMABLE, /* reclaimable non-slab kernel pages */
> NR_FOLL_PIN_ACQUIRED, /* via: pin_user_page(), gup flag: FOLL_PIN */
> NR_FOLL_PIN_RELEASED, /* pages returned via unpin_user_page() */
> + NR_VMALLOC,
> NR_KERNEL_STACK_KB, /* measured in KiB */
> #if IS_ENABLED(CONFIG_SHADOW_CALL_STACK)
> NR_KERNEL_SCS_KB, /* measured in KiB */
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index e8e94f90d686..3b02c0c6b371 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -286,8 +286,6 @@ int unregister_vmap_purge_notifier(struct notifier_block *nb);
> #ifdef CONFIG_MMU
> #define VMALLOC_TOTAL (VMALLOC_END - VMALLOC_START)
>
> -unsigned long vmalloc_nr_pages(void);
> -
> int vm_area_map_pages(struct vm_struct *area, unsigned long start,
> unsigned long end, struct page **pages);
> void vm_area_unmap_pages(struct vm_struct *area, unsigned long start,
> @@ -304,7 +302,6 @@ static inline void set_vm_flush_reset_perms(void *addr)
> #else /* !CONFIG_MMU */
> #define VMALLOC_TOTAL 0UL
>
> -static inline unsigned long vmalloc_nr_pages(void) { return 0; }
> static inline void set_vm_flush_reset_perms(void *addr) {}
> #endif /* CONFIG_MMU */
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index e286c2d2068c..a49a46de9c4f 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1063,14 +1063,8 @@ static BLOCKING_NOTIFIER_HEAD(vmap_notify_list);
> static void drain_vmap_area_work(struct work_struct *work);
> static DECLARE_WORK(drain_vmap_work, drain_vmap_area_work);
>
> -static __cacheline_aligned_in_smp atomic_long_t nr_vmalloc_pages;
> static __cacheline_aligned_in_smp atomic_long_t vmap_lazy_nr;
>
> -unsigned long vmalloc_nr_pages(void)
> -{
> - return atomic_long_read(&nr_vmalloc_pages);
> -}
> -
> static struct vmap_area *__find_vmap_area(unsigned long addr, struct rb_root *root)
> {
> struct rb_node *n = root->rb_node;
> @@ -3463,11 +3457,11 @@ void vfree(const void *addr)
> * High-order allocs for huge vmallocs are split, so
> * can be freed as an array of order-0 allocations
> */
> + if (!(vm->flags & VM_MAP_PUT_PAGES))
> + dec_node_page_state(page, NR_VMALLOC);
> __free_page(page);
> cond_resched();
> }
> - if (!(vm->flags & VM_MAP_PUT_PAGES))
> - atomic_long_sub(vm->nr_pages, &nr_vmalloc_pages);
> kvfree(vm->pages);
> kfree(vm);
> }
> @@ -3655,6 +3649,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> continue;
> }
>
> + mod_node_page_state(page, NR_VMALLOC, 1 << large_order);
> +
> split_page(page, large_order);
> for (i = 0; i < (1U << large_order); i++)
> pages[nr_allocated + i] = page + i;
> @@ -3675,6 +3671,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> if (!order) {
> while (nr_allocated < nr_pages) {
> unsigned int nr, nr_pages_request;
> + int i;
>
> /*
> * A maximum allowed request is hard-coded and is 100
> @@ -3698,6 +3695,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> nr_pages_request,
> pages + nr_allocated);
>
> + for (i = nr_allocated; i < nr_allocated + nr; i++)
> + inc_node_page_state(pages[i], NR_VMALLOC);
> +
> nr_allocated += nr;
>
> /*
> @@ -3722,6 +3722,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> if (unlikely(!page))
> break;
>
> + mod_node_page_state(page, NR_VMALLOC, 1 << order);
> +
> /*
Can we move *_node_page_stat() to the end of the vm_area_alloc_pages()?
Or mod_node_page_state in first place should be invoked on high-order
page before split(to avoid of looping over small pages afterword)?
I mean it would be good to place to the one solid place. If it is possible
of course.
--
Uladzislau Rezk
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/2] mm: vmalloc: streamline vmalloc memory accounting
2026-02-23 15:30 ` Uladzislau Rezki
@ 2026-02-23 20:19 ` Johannes Weiner
0 siblings, 0 replies; 8+ messages in thread
From: Johannes Weiner @ 2026-02-23 20:19 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Andrew Morton, Joshua Hahn, Michal Hocko, Roman Gushchin,
Shakeel Butt, Muchun Song, linux-mm, cgroups, linux-kernel
On Mon, Feb 23, 2026 at 04:30:32PM +0100, Uladzislau Rezki wrote:
> On Fri, Feb 20, 2026 at 02:10:34PM -0500, Johannes Weiner wrote:
> > @@ -3655,6 +3649,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> > continue;
> > }
> >
> > + mod_node_page_state(page, NR_VMALLOC, 1 << large_order);
> > +
> > split_page(page, large_order);
> > for (i = 0; i < (1U << large_order); i++)
> > pages[nr_allocated + i] = page + i;
> > @@ -3675,6 +3671,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> > if (!order) {
> > while (nr_allocated < nr_pages) {
> > unsigned int nr, nr_pages_request;
> > + int i;
> >
> > /*
> > * A maximum allowed request is hard-coded and is 100
> > @@ -3698,6 +3695,9 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> > nr_pages_request,
> > pages + nr_allocated);
> >
> > + for (i = nr_allocated; i < nr_allocated + nr; i++)
> > + inc_node_page_state(pages[i], NR_VMALLOC);
> > +
> > nr_allocated += nr;
> >
> > /*
> > @@ -3722,6 +3722,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
> > if (unlikely(!page))
> > break;
> >
> > + mod_node_page_state(page, NR_VMALLOC, 1 << order);
> > +
> > /*
> Can we move *_node_page_stat() to the end of the vm_area_alloc_pages()?
>
> Or mod_node_page_state in first place should be invoked on high-order
> page before split(to avoid of looping over small pages afterword)?
>
> I mean it would be good to place to the one solid place. If it is possible
> of course.
Note that the top one in the fast path IS called before the
split. We're accounting in the same step size as the page allocator
can give us.
In the fallback paths (bulk allocator, and one-by-one loop), the issue
is that the individual pages could be coming from different nodes, so
they need to bump different counters. One possible solution would be
to remember the last node and accumulate until it differs, then flush:
fallback_loop() {
page = alloc_pages();
nid = page_to_nid(page);
if (nid != last_nid) {
if (node_count) {
mod_node_page_state(...);
node_count = 0;
}
last_nid = nid;
}
}
if (node_count)
mod_node_page_state(...);
But it IS the slow path, and these are fairly cheap per-cpu
counters. Especially compared to the cost of calling into the
allocator. So I'm not sure it's worth it... What do you think?
^ permalink raw reply [flat|nested] 8+ messages in thread