linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vasily Averin <vvs@virtuozzo.com>
To: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Pekka Enberg <penberg@kernel.org>, Linux MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	kernel@openvz.org
Subject: Re: slabinfo shows incorrect active_objs ???
Date: Mon, 28 Feb 2022 09:17:27 +0300	[thread overview]
Message-ID: <2a7d3c8a-ad92-0ffe-4374-f0bb7e029a74@virtuozzo.com> (raw)
In-Reply-To: <1c73adc1-f780-56ac-4c67-490670a27951@virtuozzo.com>

On 25.02.2022 07:37, Vasily Averin wrote:
> On 25.02.2022 03:08, Roman Gushchin wrote:
>>
>>> On Feb 24, 2022, at 5:17 AM, Vasily Averin <vvs@virtuozzo.com> wrote:
>>>
>>> On 22.02.2022 19:32, Shakeel Butt wrote:
>>>> If you are just interested in the stats, you can use SLAB for your experiments.
>>>
>>> Unfortunately memcg_slabino.py does not support SLAB right now.
>>>
>>>> On 23.02.2022 20:31, Vlastimil Babka wrote:
>>>>> On 2/23/22 04:45, Hyeonggon Yoo wrote:
>>>>> On Wed, Feb 23, 2022 at 01:32:36AM +0100, Vlastimil Babka wrote:
>>>>>> Hm it would be easier just to disable merging when the precise counters are
>>>>>> enabled. Assume it would be a config option (possibly boot-time option with
>>>>>> static keys) anyway so those who don't need them can avoid the overhead.
>>>>>
>>>>> Is it possible to accurately account objects in SLUB? I think it's not
>>>>> easy because a CPU can free objects to remote cpu's partial slabs using
>>>>> cmpxchg_double()...
>>>> AFAIU Roman's idea would be that each alloc/free would simply inc/dec an
>>>> object counter that's disconnected from physical handling of particular sl*b
>>>> implementation. It would provide exact count of objects from the perspective
>>>> of slab users.
>>>> I assume for reduced overhead the counters would be implemented in a percpu
>>>> fashion as e.g. vmstats. Slabinfo gathering would thus have to e.g. sum up
>>>> those percpu counters.
>>>
>>> I like this idea too and I'm going to spend some time for its implementation.
>>
>> Sounds good!
>>
>> Unfortunately it’s quite tricky: the problem is that there is potentially a large and dynamic set of cgroups and also large and dynamic set of slab caches. Given the performance considerations, it’s also unlikely to avoid using percpu variables.
>> So we come to the (nr_slab_caches * nr_cgroups * nr_cpus) number of “objects”. If we create them proactively, we’re likely wasting lot of memory. Creating them on demand is tricky too (especially without losing some accounting accuracy).
> 
> I told about global (i.e. non-memcg) precise slab counters only.
> I'm expect it can done under new config option and/or static key, and if present use them in /proc/slabinfo output.
> 
> At present I'm still going to extract memcg counters via your memcg_slabinfo script.

I'm not sure I'll be able to debug this patch properly and decided to submit it as is.
I hope it can be useful.

In general it works and /proc/slabinfo shows reasonable numbers,
however in some cases they differs from crash' "kmem -s" output, either +1 or -1.
Obviously I missed something.

---[cut here]---
[PATCH RFC] slub: precise in-use counter for /proc/slabinfo output

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
---
  include/linux/slub_def.h |  3 +++
  init/Kconfig             |  7 +++++++
  mm/slub.c                | 20 +++++++++++++++++++-
  3 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 33c5c0e3bd8d..d22e18dfe905 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -56,6 +56,9 @@ struct kmem_cache_cpu {
  #ifdef CONFIG_SLUB_STATS
  	unsigned stat[NR_SLUB_STAT_ITEMS];
  #endif
+#ifdef CONFIG_SLUB_PRECISE_INUSE
+	unsigned inuse;		/* Precise in-use counter */
+#endif
  };
  
  #ifdef CONFIG_SLUB_CPU_PARTIAL
diff --git a/init/Kconfig b/init/Kconfig
index e9119bf54b1f..5c57bdbb8938 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1995,6 +1995,13 @@ config SLUB_CPU_PARTIAL
  	  which requires the taking of locks that may cause latency spikes.
  	  Typically one would choose no for a realtime system.
  
+config SLUB_PRECISE_INUSE
+	default n
+	depends on SLUB && SMP
+	bool "SLUB precise in-use counter"
+	help
+	  Per cpu in-use counter shows precise statistic in slabinfo.
+
  config MMAP_ALLOW_UNINITIALIZED
  	bool "Allow mmapped anonymous memory to be uninitialized"
  	depends on EXPERT && !MMU
diff --git a/mm/slub.c b/mm/slub.c
index 261474092e43..90750cae0af9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3228,6 +3228,9 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
  
  out:
  	slab_post_alloc_hook(s, objcg, gfpflags, 1, &object, init);
+#ifdef CONFIG_SLUB_PRECISE_INUSE
+	raw_cpu_inc(s->cpu_slab->inuse);
+#endif
  
  	return object;
  }
@@ -3506,8 +3509,12 @@ static __always_inline void slab_free(struct kmem_cache *s, struct slab *slab,
  	 * With KASAN enabled slab_free_freelist_hook modifies the freelist
  	 * to remove objects, whose reuse must be delayed.
  	 */
-	if (slab_free_freelist_hook(s, &head, &tail, &cnt))
+	if (slab_free_freelist_hook(s, &head, &tail, &cnt)) {
  		do_slab_free(s, slab, head, tail, cnt, addr);
+#ifdef CONFIG_SLUB_PRECISE_INUSE
+		raw_cpu_sub(s->cpu_slab->inuse, cnt);
+#endif
+	}
  }
  
  #ifdef CONFIG_KASAN_GENERIC
@@ -6253,6 +6260,17 @@ void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo)
  		nr_free += count_partial(n, count_free);
  	}
  
+#ifdef CONFIG_SLUB_PRECISE_INUSE
+	{
+		unsigned int cpu, nr_inuse = 0;
+
+		for_each_possible_cpu(cpu)
+			nr_inuse += per_cpu_ptr((s)->cpu_slab, cpu)->inuse;
+
+		if (nr_inuse <= nr_objs)
+			nr_free = nr_objs - nr_inuse;
+	}
+#endif
  	sinfo->active_objs = nr_objs - nr_free;
  	sinfo->num_objs = nr_objs;
  	sinfo->active_slabs = nr_slabs;
-- 
2.25.1


  reply	other threads:[~2022-02-28  6:17 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-22  9:22 Vasily Averin
2022-02-22 10:23 ` Hyeonggon Yoo
2022-02-22 12:10   ` Vasily Averin
2022-02-22 16:32     ` Shakeel Butt
2022-02-22 16:47     ` Roman Gushchin
2022-02-23  1:07       ` Vasily Averin
2022-02-22 20:59     ` Roman Gushchin
2022-02-22 23:08       ` Vlastimil Babka
2022-02-23  0:07         ` Roman Gushchin
2022-02-23  0:32           ` Vlastimil Babka
2022-02-23  3:45             ` Hyeonggon Yoo
2022-02-23 17:31               ` Vlastimil Babka
2022-02-23 18:15                 ` Roman Gushchin
2022-02-24 13:16                 ` Vasily Averin
2022-02-25  0:08                   ` Roman Gushchin
2022-02-25  4:37                     ` Vasily Averin
2022-02-28  6:17                       ` Vasily Averin [this message]
2022-02-28 10:22                         ` Hyeonggon Yoo
2022-02-28 10:28                           ` Hyeonggon Yoo
2022-02-28 10:43                         ` Hyeonggon Yoo
2022-02-28 12:09                         ` Hyeonggon Yoo
2022-03-03  8:39                   ` Christoph Lameter
2022-03-04 16:29     ` Vlastimil Babka
2022-02-22 11:10 ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a7d3c8a-ad92-0ffe-4374-f0bb7e029a74@virtuozzo.com \
    --to=vvs@virtuozzo.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kernel@openvz.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox