From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6789C433EF for ; Mon, 28 Feb 2022 10:43:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 656BD8D0002; Mon, 28 Feb 2022 05:43:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 606ED8D0001; Mon, 28 Feb 2022 05:43:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F5DA8D0002; Mon, 28 Feb 2022 05:43:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 4147B8D0001 for ; Mon, 28 Feb 2022 05:43:41 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 020DA232ED for ; Mon, 28 Feb 2022 10:43:40 +0000 (UTC) X-FDA: 79191852600.01.7B9C8B3 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf26.hostedemail.com (Postfix) with ESMTP id 919B2140003 for ; Mon, 28 Feb 2022 10:43:40 +0000 (UTC) Received: by mail-pf1-f169.google.com with SMTP id z15so10770527pfe.7 for ; Mon, 28 Feb 2022 02:43:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=aS1k6bPpSI5UClai470A4eWxbvvpQ3RWDvtn1GgWFoE=; b=UJDC3/Xn1HHMknvS1KT1Dz78UDPJ/8Vyx55Zdlx08brjMDgmKVNkDAGOx3vlBS1IYn xSulnsHJKhre5OnhSbry0I0SSlZuL8FXpu4VsUqyuH+w1EkHoPrmn8xm7bAJRfp59sHt BvCUgc7WvIdpknLfdk1Gh6dAOwa8Rp0ijBkrHpcbzYgwSF0tLpNsV4FUqoeVmMXuNyHp CWrqKW+8yrHZg1oBHtKnGs3ea0iLVHRrlIJh3S6kAv2DF1GvoOpgut5/wvwsmrnP047m muMwXILjd8jszqRklMFMhrlLCdBT0M+CqA2SG2vVYZw4AenLeRsq5zHGdBH/LU5V0y+1 Od6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=aS1k6bPpSI5UClai470A4eWxbvvpQ3RWDvtn1GgWFoE=; b=1CeeSD1WW83632/CZqu4ubpVluTlcyp39QhNzqlXOCP80Ld8JPFhjUP8DAfspCVsOw wfpGg+Z2c783Gl4X6HpMF24M6IflULlCveSzuPRowU2Q+1LrTTvYEa8bQP2nQsNf+KBP aFQSYxoReIeoHO4cdIR3H1TqgKBpOMHBIRJVCupR5NP4a9/2F39Wi3JF7ZSL1POX8X/x vZmRn30EzbPNN22yUwrW5+4rDxzNqPbdNtYQgZKCpMPeHgwm64BpSRl6O7Rtk0sMrccw FWZieV3U2DMHBCNQf60/GPF3y07cQuck3prazbVtxWtBhXgjQr1BRTJALSonFgTazOA4 PJDQ== X-Gm-Message-State: AOAM533AllBUT3ADuoGPHN8vxHGRqqQaECTIWBridFCZ32EsZXSouMCr BChvHNEmavyo3eRTPBPRXe4= X-Google-Smtp-Source: ABdhPJzvfiSLJAoJm/O7QcTjpZZUL6hZXh2yF66VjESktY4wB0aiI2b1TTUYOp90tp6P/jqfQ1Vj2g== X-Received: by 2002:a63:1719:0:b0:373:9a4a:368d with SMTP id x25-20020a631719000000b003739a4a368dmr16638939pgl.134.1646045019357; Mon, 28 Feb 2022 02:43:39 -0800 (PST) Received: from ip-172-31-19-208.ap-northeast-1.compute.internal (ec2-18-181-137-102.ap-northeast-1.compute.amazonaws.com. [18.181.137.102]) by smtp.gmail.com with ESMTPSA id v24-20020a634818000000b0036407db4728sm9729294pga.26.2022.02.28.02.43.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Feb 2022 02:43:39 -0800 (PST) Date: Mon, 28 Feb 2022 10:43:34 +0000 From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Vasily Averin Cc: Roman Gushchin , Vlastimil Babka , Christoph Lameter , David Rientjes , Joonsoo Kim , Pekka Enberg , Linux MM , Andrew Morton , kernel@openvz.org Subject: Re: slabinfo shows incorrect active_objs ??? Message-ID: References: <4BC89091-F314-4785-BCBB-189CE42B0192@linux.dev> <1c73adc1-f780-56ac-4c67-490670a27951@virtuozzo.com> <2a7d3c8a-ad92-0ffe-4374-f0bb7e029a74@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <2a7d3c8a-ad92-0ffe-4374-f0bb7e029a74@virtuozzo.com> X-Rspamd-Queue-Id: 919B2140003 X-Stat-Signature: 615auqmh3tkcsjjr7w76um1r9u7e7dre X-Rspam-User: Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="UJDC3/Xn"; spf=pass (imf26.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam07 X-HE-Tag: 1646045020-993380 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 28, 2022 at 09:17:27AM +0300, Vasily Averin wrote: > On 25.02.2022 07:37, Vasily Averin wrote: > > On 25.02.2022 03:08, Roman Gushchin wrote: > > > > > > > On Feb 24, 2022, at 5:17 AM, Vasily Averin wrote: > > > > > > > > On 22.02.2022 19:32, Shakeel Butt wrote: > > > > > If you are just interested in the stats, you can use SLAB for your experiments. > > > > > > > > Unfortunately memcg_slabino.py does not support SLAB right now. > > > > > > > > > On 23.02.2022 20:31, Vlastimil Babka wrote: > > > > > > On 2/23/22 04:45, Hyeonggon Yoo wrote: > > > > > > On Wed, Feb 23, 2022 at 01:32:36AM +0100, Vlastimil Babka wrote: > > > > > > > Hm it would be easier just to disable merging when the precise counters are > > > > > > > enabled. Assume it would be a config option (possibly boot-time option with > > > > > > > static keys) anyway so those who don't need them can avoid the overhead. > > > > > > > > > > > > Is it possible to accurately account objects in SLUB? I think it's not > > > > > > easy because a CPU can free objects to remote cpu's partial slabs using > > > > > > cmpxchg_double()... > > > > > AFAIU Roman's idea would be that each alloc/free would simply inc/dec an > > > > > object counter that's disconnected from physical handling of particular sl*b > > > > > implementation. It would provide exact count of objects from the perspective > > > > > of slab users. > > > > > I assume for reduced overhead the counters would be implemented in a percpu > > > > > fashion as e.g. vmstats. Slabinfo gathering would thus have to e.g. sum up > > > > > those percpu counters. > > > > > > > > I like this idea too and I'm going to spend some time for its implementation. > > > > > > Sounds good! > > > > > > Unfortunately it’s quite tricky: the problem is that there is potentially a large and dynamic set of cgroups and also large and dynamic set of slab caches. Given the performance considerations, it’s also unlikely to avoid using percpu variables. > > > So we come to the (nr_slab_caches * nr_cgroups * nr_cpus) number of “objects”. If we create them proactively, we’re likely wasting lot of memory. Creating them on demand is tricky too (especially without losing some accounting accuracy). > > > > I told about global (i.e. non-memcg) precise slab counters only. > > I'm expect it can done under new config option and/or static key, and if present use them in /proc/slabinfo output. > > > > At present I'm still going to extract memcg counters via your memcg_slabinfo script. > > I'm not sure I'll be able to debug this patch properly and decided to submit it as is. > I hope it can be useful. > > In general it works and /proc/slabinfo shows reasonable numbers, > however in some cases they differs from crash' "kmem -s" output, either +1 or -1. > Obviously I missed something. > > ---[cut here]--- > [PATCH RFC] slub: precise in-use counter for /proc/slabinfo output > > Signed-off-by: Vasily Averin > --- > include/linux/slub_def.h | 3 +++ > init/Kconfig | 7 +++++++ > mm/slub.c | 20 +++++++++++++++++++- > 3 files changed, 29 insertions(+), 1 deletion(-) > > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 33c5c0e3bd8d..d22e18dfe905 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -56,6 +56,9 @@ struct kmem_cache_cpu { > #ifdef CONFIG_SLUB_STATS > unsigned stat[NR_SLUB_STAT_ITEMS]; > #endif > +#ifdef CONFIG_SLUB_PRECISE_INUSE > + unsigned inuse; /* Precise in-use counter */ > +#endif > }; > #ifdef CONFIG_SLUB_CPU_PARTIAL > diff --git a/init/Kconfig b/init/Kconfig > index e9119bf54b1f..5c57bdbb8938 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1995,6 +1995,13 @@ config SLUB_CPU_PARTIAL > which requires the taking of locks that may cause latency spikes. > Typically one would choose no for a realtime system. > +config SLUB_PRECISE_INUSE > + default n > + depends on SLUB && SMP > + bool "SLUB precise in-use counter" > + help > + Per cpu in-use counter shows precise statistic in slabinfo. > + > config MMAP_ALLOW_UNINITIALIZED > bool "Allow mmapped anonymous memory to be uninitialized" > depends on EXPERT && !MMU > diff --git a/mm/slub.c b/mm/slub.c > index 261474092e43..90750cae0af9 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -3228,6 +3228,9 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s, > out: > slab_post_alloc_hook(s, objcg, gfpflags, 1, &object, init); > +#ifdef CONFIG_SLUB_PRECISE_INUSE > + raw_cpu_inc(s->cpu_slab->inuse); > +#endif I think here is wrong place to increase s->cpu_slab->inuse. I thought s->cpu_slab->inuse is to count inuse of current cpu slab, isn't it? If so, you need to be sure that allocation is done from cpu slab. Let me know if I'm missing something... > return object; > } > @@ -3506,8 +3509,12 @@ static __always_inline void slab_free(struct kmem_cache *s, struct slab *slab, > * With KASAN enabled slab_free_freelist_hook modifies the freelist > * to remove objects, whose reuse must be delayed. > */ > - if (slab_free_freelist_hook(s, &head, &tail, &cnt)) > + if (slab_free_freelist_hook(s, &head, &tail, &cnt)) { > do_slab_free(s, slab, head, tail, cnt, addr); > +#ifdef CONFIG_SLUB_PRECISE_INUSE > + raw_cpu_sub(s->cpu_slab->inuse, cnt); > +#endif > + } Same here. > } > #ifdef CONFIG_KASAN_GENERIC > @@ -6253,6 +6260,17 @@ void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo) > nr_free += count_partial(n, count_free); > } > +#ifdef CONFIG_SLUB_PRECISE_INUSE > + { > + unsigned int cpu, nr_inuse = 0; > + > + for_each_possible_cpu(cpu) > + nr_inuse += per_cpu_ptr((s)->cpu_slab, cpu)->inuse; > + > + if (nr_inuse <= nr_objs) > + nr_free = nr_objs - nr_inuse; > + } > +#endif > sinfo->active_objs = nr_objs - nr_free; > sinfo->num_objs = nr_objs; > sinfo->active_slabs = nr_slabs; > -- > 2.25.1 -- Thank you, You are awesome! Hyeonggon :-)