From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 344CDC433EF for ; Mon, 28 Feb 2022 06:17:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9303F8D0002; Mon, 28 Feb 2022 01:17:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E04A8D0001; Mon, 28 Feb 2022 01:17:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A7CE8D0002; Mon, 28 Feb 2022 01:17:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id 6A4688D0001 for ; Mon, 28 Feb 2022 01:17:42 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 2A7A322043 for ; Mon, 28 Feb 2022 06:17:42 +0000 (UTC) X-FDA: 79191182364.03.4B94B38 Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-eopbgr130120.outbound.protection.outlook.com [40.107.13.120]) by imf14.hostedemail.com (Postfix) with ESMTP id 2DE47100005 for ; Mon, 28 Feb 2022 06:17:36 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DJv3QncMF75ZtXSXIYk9UOqxSC2+wx2d+QzKgkTKwZM23ohRthH7ZRivTQmfM1Efa9vPsmhs88c5cqltIC6081mDzLU296Tz0/UgTu/Jvii9P/Pbxp6VLeMkhJzcwBdOzQMgh3XT52gBnlxMEsyEswL+To7lVsoGJ9raIoYGCdwgtVclzmz+/h/vFcC7HksB4M2SUMWVMRpHp/8VjoziXu9OeRK7iM5gSMRuPvKgd05Z/2sHE5rw9aWMPrvWakpB/tFXp9g7XthufdSShiDKZgm4HAYVSTYTXg804VQ2+V+oVxkJWhbpgnY/B5nd+7HZK7J3jIRgTS9VYDN//IzFCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zJhN1W22OQv2yiQU25P2UOfomQkN9K2J6aDkBm9KC7o=; b=jSZjfqKMDtuXByU8d/YFZg3+IJ4iXZUraAi0WhZ5Lz46tg1tNecFYPiqZGLzR3tufOa7XcfKTLZlaFRAOUlHNEWNrDQKZMwlg/vlWE/UOenxsPzJjIEHa2nFgxNNILMZeCI/V8gPEmKaE00QTCj96/VPc1nzTuYcCS+DuYzAoDMX00j0vlCQYUOz39QJT/dCrItH/OV1DxBhcKEY/mi3aTmVMp4YM9TVy7qaVtQQVZmjUqvvih5dTkLljkxpRrd+ThK4ZdX8d8AQWO5WrUPmv8fN8L4Qpt4CLqc2otaF9AXggEKJe3IQYgmV4so9Gdct+IoNtLHDxyhScWeAMGcrlQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=virtuozzo.com; dmarc=pass action=none header.from=virtuozzo.com; dkim=pass header.d=virtuozzo.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=virtuozzo.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zJhN1W22OQv2yiQU25P2UOfomQkN9K2J6aDkBm9KC7o=; b=F9oMnrO0zA34a+RghC7YZphFUHYVoRrxUVJeYivbzPiIpwXyNalQ557cZuLzSwxYY8mPQygD7iXQ5MKs+VhPh0BSb0ugeD3o4GoH8sXAav1OsT+I4FMV20bn0CyZVMeMJUm7XMPWKnp5LzLYfvN0F/tOc/OXcJc4r72LEBINAeg= Received: from VI1PR08MB3245.eurprd08.prod.outlook.com (2603:10a6:803:48::20) by PR3PR08MB5820.eurprd08.prod.outlook.com (2603:10a6:102:90::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5017.21; Mon, 28 Feb 2022 06:17:33 +0000 Received: from VI1PR08MB3245.eurprd08.prod.outlook.com ([fe80::4007:6de5:a0b9:1533]) by VI1PR08MB3245.eurprd08.prod.outlook.com ([fe80::4007:6de5:a0b9:1533%6]) with mapi id 15.20.5017.026; Mon, 28 Feb 2022 06:17:33 +0000 Message-ID: <2a7d3c8a-ad92-0ffe-4374-f0bb7e029a74@virtuozzo.com> Date: Mon, 28 Feb 2022 09:17:27 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: slabinfo shows incorrect active_objs ??? Content-Language: en-US From: Vasily Averin To: Roman Gushchin Cc: Vlastimil Babka , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Christoph Lameter , David Rientjes , Joonsoo Kim , Pekka Enberg , Linux MM , Andrew Morton , kernel@openvz.org References: <4BC89091-F314-4785-BCBB-189CE42B0192@linux.dev> <1c73adc1-f780-56ac-4c67-490670a27951@virtuozzo.com> In-Reply-To: <1c73adc1-f780-56ac-4c67-490670a27951@virtuozzo.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: VI1P189CA0006.EURP189.PROD.OUTLOOK.COM (2603:10a6:802:2a::19) To VI1PR08MB3245.eurprd08.prod.outlook.com (2603:10a6:803:48::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b75fd092-e307-46b1-757b-08d9fa8201ce X-MS-TrafficTypeDiagnostic: PR3PR08MB5820:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Yl1tuY5TcLsn7Hh+SfLP+eQFIa3gIvXlSJvEEu9U4FIRDeW9/icnmv40toq8aTR0n3PMYgYK6fbs5BS0Y3qnprwkCSqPEat1I5fh88pe5SaVSEezXEQY9yBQuFDPw8HE46HLq2HyMmKhyjDIdkdHQC4VmWTi9F24ttJ82JaV3exOCfOv6lNRvZjG5bZspnS/C9JfsfEPXUM5ttxfbItzVsQxBWMv/+/FaIHJTQyZb8XkAVLFivbVCbNunYUihcCOAD3uLo4emvZQs4QKJfuZbx3r3DGSGORJpKtie28S4ucaXgrIFrnuL/kgmBAYH3Evapy2/rSGgjFQQGUEf8C9lA/q1qNJa5U42ZVVwECCdcE8XZkOMxTc2jnUfGJLvbS+0JB07H3/fUMdpA4foLgnl6GgK/XfLKb8ivEaHwzfVQG+bHvxT9g2UGGm+HeyJS5S+Bx+8nPY7cPbXM6/ArfY7Rm0K58y/Zccv7kIiiu3KluldQxIcZh4+PiLFmaoZsxVvIzuv3T3j1AS4kbZUNYagUYkOY3cNzryfHQdV5OeZG1JxhkJtfitJARdlH5cRDBKpM/4Q4GxbtFOiHqmCe7/wyJ0qTEL0hjemK0/lXjI10aMPaZ4aOem6OfdN4cCGhZiT5/JYDCA6ryXqXVQSJl31uiJnDv1Yg3A2kzkoU6/Fzolv3/zvre8moWSxMknoEchCUVyMhSeKn0lTe0hfEFpg5623SYcPSy9o6YWhPJD6oscEK/QJrXdCGecY7iEmJRw9lNBorlkIeW7cfy+QLgXZQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB3245.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(4636009)(366004)(53546011)(52116002)(508600001)(6486002)(8676002)(66476007)(66556008)(66946007)(6506007)(6512007)(6666004)(8936002)(5660300002)(83380400001)(31686004)(31696002)(2906002)(36756003)(38350700002)(38100700002)(2616005)(186003)(26005)(86362001)(6916009)(4326008)(54906003)(316002)(107886003)(45980500001)(43740500002);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?OVcyY0o0ek9zdWVaQWZsWnkyaWFTeXJRejE2RlZOYzN0bzVubm1aNXZ4MTRO?= =?utf-8?B?dTROTUNOWHVnWURrOFZHOVp1RXQwSi9zTkZPMkRrbFVUWlFwcXFXb0dBL3pN?= =?utf-8?B?Y0s0b0N2T05rSi91UG5Ba3pCOHcwVlRlb1VNYTFsTHhkclBTemNZcmN1Wk5p?= =?utf-8?B?dklQMkNWU3Q2UlZtL1E3WXc1SXI2WXc0QTBzUCtwWk4xTi8zeTUyQzUyM3ds?= =?utf-8?B?b29MUlE1T0RNbUsreHFjUVVlclF4dVJxdTJHeUpGTU9wTklnS2g5cnlPdXhK?= =?utf-8?B?bkxYTys5TTZXbWtMZDhOUWE3eGJPZ1VjQlN6dFQ0ZkpmMnZPT05Kb21HMVBs?= =?utf-8?B?aHZPVjFzR2ZxdlRQMkx4bmhKQjNaRERGT3l4TGdsUTdESG9hLzNnWTI3T3J6?= =?utf-8?B?dnd3TCtFdzduU1l0a3VJL0N5Ny9nNjBTZHA5RXYwYWlBTFJUWEpaSmhjeXNC?= =?utf-8?B?YjdsZ2FDN1c1bURFazZBVXdwdTR3NnRVa2lyQUpNUjd2THFlZnZPTnZvMmc1?= =?utf-8?B?MW5Mc0FkcWQyclhIQmF0WjhKQnIyV0Q4QkUwOXc0RDl1MER6TlpmRFZYTjBD?= =?utf-8?B?Ylg2ZmdrZmhyMkloVHIxS1VaNHlBZ2RZaVNlR3o5VnI5MU00TVdjRVVlYk1P?= =?utf-8?B?eXNTdE9Jd2d4SDA4NENWcHM3TGVsYlNaRytaMU9lSEViS1BXV2ZqRE9tZlRj?= =?utf-8?B?Ni9kZlI5VXhnZzF4clI4VE5wbE9VMmlwdTZlb1Zra2l0TTkvQzZQMnRObGpY?= =?utf-8?B?TEo4aVQ0RUU2RzZzMkdQem5MbXl3QzBicDVrNEVXR3g1WWhJQVJHWGl5LzZP?= =?utf-8?B?dzBudG9SMXZrdTI3K2NJYTNNMGFVbnVvUk9TeGhKL2VaWENoV2gyQUhLelZk?= =?utf-8?B?akRva1F5akNianZWeERpUVBFWWtyYWt1bEFrK2xiQXFGNjhrSWoxaWVnOEp5?= =?utf-8?B?Z3pYSXRubklxWE52Mks4cW1lZnRHQ2FnV29sNFRUdlgzUUJYYm5KWEphZS9U?= =?utf-8?B?cUtFdjhxdGNqSkdQb3BiRDIrSDllNWk3aTdaTzhOUTJnU1YybmZCWEhlR2l0?= =?utf-8?B?cmhjUndLNUNVd3BJcFJURnJFZFM2TEhLNjVxK25xRzFIVy9pYWN0alVYR08y?= =?utf-8?B?ZFQwOXpqZ3FMcDl6QUZRZUt4bG9MbzZueFc0b2diN2RmbXJZSGlQZVZ2cDVE?= =?utf-8?B?RlRtOExNWDlDcDh4VWlXS29DRnV6UWo3NFNYQkF4bHhGY0xkOXlVeGFhcDFQ?= =?utf-8?B?VWVpd0ZTVFQwY1RqNHZONUJ6TVRtaFNHeVYyYVhzb0g1WXQ1Ty8wTi8yZ05Y?= =?utf-8?B?Ym5pYjJiQzQxcnNjZ1A1V2pERGQrWVpCRjRlNm1xUkFNWEdjZEFQemQxUkJR?= =?utf-8?B?NXViQVJ1QzgzRE04dFNpMDFFR296YXZma1N1amp5L0Fic09DTG1pK09yUGY3?= =?utf-8?B?WXA4akg4dCs1K014QUxGUGpGUWNwSVBIZE00S2dkUTdNczdQYVBKQmtIdGNP?= =?utf-8?B?blRZZkdyNXh3UGFZUzdlTHNBcmJBWWxZOG1ndVI0UndLWURmTFlPSWVaRWMw?= =?utf-8?B?dml4TzVIbE1oRktSeGdNVnp0clkzZVhmdHlqekozREc4MlZpU2E4VFh5cjBm?= =?utf-8?B?REJvMnVuN0gwWXN2VC9sWUZPd0paOU1vWndFcXBza2dqR0lnRVNaYkszbWNP?= =?utf-8?B?RGZHRmM0QkUyTEdoRjBPeFhUdXhZS0FNMHRCQmoyQ2pnWjNHamU2UWRzcld0?= =?utf-8?B?UHNUVGpZWTRZVzFxeXBtL2ZnNnM4SjVmTk1KMHpNU1VOdkJOTmptKy9kQXhF?= =?utf-8?B?ZU9jb1dBT3kvME9sMS94VHdjSWhGREg0djdOYkRpZy95V3hlTVQrWHh0MEF4?= =?utf-8?B?d1pVVkl6eWZxeWdTbGUyL0haa2hjSjdBTTBlK3hVVUNLajQzVHdqMVUvR2Fz?= =?utf-8?B?NDhVdlRTS2Z6UEVnaEo2QXR4S2FqTDhXcWhGNVQ1VG5md1hyQjBBdGh0SnZH?= =?utf-8?B?MlZmdmQ1UkppM2ZQeUdwVW01cmtHWkVUTGdjM1dSM3VZM29aUGdSbXJhb1No?= =?utf-8?B?MUhrSGRKTHkwV2VFL2M0c1lFRmh2RlZaK2gvQVpENUJoKzRjNDdpenRYeGRp?= =?utf-8?B?Ri82OCs4dlp2WVh5UGJydHM3U2tzY0hwTmJZdG1nS0JkWXF4NXJ4NDY1K3Ni?= =?utf-8?Q?ovjB+G9qXmIBizb0o6J19TY=3D?= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-Network-Message-Id: b75fd092-e307-46b1-757b-08d9fa8201ce X-MS-Exchange-CrossTenant-AuthSource: VI1PR08MB3245.eurprd08.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Feb 2022 06:17:33.4510 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 0bc7f26d-0264-416e-a6fc-8352af79c58f X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: DNM5uC1ARPoVYhcRR74HPeDAzL7ZRWhs0lt9Arv3KbNIG8vmO9ZgyPiGSrZTX19y2ymgc0n8Irt+Wg43Q80QIA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR08MB5820 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2DE47100005 X-Stat-Signature: gyh7usdmeyawkom8txqp83m565peaub1 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=virtuozzo.com header.s=selector2 header.b=F9oMnrO0; spf=none (imf14.hostedemail.com: domain of vvs@virtuozzo.com has no SPF policy when checking 40.107.13.120) smtp.mailfrom=vvs@virtuozzo.com; dmarc=pass (policy=quarantine) header.from=virtuozzo.com X-Rspam-User: X-HE-Tag: 1646029056-693609 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 25.02.2022 07:37, Vasily Averin wrote: > On 25.02.2022 03:08, Roman Gushchin wrote: >> >>> On Feb 24, 2022, at 5:17 AM, Vasily Averin wrote: >>> >>> On 22.02.2022 19:32, Shakeel Butt wrote: >>>> If you are just interested in the stats, you can use SLAB for your experiments. >>> >>> Unfortunately memcg_slabino.py does not support SLAB right now. >>> >>>> On 23.02.2022 20:31, Vlastimil Babka wrote: >>>>> On 2/23/22 04:45, Hyeonggon Yoo wrote: >>>>> On Wed, Feb 23, 2022 at 01:32:36AM +0100, Vlastimil Babka wrote: >>>>>> Hm it would be easier just to disable merging when the precise counters are >>>>>> enabled. Assume it would be a config option (possibly boot-time option with >>>>>> static keys) anyway so those who don't need them can avoid the overhead. >>>>> >>>>> Is it possible to accurately account objects in SLUB? I think it's not >>>>> easy because a CPU can free objects to remote cpu's partial slabs using >>>>> cmpxchg_double()... >>>> AFAIU Roman's idea would be that each alloc/free would simply inc/dec an >>>> object counter that's disconnected from physical handling of particular sl*b >>>> implementation. It would provide exact count of objects from the perspective >>>> of slab users. >>>> I assume for reduced overhead the counters would be implemented in a percpu >>>> fashion as e.g. vmstats. Slabinfo gathering would thus have to e.g. sum up >>>> those percpu counters. >>> >>> I like this idea too and I'm going to spend some time for its implementation. >> >> Sounds good! >> >> Unfortunately it’s quite tricky: the problem is that there is potentially a large and dynamic set of cgroups and also large and dynamic set of slab caches. Given the performance considerations, it’s also unlikely to avoid using percpu variables. >> So we come to the (nr_slab_caches * nr_cgroups * nr_cpus) number of “objects”. If we create them proactively, we’re likely wasting lot of memory. Creating them on demand is tricky too (especially without losing some accounting accuracy). > > I told about global (i.e. non-memcg) precise slab counters only. > I'm expect it can done under new config option and/or static key, and if present use them in /proc/slabinfo output. > > At present I'm still going to extract memcg counters via your memcg_slabinfo script. I'm not sure I'll be able to debug this patch properly and decided to submit it as is. I hope it can be useful. In general it works and /proc/slabinfo shows reasonable numbers, however in some cases they differs from crash' "kmem -s" output, either +1 or -1. Obviously I missed something. ---[cut here]--- [PATCH RFC] slub: precise in-use counter for /proc/slabinfo output Signed-off-by: Vasily Averin --- include/linux/slub_def.h | 3 +++ init/Kconfig | 7 +++++++ mm/slub.c | 20 +++++++++++++++++++- 3 files changed, 29 insertions(+), 1 deletion(-) diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 33c5c0e3bd8d..d22e18dfe905 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -56,6 +56,9 @@ struct kmem_cache_cpu { #ifdef CONFIG_SLUB_STATS unsigned stat[NR_SLUB_STAT_ITEMS]; #endif +#ifdef CONFIG_SLUB_PRECISE_INUSE + unsigned inuse; /* Precise in-use counter */ +#endif }; #ifdef CONFIG_SLUB_CPU_PARTIAL diff --git a/init/Kconfig b/init/Kconfig index e9119bf54b1f..5c57bdbb8938 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1995,6 +1995,13 @@ config SLUB_CPU_PARTIAL which requires the taking of locks that may cause latency spikes. Typically one would choose no for a realtime system. +config SLUB_PRECISE_INUSE + default n + depends on SLUB && SMP + bool "SLUB precise in-use counter" + help + Per cpu in-use counter shows precise statistic in slabinfo. + config MMAP_ALLOW_UNINITIALIZED bool "Allow mmapped anonymous memory to be uninitialized" depends on EXPERT && !MMU diff --git a/mm/slub.c b/mm/slub.c index 261474092e43..90750cae0af9 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3228,6 +3228,9 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s, out: slab_post_alloc_hook(s, objcg, gfpflags, 1, &object, init); +#ifdef CONFIG_SLUB_PRECISE_INUSE + raw_cpu_inc(s->cpu_slab->inuse); +#endif return object; } @@ -3506,8 +3509,12 @@ static __always_inline void slab_free(struct kmem_cache *s, struct slab *slab, * With KASAN enabled slab_free_freelist_hook modifies the freelist * to remove objects, whose reuse must be delayed. */ - if (slab_free_freelist_hook(s, &head, &tail, &cnt)) + if (slab_free_freelist_hook(s, &head, &tail, &cnt)) { do_slab_free(s, slab, head, tail, cnt, addr); +#ifdef CONFIG_SLUB_PRECISE_INUSE + raw_cpu_sub(s->cpu_slab->inuse, cnt); +#endif + } } #ifdef CONFIG_KASAN_GENERIC @@ -6253,6 +6260,17 @@ void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo) nr_free += count_partial(n, count_free); } +#ifdef CONFIG_SLUB_PRECISE_INUSE + { + unsigned int cpu, nr_inuse = 0; + + for_each_possible_cpu(cpu) + nr_inuse += per_cpu_ptr((s)->cpu_slab, cpu)->inuse; + + if (nr_inuse <= nr_objs) + nr_free = nr_objs - nr_inuse; + } +#endif sinfo->active_objs = nr_objs - nr_free; sinfo->num_objs = nr_objs; sinfo->active_slabs = nr_slabs; -- 2.25.1