From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0CBDC54749 for ; Wed, 28 Aug 2024 02:36:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D3BB6B0089; Tue, 27 Aug 2024 22:36:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5848A6B008A; Tue, 27 Aug 2024 22:36:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4251F6B008C; Tue, 27 Aug 2024 22:36:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 22C0B6B0089 for ; Tue, 27 Aug 2024 22:36:51 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9E738AB028 for ; Wed, 28 Aug 2024 02:36:50 +0000 (UTC) X-FDA: 82500091380.19.95CE674 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) by imf15.hostedemail.com (Postfix) with ESMTP id 912EBA0009 for ; Wed, 28 Aug 2024 02:36:48 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=YKjwqr+M; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf15.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.186 as permitted sender) smtp.mailfrom=muchun.song@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724812511; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d8P34oY8EX3iB+WHKFWcxCiPxWsH5mZsU2qtXZYBDx8=; b=d3tmOjisB1NTr8N0bIG0e3ps9w3LI0dMXtvzim704rwvRkKpV/3HVsk9NJcq5IY8HbSYHX LGHv2gwqt3iOBzRBSdtWnLSByZSvdXcammOi8Qpi52+X/l+bz1CyJbpQOBEz273mvtlWRw YuzVy1HPrHHKORDO65qzBx7Upb5gNiE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724812511; a=rsa-sha256; cv=none; b=2Pqu1XNPxszudussdwiYTE+g/wW1pHrWNiX4ZFo0SKNfU2Xdbk7/TEgdGoUuT0dwiB+01j pOwXFZIDSFJQLaO/fhUHVdGqU1T+FJ5fgVAl7Ffi1jQ2PBNhbnqhQs1WRTCYGG9c0OZEeQ 7rfKvRtwQzXPmB/tv0/XC6TnaFlwPUs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=YKjwqr+M; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf15.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.186 as permitted sender) smtp.mailfrom=muchun.song@linux.dev Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1724812606; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d8P34oY8EX3iB+WHKFWcxCiPxWsH5mZsU2qtXZYBDx8=; b=YKjwqr+Ms2AzytPHOJSEy7hIUuE3z3kiC2SNH71agCNDB0qczG2rbXsaOUosiWH2XcN8gS I9g2bFPumA0+f+icndZTriMglTW6AY6PWXaBG/9DaxualYBzxxEeSC72Nhi74bDX5wvc93 FaxJlji4d14otrl1zaw7d0LWgP6Sl+8= Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Subject: Re: [PATCH v1] memcg: add charging of already allocated slab objects X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: Date: Wed, 28 Aug 2024 10:36:06 +0800 Cc: Roman Gushchin , Andrew Morton , Johannes Weiner , Michal Hocko , Vlastimil Babka , David Rientjes , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Eric Dumazet , "David S . Miller" , Jakub Kicinski , Paolo Abeni , Linux Memory Management List , LKML , Meta kernel team , cgroups@vger.kernel.org, netdev Content-Transfer-Encoding: quoted-printable Message-Id: References: <20240826232908.4076417-1-shakeel.butt@linux.dev> To: Shakeel Butt X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 912EBA0009 X-Stat-Signature: 6mqnfp7wb4kjkbm13zeq8kfkb1zuktem X-Rspam-User: X-HE-Tag: 1724812608-778011 X-HE-Meta: U2FsdGVkX1/2N2PRGUff888b3q8b1H/CkTFDAt3tIAxtLzS7U7OoeQs+5ztKySRCBEap/9z0yqcJ9FnOhR9iqY3l2P1jA6/nTuInRdJ75Ui73rxtarcFMFAr6gQW4Bc+fy8+OUaTHuvKCGxBKMDcIOZbRWm6jAcmxqmXgqsot4Ciy9LZjQLTGekZSqpy8xkhUFKIL89Ef27wEZjyOJzE/FCYp/HZUtjfoLaoXxCajKHNc25XOSpIB1lEFo6thgtETzSCLlZvnjxIB2FxW7/upMP6zBclwmGsN3bxMjYzQ+SnZhISuIbKr0yJ8tTNkN2agrMYGD9uxsetFZJSeul+erbSz+ZsNdZ58Cal5re9xjIJfKM6aT6xac+uLabZuujlwwUJmryJnfmymull+yMxavUxVqwvuZ1OeUtqe+gm7sHdCVYBB9gkniepwPhpGOvIs9pQTki4hSiIi9XmgfHc7ulnRClMvMObrWG2fZVT7DB0uuOrHJLG7RrQU2JLauHCUAL2+FrDGKEhuEZPP6c8Pz/MDAzZ8jC18VKB3NdcrVz7OJx40WJ3OMI0kY0p7b6nRAetpgbtV2zzKN72ePY85hLCGQnk6pjNJ1WdyuYGQTTETuhhQ4EcP+7loan7D3kZ9xcPHBRnSNBUCWdY3DCRcXelX2yOj5bW8srKBI/wOJSa9XDZKEeDhOqJ6YozEw23Dopc9tLKh/hZ8TtJUGUS9OFh+5AKHvYmapkYnMWb1pIXK5EXe/7MuHLIkIE4PJyqcDMIeRbZomSaWJgfPJsn5Y6zALSc/g8k1rMKSYUkLTrcd6xp8F3eF4G0aTv8L0eEC49zGSurC89ios4t/alYHo5vjZBWbi0HuK2mKJN8EF2rmRmOjGPv7vbfqT3tdYNe4Dicf4r86yiyykEh+Tq9675eOH9+OmLPanYwMgOOvS9M+gxbi5G6P+BmXh2cTzeOPRfudss11qqYrC9+CGl o2ww1VVC wdc4Ew145llIKGzJ4+h9F4QcUFt7Iu0MrwXBoCAVA/JH+8mRp0vyF/4JJ+ZTsawX1bqIiRMX0v23rDOOvG/yZQ5RdzeyYTcwj6k+Eg6DVLhopfhu30MTcDKQlpctJAbWsA4osYxXYgvptD9TdmWPnPhPnMbZY7ksBK5BoroDw6PCjPQZ4NAxu5U8gC4rBHQksSGE/FP/tejxGceFtIIQ8P90UKk39tWvQaGC/VL0jobq0OiU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > On Aug 28, 2024, at 01:23, Shakeel Butt = wrote: >=20 > On Tue, Aug 27, 2024 at 03:06:32AM GMT, Roman Gushchin wrote: >> On Mon, Aug 26, 2024 at 04:29:08PM -0700, Shakeel Butt wrote: >>> At the moment, the slab objects are charged to the memcg at the >>> allocation time. However there are cases where slab objects are >>> allocated at the time where the right target memcg to charge it to = is >>> not known. One such case is the network sockets for the incoming >>> connection which are allocated in the softirq context. >>>=20 >>> Couple hundred thousand connections are very normal on large loaded >>> server and almost all of those sockets underlying those connections = get >>> allocated in the softirq context and thus not charged to any memcg. >>> However later at the accept() time we know the right target memcg to >>> charge. Let's add new API to charge already allocated objects, so we = can >>> have better accounting of the memory usage. >>>=20 >>> To measure the performance impact of this change, tcp_crr is used = from >>> the neper [1] performance suite. Basically it is a network ping pong >>> test with new connection for each ping pong. >>>=20 >>> The server and the client are run inside 3 level of cgroup hierarchy >>> using the following commands: >>>=20 >>> Server: >>> $ tcp_crr -6 >>>=20 >>> Client: >>> $ tcp_crr -6 -c -H ${server_ip} >>>=20 >>> If the client and server run on different machines with 50 GBPS NIC, >>> there is no visible impact of the change. >>>=20 >>> For the same machine experiment with v6.11-rc5 as base. >>>=20 >>> base (throughput) with-patch >>> tcp_crr 14545 (+- 80) 14463 (+- 56) >>>=20 >>> It seems like the performance impact is within the noise. >>>=20 >>> Link: https://github.com/google/neper [1] >>> Signed-off-by: Shakeel Butt >>=20 >> Hi Shakeel, >>=20 >> I like the idea and performance numbers look good. However some = comments on >> the implementation: >>=20 >=20 > Thanks for taking a look. >=20 >>> --- >>>=20 >>> Changes since the RFC: >>> - Added check for already charged slab objects. >>> - Added performance results from neper's tcp_crr >>>=20 >>> include/linux/slab.h | 1 + >>> mm/slub.c | 54 = +++++++++++++++++++++++++++++++++ >>> net/ipv4/inet_connection_sock.c | 5 +-- >>> 3 files changed, 58 insertions(+), 2 deletions(-) >>>=20 >>> diff --git a/include/linux/slab.h b/include/linux/slab.h >>> index eb2bf4629157..05cfab107c72 100644 >>> --- a/include/linux/slab.h >>> +++ b/include/linux/slab.h >>> @@ -547,6 +547,7 @@ void *kmem_cache_alloc_lru_noprof(struct = kmem_cache *s, struct list_lru *lru, >>> gfp_t gfpflags) __assume_slab_alignment __malloc; >>> #define kmem_cache_alloc_lru(...) = alloc_hooks(kmem_cache_alloc_lru_noprof(__VA_ARGS__)) >>>=20 >>> +bool kmem_cache_charge(void *objp, gfp_t gfpflags); >>> void kmem_cache_free(struct kmem_cache *s, void *objp); >>>=20 >>> kmem_buckets *kmem_buckets_create(const char *name, slab_flags_t = flags, >>> diff --git a/mm/slub.c b/mm/slub.c >>> index c9d8a2497fd6..580683597b5c 100644 >>> --- a/mm/slub.c >>> +++ b/mm/slub.c >>> @@ -2185,6 +2185,16 @@ void memcg_slab_free_hook(struct kmem_cache = *s, struct slab *slab, void **p, >>>=20 >>> __memcg_slab_free_hook(s, slab, p, objects, obj_exts); >>> } >>> + >>> +static __fastpath_inline >>> +bool memcg_slab_post_charge(struct kmem_cache *s, void *p, gfp_t = flags) >>> +{ >>> + if (likely(!memcg_kmem_online())) >>> + return true; >>=20 >> We do have this check in kmem_cache_charge(), why do we need to check = it again? >>=20 >=20 > I missed to remove this one. I am going to rearrange the code bit more > in these functions to avoid the build errors in non MEMCG builds. >=20 >>> + >>> + return __memcg_slab_post_alloc_hook(s, NULL, flags, 1, &p); >>> +} >>> + >>> #else /* CONFIG_MEMCG */ >>> static inline bool memcg_slab_post_alloc_hook(struct kmem_cache *s, >>> struct list_lru *lru, >>> @@ -2198,6 +2208,13 @@ static inline void = memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, >>> void **p, int objects) >>> { >>> } >>> + >>> +static inline bool memcg_slab_post_charge(struct kmem_cache *s, >>> + void *p, >>> + gfp_t flags) >>> +{ >>> + return true; >>> +} >>> #endif /* CONFIG_MEMCG */ >>>=20 >>> /* >>> @@ -4062,6 +4079,43 @@ void *kmem_cache_alloc_lru_noprof(struct = kmem_cache *s, struct list_lru *lru, >>> } >>> EXPORT_SYMBOL(kmem_cache_alloc_lru_noprof); >>>=20 >>> +#define KMALLOC_TYPE (SLAB_KMALLOC | SLAB_CACHE_DMA | \ >>> + SLAB_ACCOUNT | SLAB_RECLAIM_ACCOUNT) >>> + >>> +bool kmem_cache_charge(void *objp, gfp_t gfpflags) >>> +{ >>> + struct slabobj_ext *slab_exts; >>> + struct kmem_cache *s; >>> + struct folio *folio; >>> + struct slab *slab; >>> + unsigned long off; >>> + >>> + if (!memcg_kmem_online()) >>> + return true; >>> + >>> + folio =3D virt_to_folio(objp); >>> + if (unlikely(!folio_test_slab(folio))) >>> + return false; >>=20 >> Does it handle the case of a too-big-to-be-a-slab-object allocation? >> I think it's better to handle it properly. Also, why return false = here? >>=20 >=20 > Yes I will fix the too-big-to-be-a-slab-object allocations. I presume = I > should just follow the kfree() hanlding on !folio_test_slab() i.e. = that > the given object is the large or too-big-to-be-a-slab-object. Hi Shakeel, If we decide to do this, I suppose you will use memcg_kmem_charge_page to charge big-object. To be consistent, I suggest renaming = kmem_cache_charge to memcg_kmem_charge to handle both slab object and big-object. And I = saw all the functions related to object charging is moved to memcontrol.c = (e.g. __memcg_slab_post_alloc_hook), so maybe we should also do this for memcg_kmem_charge? Muhcun, Thanks.=