From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0387C54747 for ; Wed, 28 Aug 2024 00:35:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E7A396B0082; Tue, 27 Aug 2024 20:35:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E2A346B0083; Tue, 27 Aug 2024 20:35:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF1AC6B0085; Tue, 27 Aug 2024 20:35:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B26606B0082 for ; Tue, 27 Aug 2024 20:35:06 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1A4E61618B8 for ; Wed, 28 Aug 2024 00:35:06 +0000 (UTC) X-FDA: 82499784612.27.AE81C79 Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) by imf04.hostedemail.com (Postfix) with ESMTP id 3BBCD40014 for ; Wed, 28 Aug 2024 00:35:03 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=a8uo5bPm; spf=pass (imf04.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724805235; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UoNZj1Tzs8pIk5fCddWb5c9E+5Cyp9MArkhioMCWS+Q=; b=5uwcatBAqO66lYeM5A7PLzfLHElFmQ1Aglbou+e1h2SM9WcG9s9k3GmL791YP7m4EtNU0n 0Mi5L0vQnKtvuchN9pWyxePGRjDCdR9id3V5R/p4tgwOXUXeKZjbsgiBvtBRENSxwRTU9d BAuocvpVJ6xsbc52MKPCsjIGArNnsyA= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=a8uo5bPm; spf=pass (imf04.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724805235; a=rsa-sha256; cv=none; b=VSot8CrEbJk/C4mRsTgTT0sjcuqdm3RENbeAnFJiFhab9xlk+Eout96WxN48XZTxNpdm6h Xh6Ev4IM6q54IjgH5X8uV+7+zhn/iaHgqbVbii+tK/B17kiPCc9oQ91j3v85Hn1bT5vrip zI+7asi9XPDyCOrP5Nc4tw2oKD1LA2E= Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a80eab3945eso636207166b.1 for ; Tue, 27 Aug 2024 17:35:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724805303; x=1725410103; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UoNZj1Tzs8pIk5fCddWb5c9E+5Cyp9MArkhioMCWS+Q=; b=a8uo5bPmHQ/rvjW7PsYdhtT5O+yWxiWokcTb11m0w/3j6RBGFpWXaI7zpt6k6t2+B/ xfsKGV2k+bMckYJCsna1DWYLJjOoW622ItDCbn0u0xuS/I79y2WYz5ranO3f7cGnQUna 9DPnxFBopOLaFZdm71LptPoQ9EBzWp8Dq/ny+w7vV9mix7IRATm9SZ25UwwV93QUDyph wEBwWTuEgxSrXRpJzI+TcmwVl0C7jxkG+doAuxnipV7DzprB1rpnxfnH2ur2DdnfGs3o 3DGSIdDXlwkl7721C8eK+zFcR4LZwmISyW2rCgkxxj6SgO/GL2hCrDQZ2/E4d0/ZOK/i aFwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724805303; x=1725410103; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UoNZj1Tzs8pIk5fCddWb5c9E+5Cyp9MArkhioMCWS+Q=; b=Ch932VZSqYH8e/TO8rbCHCcaK67fONyNaR/8QODqr0YBINixdb0LOUxirp/UVkDWC5 2LHHqZjimgyBBSEBMjW057gGoJsk9vVVWTZAtt933OocnTH0xY7dbBVvgIienB9qwlwe 3zfVOy6PH/OfVMy/4Qa8MgZyTTjvVfUXs8jUGd4GFsV/zzKFgIr0tawPyN42mwfvpXmu IG/5sx6aXd1htRPb50dq0HM+2qFIJf7DUcY/FVrlUj7LaS5oaySR+WbiRzvyfhUVFia2 f4PEkmbZvElhxgr7uIWArJIgHk8kh99OLjT0xahYvU2aofQI0Vf02joYR4PkKrQFG5S4 M9Zg== X-Forwarded-Encrypted: i=1; AJvYcCWMtqdfy7XamR/Mzuq3dd386NsINWD8AOkhHhtCd06eVv529eziXpIYsYkd/FJOjvYHxeWR1B7R9w==@kvack.org X-Gm-Message-State: AOJu0Yxx2SKR/WSOpvOpi8sM77/LPOurm1nDv0wg8eMMgBUtHZE8WmoW mkOl3aWmSuW5yvWFHM9yDR/QZI0Gvun80RUJYxVR+f8Fb5ufQdjWsddz/eVHSxCNAAxLqvNhPpC vZ0bt6J7rnsiQa4a25jCYZGtcs5MZ3cXzDsbz X-Google-Smtp-Source: AGHT+IHIj2cLav5dpbM+Cpe/D+ZzgGmKVOFHTITkPd4YOoOS6J2co1BmJ1zu0pnWibkwcipD4nXhtpS5TFUpOcWVL6E= X-Received: by 2002:a17:907:f75e:b0:a86:ae0a:a52a with SMTP id a640c23a62f3a-a870ab00478mr31261766b.55.1724805302001; Tue, 27 Aug 2024 17:35:02 -0700 (PDT) MIME-Version: 1.0 References: <20240827235228.1591842-1-shakeel.butt@linux.dev> In-Reply-To: <20240827235228.1591842-1-shakeel.butt@linux.dev> From: Yosry Ahmed Date: Tue, 27 Aug 2024 17:34:24 -0700 Message-ID: Subject: Re: [PATCH v2] memcg: add charging of already allocated slab objects To: Shakeel Butt Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , David Rientjes , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Eric Dumazet , "David S . Miller" , Jakub Kicinski , Paolo Abeni , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Meta kernel team , cgroups@vger.kernel.org, netdev@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 9tggjheqrjgzmpkz1j9cxz97ffzs6q8m X-Rspam-User: X-Rspamd-Queue-Id: 3BBCD40014 X-Rspamd-Server: rspam02 X-HE-Tag: 1724805303-669091 X-HE-Meta: U2FsdGVkX19xefXFyro7zXrsFg+JPfDENI7iCJWe1bk7yJROxeL2sRVpJIChXNhFYwNNeGqChPQfsWH74Hu14sBkkjOZHwSMqc25pk9mvBnJBik84BFoIYaRVBrzabcN3trbo18qZUL627CDIsEKldfbXi15ftOPlH/5jswNhnborBLriJqpuMoRpP9RPcrO0lC8aJ1TBxZbDDZVHeM3l8d0EyGBiuAa82Dk8mMj3s5DOuveb0z6KXcyKY5/ITieIdgEtBypsQNUCpUFvoma0Vi00At555g0gOVzWBsOQ25fImf/ucGn2Z6qJt/I7WP94J9KDHsMzA312j4CbH6tASvya6jXRjHte3nwGrlskAb8fqFfs7wIUHvKtN6soIXkOf5w8inG6sZkjD1SaGCNwnpDONTyhQqQ+OY/AowPGQgDb6Zskp4THy4YR77rnIr9X3SJrlithEu7HmclRgqM7nvAfVTyVsh1lVPEcCW0UoIXPI80zUbEfBXEBNrQ4TyetgaBvWn60IQ1NOVL4ka64Z+DYBotqWwjbSRIBdJH/x88zBXOVL0GpJcRtm1+Dy6Kuvmk7CWs9PvIddAtUN1IYy0ocx1O5QK/wH8CB1rGZ2bsRz/hJhNnOlhkA5Dl4y6JPX38cRFmCSAAILXLg2iqKnDMhxCG7AErwa/FhiWDh87laPDSnsS+UBfhqpGBs3S37IWgQxlmVLyltqcF3dpnGRhjIWNfRxrGcqEoJkLWEFuQIulQ+5C2nlXoAG06+vO9CnQ6tti228yrc4GsYdScNszWQMJvKHLqGZ8Ej76XV5wpOasHtiWDzhVmNRT78XUkIp3rS9CWTerGBcEnuUYbYyf8hVeoRBcphAcaLOQG0XyhjLEvSUKEz7CAcqJK95KSMtOwO9QQoP2sFM/ay+SXVSedGn+7/cC3iXSiecmwoPQA0r3uq5gBYouIEGogAmdC0htsNx8gkhdDt6Q+ndR 6GU2K/Rx jBgVoUKsZgrD8RNfm3jQB7DmYVQaDug7KRTXh337ODTaXD0/BmT5Wsf+ReQvYU/B41T0QQTxooNHqnanYU2p6smwE5WOqq2gM1KB/Ej2VXpcbKHoa6ncLDqmHJOqWjCsKhw/CB5rB992vtK0zjkUvVnk/Pl981CXQs0a4ARDZhNRKvSCbTsdMBfCjg2E7afOmjE9DfTHHY1CY0Pgixom88AtQc/s8arwdXiessUtDPPyydpoRcZ4QYDiGGXLmpGk6IBD+cXYxzkbCrmCE4485gPDsQdk7VYvwr2xGDXMlGcmzlBvxnEFiAVtG/RxkxhqPwsyl6NvGdOJYxeu4GI/C3fIc2OEe+1Sg7C+U X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Aug 27, 2024 at 4:52=E2=80=AFPM Shakeel Butt wrote: > > At the moment, the slab objects are charged to the memcg at the > allocation time. However there are cases where slab objects are > allocated at the time where the right target memcg to charge it to is > not known. One such case is the network sockets for the incoming > connection which are allocated in the softirq context. > > Couple hundred thousand connections are very normal on large loaded > server and almost all of those sockets underlying those connections get > allocated in the softirq context and thus not charged to any memcg. > However later at the accept() time we know the right target memcg to > charge. Let's add new API to charge already allocated objects, so we can > have better accounting of the memory usage. > > To measure the performance impact of this change, tcp_crr is used from > the neper [1] performance suite. Basically it is a network ping pong > test with new connection for each ping pong. > > The server and the client are run inside 3 level of cgroup hierarchy > using the following commands: > > Server: > $ tcp_crr -6 > > Client: > $ tcp_crr -6 -c -H ${server_ip} > > If the client and server run on different machines with 50 GBPS NIC, > there is no visible impact of the change. > > For the same machine experiment with v6.11-rc5 as base. > > base (throughput) with-patch > tcp_crr 14545 (+- 80) 14463 (+- 56) > > It seems like the performance impact is within the noise. > > Link: https://github.com/google/neper [1] > Signed-off-by: Shakeel Butt > --- > v1: https://lore.kernel.org/all/20240826232908.4076417-1-shakeel.butt@lin= ux.dev/ > Changes since v1: > - Correctly handle large allocations which bypass slab > - Rearrange code to avoid compilation errors for !CONFIG_MEMCG builds > > RFC: https://lore.kernel.org/all/20240824010139.1293051-1-shakeel.butt@li= nux.dev/ > Changes since the RFC: > - Added check for already charged slab objects. > - Added performance results from neper's tcp_crr > > include/linux/slab.h | 1 + > mm/slub.c | 51 +++++++++++++++++++++++++++++++++ > net/ipv4/inet_connection_sock.c | 5 ++-- > 3 files changed, 55 insertions(+), 2 deletions(-) > > diff --git a/include/linux/slab.h b/include/linux/slab.h > index eb2bf4629157..05cfab107c72 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -547,6 +547,7 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *= s, struct list_lru *lru, > gfp_t gfpflags) __assume_slab_alignment __mal= loc; > #define kmem_cache_alloc_lru(...) alloc_hooks(kmem_cache_alloc_lru_= noprof(__VA_ARGS__)) > > +bool kmem_cache_charge(void *objp, gfp_t gfpflags); > void kmem_cache_free(struct kmem_cache *s, void *objp); > > kmem_buckets *kmem_buckets_create(const char *name, slab_flags_t flags, > diff --git a/mm/slub.c b/mm/slub.c > index c9d8a2497fd6..8265ea5f25be 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2185,6 +2185,43 @@ void memcg_slab_free_hook(struct kmem_cache *s, st= ruct slab *slab, void **p, > > __memcg_slab_free_hook(s, slab, p, objects, obj_exts); > } > + > +#define KMALLOC_TYPE (SLAB_KMALLOC | SLAB_CACHE_DMA | \ > + SLAB_ACCOUNT | SLAB_RECLAIM_ACCOUNT) > + > +static __fastpath_inline > +bool memcg_slab_post_charge(void *p, gfp_t flags) > +{ > + struct slabobj_ext *slab_exts; > + struct kmem_cache *s; > + struct folio *folio; > + struct slab *slab; > + unsigned long off; > + > + folio =3D virt_to_folio(p); > + if (!folio_test_slab(folio)) { > + return __memcg_kmem_charge_page(folio_page(folio, 0), fla= gs, > + folio_order(folio)) =3D= =3D 0; Will this charge the folio again if it was already charged? It seems like we avoid this for already charged slab objects below but not here. > + } > + > + slab =3D folio_slab(folio); > + s =3D slab->slab_cache; > + > + /* Ignore KMALLOC_NORMAL cache to avoid circular dependency. */ > + if ((s->flags & KMALLOC_TYPE) =3D=3D SLAB_KMALLOC) > + return true; Would it be clearer to check if the slab cache is one of kmalloc_caches[KMALLOC_NORMAL]? This should be doable by comparing the address of the slab cache with the addresses of kmalloc_cache[KMALLOC_NORMAL] (perhaps in a helper). I need to refer to your reply to Roman to understand why this works. > + > + /* Ignore already charged objects. */ > + slab_exts =3D slab_obj_exts(slab); > + if (slab_exts) { > + off =3D obj_to_index(s, slab, p); > + if (unlikely(slab_exts[off].objcg)) > + return true; > + } > + > + return __memcg_slab_post_alloc_hook(s, NULL, flags, 1, &p); > +} > + > #else /* CONFIG_MEMCG */ > static inline bool memcg_slab_post_alloc_hook(struct kmem_cache *s, > struct list_lru *lru, > @@ -2198,6 +2235,11 @@ static inline void memcg_slab_free_hook(struct kme= m_cache *s, struct slab *slab, > void **p, int objects) > { > } > + > +static inline bool memcg_slab_post_charge(void *p, gfp_t flags) > +{ > + return true; > +} > #endif /* CONFIG_MEMCG */ > > /* > @@ -4062,6 +4104,15 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cach= e *s, struct list_lru *lru, > } > EXPORT_SYMBOL(kmem_cache_alloc_lru_noprof); > > +bool kmem_cache_charge(void *objp, gfp_t gfpflags) > +{ > + if (!memcg_kmem_online()) > + return true; > + > + return memcg_slab_post_charge(objp, gfpflags); > +} > +EXPORT_SYMBOL(kmem_cache_charge); > + > /** > * kmem_cache_alloc_node - Allocate an object on the specified node > * @s: The cache to allocate from. > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_s= ock.c > index 64d07b842e73..3c13ca8c11fb 100644 > --- a/net/ipv4/inet_connection_sock.c > +++ b/net/ipv4/inet_connection_sock.c > @@ -715,6 +715,7 @@ struct sock *inet_csk_accept(struct sock *sk, struct = proto_accept_arg *arg) > release_sock(sk); > if (newsk && mem_cgroup_sockets_enabled) { > int amt =3D 0; > + gfp_t gfp =3D GFP_KERNEL | __GFP_NOFAIL; > > /* atomically get the memory usage, set and charge the > * newsk->sk_memcg. > @@ -731,8 +732,8 @@ struct sock *inet_csk_accept(struct sock *sk, struct = proto_accept_arg *arg) > } > > if (amt) > - mem_cgroup_charge_skmem(newsk->sk_memcg, amt, > - GFP_KERNEL | __GFP_NOFAIL= ); > + mem_cgroup_charge_skmem(newsk->sk_memcg, amt, gfp= ); > + kmem_cache_charge(newsk, gfp); > > release_sock(newsk); > } > -- > 2.43.5 > >