linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Muchun Song <muchun.song@linux.dev>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	David Rientjes <rientjes@google.com>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Meta kernel team <kernel-team@meta.com>,
	cgroups@vger.kernel.org, netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH v1] memcg: add charging of already allocated slab objects
Date: Wed, 28 Aug 2024 10:36:06 +0800	[thread overview]
Message-ID: <EA5F7851-B519-4570-B299-8A096A09D6E7@linux.dev> (raw)
In-Reply-To: <yiyx4fh6dklqpexfstkzp3gf23hjpbjujci2o6gs7nb4sutzvb@b5korjrjio3m>



> On Aug 28, 2024, at 01:23, Shakeel Butt <shakeel.butt@linux.dev> wrote:
> 
> On Tue, Aug 27, 2024 at 03:06:32AM GMT, Roman Gushchin wrote:
>> On Mon, Aug 26, 2024 at 04:29:08PM -0700, Shakeel Butt wrote:
>>> At the moment, the slab objects are charged to the memcg at the
>>> allocation time. However there are cases where slab objects are
>>> allocated at the time where the right target memcg to charge it to is
>>> not known. One such case is the network sockets for the incoming
>>> connection which are allocated in the softirq context.
>>> 
>>> Couple hundred thousand connections are very normal on large loaded
>>> server and almost all of those sockets underlying those connections get
>>> allocated in the softirq context and thus not charged to any memcg.
>>> However later at the accept() time we know the right target memcg to
>>> charge. Let's add new API to charge already allocated objects, so we can
>>> have better accounting of the memory usage.
>>> 
>>> To measure the performance impact of this change, tcp_crr is used from
>>> the neper [1] performance suite. Basically it is a network ping pong
>>> test with new connection for each ping pong.
>>> 
>>> The server and the client are run inside 3 level of cgroup hierarchy
>>> using the following commands:
>>> 
>>> Server:
>>> $ tcp_crr -6
>>> 
>>> Client:
>>> $ tcp_crr -6 -c -H ${server_ip}
>>> 
>>> If the client and server run on different machines with 50 GBPS NIC,
>>> there is no visible impact of the change.
>>> 
>>> For the same machine experiment with v6.11-rc5 as base.
>>> 
>>>         base (throughput)     with-patch
>>> tcp_crr   14545 (+- 80)         14463 (+- 56)
>>> 
>>> It seems like the performance impact is within the noise.
>>> 
>>> Link: https://github.com/google/neper [1]
>>> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
>> 
>> Hi Shakeel,
>> 
>> I like the idea and performance numbers look good. However some comments on
>> the implementation:
>> 
> 
> Thanks for taking a look.
> 
>>> ---
>>> 
>>> Changes since the RFC:
>>> - Added check for already charged slab objects.
>>> - Added performance results from neper's tcp_crr
>>> 
>>> include/linux/slab.h            |  1 +
>>> mm/slub.c                       | 54 +++++++++++++++++++++++++++++++++
>>> net/ipv4/inet_connection_sock.c |  5 +--
>>> 3 files changed, 58 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>>> index eb2bf4629157..05cfab107c72 100644
>>> --- a/include/linux/slab.h
>>> +++ b/include/linux/slab.h
>>> @@ -547,6 +547,7 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru,
>>>    gfp_t gfpflags) __assume_slab_alignment __malloc;
>>> #define kmem_cache_alloc_lru(...) alloc_hooks(kmem_cache_alloc_lru_noprof(__VA_ARGS__))
>>> 
>>> +bool kmem_cache_charge(void *objp, gfp_t gfpflags);
>>> void kmem_cache_free(struct kmem_cache *s, void *objp);
>>> 
>>> kmem_buckets *kmem_buckets_create(const char *name, slab_flags_t flags,
>>> diff --git a/mm/slub.c b/mm/slub.c
>>> index c9d8a2497fd6..580683597b5c 100644
>>> --- a/mm/slub.c
>>> +++ b/mm/slub.c
>>> @@ -2185,6 +2185,16 @@ void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p,
>>> 
>>> __memcg_slab_free_hook(s, slab, p, objects, obj_exts);
>>> }
>>> +
>>> +static __fastpath_inline
>>> +bool memcg_slab_post_charge(struct kmem_cache *s, void *p, gfp_t flags)
>>> +{
>>> + if (likely(!memcg_kmem_online()))
>>> + return true;
>> 
>> We do have this check in kmem_cache_charge(), why do we need to check it again?
>> 
> 
> I missed to remove this one. I am going to rearrange the code bit more
> in these functions to avoid the build errors in non MEMCG builds.
> 
>>> +
>>> + return __memcg_slab_post_alloc_hook(s, NULL, flags, 1, &p);
>>> +}
>>> +
>>> #else /* CONFIG_MEMCG */
>>> static inline bool memcg_slab_post_alloc_hook(struct kmem_cache *s,
>>>      struct list_lru *lru,
>>> @@ -2198,6 +2208,13 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab,
>>> void **p, int objects)
>>> {
>>> }
>>> +
>>> +static inline bool memcg_slab_post_charge(struct kmem_cache *s,
>>> +   void *p,
>>> +   gfp_t flags)
>>> +{
>>> + return true;
>>> +}
>>> #endif /* CONFIG_MEMCG */
>>> 
>>> /*
>>> @@ -4062,6 +4079,43 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru,
>>> }
>>> EXPORT_SYMBOL(kmem_cache_alloc_lru_noprof);
>>> 
>>> +#define KMALLOC_TYPE (SLAB_KMALLOC | SLAB_CACHE_DMA | \
>>> +       SLAB_ACCOUNT | SLAB_RECLAIM_ACCOUNT)
>>> +
>>> +bool kmem_cache_charge(void *objp, gfp_t gfpflags)
>>> +{
>>> + struct slabobj_ext *slab_exts;
>>> + struct kmem_cache *s;
>>> + struct folio *folio;
>>> + struct slab *slab;
>>> + unsigned long off;
>>> +
>>> + if (!memcg_kmem_online())
>>> + return true;
>>> +
>>> + folio = virt_to_folio(objp);
>>> + if (unlikely(!folio_test_slab(folio)))
>>> + return false;
>> 
>> Does it handle the case of a too-big-to-be-a-slab-object allocation?
>> I think it's better to handle it properly. Also, why return false here?
>> 
> 
> Yes I will fix the too-big-to-be-a-slab-object allocations. I presume I
> should just follow the kfree() hanlding on !folio_test_slab() i.e. that
> the given object is the large or too-big-to-be-a-slab-object.

Hi Shakeel,

If we decide to do this, I suppose you will use memcg_kmem_charge_page
to charge big-object. To be consistent, I suggest renaming kmem_cache_charge
to memcg_kmem_charge to handle both slab object and big-object. And I saw
all the functions related to object charging is moved to memcontrol.c (e.g.
__memcg_slab_post_alloc_hook), so maybe we should also do this for
memcg_kmem_charge?

Muhcun,
Thanks.

  reply	other threads:[~2024-08-28  2:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-26 23:29 Shakeel Butt
2024-08-27  3:06 ` Roman Gushchin
2024-08-27 17:23   ` Shakeel Butt
2024-08-28  2:36     ` Muchun Song [this message]
2024-08-28 19:03       ` Shakeel Butt
2024-08-29  2:36         ` Muchun Song
2024-08-29 15:49           ` Shakeel Butt
2024-08-30  6:09             ` Muchun Song
2024-08-27 13:40 ` kernel test robot
2024-08-27 16:03 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=EA5F7851-B519-4570-B299-8A096A09D6E7@linux.dev \
    --to=muchun.song@linux.dev \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox