linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hyeonggon Yoo <42.hyeyoo@gmail.com>
To: Feng Tang <feng.tang@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	 Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: [RFC Patch 3/3] mm/slub: setup maxim per-node partial according to cpu numbers
Date: Tue, 12 Sep 2023 13:48:23 +0900	[thread overview]
Message-ID: <CAB=+i9RWVvUb5LyoTpzZ0XXWoSNxbKJuA6fynvOd4U+P5q+uaA@mail.gmail.com> (raw)
In-Reply-To: <20230905141348.32946-4-feng.tang@intel.com>

On Tue, Sep 5, 2023 at 11:07 PM Feng Tang <feng.tang@intel.com> wrote:
>
> Currently most of the slab's min_partial is set to 5 (as MIN_PARTIAL
> is 5). This is fine for older or small systesms, and could be too
> small for a large system with hundreds of CPUs, when per-node
> 'list_lock' is contended for allocating from and freeing to per-node
> partial list.
>
> So enlarge it based on the CPU numbers per node.
>
> Signed-off-by: Feng Tang <feng.tang@intel.com>
> ---
>  include/linux/nodemask.h | 1 +
>  mm/slub.c                | 9 +++++++--
>  2 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
> index 8d07116caaf1..6e22caab186d 100644
> --- a/include/linux/nodemask.h
> +++ b/include/linux/nodemask.h
> @@ -530,6 +530,7 @@ static inline int node_random(const nodemask_t *maskp)
>
>  #define num_online_nodes()     num_node_state(N_ONLINE)
>  #define num_possible_nodes()   num_node_state(N_POSSIBLE)
> +#define num_cpu_nodes()                num_node_state(N_CPU)
>  #define node_online(node)      node_state((node), N_ONLINE)
>  #define node_possible(node)    node_state((node), N_POSSIBLE)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 09ae1ed642b7..984e012d7bbc 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4533,6 +4533,7 @@ static int calculate_sizes(struct kmem_cache *s)
>
>  static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags)
>  {
> +       unsigned long min_partial;
>         s->flags = kmem_cache_flags(s->size, flags, s->name);
>  #ifdef CONFIG_SLAB_FREELIST_HARDENED
>         s->random = get_random_long();
> @@ -4564,8 +4565,12 @@ static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags)
>          * The larger the object size is, the more slabs we want on the partial
>          * list to avoid pounding the page allocator excessively.
>          */
> -       s->min_partial = min_t(unsigned long, MAX_PARTIAL, ilog2(s->size) / 2);
> -       s->min_partial = max_t(unsigned long, MIN_PARTIAL, s->min_partial);
> +
> +       min_partial = rounddown_pow_of_two(num_cpus() / num_cpu_nodes());
> +       min_partial = max_t(unsigned long, MIN_PARTIAL, min_partial);
> +
> +       s->min_partial = min_t(unsigned long, min_partial * 2, ilog2(s->size) / 2);
> +       s->min_partial = max_t(unsigned long, min_partial, s->min_partial);

Hello Feng,

How much memory is consumed by this change on your machine?

I won't argue that it would be huge for large machines but it
increases the minimum value for every
cache (even for those that are not contended) and there is no way to
reclaim this.

Maybe a way to reclaim a full slab on memory pressure (on buddy side)
wouldn't hurt?

>         set_cpu_partial(s);
>
> --
> 2.27.0
>


  reply	other threads:[~2023-09-12 13:48 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-05 14:13 [RFC Patch 0/3] mm/slub: reduce contention for per-node list_lock for large systems Feng Tang
2023-09-05 14:13 ` [RFC Patch 1/3] mm/slub: increase the maximum slab order to 4 for big systems Feng Tang
2023-09-12  4:52   ` Hyeonggon Yoo
2023-09-12 15:52     ` Feng Tang
2023-09-05 14:13 ` [RFC Patch 2/3] mm/slub: double per-cpu partial number for large systems Feng Tang
2023-09-05 14:13 ` [RFC Patch 3/3] mm/slub: setup maxim per-node partial according to cpu numbers Feng Tang
2023-09-12  4:48   ` Hyeonggon Yoo [this message]
2023-09-14  7:05     ` Feng Tang
2023-09-15  2:40       ` Lameter, Christopher
2023-09-15  5:05         ` Feng Tang
2023-09-15 16:13           ` Lameter, Christopher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAB=+i9RWVvUb5LyoTpzZ0XXWoSNxbKJuA6fynvOd4U+P5q+uaA@mail.gmail.com' \
    --to=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=feng.tang@intel.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox