Re: [PATCH v2 3/3] mm/slub: simplify get_partial_node()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Vlastimil Babka <vbabka@suse.cz>
To: xiongwei.song@windriver.com, rientjes@google.com, cl@linux.com,
	penberg@kernel.org, iamjoonsoo.kim@lge.com,
	akpm@linux-foundation.org, roman.gushchin@linux.dev,
	42.hyeyoo@gmail.com
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	chengming.zhou@linux.dev
Subject: Re: [PATCH v2 3/3] mm/slub: simplify get_partial_node()
Date: Thu, 4 Apr 2024 11:26:47 +0200	[thread overview]
Message-ID: <8a2890f0-108f-4fca-98e1-913373aa2cff@suse.cz> (raw)
In-Reply-To: <20240404055826.1469415-4-xiongwei.song@windriver.com>

On 4/4/24 7:58 AM, xiongwei.song@windriver.com wrote:
> From: Xiongwei Song <xiongwei.song@windriver.com>
> 
> The break conditions for filling cpu partial can be more readable and
> simple.
> 
> If slub_get_cpu_partial() returns 0, we can confirm that we don't need
> to fill cpu partial, then we should break from the loop. On the other
> hand, we also should break from the loop if we have added enough cpu
> partial slabs.
> 
> Meanwhile, the logic above gets rid of the #ifdef and also fixes a weird
> corner case that if we set cpu_partial_slabs to 0 from sysfs, we still
> allocate at least one here.
> 
> Signed-off-by: Xiongwei Song <xiongwei.song@windriver.com>
> ---
> 
> The measurement below is to compare the performance effects when checking
> if we need to break from the filling cpu partial loop with the following
> either-or condition:
> 
> Condition 1:
> When the count of added cpu slabs is greater than cpu_partial_slabs/2:
> (partial_slabs > slub_get_cpu_partial(s) / 2)
> 
> Condition 2:
> When the count of added cpu slabs is greater than or equal to
> cpu_partial_slabs/2:
> (partial_slabs >= slub_get_cpu_partial(s) / 2)
> 
> The change of breaking condition can effect how many cpu partial slabs
> would be put on the cpu partial list.
> 
> Run the test with a "Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz" cpu with
> 16 cores. The OS is Ubuntu 22.04.
> 
> hackbench-process-pipes
>                   6.9-rc2(with ">")      6.9.0-rc2(with ">=")
> Amean     1       0.0373 (   0.00%)      0.0356 *   4.60%*
> Amean     4       0.0984 (   0.00%)      0.1014 *  -3.05%*
> Amean     7       0.1803 (   0.00%)      0.1851 *  -2.69%*
> Amean     12      0.2947 (   0.00%)      0.3141 *  -6.59%*
> Amean     21      0.4577 (   0.00%)      0.4927 *  -7.65%*
> Amean     30      0.6326 (   0.00%)      0.6649 *  -5.10%*
> Amean     48      0.9396 (   0.00%)      0.9884 *  -5.20%*
> Amean     64      1.2321 (   0.00%)      1.3004 *  -5.54%*
> 
> hackbench-process-sockets
>                   6.9-rc2(with ">")      6.9.0-rc2(with ">=")
> Amean     1       0.0609 (   0.00%)      0.0623 *  -2.35%*
> Amean     4       0.2107 (   0.00%)      0.2140 *  -1.56%*
> Amean     7       0.3754 (   0.00%)      0.3966 *  -5.63%*
> Amean     12      0.6456 (   0.00%)      0.6734 *  -4.32%*
> Amean     21      1.1440 (   0.00%)      1.1769 *  -2.87%*
> Amean     30      1.6629 (   0.00%)      1.7031 *  -2.42%*
> Amean     48      2.7321 (   0.00%)      2.7897 *  -2.11%*
> Amean     64      3.7397 (   0.00%)      3.7640 *  -0.65%*
> 
> It seems there is a bit performance penalty when using ">=" to break up
> the loop. Hence, we should still use ">" here.

Thanks for evaluating that, I suspected that would be the case so we should
not change that performance aspect as part of a cleanup.

> ---
>  mm/slub.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 590cc953895d..6beff3b1e22c 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2619,13 +2619,10 @@ static struct slab *get_partial_node(struct kmem_cache *s,
>  			stat(s, CPU_PARTIAL_NODE);
>  			partial_slabs++;
>  		}
> -#ifdef CONFIG_SLUB_CPU_PARTIAL
> -		if (partial_slabs > s->cpu_partial_slabs / 2)
> -			break;
> -#else
> -		break;
> -#endif
>  
> +		if ((slub_get_cpu_partial(s) == 0) ||
> +		    (partial_slabs > slub_get_cpu_partial(s) / 2))
> +			break;
>  	}
>  	spin_unlock_irqrestore(&n->list_lock, flags);
>  	return partial;

After looking at the result and your v1 again, I arrived at this
modification that incorporates the core v1 idea without reintroducing
kmem_cache_has_cpu_partial(). The modified patch looks like below. Is it OK
with you? Pushed the whole series with this modification to slab/for-next
for now.

----8<-----
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2614,18 +2614,17 @@ static struct slab *get_partial_node(struct kmem_cache *s,
                if (!partial) {
                        partial = slab;
                        stat(s, ALLOC_FROM_PARTIAL);
+                       if ((slub_get_cpu_partial(s) == 0)) {
+                               break;
+                       }
                } else {
                        put_cpu_partial(s, slab, 0);
                        stat(s, CPU_PARTIAL_NODE);
-                       partial_slabs++;
-               }
-#ifdef CONFIG_SLUB_CPU_PARTIAL
-               if (partial_slabs > s->cpu_partial_slabs / 2)
-                       break;
-#else
-               break;
-#endif
 
+                       if (++partial_slabs > slub_get_cpu_partial(s) / 2) {
+                               break;
+                       }
+               }
        }
        spin_unlock_irqrestore(&n->list_lock, flags);
        return partial;

next prev parent reply	other threads:[~2024-04-04  9:26 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-04  5:58 [PATCH v2 0/3] SLUB: improve filling cpu partial a bit in get_partial_node() xiongwei.song
2024-04-04  5:58 ` [PATCH v2 1/3] mm/slub: remove the check of !kmem_cache_has_cpu_partial() xiongwei.song
2024-04-04  5:58 ` [PATCH v2 2/3] mm/slub: add slub_get_cpu_partial() helper xiongwei.song
2024-04-04  5:58 ` [PATCH v2 3/3] mm/slub: simplify get_partial_node() xiongwei.song
2024-04-04  9:26   ` Vlastimil Babka [this message]
2024-04-07  1:47     ` Song, Xiongwei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a2890f0-108f-4fca-98e1-913373aa2cff@suse.cz \
    --to=vbabka@suse.cz \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=xiongwei.song@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox