From: "Song, Xiongwei" <Xiongwei.Song@windriver.com>
To: Vlastimil Babka <vbabka@suse.cz>,
"rientjes@google.com" <rientjes@google.com>,
"cl@linux.com" <cl@linux.com>,
"penberg@kernel.org" <penberg@kernel.org>,
"iamjoonsoo.kim@lge.com" <iamjoonsoo.kim@lge.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"roman.gushchin@linux.dev" <roman.gushchin@linux.dev>,
"42.hyeyoo@gmail.com" <42.hyeyoo@gmail.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"chengming.zhou@linux.dev" <chengming.zhou@linux.dev>
Subject: RE: [PATCH v2 3/3] mm/slub: simplify get_partial_node()
Date: Sun, 7 Apr 2024 01:47:40 +0000 [thread overview]
Message-ID: <PH0PR11MB5192E2D674F7191C9571470DEC012@PH0PR11MB5192.namprd11.prod.outlook.com> (raw)
In-Reply-To: <8a2890f0-108f-4fca-98e1-913373aa2cff@suse.cz>
> On 4/4/24 7:58 AM, xiongwei.song@windriver.com wrote:
> > From: Xiongwei Song <xiongwei.song@windriver.com>
> >
> > The break conditions for filling cpu partial can be more readable and
> > simple.
> >
> > If slub_get_cpu_partial() returns 0, we can confirm that we don't need
> > to fill cpu partial, then we should break from the loop. On the other
> > hand, we also should break from the loop if we have added enough cpu
> > partial slabs.
> >
> > Meanwhile, the logic above gets rid of the #ifdef and also fixes a weird
> > corner case that if we set cpu_partial_slabs to 0 from sysfs, we still
> > allocate at least one here.
> >
> > Signed-off-by: Xiongwei Song <xiongwei.song@windriver.com>
> > ---
> >
> > The measurement below is to compare the performance effects when
> checking
> > if we need to break from the filling cpu partial loop with the following
> > either-or condition:
> >
> > Condition 1:
> > When the count of added cpu slabs is greater than cpu_partial_slabs/2:
> > (partial_slabs > slub_get_cpu_partial(s) / 2)
> >
> > Condition 2:
> > When the count of added cpu slabs is greater than or equal to
> > cpu_partial_slabs/2:
> > (partial_slabs >= slub_get_cpu_partial(s) / 2)
> >
> > The change of breaking condition can effect how many cpu partial slabs
> > would be put on the cpu partial list.
> >
> > Run the test with a "Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz" cpu with
> > 16 cores. The OS is Ubuntu 22.04.
> >
> > hackbench-process-pipes
> > 6.9-rc2(with ">") 6.9.0-rc2(with ">=")
> > Amean 1 0.0373 ( 0.00%) 0.0356 * 4.60%*
> > Amean 4 0.0984 ( 0.00%) 0.1014 * -3.05%*
> > Amean 7 0.1803 ( 0.00%) 0.1851 * -2.69%*
> > Amean 12 0.2947 ( 0.00%) 0.3141 * -6.59%*
> > Amean 21 0.4577 ( 0.00%) 0.4927 * -7.65%*
> > Amean 30 0.6326 ( 0.00%) 0.6649 * -5.10%*
> > Amean 48 0.9396 ( 0.00%) 0.9884 * -5.20%*
> > Amean 64 1.2321 ( 0.00%) 1.3004 * -5.54%*
> >
> > hackbench-process-sockets
> > 6.9-rc2(with ">") 6.9.0-rc2(with ">=")
> > Amean 1 0.0609 ( 0.00%) 0.0623 * -2.35%*
> > Amean 4 0.2107 ( 0.00%) 0.2140 * -1.56%*
> > Amean 7 0.3754 ( 0.00%) 0.3966 * -5.63%*
> > Amean 12 0.6456 ( 0.00%) 0.6734 * -4.32%*
> > Amean 21 1.1440 ( 0.00%) 1.1769 * -2.87%*
> > Amean 30 1.6629 ( 0.00%) 1.7031 * -2.42%*
> > Amean 48 2.7321 ( 0.00%) 2.7897 * -2.11%*
> > Amean 64 3.7397 ( 0.00%) 3.7640 * -0.65%*
> >
> > It seems there is a bit performance penalty when using ">=" to break up
> > the loop. Hence, we should still use ">" here.
>
> Thanks for evaluating that, I suspected that would be the case so we should
> not change that performance aspect as part of a cleanup.
>
> > ---
> > mm/slub.c | 9 +++------
> > 1 file changed, 3 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 590cc953895d..6beff3b1e22c 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -2619,13 +2619,10 @@ static struct slab *get_partial_node(struct
> kmem_cache *s,
> > stat(s, CPU_PARTIAL_NODE);
> > partial_slabs++;
> > }
> > -#ifdef CONFIG_SLUB_CPU_PARTIAL
> > - if (partial_slabs > s->cpu_partial_slabs / 2)
> > - break;
> > -#else
> > - break;
> > -#endif
> >
> > + if ((slub_get_cpu_partial(s) == 0) ||
> > + (partial_slabs > slub_get_cpu_partial(s) / 2))
> > + break;
> > }
> > spin_unlock_irqrestore(&n->list_lock, flags);
> > return partial;
>
> After looking at the result and your v1 again, I arrived at this
> modification that incorporates the core v1 idea without reintroducing
> kmem_cache_has_cpu_partial(). The modified patch looks like below. Is it OK
> with you? Pushed the whole series with this modification to slab/for-next
> for now.
Sorry for the late response, I was on vacation.
I'm ok with the patch below.
Thanks,
Xiongwei
>
> ----8<-----
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2614,18 +2614,17 @@ static struct slab *get_partial_node(struct
> kmem_cache *s,
> if (!partial) {
> partial = slab;
> stat(s, ALLOC_FROM_PARTIAL);
> + if ((slub_get_cpu_partial(s) == 0)) {
> + break;
> + }
> } else {
> put_cpu_partial(s, slab, 0);
> stat(s, CPU_PARTIAL_NODE);
> - partial_slabs++;
> - }
> -#ifdef CONFIG_SLUB_CPU_PARTIAL
> - if (partial_slabs > s->cpu_partial_slabs / 2)
> - break;
> -#else
> - break;
> -#endif
>
> + if (++partial_slabs > slub_get_cpu_partial(s) / 2) {
> + break;
> + }
> + }
> }
> spin_unlock_irqrestore(&n->list_lock, flags);
> return partial;
prev parent reply other threads:[~2024-04-07 1:48 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-04 5:58 [PATCH v2 0/3] SLUB: improve filling cpu partial a bit in get_partial_node() xiongwei.song
2024-04-04 5:58 ` [PATCH v2 1/3] mm/slub: remove the check of !kmem_cache_has_cpu_partial() xiongwei.song
2024-04-04 5:58 ` [PATCH v2 2/3] mm/slub: add slub_get_cpu_partial() helper xiongwei.song
2024-04-04 5:58 ` [PATCH v2 3/3] mm/slub: simplify get_partial_node() xiongwei.song
2024-04-04 9:26 ` Vlastimil Babka
2024-04-07 1:47 ` Song, Xiongwei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=PH0PR11MB5192E2D674F7191C9571470DEC012@PH0PR11MB5192.namprd11.prod.outlook.com \
--to=xiongwei.song@windriver.com \
--cc=42.hyeyoo@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox