Re: [PATCH] mm, slub: better heuristic for number of cpus when calculating slab order

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Vincent Guittot <vincent.guittot@linaro.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 aneesh.kumar@linux.ibm.com,
	Bharata B Rao <bharata@linux.ibm.com>,
	 Christoph Lameter <cl@linux.com>,
	guro@fb.com, Johannes Weiner <hannes@cmpxchg.org>,
	 Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Jann Horn <jannh@google.com>,
	 linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org,  Michal Hocko <mhocko@kernel.org>,
	David Rientjes <rientjes@google.com>,
	 Shakeel Butt <shakeelb@google.com>,
	Will Deacon <will@kernel.org>,
	 Mel Gorman <mgorman@techsingularity.net>,
	"# v4 . 16+" <stable@vger.kernel.org>
Subject: Re: [PATCH] mm, slub: better heuristic for number of cpus when calculating slab order
Date: Mon, 8 Feb 2021 15:54:02 +0100	[thread overview]
Message-ID: <CAKfTPtBR4AjOGE-h2q=jKjf55hc_xiJOAywzOWZtsWgNvbmYYg@mail.gmail.com> (raw)
In-Reply-To: <20210208134108.22286-1-vbabka@suse.cz>

On Mon, 8 Feb 2021 at 14:41, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> When creating a new kmem cache, SLUB determines how large the slab pages will
> based on number of inputs, including the number of CPUs in the system. Larger
> slab pages mean that more objects can be allocated/free from per-cpu slabs
> before accessing shared structures, but also potentially more memory can be
> wasted due to low slab usage and fragmentation.
> The rough idea of using number of CPUs is that larger systems will be more
> likely to benefit from reduced contention, and also should have enough memory
> to spare.
>
> Number of CPUs used to be determined as nr_cpu_ids, which is number of possible
> cpus, but on some systems many will never be onlined, thus commit 045ab8c9487b
> ("mm/slub: let number of online CPUs determine the slub page order") changed it
> to nr_online_cpus(). However, for kmem caches created early before CPUs are
> onlined, this may lead to permamently low slab page sizes.
>
> Vincent reports a regression [1] of hackbench on arm64 systems:
>
> > I'm facing significant performances regression on a large arm64 server
> > system (224 CPUs). Regressions is also present on small arm64 system
> > (8 CPUs) but in a far smaller order of magnitude
>
> > On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
> > v5.11-rc4 : 9.135sec (+/- 0.45%)
> > v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
> > v5.10: 3.136sec (+/- 0.40%)
>
> Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
> page allocator contention:
>
> > i.e. the patch incurs a 7% to 32% performance penalty. This bisected
> > cleanly yesterday when I was looking for the regression and then found
> > the thread.
>
> > Numerous caches change size. For example, kmalloc-512 goes from order-0
> > (vanilla) to order-2 with the revert.
>
> > So mostly this is down to the number of times SLUB calls into the page
> > allocator which only caches order-0 pages on a per-cpu basis.
>
> Clearly num_online_cpus() doesn't work too early in bootup. We could change
> the order dynamically in a memory hotplug callback, but runtime order changing
> for existing kmem caches has been already shown as dangerous, and removed in
> 32a6f409b693 ("mm, slub: remove runtime allocation order changes"). It could be
> resurrected in a safe manner with some effort, but to fix the regression we
> need something simpler.
>
> We could use num_present_cpus() that should be the number of physically present
> CPUs even before they are onlined. That would for for PowerPC [3], which

minor typo : "That would for for PowerPC" should be "That would work
for PowerPC" ?

> triggered the original commit,  but that still doesn't work on arm64 [4] as
> explained in [5].
>
> So this patch tries to determine the best available value without specific arch
> knowledge.
> - num_present_cpus() if the number is larger than 1, as that means the arch is
> likely setting it properly
> - nr_cpu_ids otherwise
>
> This should fix the reported regressions while also keeping the effect of
> 045ab8c9487b for PowerPC systems. It's possible there are configurations where
> num_present_cpus() is 1 during boot while nr_cpu_ids is at the same time
> bloated, so these (if they exist) would keep the large orders based on
> nr_cpu_ids as was before 045ab8c9487b.
>
> [1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com/
> [2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/
> [3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/
> [4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03YsY18cmWv_g@mail.gmail.com/
> [5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/
>
> Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
> Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
> Reported-by: Mel Gorman <mgorman@techsingularity.net>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Tested on both large and small arm64 systems. There is no regression
with this patch applied

Tested-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>
> OK, this is a 5.11 regression, so we should try to it by 5.12. I've also
> Cc'd stable for that reason although it's not a crash fix.
> We can still try later to replace this with a safe order update in hotplug
> callbacks, but that's infeasible for 5.12.
>
>  mm/slub.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 176b1cb0d006..8fc9190e6cb3 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3454,6 +3454,7 @@ static inline int calculate_order(unsigned int size)
>         unsigned int order;
>         unsigned int min_objects;
>         unsigned int max_objects;
> +       unsigned int nr_cpus;
>
>         /*
>          * Attempt to find best configuration for a slab. This
> @@ -3464,8 +3465,21 @@ static inline int calculate_order(unsigned int size)
>          * we reduce the minimum objects required in a slab.
>          */
>         min_objects = slub_min_objects;
> -       if (!min_objects)
> -               min_objects = 4 * (fls(num_online_cpus()) + 1);
> +       if (!min_objects) {
> +               /*
> +                * Some architectures will only update present cpus when
> +                * onlining them, so don't trust the number if it's just 1. But
> +                * we also don't want to use nr_cpu_ids always, as on some other
> +                * architectures, there can be many possible cpus, but never
> +                * onlined. Here we compromise between trying to avoid too high
> +                * order on systems that appear larger than they are, and too
> +                * low order on systems that appear smaller than they are.
> +                */
> +               nr_cpus = num_present_cpus();
> +               if (nr_cpus <= 1)
> +                       nr_cpus = nr_cpu_ids;
> +               min_objects = 4 * (fls(nr_cpus) + 1);
> +       }
>         max_objects = order_objects(slub_max_order, size);
>         min_objects = min(min_objects, max_objects);
>
> --
> 2.30.0
>

next prev parent reply	other threads:[~2021-02-08 14:54 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18  8:27 [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order Bharata B Rao
2020-11-18 11:25 ` Vlastimil Babka
2020-11-18 19:34   ` Roman Gushchin
2020-11-18 19:53     ` David Rientjes
2021-01-20 17:36 ` Vincent Guittot
2021-01-21  5:30   ` Bharata B Rao
2021-01-21  9:09     ` Vincent Guittot
2021-01-21 10:01     ` Christoph Lameter
2021-01-21 10:48       ` Vincent Guittot
2021-01-21 18:19       ` Vlastimil Babka
2021-01-22  8:03         ` Vincent Guittot
2021-01-22 12:03           ` Vlastimil Babka
2021-01-22 13:16             ` Vincent Guittot
2021-01-23  5:16             ` Bharata B Rao
2021-01-23 12:32               ` Vincent Guittot
2021-01-25 11:20                 ` Vlastimil Babka
2021-01-26 23:03                   ` Will Deacon
2021-01-27  9:10                     ` Christoph Lameter
2021-01-27 11:04                       ` Vlastimil Babka
2021-02-03 11:10                         ` Bharata B Rao
2021-02-04  7:32                           ` Vincent Guittot
2021-02-04  9:07                             ` Christoph Lameter
2021-02-04  9:33                           ` Vlastimil Babka
2021-02-08 13:41                             ` [PATCH] mm, slub: better heuristic for number of cpus when calculating slab order Vlastimil Babka
2021-02-08 14:54                               ` Vincent Guittot [this message]
2021-02-10 14:07                               ` Mel Gorman
2021-01-22 13:05         ` [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order Jann Horn
2021-01-22 13:09           ` Jann Horn
2021-01-22 15:27           ` Vlastimil Babka
2021-01-25  4:28           ` Bharata B Rao
2021-01-26  8:52         ` Michal Hocko
2021-01-26 13:38           ` Vincent Guittot
2021-01-26 13:59             ` Michal Hocko
2021-01-28 13:45               ` Mel Gorman
2021-01-28 13:57                 ` Michal Hocko
2021-01-28 14:42                   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKfTPtBR4AjOGE-h2q=jKjf55hc_xiJOAywzOWZtsWgNvbmYYg@mail.gmail.com' \
    --to=vincent.guittot@linaro.org \
    --cc=Catalin.Marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=bharata@linux.ibm.com \
    --cc=cl@linux.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox