From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=RugB=G5=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,
	SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 35454C433DB
	for <linux-mm@archiver.kernel.org>; Tue, 26 Jan 2021 13:59:25 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id C30492255F
	for <linux-mm@archiver.kernel.org>; Tue, 26 Jan 2021 13:59:24 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C30492255F
Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 477A68D00D1; Tue, 26 Jan 2021 08:59:24 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 426C88D00B0; Tue, 26 Jan 2021 08:59:24 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 33D338D00D1; Tue, 26 Jan 2021 08:59:24 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224])
	by kanga.kvack.org (Postfix) with ESMTP id 1CEA28D00B0
	for <linux-mm@kvack.org>; Tue, 26 Jan 2021 08:59:24 -0500 (EST)
Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay04.hostedemail.com (Postfix) with ESMTP id CF5B51EE6
	for <linux-mm@kvack.org>; Tue, 26 Jan 2021 13:59:23 +0000 (UTC)
X-FDA: 77748083406.11.trip81_26108ea2758e
Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251])
	by smtpin11.hostedemail.com (Postfix) with ESMTP id ACE50180F8B81
	for <linux-mm@kvack.org>; Tue, 26 Jan 2021 13:59:23 +0000 (UTC)
X-HE-Tag: trip81_26108ea2758e
X-Filterd-Recvd-Size: 5150
Received: from mx2.suse.de (mx2.suse.de [195.135.220.15])
	by imf14.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Tue, 26 Jan 2021 13:59:23 +0000 (UTC)
X-Virus-Scanned: by amavisd-new at test-mx.suse.de
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1;
	t=1611669562; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:
	 mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=JCAbd7z4tqnsZ/4DW7kFHtVQefHj8K36GoZRlW3XnSo=;
	b=ZVLCr3UY2N0PhZ0VDIfebHpRaQ/hMHCTcYTjnRQBQG3Yag66LWNdyr/5XdFhp6daJM2I3E
	x+5YvjS6MlrU8aDPbmWDJT3dE0/o5r5HoFgftOyRjNeyTPYIOud0km9wUY65Sk/ak27xq+
	1HTlodXIwUGChEPqDjTdqe0j0DyWiA4=
Received: from relay2.suse.de (unknown [195.135.221.27])
	by mx2.suse.de (Postfix) with ESMTP id C8F26AB9F;
	Tue, 26 Jan 2021 13:59:21 +0000 (UTC)
Date: Tue, 26 Jan 2021 14:59:18 +0100
From: Michal Hocko <mhocko@suse.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@suse.cz>, Christoph Lameter <cl@linux.com>,
	Bharata B Rao <bharata@linux.ibm.com>,
	linux-kernel <linux-kernel@vger.kernel.org>, linux-mm@kvack.org,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>, guro@fb.com,
	Shakeel Butt <shakeelb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>, aneesh.kumar@linux.ibm.com,
	Jann Horn <jannh@google.com>
Subject: Re: [RFC PATCH v0] mm/slub: Let number of online CPUs determine the
 slub page order
Message-ID: <20210126135918.GQ827@dhcp22.suse.cz>
References: <20201118082759.1413056-1-bharata@linux.ibm.com>
 <CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com>
 <20210121053003.GB2587010@in.ibm.com>
 <alpine.DEB.2.22.394.2101210959060.100764@www.lameter.com>
 <d7fb9425-9a62-c7b8-604d-5828d7e6b1da@suse.cz>
 <20210126085243.GE827@dhcp22.suse.cz>
 <CAKfTPtAhqiHtPMUTZv8Bs3Cg5=HXLmrda=j4_HFrF=7ztYZLGA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAKfTPtAhqiHtPMUTZv8Bs3Cg5=HXLmrda=j4_HFrF=7ztYZLGA@mail.gmail.com>
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Tue 26-01-21 14:38:14, Vincent Guittot wrote:
> On Tue, 26 Jan 2021 at 09:52, Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Thu 21-01-21 19:19:21, Vlastimil Babka wrote:
> > [...]
> > > We could also start questioning the very assumption that number of cpus should
> > > affect slab page size in the first place. Should it? After all, each CPU will
> > > have one or more slab pages privately cached, as we discuss in the other
> > > thread... So why make the slab pages also larger?
> >
> > I do agree. What is the acutal justification for this scaling?
> >         /*
> >          * Attempt to find best configuration for a slab. This
> >          * works by first attempting to generate a layout with
> >          * the best configuration and backing off gradually.
> >          *
> >          * First we increase the acceptable waste in a slab. Then
> >          * we reduce the minimum objects required in a slab.
> >          */
> >
> > doesn't speak about CPUs.  9b2cd506e5f2 ("slub: Calculate min_objects
> > based on number of processors.") does talk about hackbench "This has
> > been shown to address the performance issues in hackbench on 16p etc."
> > but it doesn't give any more details to tell actually _why_ that works.
> >
> > This thread shows that this is still somehow related to performance but
> > the real reason is not clear. I believe we should be focusing on the
> > actual reasons for the performance impact than playing with some fancy
> > math and tuning for a benchmark on a particular machine which doesn't
> > work for others due to subtle initialization timing issues.
> >
> > Fundamentally why should higher number of CPUs imply the size of slab in
> > the first place?
> 
> A 1st answer is that the activity and the number of threads involved
> scales with the number of CPUs. Regarding the hackbench benchmark as
> an example, the number of group/threads raise to a higher level on the
> server than on the small system which doesn't seem unreasonable.
> 
> On 8 CPUs, I run hackbench with up to 16 groups which means 16*40
> threads. But I raise up to 256 groups, which means 256*40 threads, on
> the 224 CPUs system. In fact, hackbench -g 1 (with 1 group) doesn't
> regress on the 224 CPUs  system.  The next test with 4 groups starts
> to regress by -7%. But the next one: hackbench -g 16 regresses by 187%
> (duration is almost 3 times longer). It seems reasonable to assume
> that the number of running threads and resources scale with the number
> of CPUs because we want to run more stuff.

OK, I do understand that more jobs scale with the number of CPUs but I
would also expect that higher order pages are generally more expensive
to get so this is not really a clear cut especially under some more
demand on the memory where allocations are smooth. So the question
really is whether this is not just optimizing for artificial conditions.
-- 
Michal Hocko
SUSE Labs