From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86FBBC433DB for ; Fri, 22 Jan 2021 15:27:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CC1D723A9A for ; Fri, 22 Jan 2021 15:27:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC1D723A9A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D70EA6B0006; Fri, 22 Jan 2021 10:27:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D21AC6B0007; Fri, 22 Jan 2021 10:27:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C36DF6B000A; Fri, 22 Jan 2021 10:27:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id AB3CF6B0006 for ; Fri, 22 Jan 2021 10:27:15 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 77A4D180EAE15 for ; Fri, 22 Jan 2021 15:27:15 +0000 (UTC) X-FDA: 77733789630.01.fish49_1e11eb92756c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 50CC610054F11 for ; Fri, 22 Jan 2021 15:27:15 +0000 (UTC) X-HE-Tag: fish49_1e11eb92756c X-Filterd-Recvd-Size: 3990 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Fri, 22 Jan 2021 15:27:14 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 12797ACC6; Fri, 22 Jan 2021 15:27:13 +0000 (UTC) To: Jann Horn Cc: Christoph Lameter , Bharata B Rao , Vincent Guittot , linux-kernel , Linux-MM , David Rientjes , Joonsoo Kim , Andrew Morton , Roman Gushchin , Shakeel Butt , Johannes Weiner , aneesh.kumar@linux.ibm.com, Michal Hocko References: <20201118082759.1413056-1-bharata@linux.ibm.com> <20210121053003.GB2587010@in.ibm.com> From: Vlastimil Babka Subject: Re: [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order Message-ID: Date: Fri, 22 Jan 2021 16:27:12 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 1/22/21 2:05 PM, Jann Horn wrote: > On Thu, Jan 21, 2021 at 7:19 PM Vlastimil Babka wrote: >> On 1/21/21 11:01 AM, Christoph Lameter wrote: >> > On Thu, 21 Jan 2021, Bharata B Rao wrote: >> > >> >> > The problem is that calculate_order() is called a number of times >> >> > before secondaries CPUs are booted and it returns 1 instead of 22= 4. >> >> > This makes the use of num_online_cpus() irrelevant for those case= s >> >> > >> >> > After adding in my command line "slub_min_objects=3D36" which equ= als to >> >> > 4 * (fls(num_online_cpus()) + 1) with a correct num_online_cpus =3D= =3D 224 >> >> > , the regression diseapears: >> >> > >> >> > 9 iterations of hackbench -l 16000 -g 16: 3.201sec (+/- 0.90%) >> >> I'm surprised that hackbench is that sensitive to slab performance, an= yway. It's >> supposed to be a scheduler benchmark? What exactly is going on? >=20 > Uuuh, I think powerpc doesn't have cmpxchg_double? The benchmark was done by Vincent on arm64, AFAICS. PowerPC (ppc64) was w= hat Bharata had used to demonstrate the order calculation change in his patch= . There seems to be some implementation dependency on CONFIG_ARM64_LSE_ATOM= ICS but AFAICS that doesn't determine if cmpxchg_double is provided. > "vgrep cmpxchg_double arch/" just spits out arm64, s390 and x86? And > > says under "POWERPC": "no DW LL/SC" Interesting find in any case. > So powerpc is probably hitting the page-bitlock-based implementation > all the time for stuff like __slub_free()? Do you have detailed > profiling results from "perf top" or something like that? >=20 > (I actually have some WIP patches and a design document for getting > rid of cmpxchg_double in struct page that I hacked together in the > last couple days; I'm currently in the process of sending them over to > some other folks in the company who hopefully have cycles to > review/polish/benchmark them so that they can be upstreamed, assuming > that those folks think they're important enough. I don't have the > cycles for it...) I'm curious, so I hope this works out :)