From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10DF7C7115A for ; Fri, 18 Aug 2023 15:19:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 700F8940066; Fri, 18 Aug 2023 11:19:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6B215940012; Fri, 18 Aug 2023 11:19:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A0FE940066; Fri, 18 Aug 2023 11:19:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 4757E940012 for ; Fri, 18 Aug 2023 11:19:03 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 17FEE40119 for ; Fri, 18 Aug 2023 15:19:03 +0000 (UTC) X-FDA: 81137583366.30.6B8C7E0 Received: from mail-ua1-f52.google.com (mail-ua1-f52.google.com [209.85.222.52]) by imf09.hostedemail.com (Postfix) with ESMTP id 4A403140034 for ; Fri, 18 Aug 2023 15:19:01 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=eDREqNeZ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.222.52 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692371941; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P97bQdtLkXYaA0AynUq0EZ6/3+YeSQEgYN8aKuKl+vg=; b=SDS+KfIS/Nv5bA+4OUjbBMs/rZ5Joclfc1+nNwUK6PpzdHVDsC/lXJI/KqZtlOOtWKrZO5 08EvMBJJE81EYjn2Ns/dgKYxyIhqe+AUzdtMgDBlIJlWJgOa5FRgoamT0EmPYkEYwT7Mxe a0Wt24wadRsAY6hx3y0eA0bxpBeGxeI= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=eDREqNeZ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.222.52 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692371941; a=rsa-sha256; cv=none; b=G1xaDkjkaUzZ2A5f4ualxYfv9RreihjSbi6X+t9Hlw5IaKy98tyECEhtNJix17ibzyKa2p xsyhTMeeXwAA1Zzw/KPTBOdEznRDJAlJCzwqF1lbcWo/PFZq/NWs3Hia0ZJOkkObd/+CKX uQI3quFdEeZ9EusUn/T9ibNstXkZIu4= Received: by mail-ua1-f52.google.com with SMTP id a1e0cc1a2514c-79aa1f24ba2so296166241.2 for ; Fri, 18 Aug 2023 08:19:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692371940; x=1692976740; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=P97bQdtLkXYaA0AynUq0EZ6/3+YeSQEgYN8aKuKl+vg=; b=eDREqNeZMqP1u2k3aQjmOgtgERjfUcyE8LZAyKeRqWnK/1xDu5cpA+ZVXh+8cu3zTn FCd736OM6PqJl7Xu60/OQiQMbeiQACHZUtImtYo4qlt+Uq3IrPtNHiaDmr3hk3SrCcWa 2J6CO3emcC85FVumYIPACECMoirDzxE2zBLMuyPo6ZLDbBJ47jbksTqCP/L7XWN4ZH3G dKs5zuSKLcqa015lSQxgJHxuwDwXQO/hACKtM4tlo2YD1qqydywH6frfF+GNasUiYYBp 097TA9rc6CIKu+hnnXO/kMR8mJRI0KGf8YnbrDbH2q9DSEtp1ikg2ZVcFNm6LzDZP9Ye FZdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692371940; x=1692976740; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P97bQdtLkXYaA0AynUq0EZ6/3+YeSQEgYN8aKuKl+vg=; b=OUI2D9OJRtGe3izjTAvw/dmQWYHwRBWnCxv1DTwl5qNaY02jXofm0ZbYbveEt2Tleh pKr98HB3y3lDyskRPDsRfJIiGM61sUpUPI4emzfkw6Gcwdz0pdxgFH1DsXH0hqIYn0ex 2InmlgzmYWWOYpNgeTdYn9qhiBREY523SvTFFLG4yPSUHqVC66NRuDszMmbjSsAwctMY LFY5l+FBWsP71r2tkuKLlpjuzSnIqQMiB8QNof+5jTGKrwG1AYz/E19RU9KeyvLBnXBV MAp9wAJKHKTIkFsWTEFg4GX92QhI3WaQfP5vXpSEDl7voLf9eDNbs7P5CiBA4vnnMJns QXpQ== X-Gm-Message-State: AOJu0Yz77p8Fx+QrpKQqlMu8WTgl1ll7m74KHiW2UgnpTl7kvjQsXkvE UXEJnIvEXtu3oVJrXPAkGSn6lx3Fhu3cQNdda2I= X-Google-Smtp-Source: AGHT+IEKx0SzvU5D2FqWbPDPAF+5clk1Spq16QV46q2pawqJMTYlEanbi4HUqugEgrWGUM8bZly0MVBM2ssOLxfqwJc= X-Received: by 2002:a1f:ec43:0:b0:488:24b4:c100 with SMTP id k64-20020a1fec43000000b0048824b4c100mr3570843vkh.6.1692371940226; Fri, 18 Aug 2023 08:19:00 -0700 (PDT) MIME-Version: 1.0 References: <20230723190906.4082646-1-42.hyeyoo@gmail.com> <30b5d85348d84891bf61d7c57370d8b46df8e1a0.camel@linux.ibm.com> In-Reply-To: From: Hyeonggon Yoo <42.hyeyoo@gmail.com> Date: Sat, 19 Aug 2023 00:18:48 +0900 Message-ID: Subject: Re: [RFC 0/2] An attempt to improve SLUB on NUMA / under memory pressure To: jaypatel@linux.ibm.com Cc: Vlastimil Babka , Christoph Lameter , Pekka Enberg , Joonsoo Kim , David Rientjes , Andrew Morton , Roman Gushchin , Feng Tang , "Sang, Oliver" , Binder Makin , aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com, piyushs@linux.ibm.com, fengwei.yin@intel.com, ying.huang@intel.com, lkp , "oe-lkp@lists.linux.dev" , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 4A403140034 X-Stat-Signature: nnbt3hhpz9qj39g5nne5xpheawco7u6h X-HE-Tag: 1692371941-424829 X-HE-Meta: U2FsdGVkX18jFif75E4PqTvIPP4OLVIf7iwoPKOyA4mMtgIQjzfVgsNK5rRWi9JJmtOfXAD6AqNUv5nYZaUJbMgSJIkf3DEfb5sLVIv64Dhy9LWr/YfnzfGhibj6Lhb/GfakW4OlZA0Z/dVTIgSqxLnZb5KsefnbNRDArALYRuadNaLavel3WMQRoiBeGoOpnMxA585OAKQCBsoh2TJqcooVChNdl5QyeFNzsucaVTofiX0o8miIOeFZFwXhmGTQNhzjCHZ71Y7AHcy67ta7ndlDxWipYWIQZRZUZsVo1QN813MRctdkMp9Y927qy1x+CQTe4SyRz73l92HI1jQGNl/uu/MPbsEQkhPIlhFx+fYuldAJ29rML5gKuH8OUqs6CZe3jeKglFTAbP2rP03QPqlumJiTs5x/Szqdk5Z8VEihCv1l+KOwDKYdJh1QuQyIL9a++QjAI1OB3w5GiLFpQq9mgYhn9PNAZUpvfhQwTjB1OlmgxJDi6zqEtySFLuaVNAsqFUCeGf4TFohsFzQ3cYGt+QxUHIGxpR0PUdy+LWp80OfQ3JFmKV6byk7GcZTszuKhB2kT1XJeYvTiUFZqm727ihm4irfpY39pkRMWsdFkmdW2dKEDkJ9ffqphk84ZalIuERWwEeTS46hhfXf3GpTVl5bHdd5lOzWgiJ14SWMywfOzgn3GOA3bS4N5AhbpW+xupnjWxUVu4Xs8Wl7sGn1+rSHDCDThFwA3n8leE3Gtwu5w6DoAfS3ka2g7SXXiVSzERiqCpQayYuNid6tJpYEm0C/K4ignhY0PEyhllNVyiGu8qn/O8m0NMj7yPX0muSdNb4h9RzRus7ZQBsDukGmMtz29hl+yqdSq0b0p0DGE87Ag+wKpkFAPSwNu7Nz2ceKTl4PKUPytbYq4jJHS6NkeFbeZfXiWQcxgLYsGs7CyPD0xPYcf6Hi49XgrshT6rJuSjY8O5fFfQrwwbxK 69KS7Nzt Uc2c5aO3LDgE6DXUbB0F+zPCCUA98sQn41UlVY2teURE3UXwhRBo6mdC/zwosqUocVzNpvRLIG6FioFI6xUegRXKsR4nwV6nyhUY8oLcALhxx1CNazEr5/w5AEr6OdilIYEAzDxwzfcskrPtWDZzDLHMcHJHSuHjwXCNpeQQGik5+qXTNBEa2MgDLzwyYk8joiXb3m2XYqGyY2lzCHHk7wjaSvfFEk56ib8b0oclTy8JGfo+Bmx+i8tOMb6oLLf7ocg33HdAwxSZhiCiGAJSm7trArNnbIC3F4w9OldZLQ9C6oHGLtahxHMLswRfeonTuvzfDcLdjY2dsbQlsQ41Yz6gtW0VJhx674D5M+PpuaFFBmsMlkdDNfXvUD4jJuoa/+dPNRWz9pZCRy6fTzLgPcby8sd0uHO/shEscHo40rsns0RApWZcmOj2YhZG7kZ7yGJ0o X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 18, 2023 at 4:11=E2=80=AFPM Jay Patel = wrote: > > On Fri, 2023-08-11 at 03:06 +0900, Hyeonggon Yoo wrote: > > On Thu, Aug 10, 2023 at 7:56=E2=80=AFPM Jay Patel > > wrote: > > > On Mon, 2023-07-24 at 04:09 +0900, Hyeonggon Yoo wrote: > > > > Hello folks, > > > > > > > > This series is motivated by kernel test bot report [1] on Jay's > > > > patch > > > > that modifies slab order. While the patch was not merged and not > > > > in > > > > the > > > > final form, I think it was a good lesson that changing slab order > > > > has > > > > more > > > > impacts on performance than we expected. > > > > > > > > While inspecting the report, I found some potential points to > > > > improve > > > > SLUB. [2] It's _potential_ because it shows no improvements on > > > > hackbench. > > > > but I believe more realistic workloads would benefit from this. > > > > Due > > > > to > > > > lack of resources and lack of my understanding of *realistic* > > > > workloads, > > > > I am asking you to help evaluating this together. > > > > > > Hi Hyeonggon, > > > I tried hackbench test on Powerpc machine with 16 cpus but > > > got ~32% of Regression with patch. > > > > Thank you so much for measuring this! That's very helpful. > > It's interesting because on an AMD machine with 2 NUMA nodes there > > was > > not much difference. > > > > Does it have more than one socket? > > I have tested on single socket system. > > > > Could you confirm if the offending patch is patch 1 or 2? > > If the offending one is patch 2, can you please check how large is L3 > > cache miss rate > > during hackbench? > > > Below regression is cause by Patch 1 "Revert mm, slub: change percpu > partial accounting from objects to pages" Fortunately I was able to reproduce the regression (5~10%) on my amd laptop= :) It's interesting and thank you so much for pointing it out! It only modifies slowpath so the overhead of calculation itself should be negligible. And I think it's fair to assume that this is because the freelist is shortened due to the patch, because it rounds up the number of slabs: > nr_slabs =3D DIV_ROUND_UP(nr_objects * 2, oo_objects(s->oo)); So before the patch more objects were cached than intended. I'll try to bump up the default value to the point where it does not use more memory than before. By the way, what is the optimal default value is very unclear to me. Obviously 'Good enough value for hackbench' is not a good standard, because it's quite a synthetic workload. > Thanks > Jay Patel > > > > Results as > > > > > > +-------+----+---------+------------+------------+ > > > > | | Normal | With Patch | | > > > +-------+----+---------+------------+------------+ > > > > Amean | 1 | 1.3700 | 2.0353 | ( -32.69%) | > > > > Amean | 4 | 5.1663 | 7.6563 | (- 32.52%) | > > > > Amean | 7 | 8.9180 | 13.3353 | ( -33.13%) | > > > > Amean | 12 | 15.4290 | 23.0757 | ( -33.14%) | > > > > Amean | 21 | 27.3333 | 40.7823 | ( -32.98%) | > > > > Amean | 30 | 38.7677 | 58.5300 | ( -33.76%) | > > > > Amean | 48 | 62.2987 | 92.9850 | ( -33.00%) | > > > > Amean | 64 | 82.8993 | 123.4717 | ( -32.86%) | > > > +-------+----+---------+------------+------------+ > > > > > > Thanks > > > Jay Patel