From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55C34C001DF for ; Mon, 31 Jul 2023 09:49:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7EB56B00B8; Mon, 31 Jul 2023 05:49:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B2E816B00B9; Mon, 31 Jul 2023 05:49:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F62B28001A; Mon, 31 Jul 2023 05:49:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8C87E6B00B8 for ; Mon, 31 Jul 2023 05:49:19 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 45B01C0A4B for ; Mon, 31 Jul 2023 09:49:19 +0000 (UTC) X-FDA: 81071434038.26.1FA2592 Received: from mail-vk1-f182.google.com (mail-vk1-f182.google.com [209.85.221.182]) by imf23.hostedemail.com (Postfix) with ESMTP id 79AA9140010 for ; Mon, 31 Jul 2023 09:49:17 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=BneMlC6c; spf=pass (imf23.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.221.182 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690796957; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QZrQfTaVDB/9sDfezwMGZiQUIIPPiVARPWKD4Tzyc08=; b=dvE/IFjCNO9Ma28eECDOuUBI2ItU+MtC/KurYznfh2lRKm7B3cjcau+wH338Jg8svgOmB3 /bk6C6PgiEY08CmvsWONm6JURAcbsJ/q40ypgJRffu/Gzh9aJLjQc5kCy/CJzlsucnRGXb 7nwHQUjyIbmNV/Zpbo2Z5rrNLXwFBn8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690796957; a=rsa-sha256; cv=none; b=b92cDMYhEuYZ0/QXIEeLm3JI9Xs9QNavdxXkYkjfclBMX0kUcZnX/T+BWcxgSwOnVi2bZ0 ERoyeK+91kFkNCzyzB08hUII+P4v8zdm66nf9nLWXQxssMPRt3U8E0Gz26F+X7axXZZGOx tF2UNXsPWIvBs4E8pfq9JfMcRe24NW0= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=BneMlC6c; spf=pass (imf23.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.221.182 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vk1-f182.google.com with SMTP id 71dfb90a1353d-4863f6ffed5so1539778e0c.0 for ; Mon, 31 Jul 2023 02:49:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690796956; x=1691401756; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=QZrQfTaVDB/9sDfezwMGZiQUIIPPiVARPWKD4Tzyc08=; b=BneMlC6cunTAzIFvXLJ+zvDgUJiCqVSOMb+2LAPogDXtYUbt5mEBRIRKerrpN53Utl e96+FD9iunXOcMLfbORO8YKbXdxxv8lOQP1NcC0jsOhUrzelWOwawSNoSbR/XkgW/yGb hkgl3wpfQ822QRasQ4vwY9X5oIAOMznpZbk4XMPp2rkmjIwVINP9WDGCOJbb0zD12jET e0Ip6/McDAMPLwDuA+mrYBo5ZRsWFdNiQUn3LqdWDSBzbGk2jAQt83nAxZQNcgbiOlh7 Fq7ilW1p5jdfZ3Gvsxw2PdDgzii6wonHAi25zfL+Rxn85IZvPcSDz7oRxumJMD+FXjCI HDVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690796956; x=1691401756; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QZrQfTaVDB/9sDfezwMGZiQUIIPPiVARPWKD4Tzyc08=; b=Ax5/wMwrw1ZP/1j6F4oyZ/rQ2pJtKhUE8vPDyU5FQJIr5Pmxty5rhsLxnj1ZqQAxIe fif8FNA2F8lyrRfZ0UrKO7YFOC5vfMkNnflelV5rXqehcRFyulPLmxx8D13uItwhqpAp uyDt6ihWh8puEf7iAECjxZY76Ue1EHn/kGwPgogbSyjzc/Iogo9DCRe7vitkc+yU+nfu NznmxS+wvwM6+fLYHv8eCULZ2iaVOY1QklCBaE3x6EgNqUYLqN0yMghtSxBPpMCQQHUW sEakJwwae67ygxZt7Citb1k0sUOM/jKd5U2fPB1AoyRwBJl+GGUeOvRi/SrZlAiaJIIv Y4ig== X-Gm-Message-State: ABy/qLbtM9juHeX58EnYeSZhZm1JoCbQsnWGzl4wKdslHkp7FpolhCwr Fdy5ob5/hnIoZbtFrqDMdOajyTswF2n7Qs0btIg= X-Google-Smtp-Source: APBJJlFlaSjJhhy4O9UDqPzIHFX0ptPBT9xiO5aBuxf5txacHNFM817fP7qQSmfXfbiY2x8Xnn7hHhbwLfiqZzyNfZs= X-Received: by 2002:a1f:60c3:0:b0:485:f674:dcea with SMTP id u186-20020a1f60c3000000b00485f674dceamr4214689vkb.16.1690796956184; Mon, 31 Jul 2023 02:49:16 -0700 (PDT) MIME-Version: 1.0 References: <20230628095740.589893-1-jaypatel@linux.ibm.com> <202307172140.3b34825a-oliver.sang@intel.com> In-Reply-To: From: Hyeonggon Yoo <42.hyeyoo@gmail.com> Date: Mon, 31 Jul 2023 18:49:03 +0900 Message-ID: Subject: Re: [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage To: Oliver Sang Cc: Jay Patel , oe-lkp@lists.linux.dev, lkp@intel.com, linux-mm@kvack.org, ying.huang@intel.com, feng.tang@intel.com, fengwei.yin@intel.com, cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, aneesh.kumar@linux.ibm.com, tsahu@linux.ibm.com, piyushs@linux.ibm.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 79AA9140010 X-Rspam-User: X-Stat-Signature: kx4si1t6mmidkecpd54q1r5z1h44ta1q X-Rspamd-Server: rspam03 X-HE-Tag: 1690796957-553576 X-HE-Meta: U2FsdGVkX1/AJFFDRI5oKIhb6wUSGd1jUnF4ThPOPqD8oYN3Nu/CFDnqtqZ62cJ54bU/vjfUYZZCI2VOXRhF/qqZrrOPjDYfncL1vG21nyI/ERmmKRvoxG9OP0YuDueO7cJnLNHNfdJZiXnbwX9ilu/lDeepD23Xg38uPrjfPsyInXzgYkSE+bTThheZwAhaiQiZOO+LxYzwklAF+rinjy7Q+beYWV7M1IgtAJ3FX3grOUpvTVdw/LJxvWWnVoaJjEagAKj8aqDd9xCoe+oiEagKt2YGxgebthR12PwdJ7Fy/e/TIPXpn2HtVVBi14vAbmO/WpyAFUBNX6afdgAK6Vv/QEkmqZLRtfxmDcn/EGkW8dhNxeUg0CHWo86XtYORSmvgndYXjt+W0kktg0kvxf8EySejhNVRJNBnYwfJUohZFK1nkkuOxi6W/G2jtOgcAi4d/+NquebYkAJR/9d56XTO3sRRvOMmvaHBZyS/9aPYVhw3BVWGeAWNxJckv8t2PicEkGRa7qj/tdQvZbTufo9YpRoB91y1Wuvn21161M+r3xex8ajkChreB0QrsCZxrKA2d8SdRlfe/dE65e1flVoGlOnzQ+0zweFk7EtY80ZPcyyg7nfhTBsrZ6mlyg4kE5rbsJWQXm4t5lY9brf9Lm/EOR9tk86jy6axNCXAZaL1qCt8eJZzTY7b9t39ofX7ZETf1fJh+/6fcrP5pJN7JNtn8XYrP0vcmGGwxquQQXFQZsWX5Ko469ZM1QBJGbT6DGD3a0mwx6Cy2i7pgiuTFQLADz5Wf5tCl0ap1SfSR+74eCfziyaGvGBABZGZTSWKdHaLrHdf+rqGjOf4PDt0rQTH0CAb8eEmSQJnuMmlsUMkbVFMJUSekMTxD+F+6sNgda6FGknS1Q+4RSHeI0V2kVg02Z2zEPl4n0dCsXy/mO6KRp2uetiVEaGY3pdMukqhio6wxKR/SSnPJgG4U7L 2m2F+Vh0 rokhB1rm1FkNOzbGi5FPmVZtSyUa6H9RvvJfyMfBqX/QQljUYmDiWDxYLRYxFMAH+FAFyLhISdfa89lp5O45iO52ptHp1upTNjwrOOqJqascDmCEP/teJZbI8V0p7+/GgPhkFUwvi9T2sIXULBUpV1bCoLgtvhvu6RIt2Z4pDmT8obklRXCd8L8YcLFKV4FaaZwLVZWLFIOlP/D4RllVj28TXUgPQKM/5BsdCoode8/Mn0vEHs4fAVRsCpLiOe+K6TBXTy01e727rAcH77iESRjbW9RV4/QhnQlTNaIYLop9Zf3oz4Yf3bJjSE4QLUgG3e6hGWAqsX5w/3qHHQ+zhMwdBd3XVOdTAPXNFft//PN/kR+4tJHHi9rUlReolAd4JuTKl1mGdE4bP30/kgPvf2ZvuydnY0wOxk2Dmc3pXR9taqsuwKG/s5QleUg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 24, 2023 at 11:40=E2=80=AFAM Oliver Sang wrote: > > hi, Hyeonggon Yoo, > > On Thu, Jul 20, 2023 at 11:15:04PM +0900, Hyeonggon Yoo wrote: > > On Thu, Jul 20, 2023 at 10:46=E2=80=AFPM Hyeonggon Yoo <42.hyeyoo@gmail= .com> wrote: > > > > > > On Thu, Jul 20, 2023 at 9:59=E2=80=AFPM Hyeonggon Yoo <42.hyeyoo@gmai= l.com> wrote: > > > > On Thu, Jul 20, 2023 at 12:01=E2=80=AFPM Oliver Sang wrote: > > > > > > > commit: > > > > > > > 7bc162d5cc ("Merge branches 'slab/for-6.5/prandom', 'slab/f= or-6.5/slab_no_merge' and 'slab/for-6.5/slab-deprecate' into slab/for-next"= ) > > > > > > > a0fd217e6d ("mm/slub: Optimize slub memory usage") > > > > > > > > > > > > > > 7bc162d5cc4de5c3 a0fd217e6d6fbd23e91f8796787 > > > > > > > ---------------- --------------------------- > > > > > > > %stddev %change %stddev > > > > > > > \ | \ > > > > > > > 222503 =C4=85 86% +108.7% 464342 =C4=85 58% numa-= meminfo.node1.Active > > > > > > > 222459 =C4=85 86% +108.7% 464294 =C4=85 58% numa-= meminfo.node1.Active(anon) > > > > > > > 55573 =C4=85 85% +108.0% 115619 =C4=85 58% numa-= vmstat.node1.nr_active_anon > > > > > > > 55573 =C4=85 85% +108.0% 115618 =C4=85 58% numa-= vmstat.node1.nr_zone_active_anon > > > > > > > > > > > > I'm quite baffled while reading this. > > > > > > How did changing slab order calculation double the number of ac= tive anon pages? > > > > > > I doubt two experiments were performed on the same settings. > > > > > > > > > > let me introduce our test process. > > > > > > > > > > we make sure the tests upon commit and its parent have exact same= environment > > > > > except the kernel difference, and we also make sure the config to= build the > > > > > commit and its parent are identical. > > > > > > > > > > we run tests for one commit at least 6 times to make sure the dat= a is stable. > > > > > > > > > > such like for this case, we rebuild the commit and its parent's k= ernel, the > > > > > config is attached FYI. > > > > > > Oh I missed the attachments. > > > I need more time to look more into that, but could you please test > > > this patch (attached)? > > > > Oh, my mistake. It has nothing to do with reclamation modifiers. > > The correct patch should be this. Sorry for the noise. > > I applied below patch directly upon "mm/slub: Optimize slub memory usage"= , > so our tree looks like below: > > * 6ba0286048431 (linux-devel/fixup-a0fd217e6d6fbd23e91f8796787b621e7d5760= 88) mm/slub: do not allocate from remote node to allocate high order slab > * a0fd217e6d6fb (linux-review/Jay-Patel/mm-slub-Optimize-slub-memory-usag= e/20230628-180050) mm/slub: Optimize slub memory usage > *---. 7bc162d5cc4de (vbabka-slab/for-linus) Merge branches 'slab/for-6.= 5/prandom', 'slab/for-6.5/slab_no_merge' and 'slab/for-6.5/slab-deprecate' = into slab/for-next > > 6ba0286048431 is as below [1] > since there are some line number differences, no sure if my applying ok? = or > should I pick another base? It was fine, it was tested correctly. > by this applying, we noticed the regression still exists. > on 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice La= ke) with 256G memory Thank you for testing it! Unfortunately my guess seems to be wrong in this case, based on information that Feng Tang gave us. While I'm still interested in evaluating potential gains in SLUB, for this case I would like to focus more on the v4 in this case as Vlastimil pointed out! Thanks, Hyeonggon > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/t= box_group/testcase: > gcc-12/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x8= 6_64-20220510.cgz/lkp-icl-2sp2/hackbench > > 7bc162d5cc4de5c3 a0fd217e6d6fbd23e91f8796787 6ba02860484315665e300d9f415 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 479042 -12.5% 419357 -12.0% 421407 = hackbench.throughput > > detail data is attached as hackbench-6ba0286048431-ICL-Gold-6338 > > > on 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ic= e Lake) with 128G memory > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/t= box_group/testcase: > gcc-12/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x8= 6_64-20220510.cgz/lkp-icl-2sp6/hackbench > > 7bc162d5cc4de5c3 a0fd217e6d6fbd23e91f8796787 6ba02860484315665e300d9f415 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 455347 -5.9% 428458 -6.4% 426221 = hackbench.throughput > > detail data is attached as hackbench-6ba0286048431-ICL-Platinum-8358 > > > [1] > commit 6ba02860484315665e300d9f41511f36940a50f0 (linux-devel/fixup-a0fd21= 7e6d6fbd23e91f8796787b621e7d576088) > Author: Hyeonggon Yoo <42.hyeyoo@gmail.com> > Date: Thu Jul 20 22:29:16 2023 +0900 > > mm/slub: do not allocate from remote node to allocate high order slab > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > diff --git a/mm/slub.c b/mm/slub.c > index 8ea7a5ccac0dc..303c57ee0f560 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1981,7 +1981,7 @@ static struct slab *allocate_slab(struct kmem_cache= *s, gfp_t flags, int node) > * Let the initial higher-order allocation fail under memory pres= sure > * so we fall-back to the minimum order allocation. > */ > - alloc_gfp =3D (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOF= AIL; > + alloc_gfp =3D (flags | __GFP_THISNODE | __GFP_NOWARN | __GFP_NORE= TRY) & ~__GFP_NOFAIL; > if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order= (s->min)) > alloc_gfp =3D (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_REC= LAIM; > > > > > > > From 74142b5131e731f662740d34623d93fd324f9b65 Mon Sep 17 00:00:00 2001 > > From: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > Date: Thu, 20 Jul 2023 22:29:16 +0900 > > Subject: [PATCH] mm/slub: do not allocate from remote node to allocate = high > > order slab > > > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > --- > > mm/slub.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/slub.c b/mm/slub.c > > index f7940048138c..c584237d6a0d 100644 > > --- a/mm/slub.c > > +++ b/mm/slub.c > > @@ -2010,7 +2010,7 @@ static struct slab *allocate_slab(struct kmem_cac= he *s, gfp_t flags, int node) > > * Let the initial higher-order allocation fail under memory pres= sure > > * so we fall-back to the minimum order allocation. > > */ > > - alloc_gfp =3D (flags | __GFP_NOWARN | __GFP_NORETRY) & ~__GFP_NOF= AIL; > > + alloc_gfp =3D (flags | __GFP_THISNODE | __GFP_NOWARN | __GFP_NORE= TRY) & ~__GFP_NOFAIL; > > if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order= (s->min)) > > alloc_gfp =3D (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_REC= LAIM; > > > > -- > > 2.41.0 > > >