From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5203EC433F5 for ; Sat, 16 Oct 2021 11:38:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DA3F961108 for ; Sat, 16 Oct 2021 11:38:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DA3F961108 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id DAF896B0071; Sat, 16 Oct 2021 07:38:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D5E57900002; Sat, 16 Oct 2021 07:38:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4D7E6B0073; Sat, 16 Oct 2021 07:38:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0099.hostedemail.com [216.40.44.99]) by kanga.kvack.org (Postfix) with ESMTP id B6AB36B0071 for ; Sat, 16 Oct 2021 07:38:44 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6C7B039B81 for ; Sat, 16 Oct 2021 11:38:44 +0000 (UTC) X-FDA: 78702103368.05.2F0DD63 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf17.hostedemail.com (Postfix) with ESMTP id 16C1FF000136 for ; Sat, 16 Oct 2021 11:38:43 +0000 (UTC) Received: by mail-pl1-f174.google.com with SMTP id w14so8069186pll.2 for ; Sat, 16 Oct 2021 04:38:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=QP2TA6Ec7uzA0cBpQjCQDWFiNkJrfHQ7sVVyDgvpxIg=; b=GDySYesuVM9VmsnRM9hFXW2rzdq3Ncz/Zp0lALN0S2PlX7xQzbFNuuF8WEpXO28wi9 o6FpN76Y/qc9eb1Lpzqzv536vU3gadPE4PJcUjA0dUAy/wRC0kizoZ2GZKuaiLysBgnu hs4cTAnfHN2dJzk7zUXFcBOpNjWEF0okfrf5dqeyyZvn5TdXJ+prIAS1NctHE87D3/2i J5OGzt4T9i9bVcprxYFNBktAsSprPNbgPTxQwb+9pOvmTb3SqjHDTQeyeB05lBgL/mFv TeNSWD9JepaMHIQCliojg1zt05egOKiKSRBc7yMzXOilptVSRuwzJFeXcKVeN7vfE3Y2 nJtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=QP2TA6Ec7uzA0cBpQjCQDWFiNkJrfHQ7sVVyDgvpxIg=; b=RLsZ4f1v7NVvQW3s02L6eLC/5nFhyRggbH0QmrySDq7sHvkUGQXuZFPrm+nMb4HFRw 077IZUTFcSf9H/d2dTwKUjDSAmITJVn2j4e4k2WrL+QLiV7EJ74kIkuaR9oB6O7hwFPb Iu2raHAvU20EYZKi6z5XZN1JQG4cyES92UidFuKtraj/2JHkzrwmmon/csZTb2/EFMbp SdbDX12kBahmDzLorUnMmwdF8BBR79HUaXQYYqLxFopcNxIJDCreZcS4+sIq9qxTjieU B7csmrsWxw0i4W0ATcS/lLrtvdcJDFJ4joE5HpcprWNpNBpH5R7WSwOpefYv0R9hVpmb v2iA== X-Gm-Message-State: AOAM533OCu/11ZYbz1FjejQTseZLdF6AawmGIP27SH7xobnanm5dXlNb eo3BbJsNorVqbGi5ySQnGhwj/+dBb5s= X-Google-Smtp-Source: ABdhPJyMF9+IkgTUuvkWIRJ1QwGTbW6KISxDkKVJR+kYIZ6txwG4OgJi2uLEdeiBmerSZsiF5fiPBQ== X-Received: by 2002:a17:90a:4a04:: with SMTP id e4mr19756891pjh.51.1634384322816; Sat, 16 Oct 2021 04:38:42 -0700 (PDT) Received: from kvm.asia-northeast3-a.c.our-ratio-313919.internal (24.151.64.34.bc.googleusercontent.com. [34.64.151.24]) by smtp.gmail.com with ESMTPSA id g11sm8048610pfc.194.2021.10.16.04.38.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 Oct 2021 04:38:42 -0700 (PDT) Date: Sat, 16 Oct 2021 11:38:38 +0000 From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: linux-mm@kvack.org Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] mm, slub: Use prefetchw instead of prefetch Message-ID: <20211016113838.GA12841@kvm.asia-northeast3-a.c.our-ratio-313919.internal> References: <20211011144331.70084-1-42.hyeyoo@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211011144331.70084-1-42.hyeyoo@gmail.com> X-Rspamd-Queue-Id: 16C1FF000136 X-Stat-Signature: x5cfysqt4wro6dj7zhhhbeg755yb7pdu Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=GDySYesu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com X-Rspamd-Server: rspam02 X-HE-Tag: 1634384323-92645 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Andrew, can you please update the patch to v2? On Mon, Oct 11, 2021 at 02:43:31PM +0000, Hyeonggon Yoo wrote: > commit 0ad9500e16fe ("slub: prefetch next freelist pointer in > slab_alloc()") introduced prefetch_freepointer() because when other cpu(s) > freed objects into a page that current cpu owns, the freelist link is > hot on cpu(s) which freed objects and possibly very cold on current cpu. > > But if freelist link chain is hot on cpu(s) which freed objects, > it's better to invalidate that chain because they're not going to access > again within a short time. > > So use prefetchw instead of prefetch. On supported architectures like x86 > and arm, it invalidates other copied instances of a cache line when > prefetching it. > > Before: > > Time: 91.677 > > Performance counter stats for 'hackbench -g 100 -l 10000': > 1462938.07 msec cpu-clock # 15.908 CPUs utilized > 18072550 context-switches # 12.354 K/sec > 1018814 cpu-migrations # 696.416 /sec > 104558 page-faults # 71.471 /sec > 1580035699271 cycles # 1.080 GHz (54.51%) > 2003670016013 instructions # 1.27 insn per cycle (54.31%) > 5702204863 branch-misses (54.28%) > 643368500985 cache-references # 439.778 M/sec (54.26%) > 18475582235 cache-misses # 2.872 % of all cache refs (54.28%) > 642206796636 L1-dcache-loads # 438.984 M/sec (46.87%) > 18215813147 L1-dcache-load-misses # 2.84% of all L1-dcache accesses (46.83%) > 653842996501 dTLB-loads # 446.938 M/sec (46.63%) > 3227179675 dTLB-load-misses # 0.49% of all dTLB cache accesses (46.85%) > 537531951350 iTLB-loads # 367.433 M/sec (54.33%) > 114750630 iTLB-load-misses # 0.02% of all iTLB cache accesses (54.37%) > 630135543177 L1-icache-loads # 430.733 M/sec (46.80%) > 22923237620 L1-icache-load-misses # 3.64% of all L1-icache accesses (46.76%) > > 91.964452802 seconds time elapsed > > 43.416742000 seconds user > 1422.441123000 seconds sys > > After: > > Time: 90.220 > > Performance counter stats for 'hackbench -g 100 -l 10000': > 1437418.48 msec cpu-clock # 15.880 CPUs utilized > 17694068 context-switches # 12.310 K/sec > 958257 cpu-migrations # 666.651 /sec > 100604 page-faults # 69.989 /sec > 1583259429428 cycles # 1.101 GHz (54.57%) > 2004002484935 instructions # 1.27 insn per cycle (54.37%) > 5594202389 branch-misses (54.36%) > 643113574524 cache-references # 447.409 M/sec (54.39%) > 18233791870 cache-misses # 2.835 % of all cache refs (54.37%) > 640205852062 L1-dcache-loads # 445.386 M/sec (46.75%) > 17968160377 L1-dcache-load-misses # 2.81% of all L1-dcache accesses (46.79%) > 651747432274 dTLB-loads # 453.415 M/sec (46.59%) > 3127124271 dTLB-load-misses # 0.48% of all dTLB cache accesses (46.75%) > 535395273064 iTLB-loads # 372.470 M/sec (54.38%) > 113500056 iTLB-load-misses # 0.02% of all iTLB cache accesses (54.35%) > 628871845924 L1-icache-loads # 437.501 M/sec (46.80%) > 22585641203 L1-icache-load-misses # 3.59% of all L1-icache accesses (46.79%) > > 90.514819303 seconds time elapsed > > 43.877656000 seconds user > 1397.176001000 seconds sys > > Link: https://lkml.org/lkml/2021/10/8/598 > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > --- > mm/slub.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/slub.c b/mm/slub.c > index 3d2025f7163b..ce3d8b11215c 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -354,7 +354,7 @@ static inline void *get_freepointer(struct kmem_cache *s, void *object) > > static void prefetch_freepointer(const struct kmem_cache *s, void *object) > { > - prefetch(object + s->offset); > + prefetchw(object + s->offset); > } > > static inline void *get_freepointer_safe(struct kmem_cache *s, void *object) > -- > 2.27.0 >