From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E9CDEB64D7 for ; Mon, 26 Jun 2023 17:06:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F38BA8D0003; Mon, 26 Jun 2023 13:06:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC1558D0001; Mon, 26 Jun 2023 13:06:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3B1E8D0003; Mon, 26 Jun 2023 13:06:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BBFC88D0001 for ; Mon, 26 Jun 2023 13:06:06 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 77E88407BE for ; Mon, 26 Jun 2023 17:06:06 +0000 (UTC) X-FDA: 80945526732.04.6F76F08 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) by imf03.hostedemail.com (Postfix) with ESMTP id 28C5820067 for ; Mon, 26 Jun 2023 17:05:28 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=6aViQSHS; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.171 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687799129; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GoQWdt8s4lnt4urDoE4aeXMnOOdpPqQE2sBAK1ekfdg=; b=zic+WBwVqmrjOEjJhNaUXOf4N4ko5E+FJufpotOJNtHHpI8Korlh1M/A6SjoiRhsLAma4L CFQND1d5K6K7FaDiKGb/kCgdFb39SvLRqfQl4H1JJ6qXgXhmhzy3ZWtoiYMaW3ypjv04/h R7S3LGEYcTRVtIhAd/bB02x1APSdtZg= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=6aViQSHS; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.171 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687799129; a=rsa-sha256; cv=none; b=yJGRCbDjRIi5f0F0RPh7okAKDPIjA5tEyTajt4Ew4ih510VL26ogmi/fjpkhmXZ4NC8t6B vSiigIfDhhwBLBw5ZOmrOEUvf8YYPIjp7K1kU2J2cDCc3DvpEnzaHcXnRW92W9utG5iPCS Zzc+Vm0pOG8A5eQIEfRIm1cff1PYQmY= Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-40079620a83so12261cf.0 for ; Mon, 26 Jun 2023 10:05:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687799128; x=1690391128; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GoQWdt8s4lnt4urDoE4aeXMnOOdpPqQE2sBAK1ekfdg=; b=6aViQSHSVHqEHcJygvGuJ2g00HHN03XEqlRA3txXZnO3BemEedAEv4p2wRKbVJfgKS eDRPYIqj3wmWvPfGcp+d+ILCA8wQBrlRA059ljFwB8KA0eTsh0vCRr5CC8lOLNMOioiV dpHNw1CVU3iRREd79yoBxBExMTQPXKqbByxVH726CGOcKD6rRFj4E5DMYx2nt0bTZ3eV lDlGaTZdY8EVbOS5FuQfb8uDOgaoOCtWqgCalIo+FPz9XSOL2mUQNLVO/2d91qapWakU eo3ut0PsXvuJGUCMbLu9wNGEpngdyHdmmJWx/FhbczRr4elJ4KGdKSA8eGK8DghhP2ss MPtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687799128; x=1690391128; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GoQWdt8s4lnt4urDoE4aeXMnOOdpPqQE2sBAK1ekfdg=; b=guixvj1dLjlZVp0JzTkcMwLKrtgAGV7x10nTLWaYAonj4oR7Z6rTvLcUdus8OTNT2B 1KD2UfJerU3weaycpKY5GF3ty63zy3wmPbZmN3j4mxauNrwtQkwaNaJ5kA7Crf6dmIq5 5QOw4XLObiGbRCzzjd5TC9h3JSARlvMeq9laczVfbhe2bfy6jvcNRa72CTatAOKYtcgZ 3f+jwSCLka6oASDIdzXXtdiZ48p+VMZ1IqMmdFnkMGomKtMl8uXf1ZpgZbZrniI7qKOI qrQZkY8tjXKULtTmLB1u/3JM5rq2RDptBCC8Ls0SBXEHxiEyd0Fld5PI3q6kDGFS6Jz6 ndDw== X-Gm-Message-State: AC+VfDx6g3Ip1z1l8Wjnk9/zwurQXVAzX4YQRj1Er+9R/SBfuTkze6k6 qBQz5htR841lQfvIIWH7gA0cQwilGUB9inCCHYc6Cw== X-Google-Smtp-Source: ACHHUZ6dUh2kGxNnhxbJVXrtAPjIS9X+MH+iGLEPgyQjU7aPFFsgqra6DxQgBW7FhPplhhL2RHOKLf7/tDYI8OOYBi4= X-Received: by 2002:a05:622a:1a8d:b0:3f8:e0a:3e66 with SMTP id s13-20020a05622a1a8d00b003f80e0a3e66mr90788qtc.3.1687799127977; Mon, 26 Jun 2023 10:05:27 -0700 (PDT) MIME-Version: 1.0 References: <20230613120047.149573-1-aneesh.kumar@linux.ibm.com> <20230613120047.149573-3-aneesh.kumar@linux.ibm.com> <87bkh4x661.fsf@linux.ibm.com> <959537fc-cee4-f5e8-d7be-5e4402feda9f@linux.ibm.com> In-Reply-To: <959537fc-cee4-f5e8-d7be-5e4402feda9f@linux.ibm.com> From: Yu Zhao Date: Mon, 26 Jun 2023 11:04:51 -0600 Message-ID: Subject: Re: [PATCH 3/3] mm/lru_gen: Don't build multi-gen LRU page table walk code on architecture not supported To: Aneesh Kumar K V Cc: linux-mm@kvack.org, akpm@linux-foundation.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 28C5820067 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: bqc6szbh7w9c4dysm6ka7zi86ipsutdb X-HE-Tag: 1687799128-644601 X-HE-Meta: U2FsdGVkX1+46h24oTMrzfp1otxn9/MLszXWMxRsrmZvptbi08KNgoJE+NMRjCr6h2j4yFym54QNdHEJK9alh4/dVUR4NMa4xlAXTo294bIoaApvl1YsR3tiJOMR1JnaXJUJl2lEtpwhDtnUUm4mDgmrIacN3jP82lQ8NhqbpvEBGXmc33w7qgiLIptxyr5FQDUw9cWAo5eSBfNOdcc8Mm6XHWhJCoJBGNwiBdStut4q9wZipOBFiAmvbjAg3PbuUrHrn29DsYWmwgURKbOHQDVgzFlxusPDBRFb0ZxFVjfEuEVxo75AsIBVa9uxsfC45gb4yOFbNlRi2lD19k0lX0231ShvryQk1L4huHUj4qxTSmRGBPcYT+A/tHX2RWX0XQPnBxPBTC8efg6rZlii+ybZ5xKj4dBQtKGqxTl52sK5pCjGQ1hKWmKp3NT1fwaYP7Yvj1ovxguBx1JojmH8VUSomCV4N2y3+LhDOJldZpWXmnYmov0UakaqJooP7ehcmAZItYs9n2ZAQD30BxqRm3qeFLvtoVRMoDtisML2K6GQ3mbC5IT6M94e8MofNZM+pZ87G+Vt1lH18gesAiYJQ8Y4bCr+5+bspuUFNm4AGjReXZJd9Mca3+nh0BqJYffYX/4pkAFvDTcaWLBw8vXsO1weBwQKf9kbbWcdHSoVL6mcUvgfwIqiLRyFhj10kXL+oyLkB4/h4VLnHltSdDZZyb+9GeqC8o/TojdASHNWiFUiX/FNelpdGUa+5lsrs6mN65s2YSmg3VMfR2Hy10dMPjiwbWvKZkK79r8+c0F67yQ2DmdAJjidgss/Li08j9y3WacOJumA6J7xDEjrtO79QKu1pbNLbtK2pBmXdxcPObMN6+Hrnre9okUfJsPXaQ5IgEpA6srcMlHAx+niGxWo6uaUaS+ucI8H5/ITeQGZWrZ4rM6pPCcgYSdgE9lkAqYqpzw/DmNe6F3Trpbycz4 ld/i9mHb mx9qvRtK3LV7Z+Oz0LRcNUvdC2fddvnLTBZNFaWHm7f9AMhStdsC8MPupLbyrdZAUc5yji2IIVGq+CUbcevb+6w5Sy9IxgjTx0Vvy/YJOcEWun+s1Y1jeD3qvDoe1KSUFAzw1MIcnBgnJinOPCa+bw9pUKgUb8BbK9Ayys8fir11zBz9GV8eahPi0K5u/r6JyCeWKxBXjnWQHe7x/rCk2LMBZm+1aQZCXzCLv8qyQ4VjCanCdm2isxbHYCkqGcdF1H1xGiHdFjFDRzSvjWsku0mwB9+Es2ng7kfymfKBr8p5+1f4pfNRfK8z18NYXT4cJG2nTrRG/f5sBi1UnX6/ry+jp3XB6C+N4sDoXso7gmwkuBVo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jun 26, 2023 at 4:52=E2=80=AFAM Aneesh Kumar K V wrote: > > On 6/26/23 1:04 AM, Yu Zhao wrote: > > On Sat, Jun 24, 2023 at 8:54=E2=80=AFAM Aneesh Kumar K.V > > wrote: > >> > >> Hi Yu Zhao, > >> > >> "Aneesh Kumar K.V" writes: > >> > >>> Not all architecture supports hardware atomic updates of access bits.= On > >>> such an arch, we don't use page table walk to classify pages into > >>> generations. Add a kernel config option and remove adding all the pag= e > >>> table walk code on such architecture. > >>> > >>> No preformance change observed with mongodb ycsb test: > >>> > >>> Patch details Throughput(Ops/sec) > >>> without patch 93278 > >>> With patch 93400 > >>> > >>> Without patch: > >>> $ size mm/vmscan.o > >>> text data bss dec hex filename > >>> 112102 42721 40 154863 25cef mm/vmscan.o > >>> > >>> With patch > >>> > >>> $ size mm/vmscan.o > >>> text data bss dec hex filename > >>> 105430 41333 24 146787 23d63 mm/vmscan.o > >>> > >> > >> Any feedback on this patch? Can we look at merging this change? > > > > Just want to make sure I fully understand the motivation: are there > > any other end goals besides reducing the footprint mentioned above? > > E.g., preparing for HCA, etc. (My current understanding is that HCA > > shouldn't care about it, since it's already runtime disabled if HCA > > doesn't want to use it.) > > > > My goal with this change was to remove all those dead code from getting c= omplied > in for ppc64. I see. But the first thing (lru_gen_add_folio()) you moved has nothing to do with this goal, because it's still compiled after the entire series. > > Also as explained offline, solely relying on folio_activate() in > > lru_gen_look_around() can cause a measure regression on powerpc, > > because > > 1. PAGEVEC_SIZE is 15 whereas pglist_data->mm_walk.batched is > > virtually unlimited. > > 2. Once folio_activate() reaches that limit, it takes the LRU lock on > > top of the PTL, which can be shared by multiple page tables on > > powerpc. > > > > In fact, I think we try the opposite direction first, before arriving > > at any conclusions, i.e., > > #define arch_has_hw_pte_young() radix_enabled() > > The reason it is disabled on powerpc was that a reference bit update take= s a pagefault > on powerpc irrespective of the translation mode. This is not true. >From "IBM POWER9 Processor User Manual": https://openpowerfoundation.org/resources/ibmpower9usermanual/ 4.10.14 Reference and Change Bits ... When performing HPT translation, the hardware performs the R and C bit updates nonatomically. ... The radix case is more complex, and I'll leave it to you to interpret what it means: >From "Power ISA Version 3.0 B": https://openpowerfoundation.org/specifications/isa/ 5.7.12 Reference and Change Recording ... For Radix Tree translation, the Reference and Change bits are set atomica= lly. ... > > on powerpc. This might benefit platforms with the A-bit but not HCA, > > e.g., POWER9. I just ran a quick test (memcached/memtier I previously > > shared with you) and it showed far less PTL contention in kswapd. I'm > > attaching the flamegraphs for you to analyze. Could you try some > > benchmarks with the above change on your end as well? > > > > The ptl lock is a valid concern even though i didn't observe the contenti= on increasing with > the change. I will rerun the test to verify. We have possibly two options= here > > 1) Delay the lruvec->nr_pages update until the sort phase. But as you exp= lained earlier, that > can impact should_run_aging(). > > > 2) Add another batching mechanism similar to pglist_data->mm_walk which c= an be used on architecture > that don't support hardware update of access/reference bit to be used onl= y by lru_gen_look_around() Sounds good. Thanks.