From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A650EB64DC for ; Tue, 27 Jun 2023 19:11:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A06138D0003; Tue, 27 Jun 2023 15:11:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B5938D0001; Tue, 27 Jun 2023 15:11:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 87FE48D0003; Tue, 27 Jun 2023 15:11:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 78CFA8D0001 for ; Tue, 27 Jun 2023 15:11:35 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 383DC140584 for ; Tue, 27 Jun 2023 19:11:35 +0000 (UTC) X-FDA: 80949471750.09.1FD13B5 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf11.hostedemail.com (Postfix) with ESMTP id 1BBCB4000F for ; Tue, 27 Jun 2023 19:11:32 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=tIcBSsqa; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687893093; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jDQfwDeAKrSsrpidMk41ByVr7rAta0NMcFmnCRtDfIY=; b=1dtHXLHNOyW0TTsVnWze7Be8ooRwdcN43VqNzqmCJ1zIOXrJKCl13ycVx9Rk8QOrl4mWWh PvvZM+4HZ/S91ZeI0+6HaXxgJ9xBfveLyIE1BM/pQk2TRLZMlASEPfAZejuVnZqCRWN2Z7 SP9bNCYajfcGQTudMc4GtYaxWT02ue8= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=tIcBSsqa; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687893093; a=rsa-sha256; cv=none; b=S8WPEQnUBbO2iOWbjMPQQWwPESKiX3YSoVc9fm2rD4hsVsO5AgpOaXFEcxV7Rzb/rBihxf GkRz1dQVaSHdAeJfRNVszGjEP3j7U9mAOVIUtAEccfb1uy4I4EfywBNMsI6Ml+5bnqDcA8 qgDV4dcZAlWrSx7BxaAszLoDIHFAyPg= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-401f4408955so52451cf.1 for ; Tue, 27 Jun 2023 12:11:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687893092; x=1690485092; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jDQfwDeAKrSsrpidMk41ByVr7rAta0NMcFmnCRtDfIY=; b=tIcBSsqafCGXLKenVt1fpFuS0HgcvXhoN1fscelTiFO0W+JMZvUfP2ocFlWD5XjFQZ xDGPf60hvTiMEltt+PWPmEf75tVnHsurwQ4PCG26oPRdzBd/gf28dWs1r109NjisMP9z d7701C/iv5K18kuWgut0yNAg5cxWnrFC4sTNFzkiVmnSuBV3XnpklCEJC1W5CqGQIPcL KuY1yJXLoMTR35+2FzPgCxt0DB3tMfA1m1SoLK2PeSAaDBzykPyeSnhEZ56bVsWWk/3Q 90pywN1Fdll4rOwC0Q8kAfrJ8TtkqasqC414OWB85B9Zck56xfF1zPU8dE9BwisBvDfp OgbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687893092; x=1690485092; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jDQfwDeAKrSsrpidMk41ByVr7rAta0NMcFmnCRtDfIY=; b=NAiCf54Z/Z9cuc17byr/MQBTjCA7I0ptqnBro2b4DTBaqo83L+o8s7ori9Yu3ldPXJ XgPQmrWTuo7wPNNxM7Ou+8a/1gZfmXNc00I8C/Ccn0QXhviKLPVMsei+LlWtdT+0vbk8 pmZkJtsGLH1wDWWZ7jpm00rdsxWtMhmU/+OEw/akrHKEw4UsWJZfPTPkCUWo2jdumu1d DFtLKGXp/ik16KPKHPe3XnQ7MP9RzMW9ApBbSOBwoGZog6O1aZBq+TB9HGK/r/AQA2z2 Ypo5ypQkqqOdHTxCvkziBexjRZW/HjvNL0LnKtjjIroHzbNVRbByYonINYnsSggvyy2c 4e4A== X-Gm-Message-State: AC+VfDx5qgtOXnFOZxnJDhmfiBuy3UmQWUybQdsydnNlI1bMNxw60p9j 298dIs+eyTy3Nct+bkBx1YeiUOvRly3xMvZXXA0UAg== X-Google-Smtp-Source: ACHHUZ48452J+SqEq59D2HCrG9lok9BMOY13M2w9ux9yAiWWTZ9z9ehWMqAfmwUjcbSWe8cjbMSFWFBx8a++3hGisSM= X-Received: by 2002:ac8:5b46:0:b0:3f7:ffc8:2f6f with SMTP id n6-20020ac85b46000000b003f7ffc82f6fmr23879qtw.28.1687893091988; Tue, 27 Jun 2023 12:11:31 -0700 (PDT) MIME-Version: 1.0 References: <20230613120047.149573-1-aneesh.kumar@linux.ibm.com> <20230613120047.149573-3-aneesh.kumar@linux.ibm.com> <87bkh4x661.fsf@linux.ibm.com> <959537fc-cee4-f5e8-d7be-5e4402feda9f@linux.ibm.com> In-Reply-To: From: Yu Zhao Date: Tue, 27 Jun 2023 13:10:55 -0600 Message-ID: Subject: Re: [PATCH 3/3] mm/lru_gen: Don't build multi-gen LRU page table walk code on architecture not supported To: Aneesh Kumar K V Cc: linux-mm@kvack.org, akpm@linux-foundation.org, "npiggin@gmail.com" , linuxppc-dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 1BBCB4000F X-Stat-Signature: 3hx6mb9dq7qtuntekf58t61cqf67gxay X-HE-Tag: 1687893092-560047 X-HE-Meta: U2FsdGVkX18Q9prIkab1mvfWPxIPWil0j1WKyzGJpdes+Ju5BTXZX+3dqvtgOtuHJedZyn1gJSbDNG1Xc+/7wSgHaXx/cZNbgdqt6bKJaKBiaRVth6pcge+gXQ4ZaRmpdVe3VEsWAFbIHWLe/V+y74nAVeQoj50QCZvMXQNHXEWzBX/I9QXrXyCiBsWZVzcnsf+1/wx8YsbJ8UCup4mQIeIxAPv5BSdQNJ1aQ18m3ditmCWgxkLp2fjYfYhYndFgzg6fAlz6TLVy9Dj5G3nr+MuzfV81qkx0SeDF+nFpPLRZ2rICymEK6P3CgxbUj8BvR+LiSrDgUe2sK6nUKECOTMpKCCa8WXG7yRYsaLDEf0wi5HDIPtatCv24b8i+qcCAxUt+aoWqgmzKk81wUGCgHa9tx/ApVlSV+bdOMnZWbJl77m1frZjLua/KhtomgIyc24uZamX21w4WflFWj0so8dOzooY/AKtKw7Z0VHYb3/g30R3YMOgg2MfLLg+ESq9ZLFCnDdgLAmeazsFcyC6ZpjHwHSEqX/E0Jis5tnoWFJGjCklo0XuZROuISfz0P5Cx1iGBTRQArdB74pL4Ei0y/xNKdGG3OHqf7MxSpW7lfn1yebolRKD0GumZUuuaceeLylsuosl5ytUblqnvh1U7hQO2d+xspkMhgG9IyQ/hvukCeorOiext4yESZv2N7yfXjfmnroDcinZfVQar14hhMIX7o22f9t8ONTE3YwFT/Rv/Q+1QcR9q6W7baeAj0Nar+cgG26fsOa96gpvLL7ugEoIbgf4HjI3zAZLYMtEyLq3WMI0VO7pFmhdcb9AZcJ/Rzmh0A/nPeYQ96VCVPfuY1A7n6zl3EAdJ6YzscEjS8FdPqSY66Y9BODZeLmRiK8rBSR0Fmto/R8l0zSyvPkYb05uuEbJJi2AO0Olx6B8SLK3rSh1xK/pobuy6Up5FvYE5TLdVLsNJUrqGe/Kxn0+ oJ544LGi JI+ByuZ8uCt8MJ0chvsmdMc7xLD+NZw+5odb9A5TkNkJuPE3UAI201EZ34rk43jPSV52Zs/PXj6xhHXU9l89V/nJNXiV0bakTYUUL187uUvtDiMoq8pIqrQHxYs2zmBniWrNRXCO2TdQaVnTvUEoOgIPDsxl4a1eok/Pn3AzqEFfvJYIOA+jxAggo4bsTJhN/+QzKh9nfTtmcn5s9LUELWwSEQFUAQApq7MZRdW/jSCudHfNoL2loIPnQXWCLXg53TBOvmERljIy7IPHwtackJOdulDbarBYzqso8x6PwZGHGWlu7129bq/3F+psEUT8wQ0qQB32GrcgtxY9YMwy7m/6xOxom3j/cfOaUa6qUT4brWKApGNzrAwLVdJnO1BC9q0O45W89k9L1GpQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 27, 2023 at 5:48=E2=80=AFAM Aneesh Kumar K V wrote: > > On 6/26/23 10:34 PM, Yu Zhao wrote: > > On Mon, Jun 26, 2023 at 4:52=E2=80=AFAM Aneesh Kumar K V > > wrote: > >> > >> On 6/26/23 1:04 AM, Yu Zhao wrote: > >>> On Sat, Jun 24, 2023 at 8:54=E2=80=AFAM Aneesh Kumar K.V > >>> wrote: > >>>> > >>>> Hi Yu Zhao, > >>>> > >>>> "Aneesh Kumar K.V" writes: > >>>> > >>>>> Not all architecture supports hardware atomic updates of access bit= s. On > >>>>> such an arch, we don't use page table walk to classify pages into > >>>>> generations. Add a kernel config option and remove adding all the p= age > >>>>> table walk code on such architecture. > >>>>> > >>>>> No preformance change observed with mongodb ycsb test: > >>>>> > >>>>> Patch details Throughput(Ops/sec) > >>>>> without patch 93278 > >>>>> With patch 93400 > >>>>> > >>>>> Without patch: > >>>>> $ size mm/vmscan.o > >>>>> text data bss dec hex filename > >>>>> 112102 42721 40 154863 25cef mm/vmscan.o > >>>>> > >>>>> With patch > >>>>> > >>>>> $ size mm/vmscan.o > >>>>> text data bss dec hex filename > >>>>> 105430 41333 24 146787 23d63 mm/vmscan.o > >>>>> > >>>> > >>>> Any feedback on this patch? Can we look at merging this change? > >>> > >>> Just want to make sure I fully understand the motivation: are there > >>> any other end goals besides reducing the footprint mentioned above? > >>> E.g., preparing for HCA, etc. (My current understanding is that HCA > >>> shouldn't care about it, since it's already runtime disabled if HCA > >>> doesn't want to use it.) > >>> > >> > >> My goal with this change was to remove all those dead code from gettin= g complied > >> in for ppc64. > > > > I see. But the first thing (lru_gen_add_folio()) you moved has nothing > > to do with this goal, because it's still compiled after the entire > > series. > > > > Sure. will drop that change. > > >>> Also as explained offline, solely relying on folio_activate() in > >>> lru_gen_look_around() can cause a measure regression on powerpc, > >>> because > >>> 1. PAGEVEC_SIZE is 15 whereas pglist_data->mm_walk.batched is > >>> virtually unlimited. > >>> 2. Once folio_activate() reaches that limit, it takes the LRU lock on > >>> top of the PTL, which can be shared by multiple page tables on > >>> powerpc. > >>> > >>> In fact, I think we try the opposite direction first, before arriving > >>> at any conclusions, i.e., > >>> #define arch_has_hw_pte_young() radix_enabled() > >> > >> The reason it is disabled on powerpc was that a reference bit update t= akes a pagefault > >> on powerpc irrespective of the translation mode. > > > > This is not true. > > > > From "IBM POWER9 Processor User Manual": > > https://openpowerfoundation.org/resources/ibmpower9usermanual/ > > > > 4.10.14 Reference and Change Bits > > ... > > When performing HPT translation, the hardware performs the R and C > > bit updates nonatomically. > > ... > > > > The radix case is more complex, and I'll leave it to you to interpret > > what it means: > > > > From "Power ISA Version 3.0 B": > > https://openpowerfoundation.org/specifications/isa/ > > > > 5.7.12 Reference and Change Recording > > ... > > For Radix Tree translation, the Reference and Change bits are set ato= mically. > > ... > > > > it is atomic in that software use ldarx/stdcx to update these bits. Hardw= are/core won't > update this directly even though Nest can update this directly without ta= king a fault. So > for all purpose we can assume that on radix R/C bit is updated by page fa= ult handler. Thanks. To me, it sounds like stating a function provided by h/w, not a requirement for s/w. (IMO, the latter would be something like "software must/should set the bits atomically.) But I'll take your word for it.