From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14040EB64D9 for ; Tue, 27 Jun 2023 11:49:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E7118D0003; Tue, 27 Jun 2023 07:48:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 997988D0001; Tue, 27 Jun 2023 07:48:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 837318D0003; Tue, 27 Jun 2023 07:48:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 74E828D0001 for ; Tue, 27 Jun 2023 07:48:59 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 40C94AFB6D for ; Tue, 27 Jun 2023 11:48:59 +0000 (UTC) X-FDA: 80948356398.30.64054D0 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf29.hostedemail.com (Postfix) with ESMTP id 8038512000B for ; Tue, 27 Jun 2023 11:48:56 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=IS9AqXbT; spf=pass (imf29.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687866536; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IHfLcepkXja+ynlxouh85J8pO5rbaZJt2d2Bei4PeRg=; b=xjUw6dNXBq3Bldp7Y16xdrFF7bdlzBaz9MqyWadfuuO3WfsraEzH5u+DTiYadtxTi0dSlk 188y+tbX1yi0cO0Pw+a/B+IiYjVe2toDnrgMvT6DsRe+z7Lc7zN0hQh/qolTAq77yU0njF wtjM0DGhwiQfK2SnvRi0ZNxk9fBhT44= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=IS9AqXbT; spf=pass (imf29.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687866536; a=rsa-sha256; cv=none; b=YKLyziYMsPJ6tLC//mNfM+o/YKiI2xZbg0UnOKsaKhnzWq0jvdQehoyeFuYOamoquOKMXO 56cCJrvCmnEvW92TXNSo5GSOsci60uDma6iFC4Z045pYYzNY74OvF0R+/r0cRlAA4DL2tT eeiFWBZn0ZsOchpb2894hUKNYXrvz8A= Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35RBkfjY004551; Tue, 27 Jun 2023 11:48:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=IHfLcepkXja+ynlxouh85J8pO5rbaZJt2d2Bei4PeRg=; b=IS9AqXbTlhBHeBqCY4fzpm/BKT03oBLb8SpABzhRLUePmoyP0yYmsJsrYJLLmVm/1CP6 ir7MpwMV9n1uRgRyL3kPSzsWV/bVAsDv9Hq3aBgv+fO5J400XR4iKgKTqYI+owN7WETx LKuF3FkLycrS0DvIQTf80/+0CfGH7rA1CzyNOf2D2ZRkVusSpOdAy/MAWRFInV0qdurk cFKbBhHL0MilQcSiweFSKOex7LddIbZ7CoHZ6eagyHiuv0kTbvMqfZqPQEJfcElfKpIv fVghFzsZ8GrI2kgllkxbGPbPraojyug4zFE25Y8Xmj3qETxKtgqGsPw2yCUBpkNO8FUe jw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rfy6n022m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Jun 2023 11:48:54 +0000 Received: from m0353728.ppops.net (m0353728.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35RBkpKM005518; Tue, 27 Jun 2023 11:48:54 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rfy6n0227-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Jun 2023 11:48:54 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35R6gsMQ018575; Tue, 27 Jun 2023 11:48:51 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma01fra.de.ibm.com (PPS) with ESMTPS id 3rdr451dt7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Jun 2023 11:48:51 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35RBmnm918874930 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 27 Jun 2023 11:48:49 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 43B0A20040; Tue, 27 Jun 2023 11:48:49 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3CF0C20043; Tue, 27 Jun 2023 11:48:48 +0000 (GMT) Received: from [9.43.55.115] (unknown [9.43.55.115]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 27 Jun 2023 11:48:48 +0000 (GMT) Message-ID: Date: Tue, 27 Jun 2023 17:18:47 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH 3/3] mm/lru_gen: Don't build multi-gen LRU page table walk code on architecture not supported Content-Language: en-US To: Yu Zhao Cc: linux-mm@kvack.org, akpm@linux-foundation.org, "npiggin@gmail.com" References: <20230613120047.149573-1-aneesh.kumar@linux.ibm.com> <20230613120047.149573-3-aneesh.kumar@linux.ibm.com> <87bkh4x661.fsf@linux.ibm.com> <959537fc-cee4-f5e8-d7be-5e4402feda9f@linux.ibm.com> From: Aneesh Kumar K V In-Reply-To: Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: hi5WIR8oLKqjaEI4FqmEd3eFBIl6vnQO X-Proofpoint-GUID: F2Ouofi2sdPwaY0S4ZLTLYvCWUst9_Ww Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-27_07,2023-06-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 impostorscore=0 lowpriorityscore=0 suspectscore=0 spamscore=0 adultscore=0 mlxlogscore=999 mlxscore=0 bulkscore=0 malwarescore=0 phishscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306270106 X-Rspamd-Queue-Id: 8038512000B X-Rspam-User: X-Stat-Signature: tzhzxx4uijbrgocozsi3wwmunpdngy1k X-Rspamd-Server: rspam01 X-HE-Tag: 1687866536-390052 X-HE-Meta: U2FsdGVkX1+BzNqkvFnXQymyN7ymndxRs0jFXUWCHU8uKFy1VHvJp3Aqe1NfYGHz7amTyAUDOH7IUK2ioJiObtbP3bxQ3u7pxRzosSPLYLPaleNRqfyX/nKCDDP8anGcq7Id2gT+MV2OpADl26erWaHJZ7nOpXCaMKdWxikYTK5N56m4nKfphiRHRwnSmMsjH6whNA1yGDirK80V54CBmKq1BO3aGPr4hVd/Fi8Ia3oKjRggRYitmJgKEW6+oLrtrbjzIH3gqSqUdV7wC21zACzOAG23RM8FECWLV5EClpaJzFUgn5HP2nxmDC/ndc4iUk6TrRahRnrolDJ6rAWyT5HvUjvz9oFLDJn3MDAsJ/SgNxZ90OVdzsVCbqNG7dvOX7GRcE7H0twtwQV5ot5Io2lyAB8gZXHiOyMZm5uFRVcDbh+6hn9j85vHnMRxo9ITPUN/qN9NGvPCVMe+m39IAvzluKW0S5WGvZ+ZEtlIt8lttUrEubQuCvzcFKpAab/JFIrzQ/whdkfSfn4BrCc7pirKoX9uo8BLbdUlC1p6N+9zMiLAHM3VaF+IuoMIPIJagE59081qj3o9Eu+0XH9udMkdQAZNj5j/85VcPgWueJFQsdifM9F/rRzAKhgKnobUYD8L/FdEwqcWgQuoHlTQWUkJfVTB1kOndJaKWkakb3ntlA1zTvv63SOn4fGISsPRFsM/Rs9oiTsnPJCFZ5scZNdkIaDfNLa9w5D67R6evdFmtiUpjl8o5lmN8UfGEgz1R3NdU/1T1fXHGRuDxAIiT5tCfztM7UMs75aE69BQe9/g2KG+ZSMfa0e37AaXG/02ISCBmlCwdl5A/tq0JEK+ei9RCYKzurAzVKd3Df0l0xK1DoFobhE2dA8JAaI+AWBwKRE5GL0K8mAcgYnLynQ8x1QxRM5gJDxeYp+BJGcAEU4fbrXoLJS+oQeFFpJPrEK9IvswT8NQAOZBvwsFrMk t5eE0L82 6UK2hOwK+K6NTyYvb/zaA4x561/lUPRIqVz+l+SJqLboIRbyYEWLytlHMcMI1nQAsvTkHh184ORUr6nAUOcu67/AcuYbKYVnR3byeQKtCxI1gD6lcjL/wwIs0Jz0COai5Evca2aNsQg0HpcKJdJSu2WGyOItws9fZD3uQ+302TewlxgUxkTpiGRvCVF5K1RIE45F5g5l8u2CVEZF/T7WjhiGInatwG3Eu830REDAUS6DjFt2LoJCUvU0GTGlR5AWMf+5VGC3h74LD545jDe/+3Ed3YaFwQCUUPhTrh98YiRNwjwDTc8x7m3HgJ9TJSjYwEHkBS/b5zHBO47A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/26/23 10:34 PM, Yu Zhao wrote: > On Mon, Jun 26, 2023 at 4:52 AM Aneesh Kumar K V > wrote: >> >> On 6/26/23 1:04 AM, Yu Zhao wrote: >>> On Sat, Jun 24, 2023 at 8:54 AM Aneesh Kumar K.V >>> wrote: >>>> >>>> Hi Yu Zhao, >>>> >>>> "Aneesh Kumar K.V" writes: >>>> >>>>> Not all architecture supports hardware atomic updates of access bits. On >>>>> such an arch, we don't use page table walk to classify pages into >>>>> generations. Add a kernel config option and remove adding all the page >>>>> table walk code on such architecture. >>>>> >>>>> No preformance change observed with mongodb ycsb test: >>>>> >>>>> Patch details Throughput(Ops/sec) >>>>> without patch 93278 >>>>> With patch 93400 >>>>> >>>>> Without patch: >>>>> $ size mm/vmscan.o >>>>> text data bss dec hex filename >>>>> 112102 42721 40 154863 25cef mm/vmscan.o >>>>> >>>>> With patch >>>>> >>>>> $ size mm/vmscan.o >>>>> text data bss dec hex filename >>>>> 105430 41333 24 146787 23d63 mm/vmscan.o >>>>> >>>> >>>> Any feedback on this patch? Can we look at merging this change? >>> >>> Just want to make sure I fully understand the motivation: are there >>> any other end goals besides reducing the footprint mentioned above? >>> E.g., preparing for HCA, etc. (My current understanding is that HCA >>> shouldn't care about it, since it's already runtime disabled if HCA >>> doesn't want to use it.) >>> >> >> My goal with this change was to remove all those dead code from getting complied >> in for ppc64. > > I see. But the first thing (lru_gen_add_folio()) you moved has nothing > to do with this goal, because it's still compiled after the entire > series. > Sure. will drop that change. >>> Also as explained offline, solely relying on folio_activate() in >>> lru_gen_look_around() can cause a measure regression on powerpc, >>> because >>> 1. PAGEVEC_SIZE is 15 whereas pglist_data->mm_walk.batched is >>> virtually unlimited. >>> 2. Once folio_activate() reaches that limit, it takes the LRU lock on >>> top of the PTL, which can be shared by multiple page tables on >>> powerpc. >>> >>> In fact, I think we try the opposite direction first, before arriving >>> at any conclusions, i.e., >>> #define arch_has_hw_pte_young() radix_enabled() >> >> The reason it is disabled on powerpc was that a reference bit update takes a pagefault >> on powerpc irrespective of the translation mode. > > This is not true. > > From "IBM POWER9 Processor User Manual": > https://openpowerfoundation.org/resources/ibmpower9usermanual/ > > 4.10.14 Reference and Change Bits > ... > When performing HPT translation, the hardware performs the R and C > bit updates nonatomically. > ... > > The radix case is more complex, and I'll leave it to you to interpret > what it means: > > From "Power ISA Version 3.0 B": > https://openpowerfoundation.org/specifications/isa/ > > 5.7.12 Reference and Change Recording > ... > For Radix Tree translation, the Reference and Change bits are set atomically. > ... > it is atomic in that software use ldarx/stdcx to update these bits. Hardware/core won't update this directly even though Nest can update this directly without taking a fault. So for all purpose we can assume that on radix R/C bit is updated by page fault handler. Generic page table update sequence are slightly different with hash translation in that some page table field updates requires marking the page table entry invalid. >>> on powerpc. This might benefit platforms with the A-bit but not HCA, >>> e.g., POWER9. I just ran a quick test (memcached/memtier I previously >>> shared with you) and it showed far less PTL contention in kswapd. I'm >>> attaching the flamegraphs for you to analyze. Could you try some >>> benchmarks with the above change on your end as well? >>> >> >> The ptl lock is a valid concern even though i didn't observe the contention increasing with >> the change. I will rerun the test to verify. We have possibly two options here >> >> 1) Delay the lruvec->nr_pages update until the sort phase. But as you explained earlier, that >> can impact should_run_aging(). >> >> >> 2) Add another batching mechanism similar to pglist_data->mm_walk which can be used on architecture >> that don't support hardware update of access/reference bit to be used only by lru_gen_look_around() > > Sounds good. Thanks. I will go ahead working on the approach 2 I outlined above? -aneesh