From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEFF8C4332F for ; Mon, 13 Nov 2023 05:18:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0331C6B015D; Mon, 13 Nov 2023 00:18:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F257D6B018D; Mon, 13 Nov 2023 00:18:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9E476B018F; Mon, 13 Nov 2023 00:18:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C5A306B015D for ; Mon, 13 Nov 2023 00:18:31 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8378F12054B for ; Mon, 13 Nov 2023 05:18:31 +0000 (UTC) X-FDA: 81451775622.06.D97FF52 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf14.hostedemail.com (Postfix) with ESMTP id 7143710000B for ; Mon, 13 Nov 2023 05:18:29 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=jEuPH2X3; dmarc=none; spf=none (imf14.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699852709; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Mw5XVGZ53qHqO6W75lYP3KgdM5arMrga6Cb5+Lo7O5w=; b=zPnKLMstW1HDtNV6a59Fq83AisbvbL80DwlbabMJ99/SVxi/wLqdILzy6JVAxAJM4PT0gu 0TbeG2R0t5+cZqRDT+YK7isOgCkrq9dK4fEi1HYIY1LJc0xvVgJV3LPcCjGNLFgxE1FOui fgx2rTSWCKiZb8CCaWqHfv8TECjGpsI= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=jEuPH2X3; dmarc=none; spf=none (imf14.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699852709; a=rsa-sha256; cv=none; b=Pn63DvDR/lDLFU7OoKDmJ3IxCffHD+m7NQHeRCmeXycBt9CxgDIB1XmdJVgYr7bKz7w5Nu SMNQT7LddSaT6trR2EMyJoMEOqukUev4PDoZaHSNnbVLLx3A3E9eBuMm1us1UbOxkHpt9Y G9PrR1pprxn2annnE5Uu/vfLoI+ER9s= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Mw5XVGZ53qHqO6W75lYP3KgdM5arMrga6Cb5+Lo7O5w=; b=jEuPH2X3TXGxfIus5MqSUUo0rE WbVw0o+U0Q5hXmq+HbiaQlrcdxkwZDdFRWSX1yy5UzPeVxebXs3+jWvyz3dN5OF1ExcMg8u7ZkYVt Fq1ePl76US38DuAlbHhjPFEwWaQ1fY3u84J/UjOdHoIDuu7txdFJsNAHja7cfk9MHYfRtcb/Cdob3 hB28RS6D8bCKXnNBRCmv5MXJ9BQv89ucTxyUcrkIKyb5tz3AqyfJYjSLCw1nn+H4jlrajcrptNuNK LeiEeGoUS/gHIp54evg6FVSDhmCPJDqMRnstXRABo5Fo0EgV9eHdKHTP0ZRdYjlbbBjcg5E8RJs/6 UZfxZZFQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1r2PKq-00CLiX-S5; Mon, 13 Nov 2023 05:18:08 +0000 Date: Mon, 13 Nov 2023 05:18:08 +0000 From: Matthew Wilcox To: John Hubbard Cc: Ryan Roberts , Andrew Morton , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , David Rientjes , Vlastimil Babka , Hugh Dickins , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory Message-ID: References: <20230929114421.3761121-1-ryan.roberts@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 7143710000B X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 856qabpwmxcz8ztis6tsmfk5njyhx1zs X-HE-Tag: 1699852709-149093 X-HE-Meta: U2FsdGVkX1/u5OVRQB9Ma1R5J9nBR3R0ZBAH2Ithf14qPZwoAsOduDTqnWh7rZpg9vnSFSCTA1IQ4CE7q6MxnGksDP3wUDTurOmhMgLsAt4IsVyBW47+mOy/tWh6kxvHzi55XpXq+SLe6WXg2f+ZSSN/L64h0ZG7AkbBm9vNh9k3uEh94c8mQl4tukuP1QBxK+ipPnmq6jKf3TXqk1xHzu8TpPtoyk1OAHJM+J0n8vFECV7mFUnLMTdKveESPVJ3J/TNGzK1/3QTy+nlcR6MJBgW0vF3hUN/+Wp8xGY7RDjkyfMpom3Q2VFkheRZ/r1HBGDhHmXZhP18/8L4wzsmMSohHKWbuUejoTM8kkpu+L1qtBANiqqiTraAwb2KNNvQgZzsRtObC0qtSFbTV1cLjjQunCEmzzmgeS0JSj5jkW0hXI3fl5t4ink/58gk2IJrMqCko0cXwjqO2FrpadYXYpAy17FDs7t4ugTCznMYfgPMi4xTdxh8E0ffy6OeaqusAf6tcaEEVdo6eqiJ8RWzGkdzUb7cgubwy7r2E1sMskbPkZ5feiEzy9IiiIthHXZi58Nn1IzJIIegdhHrl8b3Ud+YO5EzySKd1IkzR5QDBJ7xApSIYXaiGa2wpIvZ5c7UujkAgrpl53uRdTH9Mzz1QJCyE3qXRUsHh4sGSqJBp/rg0R3yV6YiEG/qfikS/vGlhD50qDwZx0oOh+Y8+0qlYGfrtZai6cXs6rcT56dbmubCdJwA+eBK66Bk0OAnaP5NhW3VfJKgPqzwHg6ApCbQvgnMU8L7yAYrLnCn9DeuG+AayUET7BVuQNi9lIU6yvjDVbDahBFAkJw5EM/iRKigrsuSedFZTt5FWyyTatjvlKSG+ZzQehC/HYqLTX+jc26/DmL2rZdQxonTN1qoySOcHKqIl/GupG/kOx+sOTQJoDcmeUXKp+H7G26JuETqBNu6Yn4PNDyklY/GYDD7Ch7 rATidLJL THJf2PpiakSyP53o7DaGy0E32RGUESsWeOkSFhzfLOqAqt5qqI4a4Aaz4aO78v7fLAdsElorD1gi+zAYpGdCRYudQQP1OYqivlSD/ZlN6VNACUm1rCjUYHriAafgm7Nc6keADSH8InLtFUtjMwo5N+ZguzPgYijlQPdzk0oLrO3sqdG/dPokVtLeZWscmjU/p34iJ6nETtK1Z079OCiKCNYFlFw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Nov 12, 2023 at 10:57:47PM -0500, John Hubbard wrote: > I've done some initial performance testing of this patchset on an arm64 > SBSA server. When these patches are combined with the arm64 arch contpte > patches in Ryan's git tree (he has conveniently combined everything > here: [1]), we are seeing a remarkable, consistent speedup of 10.5x on > some memory-intensive workloads. Many test runs, conducted independently > by different engineers and on different machines, have convinced me and > my colleagues that this is an accurate result. > > In order to achieve that result, we used the git tree in [1] with > following settings: > > echo always >/sys/kernel/mm/transparent_hugepage/enabled > echo recommend >/sys/kernel/mm/transparent_hugepage/anon_orders > > This was on a aarch64 machine configure to use a 64KB base page size. > That configuration means that the PMD size is 512MB, which is of course > too large for practical use as a pure PMD-THP. However, with with these > small-size (less than PMD-sized) THPs, we get the improvements in TLB > coverage, while still getting pages that are small enough to be > effectively usable. That is quite remarkable! My hope is to abolish the 64kB page size configuration. ie instead of using the mixture of page sizes that you currently are -- 64k and 1M (right? Order-0, and order-4), that 4k, 64k and 2MB (order-0, order-4 and order-9) will provide better performance. Have you run any experiements with a 4kB page size?