From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0638BC4332F for ; Tue, 14 Nov 2023 10:57:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2EC7E6B02B7; Tue, 14 Nov 2023 05:57:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 29C0A6B02BB; Tue, 14 Nov 2023 05:57:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 164A86B02BD; Tue, 14 Nov 2023 05:57:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 03E2A6B02B7 for ; Tue, 14 Nov 2023 05:57:15 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CC313404FD for ; Tue, 14 Nov 2023 10:57:14 +0000 (UTC) X-FDA: 81456257988.06.257E58F Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf06.hostedemail.com (Postfix) with ESMTP id B578F18000E for ; Tue, 14 Nov 2023 10:57:12 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699959433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HPWsF2ER4+Rk27FqBwRPqbcWtzHCf7Bh+eSJQ0VKn54=; b=54Jtk7Yw/YPJNmimGHuW18W8rFEbB/Q8oCG/3aeTYrD4okZ6AG1bPbpGfzWTehE5oqAp+4 pKstm6xihNceQl06bxdyfgdA/b44P4peVOetv99yYsz7DFK+KouwKi4jRnf2gHk8TtnYCS j3CbLcNdynFTlfPtvjsmZYJWDu2i18c= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699959433; a=rsa-sha256; cv=none; b=teF+YxzFIDSd0kUbYSXzkd7KBpcD0BlBDNQKCkbVLHQxB5GOE258rQFZCEvp2ys6sMH+UK fCJ/jj/iryHs4I3TKD6m4iQ5w3dZQCiiXdpvx4LMfg0dbP+fv4hOyftIiMt8fPnxdm0gEg sWuGzP6rm2XEepeZZB6lOO4pPg5jgxI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C9D0CC15; Tue, 14 Nov 2023 02:57:56 -0800 (PST) Received: from [10.1.27.144] (XHFQ2J9959.cambridge.arm.com [10.1.27.144]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1DA963F641; Tue, 14 Nov 2023 02:57:09 -0800 (PST) Message-ID: <181a25c2-219e-4af9-9f8e-e5f514bbc4b6@arm.com> Date: Tue, 14 Nov 2023 10:57:07 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 0/9] variable-order, large folios for anonymous memory Content-Language: en-GB To: Matthew Wilcox Cc: John Hubbard , Andrew Morton , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , David Rientjes , Vlastimil Babka , Hugh Dickins , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org References: <20230929114421.3761121-1-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: B578F18000E X-Rspam-User: X-Stat-Signature: jb986eaaams5tafi5ewh9qniufq1ex1n X-Rspamd-Server: rspam03 X-HE-Tag: 1699959432-560449 X-HE-Meta: U2FsdGVkX19fMGM4HglAhDIGD7eyx9cYCpvx7OXdYVKe07/kfaOEtTTLiUYaj46NE8OPx13Vx7j3ph91yoCVv00f5QRPnjSbXP+8P9osvCaT0IstN9FgQeQtLKVlUBOqeC5IKIjpks5F/3fI5Ms9Hd49QIL9nKXMy+HjlBskcE2pcQMMwvqIQ5q6aYVHps21+nk7P1BGtawvWC+FVqRpMlB6ZOgAUj1NUmF2HrG2Gxml3mEybPmAm2oTGtR9zRaj2lei0qznrKgSaP1ev25tQV/GpZguW4z//04geeMole1ww3ZFoQbYW0xc3qukHp08mHaNPmr9oiOzm2ox4Bfh32bljI1s4FLIaVwdskjetnnN5gBGBAK+2VJp0SNde2pbDvFuwj9UmfoCU/ycNHBTo4AkerHSzkzMDnaC9VeGkG74uHT3X47lISS5/8KHjEmNeOqqU/FNhsa4aE/1VqYyMlsO2fNhGcFIDIuOWC/3TKBqJ1Fy8G+/URKX0uzemETq6vdBj7VyeJ5OgPn9oQSnr8vDtnU9hacRiAzSmcfWDaXmaLYpU73++XWLe7z84Vl8ZAiQXHgRno7lvC95aDHEnE2BcujCd7YHy2ABlfT0xOYFLFKo7kPLtCLGqzAKYC8g8+Rb0k6JoT/BFc0yGb2uZrD0Qoi/1LTLsH+7qPpDZjqxqaiXKaElNo+ugIwawz04CnzvGaaZt07CA6L9LDXF3nB3/x8ZbBO2LjY9KwHTksDBgP9RZKpQRKa25loS5W/nRZ2KhbpykepqpfcADoh/hGLGPZVozls9AMLd/59DtHIUhX9tBvibiheZN+5qyNxxTGhIQEMb4a84vlBWHWbEfKv6va9IHTmCWX0ZQvO+Z+s/qinTEbkih/TtPkfYAg48jkLBHPw0U384g2/qr+kZPjdlZgWZNbeBz7LvVL+8Lpd7b3jKPkSOGHBQTC/y9K3NJb7lkP8DNePx/bA2iH+ OoGw5EbC FiIw8wSSuVKwo/21pGNaaYNhoFhiYAzYekPsLT2unNEMgjNgTHRp67+qujRaaIAl9Brd5bp5MomTiJU4nNoS3e+UqQWItTgKEqR+BMKEezWS5wFD70jdUriwv17FCb6B7cJAgvv54sd0I2l+kj/ypxdhNbJrQZqa4ZypaCtTFlUpbeSJ/m2CXqd+iOY9bVjq+iH8975Mg3r+oCZxeY1ULLbN5dhD5rvxL4a04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 13/11/2023 15:04, Matthew Wilcox wrote: > On Mon, Nov 13, 2023 at 10:19:48AM +0000, Ryan Roberts wrote: >> On 13/11/2023 05:18, Matthew Wilcox wrote: >>> My hope is to abolish the 64kB page size configuration. ie instead of >>> using the mixture of page sizes that you currently are -- 64k and >>> 1M (right? Order-0, and order-4) >> >> Not quite; the contpte-size for a 64K page size is 2M/order-5. (and yes, it is >> 64K/order-4 for a 4K page size, and 2M/order-7 for a 16K page size. I agree that >> intuitively you would expect the order to remain constant, but it doesn't). >> >> The "recommend" setting above will actually enable order-3 as well even though >> there is no HW benefit to this. So the full set of available memory sizes here is: >> >> 64K/order-0, 512K/order-3, 2M/order-5, 512M/order-13 >> >>> , that 4k, 64k and 2MB (order-0, >>> order-4 and order-9) will provide better performance. >>> >>> Have you run any experiements with a 4kB page size? >> >> Agree that would be interesting with 64K small-sized THP enabled. And I'd love >> to get to a world were we universally deal in variable sized chunks of memory, >> aligned on 4K boundaries. >> >> In my experience though, there are still some performance benefits to 64K base >> page vs 4K+contpte; the page tables are more cache efficient for the former case >> - 64K of memory is described by 8 bytes in the former vs 8x16=128 bytes in the >> latter. In practice the HW will still only read 8 bytes in the latter but that's >> taking up a full cache line vs the former where a single cache line stores 8x >> 64K entries. > > This is going to depend on your workload though -- if you're using more > 2MB than 64kB, you get to elide a layer of page table with 4k base, > rather than taking up 4 cache lines with a 64k base. True, but again depending on workload/config, you may have few levels of lookup for the 64K native case in the first place because you consume more VA bits at each level.