From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4095C3DA42 for ; Wed, 17 Jul 2024 10:45:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 58DE26B009C; Wed, 17 Jul 2024 06:45:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 516D66B009E; Wed, 17 Jul 2024 06:45:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B7DD6B009F; Wed, 17 Jul 2024 06:45:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 18ED76B009C for ; Wed, 17 Jul 2024 06:45:55 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B6E4012096A for ; Wed, 17 Jul 2024 10:45:54 +0000 (UTC) X-FDA: 82348914228.04.031940C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf02.hostedemail.com (Postfix) with ESMTP id 856888001F for ; Wed, 17 Jul 2024 10:45:52 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721213113; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6EGOJicNicv9rj5HLtMnKhyFs7dJ7n1XaYru/+x8bbo=; b=y1UJaTexPYh62TBHIS31X5L+arQWPfEul3dSjZo/abAobhO2mLJcL8m1YEmhdXR4fFW+Jz Wy6cnE1OJf/Mak/VWJX5H/q6I2BwzN83a08TM2hlqs0Os8Rf6/dkhSR2EThKyS0EXEnQZX movU+DmpsVqO86baFF7GKAhnAlfABaI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721213113; a=rsa-sha256; cv=none; b=3iTQl7bnzkFbQrgkOSfKTq87zuh1Rq2oLiyjuRrKNcaVfLQ6R+WoxS5AUlBs8MYtaqab7N BPPwmxdqESrWUUTQABzUxJRaTwRIOTAmVyM1uB/p+knQG9R8RL/aJO8oVg/mcZde8I0Pvh cOaZny47uArFarwACdunIfEibmTY+Mk= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E32851063; Wed, 17 Jul 2024 03:46:16 -0700 (PDT) Received: from [10.57.77.222] (unknown [10.57.77.222]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C557F3F73F; Wed, 17 Jul 2024 03:45:49 -0700 (PDT) Message-ID: <99b33a29-e97a-4932-8d7a-85bc01885d18@arm.com> Date: Wed, 17 Jul 2024 11:45:48 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 0/4] Control folio sizes used for page cache memory Content-Language: en-GB To: David Hildenbrand , Andrew Morton , Hugh Dickins , Jonathan Corbet , "Matthew Wilcox (Oracle)" , Barry Song , Lance Yang , Baolin Wang , Gavin Shan , Pankaj Raghav , Daniel Gomez Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240717071257.4141363-1-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: uh73x4896yzis7f41zrf8hp5q7m9aujg X-Rspamd-Queue-Id: 856888001F X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1721213152-290331 X-HE-Meta: U2FsdGVkX188aeHU0vDd6PtjnvtschMUweeNv0rC/A9P+Spd0yn1HsehYQIIibpdMIr+EWwQhfJ9E8xdTWrZ6lEVQqfiaGR79Ff3lYrsJY0QLEON690u+gZP4EBMZpG0bichG3zFbOr7yii3LOHsSDV3uPIcLnQxDdSqGSdJogXBUxtf54MeBKTexzp+ddZ1uMLWUBPgHIugqqIjB47S1udzr6ZCFxFpK5hcudGgvkpEv+RAj7mGBWrmmj1GnP+/r4AHt4WgcAUt9Sotq4hW+xBxt9Y7OtSZ8V/7ZFJUmYMLV38WxaBGvsZtg0THb6HHbB9vYpHVsOGpRk1LDDeNRWa9njjCpJ58wzypbBpZ5FObSevglSitibeYfItCWthulL7HUuMY8KZwsxXOyVpP1ktb6cuHWg83b+JQQHs+VNaOsTTo7sIMxYBaeCWRjQhU5Luu3h/ghqu6cmrwbX4B/DtdeuKDf7G7jR0RxzpQc904dD04Vssikay3vM3cdzFIBg5NNYctv2XV5RIBnrfOKlwZeqsAfAIfBqo04uKHb4EDCrFYAhj0czeY/txtRylm7upt3/kKklUVfpMMcRTb/f6KBIDr60SAr/OgYZFW8j6DfA3yXvLJQN8sEt1NZzJNW+PCmv9dQX3LdAFhNGlSagYeAExoy33VSqGQnCOHjUVgJzCZx6WpIXKxbkKTFwbtZf8Wcs5oDABve6DuySjHyX7QYKcLBOo4HghPkRDcdIrYGVpcv/WZXJzU4+qbQS64/8O1Twpd7ULWmMuw+V9Se5wstmPZqGSQbIjsvwM+4PdDYKDQhyXw157YlR3/v+Nmy4ovLVbjJmTvdEWZoZC9JfECh3H6j7DVw7SQVgLVuFb3RmS1/xGlZtM+WrwSxVoK1VFVC+LejrYt0JBOR07iBh/nfWVvQ+10HyFuraWT+qw+HI4/8ZW7wK0XlKwg1skGlp+WIUUji2AuWwWm4Im Xtw/nzAc 2Hr9D4EW0FifE1Hpt7iciXLP4ecpRJXanVkR5p6MVliyUluJxPZwUMq3y5BNbQrXVStXLqjs/WsRBcJmyvq/CTS8mIr6Smf3ULzKJCZXFZRh8VXUPlTIOUFgcC3C4tL3KqQtBJAs+9jJ+suP3+9/ol7t2+rAYjJtGcirY9qyR+k49l51d7wmP1Y4NWJlObdpZbMC80iOMFkGLWLNuPpkRKwSgfZroqqh1151y/1es1RbHpirmzW7bEMQrD7PK5PcaLD3bbXIjTHF/Aug= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 17/07/2024 11:31, David Hildenbrand wrote: > On 17.07.24 09:12, Ryan Roberts wrote: >> Hi All, >> >> This series is an RFC that adds sysfs and kernel cmdline controls to configure >> the set of allowed large folio sizes that can be used when allocating >> file-memory for the page cache. As part of the control mechanism, it provides >> for a special-case "preferred folio size for executable mappings" marker. >> >> I'm trying to solve 2 separate problems with this series: >> >> 1. Reduce pressure in iTLB and improve performance on arm64: This is a modified >> approach for the change at [1]. Instead of hardcoding the preferred executable >> folio size into the arch, user space can now select it. This decouples the arch >> code and also makes the mechanism more generic; it can be bypassed (the default) >> or any folio size can be set. For my use case, 64K is preferred, but I've also >> heard from Willy of a use case where putting all text into 2M PMD-sized folios >> is preferred. This approach avoids the need for synchonous MADV_COLLAPSE (and >> therefore faulting in all text ahead of time) to achieve that. >> >> 2. Reduce memory fragmentation in systems under high memory pressure (e.g. >> Android): The theory goes that if all folios are 64K, then failure to allocate a >> 64K folio should become unlikely. But if the page cache is allocating lots of >> different orders, with most allocations having an order below 64K (as is the >> case today) then ability to allocate 64K folios diminishes. By providing control >> over the allowed set of folio sizes, we can tune to avoid crucial 64K folio >> allocation failure. Additionally I've heard (second hand) of the need to disable >> large folios in the page cache entirely due to latency concerns in some >> settings. These controls allow all of this without kernel changes. >> >> The value of (1) is clear and the performance improvements are documented in >> patch 2. I don't yet have any data demonstrating the theory for (2) since I >> can't reproduce the setup that Barry had at [2]. But my view is that by adding >> these controls we will enable the community to explore further, in the same way >> that the anon mTHP controls helped harden the understanding for anonymous >> memory. >> >> --- > > How would this interact with other requirements we get from the filesystem (for > example, because of the device) [1]. > > Assuming a device has a filesystem has a min order of X, but we disable anything >>= X, how would we combine that configuration/information? Currently order-0 is implicitly the "always-on" fallback order. My thinking was that with [1], the specified min order just becomes that "always-on" fallback order. Today: orders = file_orders_always() | BIT(0); Tomorrow: orders = (file_orders_always() & ~(BIT(min_order) - 1)) | BIT(min_order); That does mean that in this case, a user-disabled order could still be used. So the controls are really hints rather than definitive commands. > > > [1] > https://lore.kernel.org/all/20240715094457.452836-2-kernel@pankajraghav.com/T/#u >