Re: [RFC PATCH v1 0/4] Control folio sizes used for page cache memory

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Jonathan Corbet <corbet@lwn.net>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Barry Song <baohua@kernel.org>, Lance Yang <ioworker0@gmail.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Gavin Shan <gshan@redhat.com>,
	Pankaj Raghav <kernel@pankajraghav.com>,
	Daniel Gomez <da.gomez@samsung.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH v1 0/4] Control folio sizes used for page cache memory
Date: Wed, 17 Jul 2024 16:25:40 +0200	[thread overview]
Message-ID: <7efc4b0b-4e12-4b29-815f-0854182ce135@redhat.com> (raw)
In-Reply-To: <99b33a29-e97a-4932-8d7a-85bc01885d18@arm.com>

On 17.07.24 12:45, Ryan Roberts wrote:
> On 17/07/2024 11:31, David Hildenbrand wrote:
>> On 17.07.24 09:12, Ryan Roberts wrote:
>>> Hi All,
>>>
>>> This series is an RFC that adds sysfs and kernel cmdline controls to configure
>>> the set of allowed large folio sizes that can be used when allocating
>>> file-memory for the page cache. As part of the control mechanism, it provides
>>> for a special-case "preferred folio size for executable mappings" marker.
>>>
>>> I'm trying to solve 2 separate problems with this series:
>>>
>>> 1. Reduce pressure in iTLB and improve performance on arm64: This is a modified
>>> approach for the change at [1]. Instead of hardcoding the preferred executable
>>> folio size into the arch, user space can now select it. This decouples the arch
>>> code and also makes the mechanism more generic; it can be bypassed (the default)
>>> or any folio size can be set. For my use case, 64K is preferred, but I've also
>>> heard from Willy of a use case where putting all text into 2M PMD-sized folios
>>> is preferred. This approach avoids the need for synchonous MADV_COLLAPSE (and
>>> therefore faulting in all text ahead of time) to achieve that.
>>>
>>> 2. Reduce memory fragmentation in systems under high memory pressure (e.g.
>>> Android): The theory goes that if all folios are 64K, then failure to allocate a
>>> 64K folio should become unlikely. But if the page cache is allocating lots of
>>> different orders, with most allocations having an order below 64K (as is the
>>> case today) then ability to allocate 64K folios diminishes. By providing control
>>> over the allowed set of folio sizes, we can tune to avoid crucial 64K folio
>>> allocation failure. Additionally I've heard (second hand) of the need to disable
>>> large folios in the page cache entirely due to latency concerns in some
>>> settings. These controls allow all of this without kernel changes.
>>>
>>> The value of (1) is clear and the performance improvements are documented in
>>> patch 2. I don't yet have any data demonstrating the theory for (2) since I
>>> can't reproduce the setup that Barry had at [2]. But my view is that by adding
>>> these controls we will enable the community to explore further, in the same way
>>> that the anon mTHP controls helped harden the understanding for anonymous
>>> memory.
>>>
>>> ---
>>
>> How would this interact with other requirements we get from the filesystem (for
>> example, because of the device) [1].
>>
>> Assuming a device has a filesystem has a min order of X, but we disable anything
>>> = X, how would we combine that configuration/information?
> 
> Currently order-0 is implicitly the "always-on" fallback order. My thinking was
> that with [1], the specified min order just becomes that "always-on" fallback order.
> 
> Today:
> 
>    orders = file_orders_always() | BIT(0);
> 
> Tomorrow:
> 
>    orders = (file_orders_always() & ~(BIT(min_order) - 1)) | BIT(min_order);
> 
> That does mean that in this case, a user-disabled order could still be used. So
> the controls are really hints rather than definitive commands.

Okay, because that's a difference to order-0, which is -- as you note -- 
always-on (not even a toggle).

Staring at patch #1, you use the name "file_enable". That might indeed 
cause some confusion. Thinking out loud, I wonder if a different 
terminology could better express the semantics. Hm ... but maybe it only 
would have to be documented.

Thanks for the details.

-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2024-07-17 14:25 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-17  7:12 Ryan Roberts
2024-07-17  7:12 ` [RFC PATCH v1 1/4] mm: mTHP user controls to configure pagecache large folio sizes Ryan Roberts
2024-07-17  7:12 ` [RFC PATCH v1 2/4] mm: Introduce "always+exec" for mTHP file_enabled control Ryan Roberts
2024-07-17 17:10   ` Ryan Roberts
2024-07-17  7:12 ` [RFC PATCH v1 3/4] mm: Override mTHP "enabled" defaults at kernel cmdline Ryan Roberts
2024-07-19  0:46   ` Barry Song
2024-07-19  7:47     ` Ryan Roberts
2024-07-19  7:52       ` Barry Song
2024-07-19  8:18         ` Ryan Roberts
2024-07-19  8:29         ` David Hildenbrand
2024-07-22  9:13   ` Daniel Gomez
2024-07-22  9:36     ` Ryan Roberts
2024-07-22 14:10       ` Ryan Roberts
2024-07-17  7:12 ` [RFC PATCH v1 4/4] mm: Override mTHP "file_enabled" " Ryan Roberts
2024-07-17 10:31 ` [RFC PATCH v1 0/4] Control folio sizes used for page cache memory David Hildenbrand
2024-07-17 10:45   ` Ryan Roberts
2024-07-17 14:25     ` David Hildenbrand [this message]
2024-07-22  9:35     ` Daniel Gomez
2024-07-22  9:43       ` Ryan Roberts
     [not found] ` <480f34d0-a943-40da-9c69-2353fe311cf7@arm.com>
2024-09-19  8:20   ` Barry Song
2024-09-19 17:21     ` Ryan Roberts
2024-12-06  5:09     ` Barry Song
2024-12-06  5:29       ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7efc4b0b-4e12-4b29-815f-0854182ce135@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=corbet@lwn.net \
    --cc=da.gomez@samsung.com \
    --cc=gshan@redhat.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=kernel@pankajraghav.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox