Re: [LSF/MM/BPF TOPIC] Per-process page size

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Ryan Roberts <ryan.roberts@arm.com>
To: "David Hildenbrand (Arm)" <david@kernel.org>,
	Matthew Wilcox <willy@infradead.org>, Dev Jain <dev.jain@arm.com>
Cc: lsf-pc@lists.linux-foundation.org, catalin.marinas@arm.com,
	will@kernel.org, ardb@kernel.org, hughd@google.com,
	baolin.wang@linux.alibaba.com, akpm@linux-foundation.org,
	lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
	vbabka@suse.cz, rppt@kernel.org, surenb@google.com,
	mhocko@suse.com, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [LSF/MM/BPF TOPIC] Per-process page size
Date: Tue, 17 Feb 2026 15:51:05 +0000	[thread overview]
Message-ID: <0a772015-f9ea-4c7b-a42f-393a2f591f98@arm.com> (raw)
In-Reply-To: <2e68ef61-dcf2-46b2-913f-14980a104faf@kernel.org>

On 17/02/2026 15:30, David Hildenbrand (Arm) wrote:
> On 2/17/26 16:22, Matthew Wilcox wrote:
>> On Tue, Feb 17, 2026 at 08:20:26PM +0530, Dev Jain wrote:
>>> 2. Generic Linux MM enlightenment
>>> ---------------------------------
>>> We enlighten the Linux MM code to always hand out memory in the granularity
>>
>> Please don't use the term "enlighten".  Tht's used to describe something
>> something or other with hypervisors.  Come up with a new term or use one
>> that already exists.
>>
>>> File memory
>>> -----------
>>> For a growing list of compliant file systems, large folios can already be
>>> stored in the page cache. There is even a mechanism, introduced to support
>>> filesystems with block sizes larger than the system page size, to set a
>>> hard-minimum size for folios on a per-address-space basis. This mechanism
>>> will be reused and extended to service the per-process page size requirements.
>>>
>>> One key reason that the 64K kernel currently consumes considerably more memory
>>> than the 4K kernel is that Linux systems often have lots of small
>>> configuration files which each require a page in the page cache. But these
>>> small files are (likely) only used by certain processes. So, we prefer to
>>> continue to cache those using a 4K page.
>>> Therefore, if a process with a larger page size maps a file whose pagecache
>>> contains smaller folios, we drop them and re-read the range with a folio
>>> order at least that of the process order.
>>
>> That's going to be messy.  I don't have a good idea for solving this
>> problem, but the page cache really isn't set up to change minimum folio
>> order while the inode is in use.

Dev has a prototype up and running, but based on your comments, I'm guessing
there is some horrible race that hasn't hit yet. Would be good to debug the gap
in understanding at some point!

> 
> In a private conversation I also raised that some situations might make it
> impossible/hard to drop+re-read.
> 
> One example I cam up with if a folio is simply long-term R/O pinned. But I am
> also not quite sure how mlock might interfere here.
> 
> So yes, I think the page cache is likely the one of the most problematic/messy
> thing to handle.
> 

I guess we could side step the problem for now, by initially requiring that the
minimum folio size always be the maximum supported process page size. That would
allow us to get something up and running at least. But then we lose the memory
saving benefits.

Of course, I'm conveniently ignoring that not all filesystems support large
folios, but perhaps we could do a generic fallback adapter with a bounce buffer
for that case?

next prev parent reply	other threads:[~2026-02-17 15:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-17 14:50 Dev Jain
2026-02-17 15:22 ` Matthew Wilcox
2026-02-17 15:30   ` David Hildenbrand (Arm)
2026-02-17 15:51     ` Ryan Roberts [this message]
2026-02-20  4:49     ` Matthew Wilcox
2026-02-20 16:50       ` David Hildenbrand (Arm)
2026-02-18  8:39   ` Dev Jain
2026-02-18  8:58     ` Dev Jain
2026-02-18  9:15       ` David Hildenbrand (Arm)
2026-02-20  9:49   ` Arnd Bergmann
2026-02-20 13:37 ` Pedro Falcato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a772015-f9ea-4c7b-a42f-393a2f591f98@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox