From: "David Hildenbrand (Arm)" <david@kernel.org>
To: "Tejun Heo" <tj@kernel.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Jonathan Corbet" <corbet@lwn.net>,
"Shuah Khan" <skhan@linuxfoundation.org>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Lorenzo Stoakes" <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
"Vlastimil Babka" <vbabka@kernel.org>,
"Mike Rapoport" <rppt@kernel.org>,
"Suren Baghdasaryan" <surenb@google.com>,
"Michal Hocko" <mhocko@suse.com>,
"Rik van Riel" <riel@surriel.com>, "Harry Yoo" <harry@kernel.org>,
"Jann Horn" <jannh@google.com>,
"Brendan Jackman" <jackmanb@google.com>,
"Zi Yan" <ziy@nvidia.com>, "Pedro Falcato" <pfalcato@suse.de>,
"Matthew Wilcox" <willy@infradead.org>
Cc: cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org,
"David Hildenbrand (Arm)" <david@kernel.org>
Subject: [PATCH RFC 00/13] mm/rmap: support arbitrary folio mappings
Date: Sun, 12 Apr 2026 20:59:31 +0200 [thread overview]
Message-ID: <20260412-mapcount-v1-0-05e8dfab52e0@kernel.org> (raw)
This series is related to my LSF/MM/BPF topic:
[LSF/MM/BPF TOPIC] Towards removing CONFIG_PAGE_MAPCOUNT [1]
And does the following things:
(a) Gets rid of CONFIG_PAGE_MAPCOUNT, stopping rmap-related code to no
longer use page->_mapcount.
(b) Converts the entire mapcount to a "total mapped pages" counter, that
can trivially be used to calculate the per-page average mapcount in
a folio.
(c) Cleans up the code heavily,
(d) Teaches RMAP code to support arbitrary folio mappings: For example,
supporting PMD-mapping of folios that span multiple PMDs.
Initially, I wanted to use a PMD + PUD mapcount, but once I realized that
we can do the same thing much easier with a "total mapped pages" counters,
I tried that. And was surprised how clean it looks.
More details in the last patch.
Functional Changes
------------------
The kernel now always behaves like CONFIG_PAGE_NO_MAPCOUNT currently
does, in particular:
(1) System/node/memcg stats account large folios as fully mapped as soon
as a single page is mapped, instead of the precise number of pages
a partially-mapped folio has mapped. For example, this affects
"AnonPages:", "Mapped:" and "Shmem" in /proc/meminfo.
(2) "mapmax" part of /proc/$PID/numa_maps uses the average page mapcount
in a folio instead of the effective page mapcount.
(3) Determining the PM_MMAP_EXCLUSIVE flag for /proc/$PID/pagemap is based on
folio_maybe_mapped_shared() instead of the effective page mapcount.
(4) /proc/kpagecount exposes the average page mapcount in a folio
instead of the effective page mapcount.
(5) Calculating the Pss for /proc/$PID/smaps and /proc/$PID/smaps_rollup
uses the average page mapcount in a folio instead of the effective
page mapcount.
(6) Calculating the Uss for /proc/$PID/smaps and /proc/$PID/smaps_rollup
uses folio_maybe_mapped_shared() instead of the effective page
mapcount.
(7) Detecting partially-mapped anonymous folios uses the average
page-page mapcount. This implies that we cannot detect partial
mappings of shared anonymous folios in all cases.
TODOs
-----
Partially-mapped folios:
If deemed relevant, we could detect more partially-mapped shared
anonymous folios on the memory reclaim path (e.g., during access-bit
harvesting) and flag them accordingly, so they can get deferred-split.
We might also just let the deferred splitting logic perform more such
scanning of possible candidates.
Mapcount overflows:
It may already be possible to overflow a large folio's mapcount
(+refcount). With this series, it may be possible to overflow
"total mapped pages" on 32bit; and I'd like to avoid making it an
unsigned long long on 32bit.
In a distant future, we may want a 64bit mapcountv value, but for
the time being (no relevant use cases), we should likely reject new
folio mappings if there is the possibility for mapcount +
"total mapped pages" overflows early. I assume doing some basic checks
during fork() + file folio mapping should be good enough (e.g., stop
once it would turn negative).
This series saw only very basic testing on 64bit and no performance
fine-tuning yet.
[1] https://lore.kernel.org/all/fe6afcc3-7539-4650-863b-04d971e89cfb@kernel.org/
---
David Hildenbrand (Arm) (13):
mm/rmap: remove folio->_nr_pages_mapped
fs/proc/task_mmu: remove CONFIG_PAGE_MAPCOUNT handling for "mapmax"
fs/proc/page: remove CONFIG_PAGE_MAPCOUNT handling for kpagecount
fs/proc/task_mmu: remove CONFIG_PAGE_MAPCOUNT handling for PM_MMAP_EXCLUSIVE
fs/proc/task_mmu: remove mapcount comment in smaps_account()
fs/proc/task_mmu: remove CONFIG_PAGE_MAPCOUNT handling in smaps_account()
mm/rmap: remove CONFIG_PAGE_MAPCOUNT
mm: re-consolidate folio->_entire_mapcount
mm: move _large_mapcount to _mapcount in page[1] of a large folio
mm: re-consolidate folio->_pincount
mm/rmap: stop using the entire mapcount for hugetlb folios
mm/rmap: large mapcount interface cleanups
mm/rmap: support arbitrary folio mappings
Documentation/admin-guide/cgroup-v1/memory.rst | 6 +-
Documentation/admin-guide/cgroup-v2.rst | 13 +-
Documentation/admin-guide/mm/pagemap.rst | 30 ++-
Documentation/filesystems/proc.rst | 41 ++--
Documentation/mm/transhuge.rst | 29 +--
fs/proc/internal.h | 58 +----
fs/proc/page.c | 10 +-
fs/proc/task_mmu.c | 69 ++----
include/linux/mm.h | 37 +--
include/linux/mm_types.h | 22 +-
include/linux/pgtable.h | 22 ++
include/linux/rmap.h | 221 ++++++++----------
mm/Kconfig | 17 --
mm/debug.c | 10 +-
mm/internal.h | 30 +--
mm/memory.c | 3 +-
mm/page_alloc.c | 31 +--
mm/rmap.c | 302 ++++++++-----------------
18 files changed, 325 insertions(+), 626 deletions(-)
---
base-commit: 196ab4af58d724f24335fed3da62920c3cea945f
change-id: 20260330-mapcount-32066c687010
Best regards,
--
David Hildenbrand (Arm) <david@kernel.org>
next reply other threads:[~2026-04-12 19:00 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-12 18:59 David Hildenbrand (Arm) [this message]
2026-04-12 18:59 ` [PATCH RFC 01/13] mm/rmap: remove folio->_nr_pages_mapped David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 02/13] fs/proc/task_mmu: remove CONFIG_PAGE_MAPCOUNT handling for "mapmax" David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 03/13] fs/proc/page: remove CONFIG_PAGE_MAPCOUNT handling for kpagecount David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 04/13] fs/proc/task_mmu: remove CONFIG_PAGE_MAPCOUNT handling for PM_MMAP_EXCLUSIVE David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 05/13] fs/proc/task_mmu: remove mapcount comment in smaps_account() David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 06/13] fs/proc/task_mmu: remove CONFIG_PAGE_MAPCOUNT handling " David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 07/13] mm/rmap: remove CONFIG_PAGE_MAPCOUNT David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 08/13] mm: re-consolidate folio->_entire_mapcount David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 09/13] mm: move _large_mapcount to _mapcount in page[1] of a large folio David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 10/13] mm: re-consolidate folio->_pincount David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 11/13] mm/rmap: stop using the entire mapcount for hugetlb folios David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 12/13] mm/rmap: large mapcount interface cleanups David Hildenbrand (Arm)
2026-04-12 18:59 ` [PATCH RFC 13/13] mm/rmap: support arbitrary folio mappings David Hildenbrand (Arm)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260412-mapcount-v1-0-05e8dfab52e0@kernel.org \
--to=david@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=hannes@cmpxchg.org \
--cc=harry@kernel.org \
--cc=jackmanb@google.com \
--cc=jannh@google.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=mkoutny@suse.com \
--cc=pfalcato@suse.de \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=surenb@google.com \
--cc=tj@kernel.org \
--cc=vbabka@kernel.org \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox