From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Zi Yan <ziy@nvidia.com>
Cc: lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org
Subject: Re: [LSF/MM/BPF TOPIC] Towards removing CONFIG_PAGE_MAPCOUNT
Date: Fri, 20 Feb 2026 11:35:20 +0100 [thread overview]
Message-ID: <dd01a49a-d655-49d6-b088-da01544de5c6@kernel.org> (raw)
In-Reply-To: <F0B18B46-EC62-47F7-88FB-C55B0E7FAE1C@nvidia.com>
On 2/19/26 18:07, Zi Yan wrote:
> On 17 Feb 2026, at 16:04, David Hildenbrand (Arm) wrote:
>
>> Hi,
>>
>> although I like mapcounts very much, I'd rather prefer to not have mapcount work on my todo list.
>>
>> We now have CONFIG_NO_PAGE_MAPCOUNT in the kernel that doesn't touch any mapcount values of tail pages, which is great. But we still have CONFIG_PAGE_MAPCOUNT around, being used as default.
>>
>>
>> To make my dream come true, some things I have in mind are still pending. In particular, I want to:
>>
>> (a) Support mapping of folios > PMD through PMDs.
>>
>> (b) Get rid of CONFIG_PAGE_MAPCOUNT to stop messing with
>> page->_mapcount on tail pages and to cleanup the rmap code.
>>
>> (c) Better detect partially-mapped anon folios with
>> CONFIG_NO_PAGE_MAPCOUNT.
>>
>> + some other small things.
>>
>>
>> I discussed some of these challenges at LSF/MM 2024 [1], before we had CONFIG_NO_PAGE_MAPCOUNT. No we have it and we can discuss the next steps.
>>
>>
>> Sorting out (a) is fairly easy once we removed CONFIG_PAGE_MAPCOUNT: we'll primarily have to split folio->_entire_mapcount into folio->_pmd_mapcount and folio->_pud_mapcount.
>
> Then, for PMD sized folio, _pmd_mapcount is its “_entire_mapcount”, for
> PUD sized folio, _pud_mapcount is its “_entire_mapcount”. For mulit-PMD
> or multi-PUD folio, _pmd_mapcount and _pud_mapcount are similar to
> _nr_pages_mapped but with PMD_NR/PUD_NR multiplier. Maybe we would have
> _pte_mapcount instead of _nr_pages_mapped?
What we'd have is essentially (ignoring hugetlb) is
* mapcount
* pmd_mapcount
* pud_mapcount
pte_mapcount = mapcount - pmd_mapcount - pud_mapcount
That is sufficient to calculate folio_average_page_mapcount().
There are some possible evolutions of this concept (but some other stuff
would have to change), but above is what we would start with.
>
>>
>> Sorting out (b) requires switching to CONFIG_NO_PAGE_MAPCOUNT first, which will imply some imprecision with large folios to:
>>
>> (1) Process memory stats: Pss + Uss accounting like "Pss" and "Shared_"
>> vs "Private_" in /proc/$PID/smaps and /proc/$PID/smaps_rollup
>>
>> (2) PM_MMAP_EXCLUSIVE flag in /proc/$PID/pagemap
>>
>> (3) System memory stats: "mapped" memory like "AnonPages", "Mapped"
>> and "Shmem" in /proc/meminfo
>>
>> And some other smaller things. While I think that all changes here should be fine, I want to be a bit careful and have a discussion on how to tackle it without realizing in a couple of releases that some use cases still require CONFIG_PAGE_MAPCOUNT.
>>
>> Sorting out (c) is a harder nut to crack, and I wonder to which degree we care and whether I am being too careful. I have some ideas that I want to discuss. One idea is to just remove the deferred split lists and let memory reclaim deal with that:
>
> Or let a workqueue to walk rmap to check if an unmapped subpage indeed has
> no mapping left and put that folio in deferred_split list. Or let the
> deferred_list_scan does the rmap walk and decide whether to split the folio.
Exactly. That is one solution I had in mind and started prototyping at
some point after LPC: flag folios as possibly-partially-mapped and let
deferred splitting figure it out.
--
Cheers,
David
prev parent reply other threads:[~2026-02-20 10:35 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <c3bb0140-942d-49d2-bdc3-210b55435356@kernel.org>
2026-02-17 21:04 ` David Hildenbrand (Arm)
2026-02-19 17:07 ` Zi Yan
2026-02-20 10:35 ` David Hildenbrand (Arm) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dd01a49a-d655-49d6-b088-da01544de5c6@kernel.org \
--to=david@kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox