From: Kiryl Shutsemau <kirill@shutemov.name>
To: Wenchao Hao <haowenchao22@gmail.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: Add AnonZero accounting for zero-filled anonymous pages
Date: Mon, 16 Feb 2026 16:54:05 +0000 [thread overview]
Message-ID: <aZNJZTiRt_vm7oKq@thinkstation> (raw)
In-Reply-To: <CAOptpSNMx7moZnLP1Q9xX_6zw2xJL4OV4N=f8rzyZBLuTMEJnQ@mail.gmail.com>
On Mon, Feb 16, 2026 at 11:59:50PM +0800, Wenchao Hao wrote:
> On Mon, Feb 16, 2026 at 7:58 PM Kiryl Shutsemau <kirill@shutemov.name> wrote:
> >
> > On Mon, Feb 16, 2026 at 12:45:13PM +0100, David Hildenbrand (Arm) wrote:
> > > On 2/16/26 12:34, Kiryl Shutsemau wrote:
> > > > On Sat, Feb 14, 2026 at 04:45:14PM +0800, Wenchao Hao wrote:
> > > > > Add kernel command line option "count_zero_page" to track anonymous pages
> > > > > have been allocated and mapped to userspace but zero-filled.
> > > > >
> > > > > This feature is mainly used to debug large folio mechanism, which
> > > > > pre-allocates and map more pages than actually needed, leading to memory
> > > > > waste from unaccessed pages.
> > > > >
> > > > > Export the result in /proc/pid/smaps as "AnonZero" field.
> > > >
> > > > I expect it to slowdown /proc/pid/smaps read substantially. I don't
> > > > think this line in smaps worth it.
> > > >
> > >
> > > That's why it's enabled through a command line parameter.
> >
> > One users want the stat and all users on the machine pay the price?
> > That's a poor trade off.
> >
> > In general, smaps scales poorly. It collects a lot of stats and most of
> > them are ignored by user. We need something like statx(2) where user can
> > declare what he is interested in, so kernel won't waste cycles.
> >
>
> I initially considered two approaches:
>
> First, exposing the needed information via smaps. This does incur some
> performance cost but is the simplest to implement. The new feature can be
> dynamically toggled via a command-line parameter. When disabled, the
> overhead is negligible—only a minor if check, which is insignificant compared
> to the full smaps cost.
>
> Second, adding a new system call or extending madvise with a new command
> like MADV_GET_ZEROANON. Userspace tools can then use it to measure
> memory waste from zero-filled anonymous huge pages.
>
> This is slightly more complex but minimizes system impact: environments that
> don’t care about zero-filled anonymous pages pay zero overhead when the
> command is not used.
>
> The exact implementation approach can be discussed after we confirm whether
> the upstream kernel needs this debugging feature.
What I would like to see in the kernel is a syscall that return the
memory stats in binary form. Something like
size_t memstat(int pidfd, struct memstat memstatbuf[], size_t n,
unsigned long flags, unsigned long start, unsigned long end);
The syscall will fill up to n memstatbufs, one per-VMA. What exactly
filled there defined by flags. The return value is how many memstatbuf
is populated. The caller can call it multiple times to walk address
space it is interested in.
We also can have a flag that mirrors smaps_rollup behaviour and collect
all the data into a single memstatbuf.
Internally, the kernel can use the infrastructure built for this syscall
to provide /proc/<PID>/{maps,smaps,smaps_rollup}. This way we will not
duplicate the code.
--
Kiryl Shutsemau / Kirill A. Shutemov
next prev parent reply other threads:[~2026-02-16 16:54 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-14 8:45 Wenchao Hao
2026-02-16 11:34 ` Kiryl Shutsemau
2026-02-16 11:45 ` David Hildenbrand (Arm)
2026-02-16 11:58 ` Kiryl Shutsemau
2026-02-16 12:19 ` David Hildenbrand (Arm)
2026-02-16 15:59 ` Wenchao Hao
2026-02-16 16:42 ` Michal Hocko
2026-02-16 16:56 ` David Hildenbrand (Arm)
2026-02-16 17:10 ` Michal Hocko
2026-02-16 17:17 ` David Hildenbrand (Arm)
2026-02-16 16:54 ` Kiryl Shutsemau [this message]
2026-02-16 17:01 ` Matthew Wilcox
2026-02-16 17:10 ` David Hildenbrand (Arm)
2026-02-16 17:18 ` Kiryl Shutsemau
2026-02-16 12:15 ` David Hildenbrand (Arm)
2026-02-16 15:10 ` Wenchao Hao
2026-02-16 15:18 ` David Hildenbrand (Arm)
2026-02-16 14:22 ` Matthew Wilcox
2026-02-16 15:55 ` Wenchao Hao
2026-02-16 17:03 ` Matthew Wilcox
2026-02-17 15:22 ` Wenchao Hao
2026-02-17 20:29 ` David Hildenbrand (Arm)
2026-02-17 21:53 ` Kiryl Shutsemau
2026-02-19 2:11 ` Wenchao Hao
2026-02-18 7:52 ` Michal Hocko
2026-02-19 2:47 ` Wenchao Hao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aZNJZTiRt_vm7oKq@thinkstation \
--to=kirill@shutemov.name \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=haowenchao22@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox