linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hongru Zhang <zhanghongru06@gmail.com>
To: vbabka@suse.cz
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
	axelrasmussen@google.com, david@kernel.org, hannes@cmpxchg.org,
	jackmanb@google.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, lorenzo.stoakes@oracle.com, mhocko@suse.com,
	rppt@kernel.org, surenb@google.com, weixugc@google.com,
	yuanchu@google.com, zhanghongru06@gmail.com,
	zhanghongru@xiaomi.com, ziy@nvidia.com
Subject: Re: [PATCH 0/3] mm: add per-migratetype counts to buddy allocator and optimize pagetypeinfo access
Date: Mon,  1 Dec 2025 10:36:47 +0800	[thread overview]
Message-ID: <20251201023647.2538502-1-zhanghongru@xiaomi.com> (raw)
In-Reply-To: <97a9e695-487a-4428-87b7-cb8a505c9966@suse.cz>

> > On mobile devices, some user-space memory management components check
> > memory pressure and fragmentation status periodically or via PSI, and
> > take actions such as killing processes or performing memory compaction
> > based on this information.
>
> Hm /proc/buddyinfo could be enough to determine fragmentation? Also we have
> in-kernel proactive compaction these days.

In fact, besides /proc/pagetypeinfo, other system resource information is
also collected at appropriate times, and resource usage throughout the
process lifecycle is appropriately tracked as well. User-space management
components integrate this information together to make decisions and
perform proper actions.

> > Under high load scenarios, reading /proc/pagetypeinfo causes memory
> > management components or memory allocation/free paths to be blocked
> > for extended periods waiting for the zone lock, leading to the following
> > issues:
> > 1. Long interrupt-disabled spinlocks - occasionally exceeding 10ms on Qcom
> >    8750 platforms, reducing system real-time performance
> > 2. Memory management components being blocked for extended periods,
> >    preventing rapid acquisition of memory fragmentation information for
> >    critical memory management decisions and actions
> > 3. Increased latency in memory allocation and free paths due to prolonged
> >    zone lock contention
>
> It could be argued that not capturing /proc/pagetypeinfo (often) would help.
> I wonder if we can find also other benefits from the counters in the kernel
> itself.

Collecting system and app resource statistics and making decisions based
on this information is a common practice among Android device manufacturers.

Currently, there should be over a billion Android phones being used daily
worldwide. The diversity of hardware configurations across Android devices
makes it difficult for kernel mechanisms alone to maintain good
performance across all usage scenarios.

First, hardware capabilities vary greatly - flagship phones may have up to
24GB of memory, while low-end devices may have as little as 4GB. CPU,
storage, battery, and passive cooling capabilities vary significantly due
to market positioning and cost factors. Hardware resources seem always
inadequate.

Second, usage scenarios also differ - some people use devices in hot
environments while others in cold environments; some enjoy high-definition
gaming while others simply browse the web.

Third, user habits vary as well. Some people rarely restart their phones
except when the battery dies or the system crashes; others restart daily,
like me. Some users never actively close apps, only switching them to
the background, resulting in dozens of apps running in the background and
keeping system resources consumed (especially memory). Yet others just use
a few apps, closing unused apps rather than leaving them in the
background.

Despite the above challenges, Android device manufacturers hope to ensure
a good user experience (no UI jank) across all situations.

Even at 60 Hz frame refresh rate (90 Hz, 120 Hz also supported now), all
work from user input to render and display should be done within 16.7 ms.
To achieve this goal, the management components perform tasks such as:
- Track system resource status: what system has
  (system resource awareness)
- Learn and predict app resource demands: what app needs
  (resource demand awareness)
- Monitor app launch, exit, and foreground-background switches: least
  important app gives back resource to system to serve most important
  one, usually the foreground app
  (user intent awareness)

Tracking system resources seems necessary for Android devices, not
optional. So the related paths are not that cold on Android devices.

All the above are from workload perspective. From the kernel perspective,
regardless of when or how frequently user-space tools read statistical
information, they should not affect the kernel's own efficiency
significantly. That's why I submit this patch series to make the read side
of /proc/pagetypeinfo lock-free. But this does introduce overhead in hot
path, I would greatly appreciate if we can discuss how to improve it here.

> Adding these migratetype counters is something that wouldn't be even
> possible in the past, until the freelist migratetype hygiene was merged.
> So now it should be AFAIK possible, but it's still some overhead in
> relatively hot paths. I wonder if we even considered this before in the
> context of migratetype hygiene? Couldn't find anything quickly.

Yes, I wrote the code on old kernel initially, at that time, I reused
set_pcppage_migratetype (also renamed) to cache the exact migratetype
list that the page block is on. After the freelist migratetype hygiene
patches were merged, I removed that logic.


  parent reply	other threads:[~2025-12-01  2:37 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-28  3:10 Hongru Zhang
2025-11-28  3:11 ` [PATCH 1/3] mm/page_alloc: add per-migratetype counts to buddy allocator Hongru Zhang
2025-11-29  0:34   ` Barry Song
2025-11-28  3:12 ` [PATCH 2/3] mm/vmstat: get fragmentation statistics from per-migragetype count Hongru Zhang
2025-11-28 12:03   ` zhongjinji
2025-11-29  0:00     ` Barry Song
2025-11-29  7:55       ` Barry Song
2025-12-01 12:29       ` Hongru Zhang
2025-12-01 18:54         ` Barry Song
2025-11-28  3:12 ` [PATCH 3/3] mm: optimize free_area_empty() check using per-migratetype counts Hongru Zhang
2025-11-29  0:04   ` Barry Song
2025-11-29  9:24     ` Barry Song
2025-11-28  7:49 ` [PATCH 0/3] mm: add per-migratetype counts to buddy allocator and optimize pagetypeinfo access Lorenzo Stoakes
2025-11-28  8:34   ` Hongru Zhang
2025-11-28  8:40     ` Lorenzo Stoakes
2025-11-28  9:24 ` Vlastimil Babka
2025-11-28 13:08   ` Johannes Weiner
2025-12-01  2:36   ` Hongru Zhang [this message]
2025-12-01 17:01     ` Zi Yan
2025-12-02  2:42       ` Hongru Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251201023647.2538502-1-zhanghongru@xiaomi.com \
    --to=zhanghongru06@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=david@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=zhanghongru@xiaomi.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox