* [Linux Memory Hotness and Promotion] Notes from February 12, 2026
@ 2026-02-15 4:04 David Rientjes
0 siblings, 0 replies; only message in thread
From: David Rientjes @ 2026-02-15 4:04 UTC (permalink / raw)
To: Davidlohr Bueso, Fan Ni, Gregory Price, Jonathan Cameron,
Joshua Hahn, Raghavendra K T, Rao, Bharata Bhasker,
SeongJae Park, Wei Xu, Xuezheng Chu, Yiannis Nikolakopoulos,
Zi Yan
Cc: linux-mm
Hi everybody,
Here are the notes from the last Linux Memory Hotness and Promotion call
that happened on Thursday, February 12. Thanks to everybody who was
involved!
These notes are intended to bring people up to speed who could not attend
the call as well as keep the conversation going in between meetings.
----->o-----
Bharata updated on the status of v5 of his patch series from a couple
weeks ago with two modes of operation. Based on discussion with people
are LPC, he proposed two mechanisms to track hotness that would have
vastly different memory footprint requirements, one as little as one byte
per page. He posted benchmark numbers over a series of updates upstream.
He found a clear advantage of hot page promotion when top tier memory was
not saturated for both of these modes of operation.
However, when there is top-tier memory pressure (when the benchmark
working set spills over into CXL memory), then we tend to demote and
promote simultaneously as expected. Interestingly, he has not observed
promotion in these cases actually being helpful; in fact, in some cases it
actually turns out to hurt. He asked for feedback on tunings for when the
demotion case can actually help this scenario. Note in one of these
benchmarking scenarios, however, that the benchmark would perform random
access.
Gregory suggested that we may want to track the number of demotions and
promotions; when the rate of demotions equals the rate of promotions then
we should reduce the number of promotions. Bharata noted that there was
ratelimiting on promotions but not considering the current rate of
demotions. In this case, Gregory said, we are seeing promotions drive
demotions and there will be an endless loop with these workload
characteristics that will impact performance.
Gregory's idea was that if we promote 1,000 pages and then a few seconds
later we demote 1,000 pages, then this is an indication of churn. If this
is consistent, we should back off. In an ideal scenario, we would be
doing some proactive demotion from top tier so there is room to promote.
He wanted to see data on the rate of promotions and rate of demotions over
time for these bnechmarks. This may reveal an indicator that we can use
as a heuristic for a back off mechanism.
I asked if we need a back off mechanism or we actually need a fairness
mechanism, the worry was that we would find churn in promotion and
demotion, then back off, then start promoting and demoting again at the
same rate only to rinse and repeat. Gregory noted that this will continue
to consume precious bandwidth on the device so any churn will naturally
impact the rest of the system.
Wei Xu thought that this was more about the sorting of the cold memory: we
need to rank the page coldness across tiers because what we promoted may
not necessarily be hotter than what we promoted. One possible approach is
to use a time dimension where we refuse to demote any memory that is not
cold enough (like MGLRU's min_ttl). Gregory noted this is a function of
the call that we use to promote the memory which, today, is just a call to
migrate_pages(); we could pass gfp flags to specify what to do if the
promotion node is out of memory. If we are calling direct reclaim, that's
probably an indication that we are not aging off inactive pages and we may
be forcing demotion of active pages.
Bharata's current approach uses migrate_misplaced_folio() which
unfortuantely does not give a migration context but does wake up kswapd if
the top tier is near an oom condition. There is no gfp flags being
passed, however. We may want to extend the batch function to allow
passing its own allocation function; the current benchmarks would not be
calling into direct reclaim today.
Wei noted that we do not have a mechanism today to be able to compare
relative coldness of memory to make promotion and demotion decisions. He
suggested using a time dimension such that the page being demoted is
guaranteed to not have been accessed in the last interval whereas the page
being promoted *is* guaranteed to have been accessed in that interval.
----->o-----
Next meeting will be on Thursday, February 26 at 8:30am PST (UTC-8),
everybody is welcome: https://meet.google.com/jak-ytdx-hnm
Topics for the next meeting:
- any on-going work for CHMU and a generic hotness interface to leverage
it
- RFC v5 of Bharata's patch series for pghot with two modes of operation
- promotion and demotion thrashing detection and back-off mechanisms
- LSF/MM/BPF 2026 topics to propose for discussion on hotness, promotion,
and memory tiering overall
- Gregory's testing of reclaim fairness with Bharata's changes and first
posting for an RFC
- discuss generalized subsystem for providing bandwidth information
independent of the underlying platform, ideally through resctrl,
otherwise utilizing bandwidth information will be challenging
+ preferably this bandwidth monitoring is not per NUMA node but rather
slow and fast
- determine minimal viable upstream opportunity to optimize for tiering
that is extensible for future use cases and optimizations
+ extensible for multiple tiers
+ must be possible to disable with no memory or performance overhead
- update on non-temporal stores enlightenment for memory tiering
- enlightening migrate_pages() for hardware assists and how this work
will be charged to userspace, including for memory compaction
Please let me know if you'd like to propose additional topics for
discussion, thank you!
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-02-15 4:05 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-15 4:04 [Linux Memory Hotness and Promotion] Notes from February 12, 2026 David Rientjes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox