On 10/11/25 10:53AM, Bharata B Rao wrote: >This introduces a sub-system for collecting memory access >information from different sources. It maintains the hotness >information based on the access history and time of access. > >Additionally, it provides per-lowertier-node kernel threads >(named kmigrated) that periodically promote the pages that >are eligible for promotion. > >Sub-systems that generate hot page access info can report that >using this API: > >int pghot_record_access(unsigned long pfn, int nid, int src, > unsigned long time) > >@pfn: The PFN of the memory accessed >@nid: The accessing NUMA node ID >@src: The temperature source (sub-system) that generated the > access info >@time: The access time in jiffies > >Some temperature sources may not provide the nid from which >the page was accessed. This is true for sources that use >page table scanning for PTE Accessed bit. For such sources, >the default toptier node to which such pages should be promoted >is hard coded. > >Also, the access time provided some sources may at best be >considered approximate. This is especially true for hot pages >detected by PTE A bit scanning. > >The hotness information is stored for every page of lower >tier memory in an unsigned long variable that is part of >mem_section data structure. > >kmigrated is a per-lowertier-node kernel thread that migrates >the folios marked for migration in batches. Each kmigrated >thread walks the PFN range spanning its node and checks >for potential migration candidates. > >Signed-off-by: Bharata B Rao >--- > include/linux/mmzone.h | 14 ++ > include/linux/pghot.h | 52 ++++ > include/linux/vm_event_item.h | 4 + > mm/Kconfig | 11 + > mm/Makefile | 1 + > mm/mm_init.c | 10 + > mm/page_ext.c | 11 + > mm/pghot.c | 446 ++++++++++++++++++++++++++++++++++ > mm/vmstat.c | 4 + > 9 files changed, 553 insertions(+) > create mode 100644 include/linux/pghot.h > create mode 100644 mm/pghot.c > >+ >+/* >+ * Walks the PFNs of the zone, isolates and migrates them in batches. >+ */ >+static void kmigrated_walk_zone(unsigned long start_pfn, unsigned long end_pfn, >+ int src_nid) >+{ >+ int cur_nid = NUMA_NO_NODE; >+ LIST_HEAD(migrate_list); >+ int batch_count = 0; >+ struct folio *folio; >+ struct page *page; >+ unsigned long pfn; >+ >+ pfn = start_pfn; >+ do { >+ unsigned long nid = NUMA_NO_NODE, freq = 0, time = 0, nr = 1; >+ >+ if (!pfn_valid(pfn)) >+ goto out_next; >+ >+ page = pfn_to_online_page(pfn); >+ if (!page) >+ goto out_next; >+ >+ folio = page_folio(page); >+ nr = folio_nr_pages(folio); >+ if (folio_nid(folio) != src_nid) >+ goto out_next; >+ >+ if (!folio_test_lru(folio)) >+ goto out_next; >+ >+ if (pghot_get_hotness(pfn, &nid, &freq, &time)) Better to remove freq value, it’s not used later. Regards, Alok Rathore