* [PATCH v2 0/4] mm: zone lock tracepoint instrumentation
@ 2026-02-25 14:43 Dmitry Ilvokhin
2026-02-25 14:43 ` [PATCH v2 1/4] mm: introduce zone lock wrappers Dmitry Ilvokhin
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-02-25 14:43 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Brendan Jackman,
Johannes Weiner, Zi Yan, Oscar Salvador, Qi Zheng, Shakeel Butt,
Axel Rasmussen, Yuanchu Xie, Wei Xu
Cc: linux-kernel, linux-mm, linux-trace-kernel, linux-cxl,
kernel-team, Benjamin Cheatham, Dmitry Ilvokhin
Zone lock contention can significantly impact allocation and
reclaim latency, as it is a central synchronization point in
the page allocator and reclaim paths. Improved visibility into
its behavior is therefore important for diagnosing performance
issues in memory-intensive workloads.
On some production workloads at Meta, we have observed noticeable
zone lock contention. Deeper analysis of lock holders and waiters
is currently difficult with existing instrumentation.
While generic lock contention_begin/contention_end tracepoints
cover the slow path, they do not provide sufficient visibility
into lock hold times. In particular, the lack of a release-side
event makes it difficult to identify long lock holders and
correlate them with waiters. As a result, distinguishing between
short bursts of contention and pathological long hold times
requires additional instrumentation.
This patch series adds dedicated tracepoint instrumentation to
zone lock, following the existing mmap_lock tracing model.
The goal is to enable detailed holder/waiter analysis and lock
hold time measurements without affecting the fast path when
tracing is disabled.
The series is structured as follows:
1. Introduce zone lock wrappers.
2. Mechanically convert zone lock users to the wrappers.
3. Convert compaction to use the wrappers (requires minor
restructuring of compact_lock_irqsave()).
4. Add zone lock tracepoints.
The tracepoints are added via lightweight inline helpers in the
wrappers. When tracing is disabled, the fast path remains
unchanged.
The compaction changes required abstracting compact_lock_irqsave() away from
raw spinlock_t. I chose a small tagged struct to handle both zone and LRU
locks uniformly. If there is a preferred alternative (e.g. splitting helpers
or using a different abstraction), I would appreciate feedback.
Changes in v2:
- Move mecanical changes from mm/compaction.c to different commit.
- Removed compact_do_zone_trylock() and compact_do_raw_trylock_irqsave().
Dmitry Ilvokhin (4):
mm: introduce zone lock wrappers
mm: convert zone lock users to wrappers
mm: convert compaction to zone lock wrappers
mm: add tracepoints for zone lock
MAINTAINERS | 3 +
include/linux/zone_lock.h | 100 +++++++++++++++++++++++++++++++
include/trace/events/zone_lock.h | 64 ++++++++++++++++++++
mm/Makefile | 2 +-
mm/compaction.c | 96 +++++++++++++++++++++++------
mm/memory_hotplug.c | 9 +--
mm/mm_init.c | 3 +-
mm/page_alloc.c | 73 +++++++++++-----------
mm/page_isolation.c | 19 +++---
mm/page_reporting.c | 13 ++--
mm/show_mem.c | 5 +-
mm/vmscan.c | 5 +-
mm/vmstat.c | 9 +--
mm/zone_lock.c | 31 ++++++++++
14 files changed, 348 insertions(+), 84 deletions(-)
create mode 100644 include/linux/zone_lock.h
create mode 100644 include/trace/events/zone_lock.h
create mode 100644 mm/zone_lock.c
--
2.47.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/4] mm: introduce zone lock wrappers
2026-02-25 14:43 [PATCH v2 0/4] mm: zone lock tracepoint instrumentation Dmitry Ilvokhin
@ 2026-02-25 14:43 ` Dmitry Ilvokhin
2026-02-25 20:14 ` Andrew Morton
2026-02-25 14:43 ` [PATCH v2 2/4] mm: convert zone lock users to wrappers Dmitry Ilvokhin
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-02-25 14:43 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Brendan Jackman,
Johannes Weiner, Zi Yan, Oscar Salvador, Qi Zheng, Shakeel Butt,
Axel Rasmussen, Yuanchu Xie, Wei Xu
Cc: linux-kernel, linux-mm, linux-trace-kernel, linux-cxl,
kernel-team, Benjamin Cheatham, Dmitry Ilvokhin
Add thin wrappers around zone lock acquire/release operations. This
prepares the code for future tracepoint instrumentation without
modifying individual call sites.
Centralizing zone lock operations behind wrappers allows future
instrumentation or debugging hooks to be added without touching
all users.
No functional change intended. The wrappers are introduced in
preparation for subsequent patches and are not yet used.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
---
MAINTAINERS | 1 +
include/linux/zone_lock.h | 38 ++++++++++++++++++++++++++++++++++++++
2 files changed, 39 insertions(+)
create mode 100644 include/linux/zone_lock.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 55af015174a5..61e3d1f5bf43 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16680,6 +16680,7 @@ F: include/linux/pgtable.h
F: include/linux/ptdump.h
F: include/linux/vmpressure.h
F: include/linux/vmstat.h
+F: include/linux/zone_lock.h
F: kernel/fork.c
F: mm/Kconfig
F: mm/debug.c
diff --git a/include/linux/zone_lock.h b/include/linux/zone_lock.h
new file mode 100644
index 000000000000..c531e26280e6
--- /dev/null
+++ b/include/linux/zone_lock.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_ZONE_LOCK_H
+#define _LINUX_ZONE_LOCK_H
+
+#include <linux/mmzone.h>
+#include <linux/spinlock.h>
+
+static inline void zone_lock_init(struct zone *zone)
+{
+ spin_lock_init(&zone->lock);
+}
+
+#define zone_lock_irqsave(zone, flags) \
+do { \
+ spin_lock_irqsave(&(zone)->lock, flags); \
+} while (0)
+
+#define zone_trylock_irqsave(zone, flags) \
+({ \
+ spin_trylock_irqsave(&(zone)->lock, flags); \
+})
+
+static inline void zone_unlock_irqrestore(struct zone *zone, unsigned long flags)
+{
+ spin_unlock_irqrestore(&zone->lock, flags);
+}
+
+static inline void zone_lock_irq(struct zone *zone)
+{
+ spin_lock_irq(&zone->lock);
+}
+
+static inline void zone_unlock_irq(struct zone *zone)
+{
+ spin_unlock_irq(&zone->lock);
+}
+
+#endif /* _LINUX_ZONE_LOCK_H */
--
2.47.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/4] mm: convert zone lock users to wrappers
2026-02-25 14:43 [PATCH v2 0/4] mm: zone lock tracepoint instrumentation Dmitry Ilvokhin
2026-02-25 14:43 ` [PATCH v2 1/4] mm: introduce zone lock wrappers Dmitry Ilvokhin
@ 2026-02-25 14:43 ` Dmitry Ilvokhin
2026-02-25 14:43 ` [PATCH v2 3/4] mm: convert compaction to zone lock wrappers Dmitry Ilvokhin
2026-02-25 14:43 ` [PATCH v2 4/4] mm: add tracepoints for zone lock Dmitry Ilvokhin
3 siblings, 0 replies; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-02-25 14:43 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Brendan Jackman,
Johannes Weiner, Zi Yan, Oscar Salvador, Qi Zheng, Shakeel Butt,
Axel Rasmussen, Yuanchu Xie, Wei Xu
Cc: linux-kernel, linux-mm, linux-trace-kernel, linux-cxl,
kernel-team, Benjamin Cheatham, Dmitry Ilvokhin
Replace direct zone lock acquire/release operations with the
newly introduced wrappers.
The changes are purely mechanical substitutions. No functional change
intended. Locking semantics and ordering remain unchanged.
The compaction path is left unchanged for now and will be
handled separately in the following patch due to additional
non-trivial modifications.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
---
mm/compaction.c | 25 +++++++++-------
mm/memory_hotplug.c | 9 +++---
mm/mm_init.c | 3 +-
mm/page_alloc.c | 73 +++++++++++++++++++++++----------------------
mm/page_isolation.c | 19 ++++++------
mm/page_reporting.c | 13 ++++----
mm/show_mem.c | 5 ++--
mm/vmscan.c | 5 ++--
mm/vmstat.c | 9 +++---
9 files changed, 86 insertions(+), 75 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 1e8f8eca318c..47b26187a5df 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -24,6 +24,7 @@
#include <linux/page_owner.h>
#include <linux/psi.h>
#include <linux/cpuset.h>
+#include <linux/zone_lock.h>
#include "internal.h"
#ifdef CONFIG_COMPACTION
@@ -530,11 +531,14 @@ static bool compact_lock_irqsave(spinlock_t *lock, unsigned long *flags,
* Returns true if compaction should abort due to fatal signal pending.
* Returns false when compaction can continue.
*/
-static bool compact_unlock_should_abort(spinlock_t *lock,
- unsigned long flags, bool *locked, struct compact_control *cc)
+
+static bool compact_unlock_should_abort(struct zone *zone,
+ unsigned long flags,
+ bool *locked,
+ struct compact_control *cc)
{
if (*locked) {
- spin_unlock_irqrestore(lock, flags);
+ zone_unlock_irqrestore(zone, flags);
*locked = false;
}
@@ -582,9 +586,8 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
* contention, to give chance to IRQs. Abort if fatal signal
* pending.
*/
- if (!(blockpfn % COMPACT_CLUSTER_MAX)
- && compact_unlock_should_abort(&cc->zone->lock, flags,
- &locked, cc))
+ if (!(blockpfn % COMPACT_CLUSTER_MAX) &&
+ compact_unlock_should_abort(cc->zone, flags, &locked, cc))
break;
nr_scanned++;
@@ -649,7 +652,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
}
if (locked)
- spin_unlock_irqrestore(&cc->zone->lock, flags);
+ zone_unlock_irqrestore(cc->zone, flags);
/*
* Be careful to not go outside of the pageblock.
@@ -1555,7 +1558,7 @@ static void fast_isolate_freepages(struct compact_control *cc)
if (!area->nr_free)
continue;
- spin_lock_irqsave(&cc->zone->lock, flags);
+ zone_lock_irqsave(cc->zone, flags);
freelist = &area->free_list[MIGRATE_MOVABLE];
list_for_each_entry_reverse(freepage, freelist, buddy_list) {
unsigned long pfn;
@@ -1614,7 +1617,7 @@ static void fast_isolate_freepages(struct compact_control *cc)
}
}
- spin_unlock_irqrestore(&cc->zone->lock, flags);
+ zone_unlock_irqrestore(cc->zone, flags);
/* Skip fast search if enough freepages isolated */
if (cc->nr_freepages >= cc->nr_migratepages)
@@ -1988,7 +1991,7 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
if (!area->nr_free)
continue;
- spin_lock_irqsave(&cc->zone->lock, flags);
+ zone_lock_irqsave(cc->zone, flags);
freelist = &area->free_list[MIGRATE_MOVABLE];
list_for_each_entry(freepage, freelist, buddy_list) {
unsigned long free_pfn;
@@ -2021,7 +2024,7 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
break;
}
}
- spin_unlock_irqrestore(&cc->zone->lock, flags);
+ zone_unlock_irqrestore(cc->zone, flags);
}
cc->total_migrate_scanned += nr_scanned;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index bc805029da51..cfc0103fa50e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -36,6 +36,7 @@
#include <linux/rmap.h>
#include <linux/module.h>
#include <linux/node.h>
+#include <linux/zone_lock.h>
#include <asm/tlbflush.h>
@@ -1190,9 +1191,9 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
* Fixup the number of isolated pageblocks before marking the sections
* onlining, such that undo_isolate_page_range() works correctly.
*/
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
zone->nr_isolate_pageblock += nr_pages / pageblock_nr_pages;
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
/*
* If this zone is not populated, then it is not in zonelist.
@@ -2041,9 +2042,9 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
* effectively stale; nobody should be touching them. Fixup the number
* of isolated pageblocks, memory onlining will properly revert this.
*/
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
zone->nr_isolate_pageblock -= nr_pages / pageblock_nr_pages;
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
lru_cache_enable();
zone_pcp_enable(zone);
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 61d983d23f55..6dd37621248b 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -32,6 +32,7 @@
#include <linux/vmstat.h>
#include <linux/kexec_handover.h>
#include <linux/hugetlb.h>
+#include <linux/zone_lock.h>
#include "internal.h"
#include "slab.h"
#include "shuffle.h"
@@ -1425,7 +1426,7 @@ static void __meminit zone_init_internals(struct zone *zone, enum zone_type idx,
zone_set_nid(zone, nid);
zone->name = zone_names[idx];
zone->zone_pgdat = NODE_DATA(nid);
- spin_lock_init(&zone->lock);
+ zone_lock_init(zone);
zone_seqlock_init(zone);
zone_pcp_init(zone);
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fcc32737f451..c5d13fe9b79f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -54,6 +54,7 @@
#include <linux/delayacct.h>
#include <linux/cacheinfo.h>
#include <linux/pgalloc_tag.h>
+#include <linux/zone_lock.h>
#include <asm/div64.h>
#include "internal.h"
#include "shuffle.h"
@@ -1500,7 +1501,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
/* Ensure requested pindex is drained first. */
pindex = pindex - 1;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
while (count > 0) {
struct list_head *list;
@@ -1533,7 +1534,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
} while (count > 0 && !list_empty(list));
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
}
/* Split a multi-block free page into its individual pageblocks. */
@@ -1577,12 +1578,12 @@ static void free_one_page(struct zone *zone, struct page *page,
unsigned long flags;
if (unlikely(fpi_flags & FPI_TRYLOCK)) {
- if (!spin_trylock_irqsave(&zone->lock, flags)) {
+ if (!zone_trylock_irqsave(zone, flags)) {
add_page_to_zone_llist(zone, page, order);
return;
}
} else {
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
}
/* The lock succeeded. Process deferred pages. */
@@ -1600,7 +1601,7 @@ static void free_one_page(struct zone *zone, struct page *page,
}
}
split_large_buddy(zone, page, pfn, order, fpi_flags);
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
__count_vm_events(PGFREE, 1 << order);
}
@@ -2553,10 +2554,10 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
int i;
if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
- if (!spin_trylock_irqsave(&zone->lock, flags))
+ if (!zone_trylock_irqsave(zone, flags))
return 0;
} else {
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
}
for (i = 0; i < count; ++i) {
struct page *page = __rmqueue(zone, order, migratetype,
@@ -2576,7 +2577,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order,
*/
list_add_tail(&page->pcp_list, list);
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return i;
}
@@ -3246,10 +3247,10 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
do {
page = NULL;
if (unlikely(alloc_flags & ALLOC_TRYLOCK)) {
- if (!spin_trylock_irqsave(&zone->lock, flags))
+ if (!zone_trylock_irqsave(zone, flags))
return NULL;
} else {
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
}
if (alloc_flags & ALLOC_HIGHATOMIC)
page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC);
@@ -3268,11 +3269,11 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone,
page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC);
if (!page) {
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return NULL;
}
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
} while (check_new_pages(page, order));
__count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
@@ -3459,7 +3460,7 @@ static void reserve_highatomic_pageblock(struct page *page, int order,
if (zone->nr_reserved_highatomic >= max_managed)
return;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
/* Recheck the nr_reserved_highatomic limit under the lock */
if (zone->nr_reserved_highatomic >= max_managed)
@@ -3481,7 +3482,7 @@ static void reserve_highatomic_pageblock(struct page *page, int order,
}
out_unlock:
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
}
/*
@@ -3514,7 +3515,7 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
pageblock_nr_pages)
continue;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
for (order = 0; order < NR_PAGE_ORDERS; order++) {
struct free_area *area = &(zone->free_area[order]);
unsigned long size;
@@ -3562,11 +3563,11 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac,
*/
WARN_ON_ONCE(ret == -1);
if (ret > 0) {
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return ret;
}
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
}
return false;
@@ -6446,7 +6447,7 @@ static void __setup_per_zone_wmarks(void)
for_each_zone(zone) {
u64 tmp;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
tmp = (u64)pages_min * zone_managed_pages(zone);
tmp = div64_ul(tmp, lowmem_pages);
if (is_highmem(zone) || zone_idx(zone) == ZONE_MOVABLE) {
@@ -6487,7 +6488,7 @@ static void __setup_per_zone_wmarks(void)
zone->_watermark[WMARK_PROMO] = high_wmark_pages(zone) + tmp;
trace_mm_setup_per_zone_wmarks(zone);
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
}
/* update totalreserve_pages */
@@ -7257,7 +7258,7 @@ struct page *alloc_contig_frozen_pages_noprof(unsigned long nr_pages,
zonelist = node_zonelist(nid, gfp_mask);
for_each_zone_zonelist_nodemask(zone, z, zonelist,
gfp_zone(gfp_mask), nodemask) {
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
pfn = ALIGN(zone->zone_start_pfn, nr_pages);
while (zone_spans_last_pfn(zone, pfn, nr_pages)) {
@@ -7271,18 +7272,18 @@ struct page *alloc_contig_frozen_pages_noprof(unsigned long nr_pages,
* allocation spinning on this lock, it may
* win the race and cause allocation to fail.
*/
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
ret = alloc_contig_frozen_range_noprof(pfn,
pfn + nr_pages,
ACR_FLAGS_NONE,
gfp_mask);
if (!ret)
return pfn_to_page(pfn);
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
}
pfn += nr_pages;
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
}
/*
* If we failed, retry the search, but treat regions with HugeTLB pages
@@ -7436,7 +7437,7 @@ unsigned long __offline_isolated_pages(unsigned long start_pfn,
offline_mem_sections(pfn, end_pfn);
zone = page_zone(pfn_to_page(pfn));
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
while (pfn < end_pfn) {
page = pfn_to_page(pfn);
/*
@@ -7466,7 +7467,7 @@ unsigned long __offline_isolated_pages(unsigned long start_pfn,
del_page_from_free_list(page, zone, order, MIGRATE_ISOLATE);
pfn += (1 << order);
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return end_pfn - start_pfn - already_offline;
}
@@ -7542,7 +7543,7 @@ bool take_page_off_buddy(struct page *page)
unsigned int order;
bool ret = false;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
for (order = 0; order < NR_PAGE_ORDERS; order++) {
struct page *page_head = page - (pfn & ((1 << order) - 1));
int page_order = buddy_order(page_head);
@@ -7563,7 +7564,7 @@ bool take_page_off_buddy(struct page *page)
if (page_count(page_head) > 0)
break;
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return ret;
}
@@ -7576,7 +7577,7 @@ bool put_page_back_buddy(struct page *page)
unsigned long flags;
bool ret = false;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
if (put_page_testzero(page)) {
unsigned long pfn = page_to_pfn(page);
int migratetype = get_pfnblock_migratetype(page, pfn);
@@ -7587,7 +7588,7 @@ bool put_page_back_buddy(struct page *page)
ret = true;
}
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return ret;
}
@@ -7636,7 +7637,7 @@ static void __accept_page(struct zone *zone, unsigned long *flags,
account_freepages(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
__mod_zone_page_state(zone, NR_UNACCEPTED, -MAX_ORDER_NR_PAGES);
__ClearPageUnaccepted(page);
- spin_unlock_irqrestore(&zone->lock, *flags);
+ zone_unlock_irqrestore(zone, *flags);
accept_memory(page_to_phys(page), PAGE_SIZE << MAX_PAGE_ORDER);
@@ -7648,9 +7649,9 @@ void accept_page(struct page *page)
struct zone *zone = page_zone(page);
unsigned long flags;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
if (!PageUnaccepted(page)) {
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return;
}
@@ -7663,11 +7664,11 @@ static bool try_to_accept_memory_one(struct zone *zone)
unsigned long flags;
struct page *page;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
page = list_first_entry_or_null(&zone->unaccepted_pages,
struct page, lru);
if (!page) {
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return false;
}
@@ -7724,12 +7725,12 @@ static bool __free_unaccepted(struct page *page)
if (!lazy_accept)
return false;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
list_add_tail(&page->lru, &zone->unaccepted_pages);
account_freepages(zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
__mod_zone_page_state(zone, NR_UNACCEPTED, MAX_ORDER_NR_PAGES);
__SetPageUnaccepted(page);
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return true;
}
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index c48ff5c00244..56a272f38b66 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -10,6 +10,7 @@
#include <linux/hugetlb.h>
#include <linux/page_owner.h>
#include <linux/migrate.h>
+#include <linux/zone_lock.h>
#include "internal.h"
#define CREATE_TRACE_POINTS
@@ -173,7 +174,7 @@ static int set_migratetype_isolate(struct page *page, enum pb_isolate_mode mode,
if (PageUnaccepted(page))
accept_page(page);
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
/*
* We assume the caller intended to SET migrate type to isolate.
@@ -181,7 +182,7 @@ static int set_migratetype_isolate(struct page *page, enum pb_isolate_mode mode,
* set it before us.
*/
if (is_migrate_isolate_page(page)) {
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return -EBUSY;
}
@@ -200,15 +201,15 @@ static int set_migratetype_isolate(struct page *page, enum pb_isolate_mode mode,
mode);
if (!unmovable) {
if (!pageblock_isolate_and_move_free_pages(zone, page)) {
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return -EBUSY;
}
zone->nr_isolate_pageblock++;
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
return 0;
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
if (mode == PB_ISOLATE_MODE_MEM_OFFLINE) {
/*
* printk() with zone->lock held will likely trigger a
@@ -229,7 +230,7 @@ static void unset_migratetype_isolate(struct page *page)
struct page *buddy;
zone = page_zone(page);
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
if (!is_migrate_isolate_page(page))
goto out;
@@ -280,7 +281,7 @@ static void unset_migratetype_isolate(struct page *page)
}
zone->nr_isolate_pageblock--;
out:
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
}
static inline struct page *
@@ -641,9 +642,9 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
/* Check all pages are free or marked as ISOLATED */
zone = page_zone(page);
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
pfn = __test_page_isolated_in_pageblock(start_pfn, end_pfn, mode);
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
ret = pfn < end_pfn ? -EBUSY : 0;
diff --git a/mm/page_reporting.c b/mm/page_reporting.c
index f0042d5743af..37e54e16538b 100644
--- a/mm/page_reporting.c
+++ b/mm/page_reporting.c
@@ -7,6 +7,7 @@
#include <linux/module.h>
#include <linux/delay.h>
#include <linux/scatterlist.h>
+#include <linux/zone_lock.h>
#include "page_reporting.h"
#include "internal.h"
@@ -161,7 +162,7 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone,
if (list_empty(list))
return err;
- spin_lock_irq(&zone->lock);
+ zone_lock_irq(zone);
/*
* Limit how many calls we will be making to the page reporting
@@ -219,7 +220,7 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone,
list_rotate_to_front(&page->lru, list);
/* release lock before waiting on report processing */
- spin_unlock_irq(&zone->lock);
+ zone_unlock_irq(zone);
/* begin processing pages in local list */
err = prdev->report(prdev, sgl, PAGE_REPORTING_CAPACITY);
@@ -231,7 +232,7 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone,
budget--;
/* reacquire zone lock and resume processing */
- spin_lock_irq(&zone->lock);
+ zone_lock_irq(zone);
/* flush reported pages from the sg list */
page_reporting_drain(prdev, sgl, PAGE_REPORTING_CAPACITY, !err);
@@ -251,7 +252,7 @@ page_reporting_cycle(struct page_reporting_dev_info *prdev, struct zone *zone,
if (!list_entry_is_head(next, list, lru) && !list_is_first(&next->lru, list))
list_rotate_to_front(&next->lru, list);
- spin_unlock_irq(&zone->lock);
+ zone_unlock_irq(zone);
return err;
}
@@ -296,9 +297,9 @@ page_reporting_process_zone(struct page_reporting_dev_info *prdev,
err = prdev->report(prdev, sgl, leftover);
/* flush any remaining pages out from the last report */
- spin_lock_irq(&zone->lock);
+ zone_lock_irq(zone);
page_reporting_drain(prdev, sgl, leftover, !err);
- spin_unlock_irq(&zone->lock);
+ zone_unlock_irq(zone);
}
return err;
diff --git a/mm/show_mem.c b/mm/show_mem.c
index 24078ac3e6bc..245beca127af 100644
--- a/mm/show_mem.c
+++ b/mm/show_mem.c
@@ -14,6 +14,7 @@
#include <linux/mmzone.h>
#include <linux/swap.h>
#include <linux/vmstat.h>
+#include <linux/zone_lock.h>
#include "internal.h"
#include "swap.h"
@@ -363,7 +364,7 @@ static void show_free_areas(unsigned int filter, nodemask_t *nodemask, int max_z
show_node(zone);
printk(KERN_CONT "%s: ", zone->name);
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
for (order = 0; order < NR_PAGE_ORDERS; order++) {
struct free_area *area = &zone->free_area[order];
int type;
@@ -377,7 +378,7 @@ static void show_free_areas(unsigned int filter, nodemask_t *nodemask, int max_z
types[order] |= 1 << type;
}
}
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
for (order = 0; order < NR_PAGE_ORDERS; order++) {
printk(KERN_CONT "%lu*%lukB ",
nr[order], K(1UL) << order);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0fc9373e8251..b369e00e8415 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -58,6 +58,7 @@
#include <linux/random.h>
#include <linux/mmu_notifier.h>
#include <linux/parser.h>
+#include <linux/zone_lock.h>
#include <asm/tlbflush.h>
#include <asm/div64.h>
@@ -7139,9 +7140,9 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx)
/* Increments are under the zone lock */
zone = pgdat->node_zones + i;
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
zone->watermark_boost -= min(zone->watermark_boost, zone_boosts[i]);
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
}
/*
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 86b14b0f77b5..299b461a6b4b 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -28,6 +28,7 @@
#include <linux/mm_inline.h>
#include <linux/page_owner.h>
#include <linux/sched/isolation.h>
+#include <linux/zone_lock.h>
#include "internal.h"
@@ -1535,10 +1536,10 @@ static void walk_zones_in_node(struct seq_file *m, pg_data_t *pgdat,
continue;
if (!nolock)
- spin_lock_irqsave(&zone->lock, flags);
+ zone_lock_irqsave(zone, flags);
print(m, pgdat, zone);
if (!nolock)
- spin_unlock_irqrestore(&zone->lock, flags);
+ zone_unlock_irqrestore(zone, flags);
}
}
#endif
@@ -1603,9 +1604,9 @@ static void pagetypeinfo_showfree_print(struct seq_file *m,
}
}
seq_printf(m, "%s%6lu ", overflow ? ">" : "", freecount);
- spin_unlock_irq(&zone->lock);
+ zone_unlock_irq(zone);
cond_resched();
- spin_lock_irq(&zone->lock);
+ zone_lock_irq(zone);
}
seq_putc(m, '\n');
}
--
2.47.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 3/4] mm: convert compaction to zone lock wrappers
2026-02-25 14:43 [PATCH v2 0/4] mm: zone lock tracepoint instrumentation Dmitry Ilvokhin
2026-02-25 14:43 ` [PATCH v2 1/4] mm: introduce zone lock wrappers Dmitry Ilvokhin
2026-02-25 14:43 ` [PATCH v2 2/4] mm: convert zone lock users to wrappers Dmitry Ilvokhin
@ 2026-02-25 14:43 ` Dmitry Ilvokhin
2026-02-25 20:12 ` Andrew Morton
2026-02-25 14:43 ` [PATCH v2 4/4] mm: add tracepoints for zone lock Dmitry Ilvokhin
3 siblings, 1 reply; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-02-25 14:43 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Brendan Jackman,
Johannes Weiner, Zi Yan, Oscar Salvador, Qi Zheng, Shakeel Butt,
Axel Rasmussen, Yuanchu Xie, Wei Xu
Cc: linux-kernel, linux-mm, linux-trace-kernel, linux-cxl,
kernel-team, Benjamin Cheatham, Dmitry Ilvokhin
Compaction uses compact_lock_irqsave(), which currently operates
on a raw spinlock_t pointer so that it can be used for both
zone->lock and lru_lock. Since zone lock operations are now wrapped,
compact_lock_irqsave() can no longer operate directly on a spinlock_t
when the lock belongs to a zone.
Introduce struct compact_lock to abstract the underlying lock type. The
structure carries a lock type enum and a union holding either a zone
pointer or a raw spinlock_t pointer, and dispatches to the appropriate
lock/unlock helper.
No functional change intended.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
mm/compaction.c | 73 +++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 64 insertions(+), 9 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index 47b26187a5df..c3b97379a963 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -494,6 +494,53 @@ static bool test_and_set_skip(struct compact_control *cc, struct page *page)
}
#endif /* CONFIG_COMPACTION */
+enum compact_lock_type {
+ COMPACT_LOCK_ZONE,
+ COMPACT_LOCK_RAW_SPINLOCK,
+};
+
+struct compact_lock {
+ enum compact_lock_type type;
+ union {
+ struct zone *zone;
+ spinlock_t *lock; /* Reference to lru lock */
+ };
+};
+
+static bool compact_do_trylock_irqsave(struct compact_lock lock,
+ unsigned long *flags)
+{
+ if (lock.type == COMPACT_LOCK_ZONE)
+ return zone_trylock_irqsave(lock.zone, *flags);
+
+ return spin_trylock_irqsave(lock.lock, *flags);
+}
+
+static void compact_do_zone_lock_irqsave(struct zone *zone,
+ unsigned long *flags)
+__acquires(zone->lock)
+{
+ zone_lock_irqsave(zone, *flags);
+}
+
+static void compact_do_raw_lock_irqsave(spinlock_t *lock,
+ unsigned long *flags)
+__acquires(lock)
+{
+ spin_lock_irqsave(lock, *flags);
+}
+
+static void compact_do_lock_irqsave(struct compact_lock lock,
+ unsigned long *flags)
+{
+ if (lock.type == COMPACT_LOCK_ZONE) {
+ compact_do_zone_lock_irqsave(lock.zone, flags);
+ return;
+ }
+
+ compact_do_raw_lock_irqsave(lock.lock, flags);
+}
+
/*
* Compaction requires the taking of some coarse locks that are potentially
* very heavily contended. For async compaction, trylock and record if the
@@ -503,19 +550,19 @@ static bool test_and_set_skip(struct compact_control *cc, struct page *page)
*
* Always returns true which makes it easier to track lock state in callers.
*/
-static bool compact_lock_irqsave(spinlock_t *lock, unsigned long *flags,
- struct compact_control *cc)
- __acquires(lock)
+static bool compact_lock_irqsave(struct compact_lock lock,
+ unsigned long *flags,
+ struct compact_control *cc)
{
/* Track if the lock is contended in async mode */
if (cc->mode == MIGRATE_ASYNC && !cc->contended) {
- if (spin_trylock_irqsave(lock, *flags))
+ if (compact_do_trylock_irqsave(lock, flags))
return true;
cc->contended = true;
}
- spin_lock_irqsave(lock, *flags);
+ compact_do_lock_irqsave(lock, flags);
return true;
}
@@ -531,7 +578,6 @@ static bool compact_lock_irqsave(spinlock_t *lock, unsigned long *flags,
* Returns true if compaction should abort due to fatal signal pending.
* Returns false when compaction can continue.
*/
-
static bool compact_unlock_should_abort(struct zone *zone,
unsigned long flags,
bool *locked,
@@ -616,8 +662,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
/* If we already hold the lock, we can skip some rechecking. */
if (!locked) {
- locked = compact_lock_irqsave(&cc->zone->lock,
- &flags, cc);
+ struct compact_lock zol = {
+ .type = COMPACT_LOCK_ZONE,
+ .zone = cc->zone,
+ };
+
+ locked = compact_lock_irqsave(zol, &flags, cc);
/* Recheck this is a buddy page under lock */
if (!PageBuddy(page))
@@ -1160,10 +1210,15 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
/* If we already hold the lock, we can skip some rechecking */
if (lruvec != locked) {
+ struct compact_lock zol = {
+ .type = COMPACT_LOCK_RAW_SPINLOCK,
+ .lock = &lruvec->lru_lock,
+ };
+
if (locked)
unlock_page_lruvec_irqrestore(locked, flags);
- compact_lock_irqsave(&lruvec->lru_lock, &flags, cc);
+ compact_lock_irqsave(zol, &flags, cc);
locked = lruvec;
lruvec_memcg_debug(lruvec, folio);
--
2.47.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 4/4] mm: add tracepoints for zone lock
2026-02-25 14:43 [PATCH v2 0/4] mm: zone lock tracepoint instrumentation Dmitry Ilvokhin
` (2 preceding siblings ...)
2026-02-25 14:43 ` [PATCH v2 3/4] mm: convert compaction to zone lock wrappers Dmitry Ilvokhin
@ 2026-02-25 14:43 ` Dmitry Ilvokhin
3 siblings, 0 replies; 7+ messages in thread
From: Dmitry Ilvokhin @ 2026-02-25 14:43 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Brendan Jackman,
Johannes Weiner, Zi Yan, Oscar Salvador, Qi Zheng, Shakeel Butt,
Axel Rasmussen, Yuanchu Xie, Wei Xu
Cc: linux-kernel, linux-mm, linux-trace-kernel, linux-cxl,
kernel-team, Benjamin Cheatham, Dmitry Ilvokhin
Add tracepoint instrumentation to zone lock acquire/release operations
via the previously introduced wrappers.
The implementation follows the mmap_lock tracepoint pattern: a
lightweight inline helper checks whether the tracepoint is enabled and
calls into an out-of-line helper when tracing is active. When
CONFIG_TRACING is disabled, helpers compile to empty inline stubs.
The fast path is unaffected when tracing is disabled.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
---
MAINTAINERS | 2 +
include/linux/zone_lock.h | 64 +++++++++++++++++++++++++++++++-
include/trace/events/zone_lock.h | 64 ++++++++++++++++++++++++++++++++
mm/Makefile | 2 +-
mm/zone_lock.c | 31 ++++++++++++++++
5 files changed, 161 insertions(+), 2 deletions(-)
create mode 100644 include/trace/events/zone_lock.h
create mode 100644 mm/zone_lock.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 61e3d1f5bf43..b5aa2bb5d2ba 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16681,6 +16681,7 @@ F: include/linux/ptdump.h
F: include/linux/vmpressure.h
F: include/linux/vmstat.h
F: include/linux/zone_lock.h
+F: include/trace/events/zone_lock.h
F: kernel/fork.c
F: mm/Kconfig
F: mm/debug.c
@@ -16700,6 +16701,7 @@ F: mm/sparse.c
F: mm/util.c
F: mm/vmpressure.c
F: mm/vmstat.c
+F: mm/zone_lock.c
N: include/linux/page[-_]*
MEMORY MANAGEMENT - EXECMEM
diff --git a/include/linux/zone_lock.h b/include/linux/zone_lock.h
index c531e26280e6..cea41dd56324 100644
--- a/include/linux/zone_lock.h
+++ b/include/linux/zone_lock.h
@@ -4,6 +4,53 @@
#include <linux/mmzone.h>
#include <linux/spinlock.h>
+#include <linux/tracepoint-defs.h>
+
+DECLARE_TRACEPOINT(zone_lock_start_locking);
+DECLARE_TRACEPOINT(zone_lock_acquire_returned);
+DECLARE_TRACEPOINT(zone_lock_released);
+
+#ifdef CONFIG_TRACING
+
+void __zone_lock_do_trace_start_locking(struct zone *zone);
+void __zone_lock_do_trace_acquire_returned(struct zone *zone, bool success);
+void __zone_lock_do_trace_released(struct zone *zone);
+
+static inline void __zone_lock_trace_start_locking(struct zone *zone)
+{
+ if (tracepoint_enabled(zone_lock_start_locking))
+ __zone_lock_do_trace_start_locking(zone);
+}
+
+static inline void __zone_lock_trace_acquire_returned(struct zone *zone,
+ bool success)
+{
+ if (tracepoint_enabled(zone_lock_acquire_returned))
+ __zone_lock_do_trace_acquire_returned(zone, success);
+}
+
+static inline void __zone_lock_trace_released(struct zone *zone)
+{
+ if (tracepoint_enabled(zone_lock_released))
+ __zone_lock_do_trace_released(zone);
+}
+
+#else /* !CONFIG_TRACING */
+
+static inline void __zone_lock_trace_start_locking(struct zone *zone)
+{
+}
+
+static inline void __zone_lock_trace_acquire_returned(struct zone *zone,
+ bool success)
+{
+}
+
+static inline void __zone_lock_trace_released(struct zone *zone)
+{
+}
+
+#endif /* CONFIG_TRACING */
static inline void zone_lock_init(struct zone *zone)
{
@@ -12,26 +59,41 @@ static inline void zone_lock_init(struct zone *zone)
#define zone_lock_irqsave(zone, flags) \
do { \
+ bool success = true; \
+ \
+ __zone_lock_trace_start_locking(zone); \
spin_lock_irqsave(&(zone)->lock, flags); \
+ __zone_lock_trace_acquire_returned(zone, success); \
} while (0)
#define zone_trylock_irqsave(zone, flags) \
({ \
- spin_trylock_irqsave(&(zone)->lock, flags); \
+ bool success; \
+ \
+ __zone_lock_trace_start_locking(zone); \
+ success = spin_trylock_irqsave(&(zone)->lock, flags); \
+ __zone_lock_trace_acquire_returned(zone, success); \
+ success; \
})
static inline void zone_unlock_irqrestore(struct zone *zone, unsigned long flags)
{
+ __zone_lock_trace_released(zone);
spin_unlock_irqrestore(&zone->lock, flags);
}
static inline void zone_lock_irq(struct zone *zone)
{
+ bool success = true;
+
+ __zone_lock_trace_start_locking(zone);
spin_lock_irq(&zone->lock);
+ __zone_lock_trace_acquire_returned(zone, success);
}
static inline void zone_unlock_irq(struct zone *zone)
{
+ __zone_lock_trace_released(zone);
spin_unlock_irq(&zone->lock);
}
diff --git a/include/trace/events/zone_lock.h b/include/trace/events/zone_lock.h
new file mode 100644
index 000000000000..3df82a8c0160
--- /dev/null
+++ b/include/trace/events/zone_lock.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM zone_lock
+
+#if !defined(_TRACE_ZONE_LOCK_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_ZONE_LOCK_H
+
+#include <linux/tracepoint.h>
+#include <linux/types.h>
+
+struct zone;
+
+DECLARE_EVENT_CLASS(zone_lock,
+
+ TP_PROTO(struct zone *zone),
+
+ TP_ARGS(zone),
+
+ TP_STRUCT__entry(
+ __field(struct zone *, zone)
+ ),
+
+ TP_fast_assign(
+ __entry->zone = zone;
+ ),
+
+ TP_printk("zone=%p", __entry->zone)
+);
+
+#define DEFINE_ZONE_LOCK_EVENT(name) \
+ DEFINE_EVENT(zone_lock, name, \
+ TP_PROTO(struct zone *zone), \
+ TP_ARGS(zone))
+
+DEFINE_ZONE_LOCK_EVENT(zone_lock_start_locking);
+DEFINE_ZONE_LOCK_EVENT(zone_lock_released);
+
+TRACE_EVENT(zone_lock_acquire_returned,
+
+ TP_PROTO(struct zone *zone, bool success),
+
+ TP_ARGS(zone, success),
+
+ TP_STRUCT__entry(
+ __field(struct zone *, zone)
+ __field(bool, success)
+ ),
+
+ TP_fast_assign(
+ __entry->zone = zone;
+ __entry->success = success;
+ ),
+
+ TP_printk(
+ "zone=%p success=%s",
+ __entry->zone,
+ __entry->success ? "true" : "false"
+ )
+);
+
+#endif /* _TRACE_ZONE_LOCK_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/mm/Makefile b/mm/Makefile
index 8ad2ab08244e..ffd06cf7a04e 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -55,7 +55,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \
mm_init.o percpu.o slab_common.o \
compaction.o show_mem.o \
interval_tree.o list_lru.o workingset.o \
- debug.o gup.o mmap_lock.o vma_init.o $(mmu-y)
+ debug.o gup.o mmap_lock.o zone_lock.o vma_init.o $(mmu-y)
# Give 'page_alloc' its own module-parameter namespace
page-alloc-y := page_alloc.o
diff --git a/mm/zone_lock.c b/mm/zone_lock.c
new file mode 100644
index 000000000000..f647fd2aca48
--- /dev/null
+++ b/mm/zone_lock.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+#define CREATE_TRACE_POINTS
+#include <trace/events/zone_lock.h>
+
+#include <linux/zone_lock.h>
+
+EXPORT_TRACEPOINT_SYMBOL(zone_lock_start_locking);
+EXPORT_TRACEPOINT_SYMBOL(zone_lock_acquire_returned);
+EXPORT_TRACEPOINT_SYMBOL(zone_lock_released);
+
+#ifdef CONFIG_TRACING
+
+void __zone_lock_do_trace_start_locking(struct zone *zone)
+{
+ trace_zone_lock_start_locking(zone);
+}
+EXPORT_SYMBOL(__zone_lock_do_trace_start_locking);
+
+void __zone_lock_do_trace_acquire_returned(struct zone *zone, bool success)
+{
+ trace_zone_lock_acquire_returned(zone, success);
+}
+EXPORT_SYMBOL(__zone_lock_do_trace_acquire_returned);
+
+void __zone_lock_do_trace_released(struct zone *zone)
+{
+ trace_zone_lock_released(zone);
+}
+EXPORT_SYMBOL(__zone_lock_do_trace_released);
+
+#endif /* CONFIG_TRACING */
--
2.47.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 3/4] mm: convert compaction to zone lock wrappers
2026-02-25 14:43 ` [PATCH v2 3/4] mm: convert compaction to zone lock wrappers Dmitry Ilvokhin
@ 2026-02-25 20:12 ` Andrew Morton
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2026-02-25 20:12 UTC (permalink / raw)
To: Dmitry Ilvokhin
Cc: David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brendan Jackman, Johannes Weiner, Zi Yan, Oscar Salvador,
Qi Zheng, Shakeel Butt, Axel Rasmussen, Yuanchu Xie, Wei Xu,
linux-kernel, linux-mm, linux-trace-kernel, linux-cxl,
kernel-team, Benjamin Cheatham
On Wed, 25 Feb 2026 14:43:05 +0000 Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> Compaction uses compact_lock_irqsave(), which currently operates
> on a raw spinlock_t pointer so that it can be used for both
> zone->lock and lru_lock. Since zone lock operations are now wrapped,
> compact_lock_irqsave() can no longer operate directly on a spinlock_t
> when the lock belongs to a zone.
>
> Introduce struct compact_lock to abstract the underlying lock type. The
> structure carries a lock type enum and a union holding either a zone
> pointer or a raw spinlock_t pointer, and dispatches to the appropriate
> lock/unlock helper.
It's regrettable that adds overhead - increased .text, increased
instructions.
Thing is, compact_lock_irqsave() has only two callsites. One knows
that it's dealing with the zone lock, the other knows that it's dealing
with the lruvec lock.
Would it not be simpler and more efficient to copy/paste/edit two
versions of compact_lock_irqsave()? A compact_zone_lock_irqsave() and a
compact_lruvec_lock_irqsave()?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/4] mm: introduce zone lock wrappers
2026-02-25 14:43 ` [PATCH v2 1/4] mm: introduce zone lock wrappers Dmitry Ilvokhin
@ 2026-02-25 20:14 ` Andrew Morton
0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2026-02-25 20:14 UTC (permalink / raw)
To: Dmitry Ilvokhin
Cc: David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Brendan Jackman, Johannes Weiner, Zi Yan, Oscar Salvador,
Qi Zheng, Shakeel Butt, Axel Rasmussen, Yuanchu Xie, Wei Xu,
linux-kernel, linux-mm, linux-trace-kernel, linux-cxl,
kernel-team, Benjamin Cheatham
On Wed, 25 Feb 2026 14:43:03 +0000 Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> Add thin wrappers around zone lock acquire/release operations. This
> prepares the code for future tracepoint instrumentation without
> modifying individual call sites.
>
> Centralizing zone lock operations behind wrappers allows future
> instrumentation or debugging hooks to be added without touching
> all users.
>
> No functional change intended. The wrappers are introduced in
> preparation for subsequent patches and are not yet used.
>
> ...
>
> +static inline void zone_lock_init(struct zone *zone)
> +{
> + spin_lock_init(&zone->lock);
> +}
Please consider renaming zone.lock to something else (_lock would be
conventional) so that any present and future and out-of-tree
unconverted code won't compile.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-02-25 20:14 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-25 14:43 [PATCH v2 0/4] mm: zone lock tracepoint instrumentation Dmitry Ilvokhin
2026-02-25 14:43 ` [PATCH v2 1/4] mm: introduce zone lock wrappers Dmitry Ilvokhin
2026-02-25 20:14 ` Andrew Morton
2026-02-25 14:43 ` [PATCH v2 2/4] mm: convert zone lock users to wrappers Dmitry Ilvokhin
2026-02-25 14:43 ` [PATCH v2 3/4] mm: convert compaction to zone lock wrappers Dmitry Ilvokhin
2026-02-25 20:12 ` Andrew Morton
2026-02-25 14:43 ` [PATCH v2 4/4] mm: add tracepoints for zone lock Dmitry Ilvokhin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox