From: Mauricio Faria de Oliveira <mfo@igalia.com>
To: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
Oscar Salvador <osalvador@suse.de>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kernel-dev@igalia.com
Subject: [PATCH RFC 0/9] mm/page_owner: add support for pages in swap space
Date: Fri, 5 Dec 2025 20:17:12 -0300 [thread overview]
Message-ID: <20251205231721.104505-1-mfo@igalia.com> (raw)
This series extends page_owner to track the allocation stack trace of
pages in swap space as well (not just pages in memory), and maintain it
over swap-out and swap-in (no longer lost/overriden at swap-out/swap-in).
It stores a copy of the allocation stack trace (plus some other fields)
at swap-out and loads it back at swap-in, with an xarray by swp_entry_t
and the swap hooks used for CONFIG_ARM64_MTE (memory tagging extension).
The refcounts of the allocation stack traces (i.e., number of base pages)
for the initial allocation and the allocation at swap-in are fixed-up at
swap-out and swap-in, to maintain the refcount on the initial allocation
(for 'nr_base_pages' in '/sys/kernel/debug/page_owner_stacks/show_*').
The series is split into more, small patches on purpose, just for review
(some patches fail to build with '-Werror=unused-function').
- Patches 1-4 add infrastructure.
- Patch 5 adds the key functionality.
- Patches 6-7 add reporting.
- Patches 8-9 add hooks for this.
It is basically tested (below); hopefully sufficient for an RFC in order
to get comments on the idea, approach, design, problems and suggestions.
For future revisions: add pr_debug() and more error messages, and write
Documentation; maybe rename functions with 'spo' prefix (swap page_owner).
(I'll check a FIXME on the refcount fix-up due to an occasional WARN+OOPS
mostly on shutdown; but that is just for 'nr_base_pages', without impact
to the key functionality, and can be just commented out.)
Thanks,
Mauricio
# Testing
## Configuration
- next-20251205
- x86_64_defconfig
- CONFIG_PAGE_OWNER=n / CONFIG_SWAP_PAGE_OWNER=n (build-only)
- CONFIG_PAGE_OWNER=y / CONFIG_SWAP_PAGE_OWNER=n (build-only)
- CONFIG_PAGE_OWNER=y / CONFIG_SWAP_PAGE_OWNER=y (build/tests)
## Format
- Existing file (page_owner)
# cat /sys/kernel/debug/page_owner
...
Page allocated via order 0, mask 0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), pid 231, tgid 231 (dd), ts 13255570102 ns
PFN 0xa35b type Movable Block 81 type Movable Flags 0x10000000002003c(referenced|uptodate|dirty|lru|swapbacked|node=0|zone=1)
get_page_from_freelist+0x13b9/0x1530
...
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Charged to memcg session-1.scope
...
- New file (swap_page_owner)
# cat /sys/kernel/debug/swap_page_owner # new
...
Page allocated via pid 231, tgid 231 (dd), ts 13255313898 ns
SWP entry 0x2e
get_page_from_freelist+0x13b9/0x1530
...
entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
- Existing file (page_owner), when modified with '(swapped)$'
# cat /sys/kernel/debug/page_owner
...
Page allocated via order 0, mask 0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), pid 231, tgid 231 (dd), ts 13255427451 ns (swapped)
## Environment:
# uname -r
6.18.0-next-20251205-00009-g8042ea636379
# mount | grep /tmpfs
none on /tmpfs type tmpfs (rw,relatime,size=1048576k)
none on /tmpfs-noswap type tmpfs (rw,relatime,size=1048576k,noswap)
## Before (page_owner without swap support)
# grep -o 'page_owner=.*' /proc/cmdline
page_owner=on
- Mem/Swap usage per page_owner (swap is not reported)
# free -m
total used free shared buff/cache available
Mem: 458 68 367 0 33 389
Swap: 1023 0 1023
# echo $((458-367)) # total - free
91
# cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
91.2578 MiB
# cat /sys/kernel/debug/swap_page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=1 } END { print PAGES*4096/1024^2 " MiB" }'
cat: /sys/kernel/debug/swap_page_owner: Invalid argument
0 MiB
- Use 256 MiB of pages allocated by 'dd' (compare with pages allocated by 'cat')
# dd if=/dev/zero of=/tmpfs/file status=none bs=1M count=256
# COMM=dd; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
256.559 MiB
# COMM=cat; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
0.464844 MiB
- Move that 256 MiB of pages from memory to swap (use 'DD' to differentiate from 'dd's memory)
# ln -sf $(which dd) ./DD
# ./DD if=/dev/zero of=/tmpfs-noswap/file status=none bs=1M count=384
# free -m
total used free shared buff/cache available
Mem: 458 445 6 388 405 12
Swap: 1023 264 759
- Mem/Swap usage per page_owner (swap is not reported)
# echo $((458-6)) # total - free
452
# cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
450.445 MiB
# cat /sys/kernel/debug/swap_page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=1 } END { print PAGES*4096/1024^2 " MiB" }'
cat: /sys/kernel/debug/swap_page_owner: Invalid argument
0 MiB
- Move that 256 MiB of pages back from swap to memory
# rm /tmpfs-noswap/file
# cat /tmpfs/file >/dev/null
- The 256 MiB of pages is _now_ allocated by 'cat' (instead of 'dd')
# COMM=dd; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
1.66016 MiB
# COMM=cat; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
258.309 MiB
## After (page_owner with swap support)
# grep -o 'page_owner=.*' /proc/cmdline
page_owner=on,swap
- Mem/Swap usage per page_owner (swap is reported now)
# free -m
total used free shared buff/cache available
Mem: 458 68 360 0 40 389
Swap: 1023 0 1023
# echo $((458-360)) # total - free
98
# cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
97 MiB
# cat /sys/kernel/debug/swap_page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=1 } END { print PAGES*4096/1024^2 " MiB" }'
0 MiB
- Use 256 MiB of pages allocated by 'dd' (compare with pages allocated by 'cat')
# dd if=/dev/zero of=/tmpfs/file status=none bs=1M count=256
# COMM=dd; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
256.504 MiB
# COMM=cat; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
0.175781 MiB
- Move that 256 MiB of pages from memory to swap (use 'DD' to differentiate from 'dd's memory)
# ln -sf $(which dd) ./DD
# ./DD if=/dev/zero of=/tmpfs-noswap/file status=none bs=1M count=384
- Mem/Swap usage per page_owner (swap is reported now)
# free -m
total used free shared buff/cache available
Mem: 458 435 6 384 412 22
Swap: 1023 278 745
# echo $((458-6)) # total - free
452
# cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
449.258 MiB
# cat /sys/kernel/debug/swap_page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=1 } END { print PAGES*4096/1024^2 " MiB" }'
278.793 MiB
- Move that 256 MiB of pages back from swap to memory
# rm /tmpfs-noswap/file
# cat /tmpfs/file >/dev/null
- The 256 MiB of pages is _still_ allocated by 'dd' (instead of 'cat' as above)
# COMM=dd; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
256.445 MiB
# COMM=cat; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
7.80859 MiB
Mauricio Faria de Oliveira (9):
mm: add config option SWAP_PAGE_OWNER
mm/page_owner: add parameter option 'page_owner=on,swap'
mm/page_owner: add 'struct swap_page_owner' and helpers
mm/page_owner: add 'xarray swap_page_owners' and helpers
mm/page_owner: add swap hooks
mm/page_owner: report '(swapped)' pages in debugfs file 'page_owner'
mm/page_owner: add debugfs file 'swap_page_owner'
mm: call arch-specific swap hooks from generic swap hooks
mm: call page_owner swap hooks from generic swap hooks
include/linux/page_owner.h | 50 ++++++
include/linux/pgtable.h | 30 ++++
mm/Kconfig.debug | 9 +
mm/memory.c | 2 +-
mm/page_io.c | 2 +-
mm/page_owner.c | 338 ++++++++++++++++++++++++++++++++++++-
mm/shmem.c | 2 +-
mm/swapfile.c | 6 +-
8 files changed, 427 insertions(+), 12 deletions(-)
--
2.51.0
next reply other threads:[~2025-12-05 23:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-05 23:17 Mauricio Faria de Oliveira [this message]
2025-12-05 23:17 ` [PATCH RFC 1/9] mm: add config option SWAP_PAGE_OWNER Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 2/9] mm/page_owner: add parameter option 'page_owner=on,swap' Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 3/9] mm/page_owner: add 'struct swap_page_owner' and helpers Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 4/9] mm/page_owner: add 'xarray swap_page_owners' " Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 5/9] mm/page_owner: add swap hooks Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 6/9] mm/page_owner: report '(swapped)' pages in debugfs file 'page_owner' Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 7/9] mm/page_owner: add debugfs file 'swap_page_owner' Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 8/9] mm: call arch-specific swap hooks from generic swap hooks Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 9/9] mm: call page_owner " Mauricio Faria de Oliveira
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251205231721.104505-1-mfo@igalia.com \
--to=mfo@igalia.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=kernel-dev@igalia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox