linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/9] mm/page_owner: add support for pages in swap space
@ 2025-12-05 23:17 Mauricio Faria de Oliveira
  2025-12-05 23:17 ` [PATCH RFC 1/9] mm: add config option SWAP_PAGE_OWNER Mauricio Faria de Oliveira
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Mauricio Faria de Oliveira @ 2025-12-05 23:17 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand
  Cc: Lorenzo Stoakes, Michal Hocko, Vlastimil Babka, Oscar Salvador,
	linux-mm, linux-kernel, kernel-dev

This series extends page_owner to track the allocation stack trace of
pages in swap space as well (not just pages in memory), and maintain it
over swap-out and swap-in (no longer lost/overriden at swap-out/swap-in).

It stores a copy of the allocation stack trace (plus some other fields)
at swap-out and loads it back at swap-in, with an xarray by swp_entry_t
and the swap hooks used for CONFIG_ARM64_MTE (memory tagging extension).

The refcounts of the allocation stack traces (i.e., number of base pages)
for the initial allocation and the allocation at swap-in are fixed-up at
swap-out and swap-in, to maintain the refcount on the initial allocation
(for 'nr_base_pages' in '/sys/kernel/debug/page_owner_stacks/show_*').

The series is split into more, small patches on purpose, just for review
(some patches fail to build with '-Werror=unused-function').
- Patches 1-4 add infrastructure.
- Patch 5 adds the key functionality.
- Patches 6-7 add reporting.
- Patches 8-9 add hooks for this.

It is basically tested (below); hopefully sufficient for an RFC in order
to get comments on the idea, approach, design, problems and suggestions.

For future revisions: add pr_debug() and more error messages, and write
Documentation; maybe rename functions with 'spo' prefix (swap page_owner).
(I'll check a FIXME on the refcount fix-up due to an occasional WARN+OOPS
mostly on shutdown; but that is just for 'nr_base_pages', without impact
to the key functionality, and can be just commented out.)

Thanks,
Mauricio

# Testing

## Configuration

- next-20251205
- x86_64_defconfig
- CONFIG_PAGE_OWNER=n / CONFIG_SWAP_PAGE_OWNER=n (build-only)
- CONFIG_PAGE_OWNER=y / CONFIG_SWAP_PAGE_OWNER=n (build-only)
- CONFIG_PAGE_OWNER=y / CONFIG_SWAP_PAGE_OWNER=y (build/tests)

## Format

- Existing file (page_owner)

	# cat /sys/kernel/debug/page_owner
	...
	Page allocated via order 0, mask 0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), pid 231, tgid 231 (dd), ts 13255570102 ns
	PFN 0xa35b type Movable Block 81 type Movable Flags 0x10000000002003c(referenced|uptodate|dirty|lru|swapbacked|node=0|zone=1)
	 get_page_from_freelist+0x13b9/0x1530
	 ...
	 entry_SYSCALL_64_after_hwframe+0x77/0x7f
	Charged to memcg session-1.scope
	
	...
	
- New file (swap_page_owner)

	# cat /sys/kernel/debug/swap_page_owner # new
	...
	Page allocated via pid 231, tgid 231 (dd), ts 13255313898 ns
	SWP entry 0x2e
	 get_page_from_freelist+0x13b9/0x1530
	 ...
	 entry_SYSCALL_64_after_hwframe+0x77/0x7f
	
	...
	
- Existing file (page_owner), when modified with '(swapped)$'

	# cat /sys/kernel/debug/page_owner
	...
	Page allocated via order 0, mask 0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), pid 231, tgid 231 (dd), ts 13255427451 ns (swapped)

## Environment:

	# uname -r
	6.18.0-next-20251205-00009-g8042ea636379

	# mount | grep /tmpfs
	none on /tmpfs type tmpfs (rw,relatime,size=1048576k)
	none on /tmpfs-noswap type tmpfs (rw,relatime,size=1048576k,noswap)

## Before (page_owner without swap support)

	# grep -o 'page_owner=.*' /proc/cmdline
	page_owner=on

  - Mem/Swap usage per page_owner (swap is not reported)
 
	# free -m
		       total        used        free      shared  buff/cache   available
	Mem:             458          68         367           0          33         389
	Swap:           1023           0        1023

	# echo $((458-367)) # total - free
	91

	# cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	91.2578 MiB

	# cat /sys/kernel/debug/swap_page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=1 } END { print PAGES*4096/1024^2 " MiB" }'
	cat: /sys/kernel/debug/swap_page_owner: Invalid argument
	0 MiB

  - Use 256 MiB of pages allocated by 'dd' (compare with pages allocated by 'cat')

	# dd if=/dev/zero of=/tmpfs/file status=none bs=1M count=256

	# COMM=dd; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	256.559 MiB

	# COMM=cat; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	0.464844 MiB

  - Move that 256 MiB of pages from memory to swap (use 'DD' to differentiate from 'dd's memory)

	# ln -sf $(which dd) ./DD
	# ./DD if=/dev/zero of=/tmpfs-noswap/file status=none bs=1M count=384

	# free -m
		       total        used        free      shared  buff/cache   available
	Mem:             458         445           6         388         405          12
	Swap:           1023         264         759

  - Mem/Swap usage per page_owner (swap is not reported)

	# echo $((458-6)) # total - free
	452

	# cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	450.445 MiB

	# cat /sys/kernel/debug/swap_page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=1 } END { print PAGES*4096/1024^2 " MiB" }'
	cat: /sys/kernel/debug/swap_page_owner: Invalid argument
	0 MiB

  - Move that 256 MiB of pages back from swap to memory

	# rm /tmpfs-noswap/file
	# cat /tmpfs/file >/dev/null

  - The 256 MiB of pages is _now_ allocated by 'cat' (instead of 'dd')

	# COMM=dd; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	1.66016 MiB

	# COMM=cat; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	258.309 MiB

## After (page_owner with swap support)

	# grep -o 'page_owner=.*' /proc/cmdline
	page_owner=on,swap

  - Mem/Swap usage per page_owner (swap is reported now)

	# free -m
		       total        used        free      shared  buff/cache   available
	Mem:             458          68         360           0          40         389
	Swap:           1023           0        1023

	# echo $((458-360)) # total - free
	98

	# cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	97 MiB

	# cat /sys/kernel/debug/swap_page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=1 } END { print PAGES*4096/1024^2 " MiB" }'
	0 MiB

  - Use 256 MiB of pages allocated by 'dd' (compare with pages allocated by 'cat')
  
	# dd if=/dev/zero of=/tmpfs/file status=none bs=1M count=256

	# COMM=dd; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	256.504 MiB

	# COMM=cat; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	0.175781 MiB

  - Move that 256 MiB of pages from memory to swap (use 'DD' to differentiate from 'dd's memory)
  
	# ln -sf $(which dd) ./DD
	# ./DD if=/dev/zero of=/tmpfs-noswap/file status=none bs=1M count=384

  - Mem/Swap usage per page_owner (swap is reported now)
  
	# free -m
		       total        used        free      shared  buff/cache   available
	Mem:             458         435           6         384         412          22
	Swap:           1023         278         745

	# echo $((458-6)) # total - free
	452

	# cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	449.258 MiB

	# cat /sys/kernel/debug/swap_page_owner | awk -F '[ ,]' '/^Page allocated/ { PAGES+=1 } END { print PAGES*4096/1024^2 " MiB" }'
	278.793 MiB

  - Move that 256 MiB of pages back from swap to memory
	  
	# rm /tmpfs-noswap/file
	# cat /tmpfs/file >/dev/null

  - The 256 MiB of pages is _still_ allocated by 'dd' (instead of 'cat' as above)
  
	# COMM=dd; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	256.445 MiB

	# COMM=cat; cat /sys/kernel/debug/page_owner | awk -F '[ ,]' '/^Page allocated .* \('${COMM}'\)/ { PAGES+=2^$5 } END { print PAGES*4096/1024^2 " MiB" }'
	7.80859 MiB

Mauricio Faria de Oliveira (9):
  mm: add config option SWAP_PAGE_OWNER
  mm/page_owner: add parameter option 'page_owner=on,swap'
  mm/page_owner: add 'struct swap_page_owner' and helpers
  mm/page_owner: add 'xarray swap_page_owners' and helpers
  mm/page_owner: add swap hooks
  mm/page_owner: report '(swapped)' pages in debugfs file 'page_owner'
  mm/page_owner: add debugfs file 'swap_page_owner'
  mm: call arch-specific swap hooks from generic swap hooks
  mm: call page_owner swap hooks from generic swap hooks

 include/linux/page_owner.h |  50 ++++++
 include/linux/pgtable.h    |  30 ++++
 mm/Kconfig.debug           |   9 +
 mm/memory.c                |   2 +-
 mm/page_io.c               |   2 +-
 mm/page_owner.c            | 338 ++++++++++++++++++++++++++++++++++++-
 mm/shmem.c                 |   2 +-
 mm/swapfile.c              |   6 +-
 8 files changed, 427 insertions(+), 12 deletions(-)

-- 
2.51.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-12-05 23:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-05 23:17 [PATCH RFC 0/9] mm/page_owner: add support for pages in swap space Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 1/9] mm: add config option SWAP_PAGE_OWNER Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 2/9] mm/page_owner: add parameter option 'page_owner=on,swap' Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 3/9] mm/page_owner: add 'struct swap_page_owner' and helpers Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 4/9] mm/page_owner: add 'xarray swap_page_owners' " Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 5/9] mm/page_owner: add swap hooks Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 6/9] mm/page_owner: report '(swapped)' pages in debugfs file 'page_owner' Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 7/9] mm/page_owner: add debugfs file 'swap_page_owner' Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 8/9] mm: call arch-specific swap hooks from generic swap hooks Mauricio Faria de Oliveira
2025-12-05 23:17 ` [PATCH RFC 9/9] mm: call page_owner " Mauricio Faria de Oliveira

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox