On 2025-09-08 22:34:04 +0800, Kairui Song wrote: > On Sun, Sep 7, 2025 at 8:59 PM Klara Modin wrote: > > > > On 2025-09-06 03:13:53 +0800, Kairui Song wrote: > > > From: Kairui Song > > > > > > Introduce basic swap table infrastructures, which are now just a > > > fixed-sized flat array inside each swap cluster, with access wrappers. > > > > > > Each cluster contains a swap table of 512 entries. Each table entry is > > > an opaque atomic long. It could be in 3 types: a shadow type (XA_VALUE), > > > a folio type (pointer), or NULL. > > > > > > In this first step, it only supports storing a folio or shadow, and it > > > is a drop-in replacement for the current swap cache. Convert all swap > > > cache users to use the new sets of APIs. Chris Li has been suggesting > > > using a new infrastructure for swap cache for better performance, and > > > that idea combined well with the swap table as the new backing > > > structure. Now the lock contention range is reduced to 2M clusters, > > > which is much smaller than the 64M address_space. And we can also drop > > > the multiple address_space design. > > > > > > All the internal works are done with swap_cache_get_* helpers. Swap > > > cache lookup is still lock-less like before, and the helper's contexts > > > are same with original swap cache helpers. They still require a pin > > > on the swap device to prevent the backing data from being freed. > > > > > > Swap cache updates are now protected by the swap cluster lock > > > instead of the Xarray lock. This is mostly handled internally, but new > > > __swap_cache_* helpers require the caller to lock the cluster. So, a > > > few new cluster access and locking helpers are also introduced. > > > > > > A fully cluster-based unified swap table can be implemented on top > > > of this to take care of all count tracking and synchronization work, > > > with dynamic allocation. It should reduce the memory usage while > > > making the performance even better. > > > > > > Co-developed-by: Chris Li > > > Signed-off-by: Chris Li > > > Signed-off-by: Kairui Song > > > --- > > > MAINTAINERS | 1 + > > > include/linux/swap.h | 2 - > > > mm/huge_memory.c | 13 +- > > > mm/migrate.c | 19 ++- > > > mm/shmem.c | 8 +- > > > mm/swap.h | 157 +++++++++++++++++------ > > > mm/swap_state.c | 289 +++++++++++++++++++------------------------ > > > mm/swap_table.h | 97 +++++++++++++++ > > > mm/swapfile.c | 100 +++++++++++---- > > > mm/vmscan.c | 20 ++- > > > 10 files changed, 458 insertions(+), 248 deletions(-) > > > create mode 100644 mm/swap_table.h > > > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > > index 1c8292c0318d..de402ca91a80 100644 > > > --- a/MAINTAINERS > > > +++ b/MAINTAINERS > > > @@ -16226,6 +16226,7 @@ F: include/linux/swapops.h > > > F: mm/page_io.c > > > F: mm/swap.c > > > F: mm/swap.h > > > +F: mm/swap_table.h > > > F: mm/swap_state.c > > > F: mm/swapfile.c > > > > > > > ... > > > > > #include /* for swp_offset */ > > > > Now that swp_offset() is used in folio_index(), should this perhaps also be > > included for !CONFIG_SWAP? > > Hi, Thanks for looking at this series. > > > > > > #include /* for bio_end_io_t */ > > > > ... > > > > if (unlikely(folio_test_swapcache(folio))) > > > > > - return swap_cache_index(folio->swap); > > > + return swp_offset(folio->swap); > > > > This is outside CONFIG_SWAP. > > Right, but there are users of folio_index that are outside of > CONFIG_SWAP (mm/migrate.c), and swp_offset is also outside of SWAP so > that's OK. > > If we wrap it, the CONFIG_SWAP build will fail. I've test !CONFIG_SWAP > build on this patch and after the whole series, it works fine. > > We should drop the usage of folio_index in migrate.c, that's not > really related to this series though. Interesting that it works for you. I have a config with !CONFIG_SWAP which fails with: In file included from mm/shmem.c:44: mm/swap.h: In function ‘folio_index’: mm/swap.h:461:24: error: implicit declaration of function ‘swp_offset’; did you mean ‘pmd_offset’? [-Wimplicit-function-declaration] 461 | return swp_offset(folio->swap); | ^~~~~~~~~~ | pmd_offset (though it's possible I have misapplied the series somehow). If I just move the linux/swapops.h include outside the CONFIG_SWAP ifdef: diff --git a/mm/swap.h b/mm/swap.h index caff4fe30fc5..12dd7d6478ff 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -3,6 +3,7 @@ #define _MM_SWAP_H #include /* for atomic_long_t */ +#include /* for swp_offset */ struct mempolicy; struct swap_iocb; @@ -54,7 +55,6 @@ enum swap_cluster_flags { }; #ifdef CONFIG_SWAP -#include /* for swp_offset */ #include /* for bio_end_io_t */ static inline unsigned int swp_cluster_offset(swp_entry_t entry) it fixes that issue for me, and my other CONFIG_SWAP builds do not seem to be impacted. I attached the config in case it's useful. > > > > > > return folio->index; > > > } > > > > ... > > > > Regards, > > Klara Modin > >