From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C1AEC3ABA9 for ; Wed, 30 Apr 2025 00:54:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 68B2C6B00C3; Tue, 29 Apr 2025 20:54:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 639816B00C5; Tue, 29 Apr 2025 20:54:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 353376B00C7; Tue, 29 Apr 2025 20:54:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 018DB6B00C3 for ; Tue, 29 Apr 2025 20:54:30 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 513A9141498 for ; Tue, 29 Apr 2025 23:38:59 +0000 (UTC) X-FDA: 83388699198.02.13EDEA7 Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) by imf12.hostedemail.com (Postfix) with ESMTP id 65C9840003 for ; Tue, 29 Apr 2025 23:38:57 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="EaI5v/YF"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745969937; a=rsa-sha256; cv=none; b=zq2kQy9Y1MncT3e/vsQz6BL1cPhVUFnHss/eN5eB0RD+gxd3F5mI52z1WSjj1mhaTeI+++ jSs7RZFNLSKCmJVyaq1EqSo3hWLqXQF6q1KjeoeuM/bx1zehQe6hLgnrenR9SAiy2Ycd6a IDaxpMtBpjhzzR74sWaoBRgyHVpYqDs= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="EaI5v/YF"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.181 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745969937; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=44dQEBk34Lg+twP2t8lNoyYgFTs2iNjxhzJ6aTJ3tVA=; b=t5tmWq6+6/nfZLcs+z1L8FLEj6SAN8NZetr9QrAp9sEh1agw7EamT3l/Mq0muZI24rsDpb +WsW49ahkEZAymKXJdvTdEZygiSAGvdENgp0Gmhw9451EG45pm6AhRJKfOHpdvx7F1OuJ1 0deSI6JsLnvxDZnDEBk3jX3JtVad0KQ= Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-6fead015247so61558307b3.2 for ; Tue, 29 Apr 2025 16:38:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745969936; x=1746574736; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=44dQEBk34Lg+twP2t8lNoyYgFTs2iNjxhzJ6aTJ3tVA=; b=EaI5v/YFoB5vaeZhr4CKpfR9JsMqg/rIorQfIQZQ/f5UA83HE1e5BlgB5JhVrUIv/Q i4duUveIQulTzSfTlsNNQl24sxP2qDWnX4m0fC7C5ju7ZoChv1qhDx9sTcxisro37qsX 016Jcss3hgBVyX0iP/hFXpjdDxUFWGNu3jd6YrMvZK3cnke411vcVbyEleQCLmPTC71j yGYbx1YKwiWK/UNCrR2pV8k1a4fRKySyE5KaoGbeJbx+VEqYJHIz3hQJk2zZVVk1L/Bz akIlde5HVsXLCEY4e+krVBZa2vsDnoxju+78OchgPxtWX1GeEAV6upcEKuKuGjn/E/81 Vx/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745969936; x=1746574736; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=44dQEBk34Lg+twP2t8lNoyYgFTs2iNjxhzJ6aTJ3tVA=; b=j7XzY6B6lRY59aanuOsHSFPGs2buZ01wIoBFXTGYdOLZUcLSOvvPcwK9ztEOWwFw2n 2oO6etpFU2rtQlbCZUNBD7jxYZUPYuwAddOLZyXhUpgR/ks/2QjrfdIyEWJBRx/Lsg/Q uZW1qcJptmlqrRzw3F8r/RIOyPvwz5qBmKmeyggz+sjeBHAT2mIvKlVvsHg5gEGb4c4H zZZR43gCzc9wuJKHa4TeJsDwj0wROlefQwTtoP4PJVj+HSGegPD30pNUMjbnCc8OdmAh FJp9dWEVBuWtwGzAoHD/tI8quZEna23yj8fzBtlEbURvWGFpN3vY9qpIY9XvVdi0eRBu 1zzw== X-Gm-Message-State: AOJu0YxEqiP7gDZpa5HTIQZPQuNMlehjjA/kMTrJj4WVjLmnb8kF4gez 3ulE0TzzA0c/nhlbF3OqxTU2tUN+zPG4Xghv4RhrvA5vnaVPqnzWSARyUg== X-Gm-Gg: ASbGncvaot9fTcYuKXN9Hu2qOKKCqpDXL/HIYyqsQXEK/JLOqpZqeqCsMM5USijfrDQ 3RS2qaAyR0NDbF3SPJYiO+0PpNk+XWgcvs7WDWPZu/K2AQmUd037oDgvELBWd3hQGQX0rq32udN SNOntelJwq8H7HYvxtAa/d241DLi99AnflWDf/aHk9dQ7zCDAZMinMe0z3BH7PHaCl4zYap4whn ZQ4bzVK+SmC78G2IrYSnHWUOAO8apUz79NSXt1nAuhDYM7MRSvZJm2GVvrNeNpwvY+mpeUOfRbd SdCAXMvqYGvnKfJr8yKT2owZuoZi3pkS/rn7XVaFJA== X-Google-Smtp-Source: AGHT+IF0Gq0OKMjMbPFewLBiKCryoCgbkMLTWTjJKXA6zaQErPJ0Qq2QeEf8yHjPw1HgNf/wZrnNZg== X-Received: by 2002:a05:690c:6c85:b0:6f7:55a2:4cd8 with SMTP id 00721157ae682-708ad5c575dmr9535387b3.5.1745969936353; Tue, 29 Apr 2025 16:38:56 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:2::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-708ae1e9ae7sm701547b3.102.2025.04.29.16.38.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Apr 2025 16:38:55 -0700 (PDT) From: Nhat Pham To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com, yosry.ahmed@linux.dev, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, len.brown@intel.com, chengming.zhou@linux.dev, kasong@tencent.com, chrisl@kernel.org, huang.ying.caritas@gmail.com, ryan.roberts@arm.com, viro@zeniv.linux.org.uk, baohua@kernel.org, osalvador@suse.de, lorenzo.stoakes@oracle.com, christophe.leroy@csgroup.eu, pavel@kernel.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-pm@vger.kernel.org, peterx@redhat.com Subject: [RFC PATCH v2 07/18] mm: swap: zswap: swap cache and zswap support for virtualized swap Date: Tue, 29 Apr 2025 16:38:35 -0700 Message-ID: <20250429233848.3093350-8-nphamcs@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250429233848.3093350-1-nphamcs@gmail.com> References: <20250429233848.3093350-1-nphamcs@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 65C9840003 X-Stat-Signature: g3o7e3rapdusmcdpr33mamyb4o7wrbmr X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1745969937-827330 X-HE-Meta: U2FsdGVkX19j0LGY0WegOWfcdh/2WAARxFPvX8KcfeMAjCg7Qk21T8kWDDcJYNWrE2VomnHq8rHb60Ak9++nHoNdYgatQfRqIaxGQHnxZhYcN7dUiPZbSbZcqCNX0ueE9IYnctL1rM8ZhBBzIyoVIgmOLjZfUIkBgGkLhm5t5Ex8BAzjLRI9Q65gShHsVqehOydL8RJl/oEpFeK93w/aCS9XXo/LcN2ITfX7zRe0gP/tWgvy4u/ouse6KZyI15UKYVT/ex7kOzRGV1RIrWGsX924RgVkiOJbzSQvbdLLmdo8N7tGhTT15fy/5Zr+70CMW13GvjoqbhdWAgJUybKj2efV85bJZATYHFP0CMVaC9EvvTrWB2E7ZiLfvaygPoQB54IV3DowFOLLUOJ17Q4wyVuoE8E1X709jKvFA52GiW6f5PmZZiOY5cGTrwNZkjQB8L5q/aDUwgUjmo/jPDtT9UGcFiSEPy6iYrKdFKuS0/MWlD/szKbGog71ZSIrESRNYK73g5MNIkIDhbkVJbIzige0ijVYDv+AkfnmrYToSuNzfPHokuYdPCgWhoYI/OJak/B9xOj3sllbdfU5zaaJvb9SQDxzfJWBZHOXai3hs/rvsCJQ+OCkut+lVa4WpxLOskjGpBOVayzUVr8AKxqcPyJzINRtVj2/4JvNcEhmvotj+GJx7NgjO+yHmPRVVpdRMnQQ9/WA0CwNqY+yZSGi5lBKk5FmCzvKldbgC0MoESMoVCbCAacBDeoYT9LJrSogh1/yAteM3oZtswWeeonNgobEuQUdIcM0QKPe8mqcLFpbuVFAfTJQP7t2qkpauLuTqoM+mCcij59zICxZZXqvXNheRgpweD6TGha6SJvqIPHv8a56Eu51Kh6ANJLhYwBQ75UJREXm+3xeyxgWQ2tgwA4UpNoqvSLwXTk9wxx0o1r8SW+1A7OjphiXBu7fwICj0bORBj11AtSre+0tJOl t/rB92dQ B/qqHS01hzMT6Cp7HRecdCbWCNXJI011CtFR5B5ENaUL9bovT8nI9wwiI1y8zpCDErCPIBB0duNZ6KUMjWXsjwRWEK6J6MoyxATTrDlfxHJNhrmFbBqDKJL5Kw4zn3S2mnFh9zxV5wno/T38= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, the swap cache code assumes that the swap space is of a fixed size. The virtual swap space is dynamically sized, so the existing partitioning code cannot be easily reused. A dynamic partitioning is planned, but for now keep the design simple and just use a flat swapcache for vswap. Similar to swap cache, the zswap tree code, specifically the range partition logic, can no longer easily be reused for the new virtual swap space design. Use a simple unified zswap tree in the new implementation for now. As in the case of swap cache, range partitioning is planned as a follow up work. Since the vswap's implementation has begun to diverge from the old implementation, we also introduce a new build config (CONFIG_VIRTUAL_SWAP). Users who do not select this config will get the old implementation, with no behavioral change. Signed-off-by: Nhat Pham --- mm/swap.h | 22 ++++++++++++++-------- mm/swap_state.c | 44 +++++++++++++++++++++++++++++++++++--------- mm/zswap.c | 38 ++++++++++++++++++++++++++++++++------ 3 files changed, 81 insertions(+), 23 deletions(-) diff --git a/mm/swap.h b/mm/swap.h index d5f8effa8015..06e20b1d79c4 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -22,22 +22,27 @@ void swap_write_unplug(struct swap_iocb *sio); int swap_writepage(struct page *page, struct writeback_control *wbc); void __swap_writepage(struct folio *folio, struct writeback_control *wbc); -/* linux/mm/swap_state.c */ -/* One swap address space for each 64M swap space */ +/* Return the swap device position of the swap slot. */ +static inline loff_t swap_slot_pos(swp_slot_t slot) +{ + return ((loff_t)swp_slot_offset(slot)) << PAGE_SHIFT; +} + #define SWAP_ADDRESS_SPACE_SHIFT 14 #define SWAP_ADDRESS_SPACE_PAGES (1 << SWAP_ADDRESS_SPACE_SHIFT) #define SWAP_ADDRESS_SPACE_MASK (SWAP_ADDRESS_SPACE_PAGES - 1) + +/* linux/mm/swap_state.c */ +#ifdef CONFIG_VIRTUAL_SWAP +extern struct address_space *swap_address_space(swp_entry_t entry); +#define swap_cache_index(entry) entry.val +#else +/* One swap address space for each 64M swap space */ extern struct address_space *swapper_spaces[]; #define swap_address_space(entry) \ (&swapper_spaces[swp_type(entry)][swp_offset(entry) \ >> SWAP_ADDRESS_SPACE_SHIFT]) -/* Return the swap device position of the swap slot. */ -static inline loff_t swap_slot_pos(swp_slot_t slot) -{ - return ((loff_t)swp_slot_offset(slot)) << PAGE_SHIFT; -} - /* * Return the swap cache index of the swap entry. */ @@ -46,6 +51,7 @@ static inline pgoff_t swap_cache_index(swp_entry_t entry) BUILD_BUG_ON((SWP_OFFSET_MASK | SWAP_ADDRESS_SPACE_MASK) != SWP_OFFSET_MASK); return swp_offset(entry) & SWAP_ADDRESS_SPACE_MASK; } +#endif void show_swap_cache_info(void); bool add_to_swap(struct folio *folio); diff --git a/mm/swap_state.c b/mm/swap_state.c index 1607d23a3d7b..f677ebf9c5d0 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -38,8 +38,18 @@ static const struct address_space_operations swap_aops = { #endif }; +#ifdef CONFIG_VIRTUAL_SWAP +static struct address_space swapper_space __read_mostly; + +struct address_space *swap_address_space(swp_entry_t entry) +{ + return &swapper_space; +} +#else struct address_space *swapper_spaces[MAX_SWAPFILES] __read_mostly; static unsigned int nr_swapper_spaces[MAX_SWAPFILES] __read_mostly; +#endif + static bool enable_vma_readahead __read_mostly = true; #define SWAP_RA_ORDER_CEILING 5 @@ -718,23 +728,34 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, return folio; } +static void init_swapper_space(struct address_space *space) +{ + xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ); + atomic_set(&space->i_mmap_writable, 0); + space->a_ops = &swap_aops; + /* swap cache doesn't use writeback related tags */ + mapping_set_no_writeback_tags(space); +} + +#ifdef CONFIG_VIRTUAL_SWAP +int init_swap_address_space(unsigned int type, unsigned long nr_pages) +{ + return 0; +} + +void exit_swap_address_space(unsigned int type) {} +#else int init_swap_address_space(unsigned int type, unsigned long nr_pages) { - struct address_space *spaces, *space; + struct address_space *spaces; unsigned int i, nr; nr = DIV_ROUND_UP(nr_pages, SWAP_ADDRESS_SPACE_PAGES); spaces = kvcalloc(nr, sizeof(struct address_space), GFP_KERNEL); if (!spaces) return -ENOMEM; - for (i = 0; i < nr; i++) { - space = spaces + i; - xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ); - atomic_set(&space->i_mmap_writable, 0); - space->a_ops = &swap_aops; - /* swap cache doesn't use writeback related tags */ - mapping_set_no_writeback_tags(space); - } + for (i = 0; i < nr; i++) + init_swapper_space(spaces + i); nr_swapper_spaces[type] = nr; swapper_spaces[type] = spaces; @@ -752,6 +773,7 @@ void exit_swap_address_space(unsigned int type) nr_swapper_spaces[type] = 0; swapper_spaces[type] = NULL; } +#endif static int swap_vma_ra_win(struct vm_fault *vmf, unsigned long *start, unsigned long *end) @@ -930,6 +952,10 @@ static int __init swap_init_sysfs(void) int err; struct kobject *swap_kobj; +#ifdef CONFIG_VIRTUAL_SWAP + init_swapper_space(&swapper_space); +#endif + err = vswap_init(); if (err) { pr_err("failed to initialize virtual swap space\n"); diff --git a/mm/zswap.c b/mm/zswap.c index 23365e76a3ce..c1327569ce80 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -203,8 +203,6 @@ struct zswap_entry { struct list_head lru; }; -static struct xarray *zswap_trees[MAX_SWAPFILES]; -static unsigned int nr_zswap_trees[MAX_SWAPFILES]; /* RCU-protected iteration */ static LIST_HEAD(zswap_pools); @@ -231,12 +229,28 @@ static bool zswap_has_pool; * helpers and fwd declarations **********************************/ +#ifdef CONFIG_VIRTUAL_SWAP +static DEFINE_XARRAY(zswap_tree); + +static inline struct xarray *swap_zswap_tree(swp_entry_t swp) +{ + return &zswap_tree; +} + +#define zswap_tree_index(entry) entry.val +#else +static struct xarray *zswap_trees[MAX_SWAPFILES]; +static unsigned int nr_zswap_trees[MAX_SWAPFILES]; + static inline struct xarray *swap_zswap_tree(swp_entry_t swp) { return &zswap_trees[swp_type(swp)][swp_offset(swp) >> SWAP_ADDRESS_SPACE_SHIFT]; } +#define zswap_tree_index(entry) swp_offset(entry) +#endif + #define zswap_pool_debug(msg, p) \ pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \ zpool_get_type((p)->zpool)) @@ -1047,7 +1061,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry, swp_entry_t swpentry) { struct xarray *tree; - pgoff_t offset = swp_offset(swpentry); + pgoff_t offset = zswap_tree_index(swpentry); struct folio *folio; struct mempolicy *mpol; bool folio_was_allocated; @@ -1463,7 +1477,7 @@ static bool zswap_store_page(struct page *page, goto compress_failed; old = xa_store(swap_zswap_tree(page_swpentry), - swp_offset(page_swpentry), + zswap_tree_index(page_swpentry), entry, GFP_KERNEL); if (xa_is_err(old)) { int err = xa_err(old); @@ -1612,7 +1626,7 @@ bool zswap_store(struct folio *folio) bool zswap_load(struct folio *folio) { swp_entry_t swp = folio->swap; - pgoff_t offset = swp_offset(swp); + pgoff_t offset = zswap_tree_index(swp); bool swapcache = folio_test_swapcache(folio); struct xarray *tree = swap_zswap_tree(swp); struct zswap_entry *entry; @@ -1670,7 +1684,7 @@ bool zswap_load(struct folio *folio) void zswap_invalidate(swp_entry_t swp) { - pgoff_t offset = swp_offset(swp); + pgoff_t offset = zswap_tree_index(swp); struct xarray *tree = swap_zswap_tree(swp); struct zswap_entry *entry; @@ -1682,6 +1696,16 @@ void zswap_invalidate(swp_entry_t swp) zswap_entry_free(entry); } +#ifdef CONFIG_VIRTUAL_SWAP +int zswap_swapon(int type, unsigned long nr_pages) +{ + return 0; +} + +void zswap_swapoff(int type) +{ +} +#else int zswap_swapon(int type, unsigned long nr_pages) { struct xarray *trees, *tree; @@ -1718,6 +1742,8 @@ void zswap_swapoff(int type) nr_zswap_trees[type] = 0; zswap_trees[type] = NULL; } +#endif /* CONFIG_VIRTUAL_SWAP */ + /********************************* * debugfs functions -- 2.47.1