From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B5F2C3ABA5 for ; Wed, 30 Apr 2025 00:56:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 510586B00D0; Tue, 29 Apr 2025 20:56:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 475596B00D1; Tue, 29 Apr 2025 20:56:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2EA9B6B00DD; Tue, 29 Apr 2025 20:56:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 08CF06B00D0 for ; Tue, 29 Apr 2025 20:56:33 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 185885F889 for ; Tue, 29 Apr 2025 23:38:56 +0000 (UTC) X-FDA: 83388699072.24.588C349 Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) by imf20.hostedemail.com (Postfix) with ESMTP id 254971C0006 for ; Tue, 29 Apr 2025 23:38:53 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GGjluEAL; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745969934; a=rsa-sha256; cv=none; b=XiX59FM5tA5ynZH4YXmXqkwNEjMgNxxaeqDLd60HfEw939FmdU34lULQ9QXV8iUi6Ddeo7 ZAg1qqZ0rTh6oFFaNo0EtAfSAD9/+vZxzF7BJ7FzPWUnDnGuOqZwpwYDfpg2mwDF7GDbaQ roPu7IrqtUtjOagfMwPgT6FkFPsF4jc= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GGjluEAL; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=nphamcs@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745969934; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fKE37L6euvcRArGqI2WCdwnU49QgVd0+WJwqbTjSGfU=; b=dGsajkjhNMSwo5YQfs8oEeNgoDdCpNIbLM2aeuKWzeNeJWF/w7cESfiO60hlvFkDz6xjWr Uk4SpEeVMK6vOO1RYbe5xeb1i+QxpOwxHXkfvagur8cg+FIaY0zbIK2mReyG6Sw7eNqzQW heViSKML8Od+XlWYeY3zgaVm00y1MJc= Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-7080dd5f9f1so45229987b3.0 for ; Tue, 29 Apr 2025 16:38:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745969933; x=1746574733; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fKE37L6euvcRArGqI2WCdwnU49QgVd0+WJwqbTjSGfU=; b=GGjluEALqRy6TgOzXWR3Gmw6w/Od7f65g+MIdbAwE+EO9cLH/XiNtE/w6rkPDAv6Z0 mxSplohP1JZFDm7eF48tfdaCeF/86PKnjtN44xuX3KittyUiV4XAIBjBW5aYF3ook4s5 5oApLodAb0P/Pi5VzyDKKZuQ1pXpxqR1Nn/7lKPan8H3X26A8u+8FWWjuKrwM+xD9yil +6HzfZ/PRceNhPHo8tB6vqRp2nrgIl2pVBRhG3R01HW767XosOYIWkSbfU8LuMuVV09L siP/38HGHI15rP+xekj+SXIB0mjGwV3+zF00CXq+wzUys8cqHuPasNctqTBI+dTqQn/n g6Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745969933; x=1746574733; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fKE37L6euvcRArGqI2WCdwnU49QgVd0+WJwqbTjSGfU=; b=V5IB1nIfk/PKxVik7R7G1GJ7kjVyk7PJaqfHodxovSLg6u4tV/zFhZ9+mifxizOVdT QZTSuQ5tqz7/6RJ9GLStAGO+hfXRsU6jX1572iVi3XVUbWbKLmg9gYlz11CqOOHz1vFZ SyjlPisRz27kdpMXbQ4OXSxxxGtXX7bv3ZCNoliMROufUi+RKCXGAqZRX8T4cmDEGZDi tWhC9iBBJwcYDPtoskkrZoXzUxaPxKXu21q7V5JyU1jtZ0KuxfnNhGdNShT37/V2RhfP dUREDDrCKfI7E2Ky8vtmDlamf7VJLXWE9sPZ39uTUSk38VaZg3yIGoomJVY1i/0RXMU+ TKIg== X-Gm-Message-State: AOJu0YwZscRW7Z8g690IKuY07DONQhq0bYeW5JKTlzMgVi23oc0+GF28 JmOC9RWBbb0WL51LiykZNTA4WB1d6HhulxmUrgwMziw+b41wpuoGXpESCA== X-Gm-Gg: ASbGncs2v3opQvqOWN40te6aG+mouMniPiDoDJIeH2oMzo1CNGaLBkEH3Pw+dV/d8Wl M0O8rWJqQreqqTBQcQGiRAfLI4quT3KG4oVT5l1kpuxjsICv1CutGCNbMbR4woIHhrlCu5gJ4Zp 9C2M1MopLSAieWAME0xEodswgJIqUvQsLVbs7UXfTD371l4OmO4VGeGzBZuebl3i8D2CqHvKadM CzQNUrFbUaFo9rhamaMnz286qsIi+nsqQgAnmvSc1zPLEFZJ1oCCL1Qvi7S6fuCG1D0Epl2K1bN WE5V74sLlcQq+De/TLEi0/95gsQ2gjI= X-Google-Smtp-Source: AGHT+IFO87ET7RNKrk3UF1PEUh15ufHAa0Xuh+MHK1ieSEIT7Rz22UvQkjilBKaStSwIpkQPF5GxvQ== X-Received: by 2002:a05:690c:7304:b0:6f9:7a3c:1fe with SMTP id 00721157ae682-708abe2046emr20914727b3.23.1745969933102; Tue, 29 Apr 2025 16:38:53 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:4::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-708ae1edd01sm700967b3.115.2025.04.29.16.38.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Apr 2025 16:38:52 -0700 (PDT) From: Nhat Pham To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com, yosry.ahmed@linux.dev, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, len.brown@intel.com, chengming.zhou@linux.dev, kasong@tencent.com, chrisl@kernel.org, huang.ying.caritas@gmail.com, ryan.roberts@arm.com, viro@zeniv.linux.org.uk, baohua@kernel.org, osalvador@suse.de, lorenzo.stoakes@oracle.com, christophe.leroy@csgroup.eu, pavel@kernel.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-pm@vger.kernel.org, peterx@redhat.com Subject: [RFC PATCH v2 04/18] mm: swap: add an abstract API for locking out swapoff Date: Tue, 29 Apr 2025 16:38:32 -0700 Message-ID: <20250429233848.3093350-5-nphamcs@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250429233848.3093350-1-nphamcs@gmail.com> References: <20250429233848.3093350-1-nphamcs@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 254971C0006 X-Stat-Signature: i3w5cptaw9y3jx8pn3bgsdehb5gmcmpm X-Rspam-User: X-HE-Tag: 1745969933-537517 X-HE-Meta: U2FsdGVkX1/74wPnqzreofkqWoLjMzWh1cnndGxAh4vY1pGagoXICFTe1orO/HUgMapwQH4ATUQL7LYvl1Soz/w3kvCpXXXnggV6Q0NYdp1IU2/DwmC5PZ1VvA8R+wyFMWnWU88T9DNZ7B/nYsBovbJ4U+xI5Jed05ltvs6vfaiZAGvyC0M9m2LCif02nTCJAI1YZxxPoE+5SYZUtqBxEU8xLEM/3nJt+TbxrPRawhdZrndVvxoKFLf9pnBJN3wf5zrFE1CPPnOI0sP41iN0geSwUmohY3qBYjQ0bEEJRLeywzgejkckklJp4vxlaAQ9qLBmQGAWQyOsSiJn7pqTNYvZwmuuGQ3yz7VPcGpUeKJblJ+nsYMwxB3wxsQFKy3kYJdwPys6LYD5tTSnFLVtcDHKRv6NeEQEMXmddtIbKWQ/OqO7s7hMIjEa0aYqhIL0lq8qfjCFnQaJa+RDUIx2Z5Nn96iobVtSoVKLEJLc6+ZLtwurM1hI4AUj3vlSINsMaNttWKOJNyNWY0Cr5E16NCjLmzT1mjmm5gkLCX6MbZNEdJUkLAPWbwPIJg8A34X3xSuvgodAradA8BaY31kotOZ9/U6zQvkIV41TsWs1yvvl3peq3gdjMpyFZL438Cb/+H1BR4nEpQeFI4GDe3G9FA4HZA9ngQCHkJgJfczfc5vzk2xAIQnayFGprbyOjps8BJZjPTtKIgQypF0ynNSQnJagqMv51r8hvr66UUYugpTu48Syf7lvkTjPDnSdN/AHmEXODkU41eJ5h5yvOu+6atlyXZydGjqEWm0SiKmb21bNYGkYHqfhr3gGPKyaCGmQSncl0UVgu3pAOmC60zn31P+z9VLTHbB7LtusKmtggjgeypcw7YCnqPzInEOYtUmshci8UUg1ii4hPFoIzeXIaBOrn+8k7a73stchlpJVYH0YXC63/4axeoqCp0wa9aHP1VBwuaDs2q2tgNge1oE xQFA9J1T Jzwup+WvosppIH4gyDwVFgR8boQOP12FqLleNsxEgyNo5ORMfu6gDtw4g/pO8Ohd9CVQt1sgqO2urcd4Y17N6aZtODD5A9sF1ysofS8m0bVaiyBNqLc2fSN1genXj5VDCfHNdOPSxPgosMb/uy0iupi7yjG92N5Rb3cSEfSlwi3pN7z+L0ow+Sg/NmL6yfC3WIPPUWNe54OyVWLxJ4cUhvKk/JKyby3D6dKvDM7NbDbpw8wcIcwhwH8JIaM77xsNfM/WNaJIP0+PsjfI6ZZlzo5RgNkw0lKg8CvmKm7KJAqxfkNQn1+oX23x6MfjULloAc8WyePxHxJZZ0U6yj+N14vqLwAj80cv6zLt0gfCJmhrvmVDbjox0VBthRqXICoofWqwc8jRC9KtLF5P8E7lLCtolAeGFvMxQP+o1fd6YO40T+RPR4qGr6lBXuo0CzWU3o2G16/AGiRY04AvCbKbY7Fe0KORuT8CdpqQPM9z7KUXl5GlbEbQC6Y6vMg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, we get a reference to the backing swap device in order to lock out swapoff and ensure its validity. This does not make sense in the new virtual swap design, especially after the swap backends are decoupled - a swap entry might not have any backing swap device at all, and its backend might change at any time during its lifetime. In preparation for this, abstract away the swapoff locking out behavior into a generic API. Signed-off-by: Nhat Pham --- include/linux/swap.h | 12 ++++++++++++ mm/memory.c | 13 +++++++------ mm/shmem.c | 7 +++---- mm/swap_state.c | 10 ++++------ mm/userfaultfd.c | 11 ++++++----- 5 files changed, 32 insertions(+), 21 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 8b8c10356a5c..23eaf44791d4 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -709,5 +709,17 @@ static inline bool mem_cgroup_swap_full(struct folio *folio) } #endif +static inline bool trylock_swapoff(swp_entry_t entry, + struct swap_info_struct **si) +{ + return get_swap_device(entry); +} + +static inline void unlock_swapoff(swp_entry_t entry, + struct swap_info_struct *si) +{ + put_swap_device(si); +} + #endif /* __KERNEL__*/ #endif /* _LINUX_SWAP_H */ diff --git a/mm/memory.c b/mm/memory.c index fb7b8dc75167..e92914df5ca7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4305,6 +4305,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) struct swap_info_struct *si = NULL; rmap_t rmap_flags = RMAP_NONE; bool need_clear_cache = false; + bool swapoff_locked = false; bool exclusive = false; swp_entry_t entry; pte_t pte; @@ -4365,8 +4366,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) } /* Prevent swapoff from happening to us. */ - si = get_swap_device(entry); - if (unlikely(!si)) + swapoff_locked = trylock_swapoff(entry, &si); + if (unlikely(!swapoff_locked)) goto out; folio = swap_cache_get_folio(entry, vma, vmf->address); @@ -4713,8 +4714,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (waitqueue_active(&swapcache_wq)) wake_up(&swapcache_wq); } - if (si) - put_swap_device(si); + if (swapoff_locked) + unlock_swapoff(entry, si); return ret; out_nomap: if (vmf->pte) @@ -4732,8 +4733,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (waitqueue_active(&swapcache_wq)) wake_up(&swapcache_wq); } - if (si) - put_swap_device(si); + if (swapoff_locked) + unlock_swapoff(entry, si); return ret; } diff --git a/mm/shmem.c b/mm/shmem.c index 1ede0800e846..8ef72dcc592e 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2262,8 +2262,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, if (is_poisoned_swp_entry(swap)) return -EIO; - si = get_swap_device(swap); - if (!si) { + if (!trylock_swapoff(swap, &si)) { if (!shmem_confirm_swap(mapping, index, swap)) return -EEXIST; else @@ -2411,7 +2410,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, } folio_mark_dirty(folio); swap_free_nr(swap, nr_pages); - put_swap_device(si); + unlock_swapoff(swap, si); *foliop = folio; return 0; @@ -2428,7 +2427,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, folio_unlock(folio); folio_put(folio); } - put_swap_device(si); + unlock_swapoff(swap, si); return error; } diff --git a/mm/swap_state.c b/mm/swap_state.c index ca42b2be64d9..81f69b2df550 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -419,12 +419,11 @@ struct folio *filemap_get_incore_folio(struct address_space *mapping, if (non_swap_entry(swp)) return ERR_PTR(-ENOENT); /* Prevent swapoff from happening to us */ - si = get_swap_device(swp); - if (!si) + if (!trylock_swapoff(swp, &si)) return ERR_PTR(-ENOENT); index = swap_cache_index(swp); folio = filemap_get_folio(swap_address_space(swp), index); - put_swap_device(si); + unlock_swapoff(swp, si); return folio; } @@ -439,8 +438,7 @@ struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, void *shadow = NULL; *new_page_allocated = false; - si = get_swap_device(entry); - if (!si) + if (!trylock_swapoff(entry, &si)) return NULL; for (;;) { @@ -538,7 +536,7 @@ struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, put_swap_folio(new_folio, entry); folio_unlock(new_folio); put_and_return: - put_swap_device(si); + unlock_swapoff(entry, si); if (!(*new_page_allocated) && new_folio) folio_put(new_folio); return result; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index d06453fa8aba..f40bbfd09fd5 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -1161,6 +1161,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, struct folio *src_folio = NULL; struct anon_vma *src_anon_vma = NULL; struct mmu_notifier_range range; + bool swapoff_locked = false; int err = 0; flush_cache_range(src_vma, src_addr, src_addr + PAGE_SIZE); @@ -1367,8 +1368,8 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, goto out; } - si = get_swap_device(entry); - if (unlikely(!si)) { + swapoff_locked = trylock_swapoff(entry, &si); + if (unlikely(!swapoff_locked)) { err = -EAGAIN; goto out; } @@ -1399,7 +1400,7 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, pte_unmap(src_pte); pte_unmap(dst_pte); src_pte = dst_pte = NULL; - put_swap_device(si); + unlock_swapoff(entry, si); si = NULL; /* now we can block and wait */ folio_lock(src_folio); @@ -1425,8 +1426,8 @@ static int move_pages_pte(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, if (src_pte) pte_unmap(src_pte); mmu_notifier_invalidate_range_end(&range); - if (si) - put_swap_device(si); + if (swapoff_locked) + unlock_swapoff(entry, si); return err; } -- 2.47.1