From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06291C3ABC3 for ; Tue, 13 May 2025 08:46:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F1C06B000A; Tue, 13 May 2025 04:46:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A0D56B0083; Tue, 13 May 2025 04:46:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 040A16B0085; Tue, 13 May 2025 04:46:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D16A26B000A for ; Tue, 13 May 2025 04:46:42 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3E8941A021C for ; Tue, 13 May 2025 08:46:42 +0000 (UTC) X-FDA: 83437253844.09.FA1B012 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf15.hostedemail.com (Postfix) with ESMTP id 4A76FA0007 for ; Tue, 13 May 2025 08:46:40 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Ejdh4DUq; spf=pass (imf15.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747126000; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=yMd7bbXJ73RvHB8jjoZ1ePqWWMKL4OGPHNRz44gnCOc=; b=pLKM8CTB99u/iRckzpxd+SqEA2JjFvqtGx9zlUmFIp/xuAYADbGVqLjwrlyaqOpwnbNcKR hcIcX35nZzrJ25de5qHU10pvmtRNB5J+Bm9mGS/wPohXBqRxjMGrC1fbDvUeaivpsqr7+1 Z+BZifmgNrO0L23D2CP3lttpxCwnUZM= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Ejdh4DUq; spf=pass (imf15.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747126000; a=rsa-sha256; cv=none; b=fo5vPP+IgATBghFIX8CUCP9PsML4PMY8Lvd4ZOZLipjaXEUESa4cQy5NKcNDQZaF+2sBOg kSh8xbuyXIUmfa1PoiUF5+PeteCRDNGmUiqqkHXcnjiLXl/ifEsXBQy21hhqPgwbod2Ght s75Z7B2vIe3VVhlAqMegD8IRNLPVSlo= Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-7423fb98cb1so3508169b3a.3 for ; Tue, 13 May 2025 01:46:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747125999; x=1747730799; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=yMd7bbXJ73RvHB8jjoZ1ePqWWMKL4OGPHNRz44gnCOc=; b=Ejdh4DUqnP0X8+mpMNb1TDyLQlnZ2ezKhiT4IFZtgoe1/VT0pWthi0SJbr3O+M8lyj ddPq/DNVIe5AzE8WshdIGyvd/Ala2pDcb0tz2BPaQ9tW2n07PnYhKqXjfbBBLQRyK5Vx A5uKpHdMBhBhTyg4+FKG5dLB1GAnNMsPN3OLvCE1HhyRcFNI80FX+qgBiJqIYEALiMHV EB0pe0MrbI9W/PzNqldZtc9MZn4sE9LvuOkX3TrzChzqraYzq+o7cimnRbeEsIjlSjOy dBs+6MRuqvLgDR5llEstNkArCPBuHuqjNYOs16Pty8w8Fn1oZm9Eu/j6MFyqfA6CpzUQ mpPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747125999; x=1747730799; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=yMd7bbXJ73RvHB8jjoZ1ePqWWMKL4OGPHNRz44gnCOc=; b=erfcXCbO+A2Gw9tjYP0L4g+tHGN6q1vDTUvZ72pQMEEjznDZBsq9/Zf6Fr0FRP9+pn 0b88mtLP4KsuE7QpsKgcp5DtbOGPzTjlidpkPMVex1MtZ9bduaal/7xkN+oi0Wz+fuNd FjNSfZcKcZu5Rn+ztIAbuoCNnHbqqV5Qh04BhNNCx2DsbQi4LOphXn9s1Zp6HpkDs5N4 NK2lCXSU43udNtjl5xDo78aE8CyFHNgJ4tBSqPUbnLoVPRdtlkKj2KZ/GvGuaP4pRoxu I3wrxSXOz9S267uPB1jbFFEsQn05UUSfdjF/Lbf+zELwzCeK+YSNuHS1KKUZ2F25ooxB mCnw== X-Forwarded-Encrypted: i=1; AJvYcCWc0onRv32hHZkUqAxa47JNTHSJqT1wgGCzzIhsD7RUwqLV9B0S6l25Yycki7nnltcajVXq5t+9mQ==@kvack.org X-Gm-Message-State: AOJu0YxOQYhCfi9ejJc/uzQuZ5F5DuS8hX0YfZvr3NwpEdTw831NEFHY 9XRUJ8XgYIuf0vjd40t0Tx5abeKUHauP6uF1NktBN7YKtxZiRgCD X-Gm-Gg: ASbGncvmkAzVJrfWDFxyLcbMKB/l8+W1WGS7pkKKt8/eNJiJbly1eB8321t/36H9rPC b39qHqMTN31neJeM3WCdzbqwo4HHBo/ndJluBiYD/bEGUkFg9gL9/dMT4zH+sJSQmHf5dhpAYwS i/dYgD1NcHRl69Et0KV3nE3re4+NfsQRjM/TXcr6/+X4BGgU+9jXZNk2Ts63ScHpkes1tOuSNVc ah44npPWcIofcCuMdydE4vEolO2Hypf0QSqfVAFdhRiEzP9eaPbMXCCCTKbBZ4K5okCX8Eo8kGx GhvhPbAp56jDgBdP0s9HJCqDQkdfafaEFWLzwvKgK2oDUK2TyemKAQ76GV4t3YSOMYU= X-Google-Smtp-Source: AGHT+IFPrLGaEIeGBqJEYb5CmLsIzhwCz1cefsjOvxFrrMGixDi8DUsGR88tP+3exTFDtmJg3Bkhyg== X-Received: by 2002:a05:6a00:17a3:b0:736:34ff:be7 with SMTP id d2e1a72fcca58-7423c059c37mr21365747b3a.15.1747125998950; Tue, 13 May 2025 01:46:38 -0700 (PDT) Received: from Barrys-MBP.hub ([118.92.10.104]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-74237a0d007sm7578566b3a.96.2025.05.13.01.46.31 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 13 May 2025 01:46:38 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Barry Song , David Hildenbrand , Baolin Wang , Ryan Roberts , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Rik van Riel , Harry Yoo , Kairui Song , Chris Li , Baoquan He , Dan Schatzberg , Kaixiong Yu , Fan Ni , Tangquan Zheng Subject: [PATCH RFC] mm: make try_to_unmap_one support batched unmap for anon large folios Date: Tue, 13 May 2025 20:46:20 +1200 Message-Id: <20250513084620.58231-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam10 X-Stat-Signature: wpqdu65boto1iwsm4kzmfthmak6txzyx X-Rspamd-Queue-Id: 4A76FA0007 X-Rspam-User: X-HE-Tag: 1747126000-600451 X-HE-Meta: U2FsdGVkX1/sqLyUszw8AQwOofIlyQ6/GBZZTw7rTvoP/pDJZ3hRnfEPISBlVjyCiCHsDeR9DRYFWXRCuWjWXAvv/32Uk0/EvCHfOdcZ7BidfME5crFJc8hRVPIbg5WHJ/K+7p5hivF91Pg4VCKuZUucXovUegMb8Ey2/8c+WDJKYyh3WO6OsU+stAeuBOQaOCYzRez1vXJJV/2Y4iTHFsrR1osSnFZXk2/pysjVXmkOs1yIuOUjfTHIuDMiymhO2l9ppH4uKMKr2H4bO6wXh3hNHZGof/H4MZHNiyhFY3MZEIgvnfF+BTji2dUmEzv48KUYsW8KyqbHEdXZCTq2dYuTOh0AA7WpRYAIdFI4QJ3X2/K3o1aW99t55NiMMkhSSYsZ73lbiNJcdtKyrJpMWJirTMTBIUyjNpFzX/ygpwk81Slol4wY2fAFB0ZFvOxhJG6/XvlOWhh6rfhIpPC5mi75k0wlorXW+8JiXlE05bPnX9GvDxRLWx7gNa741UC8p41qSSlRstyB4ZxTfRhyXULTnSOBhTa7ijby2TaBjDlBwkCxykyda5MMOzXzu7wLRYEyaiKJeaWco/DgmPIJ+Fm6wv+0JIimfExDiTLZkDvczu1TZGTlM8VC5J8/I1q+l8Cf1+zENLjEoDat59qL+4wNS2adKKXEMrdMxHoFVuhLcQPyT7a3VPXQJofOrrpVrFAGCHBW899oUtdk3QhtqwUxxePMIih67B8GcFF7CJ1ktyne6muKTuocOKkG7XpKUEZaSw0KZrlXOJhu04rsbeM4sNVcqXYZodb2yQKjt2eBwazJY8GE6aWVZGZwe+Ny8gsYRRSYA1kqZJwRrpWt2/avKMDukJhGlSSlurv011h8S7Vkh4fx3OydDaK2hvoV3f2MTZQ/2qKwAb9QnTFTK4XeNBMxNZfRCxIR5JK7dA/gD3UUq6FLPolXN2UtogMdjxOkAQCbpJnmJGpzGXv 78L/PBoo OC/yfAH84po+LRqyNVlbhCFqZ9o4Ldo7SXbYmS7QmaYHmpEbFNV6zhwXPiH0ZleXGTHtCSF7VHxsi2+3VgR9qUR+TrmrVqCanIUw825z14HMqgsuFsSfGgB6Z0KoYEPh+SZlWhVDeDX2oCtdx7MpxBzVyhsMVWRXp1lrhTCaDOe9llro51wd/ZJGz/ywpEiiLXAF4HFs38YO0p3gfQcb1Idx4aaa/C1C4AGzc3vCpDdSEP8IcCu4PloY3H9GDDSQKCd/Y9Rs96fsnLpwi4lk4z+RwkOVkTTi78YbDUFov3BUfCAfOLMDHRvPCK/8ij1fODxmvMMLW/VukwGNXx7/IWTRXFa4wj8jzznWetm0e+tHfDp1HOlhdMKElJWdJo1Z0vJFzyZKJqVdQkaEkxmFreTknCH2Ph8sM3lIstkUMXEGejYdk80KJR53cVMOA3B1q0KXJEBewDW/pBe1wz2avi6jOGvurWwFrMNklP6S+D45T5B/G7nE0OyJ8vJDgyvZ83fsWv3VdpE6J7Al7+chUQBWC0ST0cndBYi3iPcR5nUmN1yLiv/qgCeOQ08vjsFFs5cc9sCLcre6vXSjZ26j4rdRjTJR0/+9ips+qi49t1AWrCNMKcnQmSLzpyKaZ3mnJEWy7Ok6DSPVxyoiKLg5CRCgVPKZ+BBr0ofceXXXdpn6dgrdW18CuYHcqGPP9HEUYrJHenYkZySmaytDAOoGcH0bgvC3K2GDDyPMOFMc7aOvv+VczXq9kkKeIah/YmOWjVrcSlw2+jgeiUlmCD5voTAXBZVHfi+5cuzzX8FjbB48RheePevCQTLQDyBgpD+JQ+fKWv2GQ56PmB6JE2KQGjDQAjlvOwpWwso6+aWL1AHs1xJ/jRH1LKs9wdma0/uW2nxRXiULNBAd4OzJKANYcXLPuihsFWX86ttE8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song My commit 354dffd29575c ("mm: support batched unmap for lazyfree large folios during reclamation") introduced support for unmapping entire lazyfree anonymous large folios at once, instead of one page at a time. This patch extends that support to generic (non-lazyfree) anonymous large folios. Handling __folio_try_share_anon_rmap() and swap_duplicate() becomes extremely complex—if not outright impractical—for non-exclusive anonymous folios. As a result, this patch limits support to exclusive large folios. Fortunately, most anonymous folios are exclusive in practice, so this restriction should be acceptable in the majority of cases. SPARC is currently the only architecture that implements arch_unmap_one(), which also needs to be batched for consistency. However, this is not yet supported, so the platform is excluded for now. Using the following micro-benchmark to measure the time taken to perform PAGEOUT on 256MB of 64KiB anonymous large folios. #define _GNU_SOURCE #include #include #include #include #include #include #include #define SIZE_MB 256 #define SIZE_BYTES (SIZE_MB * 1024 * 1024) int main() { void *addr = mmap(NULL, SIZE_BYTES, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (addr == MAP_FAILED) { perror("mmap failed"); return 1; } memset(addr, 0, SIZE_BYTES); struct timespec start, end; clock_gettime(CLOCK_MONOTONIC, &start); if (madvise(addr, SIZE_BYTES, MADV_PAGEOUT) != 0) { perror("madvise(MADV_PAGEOUT) failed"); munmap(addr, SIZE_BYTES); return 1; } clock_gettime(CLOCK_MONOTONIC, &end); long duration_ns = (end.tv_sec - start.tv_sec) * 1e9 + (end.tv_nsec - start.tv_nsec); printf("madvise(MADV_PAGEOUT) took %ld ns (%.3f ms)\n", duration_ns, duration_ns / 1e6); munmap(addr, SIZE_BYTES); return 0; } w/o patch: ~ # ./a.out madvise(MADV_PAGEOUT) took 1337334000 ns (1337.334 ms) ~ # ./a.out madvise(MADV_PAGEOUT) took 1340471008 ns (1340.471 ms) ~ # ./a.out madvise(MADV_PAGEOUT) took 1385718992 ns (1385.719 ms) ~ # ./a.out madvise(MADV_PAGEOUT) took 1366070000 ns (1366.070 ms) ~ # ./a.out madvise(MADV_PAGEOUT) took 1347834992 ns (1347.835 ms) w/patch: ~ # ./a.out madvise(MADV_PAGEOUT) took 698178000 ns (698.178 ms) ~ # ./a.out madvise(MADV_PAGEOUT) took 708570000 ns (708.570 ms) ~ # ./a.out madvise(MADV_PAGEOUT) took 693884000 ns (693.884 ms) ~ # ./a.out madvise(MADV_PAGEOUT) took 693366000 ns (693.366 ms) ~ # ./a.out madvise(MADV_PAGEOUT) took 690790000 ns (690.790 ms) We found that the time to reclaim this memory was reduced by half. Cc: David Hildenbrand Cc: Baolin Wang Cc: Ryan Roberts Cc: Lorenzo Stoakes Cc: Liam R. Howlett Cc: Vlastimil Babka Cc: Mike Rapoport Cc: Suren Baghdasaryan Cc: Michal Hocko Cc: Rik van Riel Cc: Harry Yoo Cc: Kairui Song Cc: Chris Li Cc: Baoquan He Cc: Dan Schatzberg Cc: Kaixiong Yu Cc: Fan Ni Cc: Tangquan Zheng Signed-off-by: Barry Song --- include/linux/swap.h | 4 +-- mm/memory.c | 2 +- mm/rmap.c | 79 +++++++++++++++++++++++++++++--------------- mm/swapfile.c | 10 ++++-- 4 files changed, 62 insertions(+), 33 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index bc0e1c275fc0..8fbb8ce72016 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -479,7 +479,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry); extern swp_entry_t get_swap_page_of_type(int); extern int add_swap_count_continuation(swp_entry_t, gfp_t); extern void swap_shmem_alloc(swp_entry_t, int); -extern int swap_duplicate(swp_entry_t); +extern int swap_duplicate(swp_entry_t, int nr); extern int swapcache_prepare(swp_entry_t entry, int nr); extern void swap_free_nr(swp_entry_t entry, int nr_pages); extern void free_swap_and_cache_nr(swp_entry_t entry, int nr); @@ -546,7 +546,7 @@ static inline void swap_shmem_alloc(swp_entry_t swp, int nr) { } -static inline int swap_duplicate(swp_entry_t swp) +static inline int swap_duplicate(swp_entry_t swp, int nr) { return 0; } diff --git a/mm/memory.c b/mm/memory.c index 99af83434e7c..5a7e4c0e89c7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -803,7 +803,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, swp_entry_t entry = pte_to_swp_entry(orig_pte); if (likely(!non_swap_entry(entry))) { - if (swap_duplicate(entry) < 0) + if (swap_duplicate(entry, 1) < 0) return -EIO; /* make sure dst_mm is on swapoff's mmlist. */ diff --git a/mm/rmap.c b/mm/rmap.c index fb63d9256f09..2607e02a0960 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1845,23 +1845,42 @@ void folio_remove_rmap_pud(struct folio *folio, struct page *page, #endif } -/* We support batch unmapping of PTEs for lazyfree large folios */ +/* + * We support batch unmapping of PTEs for lazyfree or exclusive anon large + * folios + */ static inline bool can_batch_unmap_folio_ptes(unsigned long addr, - struct folio *folio, pte_t *ptep) + struct folio *folio, pte_t *ptep, bool exclusive) { const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY; int max_nr = folio_nr_pages(folio); +#ifndef __HAVE_ARCH_UNMAP_ONE + bool no_arch_unmap = true; +#else + bool no_arch_unmap = false; +#endif pte_t pte = ptep_get(ptep); + int mapped_nr; - if (!folio_test_anon(folio) || folio_test_swapbacked(folio)) + if (!folio_test_anon(folio)) return false; if (pte_unused(pte)) return false; if (pte_pfn(pte) != folio_pfn(folio)) return false; - return folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags, NULL, - NULL, NULL) == max_nr; + mapped_nr = folio_pte_batch(folio, addr, ptep, pte, max_nr, fpb_flags, NULL, + NULL, NULL); + if (mapped_nr != max_nr) + return false; + if (!folio_test_swapbacked(folio)) + return true; + + /* + * The large folio is fully mapped and its mapcount is the same as its + * number of pages, it must be exclusive. + */ + return no_arch_unmap && exclusive && folio_mapcount(folio) == max_nr; } /* @@ -2025,7 +2044,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, folio_mark_dirty(folio); } else if (likely(pte_present(pteval))) { if (folio_test_large(folio) && !(flags & TTU_HWPOISON) && - can_batch_unmap_folio_ptes(address, folio, pvmw.pte)) + can_batch_unmap_folio_ptes(address, folio, pvmw.pte, + anon_exclusive)) nr_pages = folio_nr_pages(folio); end_addr = address + nr_pages * PAGE_SIZE; flush_cache_range(vma, address, end_addr); @@ -2141,8 +2161,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, goto discard; } - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); + if (swap_duplicate(entry, nr_pages) < 0) { + set_ptes(mm, address, pvmw.pte, pteval, nr_pages); goto walk_abort; } @@ -2159,9 +2179,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, /* See folio_try_share_anon_rmap(): clear PTE first. */ if (anon_exclusive && - folio_try_share_anon_rmap_pte(folio, subpage)) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); + __folio_try_share_anon_rmap(folio, subpage, nr_pages, + RMAP_LEVEL_PTE)) { + swap_free_nr(entry, nr_pages); + set_ptes(mm, address, pvmw.pte, pteval, nr_pages); goto walk_abort; } if (list_empty(&mm->mmlist)) { @@ -2170,23 +2191,27 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, list_add(&mm->mmlist, &init_mm.mmlist); spin_unlock(&mmlist_lock); } - dec_mm_counter(mm, MM_ANONPAGES); - inc_mm_counter(mm, MM_SWAPENTS); - swp_pte = swp_entry_to_pte(entry); - if (anon_exclusive) - swp_pte = pte_swp_mkexclusive(swp_pte); - if (likely(pte_present(pteval))) { - if (pte_soft_dirty(pteval)) - swp_pte = pte_swp_mksoft_dirty(swp_pte); - if (pte_uffd_wp(pteval)) - swp_pte = pte_swp_mkuffd_wp(swp_pte); - } else { - if (pte_swp_soft_dirty(pteval)) - swp_pte = pte_swp_mksoft_dirty(swp_pte); - if (pte_swp_uffd_wp(pteval)) - swp_pte = pte_swp_mkuffd_wp(swp_pte); + add_mm_counter(mm, MM_ANONPAGES, -nr_pages); + add_mm_counter(mm, MM_SWAPENTS, nr_pages); + /* TODO: let set_ptes() support swp_offset advance */ + for (pte_t *ptep = pvmw.pte; address < end_addr; + entry.val++, address += PAGE_SIZE, ptep++) { + swp_pte = swp_entry_to_pte(entry); + if (anon_exclusive) + swp_pte = pte_swp_mkexclusive(swp_pte); + if (likely(pte_present(pteval))) { + if (pte_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + } else { + if (pte_swp_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_swp_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + } + set_pte_at(mm, address, ptep, swp_pte); } - set_pte_at(mm, address, pvmw.pte, swp_pte); } else { /* * This is a locked file-backed folio, diff --git a/mm/swapfile.c b/mm/swapfile.c index 026090bf3efe..189e3474ffc6 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3550,13 +3550,17 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) offset = swp_offset(entry); VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); - VM_WARN_ON(usage == 1 && nr > 1); ci = lock_cluster(si, offset); err = 0; for (i = 0; i < nr; i++) { count = si->swap_map[offset + i]; + /* + * We only support batched swap_duplicate() for unmapping + * exclusive large folios where count should be zero + */ + VM_WARN_ON(usage == 1 && nr > 1 && swap_count(count)); /* * swapin_readahead() doesn't check if a swap entry is valid, so the * swap entry could be SWAP_MAP_BAD. Check here with lock held. @@ -3626,11 +3630,11 @@ void swap_shmem_alloc(swp_entry_t entry, int nr) * if __swap_duplicate() fails for another reason (-EINVAL or -ENOENT), which * might occur if a page table entry has got corrupted. */ -int swap_duplicate(swp_entry_t entry) +int swap_duplicate(swp_entry_t entry, int nr) { int err = 0; - while (!err && __swap_duplicate(entry, 1, 1) == -ENOMEM) + while (!err && __swap_duplicate(entry, 1, nr) == -ENOMEM) err = add_swap_count_continuation(entry, GFP_ATOMIC); return err; } -- 2.39.3 (Apple Git-146)