From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD711CF318A for ; Wed, 2 Oct 2024 01:33:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 460746B00EA; Tue, 1 Oct 2024 21:33:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4111C6B0100; Tue, 1 Oct 2024 21:33:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D8906B0106; Tue, 1 Oct 2024 21:33:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 05C556B00EA for ; Tue, 1 Oct 2024 21:33:53 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 51C741415C4 for ; Wed, 2 Oct 2024 01:33:53 +0000 (UTC) X-FDA: 82626940746.24.5006E32 Received: from mail-ej1-f46.google.com (mail-ej1-f46.google.com [209.85.218.46]) by imf07.hostedemail.com (Postfix) with ESMTP id 753F440014 for ; Wed, 2 Oct 2024 01:33:51 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=xZLFglYx; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.46 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727832766; a=rsa-sha256; cv=none; b=ky4MAThIn22n5UCPIOgBdIB7paHtQsGMFiyTuNbpZHQOUjqbuoMmBqIqzewYlGQeg1os/L DGp4nOt8DPcUjjVT5xj3Wbyz2NDhOoidTE1/y9y+U+vURBIjYnaEi2Fj+skdn5OwXqk6q1 T9Uh2JYWj7jFoFvsqwlvfvGcTWO9OCU= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=xZLFglYx; spf=pass (imf07.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.46 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727832766; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uTSnTMyNZTOwo0t32cs6MGbO9x8fhKc49DjoEIEGpnw=; b=jtGAN3v+eD5X+d+sCMQf5XvGSLFZqpLkg/dE410HOtJH5e9GANw9OeWMAtnRSnT4rpZuDY OsFoQhj1QTjxVABA7GVg5S/LfZQLowo4XpTVEdn/Bvv9U1mYl78Qa50H5l+OYQJkgvG9+q 5D44zKnZpD3/LgYr8vY2d4cQuyFARU0= Received: by mail-ej1-f46.google.com with SMTP id a640c23a62f3a-a8d100e9ce0so621638366b.2 for ; Tue, 01 Oct 2024 18:33:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1727832830; x=1728437630; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=uTSnTMyNZTOwo0t32cs6MGbO9x8fhKc49DjoEIEGpnw=; b=xZLFglYxXb99TJoO7LJWE14jNeoZNMSHw6S1QItilTvm7LnRukDVwntz2JOaBIxXwe EuyCuCcqtv8f/1phR6A9Na/m/7PeliUCEedcGBKH2ggQc3U3Vfg74XqRVWJLwTCKrlHw 1Zvu5sFyt/Q7Qn1CjI4pFX9xpdf/0k5e1CnLwHXwg4W7vye44prD8W634mXIsq5bmZtP LZodn2hpi3La326u7zp2Re2uXn53FkFEmz4E2uOvx33dn9OwpmKgmwJizzvDoid0k0yw Y8WzEgo86Mfkcr7nwtxxA1GmrIQJiIfd7TWT3s3OCPAimG4J0UiyDNPsBvbybdlf6Dt9 OEag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727832830; x=1728437630; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uTSnTMyNZTOwo0t32cs6MGbO9x8fhKc49DjoEIEGpnw=; b=PPh6eGpPLSMvh79Xmv+VYJsuwi9J/iJnN9qYWuxLzDa1vFxs1zaOI+Ir1B6DH3PRG1 chKiJvYU9j8srb2ff6wRb4WcHjS+LPrCQoQf8wds5CP8foF/L36ZwDg4byBy+vA1mNWx XAjO3FcRcovIQcxsCJ/UwWvodUWSbMVZZCqgO+YVcsISJbp4TPCckUQk9xuBI7+e68DC fOh6ctWXwveE243Kcg5UcKpx9KbWAcVpYcde3Ag8VSuKV0GUz8zVhN9Tr8I4HF8HwtzE YX46R9E3x5esS0Vdk3OVmvu4lNOuGdXalg0PuAgzC1a009Y4ZCz5bK71DVfKbEqE5l4a zToA== X-Forwarded-Encrypted: i=1; AJvYcCXcwI1/D0w7kvmfznBIZc3l7dS8eHgrIpc3MBQZImYBvPkNfyznjfpcBsTRVBsGpWCvsghf5wlKlA==@kvack.org X-Gm-Message-State: AOJu0YwVLw37EPJHUPXICiultqqxvsoHGqi8OAv40+Q2ALVIFo3PYdDd sbSeXTd5AxV+k2kKOsir6mNxcmA5fGFrp+WJRNVgilg5zB6vPVzjxiKswnUQMblxNsc/JIh3Jp8 KU7YRBl/Ly2Ko4GXdlnDFshLyHBPqMM2nnT6z X-Google-Smtp-Source: AGHT+IGTDQRoOwxwmBuGYVvNXvu/jxHtkl/Zpt9fVApOACkOXvRtX2CraxUrT0FgEQnBRBEG5AWEot9k5sl9DNH+ESg= X-Received: by 2002:a17:907:96a1:b0:a86:7e2d:f10b with SMTP id a640c23a62f3a-a98f8264d7amr112888766b.37.1727832829476; Tue, 01 Oct 2024 18:33:49 -0700 (PDT) MIME-Version: 1.0 References: <20241002012042.2753174-1-nphamcs@gmail.com> <20241002012042.2753174-2-nphamcs@gmail.com> In-Reply-To: <20241002012042.2753174-2-nphamcs@gmail.com> From: Yosry Ahmed Date: Tue, 1 Oct 2024 18:33:12 -0700 Message-ID: Subject: Re: [PATCH v2 1/1] swap: shmem: remove SWAP_MAP_SHMEM To: Nhat Pham Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com, shakeel.butt@linux.dev, ryan.roberts@arm.com, ying.huang@intel.com, chrisl@kernel.org, david@redhat.com, kasong@tencent.com, willy@infradead.org, viro@zeniv.linux.org.uk, baohua@kernel.org, chengming.zhou@linux.dev, v-songbaohua@oppo.com, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: m11b7zo7faak6isedocycjdy1b7qsicx X-Rspamd-Queue-Id: 753F440014 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1727832831-787168 X-HE-Meta: U2FsdGVkX1+BNcEeKxbhddn/CuCfAf3TUY+fml4F6jcRTMISSKFJr8jLDYW61ANFwWMeULOXf7wdLSw7ZKEMXkKOEXtA3TS5rKKo+i7qoucmalNE2digBiGjgdI5qRSIocgkJlzH+NGElTDVylHtEIFWOxOUlmnLT4BBBWislgPyFgVl4oUrZeR2fMKH9N3P6o9RTJ4+Dxg6L8aTaDiNI7IPlw5i0INwEVtWDMT/Whoiiu4Fa9EK84KaSlcqrZ0pAERoyidntJiBJMK+2tg8pEHZS71V3H33Dsk3qnt0jR/OvrfZFPIuPaGyB7PunQ2ZHVTl7TuP0IpTgo4hYJL8pTud9fHRhAcBcwL4BicUiZIoE3DXedFejWLcaeXCph0s71fYYuXqEh3t+kDDR3SwBAhGqwCbcIbyO+ZDADLy93M4IKL2FW5emgKu/GWPE7stNRHYDHVYPnscYUZr3vTUIbLQj3YrX6uOLw2nOhtbeyDzyDL6FVmMjDGXgYPk7hEkby3d1TqaMGgR/mNOJDWxjeuvNATW0fnoAWBAuBZEVdNG532uySWhnAw1E2ZwAjDh61SOZuRHN3Ba3J8kRmY29vXzjV+s/YipK0BG90inFV2d67CZ9Pbg/jv/r5rJwm1xK+KaS6q55qrnDD7IduEpciUlxdk78DQn2iOc+VzB1NbhevjzuxfunWMTLEWwkBkNejX8ns/XY9o5B7BH/+WXgEzDV/zUQxVmFFd84Y85e+SC2D4et2yZW7WAGJUvedfOhARsLlWI4Wag+t2ikU3Dv7Vyft4VzYE0U0U+Tc+k2Om6PdtRrW8yHu/GjEkdd48ZA5OVXmD6Jq71ZBRcKX9hN7C+fA4KzkiLgphScjIpQL5MVZx19OwG3FWJMtXROlVAujd78TAfU2TUxm2LZM4R+0Lm5S73+OmnmGILDEFT7sd+9Vav1aQtv1z4z6wnYM3fpj6dqlnprgy3qV+FdnR RDpRKYl5 IFTn9zliLvvMwd6TTP6i7uB9o0WIF7KhZgJf+m4RQeWH+UjRvReVt1HovrEiD4nqbgN3cG2EVTtMrL5t2gBCMBcC8zq+HfHLTvnBZnINbYN8VQ7ph3VlxhQPf/dpOueh1w2YSK/CFnKSaKAoLelFjnUX8m0eIV66EBpO8U7/9ssWp7apGhJqpRHe/WxlqCZYerMYLXF6AJsJJ37xZ3i0Mf2bo8X/Dpe2yeEchFBLETw4igIViWjSwxU+kkmvw3vDp97zHNvz1aS1sY0kVUz4ilvPnpkgDz91eXwYw/9dxrx14cp0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Oct 1, 2024 at 6:20=E2=80=AFPM Nhat Pham wrote: > > The SWAP_MAP_SHMEM state was introduced in the commit aaa468653b4a > ("swap_info: note SWAP_MAP_SHMEM"), to quickly determine if a swap entry > belongs to shmem during swapoff. > > However, swapoff has since been rewritten in the commit b56a2d8af914 > ("mm: rid swapoff of quadratic complexity"). Now having swap count =3D=3D > SWAP_MAP_SHMEM value is basically the same as having swap count =3D=3D 1, > and swap_shmem_alloc() behaves analogously to swap_duplicate(). The only > difference of note is that swap_shmem_alloc() does not check for > -ENOMEM returned from __swap_duplicate(), but it is OK because shmem > never re-duplicates any swap entry it owns. This will stil be safe if we > use (batched) swap_duplicate() instead. > > This commit adds swap_duplicate_nr(), the batched variant of > swap_duplicate(), and removes the SWAP_MAP_SHMEM state and the > associated swap_shmem_alloc() helper to simplify the state machine (both > mentally and in terms of actual code). We will also have an extra > state/special value that can be repurposed (for swap entries that never > gets re-duplicated). > > Signed-off-by: Nhat Pham > --- > include/linux/swap.h | 16 ++++++++-------- > mm/shmem.c | 2 +- > mm/swapfile.c | 41 +++++++++++++++++++++-------------------- > 3 files changed, 30 insertions(+), 29 deletions(-) > > diff --git a/include/linux/swap.h b/include/linux/swap.h > index ca533b478c21..017f3c03ff7a 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -232,7 +232,6 @@ enum { > /* Special value in first swap_map */ > #define SWAP_MAP_MAX 0x3e /* Max count */ > #define SWAP_MAP_BAD 0x3f /* Note page is bad */ > -#define SWAP_MAP_SHMEM 0xbf /* Owned by shmem/tmpfs */ > > /* Special value in each swap_map continuation */ > #define SWAP_CONT_MAX 0x7f /* Max count */ > @@ -482,8 +481,7 @@ void put_swap_folio(struct folio *folio, swp_entry_t = entry); > extern swp_entry_t get_swap_page_of_type(int); > extern int get_swap_pages(int n, swp_entry_t swp_entries[], int order); > extern int add_swap_count_continuation(swp_entry_t, gfp_t); > -extern void swap_shmem_alloc(swp_entry_t, int); > -extern int swap_duplicate(swp_entry_t); > +extern int swap_duplicate_nr(swp_entry_t, int); > extern int swapcache_prepare(swp_entry_t entry, int nr); > extern void swap_free_nr(swp_entry_t entry, int nr_pages); > extern void swapcache_free_entries(swp_entry_t *entries, int n); > @@ -549,11 +547,7 @@ static inline int add_swap_count_continuation(swp_en= try_t swp, gfp_t gfp_mask) > return 0; > } > > -static inline void swap_shmem_alloc(swp_entry_t swp, int nr) > -{ > -} > - > -static inline int swap_duplicate(swp_entry_t swp) > +static inline int swap_duplicate_nr(swp_entry_t swp, int nr) > { > return 0; > } > @@ -606,6 +600,12 @@ static inline int add_swap_extent(struct swap_info_s= truct *sis, > } > #endif /* CONFIG_SWAP */ > > +static inline int swap_duplicate(swp_entry_t entry) > +{ > + return swap_duplicate_nr(entry, 1); > +} > + > + Nit: extra blank line. > static inline void free_swap_and_cache(swp_entry_t entry) > { > free_swap_and_cache_nr(entry, 1); > diff --git a/mm/shmem.c b/mm/shmem.c > index 0613421e09e7..e3f72f99be32 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1561,7 +1561,7 @@ static int shmem_writepage(struct page *page, struc= t writeback_control *wbc) > __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN, > NULL) =3D=3D 0) { > shmem_recalc_inode(inode, 0, nr_pages); > - swap_shmem_alloc(swap, nr_pages); > + swap_duplicate_nr(swap, nr_pages); > shmem_delete_from_page_cache(folio, swp_to_radix_entry(sw= ap)); > > mutex_unlock(&shmem_swaplist_mutex); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 0cded32414a1..9bb94e618914 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1381,12 +1381,6 @@ static unsigned char __swap_entry_free_locked(stru= ct swap_info_struct *si, > if (usage =3D=3D SWAP_HAS_CACHE) { > VM_BUG_ON(!has_cache); > has_cache =3D 0; > - } else if (count =3D=3D SWAP_MAP_SHMEM) { > - /* > - * Or we could insist on shmem.c using a special > - * swap_shmem_free() and free_shmem_swap_and_cache()... > - */ > - count =3D 0; > } else if ((count & ~COUNT_CONTINUED) <=3D SWAP_MAP_MAX) { > if (count =3D=3D COUNT_CONTINUED) { > if (swap_count_continued(si, offset, count)) > @@ -3626,7 +3620,6 @@ static int __swap_duplicate(swp_entry_t entry, unsi= gned char usage, int nr) > > offset =3D swp_offset(entry); > VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); > - VM_WARN_ON(usage =3D=3D 1 && nr > 1); > ci =3D lock_cluster_or_swap_info(si, offset); > > err =3D 0; > @@ -3652,6 +3645,13 @@ static int __swap_duplicate(swp_entry_t entry, uns= igned char usage, int nr) > err =3D -EEXIST; > } else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) { > err =3D -EINVAL; > + } else { > + /* > + * The only swap_duplicate_nr() caller that passe= s nr > 1 is shmem, > + * who never re-duplicates any swap entry it owns= . So this should nit: I think "which" is the right word here, but I am not a native speaker = :) > + * not happen. > + */ > + VM_WARN_ON(nr > 1 && (count & ~COUNT_CONTINUED) = =3D=3D SWAP_MAP_MAX); Why not return an error in this case? I think we should add recovery for bugs when it's possible and simple, which I believe is the case here. In shmem_writepage() we can add a WARN if swap_duplicate_nr() fails, or propagate an error to the caller as well (perhaps this belongs in a separate patch that does this for swap_shmem_alloc() first). Sorry if I am being paranoid here, please let me know if this is the case. > } > > if (err) > @@ -3686,27 +3686,28 @@ static int __swap_duplicate(swp_entry_t entry, un= signed char usage, int nr) > return err; > } > > -/* > - * Help swapoff by noting that swap entry belongs to shmem/tmpfs > - * (in which case its reference count is never incremented). > - */ > -void swap_shmem_alloc(swp_entry_t entry, int nr) > -{ > - __swap_duplicate(entry, SWAP_MAP_SHMEM, nr); > -} > - > -/* > - * Increase reference count of swap entry by 1. > +/** > + * swap_duplicate_nr() - Increase reference count of nr contiguous swap = entries > + * by 1. Can we avoid the line break by using "refcount" instead of "reference count= "? > + * > + * @entry: first swap entry from which we want to increase the refcount. > + * @nr: Number of entries in range. > + * > * Returns 0 for success, or -ENOMEM if a swap_count_continuation is req= uired > * but could not be atomically allocated. Returns 0, just as if it succ= eeded, > * if __swap_duplicate() fails for another reason (-EINVAL or -ENOENT), = which > * might occur if a page table entry has got corrupted. > + * > + * Note that we are currently not handling the case where nr > 1 and we = need to > + * add swap count continuation. This is OK, because no such user exists = - shmem > + * is the only user that can pass nr > 1, and it never re-duplicates any= swap > + * entry it owns. Do we need this comment when we have the WARN + comment in __swap_duplicate= ()? > */ > -int swap_duplicate(swp_entry_t entry) > +int swap_duplicate_nr(swp_entry_t entry, int nr) > { > int err =3D 0; > > - while (!err && __swap_duplicate(entry, 1, 1) =3D=3D -ENOMEM) > + while (!err && __swap_duplicate(entry, 1, nr) =3D=3D -ENOMEM) > err =3D add_swap_count_continuation(entry, GFP_ATOMIC); > return err; > } > -- > 2.43.5