From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDC21C83F34 for ; Sat, 19 Jul 2025 00:51:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 01DD96B007B; Fri, 18 Jul 2025 20:51:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EEA046B0088; Fri, 18 Jul 2025 20:51:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD8346B0089; Fri, 18 Jul 2025 20:51:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C75B76B007B for ; Fri, 18 Jul 2025 20:51:30 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8C40416012A for ; Sat, 19 Jul 2025 00:51:30 +0000 (UTC) X-FDA: 83679185940.26.45D8EAD Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) by imf17.hostedemail.com (Postfix) with ESMTP id B97B140005 for ; Sat, 19 Jul 2025 00:51:28 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ESYk2xLZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of hughd@google.com designates 209.85.128.170 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752886288; a=rsa-sha256; cv=none; b=jDzhr/i/cIOHEFIjbDgcd3ZCFU6T1ihlFsE9FURSuKNbRjnc+EPWwpVXLv7690Gl53Z2o/ jK+3t70tEFoXJKGwHePS+sckSs0xrOCEjDx9bNzA6oh8EX60BhxRQhe7J0N5S0tp2rC/no fvUoMEZD26I9q1MvnbUZWeqgtS2Y4Ys= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ESYk2xLZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of hughd@google.com designates 209.85.128.170 as permitted sender) smtp.mailfrom=hughd@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752886288; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VLA2HjCsJilgmCjXGe+HwENBmUw8Mc0GyNWvAB/59Z4=; b=wSqDvkRfveC9nMHfL1R2f/fcb4BW4I7EwvuLkmHRJZRk717UM0DShTyWvBEJ3DxQgQ/xvB EDL2LiMY3FdxgSpvdb7C2x8zC8UVDsFwXy+I6V63sKdussLU913H16jFtEZUZ7dXS8IakG TUjXXxPYwhjJdpdlX4DZnszC86jl9SY= Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-708d90aa8f9so24911157b3.3 for ; Fri, 18 Jul 2025 17:51:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752886288; x=1753491088; darn=kvack.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=VLA2HjCsJilgmCjXGe+HwENBmUw8Mc0GyNWvAB/59Z4=; b=ESYk2xLZYjyUL874CJaW9xpEZ44xP1IE+F/1L1HF8ndozfTw5GhOS//z71mpRS9XcB kd3eS7AXkQLApY69mFhyTRHsCb3jtKc4L5FoQADuGCr2uF5z0g6ChAomcNP+xtWq85qt 6Zvbjmea1vfpogI1fu22HwsATKuBg4+fkPOCJAFZi7nwYgOQyZrQan2QCSlCa1LviT9/ 0F1pJveVg2XOInyPD4yN4OfRpxKmvdo8qfugqjQE6es18dRfNrlG82A18Iu4hQTxHlWZ Hoo8CbCnj3zKa5Ek9ugr2BKkM1nzNJHuKNOEJbCeQtcm3hNB6ntDQ2+2Ztt+zAiaKGJm 6OxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752886288; x=1753491088; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VLA2HjCsJilgmCjXGe+HwENBmUw8Mc0GyNWvAB/59Z4=; b=B00+kU5W9PYfbLat2Kc0tBHRrf2G/ldccGkdgTGY5PowWPtRCX0wUaRCGyTENf6PuZ x6rtmtFHtkDV0js685iaItmQ+4M/f3R97VE1FL23NcnlpKRFQ8paI9hrR8lHs49rpf3f gjLPyv35XObgXF74wfkVz1M2jwiLbgMOuKZSGJjtUb2Dh8KRWCBU4af2KqkpV1lK+7Ha G9CQV/StqGkUI+sRO+Y3kT5yufzLczD70fpyfy9HN2wbBtgfLVwFYbEnoX6sh24ZHolo ButpaBBmumETasexIewWtWxEDyv1FsDTBeVZfT1RwtkD8wYkisVjEsnRWINyx3/VVKAg DhGQ== X-Forwarded-Encrypted: i=1; AJvYcCVx/Xv/hwWE5V5l3LEVsIOx/Wg9Yp39uMDRazB2bJ1UZTuvb3ehfOrfHJfKGpmo2XnJSWFRF9zgug==@kvack.org X-Gm-Message-State: AOJu0Yxdkqfi4Xa+nfRFO3EgY//qVBsXjLZsPUSI3Aff/4yxqJSGJ4F3 fy7Jms40y5bTtXmuZT1FWiUZ7lc7/6YQ7q3keXciSrty0bFk15vixkUq0EzqeuYnWg== X-Gm-Gg: ASbGncs1w0su9/InIE+JgNuKHZ73iEBlWN8gwHXMF+pZ2x/93Y0dNkaVG7Rfn2n0m/z 8Rb/yCV6ua7wRNJSDffLB7sS+ev1wfvZVLRYxCT9OYY8yd4R6bSVB7hP3mKFB2Xkm233hE9lcIV o2XQ2cTVwNsMCbv6vWPb8g/ZdHkBS2JLBrv1j/omvvGS37IuBtmke9Uxe42ab3mqIEqVxsRwWvC WFIPRcddvgHLtGsx0yun/hpzbaIKmQA4tNhpxtzife1pL+H9R6xSyawNK8B2MqJ196EUv+7h9b5 Xi0d4oDUef1l0RM4pPKKWsVeIMUKZpeH/BEXdOJMcgCEIdDo6glbFQCPBw6zACbqOxeByCc4Xji yJ3L7ZkgbGdFqOEZ9gshjLK5pc/jmmVUJebjSznDIjKwI2Ur/bh4vXG8N2MKWitTKm+98JCXrmd ZweHed27E= X-Google-Smtp-Source: AGHT+IGg9f4qdm3LH8bGi05lrhAzWO5XxarVsdy/66jmvAkbANg8MgTxs9g+ar65cvjQTEFW7sWjEw== X-Received: by 2002:a05:690c:311:b0:718:38bd:bb44 with SMTP id 00721157ae682-71838bdd008mr160685557b3.39.1752886287517; Fri, 18 Jul 2025 17:51:27 -0700 (PDT) Received: from darker.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7195335c284sm5931557b3.106.2025.07.18.17.51.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Jul 2025 17:51:25 -0700 (PDT) Date: Fri, 18 Jul 2025 17:51:16 -0700 (PDT) From: Hugh Dickins To: Baolin Wang cc: Hugh Dickins , Andrew Morton , Baoquan He , Barry Song <21cnbao@gmail.com>, Chris Li , David Rientjes , Kairui Song , Kemeng Shi , Shakeel Butt , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH mm-new 2/2] mm/shmem: writeout free swap if swap_writeout() reactivates In-Reply-To: <853a5211-cdab-4bdf-b0c4-8092dd943ff5@linux.alibaba.com> Message-ID: References: <87beaec6-a3b0-ce7a-c892-1e1e5bd57aa3@google.com> <5c911f7a-af7a-5029-1dd4-2e00b66d565c@google.com> <853a5211-cdab-4bdf-b0c4-8092dd943ff5@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Stat-Signature: 4racszpja7hyno3oj15rmw3taqae9ta3 X-Rspam-User: X-Rspamd-Queue-Id: B97B140005 X-Rspamd-Server: rspam02 X-HE-Tag: 1752886288-10642 X-HE-Meta: U2FsdGVkX1/aEGISu2Wc53hEXGweJlHZ8r3EKohuNj0J2U9MWWhKaD8z5qgOIPwt8PsPai07GFbxdvUQlsc+WcX/QcE/oCXWgTT0KMHyAlpPuiESJIq9lAZ/BsJdak4yNxU2BLfieswbn/HV+lQV3HOmFlaBndINzQieqJ6ajovuRwsOe7TJ41k99RL45a1t5A+5s5krTtqo0l17iGNYvzg0g+Yzss0aAtwOmLOSizdt+iETTKu7SXfYsiMjsQrlJjh8UAxnzULqlnwbrkRzCpyAKoiMiEr+L/aoKzRrwcOBK9H2/hrZXCupVOvuwa1lC8EACA/k07DN4oQHCIPI/+AD3p5X0fKiFpyNPK/cDW/Vp4SesUeXa38HE0JZegCuvfu9cT+4E2t8DZeb0O0yU8PXGxHvcglyhbklY/gB+TqM5rXVzyof0pqXiLXfI11CDMVI5cwrNWQuPzcBo6/Oce2xLG88633d6het5L8hOC14i5DnG2fg064EKq/7C71NwGLZqfAeDnan/94DZk3YOX9WjTBuAV+XMMrLCaKtl1lI5TGyIXOt9uVR5V4D0FNmtKpyGdQBvyjmed1zFV7T9ife3bIAY3HGGCwKs0fG3H12qAkSy7F/m42yjQk2UHMjBSizbORhXtHh7yk0hWkmUkJZ8h4Ipo8358eZLW+jDgN5PN2yivtoJvWT/yPzNrO3ZgenvEfPUjFY5kcEzPZqGaeqxQO6xtvZi3ReF8Y1/e7/lc3GIFwC5LEd+sHpjg+UVcdaxgHFUan7WiZdi//c0AiI5LPGF0eZrWugsCNa/F0AbPjgnPe4TBpb0oYPEBkBAI7QRPUCYgVCoYeZnOyJ0iGQQnFMh2rZl4emp/phWdByfs7KhfTFewVJvC9RGs/vvV9mpP61uBV5UzKTf+3OAyHpm9WlPsmjXqatiNMxKRHsjq2gh65hWfg9+E0v0eAjSqB05CJK0jvrg3fnHf1 35VhFITG udRgzgKGQ2pOF9jG4kSVs7NEq1TI6dHucrp2BZzbn52XxhAO7NX6RHvFNfmw3GqZMHheKnTsL0LOid5ZogMF2zu9a/4zBum0nJhR0KW1ANCuw+2A5RoosYpgiEMDkNYWQiV+bxe7IEb/fIhQBt6rIjwPqOL4E2aMp6KTSRpUME0n4pukQSvVWY3t2gFLFxB8M6Dcff999PPmUsBc1USFfTYB236Sf0+CWEsPp83vpKHTtrtLbauDnXAiuTAOTTuOasmVgNJEMBU/rD7dz9nWxwXpmauUzN95KuzLktlo3ln4NDlrYtvh/zy9YpUq6oqtBEiQJY0BNXqqa4dECMfwBsCw1NyYIXAa48OuhDg9KAlkahgyCud/hY354hFb/d8dKmMaWFWBGbwwJBxQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 17 Jul 2025, Baolin Wang wrote: > Hi Hugh, > > On 2025/7/16 16:08, Hugh Dickins wrote: > > If swap_writeout() returns AOP_WRITEPAGE_ACTIVATE (for example, because > > zswap cannot compress and memcg disables writeback), there is no virtue > > in keeping that folio in swap cache and holding the swap allocation: > > shmem_writeout() switch it back to shmem page cache before returning. > > > > Folio lock is held, and folio->memcg_data remains set throughout, so > > there is no need to get into any memcg or memsw charge complications: > > swap_free_nr() and delete_from_swap_cache() do as much as is needed (but > > beware the race with shmem_free_swap() when inode truncated or evicted). > > > > Doing the same for an anonymous folio is harder, since it will usually > > have been unmapped, with references to the swap left in the page tables. > > Adding a function to remap the folio would be fun, but not worthwhile > > unless it has other uses, or an urgent bug with anon is demonstrated. > > > > Signed-off-by: Hugh Dickins > > --- > > mm/shmem.c | 33 ++++++++++++++++++++++++++++++++- > > 1 file changed, 32 insertions(+), 1 deletion(-) > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > index 33675361031b..5a7ce4c8bad6 100644 > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -1655,6 +1655,7 @@ int shmem_writeout(struct folio *folio, struct > > swap_iocb **plug, > > > > if (!folio_alloc_swap(folio, __GFP_HIGH | __GFP_NOMEMALLOC | > > __GFP_NOWARN)) { > > bool first_swapped = shmem_recalc_inode(inode, 0, nr_pages); > > + int error; > > > > /* > > * Add inode to shmem_unuse()'s list of swapped-out inodes, > > @@ -1675,7 +1676,37 @@ int shmem_writeout(struct folio *folio, struct > > swap_iocb **plug, > > shmem_delete_from_page_cache(folio, swp_to_radix_entry(folio->swap)); > > > > BUG_ON(folio_mapped(folio)); > > - return swap_writeout(folio, plug); > > + error = swap_writeout(folio, plug); > > + if (error != AOP_WRITEPAGE_ACTIVATE) { > > + /* folio has been unlocked */ > > + return error; > > + } > > + > > + /* > > + * The intention here is to avoid holding on to the swap when > > + * zswap was unable to compress and unable to writeback; but > > + * it will be appropriate if other reactivate cases are added. > > + */ > > + error = shmem_add_to_page_cache(folio, mapping, index, > > + swp_to_radix_entry(folio->swap), > > + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN); > > + /* Swap entry might be erased by racing shmem_free_swap() */ > > + if (!error) { > > + spin_lock(&info->lock); > > + info->swapped -= nr_pages; > > + spin_unlock(&info->lock); > > Using the helper 'shmem_recalc_inode(inode, 0, -nr_pages)' seems more > readable? Yes, that's better, thanks: I don't know if I'd say "more readable", but it is much more in the spirit of shmem_recalc_inode(), bringing the counts into balance sooner rather than later. I'll follow up with a "fix" patch to Andrew. > > > + swap_free_nr(folio->swap, nr_pages); > > + } > > + > > + /* > > + * The delete_from_swap_cache() below could be left for > > + * shrink_folio_list()'s folio_free_swap() to dispose of; > > + * but I'm a little nervous about letting this folio out of > > + * shmem_writeout() in a hybrid half-tmpfs-half-swap state > > + * e.g. folio_mapping(folio) might give an unexpected answer. > > + */ > > + delete_from_swap_cache(folio); > > IIUC, Should the delete_from_swap_cache() also be moved into the 'if (!error)' > branch? Since if shmem_free_swap() has freed the swap entry, it would also > reclaim the swap cache, no? No, but it was a good point to raise, and led into more research than I had anticipated. No: because shmem_free_swap->free_swap_and_cache_nr->__try_to_reclaim_swap has to return after doing nothing if its folio_trylock fails: it cannot do the delete_from_swap_cache() part of the job, which we do here - on this AOP_WRITEPAGE_ACTIVATE path, we hold the folio_lock throughout. But it led into more research, because I wanted to point you to the equivalent coding in shmem_swapin_folio(): but, to my initial alarm, the equivalent is not there; but used to be. See 5.8 commit 14235ab36019 ("mm: shmem: remove rare optimization when swapin races with hole punching"). There (in the deleted lines) you can see the helpful comment on this case, with its delete_from_swap_cache() when shmem_add_to_page_cache() fails. But for memcg-charging reasons, 5.8 found it simpler to drop that, and just let shrink_page_list() clear up the debris later. Here in shmem_writeout(), holding folio_lock throughout, we have no memcg complications, and can go ahead with delete_from_swap_cache(), both when successfully added back to page cache, and when that fails. Thanks for checking! Hugh