From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4837C83F27 for ; Sat, 19 Jul 2025 04:32:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E91A26B0088; Sat, 19 Jul 2025 00:32:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E42776B0089; Sat, 19 Jul 2025 00:32:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D57F26B008A; Sat, 19 Jul 2025 00:32:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C6F416B0088 for ; Sat, 19 Jul 2025 00:32:52 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 243521D8D3C for ; Sat, 19 Jul 2025 04:32:52 +0000 (UTC) X-FDA: 83679743784.11.16CAB58 Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) by imf09.hostedemail.com (Postfix) with ESMTP id F046514000C for ; Sat, 19 Jul 2025 04:32:48 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=UHWRSXiS; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf09.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.98 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752899570; a=rsa-sha256; cv=none; b=PsP5jawuxye9IKsb9Z5dKdz5ajYJJqIbktijfUsuOzk07De8dJEMSomZ82u3Z8oLPNj3TO ucX1oO+WHJfwJRBxli4wnvULpWBFCeYKCEJRs/C9e14jL2B/Ob+dkBwNegSiHy+B4wka7m Exo1T24FHyWh4mbE71cNiBznibE9aPY= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=UHWRSXiS; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf09.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.98 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752899570; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WIfcKFqW1HEmdhY1I90ltqa+jRIcL78choQIblUJOfc=; b=qPXIYVrewH4CvUyu+NN12xRxgPJ2jJWVm7NMJs9qG9PXHtRC8ksle3TOEyxuxrrx028rGJ EGIJ1ko045Vijorcpzvr8PjgSq1Qv7ppi1SoYVqMic/0vDCKNKfCy4AXg8D8P49Crne5iT PZ3Zph2gag+Qz6zHjJd67B0BNmahep4= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1752899566; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=WIfcKFqW1HEmdhY1I90ltqa+jRIcL78choQIblUJOfc=; b=UHWRSXiSPuTHfxYzIJYdjupTiZCXAKzMWUFqse2M5cpsOpAxvfpudnRQiM+k6iuATzB3GS/rRE6ciQQ91YGRzubD+jEOqIEF4rQyOTpEXWhZEkjVOBIqWnkZJfKKlWitb5Mh4C4U895fdwgXafzmVPEx6NUrtCoVAtdaxdCzU8U= Received: from 30.134.69.216(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WjDqIoK_1752899562 cluster:ay36) by smtp.aliyun-inc.com; Sat, 19 Jul 2025 12:32:43 +0800 Message-ID: Date: Sat, 19 Jul 2025 12:32:41 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH mm-new 2/2] mm/shmem: writeout free swap if swap_writeout() reactivates To: Hugh Dickins Cc: Andrew Morton , Baoquan He , Barry Song <21cnbao@gmail.com>, Chris Li , David Rientjes , Kairui Song , Kemeng Shi , Shakeel Butt , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <87beaec6-a3b0-ce7a-c892-1e1e5bd57aa3@google.com> <5c911f7a-af7a-5029-1dd4-2e00b66d565c@google.com> <853a5211-cdab-4bdf-b0c4-8092dd943ff5@linux.alibaba.com> From: Baolin Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F046514000C X-Stat-Signature: 9h6htdq9668ubrw7jh3jmypabkbgq3sp X-HE-Tag: 1752899568-846895 X-HE-Meta: U2FsdGVkX180VQtKbbHYP6jt/lv+n896h+3vDEvD12Za6LNeHx85bau7vJuPKd9jGCfstVKXQZFgglPWQ8bhHQ1PP79/KJ0dRHuzuEI6nrig5zTbTJuOUyD/+eGI3TR3YKHtJyQlUU8exmUk7+sKpR1xpweH8tBpsX7ozf4FmjzDIq3ETDaRUtv+/bGLZHrdR2uAc/tOXhF0NmdfG6tsmyUCxb1qqPdKGep4a7ErlG4DwRZyfcKdul+fxVSnaVe08fVWEkpT84xCH0oEwd8yQgdNYcizDTtauc+En+romsp4l+qY+qB1lGmssxgkA7mB3fY96UmCV/7AXPbIAPCy8Ci9PWn3MYmbPXSPvs+J2z1t7OAa6ZQcaqfrwrbahY8C5g4dwlcL/bS77Ak23J0DKo4ERA1dQ35w4VEN9QBe9Ddc9oEsMnEXYC9sKAGShYqlh4HwxOL63h7QZTuuiC20wmKqDPjqaDbONBhzvv9Mlxbkmmf6izfO60JGkKr24fhgR7lN0Fi3C3c+zFhJMRNWBGzLiaCi+V9FTefeqptxLOoVagLHrKnEBbNvN1VO36WgsvVHTmadfDa7Hpy813vesE91ZeedCeCz6a6tB7SlDkcilEohW6XOPyqED5KFK0I++3BBGTaRMwGFAClXd4/RhqQcrswDnN1eNGykN11yzbjUut/VtxMx+boWbkZX5g5LLyG2UiQqznirO+2LKMhc2nbHId0KWwlCLZvXnCDXsCmEjIsxayyCHsaC4bO3MQC0Ka/VOJtKClo65ZoNo9L7p474YS9A+Zz0+4HVrvjlvcKisxcEFnIwLALGJpohPGGoyetIU8T/7xHTY8ADiNVPHHXQTgvbqgz6au5zOgWQkSFGyOZ+EYwkUm2ILRVsml9d7I7yKJfkvp58CkaYUISXyGIQdbi6F4Ffjn4Sczdp4un3qGKCcRes3z42qzPFKVOhnrf84uRuGYzmlcBc808 LwSQbpNv pDGHLFlkW+jZRoPDivvU9+fozcmE1RwXGxUTX9/J6eW1Jk1kTK3M5UOAZ3YXDKemLaV6HE/BgVH3ug2tx0AZduO7HSi+8QoteEABmInZqwMBAgJxKbyRL2tC3b648JtrMubX6tVS82cKwgxOR3by4dtp9WFimNjn0q85ILMkGDGxLA+DajQtZ3mqbMh4HfyMHrSKfob3itVeIEJMBDKhbtldxc9bGv+msAVn6uMiPTEUlvwXHCqRrXYBv/KRnwcUJzy2U5rntF0EKZhdwhM967UxNYbC+2kvMmLqDFcs0GpTrn/OlXxYyttKxC0qcA32y6wEr1lDfcUaRUTfNMSLUqM1x8/vESeZyHhUNQH+CJu3PUHWSTdCtCdL7fZRfkaXArj33 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2025/7/19 08:51, Hugh Dickins wrote: > On Thu, 17 Jul 2025, Baolin Wang wrote: > >> Hi Hugh, >> >> On 2025/7/16 16:08, Hugh Dickins wrote: >>> If swap_writeout() returns AOP_WRITEPAGE_ACTIVATE (for example, because >>> zswap cannot compress and memcg disables writeback), there is no virtue >>> in keeping that folio in swap cache and holding the swap allocation: >>> shmem_writeout() switch it back to shmem page cache before returning. >>> >>> Folio lock is held, and folio->memcg_data remains set throughout, so >>> there is no need to get into any memcg or memsw charge complications: >>> swap_free_nr() and delete_from_swap_cache() do as much as is needed (but >>> beware the race with shmem_free_swap() when inode truncated or evicted). >>> >>> Doing the same for an anonymous folio is harder, since it will usually >>> have been unmapped, with references to the swap left in the page tables. >>> Adding a function to remap the folio would be fun, but not worthwhile >>> unless it has other uses, or an urgent bug with anon is demonstrated. >>> >>> Signed-off-by: Hugh Dickins >>> --- >>> mm/shmem.c | 33 ++++++++++++++++++++++++++++++++- >>> 1 file changed, 32 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/shmem.c b/mm/shmem.c >>> index 33675361031b..5a7ce4c8bad6 100644 >>> --- a/mm/shmem.c >>> +++ b/mm/shmem.c >>> @@ -1655,6 +1655,7 @@ int shmem_writeout(struct folio *folio, struct >>> swap_iocb **plug, >>> >>> if (!folio_alloc_swap(folio, __GFP_HIGH | __GFP_NOMEMALLOC | >>> __GFP_NOWARN)) { >>> bool first_swapped = shmem_recalc_inode(inode, 0, nr_pages); >>> + int error; >>> >>> /* >>> * Add inode to shmem_unuse()'s list of swapped-out inodes, >>> @@ -1675,7 +1676,37 @@ int shmem_writeout(struct folio *folio, struct >>> swap_iocb **plug, >>> shmem_delete_from_page_cache(folio, swp_to_radix_entry(folio->swap)); >>> >>> BUG_ON(folio_mapped(folio)); >>> - return swap_writeout(folio, plug); >>> + error = swap_writeout(folio, plug); >>> + if (error != AOP_WRITEPAGE_ACTIVATE) { >>> + /* folio has been unlocked */ >>> + return error; >>> + } >>> + >>> + /* >>> + * The intention here is to avoid holding on to the swap when >>> + * zswap was unable to compress and unable to writeback; but >>> + * it will be appropriate if other reactivate cases are added. >>> + */ >>> + error = shmem_add_to_page_cache(folio, mapping, index, >>> + swp_to_radix_entry(folio->swap), >>> + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN); >>> + /* Swap entry might be erased by racing shmem_free_swap() */ >>> + if (!error) { >>> + spin_lock(&info->lock); >>> + info->swapped -= nr_pages; >>> + spin_unlock(&info->lock); >> >> Using the helper 'shmem_recalc_inode(inode, 0, -nr_pages)' seems more >> readable? > > Yes, that's better, thanks: I don't know if I'd say "more readable", > but it is much more in the spirit of shmem_recalc_inode(), bringing > the counts into balance sooner rather than later. > > I'll follow up with a "fix" patch to Andrew. > >> >>> + swap_free_nr(folio->swap, nr_pages); >>> + } >>> + >>> + /* >>> + * The delete_from_swap_cache() below could be left for >>> + * shrink_folio_list()'s folio_free_swap() to dispose of; >>> + * but I'm a little nervous about letting this folio out of >>> + * shmem_writeout() in a hybrid half-tmpfs-half-swap state >>> + * e.g. folio_mapping(folio) might give an unexpected answer. >>> + */ >>> + delete_from_swap_cache(folio); >> >> IIUC, Should the delete_from_swap_cache() also be moved into the 'if (!error)' >> branch? Since if shmem_free_swap() has freed the swap entry, it would also >> reclaim the swap cache, no? > > No, but it was a good point to raise, and led into more research than > I had anticipated. > > No: because shmem_free_swap->free_swap_and_cache_nr->__try_to_reclaim_swap > has to return after doing nothing if its folio_trylock fails: it cannot do > the delete_from_swap_cache() part of the job, which we do here - on this > AOP_WRITEPAGE_ACTIVATE path, we hold the folio_lock throughout. I missed the 'folio_trylock', yes, you are right. Thanks for explanation. > But it led into more research, because I wanted to point you to the > equivalent coding in shmem_swapin_folio(): but, to my initial alarm, > the equivalent is not there; but used to be. > > See 5.8 commit 14235ab36019 ("mm: shmem: remove rare optimization when > swapin races with hole punching"). There (in the deleted lines) you can > see the helpful comment on this case, with its delete_from_swap_cache() > when shmem_add_to_page_cache() fails. But for memcg-charging reasons, > 5.8 found it simpler to drop that, and just let shrink_page_list() > clear up the debris later. > > Here in shmem_writeout(), holding folio_lock throughout, we have no > memcg complications, and can go ahead with delete_from_swap_cache(), > both when successfully added back to page cache, and when that fails. OK. Thanks for pointing out the change history here, and I have no further questions. With your following changes, feel free to add: Reviewed-by: Baolin Wang