From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED8EAC54E58 for ; Mon, 25 Mar 2024 04:54:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5823F6B0087; Mon, 25 Mar 2024 00:54:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5038E6B0089; Mon, 25 Mar 2024 00:54:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CA8F6B008A; Mon, 25 Mar 2024 00:54:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2D7E16B0087 for ; Mon, 25 Mar 2024 00:54:22 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id DB2581407C6 for ; Mon, 25 Mar 2024 04:54:21 +0000 (UTC) X-FDA: 81934345122.09.5422B53 Received: from mail-ua1-f42.google.com (mail-ua1-f42.google.com [209.85.222.42]) by imf22.hostedemail.com (Postfix) with ESMTP id 259F8C0007 for ; Mon, 25 Mar 2024 04:54:19 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TrxybGHE; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf22.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.42 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711342460; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3flMnA28xnVDE8zVj79YDHLjc6D0x73P2ccGoK196hA=; b=QAcqARqN3OGNVU33awIG0urUy+2jkfWgKL/oQar0tlgjY5v7swFsjiRh+cxGzjEYOSPeag 6wzchA2PdQ49u1YKVp7Or+QkVjWFwGhrv2lURvvxbqNMZtYVlt2I8Stw6BC3c0bQiQsjtc O2QSIBnUr6hRsET7R39NmTma8SCy36A= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TrxybGHE; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf22.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.42 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711342460; a=rsa-sha256; cv=none; b=jTWgVVfFDnhTs1X00oywqsHfHzeLfxWRRUuRfo0PyRt62lwJhNfYrP1yTaapmANddHWO3M 9atGMHZJLvoxYwA36uQVICFhWWKd1ic0E5Sf4UUnRYN+rM4imar8WBcof/7Z4VplSgfujA V1dWC/XLRYYPsUwaBrlzGlds9lTnIAM= Received: by mail-ua1-f42.google.com with SMTP id a1e0cc1a2514c-7e05b1ef941so1415038241.3 for ; Sun, 24 Mar 2024 21:54:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1711342459; x=1711947259; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3flMnA28xnVDE8zVj79YDHLjc6D0x73P2ccGoK196hA=; b=TrxybGHENJjz4G2em7jd2vyMSC4N0WwPrMaoOEGK0Ljo33Y6h1s6KheKzUsXIYz18b 2YvGHij+TQQ1tin6To2qbzphrYxTk+GpF5wd/qogKPUf/gOWTyq0Tj75pt3aqkMRhedl KgJX2us2tpouO6aw5TOTAG3H9GbMDS15Pb3pJYuA0hqZNyw/VPlRZjEOw6LKatMQA2yk uQh2LBK+C3Cudo3uj483jR5lsSx/IiM1ZN0HibbaX++ikO5/joJwwi8Pj0HrLb3EY6B5 HmAV0yMIQZs0J+m4PRTEkRDLqyeqIsBckQcObU+m+vrOggbZlzXuP5p2DwMk05ROh8C2 2Etw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711342459; x=1711947259; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3flMnA28xnVDE8zVj79YDHLjc6D0x73P2ccGoK196hA=; b=N/+0Fk6cCl8vJ3Srkd5s/vOh4WSBti+lI1ospRe7juKy8gzm0MV3JtsZ1VdbPtwZiH ecxKTw1zowmYrsP24laSKDTiD93OX7rMDX8M4Fk0ezIz4t3wbf51EVX43RfSQIgTQnxO 9yQnEPcPOP46JVe+1mGHZbmn/TrjHQl6yCGqPUTtoYxaOoXBsP2B9GUPNHayjYiu8ybm lLkvpFFtwTig1yGCX5AFXg/wXnp+i/evXpX9/02zRzki91HEKA2jUgNHpZCoSyBjDI4Q XwMYa1G6dl+CCfThOoTzcgmi3sDnLrxCTDaJBiJOFeOdGjlhqe+SgZE00Xdrf1Bi/DhD GVDw== X-Forwarded-Encrypted: i=1; AJvYcCWr7Z36qPDjJKgt2N/oupuDDCSKH4rOw02h0vYfEO/Hxx/XINjnMVwG8drjIXem1MiTSzAMQ7ctThCO/pL8X7WcVRs= X-Gm-Message-State: AOJu0Yx9M347pLqtcP0LfSqUUos5Xqbz3H9c7tRbcul9P/gpFrZQx5YI o0gHyOExZzChw6g1KdRECfW+Ax12LtL/MsluOZr2aBpusT7IhVGUBW8yUxC7pLW/QwYtsr8LL0K 26SXgHaYOO/yBzzADFJDVcy7OMaU= X-Google-Smtp-Source: AGHT+IEvq/NXn7dV2CoT+ELfsUm6EQA0YC7HkzwkQKfanBNR/yZjNX1onExkSyn1ahh09EqwKPvLef5Atf/363j39Ws= X-Received: by 2002:a05:6102:32c6:b0:477:e55d:668f with SMTP id o6-20020a05610232c600b00477e55d668fmr202722vss.23.1711342459098; Sun, 24 Mar 2024 21:54:19 -0700 (PDT) MIME-Version: 1.0 References: <20240324210447.956973-1-hannes@cmpxchg.org> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Mon, 25 Mar 2024 17:54:07 +1300 Message-ID: Subject: Re: [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices To: Yosry Ahmed Cc: Johannes Weiner , Andrew Morton , Zhongkun He , Chengming Zhou , Chris Li , Nhat Pham , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 259F8C0007 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: i4q8f8cnedtqe76yzgh4adebgtimucw3 X-HE-Tag: 1711342459-161283 X-HE-Meta: U2FsdGVkX1/Sxv25OgI0iRMswHRnGiDLJt1qBejkjDkwpqV0AQ4gIQYtLoBTzjDx/yCYUZpPa5SRy8z2tUqLH90KK/bVotHBIkjddh79Fl57Dr88C20HxHvOYhPBqUkI7Etxic48Jf98rySCfc1uZSytPPeoQnugnOCoXjcJfCapPipZtQAsh7Z0uOdZ9n+0VW0GokmK2kJFspslFrLiiuy6+1dR/h4anrw7F5kUNZ7XnJ8iiBarpOZrIa9lN9PpT0AQ5d4D4l3nUYr5rFRLPhjEMCq5jVdZMfwrFWpi664x0qIA/opjqkXwPHMfy+UWuCYl6wrHpDzgP3Xvj/qyNlr3d6gv1IUsbBGZEdJqvQ8aQvR2+bGv5RuSbeEA3P1eL8j+HOroS9zVnzmQ7ie+aPHLy+PmS9zS0Anheb5AGpYfWP6zcOkXPeQtrVx8nAwAnaXgz4dQ1W756dMwsIRR+1Q9KJzHYWMV744g36KEco1uTglr0vmxJLUWpTAHf8Heyv1+G56n6Cq/kl9OC1wcNhrVdLk629ghepGTCio+1vdI++AQDSkKVwp+g9s4BkC/lPtIOm5C7ThJEBabFSk+OGJ1DTGaM12flzPdhjZalyc/H5T5QXzftlWYRsekosJ25Eu61TRwTPEQH5/RXpDyrBIGU4SDwu6UTgzanRfgV0VmvzYYT+MY9dYxWTp5h+DrW9soay+4zdJy8N+3MohPgUIVw3YSj0PvOXStHKqp0nUoqFitAO5EJ56/fecw1+mC94xU3fwiIw6jFzmfs3EGK235H4opCeHueE3PhYCs2bB1KUNKruoPSx9BXTO7OfB/d3Cbb53G4eIz2fSFZgydpTT1xQJrcXNnW9kS2f1TFOa1y5mV0C4IYbCAKe4qDM3b6Aa9pMgWjOC8/LHuPSLBG/qLc25xqePnHUm6PcBvpB6im3uYBxaMSuEhrYYjuHBOrvpIBaymlDHLo8jJ5Os b2E77Lkh cqUzG5g5Q6wc55D5VjhRsF5mxkLQj9ZgnnVyJY2CiIN8RxAO/4KkWtGNGwfqlAqBEL9SqemjSAOzH3zFNaOc51b7KzObb3+jlxRrMe51uYuQ2XOKm3jSAU4H6wX9dBr1I6lxWJdW0efAy33+amKFvF/Lyb7bzTKVSVf0MtvYjTlhNs7HoRmLIRQZA5NudDlVpgvIc6JwgqJz+LxQNCikSxW57RwmjOmPJZ80UxkZjGXT2M4B+dmYc1uwXQuNGw0Sks1k5SkbBeZ4O3vp2Cl1CwIEIJNuGpEhTWzC+rPj3RJbCS0Cg8enS++p2N7Ii4Thvi2o/v4+divpa9op/8RplDH2Ar7Wt385/zB0UZyX7IBGx7cNPxhlaA/QzShSKk5YlYVaB3BXOBOVS5rM/s13lLdEHwSTBqatXcQEwTNWsQ/Yw5OWz+qsc2+Xy2Q1dUuYFOGbAdmyiY+8ASG/fSDKWbohuL1w0Jgq4c/8SQvg68ThGqJQqmdP9Xtn3yPtvn9lmv022r8enpxiNlZUlBLVbpI2IKQX5cBKzEUr1cP+tapRTPLjqwNtE5N0HFIiGw70Qp+xB41rW6Wdv+glT08MAGoNR95JP+JzY7M3lBQdpExgjNV31lcFKhbKzSA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 25, 2024 at 10:23=E2=80=AFAM Yosry Ahmed wrote: > > On Sun, Mar 24, 2024 at 2:04=E2=80=AFPM Johannes Weiner wrote: > > > > Zhongkun He reports data corruption when combining zswap with zram. > > > > The issue is the exclusive loads we're doing in zswap. They assume > > that all reads are going into the swapcache, which can assume > > authoritative ownership of the data and so the zswap copy can go. > > > > However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try > > to bypass the swapcache. This results in an optimistic read of the > > swap data into a page that will be dismissed if the fault fails due to > > races. In this case, zswap mustn't drop its authoritative copy. > > > > Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=3DzV9P691B9bVq33erwOX= NTmEaUbi9DrDeJzw@mail.gmail.com/ > > Reported-by: Zhongkun He > > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads") > > Cc: stable@vger.kernel.org [6.5+] > > Signed-off-by: Johannes Weiner > > Tested-by: Zhongkun He Acked-by: Barry Song > > Do we also want to mention somewhere (commit log or comment) that > keeping the entry in the tree is fine because we are still protected > from concurrent loads/invalidations/writeback by swapcache_prepare() > setting SWAP_HAS_CACHE or so? It seems that Kairui's patch comprehensively addresses the issue at hand. Johannes's solution, on the other hand, appears to align zswap behavior more closely with that of a traditional swap device, only releasing an entr= y when the corresponding swap slot is freed, particularly in the sync-io case= . Johannes' patch has inspired me to consider whether zRAM could achieve a comparable outcome by immediately releasing objects in swap cache scenarios. When I have the opportunity, I plan to experiment with zRAM. > > Anyway, this LGTM. > > Acked-by: Yosry Ahmed > > > --- > > mm/zswap.c | 23 +++++++++++++++++++---- > > 1 file changed, 19 insertions(+), 4 deletions(-) > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > index 535c907345e0..41a1170f7cfe 100644 > > --- a/mm/zswap.c > > +++ b/mm/zswap.c > > @@ -1622,6 +1622,7 @@ bool zswap_load(struct folio *folio) > > swp_entry_t swp =3D folio->swap; > > pgoff_t offset =3D swp_offset(swp); > > struct page *page =3D &folio->page; > > + bool swapcache =3D folio_test_swapcache(folio); > > struct zswap_tree *tree =3D swap_zswap_tree(swp); > > struct zswap_entry *entry; > > u8 *dst; > > @@ -1634,7 +1635,20 @@ bool zswap_load(struct folio *folio) > > spin_unlock(&tree->lock); > > return false; > > } > > - zswap_rb_erase(&tree->rbroot, entry); > > + /* > > + * When reading into the swapcache, invalidate our entry. The > > + * swapcache can be the authoritative owner of the page and > > + * its mappings, and the pressure that results from having two > > + * in-memory copies outweighs any benefits of caching the > > + * compression work. > > + * > > + * (Most swapins go through the swapcache. The notable > > + * exception is the singleton fault on SWP_SYNCHRONOUS_IO > > + * files, which reads into a private page and may free it if > > + * the fault fails. We remain the primary owner of the entry.) > > + */ > > + if (swapcache) > > + zswap_rb_erase(&tree->rbroot, entry); > > spin_unlock(&tree->lock); > > > > if (entry->length) > > @@ -1649,9 +1663,10 @@ bool zswap_load(struct folio *folio) > > if (entry->objcg) > > count_objcg_event(entry->objcg, ZSWPIN); > > > > - zswap_entry_free(entry); > > - > > - folio_mark_dirty(folio); > > + if (swapcache) { > > + zswap_entry_free(entry); > > + folio_mark_dirty(folio); > > + } > > > > return true; > > } > > -- > > 2.44.0 Thanks Barry