From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC8B8C47DD9 for ; Mon, 25 Mar 2024 03:02:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 542E66B0088; Sun, 24 Mar 2024 23:02:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F2CA6B0089; Sun, 24 Mar 2024 23:02:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E1FE6B008A; Sun, 24 Mar 2024 23:02:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 300E36B0088 for ; Sun, 24 Mar 2024 23:02:09 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D5F34120846 for ; Mon, 25 Mar 2024 03:02:08 +0000 (UTC) X-FDA: 81934062336.01.52ED972 Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) by imf23.hostedemail.com (Postfix) with ESMTP id 3EE3514000E for ; Mon, 25 Mar 2024 03:02:05 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="kxX/hC0Y"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf23.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711335727; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vGZHB2WZTfvTK8AUGI8LedlLw/3bzf9ZOZw9kvDSlsM=; b=VZYSTzvbYKfZaesUmbPs4pLAbJifSKuPl73r2HHcmnkFenn74awai9u/f4VIU0jc7lqpDX BQ9FLj6SCJwmbvhyoZF38rADgb04qRKROb062oz5cwJMeVxn9y9Q40loEbLeK/ONFT2qJE mrhy8D8vOm3uygNflzV3+CuzCyOvkfA= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b="kxX/hC0Y"; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf23.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711335727; a=rsa-sha256; cv=none; b=uZSWu8n1fWQBaPB7kToHzTVC9bRaGRKq/LDWWUMHnws2lyXuF5vAvTM3C/nd0b7Bej+mrB hVI5Zv+is/yG+Fz7IS9FbZ2dMYqH4Bk2f8N1L1i6kHRxJSqL48O9W8XtlquPbQgxSsUfdQ ouBhbrfLTn5iavfYPZnFN+7daMqsEBs= Received: by mail-lj1-f169.google.com with SMTP id 38308e7fff4ca-2d228a132acso54602991fa.0 for ; Sun, 24 Mar 2024 20:02:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1711335724; x=1711940524; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vGZHB2WZTfvTK8AUGI8LedlLw/3bzf9ZOZw9kvDSlsM=; b=kxX/hC0YJuwdfeGiUHJUme4qT63CbMQhcGVpj7nuJ10KB9L6YWUEIebQxemLdMZI/F A3YOMFs8+jcBRhyCT7/hLsIlMNoYNzMc/bnPqi0Z0xzkXGvDmli7y+GGmiqqx1t0xNzr wcRn2pDAdknZ8xs0u561x97UxJXwb+e8R6REnhYTAA9hCr+hbbH8BtNiC52KZjBI/OVg gYGKnqq+VMcJcVJ1Nvpt9ac+m5qJUU1K0PqCNsvbLWU32q+xgLXkL8+q99khb1FtwxBi 5Xih7WokH1GDF0ewPKERqwQrfmfyLMEAi2VIEY+zsAAicfTnBHTkeWPzeUAXGVUNu+0F cmBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711335724; x=1711940524; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vGZHB2WZTfvTK8AUGI8LedlLw/3bzf9ZOZw9kvDSlsM=; b=I5u4hfKWcD0tyQq4/ylISJuHjgWaPIxiAxqsLUFiPpYRUInrSke5hnBpogV1pZN67h 8VZvie1d2dZY6WCHPFhGvVDNQZH0ftMh5mEsHzGu6+HE7GtgH0g8q8ayuEsqdB5ePfeB qrs/6BrR5z20UPxz/d2YFbX52e8VBdjvHUe5a/ptPU93LgX0XQI6LRsHtjoj77vRKQCc V3jbuppqo/RQPAWBvSG9f6owLLOYPqJGr74CfzXlIQBSXiQajh6WJ8PgtydBLV+nkIJM fWptuvtIyr6ScHTU24r7k6oerBca/De4iva4WTrmPi2pBG4/vXj/L1do+NZBTeMlcvuN 0zxw== X-Forwarded-Encrypted: i=1; AJvYcCWRAa7iFMMHiwvjREhd4l0tuLeCboC6s3VGRj6w5Y8UTmdg7yOlVR8ld9MGg+ZzQLDD+DBcpl3dG2S9ayKBxHiY/Y8= X-Gm-Message-State: AOJu0Yw64YwdNvbkpxlTKd0MD9Vwu0bLS6cILJgHI4LCdQCWqbXKio3j oHeg7sJXMvWtaTIIETmHh/rHzkktfOgW//6/GFzry5HAU9jKuc0GDWR5wLX59oe2xxsZItUYNTF YCsM2AQ2KXmDKfKB1IYvSB88GPibQAt23K5993A== X-Google-Smtp-Source: AGHT+IHSwGHPA3VAXSaD502fc6IbNOyaE83xzpb1sU/fP30IgGWmj7pWuPu1y6z0fWpl0amuLlGajV0ai0CcIJ/zlkA= X-Received: by 2002:a2e:88d0:0:b0:2d2:206a:2f2a with SMTP id a16-20020a2e88d0000000b002d2206a2f2amr4041890ljk.17.1711335724075; Sun, 24 Mar 2024 20:02:04 -0700 (PDT) MIME-Version: 1.0 References: <20240324210447.956973-1-hannes@cmpxchg.org> In-Reply-To: <20240324210447.956973-1-hannes@cmpxchg.org> From: Zhongkun He Date: Mon, 25 Mar 2024 11:01:52 +0800 Message-ID: Subject: Re: [External] [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices To: Johannes Weiner Cc: Andrew Morton , Chengming Zhou , Yosry Ahmed , Barry Song <21cnbao@gmail.com>, Chris Li , Nhat Pham , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 3EE3514000E X-Stat-Signature: qpep7g3tbatm1oderh7t1nkukzz5n7pd X-HE-Tag: 1711335725-627479 X-HE-Meta: U2FsdGVkX180KPwR3MJ94THU2XhFo2gI3aPZhgDxGlr6C2UfETeKOhVLuew6yCiiFP/VWDJwpyIHubd6mUzWS30LlyYdxaEhF5QwdL9cV0gn5YyX1ErIqtHHO59LgS3/EFYOXQLLqLTrAjzsZiwIEY4kJcMjWC7G1WO82sjGoQuaVeitNsTUWBevvtg3VE+RHf2FRDBPeG7KfrIR1SxJCzmlQen8hc1EMmxSeHksR62hu/lldZNuFaTjT7PFg75rBD94/tWrX1/Nl6TdmR7zSR+qjpwbT/zdUclwJqQvhxLwSWmDbm+9ZX351K3oZUCzRLZdvDM39ma9ILmXbrOuDaveVddf+VtcTyl+u71WjrCVK2gcZ1tZZn2v77wvyjOwrmYSZKHjMjmyh+ALLdFpZjRlOikHRyWlwewjw83gQ8b7SUtEfy3XYHjjO+wxligknhNzy3lAv3IoLC9LgkOmcZu+od9zudI59uQfw2JI8gNDLBG7GsaLFx0JWrCsVhxXJ+62NS3GD2DD5e8OTHl5x04ZE2rjixVyersBgtj71AFzAiL2wItcJjJNwhAy5rQpubxNiIDMYi6nQbPS6CmXvMqR6e8NdWp6hLqTNf2CLnfBRIhDMUz26101Fi+99q6A6nnArHAywYidrFsR0siRdOUg4JK0qv25bizIl2aMb4VyW8IbpvJKoT2nfR+bFIMinUxxgCbtftueOy2mnuL/dq+lSsrFW4YKw7WtpUNTeVS71ezm8716R3rGkSBsVhCLZg2ov2MsxOB8LB2e6dVz/4f4q2YQN+TetsdGtYWr3x7yzwbCe0nBsHttF55yY3X8LqZghDV9fscK61yVqqeG94nzbTkx12ozlet+EaQbZK5/OVQ46ol8MY5WBtI7tPc89lxPaQB+alrmJ7/bRwyh+rBx4tizeI/Ehwz7QbOxCNXUm8MuRa1KwDmq9Gfd/BjsPpRS5dBq2Uw62ffW5W7 /ZM0pIMY GkXM17cfCeGKNhZvG8eStUsFJGCkSayTmh2DYHTgsnKgvMKQ9G7Im37gD4Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 25, 2024 at 5:05=E2=80=AFAM Johannes Weiner wrote: > > Zhongkun He reports data corruption when combining zswap with zram. > > The issue is the exclusive loads we're doing in zswap. They assume > that all reads are going into the swapcache, which can assume > authoritative ownership of the data and so the zswap copy can go. > > However, zram files are marked SWP_SYNCHRONOUS_IO, and faults will try > to bypass the swapcache. This results in an optimistic read of the > swap data into a page that will be dismissed if the fault fails due to > races. In this case, zswap mustn't drop its authoritative copy. > > Link: https://lore.kernel.org/all/CACSyD1N+dUvsu8=3DzV9P691B9bVq33erwOXNT= mEaUbi9DrDeJzw@mail.gmail.com/ > Reported-by: Zhongkun He > Fixes: b9c91c43412f ("mm: zswap: support exclusive loads") > Cc: stable@vger.kernel.org [6.5+] > Signed-off-by: Johannes Weiner > Tested-by: Zhongkun He > --- > mm/zswap.c | 23 +++++++++++++++++++---- > 1 file changed, 19 insertions(+), 4 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 535c907345e0..41a1170f7cfe 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1622,6 +1622,7 @@ bool zswap_load(struct folio *folio) > swp_entry_t swp =3D folio->swap; > pgoff_t offset =3D swp_offset(swp); > struct page *page =3D &folio->page; > + bool swapcache =3D folio_test_swapcache(folio); > struct zswap_tree *tree =3D swap_zswap_tree(swp); > struct zswap_entry *entry; > u8 *dst; > @@ -1634,7 +1635,20 @@ bool zswap_load(struct folio *folio) > spin_unlock(&tree->lock); > return false; > } > - zswap_rb_erase(&tree->rbroot, entry); > + /* > + * When reading into the swapcache, invalidate our entry. The > + * swapcache can be the authoritative owner of the page and > + * its mappings, and the pressure that results from having two > + * in-memory copies outweighs any benefits of caching the > + * compression work. > + * > + * (Most swapins go through the swapcache. The notable > + * exception is the singleton fault on SWP_SYNCHRONOUS_IO > + * files, which reads into a private page and may free it if > + * the fault fails. We remain the primary owner of the entry.) > + */ > + if (swapcache) > + zswap_rb_erase(&tree->rbroot, entry); > spin_unlock(&tree->lock); > > if (entry->length) > @@ -1649,9 +1663,10 @@ bool zswap_load(struct folio *folio) > if (entry->objcg) > count_objcg_event(entry->objcg, ZSWPIN); > > - zswap_entry_free(entry); > - > - folio_mark_dirty(folio); > + if (swapcache) { > + zswap_entry_free(entry); > + folio_mark_dirty(folio); > + } > > return true; > } > -- > 2.44.0 > Good solution and makes great sense to me. Thanks.