From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3F3DDC27C55
	for <linux-mm@archiver.kernel.org>; Fri,  7 Jun 2024 21:20:23 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id B3ED96B00A8; Fri,  7 Jun 2024 17:20:22 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id AEDB36B00A9; Fri,  7 Jun 2024 17:20:22 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 9B59A6B00AA; Fri,  7 Jun 2024 17:20:22 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12])
	by kanga.kvack.org (Postfix) with ESMTP id 75F7E6B00A8
	for <linux-mm@kvack.org>; Fri,  7 Jun 2024 17:20:22 -0400 (EDT)
Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay04.hostedemail.com (Postfix) with ESMTP id E15DD1A196E
	for <linux-mm@kvack.org>; Fri,  7 Jun 2024 21:20:21 +0000 (UTC)
X-FDA: 82205361042.18.51B8E64
Received: from mail-vk1-f175.google.com (mail-vk1-f175.google.com [209.85.221.175])
	by imf11.hostedemail.com (Postfix) with ESMTP id 1474440009
	for <linux-mm@kvack.org>; Fri,  7 Jun 2024 21:20:19 +0000 (UTC)
Authentication-Results: imf11.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20230601 header.b=AvZiGDre;
	spf=pass (imf11.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.175 as permitted sender) smtp.mailfrom=21cnbao@gmail.com;
	dmarc=pass (policy=none) header.from=gmail.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1717795220;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=OxAZp3z7L49A+lTnMeWo5nLFYKkCmQWqY4FG8Fg0hV8=;
	b=GOAl8UFQSeyZS8BuFhSFVSi7p+U5CKZVxQzVKDLvRy+s5O7F3R10AvYq8E2Bfm2j74DBY7
	THx2Ol52eZLdGbL8X0kxQvp/7QXjL77904y27VawGDQkfR5WQrDfVTnpHQm7h8svY0L6NP
	LileAYs0QuqdzjtTihWJ3N4Q+X55nqU=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717795220; a=rsa-sha256;
	cv=none;
	b=DzJtCf3/rOu5SsyYQFO45Hr05yXo23N/K5klJ/QdZNtH8ItZRmzACIp7HEJcOV7hbq12BR
	Z4Eg6/Jasstr4zMyiTXiBJqB+IhhNdWDqmz5NK4z5PWj7d3ls7W4HmSqPi7KbKgrxuWfyb
	JEeu/S5qQV7PHk8BGKzBc+YI4hzLscE=
ARC-Authentication-Results: i=1;
	imf11.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20230601 header.b=AvZiGDre;
	spf=pass (imf11.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.175 as permitted sender) smtp.mailfrom=21cnbao@gmail.com;
	dmarc=pass (policy=none) header.from=gmail.com
Received: by mail-vk1-f175.google.com with SMTP id 71dfb90a1353d-4eb236e27adso765442e0c.2
        for <linux-mm@kvack.org>; Fri, 07 Jun 2024 14:20:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1717795219; x=1718400019; darn=kvack.org;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=OxAZp3z7L49A+lTnMeWo5nLFYKkCmQWqY4FG8Fg0hV8=;
        b=AvZiGDrebSlrA0po3wP3uMRgHJrSHUfoADYof45mPsxI7D/vVGA0pGyhjwBT14zUcE
         3SpYWgWIVpCYP+iod1g/sGUsTGjyMakdt6qEwcCJr80Sm26+p6+aimY6UYRWxBHOYNFK
         hwa/O7K4/ox5PGus6wmsYszl6bLnmF1K6ZjiKVjWS50XEIERVRIWJyjx1I4xJRdm2Q1l
         AC6MCKxXn7e5uYQk/qZIwHbN5AHADOIpI+f3Dm/0li0UNbT8etwXD+o8SQA4oZmi6AeV
         TY/SEd6U11fCD3IPmq77wJKPruuXCnGRRP3ApZWbhCzwqAVs1O7HLMYK62FP6psDx0Yz
         Znqg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1717795219; x=1718400019;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=OxAZp3z7L49A+lTnMeWo5nLFYKkCmQWqY4FG8Fg0hV8=;
        b=pdUf+s/QMVV7Dp1XCdAXJ92kO3Uvzo1rk7YXoOW1GU+kb7JVvrLy2ksl0y1sq4naCm
         BEBSwWBZr6j69WbUfRRI+0JTl5nG3P9swOF5TsUiedxyboEgCaXWvQKG1mBI4THimUKX
         Zx/LgxGVt3IBu5YscUq9ZTaXf0+juEUU7ifovLoFyZ8LcIi5+cfqTkJDTqDPXX1Oq/dO
         4dO8+LzwBe2GdteMq1DdtenfY9iInm6SLTAlkTHS32zDlzFc7vlVRgk01hKD1XV1t1L6
         HVnthDnJ8shPKh2yBmv4Wk76YZ20M0iKaCLyhZBt69zUaQxtGznByB0ntdqow+1qC4mk
         P6qA==
X-Forwarded-Encrypted: i=1; AJvYcCWLTxYsm3iY2nl0frs9SvjkPzx8+1hELWoOHMxSLOPKzfRBzfqI4sR3C6jgg6mrvojz2XOOaFiyTmFqk1qiFdtIXv4=
X-Gm-Message-State: AOJu0YwBT9uW8iuAQ6XWfwzYAFtI2JAU765pV7BWrJPKXXu8R2gPGcbo
	DpXAV1VDO1P/E9evCqAlqO/MQ/wVw3INmCAQvAp2sBB7WeiZ+w8WYaF57etGeQPZax4OPlOPC42
	oRp2QYy4w4R0meX1JkahgoouzNMg=
X-Google-Smtp-Source: AGHT+IHwApfbDXy/9uEmymJpWJutBc9OUMURYbKnanvYLEGkriqI0tU51VRp6KnRGmRS+myCt6MsDRH+vVcH5VYRYHk=
X-Received: by 2002:a05:6122:dd:b0:4ea:edac:3ae4 with SMTP id
 71dfb90a1353d-4eb56260c92mr4190990e0c.9.1717795219055; Fri, 07 Jun 2024
 14:20:19 -0700 (PDT)
MIME-Version: 1.0
References: <20240606184818.1566920-1-yosryahmed@google.com>
 <84d78362-e75c-40c8-b6c2-56d5d5292aa7@redhat.com> <CAJD7tkZH9C21nx75W9Erun=oUvmad5ujmDyGYWRRHEwPCCizUw@mail.gmail.com>
 <7507d075-9f4d-4a9b-836c-1fbb2fbd2257@redhat.com> <CAGsJ_4w_dEyLsPhdJCtRMReXDD116p+U979Yk-8EBu=zx5FWgQ@mail.gmail.com>
 <CAJD7tkYnZCvwUSTAx-N-08b+mVF4yHEdDTbp9oqYrVQXX8M1Cw@mail.gmail.com>
 <CAGsJ_4wAYbofjg99iEr7cEAptH9Ve9WcrSXho8A-Sw54vb8Msw@mail.gmail.com>
 <9374758d-9f81-4e4f-8405-1f972234173e@redhat.com> <CAJD7tkZBzSB_6pAZP0n0nq+W=J1XKQGFzZZLzPmSH0apwaqTNg@mail.gmail.com>
In-Reply-To: <CAJD7tkZBzSB_6pAZP0n0nq+W=J1XKQGFzZZLzPmSH0apwaqTNg@mail.gmail.com>
From: Barry Song <21cnbao@gmail.com>
Date: Sat, 8 Jun 2024 09:20:07 +1200
Message-ID: <CAGsJ_4ywmVbO+nrbcxz9YMdzi_y5qSUG3Cg=3oz3kNpQ6gg6iA@mail.gmail.com>
Subject: Re: [PATCH] mm: zswap: add VM_BUG_ON() if large folio swapin is attempted
To: Yosry Ahmed <yosryahmed@google.com>
Cc: David Hildenbrand <david@redhat.com>, Andrew Morton <akpm@linux-foundation.org>, 
	Johannes Weiner <hannes@cmpxchg.org>, Nhat Pham <nphamcs@gmail.com>, 
	Chengming Zhou <chengming.zhou@linux.dev>, Baolin Wang <baolin.wang@linux.alibaba.com>, 
	Chris Li <chrisl@kernel.org>, Ryan Roberts <ryan.roberts@arm.com>, 
	Matthew Wilcox <willy@infradead.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Rspam-User: 
X-Stat-Signature: 6h1etqwcnu6p4uu8316eqiiu44d4baku
X-Rspamd-Server: rspam07
X-Rspamd-Queue-Id: 1474440009
X-HE-Tag: 1717795219-705001
X-HE-Meta: U2FsdGVkX1+Eahw80g/yjOHoXkOuPPenoeY68vu7tgVAuTNRBqC4mAw1gnLIrw1gay60QlzzxMhSQLIxMgA2+e49+EwcfwHqn//3O/tFG6W2F07J+NAmd3ZwnZ9DG6XIjgGS5NQnUNowqyQUP3Jw0FkTh9cwrS5pLSJOoqrVX1AkmKP43OxcLUv3QcVTHWlPF7OFM3Epigiy4u8tsFRXpNuw6BAAPJU0QvJMLxMU0LeRa18C51imjHukP2oMmheqpOOFPmxGJYvNvAPmrOMmN5TfXdd3YaBQItvtkB462Cg8m76sFAbZlsrjFtV2g+riEHMbYtWdeacshYoP3rJ1XHwcJyFMaDP92eWzEz6TJ2FYksibaQDKb5wcuEOmQ5bQmBpR7F2WmHImlEipku0S8eeY+FprrrPBzpYEQkFJFRPGm2QXdhmavoIWGB2uILkr7Int88LqZvvgq8a6WGmyJbBX5nBPZT6XQgHBx+hdhioQfAGdKaHVW9hdIO60C8y8DGW28wg/rGrP664uxtcHYMpEmRCzJgxqG9ec+mNy8hQSi7OXUjsMYDIglqbqSG7u29hEODwevoyDdoNJZRAqpa92LgGybasqwE25JyBlI51jXwKrQtizANEew5tsfztg5/Okj1XN40x69CG5cd0Rw0yctFPB+nk+k2ZdKgSCm2qzmFxFDNXEzycLvEo6STjg90wqd5f9fkPRYHlbKrzvzOCSio8agr1shdEIa34EudP5dOq50S7TbwncrW6Sz868q5/5iI8I5R+7vVqAKkIU+q3NM2YUQpDAdaFPU6smG8ijcIt8ka2cBX4A7rLEB6iigUwFbsBYY7utU3cM4XivlTxTWslpccVqkvCxtEPxjLjEh3rtyc+TrD/Vg6LjZy3bFBwiPD57owS5SXW8st+owZ0OBH3MUMgBwE2TPX8KXA8lAeLfrObljFPuW2E1SApcbOl+DS2BoInRJzJC4/k
 jsK6A3re
 bAof2Uk0lMaKqJKjVx+p65uSZJSI1pmfF7sK7DO1OSYLe7XvjgE2cILJLhnBlfVCYGWLh1zrpW2wR1Kaf3V6gjkjX1xPAzKsgtPt+s34XWT6e09WpzJY+krAilanYiuG/sUZfV8kW+i5cWZWaafUWAQWqnexiYLyvq/VVlyLeS9/EZMnYu4PmtsqjMLxokdECWWP/E+LGLKLVtTqD47ZLIcc55EgTBN0tz9aY/xIpgAxeXcehO5vNaI33tiPzKX8V7VkMPh9+txQjc9Qmd3fcKutbFrKWCH96tUIXcwUdDxKGDfeehiNy1/4YvX/MALYcGBov6dtulzKSKH1SkTLCVyo7hVJM8kPPKn8AvC1v3ZcO20ZkRAlc/qc9dXMfu9NNnfrMfkfxkGaXSPc6G0ampo4j1zY3dWgJ7OJqQZ2iwP6y6H+ZkfiCrdd/GjmXUeEYK80UnBhMnPSN/SFRX1+MR804XntwSsFbu4SF
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Sat, Jun 8, 2024 at 5:43=E2=80=AFAM Yosry Ahmed <yosryahmed@google.com> =
wrote:
>
> On Fri, Jun 7, 2024 at 12:11=E2=80=AFAM David Hildenbrand <david@redhat.c=
om> wrote:
> >
> > On 06.06.24 23:53, Barry Song wrote:
> > > On Fri, Jun 7, 2024 at 9:37=E2=80=AFAM Yosry Ahmed <yosryahmed@google=
.com> wrote:
> > >>
> > >> On Thu, Jun 6, 2024 at 2:30=E2=80=AFPM Barry Song <21cnbao@gmail.com=
> wrote:
> > >>>
> > >>> On Fri, Jun 7, 2024 at 9:17=E2=80=AFAM David Hildenbrand <david@red=
hat.com> wrote:
> > >>>>
> > >>>> On 06.06.24 22:31, Yosry Ahmed wrote:
> > >>>>> On Thu, Jun 6, 2024 at 1:22=E2=80=AFPM David Hildenbrand <david@r=
edhat.com> wrote:
> > >>>>>>
> > >>>>>> On 06.06.24 20:48, Yosry Ahmed wrote:
> > >>>>>>> With ongoing work to support large folio swapin, it is importan=
t to make
> > >>>>>>> sure we do not pass large folios to zswap_load() without implem=
enting
> > >>>>>>> proper support.
> > >>>>>>>
> > >>>>>>> For example, if a swapin fault observes that contiguous PTEs ar=
e
> > >>>>>>> pointing to contiguous swap entries and tries to swap them in a=
s a large
> > >>>>>>> folio, swap_read_folio() will pass in a large folio to zswap_lo=
ad(), but
> > >>>>>>> zswap_load() will only effectively load the first page in the f=
olio. If
> > >>>>>>> the first page is not in zswap, the folio will be read from dis=
k, even
> > >>>>>>> though other pages may be in zswap.
> > >>>>>>>
> > >>>>>>> In both cases, this will lead to silent data corruption.
> > >>>>>>>
> > >>>>>>> Proper large folio swapin support needs to go into zswap before=
 zswap
> > >>>>>>> can be enabled in a system that supports large folio swapin.
> > >>>>>>>
> > >>>>>>> Looking at callers of swap_read_folio(), it seems like they are=
 either
> > >>>>>>> allocated from __read_swap_cache_async() or do_swap_page() in t=
he
> > >>>>>>> SWP_SYNCHRONOUS_IO path. Both of which allocate order-0 folios,=
 so we
> > >>>>>>> are fine for now.
> > >>>>>>>
> > >>>>>>> Add a VM_BUG_ON() in zswap_load() to make sure that we detect c=
hanges in
> > >>>>>>> the order of those allocations without proper handling of zswap=
.
> > >>>>>>>
> > >>>>>>> Alternatively, swap_read_folio() (or its callers) can be update=
d to have
> > >>>>>>> a fallback mechanism that splits large folios or reads subpages
> > >>>>>>> separately. Similar logic may be needed anyway in case part of =
a large
> > >>>>>>> folio is already in the swapcache and the rest of it is swapped=
 out.
> > >>>>>>>
> > >>>>>>> Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
> > >>>>>>> ---
> > >>>>>>>
> > >>>>>>> Sorry for the long CC list, I just found myself repeatedly look=
ing at
> > >>>>>>> new series that add swap support for mTHPs / large folios, maki=
ng sure
> > >>>>>>> they do not break with zswap or make incorrect assumptions. Thi=
s debug
> > >>>>>>> check should give us some peace of mind. Hopefully this patch w=
ill also
> > >>>>>>> raise awareness among people who are working on this.
> > >>>>>>>
> > >>>>>>> ---
> > >>>>>>>     mm/zswap.c | 3 +++
> > >>>>>>>     1 file changed, 3 insertions(+)
> > >>>>>>>
> > >>>>>>> diff --git a/mm/zswap.c b/mm/zswap.c
> > >>>>>>> index b9b35ef86d9be..6007252429bb2 100644
> > >>>>>>> --- a/mm/zswap.c
> > >>>>>>> +++ b/mm/zswap.c
> > >>>>>>> @@ -1577,6 +1577,9 @@ bool zswap_load(struct folio *folio)
> > >>>>>>>         if (!entry)
> > >>>>>>>                 return false;
> > >>>>>>>
> > >>>>>>> +     /* Zswap loads do not handle large folio swapins correctl=
y yet */
> > >>>>>>> +     VM_BUG_ON(folio_test_large(folio));
> > >>>>>>> +
> > >>>>>>
> > >>>>>> There is no way we could have a WARN_ON_ONCE() and recover, righ=
t?
> > >>>>>
> > >>>>> Not without making more fundamental changes to the surrounding sw=
ap
> > >>>>> code. Currently zswap_load() returns either true (folio was loade=
d
> > >>>>> from zswap) or false (folio is not in zswap).
> > >>>>>
> > >>>>> To handle this correctly zswap_load() would need to tell
> > >>>>> swap_read_folio() which subpages are in zswap and have been loade=
d,
> > >>>>> and then swap_read_folio() would need to read the remaining subpa=
ges
> > >>>>> from disk. This of course assumes that the caller of swap_read_fo=
lio()
> > >>>>> made sure that the entire folio is swapped out and protected agai=
nst
> > >>>>> races with other swapins.
> > >>>>>
> > >>>>> Also, because swap_read_folio() cannot split the folio itself, ot=
her
> > >>>>> swap_read_folio_*() functions that are called from it should be
> > >>>>> updated to handle swapping in tail subpages, which may be questio=
nable
> > >>>>> in its own right.
> > >>>>>
> > >>>>> An alternative would be that zswap_load() (or a separate interfac=
e)
> > >>>>> could tell swap_read_folio() that the folio is partially in zswap=
,
> > >>>>> then we can just bail and tell the caller that it cannot read the
> > >>>>> large folio and that it should be split.
> > >>>>>
> > >>>>> There may be other options as well, but the bottom line is that i=
t is
> > >>>>> possible, but probably not something that we want to do right now=
.
> > >>>>>
> > >>>>> A stronger protection method would be to introduce a config optio=
n or
> > >>>>> boot parameter for large folio swapin, and then make CONFIG_ZSWAP
> > >>>>> depend on it being disabled, or have zswap check it at boot and r=
efuse
> > >>>>> to be enabled if it is on.
> > >>>>
> > >>>> Right, sounds like the VM_BUG_ON() really is not that easily avoid=
able.
> > >>>>
> > >>>> I was wondering, if we could WARN_ON_ONCE and make the swap code d=
etect
> > >>>> this like a read-error from disk.
> > >>>>
> > >>>> I think do_swap_page() detects that by checking if the folio is no=
t
> > >>>> uptodate:
> > >>>>
> > >>>> if (unlikely(!folio_test_uptodate(folio))) {
> > >>>>          ret =3D VM_FAULT_SIGBUS;
> > >>>>          goto out_nomap;
> > >>>> }
> > >>>>
> > >>>> So maybe WARN_ON_ONCE() + triggering that might be a bit nicer to =
the
> > >>>> system (but the app would crash either way, there is no way around=
 it).
> > >>>>
> > >>>
> > >>> I'd rather fallback to small folios swapin instead crashing apps ti=
ll we fix
> > >>> the large folio swapin in zswap :-)
> > >>
> > >> I think David is referring to catching the buggy cases that do not
> > >> properly fallback to small folios with zswap, not as an alternative =
to
> > >> the fallback. This is at least what I had in mind with the patch.
> > >
> > > Cool. Thanks for the clarification. I'm fine with keeping the fallbac=
k,
> > > whether it's the current VM_BUG_ON or David's recommended
> > > SIGBUS.
> > >
> > > Currently, mainline doesn't support large folios swap-in, so I see
> > > your patch as a helpful warning for those attempting large folio
> > > swap-in to avoid making mistakes like loading large folios from
> > > zswap.
> > >
> > > In fact, I spent a week trying to figure out why my app was crashing
> > > before I added 'if (zswap_is_enabled()) goto fallback'. If your patch
> > > had come earlier, it would have saved me that week of work :-)
> > >
> > > To me, a direct crash seems like a more straightforward way to
> > > prompt people to fallback when attempting large folio swap-ins.
> > > So, I am slightly in favor of your current patch.
> >
> > BUG_ON and friends are frowned-upon, in particular in scenarios where w=
e
> > can recover or that are so unexpected that they can be found during
> > early testing.
> >
> > I have no strong opinion on this one, but likely a VM_WARN_ON would als=
o
> > be sufficient to find such issues early during testing. No need to cras=
h
> > the machine.
>
> I thought VM_BUG_ON() was less frowned-upon than BUG_ON(), but after
> some digging I found your patches to checkpatch and Linus clearly
> stating that it isn't.
>
> How about something like the following (untested), it is the minimal
> recovery we can do but should work for a lot of cases, and does
> nothing beyond a warning if we can swapin the large folio from disk:
>
> diff --git a/mm/page_io.c b/mm/page_io.c
> index f1a9cfab6e748..8f441dd8e109f 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -517,7 +517,6 @@ void swap_read_folio(struct folio *folio, struct
> swap_iocb **plug)
>         delayacct_swapin_start();
>
>         if (zswap_load(folio)) {
> -               folio_mark_uptodate(folio);
>                 folio_unlock(folio);
>         } else if (data_race(sis->flags & SWP_FS_OPS)) {
>                 swap_read_folio_fs(folio, plug);
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 6007252429bb2..cc04db6bb217e 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -1557,6 +1557,22 @@ bool zswap_load(struct folio *folio)
>
>         VM_WARN_ON_ONCE(!folio_test_locked(folio));
>
> +       /*
> +        * Large folios should not be swapped in while zswap is being use=
d, as
> +        * they are not properly handled.
> +        *
> +        * If any of the subpages are in zswap, reading from disk would r=
esult
> +        * in data corruption, so return true without marking the folio u=
ptodate
> +        * so that an IO error is emitted (e.g. do_swap_page() will sigfa=
ult).
> +        *
> +        * Otherwise, return false and read the folio from disk.
> +        */
> +       if (WARN_ON_ONCE(folio_test_large(folio))) {
> +               if (xa_find(tree, &offset, offset +
> folio_nr_pages(folio) - 1, 0))
> +                       return true;
> +               return false;
> +       }
> +
>         /*
>          * When reading into the swapcache, invalidate our entry. The
>          * swapcache can be the authoritative owner of the page and
> @@ -1593,7 +1609,7 @@ bool zswap_load(struct folio *folio)
>                 zswap_entry_free(entry);
>                 folio_mark_dirty(folio);
>         }
> -
> +       folio_mark_uptodate(folio);
>         return true;
>  }
>
> One problem is that even if zswap was never enabled, the warning will
> be emitted just if CONFIG_ZSWAP is on. Perhaps we need a variable or
> static key if zswap was "ever" enabled.
>
> Barry, I suspect your is_zswap_enabled() check is deficient for
> similar reasons, zswap could have been enabled before then became
> disabled.

I don't understand this. if zswap was enabled before but is disabled when
I am loading data, will I get corrupted data before zswap was once enabled?
If not, it seems nothing important.

Thanks
Barry