From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6878DC52D6F for ; Wed, 21 Aug 2024 07:38:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E83ED6B00B1; Wed, 21 Aug 2024 03:38:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E0DF96B00B2; Wed, 21 Aug 2024 03:38:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAD836B00B3; Wed, 21 Aug 2024 03:38:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A3FF96B00B1 for ; Wed, 21 Aug 2024 03:38:41 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 136771A0926 for ; Wed, 21 Aug 2024 07:38:41 +0000 (UTC) X-FDA: 82475450442.10.DC4EA3E Received: from mail-ua1-f53.google.com (mail-ua1-f53.google.com [209.85.222.53]) by imf07.hostedemail.com (Postfix) with ESMTP id 3B1AA40015 for ; Wed, 21 Aug 2024 07:38:38 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=RoLcLwUZ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.53 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724225829; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TzSGGFm7ZFWjbOa6s0GPDWc8zH4KT9ub5u4nCrK74XY=; b=mkfwhgkhFVV0VP+R2+ir4sDKaAQs9u1QMvJTn6q+OuYsX83frkSNPyd6xtpRmrRCTiPToH 3B7XykwgyMLTVrKoiGmgB26OOrHPzpFMeMi9n050H5HfWtVoun+9G8OxJlrjDaMLPQX7SP ntikjctz3B+tDDMmEXBLhAPGrpdi+7M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724225829; a=rsa-sha256; cv=none; b=lxPgGQt4KNbkEsEJuj6+JugxjX+Kqt3Akbw/8uQXGjpo3Vk4/Bryfv7Ts6IZItaXJO6veI wYyYtOCsPdKIb0pKqKjMV/Tn3kz8z8P2BUGJljD9uNHTQsf4/NtXxbsYpmNQISZY1UghMd g/mXsotGzgOMkq0wjws3gTbSF00hIkQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=RoLcLwUZ; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.53 as permitted sender) smtp.mailfrom=21cnbao@gmail.com Received: by mail-ua1-f53.google.com with SMTP id a1e0cc1a2514c-842fec2987eso1882163241.2 for ; Wed, 21 Aug 2024 00:38:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724225917; x=1724830717; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TzSGGFm7ZFWjbOa6s0GPDWc8zH4KT9ub5u4nCrK74XY=; b=RoLcLwUZZOJPBs9bDSDP2gTfysOi2/bbnO+Bd70kcj7zVGO7WQvdv9f6vuVunHYtLM MfAzKVpY42hEp2z86K8LOohfNdo6GpIFbCFivlroTZ5OBQVuUVnsZGUJLiB4Cq4JQA+C hvzgOEddK+XwDtiGiGC96+6IqA6ufxvfxDoegLMMKLPlUgeDkwwu1yotS1FXqTg6usIc G1xy4JVzhG1gjTGm+0eJ6ggUIQ8u1GR5SeLrQNuBnsG+3IxLB4uDtmOR+s/lS4Yww6+3 OQ7sHkL/lY2vbeE+ZRhXLbBurq2BChtnH79Fq7j++j/ioyWIIh7PzCA9Sq6X0koCYcqF 3bBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724225917; x=1724830717; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TzSGGFm7ZFWjbOa6s0GPDWc8zH4KT9ub5u4nCrK74XY=; b=sfAr+A+qxwMHijkk5wEgSvpztOraBNI4dZHV5a1snc4OitT/oowr7NMpIYrrqzvaVj QoaU+hMQCi2rKee+sWLkOhCTQDwvt5XfxG0zuJHG3FLxqfRt4hu8j/kRPrwB/SD5I7XH 88RdZgRqbBHncaCn6xdkwTylojJUeaVYpJ3BPRaM0DVXocjavoAV8weZc11o2arx7yJX 7fl0R6sMAnMBXuJB4wPAm+sTecOHaVL3G+aRA0sSDqC3ptVHKhif+fTNYu/+RK8NTjxB TtN+Lvb4b5vop9zSayorrj9MfeM+mMp8PZ3A+KrPAsh+o/42jDGo+ULn9l/aCp5Y8OyY Kruw== X-Forwarded-Encrypted: i=1; AJvYcCU0a99VEqntE5kBlbN67uvuGwIrOX+x2iPmvh+eixzg4qpC3hgsY2IWsJMNb3iVaOKksYjWXRefkQ==@kvack.org X-Gm-Message-State: AOJu0Yz2W5Miel4SVCB/qtm8kipaBKTdBIHElQC/iT6GbePswIummJFB pgTO5Gd6Kwz4UdNN+AVjydRv76UUn86TrJmCX4ZF7O8EK6ePdgMFklKac80t/nMgzGI7A4PfDfD qcnUUnp0so7eD7rfy9DGWtazzOj4= X-Google-Smtp-Source: AGHT+IGb0IvWmhD+dR+4kYq6qhy0FbbAusEE10A/bsVVzs+38juYCZfDym/dtA42RGU8+6R5Q2uI+yBzZFEsftxRZDk= X-Received: by 2002:a05:6102:3906:b0:497:5eed:cd89 with SMTP id ada2fe7eead31-498d2f8e4d3mr2437108137.22.1724225917078; Wed, 21 Aug 2024 00:38:37 -0700 (PDT) MIME-Version: 1.0 References: <20240821054921.43468-1-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Wed, 21 Aug 2024 19:38:25 +1200 Message-ID: Subject: Re: [syzbot] [mm?] WARNING in zswap_swapoff To: Kairui Song Cc: akpm@linux-foundation.org, chengming.zhou@linux.dev, chrisl@kernel.org, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, nphamcs@gmail.com, ryan.roberts@arm.com, syzbot+ce6029250d7fd4d0476d@syzkaller.appspotmail.com, syzkaller-bugs@googlegroups.com, ying.huang@intel.com, yosryahmed@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3B1AA40015 X-Stat-Signature: 88x13kp7is1jssdeit5zrjhd1g9emgbf X-Rspam-User: X-HE-Tag: 1724225918-132997 X-HE-Meta: U2FsdGVkX19t5BhjYVgPQAXBHlg05XW1xUrSzhEGxd15970h6of7AeZk9yZ6TtQsVeLjidfxzKXo4RpB2+RXi5H7Vu+lYbM9i6RLr2PBupsEjIBTdGCyiuk61O1X5ROv+i2L+UWAIKGOFWn0RRS09e4hkcywANAVDjX3k4Op2Q0Oxrp5+KgK7FnvPw3E0KHla5Jkm7e6DMPpEEOvZIF0/iI4FBvuzcOZnVTxYhpvuq5ZV6RD4CvxkqmQt66ZaBwcRBJdcx5thUPY4SEZGm3x75AI1sJm8gjZyux+gaip2mLTxyQGSnkKSFZw0hyKAcLjxyGO3L8M0ejN7Woe+e3N5xDC6J758yjWQfx06JeXTUBDPA6SiToA5ILK+vv5tSjJFeXuiVS/rINEOScHTfVQsVL2lV0ZAQJFf8C71We3hMeQthBr4Vyhb3SZfxZxUQyoZeouvdF7E8sWuMD8BEbVhcswZMZGPG0GTo+VXfmTRj8Qh0aDc65J4FvUCqc28iJkK6pT42hIMFmcWVthDWA6jSpLmxRcPTn1NcuxFSm+CGhrSg4bQyCELTYU0643f4mNgjUDTNlW8SfTl+BUpY9gcNdeAAkg3n4YAiK0gf2pblLyQcoO+mQM3tYQ7Q37R/Aw22ZXspu4haW57QWogC2yA4xr4exJIewoefRX6vuILGuxF3xdI3b1l/IMAlpg+v/Y5xGNkmNFXkGB4qZopqgEIA74BxVewlTBi46ifu2MOc81JDNKXqNkksvAKFzAzd5UOZXUrNeGaOamGpPvNZI8u9QemRuqA6d+hSyWcDc/d61hUxs9/ODjnBC7HanA3d1Qm0yUN4AeN2PvbuIvlonDmL+HP1z6DHKXWO/ZbhNu8DzbI3W0UcwcLe1ZFmoaGLFNN/kNa48DA4+lvA3lE9SfxIbg6jDw6h9mydpND2iV33subbg37e+d/wx9iDfMoLoDrHMdpI0vohXywgnVVPx f6bOeJST OwjxkAJY6iOrwKDHoM6uqat0/ALRP+FIS7G7nihqj0oIGAcc+bG0OhBmGl4rwzSfI1TNmLcRA3B7cNQN0RjLuqhk128IAdZypBiLKOnIB946wg7TTm87VRcxy3KZGqLDJONhoCpuUwRLJBiJwl//w86rUD62/i+eiyarc987QkqCYxoOyTGEdFoayoZ3uvQgKAgqVZOwe87a01iqzGLXHgvgBpHFWwOmOeI5tCsFBEsdbTvxZrYVmieHWEA3nx8Iumm3lfUL/rTpJUfzfh7ogOhMLUYYhM7p7R5h4MgRDumN42p7FGQgHJpZdLFozSYTCWxYcZ9ZNf/T3Etc/mxa7lybz4hNfXgL1iKjT2jDm9r0zqCe9ofJakSGluyyUoJ0q/D9NNS3uLCw2mRRT8iheKxUQaZisIZHNFKv80iJ5cOFKOU61WL2/UyZe4ehGpArLriqV2tz6og4NHpOiNsceVprZEVjlcyiU5MMF+zWZsLZ7nCQVR5QfoHMsgRFVDElumcXLDdqVwkEd6MJltNPtwVkcVpesQG3/Qldw1qE4rpiOE+JCoyewXVzNEvqLKfRWXX4ZlB28dzmr9tg57gPb9ZTjk6T41mcN/Txhv3fCuKvGe2DBNYLRhrAkCFVfkatGsk+9k7g/0FrLvuQN0QDPYlmQd3crWd1WDFnhi1zjJwodYFoTHEGL7kvyZ7eqVuFZcMQlzoAP1Ejb7UnSB6Mr42Ppg/AH7+tbB2sKGIiLI47I9CAnGpNWAF3HzH8PAls1kqYnfivh8oH4P+tF0OjRVr4QYeC6emK57H8x8dD/eYbzHUDU2sj0keXsf+h5IE0T9v8Bq1nBGtd9YlIAFx6ts62XNYPcel4gBZaJVDd8X/0g5XWIhdMyggzxwA1nOeo1PpoEUSz53Lw9+fGkhfGE+yVRTFajdvxl6t/6xdA9DReJpcvXVLdyCTCPT7w10HnCm6/M7NoB8E97z6yNSU6mX9/kaJhl 7P5YZ0BP XkKkmyutpo7CAAsqmCrUpQu5KxPGo1FReh1LqVn3KeU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Aug 21, 2024 at 6:42=E2=80=AFPM Kairui Song wrot= e: > > On Wed, Aug 21, 2024 at 1:49=E2=80=AFPM Barry Song <21cnbao@gmail.com> wr= ote: > > > > On Tue, Aug 20, 2024 at 9:02=E2=80=AFPM Kairui Song = wrote: > > > > > > On Tue, Aug 20, 2024 at 4:47=E2=80=AFPM Kairui Song wrote: > > > > > > > > On Tue, Aug 20, 2024 at 4:13=E2=80=AFAM Yosry Ahmed wrote: > > > > > On Fri, Aug 16, 2024 at 12:52=E2=80=AFPM syzbot > > > > > wrote: > > > > > > > > > > > > Hello, > > > > > > > > > > > > syzbot found the following issue on: > > > > > > > > > > > > HEAD commit: 367b5c3d53e5 Add linux-next specific files for = 20240816 > > > > > > > > I can't find this commit, seems this commit is not in linux-next an= y more? > > > > > > > > > > git tree: linux-next > > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=3D124= 89105980000 > > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=3D61b= a6f3b22ee5467 > > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=3Dce602= 9250d7fd4d0476d > > > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binuti= ls for Debian) 2.40 > > > > > > > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > > > > > > > Downloadable assets: > > > > > > disk image: https://storage.googleapis.com/syzbot-assets/0b1b4e= 3cad3c/disk-367b5c3d.raw.xz > > > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/5bb090f78= 13c/vmlinux-367b5c3d.xz > > > > > > kernel image: https://storage.googleapis.com/syzbot-assets/6674= cb0709b1/bzImage-367b5c3d.xz > > > > > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag t= o the commit: > > > > > > Reported-by: syzbot+ce6029250d7fd4d0476d@syzkaller.appspotmail.= com > > > > > > > > > > > > ------------[ cut here ]------------ > > > > > > WARNING: CPU: 0 PID: 11298 at mm/zswap.c:1700 zswap_swapoff+0x1= 1b/0x2b0 mm/zswap.c:1700 > > > > > > Modules linked in: > > > > > > CPU: 0 UID: 0 PID: 11298 Comm: swapoff Not tainted 6.11.0-rc3-n= ext-20240816-syzkaller #0 > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engi= ne, BIOS Google 06/27/2024 > > > > > > RIP: 0010:zswap_swapoff+0x11b/0x2b0 mm/zswap.c:1700 > > > > > > Code: 74 05 e8 78 73 07 00 4b 83 7c 35 00 00 75 15 e8 1b bd 9e = ff 48 ff c5 49 83 c6 50 83 7c 24 0c 17 76 9b eb 24 e8 06 bd 9e ff 90 <0f> 0= b 90 eb e5 48 8b 0c 24 80 e1 07 80 c1 03 38 c1 7c 90 48 8b 3c > > > > > > RSP: 0018:ffffc9000302fa38 EFLAGS: 00010293 > > > > > > RAX: ffffffff81f4d66a RBX: dffffc0000000000 RCX: ffff88802c19bc= 00 > > > > > > RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff8880159862= 48 > > > > > > RBP: 0000000000000000 R08: ffffffff81f4d620 R09: 1ffffffff1d476= ac > > > > > > R10: dffffc0000000000 R11: fffffbfff1d476ad R12: dffffc00000000= 00 > > > > > > R13: ffff888015986200 R14: 0000000000000048 R15: 00000000000000= 02 > > > > > > FS: 00007f9e628a5380(0000) GS:ffff8880b9000000(0000) knlGS:000= 0000000000000 > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > CR2: 0000001b30f15ff8 CR3: 000000006c5f0000 CR4: 00000000003506= f0 > > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000= 00 > > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 00000000000004= 00 > > > > > > Call Trace: > > > > > > > > > > > > __do_sys_swapoff mm/swapfile.c:2837 [inline] > > > > > > __se_sys_swapoff+0x4653/0x4cf0 mm/swapfile.c:2706 > > > > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > > > > > do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 > > > > > > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > > > > > RIP: 0033:0x7f9e629feb37 > > > > > > Code: 73 01 c3 48 8b 0d f1 52 0d 00 f7 d8 64 89 01 48 83 c8 ff = c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 a8 00 00 00 0f 05 <48> 3= d 01 f0 ff ff 73 01 c3 48 8b 0d c1 52 0d 00 f7 d8 64 89 01 48 > > > > > > RSP: 002b:00007fff17734f68 EFLAGS: 00000246 ORIG_RAX: 000000000= 00000a8 > > > > > > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9e629feb= 37 > > > > > > RDX: 00007f9e62a9e7e8 RSI: 00007f9e62b9beed RDI: 0000563090942a= 20 > > > > > > RBP: 0000563090942a20 R08: 0000000000000000 R09: 77872e07ed164f= 94 > > > > > > R10: 000000000000001f R11: 0000000000000246 R12: 00007fff177351= 88 > > > > > > R13: 00005630909422a0 R14: 0000563073724169 R15: 00007f9e62bdda= 80 > > > > > > > > > > > > > > > > I am hoping syzbot would find a reproducer and bisect this for us= . > > > > > Meanwhile, from a high-level it looks to me like we are missing a > > > > > zswap_invalidate() call in some paths. > > > > > > > > > > If I have to guess, I would say it's related to the latest mTHP s= wap > > > > > changes, but I am not following closely. Perhaps one of the follo= wing > > > > > things happened: > > > > > > > > > > (1) We are not calling zswap_invalidate() in some invalidation pa= ths. > > > > > It used to not be called for the cluster freeing path, so maybe w= e end > > > > > up with some order-0 swap entries in a cluster? or maybe there is= an > > > > > entirely new invalidation path that does not go through > > > > > free_swap_slot() for order-0 entries? > > > > > > > > > > (2) Some higher order swap entries (i.e. a cluster) end up in zsw= ap > > > > > somehow. zswap_store() has a warning to cover that though. Maybe > > > > > somehow some swap entries are allocated as a cluster, but then pa= ges > > > > > are swapped out one-by-one as order-0 (which can go to zswap), bu= t > > > > > then we still free the swap entries as a cluster? > > > > > > > > Hi Yosry, thanks for the report. > > > > > > > > There are many mTHP related optimizations recently, for this proble= m I > > > > can reproduce this locally. Can confirm the problem is gone for me > > > > after reverting: > > > > > > > > "mm: attempt to batch free swap entries for zap_pte_range()" > > > > > > > > Hi Barry, > > > > > > > > If a set of continuous slots are having the same value, they are > > > > considered a mTHP and freed, bypassing the slot cache, and causing > > > > zswap leak. > > > > This didn't happen in put_swap_folio because that function is > > > > expecting an actual mTHP folio behind the slots but > > > > free_swap_and_cache_nr is simply walking the slots. > > > > > > > > For the testing, I actually have to disable mTHP, because linux-nex= t > > > > will panic with mTHP due to lack of following fixes: > > > > https://lore.kernel.org/linux-mm/a4b1b34f-0d8c-490d-ab00-eaedbf3fe7= 80@gmail.com/ > > > > https://lore.kernel.org/linux-mm/403b7f3c-6e5b-4030-ab1c-3198f36e3f= 73@gmail.com/ > > > > > > > > > > > > > > I am not closely following the latest changes so I am not sure. C= Cing > > > > > folks who have done work in that area recently. > > > > > > > > > > I am starting to think maybe it would be more reliable to just ca= ll > > > > > zswap_invalidate() for all freed swap entries anyway. Would that = be > > > > > too expensive? We used to do that before the zswap_invalidate() c= all > > > > > was moved by commit 0827a1fb143f ("mm/zswap: invalidate zswap ent= ry > > > > > when swap entry free"), and that was before we started using the > > > > > xarray (so it was arguably worse than it would be now). > > > > > > > > > > > > > That might be a good idea, I suggest moving zswap_invalidate to > > > > swap_range_free and call it for every freed slot. > > > > > > > > Below patch can be squash into or put before "mm: attempt to batch > > > > free swap entries for zap_pte_range()". > > > > > > Hmm, on second thought, the commit message in the attachment commit > > > might be not suitable, current zswap_invalidate is also designed to > > > only work for order 0 ZSWAP, so things are not clean even after this. > > > > Kairui, what about the below? we don't touch the path of __try_to_recla= im_swap() where > > you have one folio backed? > > > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index c1638a009113..8ff58be40544 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -1514,6 +1514,8 @@ static bool __swap_entries_free(struct swap_info_= struct *si, > > unlock_cluster_or_swap_info(si, ci); > > > > if (!has_cache) { > > + for (i =3D 0; i < nr; i++) > > + zswap_invalidate(swp_entry(si->type, offset + i= )); > > spin_lock(&si->lock); > > swap_entry_range_free(si, entry, nr); > > spin_unlock(&si->lock); > > > > Hi Barry, > > Thanks for updating this thread, I'm thinking maybe something will > better be done at the zswap side? > > The concern of using zswap_invalidate is that it calls xa_erase which > requires the xa spin lock. But if we are calling zswap_invalidate in > swap_entry_range_free, and ensure the slot is HAS_CACHE pinned, doing > a lockless read first with xa_load should be OK for checking if the > slot needs a ZSWAP invalidation. The performance cost will be minimal > and we only need to call zswap_invalidate in one place, something like > this (haven't tested, comments are welcome). Also ZSWAP mthp will > still store entried in order 0 so this should be OK for future. Hi Kairui, I fully welcome the callers of swap_entry_range_free not needing to worry about the zswap mechanism=E2=80=94it's a freedom from concerning themselves with zswap. However, currently, zswap_invalidate is executed outside of the si lock, and after the changes, it will be inside the si lock, which ma= y potentially increase the time spent holding the si lock. So, is it possible= to move the action of acquiring the si lock for swap_entry_range_free to this function and do it after completing zswap_invalidate? Or am I overthinking it? swap_entry_range_free(si, entry, size) { for(nr) zswap_invalidate(....) spin_lock(si->lock) ... spin_unlock(si->lock) } > > diff --git a/mm/swap_slots.c b/mm/swap_slots.c > index 13ab3b771409..d7bb3caa9d4e 100644 > --- a/mm/swap_slots.c > +++ b/mm/swap_slots.c > @@ -273,9 +273,6 @@ void free_swap_slot(swp_entry_t entry) > { > struct swap_slots_cache *cache; > > - /* Large folio swap slot is not covered. */ > - zswap_invalidate(entry); > - > cache =3D raw_cpu_ptr(&swp_slots); > if (likely(use_swap_slot_cache && cache->slots_ret)) { > spin_lock_irq(&cache->free_lock); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index f947f4dd31a9..fbc25d38a27e 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -242,9 +242,6 @@ static int __try_to_reclaim_swap(struct > swap_info_struct *si, > folio_set_dirty(folio); > > spin_lock(&si->lock); > - /* Only sinple page folio can be backed by zswap */ > - if (nr_pages =3D=3D 1) > - zswap_invalidate(entry); > swap_entry_range_free(si, entry, nr_pages); > spin_unlock(&si->lock); > ret =3D nr_pages; > @@ -1545,6 +1542,10 @@ static void swap_entry_range_free(struct > swap_info_struct *si, swp_entry_t entry > unsigned char *map_end =3D map + nr_pages; > struct swap_cluster_info *ci; > > + /* Slots are pinned with SWAP_HAS_CACHE, safe to invalidate */ > + for (int i =3D 0; i < nr_pages; ++i) > + zswap_invalidate(swp_entry(si->type, offset + i)); > + > ci =3D lock_cluster(si, offset); > do { > VM_BUG_ON(*map !=3D SWAP_HAS_CACHE); > diff --git a/mm/zswap.c b/mm/zswap.c > index df66ab102d27..100ad04397fe 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1656,15 +1656,18 @@ bool zswap_load(struct folio *folio) > return true; > } > > +/* Caller need to pin the slot to prevent parallel store */ > void zswap_invalidate(swp_entry_t swp) > { > pgoff_t offset =3D swp_offset(swp); > struct xarray *tree =3D swap_zswap_tree(swp); > struct zswap_entry *entry; > > - entry =3D xa_erase(tree, offset); > - if (entry) > - zswap_entry_free(entry); > + if (xa_load(tree, offset)) { > + entry =3D xa_erase(tree, offset); > + if (entry) > + zswap_entry_free(entry); > + } > } > > int zswap_swapon(int type, unsigned long nr_pages) > -- > 2.45.2