From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2704C47DDB for ; Tue, 23 Jan 2024 15:39:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E1BA8D0002; Tue, 23 Jan 2024 10:39:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 29C638D0001; Tue, 23 Jan 2024 10:39:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 159058D0002; Tue, 23 Jan 2024 10:39:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 06B938D0001 for ; Tue, 23 Jan 2024 10:39:00 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AF6851C1508 for ; Tue, 23 Jan 2024 15:38:59 +0000 (UTC) X-FDA: 81710983998.23.AE17FF0 Received: from mail-oa1-f50.google.com (mail-oa1-f50.google.com [209.85.160.50]) by imf02.hostedemail.com (Postfix) with ESMTP id 812738000F for ; Tue, 23 Jan 2024 15:38:57 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=QBO7njBK; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf02.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.50 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706024337; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1nP0lWvVSdV3ePRrfwC4/bRX6mgra3C2ztXq0kulIvo=; b=rygqaGXhcC3BO9CcwDPiglWDUWvI7gUFRv83M4qCoXvp/JyQ//4Z8MHthUfCDG+KO3kBk0 Rj50/NKYVpmuRuRaQIr3FKhyn7amK3yhoMnS/oGQdNUr7o02JjFaIaknzEC22FANyrOsx+ wnylofb2LJq60/IemjZQ8SGz+2R6XmY= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=QBO7njBK; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf02.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.50 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706024337; a=rsa-sha256; cv=none; b=fArwg9G8AJcSxJn4pCc2Y8A+p4ohTYivz4pzI0uVGl/tsegzHKSECZ2mZBHqY1J2dbECEG JgRp9YlV/nbJrdntjyoIo4ZMWlRvvjp/lKRn1k/TGr2aJ/wJlHZ1j0c/S9xHWaBZeFoEci c3iWkHN3Mt/gimaUB5pwfygLDZDr5YQ= Received: by mail-oa1-f50.google.com with SMTP id 586e51a60fabf-2143a96d185so692484fac.3 for ; Tue, 23 Jan 2024 07:38:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1706024336; x=1706629136; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=1nP0lWvVSdV3ePRrfwC4/bRX6mgra3C2ztXq0kulIvo=; b=QBO7njBKJoehBNtDdbD8/NYCra+kv3n1yzGuE7frRMB6WlyEiiDxFOWXJ0FkOuuagp U+LCFBEe+xMQUJ6m5OVO/GZMVObh7OumKDVTzNAqVz41E1nrK7hf1uZblheE0K6g4vlZ lGGQYQTJDAbr4pub7ClThi4hz9HC/Yvfqb59jXS9XNZL0YS6zNrUsyZlkL28oZ+a4Tbr zQQjeor5ESJuuFdS2fFhq9Nn3yW2ebSd3RPQK2xu7+mAmhT8qjrg82aN3TDfIF3+V/MI f3g6yBUc4+xVGi+0KElJOyHP4awowH3QKc90jTZg2EOsEJpnqocdQkUyJ6LrdRCY8syx Lg2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706024336; x=1706629136; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1nP0lWvVSdV3ePRrfwC4/bRX6mgra3C2ztXq0kulIvo=; b=tdjsme0mMtEVzwY2SEKocK2JKIBE6rgDqu/BBluMgTfqlScqzIxJOxZfqNLMK1WPEI AX/fg+N10Bn7y+pv9UsYydu9qAKGqmPqbuS5HbP6FC8yc/nkN5uMVssFS5nDVnwt7AoO XUYGGNWxCsmB+YN/uDjMvKe4ju7b/3kNYwwkeMHFrJ9xBzFFMWVWCsmO/cmGIrYrFgV5 JYkz0Kc2hT3scxrOp6Ro+29qADt2+NSwD5BPHzOoX73tlDEAzKH5waduiwm0twOjvDdo 47jPcweZXStycZAhybgagsnRVMsyokk0NAG54xURlho95PdHDTYrhILgRvpGU7Edxza5 MoDA== X-Gm-Message-State: AOJu0Yw5WGs6J0ypJUOVyuD3vQV/aLQfTCDMf5+l4e8Gs/DkbxMaXinc T4ECvR8RLAC54HrTTCaieTAAslk1mtfYIHlzwhzAJCErSFnodNOE6311Njm3APc= X-Google-Smtp-Source: AGHT+IEjyxHRQmDZau+xZqO7ups8UN2Vf2ApNGwlzm1r1pP4SSCaGsrMcbfuRBOxqpOwUglj0NWFgw== X-Received: by 2002:a05:6871:5224:b0:203:88b1:31ce with SMTP id ht36-20020a056871522400b0020388b131cemr1659751oac.117.1706024336410; Tue, 23 Jan 2024 07:38:56 -0800 (PST) Received: from localhost (2603-7000-0c01-2716-97cf-7b55-44af-acd6.res6.spectrum.com. [2603:7000:c01:2716:97cf:7b55:44af:acd6]) by smtp.gmail.com with ESMTPSA id d15-20020a056214184f00b006819a4354basm3567426qvy.37.2024.01.23.07.38.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 07:38:56 -0800 (PST) Date: Tue, 23 Jan 2024 10:38:51 -0500 From: Johannes Weiner To: Yosry Ahmed Cc: Andrew Morton , Nhat Pham , Chris Li , Chengming Zhou , Huang Ying , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] mm: zswap: remove unnecessary tree cleanups in zswap_swapoff() Message-ID: <20240123153851.GA1745986@cmpxchg.org> References: <20240120024007.2850671-1-yosryahmed@google.com> <20240120024007.2850671-3-yosryahmed@google.com> <20240122201906.GA1567330@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 812738000F X-Stat-Signature: t8zgdkmck1teaxy8rxkjm7cqwox38sfd X-Rspam-User: X-HE-Tag: 1706024337-585607 X-HE-Meta: U2FsdGVkX18ya+vuyB8wJ9hIp+S4ddrrLg1k1Y202l6z4C8i6GwyT/5kqEkXR4VjfRQZjbMgX1ECEkWorq95q1n6tQXwVeWuxcbBqJhMTOtnID5S83/WJBeeByOdMN5/5v+iLXuomRkJ1nu+sbZ9+JCwz6R+1O9ROFrsPFvmDPHdVmA/kIMWMh6ugS5cuDoGN9yetL0ldFAPgiPYyKsY55KImEulBXzhkqLJ2dJ8JM0HBreEcqDTq5xo9jGRnk2XoaY7WZBcLl+TyJ0zld5V1mU7aCswj03bY3mNcnaZgPFp42Oib/3uBkmuFDbx++cqp0MFI08zYJFTi+6NQo00Mu4SQvQDPaAqKELvtQbxtoeKAHynoY560z73RKLS4VyRMAO1Za+Vsv2XQUKEDXmQ7MzEFYkg6stZ2ieo0YzSPrEwrtR2C+zYeObOgIHQD33UeHawD2o1mJokDgpd2TB5MXpQmNIs+uWGR6jb1PwWO1FSsYQTZzDyB/PVNDdVPU36A2uC9XQT5Eu2yRbo+oLBbvng8tlgNR4GXKm0l6SK0u7zo+It99xZXLtFPk0klc7TzG6dIVmMrEUOs4jmKr6a9ZqgCSP8lIy4CxM8ZcEbYiLeslBnPfyRwKBne7hMR7Hvt6TUw71S4NS8iN7fHNHsJwLkavJ6wD/OJ52rmgIJw5Kqk+4UjhLygW9h5Vds+wRZtSochpWPKqdj4vxv21SkxMxdOxM9Kv8QoaqE5hZfS+kkgOf/zLz0t4L9Ahep4rhsLmRX8E9HPQ8ryM6wsK5jkywW2e9RDT5Ii/Hz7VeWHLLqpjOK1XLtQQ9ldFSkx3UCGWoXprmXRWpKsyBfiULPOodLg3YzcUPBcPWBaaPtROI76e3qlW6V/0gOtRguLnmIdDOA0J3o/g5a0Tl9oifwndhGEiyqhXNNQ3JTCGsWxY3BoFzPs1SPhNdYDnD8wxwBNhj8HlPoJjYxBx6Wb8C C5MeUFti 0KaRypFXbx/UD2EHaWamtIOW55p2t8wFDBqqyBYL7zXNS0th9SPphoiIpf+TW1RSKk6FodA3ry9jPiIxNOyi295DKH+a/9wn2Foe2OK6VDSZQQmEh8qD/yT2zmQ/uVQPHGy5O2XEQG/NKmLNrKQZX4zvzyYNFgrEAfMpyqW3r4j2r3R7Vv/0jQhkb2iEeftiYscMpqEQ4X/oHe1dYFxkU4nW0E0jHO5GREl13kXZ97+3BvFjoFg9Me6NgynoWhR+XykhUhxjoIPCVh8dxotjtoUdFo6bcJTUDtt9ZAtWkmX3/onn5C7za+h3TRAfFvvZHk01I/GDXXSbnbyYt/Np3BsIq50gx7w5VnYf5YzKA7a+9pDN40NUzk+paDqEL+DRom+uMX2Fqusy6dP+zAH4sinZuuJ9LvDYY5ORUPYln/Bna/UXokf27KYKmPEwFVzq5J14kjPnZDzwnw2XPP+ycT9zt0XPcC3hVyx3AQNgRrdZVBMjbYR5KJzSSBK8i1pMHng/fMjElfjtCF6TYkvADEc84xg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 22, 2024 at 12:39:16PM -0800, Yosry Ahmed wrote: > On Mon, Jan 22, 2024 at 12:19 PM Johannes Weiner wrote: > > > > On Sat, Jan 20, 2024 at 02:40:07AM +0000, Yosry Ahmed wrote: > > > During swapoff, try_to_unuse() makes sure that zswap_invalidate() is > > > called for all swap entries before zswap_swapoff() is called. This means > > > that all zswap entries should already be removed from the tree. Simplify > > > zswap_swapoff() by removing the tree cleanup loop, and leaving an > > > assertion in its place. > > > > > > Signed-off-by: Yosry Ahmed > > > > Acked-by: Johannes Weiner > > > > That's a great simplification. > > > > Removing the tree->lock made me double take, but at this point the > > swapfile and its cache should be fully dead and I don't see how any of > > the zswap operations that take tree->lock could race at this point. > > It took me a while staring at the code to realize this loop is pointless. > > However, while I have your attention on the swapoff path, there's a > slightly irrelevant problem that I think might be there, but I am not > sure. > > It looks to me like swapoff can race with writeback, and there may be > a chance of UAF for the zswap tree. For example, if zswap_swapoff() > races with shrink_memcg_cb(), I feel like we may free the tree as it > is being used. For example if zswap_swapoff()->kfree(tree) happen > right before shrink_memcg_cb()->list_lru_isolate(l, item). > > Please tell me that I am being paranoid and that there is some > protection against zswap writeback racing with swapoff. It feels like > we are very careful with zswap entries refcounting, but not with the > zswap tree itself. Hm, I don't see how. Writeback operates on entries from the LRU. By the time zswap_swapoff() is called, try_to_unuse() -> zswap_invalidate() should will have emptied out the LRU and tree. Writeback could have gotten a refcount to the entry and dropped the tree->lock. But then it does __read_swap_cache_async(), and while holding the page lock checks the tree under lock once more; if that finds the entry valid, it means try_to_unuse() hasn't started on this page yet, and would be held up by the page lock/writeback state. > > > Chengming, Chris, I think this should make the tree split and the xarray > > > conversion patches simpler (especially the former). If others agree, > > > both changes can be rebased on top of this. > > > > The resulting code is definitely simpler, but this patch is not a > > completely trivial cleanup, either. If you put it before Chengming's > > patch and it breaks something, it would be difficult to pull out > > without affecting the tree split. > > Are you suggesting I rebase this on top of Chengming's patches? I can > definitely do this, but the patch will be slightly less > straightforward, and if the tree split patches break something it > would be difficult to pull out as well. If you feel like this patch is > more likely to break things, I can rebase. Yeah I think it's more subtle. I'd only ask somebody to rebase an already tested patch on a newer one if the latter were an obvious, low-risk, prep-style patch. Your patch is good, but it doesn't quite fit into this particular category, so I'd say no jumping the queue ;)