From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8C85C369C2 for ; Tue, 22 Apr 2025 14:43:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB11E6B0007; Tue, 22 Apr 2025 10:43:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D35FC6B000A; Tue, 22 Apr 2025 10:43:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD7736B000C; Tue, 22 Apr 2025 10:43:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9AF946B0007 for ; Tue, 22 Apr 2025 10:43:47 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 48800BA54F for ; Tue, 22 Apr 2025 14:43:48 +0000 (UTC) X-FDA: 83361948936.18.2F2796F Received: from out-177.mta0.migadu.com (out-177.mta0.migadu.com [91.218.175.177]) by imf22.hostedemail.com (Postfix) with ESMTP id 6BDAFC000F for ; Tue, 22 Apr 2025 14:43:46 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=m8wyn1wl; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.177 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745333026; a=rsa-sha256; cv=none; b=h4ohvpTJEuehJppBh+IMd4JVpky5rr0610GtMu3RP1HQvJh1qV4qjkY0EqVaP6HGn5hala dUZBGVFlzpzwakpOG0rqP/D1AXC55hmSYNcRBLcASY200dVgavITvVp+uhxROtjLEjBeAH OxKpbytrX7FUBXh2EyNrG+IH/V4kCSQ= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=m8wyn1wl; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.177 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745333026; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eGqWvARWPbzlJGTq79GPwsMrSwFqx/NHQqtKgiIKvSM=; b=POkKAiz0oAiN13I6SzVQjot631zMoDbVqyofopmvGpTrXbTnbX0rynjgNUQDbozJKAphQE YHkz0zpZS6JThWmA7QsLLCvllcUNw4ZREcBCX+UWzH6y1ydnDAlMzDOcZgl33BwFhTDIdy O99V6m0XWmDZdTK8iL7smdJ+AMJt44s= Date: Tue, 22 Apr 2025 07:43:38 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1745333024; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eGqWvARWPbzlJGTq79GPwsMrSwFqx/NHQqtKgiIKvSM=; b=m8wyn1wlQGqXXwq/vo89jTgaUiWlR+8kWRJf6auVIYA5RKwT5xpx3fcS7ggvw/01cD/xS6 GR3o024+UL948xQMC/o26bfT1yU8NgmULUOShTAvVdsdkcWYAX6fnrAZCm9MgV4lIbgn5O wDofZVNbVeu4BnjNVq8svdEuy+RQr4c= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: Kairui Song Cc: Nhat Pham , linux-mm@kvack.org, akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, len.brown@intel.com, chengming.zhou@linux.dev, chrisl@kernel.org, huang.ying.caritas@gmail.com, ryan.roberts@arm.com, viro@zeniv.linux.org.uk, baohua@kernel.org, osalvador@suse.de, lorenzo.stoakes@oracle.com, christophe.leroy@csgroup.eu, pavel@kernel.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-pm@vger.kernel.org Subject: Re: [RFC PATCH 00/14] Virtual Swap Space Message-ID: References: <20250407234223.1059191-1-nphamcs@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 6BDAFC000F X-Stat-Signature: dco5xgidx91qit4p3ifdjpbbryucik3j X-Rspam-User: X-HE-Tag: 1745333026-582927 X-HE-Meta: U2FsdGVkX186fWIjotFhsYjmFTeYe5CnzkEF9MYP8QoXWeJbY8CZKDCBzQEJHlFfpXompKVHUg2ibww2az9l+8/IbG+4aAV2KcYhkEeeMQKf7w/r2xQPUi9rFJG2qV8UOdMh/KsHcoBFbjKp50H3KNUaefvTfsIJ4GnhspRuUzcxyxWVKhnE7J01RkU4ig/ygbYJ9IFp+RCJG+zlThdcfIucQbWrMs036peF5jL5DHDBLOxMUub5fgqd5LWoLwJFjavVzj3huBEZvZji4nz+8VM9b93n+EYRvz5ScIpjmYv+6gDxGzkvUCIdc4XI1DgO8deM34UpBO6BmmwINVmaXR17jtHvHfUn6AipfUqfI3xsNP1L9QLQtgKa/Gq/4PMA0OpBIsJUAgdNTq8LKGjrt9YoEg/O90iRxr/xk/8i/I4KpdRDfrQ2mMmIyu4jRxQfq72D/AAOCVEylcL8pufzDD6wpyH9hWOaJ+CmhamhxaiCGkH15j3SMMdUh8kgwUTEYI18U9QnmPretNKZ+HLQ+uyJq8/QU5ughDtJL983IuvPOdSUOMwku13dGeoS3d8UZfWSFPlmGCNaHgXTtU3kSg2KgE4gV7+sM6LuSZZQHCz4osWrCzg1XFIRdjYiChHLQnrOqmIxmAyrzD89pLd7oTb0jmDQvf4Un2dKZeSWg1ZtCea0mbz0E2HJGuvuIqFiUPJXGTO9Sb72kBPKrKzfXrtg3V6MV6ojPyOMfNBjjsJkcwL2V1/CVYsyHjx0nN+xIL3W2pYP8GI8lzOXxJXBu1PeQepJxB8u0UZrBzMoqhFlm+JQblWJCbdbrKsF/w0DfV0cZzmhNom+GlLc931XoYUq4BB31+U9AePAHCVyuWCpYy8A5m2vRN7sGlMB4knQiaiHF/qd1QcIaXlJFvLxzEUjeWuvvD+1a5E4mrzwU0cH9pUCARj6jU+BliKqz2WWV2U1VzAlPCd8mXFiJRs 1fF+hJ3A zzbSIY+c+WVm9OCjmmTMnzR3NiqfYWf97gFiGlJ7CneCQf0J2MOcdreJvZjZlo+O4tsvx8lEh0+6Ol4mGVUcWzZV2XmIunoJYps2LuedSh7xCkxyOQMMWduIWJyvvZNI6RIZU81IXiDXzrRyV9nWa0kcKl/fy133LXg8KxSn3NRB1zz7G261EiUcyP9IBU8H5xEqWlPDKQXknttwvG3GKWt2V6InymSeI95tZiRnXRq/Jy/ONOYS8sCp88958ylYRffIbGq5Pno/E9gnkyXWQ47b2/zaYcv8uF9dA9vQ8qbBgT7GNOtMPuP3Y6A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 09, 2025 at 12:59:24AM +0800, Kairui Song wrote: > On Wed, Apr 9, 2025 at 12:48 AM Nhat Pham wrote: > > > > On Tue, Apr 8, 2025 at 9:23 AM Kairui Song wrote: > > > > > > > > > Thanks for sharing the code, my initial idea after the discussion at > > > LSFMM is that there is a simple way to combine this with the "swap > > > table" [1] design of mine to solve the performance issue of this > > > series: just store the pointer of this struct in the swap table. It's > > > a bruteforce and glue like solution but the contention issue will be > > > gone. > > > > Was waiting for your submission, but I figured I should send what I > > had out first for immediate feedback :) > > > > Johannes actually proposed something similar to your physical swap > > allocator for the virtual swap slots allocation logic, to solve our > > lock contention problem. My apologies - I should have name-dropped you > > in the RFC cover as well (the cover was a bit outdated, and I haven't > > updated the newest developments that came from the LSFMMBPF > > conversation in the cover letter). > > > > > > > > Of course it's not a good approach, ideally the data structure can be > > > simplified to an entry type in the swap table. The swap table series > > > handles locking and synchronizations using either cluster lock > > > (reusing swap allocator and existing swap logics) or folio lock (kind > > > of like page cache). So many parts can be much simplified, I think it > > > will be at most ~32 bytes per page with a virtual device (including > > > the intermediate pointers).Will require quite some work though. > > > > > > The good side with that approach is we will have a much lower memory > > > overhead and even better performance. And the virtual space part will > > > be optional, for non virtual setup the memory consumption will be only > > > 8 bytes per page and also dynamically allocated, as discussed at > > > LSFMM. > > > > I think one problem with your design, which I alluded to at the > > conference, is that it doesn't quite work for our requirements - > > namely the separation of zswap from its underlying backend. > > > > All the metadata HAVE to live at the virtual layer. For once, we are > > duplicating the logic if we push this to the backend. > > > > But more than that, there are lifetime operations that HAVE to be > > backend-agnostic. For instance, on the swap out path, when we unmap > > the page from the page table, we do swap_duplicate() (i.,e increasing > > the swap count/reference count of the swap entries). At that point, we > > have not (and cannot) make a decision regarding the backend storage > > yet, and thus does not have any backend-specific places to hold this > > piece of information. If we couple all the backends then yeah sure we > > can store it at the physical swapfile level, but that defeats the > > purpose of swap virtualization :) > > Ah, now I get why you have to store the data in the virtual layer. > > I was thinking that doing it in the physical layer will make it easier > to reuse what swap already has. But if you need to be completely > backend-agnostic, then just keep it in the virtual layer. Seems not a > foundunmentail issue, it could be worked out in some way I think. eg. > using another table type. I'll check if that would work after I've > done the initial parts. Watching from the sidelines, I am happy to see Nhat's proposal materializing, and think there is definitely room for collaboration here with Kairui's. Overall, both proposals seem to be complimentary concepts, and we just need to figure out the right way to combine them :)