From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5D4BC4828D for ; Sat, 3 Feb 2024 05:14:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 41F5B6B0078; Sat, 3 Feb 2024 00:14:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3CFE36B007D; Sat, 3 Feb 2024 00:14:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2BEA76B007E; Sat, 3 Feb 2024 00:14:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1ECDA6B0078 for ; Sat, 3 Feb 2024 00:14:13 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id DDE19A019C for ; Sat, 3 Feb 2024 05:14:12 +0000 (UTC) X-FDA: 81749326344.14.6F62A27 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf20.hostedemail.com (Postfix) with ESMTP id 61F901C000D for ; Sat, 3 Feb 2024 05:14:09 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Uxo97TB4; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf20.hostedemail.com: domain of xiang@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=xiang@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706937249; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pHb2tEvYR49hWjYTlgFtMSXv4f3CmEHu6MD19fhvb0Q=; b=0bOaxFpdQbK/g3G8O4BQApb6MA2HctnbkrF2GpPTO+94bruCp29nIyXBCEXzyDIs7wwTJq ya67gpo+XWd/4NNItumZa7nhoEh7vyVRgdSxpc4e5CqpDezQUBK33b6gSuH1OlO3zAZxBi a0sxOqR1zcjt1iLKY0XiQiMztWbQtJk= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Uxo97TB4; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf20.hostedemail.com: domain of xiang@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=xiang@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706937249; a=rsa-sha256; cv=none; b=6eOX7t01VxJB44k9/1w/AEKlU6dff1o82Y1B/m0M5DUaYQY5KRhYHc5PVzlfMYl+KWspgj d2KQWpvlq1ZnRIyRsINOBCkIdKTQJf/5/rluSHAIfVFXpnnPJJ+YsnudJ+HoZY7siqLkQ4 6uWQIGltX+EiYck6gnSUeCCoqdjZb9k= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 4690960AC4; Sat, 3 Feb 2024 05:14:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E8D1CC433C7; Sat, 3 Feb 2024 05:14:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706937247; bh=v25sxvay8N5e2/0Wwh7sc0SdxRqWAPgUZnyU+HdR1PE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Uxo97TB4tOt7X3Yk1UgLdnbvOvrzeWM3tu68rXKSOpcvkED8g9GQ995duk/A6H2tc NCBVXsXDlwURcMA89UfEEwq/lAaxhjlvnvW9y1yOF9GePdk8Y4k7rXGwbtgpTE14+z xaX3haOxI/48bPdPKDo1h2JSHFDncyJHajKTJD094l20F8fo3k094kIErDHD9LSZ2k 2sBvAwRtYJw1R9g/aESi6ILszrP6Qc+a8AIcUJOXFeobFsyoyOWUoQ4td/t9ZOQhyp Ac0tJdTYJazT03uGnoNoex39SOY9riBigfiL/2u6jB0DxPN9TxPfugSPVhtFjKrD/A DQaJ2WSlUTl1g== Date: Sat, 3 Feb 2024 13:13:57 +0800 From: Gao Xiang To: David Howells Cc: lsf-pc@lists.linux-foundation.org, Matthew Wilcox , netfs@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [LSF/MM/BPF TOPIC] Large folios, swap and fscache Message-ID: Mail-Followup-To: David Howells , lsf-pc@lists.linux-foundation.org, Matthew Wilcox , netfs@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org References: <2701740.1706864989@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <2701740.1706864989@warthog.procyon.org.uk> X-Rspam-User: X-Stat-Signature: spsbtir4uncodr1pebedeh77mximmgj5 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 61F901C000D X-HE-Tag: 1706937249-474367 X-HE-Meta: U2FsdGVkX1/9hptDq8a5KwzqSrlHAU2HdyFX2X1r94Noc2NJGiYDj5R0slwRr1GwlfGBE2OSMPhQDG0ev7F6Xfmi3g65iY48RE3ka2txStgJZT2qRNVGXuX4AqZAHzIypEQA9A43R1gZYvGTCDOTYJ+SKQudSL4KscQL+XrlnXJnUe0/LuF0Dj7QBNSm4eVFCK285wh3BpmZqQ/iByFlVB/O/m2TmT6gaS5KWkzotnhRSYdVn8LF/osc2UHSJD5mRZ1MkZ8UTYR4EWlLjgC+fr8/c5frjxS7qds29/dwqRnLyhAfghJtJe3qLqD+xn1rNQxiFV2bKlrPcT9+ijpfvD4Wa9Ejodh2WpDzvY1OgSn7Yw967IOpIeXNPhUQ9QEuD/KrIbDrzPMNv49zGRNOGuexqAKSXD+H5Ej0ED1uLbjxodRBpsSDGhvhj72c6BrWxQl86c7/15BA/4gvJcf4zQfQqm1jyF0YGPvOX/0o5sQ8g9GPdYYq9zVmXEbAxo2vU2pD3UE48VZ8Vhf8+3i0F+wvMaO3WVbzN9gXYWN6ypc/8r78EUOf6pkFeiiookXeSglZO9OWtZ6vVaEJHSMEqXmpJ9gGTvq4e4ZySMCOb5T9q7TPtXHzuJn6dt5j+RnbdVtOKqBc0iDSdAO0PQCe0jsM26n0BVXDg3rk+ZCCH29rze4P71JmPrRSS8C+7Q6AB7hC+XzWrPr1rZa/E4KK3PrLUwciiRen5Zai3D/Gcx+OZkbgg/wZZrZZtOR4CD+mxyAloX4DiNArwZrJFPamSNg+tvJxo4VR+5nqfGhpvQd90RavSWYa/b1ybKKWt2iC9BesDKVeAg3ofdR3MBoYdQfb6CtGfIqqyZG2Ht0Z2ZN9M+h9nbDO1YEgHsGvMvVBi0IdWzTiBhc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi David, On Fri, Feb 02, 2024 at 09:09:49AM +0000, David Howells wrote: > Hi, > > The topic came up in a recent discussion about how to deal with large folios > when it comes to swap as a swap device is normally considered a simple array > of PAGE_SIZE-sized elements that can be indexed by a single integer. > > With the advent of large folios, however, we might need to change this in > order to be better able to swap out a compound page efficiently. Swap > fragmentation raises its head, as does the need to potentially save multiple > indices per folio. Does swap need to grow more filesystem features? > > Further to this, we have at least two ways to cache data on disk/flash/etc. - > swap and fscache - and both want to set aside disk space for their operation. > Might it be possible to combine the two? > > One thing I want to look at for fscache is the possibility of switching from a > file-per-object-based approach to a tagged cache more akin to the way OpenAFS > does things. In OpenAFS, you have a whole bunch of small files, each > containing a single block (e.g. 256K) of data, and an index that maps a > particular {volume,file,version,block} to one of these files in the cache. > > Now, I could also consider holding all the data blocks in a single file (or > blockdev) - and this might work for swap. For fscache, I do, however, need to > have some sort of integrity across reboots that swap does not require. If my understanding is correct, I think the old swapfile approach just works with pinned local fs extents, which means it looks up extents in advance and it doesn't expect these extents will be moved so the real swap data I/O paths always work without fses involved. I don't look into the new SWP_FS_OPS/.swap_rw way and it seems only some network fses use it but IMHO it might have some deadlock risk if swapout triggers local fs block allocation. But overall I think it's a good idea to combine the two. Just slight off the topic: Recently I had another rough thought. As you said, even a single fscache block or called a single fscache chunk is like 256K or whatever. Is it possible to implement an _optional_ partial cached data uptodate like fscache chunk vs fsblock? For example a bitmap can be attached to each 256K or 1M chunk. That would be much helpful. Thanks, Gao Xiang > > David >