From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D43CC54798 for ; Tue, 5 Mar 2024 20:57:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4CB3F6B0074; Tue, 5 Mar 2024 15:57:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 47B506B0075; Tue, 5 Mar 2024 15:57:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 343256B007B; Tue, 5 Mar 2024 15:57:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 205A76B0074 for ; Tue, 5 Mar 2024 15:57:03 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A8A30C068E for ; Tue, 5 Mar 2024 20:57:02 +0000 (UTC) X-FDA: 81864195084.27.33D1EFB Received: from mail-oa1-f49.google.com (mail-oa1-f49.google.com [209.85.160.49]) by imf23.hostedemail.com (Postfix) with ESMTP id 17AE414000C for ; Tue, 5 Mar 2024 20:57:00 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=KU5XIXyW; spf=pass (imf23.hostedemail.com: domain of jaredeh@gmail.com designates 209.85.160.49 as permitted sender) smtp.mailfrom=jaredeh@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709672221; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dX0MLLvxfqrq7Q/LeGWS5NehrMjPfNr1sThaoWNEudQ=; b=GzteWmIJaW4i7r8XxCasi5Y/NC3VR6Hh3XQ0cYRYb046c1MT513C3V/6uOLikxFYZ43Oep JNzzV0lRyWZnX+l+omz6+cFnJ0NDNoilxxsSdvSEBIfb/sScNOSLFkQo39didJiuH8wqDh bJS/q0Yq4n+GWIjUkgK1AmBSd1TbEjs= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=KU5XIXyW; spf=pass (imf23.hostedemail.com: domain of jaredeh@gmail.com designates 209.85.160.49 as permitted sender) smtp.mailfrom=jaredeh@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709672221; a=rsa-sha256; cv=none; b=TD/EKxYB7RI0WUyq0osaeBBj0VnUFnYsHoIWVJAXITDNXqoNSRYMZ/374Cuh1Ezi4FKi/o AKeI1uYTNvK4+DcEfEoX951NlLxxsl1DcxDEhFnYDU9w8aWMqZh9rjOgQgBa9Afm5vVKyR kHvBSuGYRqBgzU59e08sBtQab6JiSXk= Received: by mail-oa1-f49.google.com with SMTP id 586e51a60fabf-220a0cacf9fso74698fac.1 for ; Tue, 05 Mar 2024 12:57:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1709672220; x=1710277020; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dX0MLLvxfqrq7Q/LeGWS5NehrMjPfNr1sThaoWNEudQ=; b=KU5XIXyWKRwqI3fw6bCvwf/IEVnEWHp++bOXYpuloWtQGetd+NKhgRHetoGbhqHI6r 0Sz4wo4mZRD28e7zldWDy9IfzSPIjZTpgsP4kzRaoSI+isI3xvKQRLyHVRIYP2BTORaZ egUO9N4IBfeJmh1ixdaFlFMwhCfvCjWbavlD3J8WmjyE4VkbI2mkB10mzHlqH2sHNT0w cImErqJUHg8PQrg7bEINAWGlNzCw4h1GIQyd3a5moZCfBHG1acR/I5BAAqrbjN/qD39K E6Lr96BoMCvbR/UefwOriUtPvpxiK/jojm9XYf8Tffqu7WTrRrrIgb2O5F2ga4bP2gWM aE4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709672220; x=1710277020; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dX0MLLvxfqrq7Q/LeGWS5NehrMjPfNr1sThaoWNEudQ=; b=D3+7lVgQ3z5Z1BNHzMs37E27XYItVc4zY/OWPKo8mElf/RufZuAxcr9wx815GlZc2b N4s0Ka+H27XvCBQhRXP47eiydXH2wI2S0uAX5H1FCcuvGnTBWhAxj6OBA/9Z097iUyHN BXvEb8D9uY76d5FdXGKfXOnlI0MYnn+ubKANneJ/+2QumUtBBL9dVvU6J9t2faHDs+XT hSWvNb22rKWupct9LkjN7w9n/r00AmZvfhNS4Er1u+I3cid2bNQTHk/fvrU6ut0OROlH InuKYfo86KvEwUpDNb6PID04SXZbysFmJwD8FlhhfdsnloZPmDx4+2pELu5mpQTUFZ1j maGQ== X-Forwarded-Encrypted: i=1; AJvYcCWtQso2rNXkZFCt4lbBYn4jkLBNDboD1dOKDT03ct8AgOEs/YhOA84d1QGj/SBrcRiYyHXHv1+owzBD4hMoU5DlBRk= X-Gm-Message-State: AOJu0YzgE1b0kd/fQIViCIYvJwwKzbPmKZy1zAJRT/65xRMSsjXWgigc m0olm3JqbxWTF1/WTH5RJwANvo9+O3CZZM0ma84fcLKQA+elC9lJ76epDAzDJzpdJXfmod2m0gL ABls5s4sL4Qx3LLYEFpzgCVdNsmk= X-Google-Smtp-Source: AGHT+IFzlPFnuc5frXW5y615z8dw0mQjnrMoofo2P28Tlw2uM6qYhqHj96rxEQlDsEkS3K6d03wOZBU1LbD+R+VyOVU= X-Received: by 2002:a05:6870:7254:b0:220:bd68:3c7b with SMTP id y20-20020a056870725400b00220bd683c7bmr2993934oaf.15.1709672220109; Tue, 05 Mar 2024 12:57:00 -0800 (PST) MIME-Version: 1.0 References: <97e95dc3-bdc0-4dfd-aca9-2d2880e1fdf5@linux.dev> In-Reply-To: From: Jared Hulbert Date: Tue, 5 Mar 2024 12:56:48 -0800 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction "the pony" To: Chris Li Cc: Nhat Pham , Chengming Zhou , Matthew Wilcox , lsf-pc@lists.linux-foundation.org, linux-mm , ryan.roberts@arm.com, David Hildenbrand , Barry Song <21cnbao@gmail.com>, Chuanhua Han Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 17AE414000C X-Rspam-User: X-Stat-Signature: m9yz637zbc39u7kwa735eeo5wckrny49 X-Rspamd-Server: rspam01 X-HE-Tag: 1709672220-9922 X-HE-Meta: U2FsdGVkX18jstt3WsW888nI/VTrE9Mbn5SO/5F4UCiSYKyPUrIRAfQBTRDv0oTGehT00n/ES3BcykY8gItYqdjcR/J6cNQ9GW+53enoLOADhoMLDs/NeJCwhf9NR27vMSY5Wfburtco91pnk2SJ9dbutreKmbTTwzYVZIiBi3FvNuBcQq+jnnFqasDJFyps/HJu+6XLbChwQ9alpiq/j3N9FvOe5aRINGE0SGi+7e6NIS2XLbnsbw/0UveZjAuBwnJC27hvYdQp4smOBB7klLXlUrK9N9Ag8lP4Z7SYC6cPUZNne9L3Mm61IsdaD/tnc0NbsovCa/lDReiNNldnDwI5h670hIIONyq7K2OpmGWv/YaVYLH0A6t+Hr/mLZrnQK1A6w9cLz1/yJnxiBuThUHAAy9V1knFx7UkOeAfPx1YyihKG0+gk6gyJ2rH77DqWbfKpv4H1Fd7fukHjv7HmT10WFkCSXrgi7iWwc9LY96PtbaiusW5x3dM1QBvNFu/K9Q+uodJLGhIpz8fRpNNdkaoKiYmhulnzdk2EI4anAEFRX2xAYS50UzpENlgdoThRa99lrnb2VCVU7MZH6EfObd+Ya7PcrzT9rB0Tgy5F8DYEg0ziFW/dKYlXcQeb52z+6WQYh25zMi+L81CDl2AvQW7dUzKecQ7k9I+8aTJG5f6qnVXPt+j2jbWg5pW+cmjjh7Hx+SfzndaLPMtIUUtbOj4k2qJudhhSp72Sih38lxc0heGKTGhzz4t2PKejoB8Me+P2ZoAKaNe48S6pUs/JNmTALrLJ/Hu39z7wXZDT7FKm5JxKc1lscfyrZYrBffQt38mjxJyKFHtYbpmcKp5v4dn8xCeHLAELW7SzRVJhbe9HB7VYqpDFJx2O6Ofh8FRo6atgedvbw3AoAHbNe2J+aZTDQ4SGzLVsclFAvnQCQE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Mar 5, 2024 at 11:20=E2=80=AFAM Chris Li wrote: > > On Tue, Mar 5, 2024 at 2:55=E2=80=AFAM Nhat Pham wrot= e: > > > > On Tue, Mar 5, 2024 at 4:52=E2=80=AFPM Chengming Zhou wrote: > > > > > > Looks sensible. Now the zswap middle layer is transparent to frontend= users, > > > which just allocate swap entry and swap out, don't care about whether= it's > > > swapped out to the zswap or swap file. > > > > > > By decoupling, the frontend users need to know it want to allocate zs= wap entry > > > instead of a swap entry, right? Which becomes not transparent to user= s. > > > > Hmm for now, I was just thinking that it should always try zswap > > first, and only fall back to swap if it fails to store to zswap, to > > maintain the overall LRU ordering (best effort). > > > > The minimal viable implementation I'm thinking right now for this is > > basically the "ghost swapfile" approach - i.e represent zswap as a > > swapfile. > > Google has been using the ghost swapfile in production for many years. > If it helps, I can rebase the ghost swap file patches to mm-unstable > then send them out for RFC discussion. I am not expecting it to merge > as it is, just as a starting point for if any one is interested in the > ghost swap file. > > I think zswap with a ghost swap file will make zswap behave more like > other swap back ends. If you use the ghost swap file, migrating from > zswap to another swap device is very similar to migrating from SSD to > hard drive, for example. Yes please. > > Writeback becomes quite hairy though, because there might be two > > "swap" entries of the same object (the zswap swap entry and the newly > > reserved swap entry) lying around near the end of the writeback step, > > so gotta be careful with synchronization (read: juggling the swap > > cache) to make sure concurrent swap-ins get something that makes > > sense. > > Dealing with two swap device entries while writing back from one to > another is unavoidable. I consider it as necessary evil. > If we can have swap offset lookup to different swap entry types. One > idea is to introduce a migration type of swap entry, the swap entry > will have both source and destination swap entry stored in it. Then > you just read in the source swap entry data (compressed or not). Write > to the destination entry. Every swap in of the source swap entry will > notice it has a migration swap entry type. Then it will ask the > destination swap device to perform the IO. The same folio will exist > in both source and destination swap cache. > > The limit of this approach is that, unless the source entry usage > count drops to zero (every user swap in the entry). That source swap > entry is occupied. It can't be reused for other data. > > Chris >