From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0578C02194 for ; Tue, 4 Feb 2025 19:25:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4518E6B0083; Tue, 4 Feb 2025 14:25:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4019C6B0085; Tue, 4 Feb 2025 14:25:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C9796B0088; Tue, 4 Feb 2025 14:25:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 0CD8D6B0083 for ; Tue, 4 Feb 2025 14:25:42 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8E6A848AE1 for ; Tue, 4 Feb 2025 19:25:41 +0000 (UTC) X-FDA: 83083241682.18.CA5056C Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) by imf18.hostedemail.com (Postfix) with ESMTP id 8651B1C0004 for ; Tue, 4 Feb 2025 19:25:39 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=K+HaCapF; spf=pass (imf18.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738697139; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=u8ngTcHGSqu8PG19UTWvr3quhD2nQ7uDujoOUGHoOQU=; b=J5BIWQEiPULtTI5wciJkqfTwwTIcmy8bHj7WxFsAXHaSfTuDuNwslgaTFmf4wT+nhu1xBS K920wv5F+tq9YuVoAj8O7W6tscbqEOHIsmz6ZIf88g//uB7m4xi7xPH71ATQWRzxy9AsgD 5ouojO7dHZCAPnSI4JUtw1fQjFStqsQ= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=K+HaCapF; spf=pass (imf18.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.169 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738697139; a=rsa-sha256; cv=none; b=NUsLkqrqX4oFWC3FtqNG9lDNmtDeBoW07x2ZXvpAgH+G/DkSIjo++BTJos051SA7hXRKgj wPnX0YCM0uEGEy3FEi3LxMzqu2Lw52gan6s6J9Gp8Xd98Uu7VcMCgxi0TLYtQRAqq+a6Uy DiK66tMT8Vo0I6VBcrMMrD8NEUu0aWg= Received: by mail-lj1-f169.google.com with SMTP id 38308e7fff4ca-30795988ebeso57356791fa.3 for ; Tue, 04 Feb 2025 11:25:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738697137; x=1739301937; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=u8ngTcHGSqu8PG19UTWvr3quhD2nQ7uDujoOUGHoOQU=; b=K+HaCapFr/fr8Lmh4OXep7suMdhx9Ahd/V4Yq3IywM698VE85+4IH39W/Y3DlCkTt5 r6BeBki6AKhpK4IiioDseYQBCXGHcmB0cGCJ7trcfU79C+dgT2MTMT9HS1xAMtSb5NRP 3MqUFS3zIu32tzHwUMWtfsh3iVypLbv9kAJmv9GBWBeS8ypffuBSLzcy+2RC4SnJSiqi vOfZeURookxUOnF/dxIh0109BUyAxRi47BJ/EtxSLncyPJrBTY2GUXU5h3aS9Tl+O9J4 h3za0PKAoavlLf6gcLLtX7oK+I0wfV1BUvO+qz23ypoddH4FjP4dydWG9C1c9k4x/XBT 5LvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738697137; x=1739301937; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u8ngTcHGSqu8PG19UTWvr3quhD2nQ7uDujoOUGHoOQU=; b=NCO24Eb+HEP8vWZ7EvP2tnkwHSjIaozfnlJaf5rJzVWbYiF64eTglW73vQts2xnb/t u9BRNtiB97TkBNyF+wfkUdxJXwYOQ/SGwAPVRDs7geLAATF68oznyheG0fe7Z6Y2c1XT yks1LYmlwoBi1V60qCqF6HDml5NBFrD28oj4qFMBKKcXOl46omyJ35r/RfZgqoPmf/Rm at6FV0FlT5nIut/+FVPvqjT1SWDSFl4CTQDTCKUhebxlPPkIq56jZUfS/C9ZbIdG5J/8 d/aXJce/wB5xc5fxloh0zGhDzN0fBhJ2lQHACB4IDW95KpNqNcEIRqpoKevrYlGxtO+3 ioOg== X-Forwarded-Encrypted: i=1; AJvYcCV/XI0eaTCaWhTvCJNuA5V8kjtiYhi9fmao9wyoYeBsUsVSq6eC2zi4esktFbIfJvyCUAQk0d8+UA==@kvack.org X-Gm-Message-State: AOJu0YyAWs2zharDSzxq/GAGicumljDONbCnskDI4FDuuNOFgw+v176w ygFlK7vv/0dMGTgoVB2WGfBNLMiVqpxH4JVBCsh/++HXGdo4aFDhU3/thL5CJtq2LMz9ATIhbma 5KH4pzWQpd9NQ0VuSrdBKLKmKd4DjGYBGt/IinA== X-Gm-Gg: ASbGncvaXjLqEvdElRt8mlaTPL/hxLbDICpFih/hOMDxSwRgQ9sAmsP9NcXy6T4xdPc mPrGDeamGJzFSQ15HTS4zmV0oPQwURbWvCaXQZOpU6x/UTKjDQEISPa7Jrl2O79uxgVsi7/7y X-Google-Smtp-Source: AGHT+IHz1xOhMjPNCPEW5AwJYRqMfmJtYT3XRPuy9PEjqE3rgrMNO5zacCu8hxK3RfSgYODAaFsj3u0RUd+ZTia+nXQ= X-Received: by 2002:a2e:be9c:0:b0:307:ce2b:ed82 with SMTP id 38308e7fff4ca-307cf340bfdmr854001fa.22.1738697137358; Tue, 04 Feb 2025 11:25:37 -0800 (PST) MIME-Version: 1.0 References: <20250204162426.GB705532@cmpxchg.org> <20250204190904.GC705532@cmpxchg.org> In-Reply-To: <20250204190904.GC705532@cmpxchg.org> From: Kairui Song Date: Wed, 5 Feb 2025 03:25:20 +0800 X-Gm-Features: AWEUYZlhoI7pARoqb6IZK60tyBcTiBnX6DrL_PDerqKmZWMeau5tGvs79wiudIQ Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Integrate Swap Cache, Swap Maps with Swap Allocator To: Johannes Weiner Cc: Yosry Ahmed , lsf-pc@lists.linux-foundation.org, linux-mm , Andrew Morton , Chris Li , Chengming Zhou , Shakeel Butt , Hugh Dickins , Matthew Wilcox , Barry Song <21cnbao@gmail.com>, Nhat Pham , Usama Arif , Ryan Roberts , "Huang, Ying" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 8651B1C0004 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: xxoioswbbwgepfsksu9j958cz1hmrm1m X-HE-Tag: 1738697139-874730 X-HE-Meta: U2FsdGVkX1+SeeuGADgpNOCl6k8s05riK9S8M4hkSqbk8eUtDx6cPerU0wCO/AdysIMWucVzwwrQFhFYvHb0LsQUBMNnw2jxvDIvMm2ODVI5i8vX2IrDUxK7NbiRyHU3Q+uxaAz+EHyxMT5NicZvhvr0/8FWBP/83e29EaT2WH/eI6bNdGey4Etx5Tb/PxMoIOaUFenu88DrqtdrM+ZIE4EKnPQfVLdoErgLdWDwTXLT74k0fXnudIAfebgJEKDePInx+z+z1vYHoUaVluzRJqT1OwyHE3/manRnywNulwqCJokijNhTaLgIL+RphxlyyMd/JUsTu6e+6ZpDmfqmGH6atbrqwAv9gzDdeI4qGQVFgdYp7VzTJ0I/8hUZSAawPKycqDXTSKHzJOXKX/G/Zo+tLyZuRa0LVtQkZZGSFtYKeO5jshPYShqG7wvH5otrC2V0W2CrQgff5caxLyOgjlZKKXELtiTXiFw7M4iBdFfPOoJj2PyinlM9cGV49Hwv83OQbHjFdMlZ1XDJryxwQsjiZstaZP/X36KviyyN5NHoMMz2CtbLxJ85tx8MuZttZdovXmcUQq0nduSKCXXGCCjPb+I4L4yJUahvvc7VIEs5Gd6fMUEyVC1YKbEIvsrcaTqk69SgT3KbI8CmybLky4YDfjZpCIRW3hXBxwftqm6ldifTCNxp7Hiidk4YQ1UzwhvW2xdUx1cHii91zMoVrwuvs5OCs5fmhgmoI5pkiFrMatxyNBeU6lmPx9zPbyeYa8V5JMWuFwuDd0hVT8DKIXcHB1OzcCK+8a2EcdLNDc+1rlTE+DrggYdHOAoPPAAS6WSnYh8Z7hOpXwD3TN5q+3tDnrI5oYQh8lRLafRtoB3PNEJqxmJ1xmRp0MEgaipNw4tDzaOWpDRlMTiuUxymmPmy7Ltrd1Z+XOjfBmtX7Rm7fKTCqQEnETWdJtbS2yecI2g3LFXZ8k+rUiqBKAm /aAyYCuV sABHNQzVvlE7kyb4sj5U6HN23bG6RCm8r/ydoOXmfSGDsUkQM0KjLM4WH1MAZAG6BClzLFzZ/X7OojCvy7haHDJK4ZM+xu+Gh4uOOivG7DAqYvueABH2qDhD6JY0v3OQa2+dyZT3UihRZdpltHkrfhZpQbjQh9/uLM8sRQ8eOFR/lOJ9meJRKe4Dr3XnIk6xHfrdKxQp1DU+yT58sQNaLzjQUB2am7s+DMODXJqHDRpKyHXtXWBwEtwNjlBNmmiFWLkZMv3FTFxxbysTljCSGFB2bop2JinUHS4JXM3/tYDRxRbmIOqLRKdvfwVgWBGnjTTLR8ur9B9QhQM4QeNQZhTfXCXoFjPLaeAESGR0I7prWWyYiX1TbIXofzNcugNqrKFs02FX0jEyarqvqrRzZusc6LL6bQco7w62ECN+HMvH2ll0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000370, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 5, 2025 at 3:09=E2=80=AFAM Johannes Weiner = wrote: > > On Wed, Feb 05, 2025 at 02:38:39AM +0800, Kairui Song wrote: > > On Wed, Feb 5, 2025 at 2:11=E2=80=AFAM Yosry Ahmed wrote: > > > However, what we should *not* do is have these clusters be tied to th= e > > > disk swap space with the ability to redirect some entries to use > > > someting like zswap. This does not fix the problem Johannes is > > > describing. > > > > Yes, a virtual swap file can have its own swap space, which is indexed > > by the cache / table, and reuse all the logic. As long as we don't > > dramatically change the kernel swapout path, adding a folio to > > swapcache seems a very reasonable way to avoid redundant IO, and > > synchronize it upon swapin/swapout, and reusing a lot of > > infrastructure, even if that's a virtual file. For example a current > > busy loop issue can be just fixed by leveraging the folio lock: > > https://lore.kernel.org/lkml/CAMgjq7D5qoFEK9Omvd5_Zqs6M+TEoG03+2i_mhuP5= CQPSOPrmQ@mail.gmail.com/ > > > > The virtual file/space can be decoupled from the lower device. But the > > virtual file/space's table entry can point to an underlying physical > > SWAP device or some meta struct. > > It's a bit unclear to me still which level will use the struct > swap_cluster_info in the layered scenario. > > Would it be the virtual address space, where ->table has tagged > pointers to resolve to swapcache/zeromap/zswap/swapfile? > > Or would it be the swapfile space, where ->table resolves to disk > slots? > > Or are you proposing to use the same struct on both levels, with > ->table catering to different needs? I was thinking about the first case, that in the virtual address space, ->table[n] will resolve to an offset in a lower (physical) layer or some other meta structure. But we still reuse the same struct for both layer, the table could be in dense mode for used clusters on lower layer (3 bytes (memcg + count), or even 1 bit per entry, depending on how we want to store info like memcg_id). This also brings a nice side effect (feature), we can have multiple swap file/devices, if the upper one (virtual or not) is full, it can fall back to use the lower one just fine. > > Keep in mind, in the virtualized case, it's the top layer that would > have to keep track of the page table count, the swapcache pointer and > likely the memcg linkage. That also means the physical layer could > likely be reduced to a single bit per entry - used or free. > > I suppose void *table could also point to such a bitmap? But not sure > about the other members that would become redundant/unused. That's very doable. I also wanted shmem to have a dense table, it may also reduce the entry to one single bit.