From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98094C4345F for ; Fri, 26 Apr 2024 23:16:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC4326B0082; Fri, 26 Apr 2024 19:16:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E748F6B0083; Fri, 26 Apr 2024 19:16:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3D126B0085; Fri, 26 Apr 2024 19:16:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B3C8C6B0082 for ; Fri, 26 Apr 2024 19:16:18 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7AEF14063B for ; Fri, 26 Apr 2024 23:16:18 +0000 (UTC) X-FDA: 82053243636.23.EF21837 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf18.hostedemail.com (Postfix) with ESMTP id B55A71C0006 for ; Fri, 26 Apr 2024 23:16:16 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=jIG7okHs; spf=pass (imf18.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714173376; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wnC6kWS7EC8Lh4Ey/d9Zi/WYSdrZBLgMydr3ySNezaQ=; b=isaX656RG6g00wH3Av13bXYvzzEv/u9LXy3LKA6Vn5cx01QybTLEZZRye/DiYExACierFw 5rDzPYTqW+qDl7kRgwJDs0Mr/MTQyw0V4kzdBZXUeh87gQvipN/J5//jbSmU11cTLlzi+N KUNIX/jhUNOyIaKKTjFrusuO0Dksq1U= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=jIG7okHs; spf=pass (imf18.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714173376; a=rsa-sha256; cv=none; b=zTESN/631dsq4GrkkGFa2xIa3eCQ7nFhr2czubLBWkVs3OgAdQKBjvKUMKKuCeTxSgF5ti /nuTVhwUVq+29Tsd0uJB+NHEdbj1u7d3pmmss+5lWD9tJVtGLkFf0uERgl+hVOPDA0Hwae cTc0UC1lAu63BTN9DcgDQ+lKzXFbnfc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id B934F62177 for ; Fri, 26 Apr 2024 23:16:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 174FBC4AF0B for ; Fri, 26 Apr 2024 23:16:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714173375; bh=i/wBhRN2FmRqLAw3qg/6YhyNV8moTihsEClOdkMDA3M=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=jIG7okHspmyEZtC4YgDgWUw6VzhoyVKmUwELbWDK9TChMiY4tis9qSYAuMfoHS3R9 yfdr9NrwqxFUmJkdEy1CaBTHorlqqjQ2JxUx1JjqX/G4kP4y4PkEWTXdrKPVExj6hY tG8scTPbdPtvQRAtcUd+ZoPvPTealJ5apigf5ePe823rQowanLDNwaWiTLz/+euxVf 9z5LnFspPg3wVCuiFNqk/AVJNrAOKAFphQxJePOn13dICHsR5/o0kkqkF3exLuJHZc SNmtd6SRQ0Yytfp8TNp8xi6DbRzHvgHMZFgSm2mW4a8jkQMz3QEMd7gBiGykfSJ/z5 FkOwyJjb/3T/A== Received: by mail-lj1-f176.google.com with SMTP id 38308e7fff4ca-2dfb4ea2bbfso1873111fa.2 for ; Fri, 26 Apr 2024 16:16:15 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCU50vK4XrhlZ1ZVcoJqfV0YlL/iLag36DGWq3pdSwuKNSUS5esKEOmTyXQf71WKfMD2WG936e0Opw8lbHao9WcqEss= X-Gm-Message-State: AOJu0YzmVkHbkdVRw23zlfgox/XmcqB02bSCNLVfPsoH+AX7E5Ld14Rq +Yyidd/9l0UhKsQKetVzEZTDUF6Mv0OUjWE5aM13OMVYc03cGYoXBQj+mGoQcZstaObOju3dhzp UtRa1cWuQOCLIqzmBq0TEJnh++w== X-Google-Smtp-Source: AGHT+IHXJuz7ODOWEdpU1q+G5HUecFVtjp4DvyVAWQUl8xxkWjgti7Z7xPLaVIYjNEeN1nykpXQSKRy1h93c+/5jwSw= X-Received: by 2002:a05:6512:3144:b0:51b:1e76:4e9c with SMTP id s4-20020a056512314400b0051b1e764e9cmr2532587lfi.29.1714173373661; Fri, 26 Apr 2024 16:16:13 -0700 (PDT) MIME-Version: 1.0 References: <20240417160842.76665-1-ryncsn@gmail.com> <87zftlx25p.fsf@yhuang6-desk2.ccr.corp.intel.com> <87o79zsdku.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87o79zsdku.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Chris Li Date: Fri, 26 Apr 2024 16:16:01 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 0/8] mm/swap: optimize swap cache search space To: "Huang, Ying" Cc: Matthew Wilcox , Kairui Song , linux-mm@kvack.org, Kairui Song , Andrew Morton , Barry Song , Ryan Roberts , Neil Brown , Minchan Kim , Hugh Dickins , David Hildenbrand , Yosry Ahmed , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: B55A71C0006 X-Rspamd-Server: rspam06 X-Stat-Signature: jrjkg4aksobtbahzqzzs3qy5hbhx1qrj X-HE-Tag: 1714173376-594054 X-HE-Meta: U2FsdGVkX1+zzi+PkcfJC2WwW5HDx1B2/AmIV2Xz3f9r86uSuJDEMcXetemkqcfReiz02+QcUNmahaatZNMJ5DCrgwqrU1dHZ/m4evGU+zgmP3U9w8owdYS/2P2qPg1AT8XJn6uUk8x8jm3JSVvmxnZwn1lq5zByxY7CtwR9sHYnhgh5dKbSsbC3lKNZbWkQ0uSKuDRdAor1BPWL6kaN2u/mtl29ODuafyrLj4dx48F5TEiKdYRXxJQu/BIl9B+CKjo3hVvjLO4s3zzg44HI9WDFVHifwCBkEUmBxhNELoBkcxD/xge4AoDe5t4yWCeAjU355vHO+dR6Nc7pPg56pNuBFSDXi2F86+tol/QTGdl79uvKu50zOJoXei+PrnH1PRJM3/wA2eXm9T0Kv+KVHZ1ztZHnVERIrxUaQG9LOowe6CZZNo21MYR0yzzCaCPofeMnZU91SWErhxUWoO1Rtdjsyhu8sfKdHEGn6X9qWMk7XhVomDzFZE6k82+Wm3wW9rS6hBpHFHRgEkxrMr5yUANlMkzlfkCr1LyZUdGESvBwqUuWnsMMvzU588q4M6YSxjERr54zqzSuyxMT1+sQMzWdcBNHsPIHo9OeeZ4rij4WCpiaRnLOWF0hstA9lRsbWQ1XLoqhy8E99aYtiDSaODlecbemA+6dSWCZrGyaatMmFA/k/5eFv8FOf3/a2Mq8QapqKepeXY4/jzNHzKxROrkSzsAcvO8+iIbyW4JK2CSZVrjJm+x5UDZ7bk76UTEs47cVuiAQrXfambqVWNARfCZc6UaQ/hYYUMX4krhpnbJsOMA7Oi0a41T0xFDAjicH7/LM7dYXvZr6lyofD1G5G3EU7ErzbCsjRuIhnpMmr9bCBl7tEmQxPJoe/D0lhhZc4lPrKnwfcUTZDBv5K5ZgvAaoyFSKIR+G/glpEoUSdL4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Ying, On Tue, Apr 23, 2024 at 7:26=E2=80=AFPM Huang, Ying = wrote: > > Hi, Matthew, > > Matthew Wilcox writes: > > > On Mon, Apr 22, 2024 at 03:54:58PM +0800, Huang, Ying wrote: > >> Is it possible to add "start_offset" support in xarray, so "index" > >> will subtract "start_offset" before looking up / inserting? > > > > We kind of have that with XA_FLAGS_ZERO_BUSY which is used for > > XA_FLAGS_ALLOC1. But that's just one bit for the entry at 0. We could > > generalise it, but then we'd have to store that somewhere and there's > > no obvious good place to store it that wouldn't enlarge struct xarray, > > which I'd be reluctant to do. > > > >> Is it possible to use multiple range locks to protect one xarray to > >> improve the lock scalability? This is why we have multiple "struct > >> address_space" for one swap device. And, we may have same lock > >> contention issue for large files too. > > > > It's something I've considered. The issue is search marks. If we dele= te > > an entry, we may have to walk all the way up the xarray clearing bits a= s > > we go and I'd rather not grab a lock at each level. There's a convenie= nt > > 4 byte hole between nr_values and parent where we could put it. > > > > Oh, another issue is that we use i_pages.xa_lock to synchronise > > address_space.nrpages, so I'm not sure that a per-node lock will help. > > Thanks for looking at this. > > > But I'm conscious that there are workloads which show contention on > > xa_lock as their limiting factor, so I'm open to ideas to improve all > > these things. > > I have no idea so far because my very limited knowledge about xarray. For the swap file usage, I have been considering an idea to remove the index part of the xarray from swap cache. Swap cache is different from file cache in a few aspects. For one if we want to have a folio equivalent of "large swap entry". Then the natural alignment of those swap offset on does not make sense. Ideally we should be able to write the folio to un-aligned swap file locations. The other aspect for swap files is that, we already have different data structures organized around swap offset, swap_map and swap_cgroup. If we group the swap related data structure together. We can add a pointer to a union of folio or a shadow swap entry. We can use atomic updates on the swap struct member or breakdown the access lock by ranges just like swap cluster does. I want to discuss those ideas in the upcoming LSF/MM meet up as well. Chris