From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52698C4345F for ; Sun, 28 Apr 2024 02:43:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C64016B007B; Sat, 27 Apr 2024 22:43:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C14E16B0083; Sat, 27 Apr 2024 22:43:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8F7F6B0085; Sat, 27 Apr 2024 22:43:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8A4176B007B for ; Sat, 27 Apr 2024 22:43:37 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2224C14021B for ; Sun, 28 Apr 2024 02:43:37 +0000 (UTC) X-FDA: 82057394874.24.FA04014 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf06.hostedemail.com (Postfix) with ESMTP id 2E186180018 for ; Sun, 28 Apr 2024 02:43:34 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iXVAM2bv; spf=pass (imf06.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714272215; a=rsa-sha256; cv=none; b=wI3G88kvtns6a9+sd0E1axNEhQqv4rdg0i+NBQ/xjeCeCGKEPbpPCDNPvjVkfPLIXQtMBb 7TLuedBM7CFdB9ULc+vf0dPZrqcn66nAAcb6lqVxv6v+Jd/MmtxENQrjI7K8yAkFwqVUNz iwvK8ahneoCqQ5N5FEFB4jp76YstVyw= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iXVAM2bv; spf=pass (imf06.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714272215; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9t/1GCHVN4YGYRADDKgDrT1NHMpU17LQKodOBKwsb5M=; b=iVWPWP/ygRKpTD6aPV93r0hqH2HLwWexupRXqezGTr6aY6skzH7gdY5mepk4eCqD0KSmDT 9vVc/69ftbYC2Bysj6jeM7RYzoIsuxLB80mtdSywz7Qx8RnaTUJlH4BVbU4MeQXpUyK4At suNFfigyEP3u5HsV8qBW5kOa79KwwMA= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 1018E6023E for ; Sun, 28 Apr 2024 02:43:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AECE2C2BD10 for ; Sun, 28 Apr 2024 02:43:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714272213; bh=HFYmYFkOJcs1UWSCZYbi2D5Ua/FNF/Ysbr0h8olyrCU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=iXVAM2bvQ4QN3T+HmWeAabZGCZ4i1IlmMKfdarRhBok4Sm5fs/vb+yOLl6Y7n454t 3bTrXO6lYcBiu9Il+aFtNLwBPZDNzzdXc/HmaLso+r1A/ZQd73SmgQlL6yHz259nX/ UHynHroMuuSxnQhrfioFQlbBEBQ5EBWqsOwSVnFOUxzs+NssrSYfVqQHhY2sDOfC7R ot/nYotLa6XOtM1UznO4U10TsI07KwUk41VgZ2Q7fjuJ2301o21QlTlzobPK/pGmO0 ODUodxtQ6UEIyQYPHcTHhG4KhhVhCoAjaoZThkfI+BXafgoerVnf2BJQub0fdbedW1 luz2GbSqxS56g== Received: by mail-lf1-f48.google.com with SMTP id 2adb3069b0e04-516d2600569so4059264e87.0 for ; Sat, 27 Apr 2024 19:43:33 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCUNf5KiskwNnCs6b3l1+C3d85C7GleND7yhx1gmZLWmha3Ajt7o0c9Gq3YESkzDkxK0acdnkixr8iNNanmJ8x96v0o= X-Gm-Message-State: AOJu0Yw+mBd74NwaSQEGyV98MNhBZhne6W9dvWJBpwAkGBvh0fZeufHN 5f+KurLOTGQAXjtczBt1rKrwQ9zLoAei8n7c+yP1nzkelNlg5BG57S9A/aNqsz1iVWk1kbjzxki u5nPdBrhGXsqnCpu/2YOGo+l2qA== X-Google-Smtp-Source: AGHT+IF75nQX/dWyajS5NVAD4ViXSOpii6Cd6eRPc+A/veNE/1VqvZTVL4PkmA9WDbIYXX+57ig2aQaECW/3iwavFMU= X-Received: by 2002:a05:6512:313c:b0:51d:aca:4b06 with SMTP id p28-20020a056512313c00b0051d0aca4b06mr2055891lfd.39.1714272212358; Sat, 27 Apr 2024 19:43:32 -0700 (PDT) MIME-Version: 1.0 References: <20240417160842.76665-1-ryncsn@gmail.com> <87zftlx25p.fsf@yhuang6-desk2.ccr.corp.intel.com> <87o79zsdku.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bk5uqoem.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87bk5uqoem.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Chris Li Date: Sat, 27 Apr 2024 19:43:20 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 0/8] mm/swap: optimize swap cache search space To: "Huang, Ying" Cc: Matthew Wilcox , Kairui Song , linux-mm@kvack.org, Kairui Song , Andrew Morton , Barry Song , Ryan Roberts , Neil Brown , Minchan Kim , Hugh Dickins , David Hildenbrand , Yosry Ahmed , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: hzezdk897onhp8j35157hj9dw6gpqp1s X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 2E186180018 X-HE-Tag: 1714272214-941089 X-HE-Meta: U2FsdGVkX1/IdlUz4k4kX0GqTudjy+BP3zhTivz8xrdUj3Ep/2sPIK6UoAmkFEYLWrl3EMle6tQhoAsKFp5yL7Ht8tHWJAxEk8LgOp3M+4Xu4PCKMrMyNa39i5WBstAvYfSKy/vBVz1sOBeFsl4tVxgzPJIB4ijJsWP3LxDhq0WNInOh0VkyDUolJVcvXzyQeyv354nMuaS0I8i1kLvPN/XlA3Z5K3Cl9HzxeV9Bh+FlKy5RH/9o6Sfqvbf+RyK3YZ5Hyc2ONZ6lOTqehvaJSUPPszw3qaEiibluZhu0RxpMY6ZIujybRYNTDoU5c8iXRw1gm0ukeQ5C9v/pUZcT/o4VKONRVVXI4fsgN1NyTFukd00Mp2avdguuiNPfPkg6gLVmG+i8pK3QiD0d4oSQddcYXRyqPNKyuf8LrkyMQbY1UzuU76bO4wGu+9aSSJyiENJiFfOOnSFO/zme0gCA5R24SsGUYw5iM8VdXDSHhQq6xCtVqgdAq4qGdRrm1ft0KE2GzAuFkNbf/xePXI196PaHT75zQ+eNKUrneeiwRKeOalvo+jDpmlSnLXnZ3gT+e+yeKDemW/zivfMotFIA2Ttg4a53cfytQwCWlzlbHsP/H5R2TW+v2iUNDtbFvHRIOuObFIfWtxsazJj2ek5DZXiW4cxBHB9fb2eck9ohfQ04100MPaa3K0ZQPweQLUp3dkZlbMajGV9z8mNkNBzsxaNE/dv2pmIUlXnVxh8qCP6ACGeApm/7fnUscQanpAkhbQd6tdrDlzkUA4ATIU2yyXJKaNEokB4ogK+8qFZ1RWmZYiUqy/FdbHennPvnFQkHgQDbeqK/+gZMFFzHIt5veu9WLj4Tf4El+MbudX1pUy2QoYaPdDuojDcrXq/PSG0sSAkoj/ulYHM32ZCZLOA5sTyIKflDEYyhIX5nJRdlJbAJwHrFNGBoJubMCOvAT2zD0qus8sRs8VpLYbIf2+2 qmyGcobb ry9Lgqx0U6RknKkrRkRT2muBb2t1w2gLUuJOb0thtixw2PDc1rhLNTtoMgFx3xeSXo0FL/NgmViJ/3Wq51bBHyerJ0BbvwjWRIy7Lwm1l1VauL9fl+aU/O1ncAvp0XR+iyKYUaPfA2hJtOKc5KxjOTIomiLqmnbtr1l9tdFgzuXtNYivtim58fS490YwbilTuohGICPB4Ip8HEA8WgEwQxKQQSDkLwSqN0FRnL+lWEykgerYJPiz/CjUKAVT5CyiGxsL7TxG8+igsMp7ZnRMUHlD9DMPZupjUD8tkkKIFuePb9VQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Apr 27, 2024 at 6:16=E2=80=AFPM Huang, Ying = wrote: > > Chris Li writes: > > > Hi Ying, > > > > For the swap file usage, I have been considering an idea to remove the > > index part of the xarray from swap cache. Swap cache is different from > > file cache in a few aspects. > > For one if we want to have a folio equivalent of "large swap entry". > > Then the natural alignment of those swap offset on does not make > > sense. Ideally we should be able to write the folio to un-aligned swap > > file locations. > > > > The other aspect for swap files is that, we already have different > > data structures organized around swap offset, swap_map and > > swap_cgroup. If we group the swap related data structure together. We > > can add a pointer to a union of folio or a shadow swap entry. > > The shadow swap entry may be freed. So we need to prepare for that. Free the shadow swap entry will just set the pointer to NULL. Are you concerned that the memory allocated for the pointer is not free to the system after the shadow swap entry is free? It will be subject to fragmentation on the free swap entry. In that regard, xarray is also subject to fragmentation. It will not free the internal node if the node has one xa_index not freed. Even if the xarray node is freed to slab, at slab level there is fragmentation as well, the backing page might not free to the system. > And, in current design, only swap_map[] is allocated if the swap space > isn't used. That needs to be considered too. I am aware of that. I want to make the swap_map[] not static allocated any more either. The swap_map static allocation forces the rest of the swap data structure to have other means to sparsely allocate their data structure, repeating the fragmentation elsewhere, in different ways.That is also the one major source of the pain point hacking on the swap code. The data structure is spread into too many different places. > > We can use atomic updates on the swap struct member or breakdown the > > access lock by ranges just like swap cluster does. > > The swap code uses xarray in a simple way. That gives us opportunity to > optimize. For example, it makes it easy to use multiple xarray The fixed swap offset range makes it like an array. There are many ways to shard the array like swap entry, e.g. swap cluster is one way to shard it. Multiple xarray is another way. We can also do multiple xarray like sharding, or even more fancy ones. Chris