From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2846CC4345F for ; Sun, 28 Apr 2024 17:37:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9DD986B0085; Sun, 28 Apr 2024 13:37:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 98E056B0087; Sun, 28 Apr 2024 13:37:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82CF56B0088; Sun, 28 Apr 2024 13:37:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 649D76B0085 for ; Sun, 28 Apr 2024 13:37:25 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D944D1415BA for ; Sun, 28 Apr 2024 17:37:24 +0000 (UTC) X-FDA: 82059647208.11.F7A2260 Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) by imf18.hostedemail.com (Postfix) with ESMTP id 06E9D1C0007 for ; Sun, 28 Apr 2024 17:37:22 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=IOt+Y1Xe; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.171 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714325843; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2M+7SXqqUCz3AMb8S+u8A0bUNwyf95ASvG0+qysyKfY=; b=vp1XXrYPIigbE+Kxv3mMCPEJ2o49y0is2X+66qDZqQnzTngZOALOIKoIek94JlLrWKt0mf fYLdztZ17q6O05/0/q0aNqH8d1RpEUgRO83dgdY1w0CYrbZYW7WjHxtYu1u4nZruk1H4MQ 7y16OdR73B7By4EslvZtAwAcHvsSA6c= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=IOt+Y1Xe; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.171 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714325843; a=rsa-sha256; cv=none; b=V+FRKcL/ROgE+PcpzD00fDGxmJNUP1Na2QV0juLTgYgPhTx2d5+bv4kLFvmvZBBOSy3HVA rFgeqeNXlmfbbyCjDXS3wve3eT2d3Hq3XDl2AjI3HcFrv5reLZC6TF3cInNNulneI7XxLQ SLTGOxnLVLobpjUWwygPcrWhP4vHnlI= Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-2dac77cdf43so47146491fa.2 for ; Sun, 28 Apr 2024 10:37:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714325841; x=1714930641; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2M+7SXqqUCz3AMb8S+u8A0bUNwyf95ASvG0+qysyKfY=; b=IOt+Y1Xe/KCadl6v/02tRh6XJgPsgeb9VJQhy9SUBLIH+M6+Zt9Z6RTBcYMEyyZ+jL uYqzM7T5bAxKLoT33BjMqPkcMeQ+1r/KzBoD444kBozy5aaYxYzjEVhHqwkGbS0pqj74 TTUCyyqrttLl4Tgol0zg5niCI2l95APOmkri9uhoe3V0SYM55hBc2RvYTWajAOs46J2w 3depqIbIAYDNChtD8SmQrefm5vdHVQqHGBuEVlYDAaTRI97rIAH8bXxr2+ffYJGi1N8n 8JK3tUHAjKVZr6E66QBr8cMcGCAeemUI4EA31+HgIE1wBKGLuRgY0G9VSO12WLlW+RI8 P9ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714325841; x=1714930641; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2M+7SXqqUCz3AMb8S+u8A0bUNwyf95ASvG0+qysyKfY=; b=nf9g4LiofNjTOVEFhgWA2vXxoCUhrG86gbxxCfmCaUSyh7beu/YXvexZt8LjJX5yKd 8MVSMTBMnXRLZeWM+K6BGtF8Rf1ra8Drvvxca0gIYPPXcfxQ2aYyaCm/DX4ciEYiOpLM bBXY+O70d9Wh0lIXFuDRrPQ/BVBxJER8zLrCNfJ/9azLLBXdpukQhgk5LLTdI29ICA7e uaa3OCEqwLripxmD2SmYBAZ84Orw6sg9dr84iQ0Pz/cWFz+9Fq7yeyuz5JjUiOwPgGW3 eLmtA5wtnYchmgBRFR1C64bQyNaEw0O7Q4oMqkW3VvsNstAO60ehGtUCsAPMXZeRPp/w reQg== X-Forwarded-Encrypted: i=1; AJvYcCUt4yPjmPtk9TXBmU1TyTVt+rNlRqQ/IvRlJdnJEQEace++/UQnJQiD26gpVKK7v3l6s/9HqryXbSa6MuuI9O9e6/w= X-Gm-Message-State: AOJu0YxOvw4Hcck9m1+QZoXnBGVALy9g1If3NanzJ0P12zpeOY9mf3KG 81biEkI5fnTq5XOwSOxy7rf+LNxdvOYJ5PxlsKJEdWzYKKgOXqOiYuLWs0Zznk4AXGn9JPCYNI0 eT01L+tzD469Q8yWyQ6e/JvQXprg= X-Google-Smtp-Source: AGHT+IETOZuoJd/xz6aIGmwIsw9yXSQIJCM4rWPYbEALodfeYov81CKqQggfN1JeZG0fTBDgM6yKkiGNkU+PuSxgCiI= X-Received: by 2002:a2e:9d10:0:b0:2db:ef48:ea38 with SMTP id t16-20020a2e9d10000000b002dbef48ea38mr5486201lji.45.1714325840835; Sun, 28 Apr 2024 10:37:20 -0700 (PDT) MIME-Version: 1.0 References: <20240417160842.76665-1-ryncsn@gmail.com> <87zftlx25p.fsf@yhuang6-desk2.ccr.corp.intel.com> <87o79zsdku.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: From: Kairui Song Date: Mon, 29 Apr 2024 01:37:04 +0800 Message-ID: Subject: Re: [PATCH 0/8] mm/swap: optimize swap cache search space To: Chris Li Cc: "Huang, Ying" , Matthew Wilcox , linux-mm@kvack.org, Andrew Morton , Barry Song , Ryan Roberts , Neil Brown , Minchan Kim , Hugh Dickins , David Hildenbrand , Yosry Ahmed , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: k8jf5mi1xjb7rwdog6humtpmrnfugqp6 X-Rspamd-Queue-Id: 06E9D1C0007 X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1714325842-638757 X-HE-Meta: U2FsdGVkX18GAhCbmDpnOZns9C4WH7bdUZXgqfpWDixANCBwRnrv4kwZtRQf0Q6NvTSV5AZCXMaXp7UzaKAaoqxGNCcUNlBA1QNASUQ5xP1KZ6ljIi8Yie2JTm436oMjMMgczycHJRmUasVl9S3yebimO8qIoEpqJ3FgBL6qk3rxEQNfmhOGDQNiBftirBKaBiPbFSYotDrKeWag1kCyPtOzI3GlKbuAC0ZkLXgJs0s948aN3jjGwU0fQgYfyu+Imno2lFMDv/aXVquUPCXneBgDaJraOjNG0ZJbCiClOig4ibsUT6RrqDj6ThbxC27sFl6byEo0a1V98t2ii/O49z9d5IQpe2MQg19SXpebChfFVxU9GeRX+8vT8jSZ7p6KjXFAkgH/46W/uxhN2n/H+CCJiQQVmlGKeFYRjxMI11Zs/au0nO4QPfZYb3GfdagHe0dT8MBJfbMoeFmJknsbx0O8wADeEGmviMHSYicyZWXGsXjFTcjCI9ewHLg4m3lZlCGTTHZ6q0QEXXcxH2eFAQi5y5wkwd8/L18aXtFKc73faSKdwVmPEnVa/M6jblazKYFVB5criga+DIayr6IMGRD65MRI4+FjRPNTMrXyrIlFgh15pD5WDDfsjC9lvTWNlkwd9i2YEFX96YvWOUH3tO2urpFLzn4XD2hKi00uRC1rpJF7JvaT8BXfuSKv4/mkTgFLMvqHUy2CkSvPIIAO2CeDL5IsYmMeoTloaVT3snrPABdOvqa+GA/kGEDeHZsd8L4WK5bKfPX6EMVK9NrKKFMaZTkDGYT84dGvy3vu8do8CHKazIV+7tavYNgM/eXdtkImhDpeXZXc8VOFIFYZYCDsTVDKJS6eIw3dVwCieHzFI4O9CDl3kmKPkPrgH5JXXZnIno4Zs+8ulauBRnJHmkREcRxmS+2JNjA1aTisJ4EMDJTqcU540tEYASbiKJl3atvIwh5KLu24TUXGepk 8GCpazcY 4dWcbZTzyfiPn7ps5642v0ZE4UhZehfYMv4xcZ1F/Afh3fKjJg/k8mcSIC4ZJ6n2peG+ys89gUTenQPGDkmOYZx/ilUdTX0O/p6JfLlfd2HFhgivLDZjDyMCftnrMGGe5/Qd2l4TfHnpmo7LoA8HCNh6txqtXDZ8uwQQNkyVXwdJbol1pqBoFau6fpcaZf/Sbt0ujVfXpiJMgtw2Mfo+/TMLgn4YiA8vweUTzszviSRsOdUi2Yg6xxdxjBsm77/lM+3w9haQRPRq9FkWWGsA5ZEf6ASsZaNdFZqjF1Ytd43OD+h8MuiJt7tV+P1SIJSCd6MG2Q/979zYNQYKRnv07GUukJh7HpH0egLXhasSpGybtg4tAHSfVjJoHR7LmYx4mKg3pd334uuit8AfQwVGLdbKGS7+sXIFAjTi/2l117TwdmHFNXr66kYltPGpfARQs0tMZSV+eU7Kt20sgXKjMBjdykxoLv5O4g8AuAZ0gvwc1YG7CdlKxb+Kmhw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Apr 27, 2024 at 7:16=E2=80=AFAM Chris Li wrote: > > Hi Ying, > > On Tue, Apr 23, 2024 at 7:26=E2=80=AFPM Huang, Ying wrote: > > > > Hi, Matthew, > > > > Matthew Wilcox writes: > > > > > On Mon, Apr 22, 2024 at 03:54:58PM +0800, Huang, Ying wrote: > > >> Is it possible to add "start_offset" support in xarray, so "index" > > >> will subtract "start_offset" before looking up / inserting? > > > > > > We kind of have that with XA_FLAGS_ZERO_BUSY which is used for > > > XA_FLAGS_ALLOC1. But that's just one bit for the entry at 0. We cou= ld > > > generalise it, but then we'd have to store that somewhere and there's > > > no obvious good place to store it that wouldn't enlarge struct xarray= , > > > which I'd be reluctant to do. > > > > > >> Is it possible to use multiple range locks to protect one xarray to > > >> improve the lock scalability? This is why we have multiple "struct > > >> address_space" for one swap device. And, we may have same lock > > >> contention issue for large files too. > > > > > > It's something I've considered. The issue is search marks. If we de= lete > > > an entry, we may have to walk all the way up the xarray clearing bits= as > > > we go and I'd rather not grab a lock at each level. There's a conven= ient > > > 4 byte hole between nr_values and parent where we could put it. > > > > > > Oh, another issue is that we use i_pages.xa_lock to synchronise > > > address_space.nrpages, so I'm not sure that a per-node lock will help= . > > > > Thanks for looking at this. > > > > > But I'm conscious that there are workloads which show contention on > > > xa_lock as their limiting factor, so I'm open to ideas to improve all > > > these things. > > > > I have no idea so far because my very limited knowledge about xarray. > > For the swap file usage, I have been considering an idea to remove the > index part of the xarray from swap cache. Swap cache is different from > file cache in a few aspects. > For one if we want to have a folio equivalent of "large swap entry". > Then the natural alignment of those swap offset on does not make > sense. Ideally we should be able to write the folio to un-aligned swap > file locations. > Hi Chris, This sound interesting, I have a few questions though... Are you suggesting we handle swap on file and swap on device differently? Swap on file is much less frequently used than swap on device I think. And what do you mean "index part of the xarray"? If we need a cache, xarray still seems one of the best choices to hold the content. > The other aspect for swap files is that, we already have different > data structures organized around swap offset, swap_map and > swap_cgroup. If we group the swap related data structure together. We > can add a pointer to a union of folio or a shadow swap entry. We can > use atomic updates on the swap struct member or breakdown the access > lock by ranges just like swap cluster does. > > I want to discuss those ideas in the upcoming LSF/MM meet up as well. Looking forward to it! > > Chris