From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B7F2C4345F for ; Mon, 22 Apr 2024 15:20:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E59376B0093; Mon, 22 Apr 2024 11:20:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E09F56B0096; Mon, 22 Apr 2024 11:20:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD1236B0098; Mon, 22 Apr 2024 11:20:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B06976B0093 for ; Mon, 22 Apr 2024 11:20:41 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3397080C2D for ; Mon, 22 Apr 2024 15:20:41 +0000 (UTC) X-FDA: 82037529882.08.9CB0C9B Received: from mail-lj1-f177.google.com (mail-lj1-f177.google.com [209.85.208.177]) by imf30.hostedemail.com (Postfix) with ESMTP id 4FC5A80012 for ; Mon, 22 Apr 2024 15:20:38 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gh22guxv; spf=pass (imf30.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.177 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713799238; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vuzvRER/eRgyRC6zF+Dn/+Ln6gq/IXAuegKQubeLgkk=; b=ZDMfvSIr3I+wNEcp0tbVwNRMmim/D3eaYSxDac6GogxGqNPgOBkeLiw0ngqZdw4y6IODjj Vn54MgVKA8l6wW3XPQW2K00VUxcYNRlaQ4G2QPafIqtQYd+h3BvCMFE1JHMwWDRqOAGo2H DtoVPiQd1X04bxEnFXAmyGeh3ATrBgg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713799238; a=rsa-sha256; cv=none; b=JoJ6JhLIomaTbQUmYwUsU4wIWKEl/QgKWC67qNJX7CzoDpRZrtjv9EdxFD9IiRa402Z6QG Tm7252Ls2PJ1zQN/rEVqNAOi4jp+ewQ/8QW6FMpUzD/zimXrvHZPhPTuKPR+uR8asfO/Qi PJd7+DzgiNRulogfauk5jCFAELbliy0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gh22guxv; spf=pass (imf30.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.177 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lj1-f177.google.com with SMTP id 38308e7fff4ca-2d8b194341eso43287691fa.3 for ; Mon, 22 Apr 2024 08:20:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1713799236; x=1714404036; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vuzvRER/eRgyRC6zF+Dn/+Ln6gq/IXAuegKQubeLgkk=; b=gh22guxvsrfQS6IthttYlt2m0Cxns6841xmNIQIlDhiySquGQ0inbwYcc0CMoUtiKx HvVZsIWjwCecOprR+E3cDshyHzbpgxL8MLnHbY3bP+YzGXLaEsP/MGFo+pCE/k/qVnU4 TKrx6X+dvlPig33PXELz5Qt4HaPG38GlC923BphpzsE55tkyeuVfRVZZRAd8Jv8ljhKM sIDuo0tbJTc2MmwtxVB0tafhU+D+ANNAcoqF/TbhgporkTw3h8ebTY0BheHoel85fpE/ 923NRl6glViqE0yDSkcBjLIP77nGkSuxnqoml2XOeHMX0wJzwvSGs8ljnNtC1FVxRVn1 pbeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713799236; x=1714404036; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vuzvRER/eRgyRC6zF+Dn/+Ln6gq/IXAuegKQubeLgkk=; b=muPNvZdDJ6gODmQTUlplprM2ED7o/ob2/aUO9atgvmRCJW62VUz6wZm74JhEizDdJ+ BMt88nFJCxPz48dTFZg8Qxya/s5DlI3sZyNSeOw5u5jFfqefp6ffWP+5yWOWlXIhJblL 6zHc/yAnOZrHB35t3BPLUZ1gpb8bAtplYT5yhWf37lHO58/+XVSpPy44QEYn2LZ1XGr9 Ohf5s1LPthfm5vuELvXhAsKvQull0EiJthbymZIoxKYd5F/qPM0j7k5H0O3hkopUvWyM D1S/H+1mV1wXN4m5FO/eIypt/7oJNigjKvmReKvx7BhfxkIjz843b54EGAFLtGqC5A1n M0bw== X-Forwarded-Encrypted: i=1; AJvYcCVJjhJRfu0+S8V9orNBW6irD0JBb2IiGdl4qAQ1R3mnDpWf4kwL3tv4osgAhfFJFk45jmESThpfzh+V5NpGAffRYQ0= X-Gm-Message-State: AOJu0YwrjRCiDcJynjzSvB7XzEA9LVOdulVJB6+GNuygQg2oUttKvTLe ONN6VeZtsZN2KyM/QMeez79JE9jV9CJ+oKuzIMn68u+oUjA5xApxCuWnhaDE/e49ztb02iRgroE Cb+13YGMrBuhApPY0mrYgfO56KX8= X-Google-Smtp-Source: AGHT+IGmn/mToNtPSu8NqpDUFrNYhzadt7vHWmwuQWYgdxJO4KVqIZ51GyywFjqOjpCMtpoxvPmrN5nQYIoT0qz/nL4= X-Received: by 2002:a2e:97d4:0:b0:2d8:74c6:c44c with SMTP id m20-20020a2e97d4000000b002d874c6c44cmr6429232ljj.46.1713799236152; Mon, 22 Apr 2024 08:20:36 -0700 (PDT) MIME-Version: 1.0 References: <20240417160842.76665-1-ryncsn@gmail.com> <87zftlx25p.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87zftlx25p.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Kairui Song Date: Mon, 22 Apr 2024 23:20:19 +0800 Message-ID: Subject: Re: [PATCH 0/8] mm/swap: optimize swap cache search space To: "Huang, Ying" Cc: Matthew Wilcox , linux-mm@kvack.org, Andrew Morton , Chris Li , Barry Song , Ryan Roberts , Neil Brown , Minchan Kim , Hugh Dickins , David Hildenbrand , Yosry Ahmed , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 4FC5A80012 X-Stat-Signature: wnjw6rs8774s3q1ad63ks6d4a3qi4ghr X-HE-Tag: 1713799238-260237 X-HE-Meta: U2FsdGVkX18FIgAusMcFFErayIdjzfdWG/2AletsYJ9MR4PD0G3huXCfbox7X19SazWObd5FeL/oCrdtu29S4x6cTPZWFrO6U/ODGNbCUWEtwOGifrB+78eqTn6zo2QMILdSZZVHz12jQkbYCtmm7lBRqZS8UCjHEf/BeDHenPNbFyHsDkn0NtLMg9EwxWurNiPW66tkpEbdQcyrlCSyqGfcjHN2oYEtlDzjfTrdBH8DkNZn6fgwUXGoysVw+JBEcvbP3ejAIlyK1yw52OV+jqj3cWQlX5CnOmaykaiS5BS/sp3tvZapmANUiqrB7PqcwPYsm0LGEjqCn+bJxOxBSAxGoT/+A3WIcTSLZX8GqbSLbcTuCtyZ7wUcuOmdMz02XwTonhk8JRU2QraNJypP5VkpEckYLhLr24LUOsR/038BVfkzxvS2B9DX90G3SwoCcPWCMsgl6pIjfDGF01/zh4VxxPeIm376lSdjPCzId9GLHhW8QM9Dq8qVAfvNFLW8IkuAPyirpnG884NWGugclPFOtvsM7OBSkknoHmUzMazMDSAuAo3gzfmEskXo4g2eFMcTcIktww15dMMcPaLl6EiqB1izqD//h+bEjPWTvUadud4ulRE6A/aTvYh1LfQ9HLSooz8aReCYYClKcyOiM8KcRAy7xcpEeP7AR8kn/dq/nT7iuwrXR5hX56DExZtPnJVY9STs66R1j/bO6wVMNKs9jfqbXzLmUZxKzTuOSG7C804noUZ9xi1rZSkei4BNjeEVxosjUKsThyQwS8kLxnnCcDvUFiNYJNGpYnY/+facBOS/XaH9EyXLoLMtVMJBSAye5eaY9OCHIkrielB+R4CosZeYBZq9uWX7CI7R73rttIdJd274ZhEc5G6PriGuxqoJQb7So1lVc/Po/KUVxmgZWxuSljiqCEpjo2AFqwafPh0pOCytLTCZZz05riGu85gYzylrZhAOb0IaEPU YvmLn5TS RsVAoL1Mv6XlzkXN6nz7hlDc4EFNeffSsJMUItDSsegDMoGP2Z0PusQ+P76jtbHEZIrGbxfgnv5TJSIRdY4vR3y2kv9d6NEiQ7lyIlX9xDyCv55xzahFFkSFA7flGh69MFVcWqsUt6i9f4wyCFRwFNwJQyKbxaUpqGbMTKRuivEcw9HnJAE5jNJ5396UwF8Pz/U/hd8oUrFhLIsGZCKvc0eeNJz3Ct3DQGip5P52FczpV3pDe4AMPmkwpFAtc5RkzADbOK68q2b+4keLh1A007WLeG/ZkrPsNG2ppuY2Nlo7tinX3oJs9hl9GcmB3L36PBQcQ7zrBSMUrDW8rJh0gD1PSl+MYRcMsgz6J2Y3tloBXmp1J42u16L0fExw2BsOjM1guNHzK8TfKWVoerWvYnpY+tR6wkd4Pmwb1LjfYRSwZE2JtGSHkVUBGxNX846FT6ZsfylpmNBJ0ZVnLPx2+Qa8psENRs8CAHdkmTwE93lnhzK0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Apr 22, 2024 at 3:56=E2=80=AFPM Huang, Ying = wrote: > > Hi, Kairui, > > Kairui Song writes: > > > From: Kairui Song > > > > Currently we use one swap_address_space for every 64M chunk to reduce l= ock > > contention, this is like having a set of smaller swap files inside one > > big swap file. But when doing swap cache look up or insert, we are > > still using the offset of the whole large swap file. This is OK for > > correctness, as the offset (key) is unique. > > > > But Xarray is specially optimized for small indexes, it creates the > > redix tree levels lazily to be just enough to fit the largest key > > stored in one Xarray. So we are wasting tree nodes unnecessarily. > > > > For 64M chunk it should only take at most 3 level to contain everything= . > > But we are using the offset from the whole swap file, so the offset (ke= y) > > value will be way beyond 64M, and so will the tree level. > > > > Optimize this by reduce the swap cache search space into 64M scope. > Hi, Thanks for the comments! > In general, I think that it makes sense to reduce the depth of the > xarray. > > One concern is that IIUC we make swap cache behaves like file cache if > possible. And your change makes swap cache and file cache diverge more. > Is it possible for us to keep them similar? So far in this series, I think there is no problem for that, the two main helpers for retrieving file & cache offset: folio_index and folio_file_pos will work fine and be compatible with current users. And if we convert to share filemap_* functions for swap cache / page cache, they are mostly already accepting index as an argument so no trouble at all. > > For example, > > Is it possible to return the offset inside 64M range in > __page_file_index() (maybe rename it)? Not sure what you mean by this, __page_file_index will be gone as we convert to folio. And this series did delete / rename it (it might not be easy to see this, the usage of these helpers is not very well organized before this series so some clean up is involved). It was previously only used through page_index (deleted) / folio_index, and, now folio_index will be returning the offset inside the 64M range. I guess I just did what you wanted? :) My cover letter and commit message might be not clear enough, I can update = it. > > Is it possible to add "start_offset" support in xarray, so "index" > will subtract "start_offset" before looking up / inserting? xarray struct seems already very full, and this usage doesn't look generic to me, might be better to fix this kind of issue case by case. > > Is it possible to use multiple range locks to protect one xarray to > improve the lock scalability? This is why we have multiple "struct > address_space" for one swap device. And, we may have same lock > contention issue for large files too. Good question, this series can improve the tree depth issue for swap cache, but contention in address space is still a thing. A more generic solution might involve changes of xarray API or use some other data struct? (BTW I think reducing the search space and resolving lock contention is not necessarily related, reducing the search space by having a large table of small trees should still perform better for swap cache). > > I haven't look at the code in details. So, my idea may not make sense > at all. If so, sorry about that. > > Hi, Matthew, > > Can you teach me on this too? > > -- > Best Regards, > Huang, Ying