From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1C22C54E60 for ; Thu, 14 Mar 2024 13:12:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1FE0B800A4; Thu, 14 Mar 2024 09:12:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1AE1680073; Thu, 14 Mar 2024 09:12:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 04F3B800A4; Thu, 14 Mar 2024 09:12:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E97AF80073 for ; Thu, 14 Mar 2024 09:12:41 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 82EA6120933 for ; Thu, 14 Mar 2024 13:12:41 +0000 (UTC) X-FDA: 81895684122.29.674117F Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by imf05.hostedemail.com (Postfix) with ESMTP id 99488100023 for ; Thu, 14 Mar 2024 13:12:38 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HamvDyYC; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of chuanhuahan@gmail.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=chuanhuahan@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710421958; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H62lO+oi6cqZ8jJaBCDOw4WoFlGyuOK3o5bFL4QH/2Y=; b=GH8kVGdJsy9qlBG/JMLCovCKFMRtPc4Xc60RjmF4l1EWxIncAEqESpuPnda7dqLKOEMv5w bkWsJLi8pgNfSw2tkLrpUuMfrRvWrBrRfnueeRmkxUGkMzz+6G1iSKPd4G77mU13JS87iz S2vztNaLlrTIjKp8Azu8YAM8Sa+6GO8= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HamvDyYC; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of chuanhuahan@gmail.com designates 209.85.208.43 as permitted sender) smtp.mailfrom=chuanhuahan@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710421958; a=rsa-sha256; cv=none; b=pkKS0fH1uF7V14Ur1LXN8Q2Pr9BdbsijKyIQ+18Np83/hRF9FCayvR154QhCsjLV5+JMFK Dawxty/WELyHtEyRVPKdulXdMfjm8uEj+8V32Crhv3q0krubGW4F3kEH6eoEk9A3GJlzoC auBVqgRtENc/+gnPvVzcm+N6iQWo6x4= Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-5687e7662a5so1312770a12.0 for ; Thu, 14 Mar 2024 06:12:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710421957; x=1711026757; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=H62lO+oi6cqZ8jJaBCDOw4WoFlGyuOK3o5bFL4QH/2Y=; b=HamvDyYCD3hrVtu1HY7CmTsbV6MUMAyocKhLU+YTVBpu24Rq+tsJKNdMJedhpXNbO0 S3uonmbfO2xQa+QJ0ykiwUQEW2Vw1jWDZCAVqj3i/VhYLWOWuObIoUgUR445QW9e+YiG Aeq8gu6MrYttNOqER5ReGLNrlCz86BxgVaccxkQhVk39gVMP6cDh6QkBov5/gnX4I3wQ Nt4ax0t8v9zOyr6fjodTHpoUfv5WjP9qk5xMDRHoOHe1Fodndg7BkazNEVY09fv85Vnd ZIenLfG86qzYiYX2c9Czn6cp/AKQq2v4Lw8jyNLjmzqzZBkBeiPSmAbHr3B8dxKzY9DT qQ5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710421957; x=1711026757; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H62lO+oi6cqZ8jJaBCDOw4WoFlGyuOK3o5bFL4QH/2Y=; b=W2e81ap/sd+SI4+uUnyD4+UfLbZZnZ/+mYOKVAPrhxZti6CFFYssLSPiFpM/WOuShS 7KyPFmyIWL369i2i5TL5aKcSzXOTwLQep/2rKjQad9A3T1U/WO4sX0XFaxZUwJq6F0CH SI6n17DYwKagXWN8OPROzVcAvJRBYdl3EhR1etAbLxYPgulsjc5urvMWUpMAlGAKE5nI VUWhp7CIsscb/i5Xrx86fTOiqPo/wXnO+P0zZwu7JnkFIB5XEytCVgDYzueYpSdVYNbI yaxf/XnuUt/LqmQa+4MpXMJmswvLqf3UVeWNOgTjbW3lwUgha2p//ujnnb7PFrbfM0Tb 43AA== X-Forwarded-Encrypted: i=1; AJvYcCWH7ozKHc7zyOp3Bdb6FP4bBsHCMmzPe8p4vWj4Zr549rJTmEEYGRI/hMR/PkMeFxslnH3KmpiWJOHeOxASfti19Bg= X-Gm-Message-State: AOJu0Yx7dxc6TKKMBkROsSOSCxC5UcEvGA6wmxw+JOVuk7QJhWD4g4S/ v8HM4D2YcKF8eX+JQSezDOUWossp/DtyXp2cC+6OJt9+GJRqVKzdJFNpW3yV99j4XCc17VCiQA7 pRYcQ28F84Vc1tPfn3QuKPNxM7YnlcmcveWN1oQ== X-Google-Smtp-Source: AGHT+IGVeVLKvGGTbfxnWNv6CU4y+YUBNoogOO9HebA4vVKRVOECbhRQfqfq76ixhWrOIDUth6fTQBnfg3pOPg/Do64= X-Received: by 2002:a05:6402:2486:b0:566:4dc1:522c with SMTP id q6-20020a056402248600b005664dc1522cmr1405949eda.15.1710421956967; Thu, 14 Mar 2024 06:12:36 -0700 (PDT) MIME-Version: 1.0 References: <20240304081348.197341-1-21cnbao@gmail.com> <20240304081348.197341-3-21cnbao@gmail.com> <499a60c6-eeb8-4bbd-8563-9717c0d2e43d@arm.com> In-Reply-To: <499a60c6-eeb8-4bbd-8563-9717c0d2e43d@arm.com> From: Chuanhua Han Date: Thu, 14 Mar 2024 21:12:25 +0800 Message-ID: Subject: Re: [RFC PATCH v3 2/5] mm: swap: introduce swap_nr_free() for batched swap_free() To: Ryan Roberts Cc: Barry Song <21cnbao@gmail.com>, akpm@linux-foundation.org, linux-mm@kvack.org, chengming.zhou@linux.dev, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, kasong@tencent.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, mhocko@suse.com, nphamcs@gmail.com, shy828301@gmail.com, steven.price@arm.com, surenb@google.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yosryahmed@google.com, yuzhao@google.com, Chuanhua Han , Barry Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 99488100023 X-Stat-Signature: ddt88r6zjw5tdx77h7swe7t3qam4dbcn X-HE-Tag: 1710421958-646540 X-HE-Meta: U2FsdGVkX1/NtVg2kewIeIdL00HBQ/L/rphYWY5ldJjMmIQLM7lHjeU3zaHMYVWrKuR/VnokpPPly/OJfRIM/gnpXUzg/8nWGJ0RrNS5i8+K3dMujUsWyvVzxAWhNFanfjs48lMnnf4xP+LJ5d6XCiAVLp2yVvQhDN1MxRi2BXsUW0aZH97RNRJKFhvfK7y++7yyV2Lu2Y0KrAWQZElaY7YWJErUJnW3eUt/607JqY36eUs8v2EO75xW2Lg55aNUd+vEwMYqIut7EuPmA2kKWs6kVRzmPXXTwCM1JcrEDXlVVbHB5BQYXjtWFd8GZ5K8XN3QTuXPmbkWuFYCyPzNaulW5a7CEvKKpU5gDiNr9eyNoatubMxQNIEhRWaR1R20JTBFYD45Bom+Z2dXDxIxPL66R8ypFNjsaiTNl53Cc4v25ReIUbNiBGmajMqPKjg8HqtsCfdt3ynnlpGjZnjFoz2p1ybrFJOqpk0zIUwrb3Lt9MRg8A5TgV+/tUzyA4d4YZ9y0b62Ch7YL6JwMHiirvy2Ez58sru7f9oW7ssa+cc/9YtPUgVV9gPuhfcbEn6VLYnZcE1rDyfCBzTTu+1dkgmnvfjBrn9PcVaCDXSvs4tCPlzQKZp36yE8hvflV6bS124sf6zmThtnK/MpwJg6Ho2FgerkDn92I9wfKHfUulAiuNdA4vmPtqhJlob8/j5fwrF6raititrpbD/ekyBwFBaI6nppIuw7XDCHnyO3qSJ38ihVL1ln1RYxVoT/6TD6e7KEDtVObeLs6IvmdHLWHd/4mDtTkpYwVzC6mcMdjwJozRrAWhdhxiHw6xsrTC4t+jEhlg6Ql2g/mZuvFmfxNVhFicRS1B7kPu+KU6wqsZIzIAI6AyGrZBpM0BS1OFvJTAxK7sb5Wp0gLv8HqHRWJ81tWl+5KVZMkpRX0FDAGuS1f1umEnKo8JEkAX8DhAog5cTiQdu7uyNu4q+j7Rr 9Hl5jpJc CyJjdsKN1VXXxSPa9wkpca7xuI0fVqJ4CU1ydV0smcEvyC0Kl/dGp9qp5+l1c+qOFSxj5Xl07WfdPEwj7ybzAfqZgEQ3x5EVjolP10d/0MLC5e2V76faajLPvVgxVT2bPyNHfH33ivIJrJRnen7tVNv+REjYqLqCuJSymVemWhHKVB19vGY3MNABf/02Qkh9sF/RxFuwgHqGgu5vtWQn2fPFEkDQe54Omd56uy2CZCidsWhy2sIOaqwYGbb+/ZDFuz+I4dcBJr/uQ3al1iBTskwH0g9MbXaR8sOX1viqRjMRbk0tT066EKCZ1KV9knGaQch/gs7LVcQsAdVlLzkSBvynzYpOS/jZddnFH/D9Zb+QSywX62IEeTnuAOU+GwnC2Xy1oeQG8ejWSOpb+IvsKc9lZd3nU0SWngLX9OnVc6sX1vgcxzycn5KL1T20ft6rMBwX9xjSbhn2twNSZ54ff2EAyCq7oH52g+Vrrhc3/MyJbnkytZ1X4w4+FQZlXGrhPYBkLQS7740P5Uzo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Ryan Roberts =E4=BA=8E2024=E5=B9=B43=E6=9C=8812=E6= =97=A5=E5=91=A8=E4=BA=8C 02:51=E5=86=99=E9=81=93=EF=BC=9A > > On 04/03/2024 08:13, Barry Song wrote: > > From: Chuanhua Han > > > > While swapping in a large folio, we need to free swaps related to the w= hole > > folio. To avoid frequently acquiring and releasing swap locks, it is be= tter > > to introduce an API for batched free. > > > > Signed-off-by: Chuanhua Han > > Co-developed-by: Barry Song > > Signed-off-by: Barry Song > > --- > > include/linux/swap.h | 6 ++++++ > > mm/swapfile.c | 35 +++++++++++++++++++++++++++++++++++ > > 2 files changed, 41 insertions(+) > > > > diff --git a/include/linux/swap.h b/include/linux/swap.h > > index 2955f7a78d8d..d6ab27929458 100644 > > --- a/include/linux/swap.h > > +++ b/include/linux/swap.h > > @@ -481,6 +481,7 @@ extern void swap_shmem_alloc(swp_entry_t); > > extern int swap_duplicate(swp_entry_t); > > extern int swapcache_prepare(swp_entry_t); > > extern void swap_free(swp_entry_t); > > +extern void swap_nr_free(swp_entry_t entry, int nr_pages); > > nit: In my swap-out v4 series, I've created a batched version of > free_swap_and_cache() and called it free_swap_and_cache_nr(). Perhaps it = is > preferable to align the naming schemes - i.e. call this swap_free_nr(). Y= our > scheme doesn't really work when applied to free_swap_and_cache(). Thanks for your suggestions, and for the next version, we'll see which package is more appropriate! > > > extern void swapcache_free_entries(swp_entry_t *entries, int n); > > extern int free_swap_and_cache(swp_entry_t); > > int swap_type_of(dev_t device, sector_t offset); > > @@ -561,6 +562,11 @@ static inline void swap_free(swp_entry_t swp) > > { > > } > > > > +void swap_nr_free(swp_entry_t entry, int nr_pages) > > +{ > > + > > +} > > + > > static inline void put_swap_folio(struct folio *folio, swp_entry_t swp= ) > > { > > } > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index 3f594be83b58..244106998a69 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -1341,6 +1341,41 @@ void swap_free(swp_entry_t entry) > > __swap_entry_free(p, entry); > > } > > > > +/* > > + * Called after swapping in a large folio, batched free swap entries > > + * for this large folio, entry should be for the first subpage and > > + * its offset is aligned with nr_pages > > + */ > > +void swap_nr_free(swp_entry_t entry, int nr_pages) > > +{ > > + int i; > > + struct swap_cluster_info *ci; > > + struct swap_info_struct *p; > > + unsigned type =3D swp_type(entry); > > nit: checkpatch.py will complain about bare "unsigned", preferring "unsig= ned > int" or at least it did for me when I did something similar in my swap-ou= t patch > set. Gee, thanks for pointing that out! > > > + unsigned long offset =3D swp_offset(entry); > > + DECLARE_BITMAP(usage, SWAPFILE_CLUSTER) =3D { 0 }; > > I don't love this, as it could blow the stack if SWAPFILE_CLUSTER ever > increases. But the only other way I can think of is to explicitly loop ov= er > fixed size chunks, and that's not much better. Is it possible to save kernel stack better by using bit_map here? If SWAPFILE_CLUSTER=3D512, we consume only (512/64)*8=3D 64 bytes. > > > + > > + /* all swap entries are within a cluster for mTHP */ > > + VM_BUG_ON(offset % SWAPFILE_CLUSTER + nr_pages > SWAPFILE_CLUSTER= ); > > + > > + if (nr_pages =3D=3D 1) { > > + swap_free(entry); > > + return; > > + } > > + > > + p =3D _swap_info_get(entry); > > You need to handle this returning NULL, like swap_free() does. Yes, you're right! We did forget to judge NULL here. > > > + > > + ci =3D lock_cluster(p, offset); > > The existing swap_free() calls lock_cluster_or_swap_info(). So if swap is= backed > by rotating media, and clusters are not in use, it will lock the whole sw= ap > info. But your new version only calls lock_cluster() which won't lock any= thing > if clusters are not in use. So I think this is a locking bug. Again, you're right, it's bug! > > > + for (i =3D 0; i < nr_pages; i++) { > > + if (__swap_entry_free_locked(p, offset + i, 1)) > > + __bitmap_set(usage, i, 1); > > + } > > + unlock_cluster(ci); > > + > > + for_each_clear_bit(i, usage, nr_pages) > > + free_swap_slot(swp_entry(type, offset + i)); > > +} > > + > > /* > > * Called after dropping swapcache to decrease refcnt to swap entries. > > */ > > Thanks, > Ryan > > --=20 Thanks, Chuanhua