From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5548CC4345F for ; Mon, 15 Apr 2024 06:19:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9783E6B0085; Mon, 15 Apr 2024 02:19:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 92A186B0087; Mon, 15 Apr 2024 02:19:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C8D86B0088; Mon, 15 Apr 2024 02:19:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5E4D26B0085 for ; Mon, 15 Apr 2024 02:19:42 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id AF63A140452 for ; Mon, 15 Apr 2024 06:19:41 +0000 (UTC) X-FDA: 82010764962.13.1122E58 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) by imf27.hostedemail.com (Postfix) with ESMTP id 7802440007 for ; Mon, 15 Apr 2024 06:19:39 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QWccoPHu; spf=pass (imf27.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.19 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713161979; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bsXRQOnhI7m9+ArFUNfHMb6KtlJgzVcSCjIn/YHymrk=; b=py46bXBJoQWULC0RjNsj6vm6e+fulQFq4rE86XDmt1TzCw86fH42J+XCUPF/M5zSP8vMq9 rtw3EZXHn4Q4EqlD6IvGNKZLiQHqBGyPGrqNkVu/whF5Cf1/W/CByEkda7f+439gppEtHJ daBmW+kyb8d7vQAwk+rZJo+WD+nDhks= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713161979; a=rsa-sha256; cv=none; b=asEPuBDMogJCQDSHBlOYzyAE22gxULkQDgd892c+RlppDn3PFp08TH5jWWNWtGQeAL74fS SHF3PMr4CjVQEoQhU+XQfpYOLDm29fEN8ZfJdaygpWxSaUQqptn1S0acF3C22Zw4IggYMt VBcQVNGBpV0gEVpbxJo/6JCQ2vUWA84= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QWccoPHu; spf=pass (imf27.hostedemail.com: domain of ying.huang@intel.com designates 192.198.163.19 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1713161979; x=1744697979; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=lJQwJ8XaCikZrjdfNcnmhKxcehDi16qkdeUs/IIX7ic=; b=QWccoPHuySj4HzzDgHOT1rWSKhQ09xHBkIx8UMogtHFYoJFPhnU5IcyV FMddna+25pRN5XrbJD5sUDFn5vmkF+eIeT+I4GmKY/+oTmE+Y6Jh78E3C bGUbSHSl4Uk4AWMFq9zWah+FhxBK/llMFho7KYZf+57gVDi7a50zjIrYV vk5MULzBf4ntyz5Fb87xI5sIVOjwISkfdExWgj7SsoaqeIUwqCN4HgRBh eIFmQJNs9QWOOn7hvYn42/WFV1Skr0o4wIhOdCtKFNr6DPCTvPRcyexdW q0pLOolveEHYypcETUudzlkaCO7c+ZVocabiHgO8KayG1BR4hNnU7zokL Q==; X-CSE-ConnectionGUID: Cs7fkdUzSB2pigeKuha4uA== X-CSE-MsgGUID: UjXY9RN8RV29/27GAYE6dA== X-IronPort-AV: E=McAfee;i="6600,9927,11044"; a="8390531" X-IronPort-AV: E=Sophos;i="6.07,202,1708416000"; d="scan'208";a="8390531" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Apr 2024 23:19:38 -0700 X-CSE-ConnectionGUID: 0Kd1SlWJS4yLJTSJSdTkVw== X-CSE-MsgGUID: jMIJiqviSPGZpGr31tJ9ew== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,202,1708416000"; d="scan'208";a="59251237" Received: from unknown (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Apr 2024 23:19:33 -0700 From: "Huang, Ying" To: Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, hanchuanhua@oppo.com, hannes@cmpxchg.org, hughd@google.com, kasong@tencent.com, ryan.roberts@arm.com, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org, yosryahmed@google.com, yuzhao@google.com, ziy@nvidia.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 1/5] mm: swap: introduce swap_free_nr() for batched swap_free() In-Reply-To: <20240409082631.187483-2-21cnbao@gmail.com> (Barry Song's message of "Tue, 9 Apr 2024 20:26:27 +1200") References: <20240409082631.187483-1-21cnbao@gmail.com> <20240409082631.187483-2-21cnbao@gmail.com> Date: Mon, 15 Apr 2024 14:17:40 +0800 Message-ID: <87y19f2lq3.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Stat-Signature: txdak6i9s6goc7y91wepr3cw9pdzk751 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 7802440007 X-Rspam-User: X-HE-Tag: 1713161979-490562 X-HE-Meta: U2FsdGVkX19QuVUMVyJFoYdb3CPEOKdkugxRLs0VHfCrzIbS/Ff65p3aM85U+LQ+IcnoWsGnZJdQT2msWUnbHDkSqbMnvdtrvc61DOtSgpSG2fjExsSdoO2mHzZmDymHEtnlXv9B4Li9R6plYUwJWkjBJUfRdwCFj5/WkwcwDqEuB9GAB0W90PANnEt0B0Cems6hBc8Elmo9zX1nqJC6IotiLEvnMHM20TKkWvwWSzE7tnX4KSZmPyDUt0gPYsdX8bzumt6Q5qvhW96/cEz2+i0Xb65Dm2TQd7kA/R/MnXu9BPgKcqIWDupHeM20qQYHHbZsI7bRrVR8c+Ua54LMfrNUNE9QS+PdDY/7NhpFhnhnwi/yHO4ePZ7XUXpB2vJ5tr1PcsoU+SgFrYpx7sQ/hwdAW4ddaZhqNR1ujvBXudTRgMG29hfsiIGi/lxSUZu10eZnn5aB3UnYANFhtAU6nN0nTAMHem+yD74LemqpLIy8jDdZfEvEn2NV+sfTfvM8uZ1pJQ0Fjs38sIDcxJFlyXuovtzg3ie5aaM4Cua9LD07Azn3Km595DhDlcjMN5O4wwXRXvkjsWb98kMii4eX9CJZSYfckJGOwwCd/9RVmtSBMVeUoFv7l0KTctuL3RQsD8vSEUlGwwQnUDR22bEsY3o+FmF19KZRM2HmsmHKjAABGIBm/IxKxQs0rV6GqWP0FJ7RZ1rRmhTI7mEhiLrYX1yueJvHc9gWK7PAIFlt4m7xNfGmIdp4y6XgSIxuzDk0+rtMFAm2OL0tpJlPLpc1Q25dmvJU6MEZDOEvkfVNkZseEcTx2+sBXH15nlZWkV9TE/O3vq+NPv5s8mTnbEtSevCLc9JuBu6JKbFJs3DkOTmC4kteXER7oEveHHNJ5R9QX3mikb3CpLwy4LcvO9oIt9iEpbAKThKTwT18tyI+kgHTeoXX8WmD04lSHZOAR/RLAUSv3UJbIkQiPHsK4X5 YxkTdKAd 749qAheccKQaF6JiKfPjmHotIM1kiZmlaciq9JdfyCHNg+eLgDDB8dvjqS9yJBa0NEcXHrwNObE4Rhf6S2ev+nV0HEy14m5qhESF6r3xgaKkbciuTY+T07whj/58XPFaPNNzOd/Bq/beW6uk0Mm6uj51Wl5XkGt1YLnl48HNBZjBKfM+7sbe0h6nqUeaWhAbT2X8k5BYetsw3vqii0BMC2Qq8Z+Q6eeiIHiuLTIYsVaAPv1cZCYwGQgQ4laqVR6QYrZmkNF9zYX5WkVUNcoDXJf19bg7ooSA1yvvQ95n3YbQkqL3QrI7ItDCp0g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Barry Song <21cnbao@gmail.com> writes: > From: Chuanhua Han > > While swapping in a large folio, we need to free swaps related to the whole > folio. To avoid frequently acquiring and releasing swap locks, it is better > to introduce an API for batched free. > > Signed-off-by: Chuanhua Han > Co-developed-by: Barry Song > Signed-off-by: Barry Song > --- > include/linux/swap.h | 5 +++++ > mm/swapfile.c | 51 ++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 56 insertions(+) > > diff --git a/include/linux/swap.h b/include/linux/swap.h > index 11c53692f65f..b7a107e983b8 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -483,6 +483,7 @@ extern void swap_shmem_alloc(swp_entry_t); > extern int swap_duplicate(swp_entry_t); > extern int swapcache_prepare(swp_entry_t); > extern void swap_free(swp_entry_t); > +extern void swap_free_nr(swp_entry_t entry, int nr_pages); > extern void swapcache_free_entries(swp_entry_t *entries, int n); > extern void free_swap_and_cache_nr(swp_entry_t entry, int nr); > int swap_type_of(dev_t device, sector_t offset); > @@ -564,6 +565,10 @@ static inline void swap_free(swp_entry_t swp) > { > } > > +void swap_free_nr(swp_entry_t entry, int nr_pages) > +{ > +} > + > static inline void put_swap_folio(struct folio *folio, swp_entry_t swp) > { > } > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 28642c188c93..f4c65aeb088d 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -1356,6 +1356,57 @@ void swap_free(swp_entry_t entry) > __swap_entry_free(p, entry); > } > > +/* > + * Free up the maximum number of swap entries at once to limit the > + * maximum kernel stack usage. > + */ > +#define SWAP_BATCH_NR (SWAPFILE_CLUSTER > 512 ? 512 : SWAPFILE_CLUSTER) > + > +/* > + * Called after swapping in a large folio, IMHO, it's not good to document the caller in the function definition. Because this will discourage function reusing. > batched free swap entries > + * for this large folio, entry should be for the first subpage and > + * its offset is aligned with nr_pages Why do we need this? > + */ > +void swap_free_nr(swp_entry_t entry, int nr_pages) > +{ > + int i, j; > + struct swap_cluster_info *ci; > + struct swap_info_struct *p; > + unsigned int type = swp_type(entry); > + unsigned long offset = swp_offset(entry); > + int batch_nr, remain_nr; > + DECLARE_BITMAP(usage, SWAP_BATCH_NR) = { 0 }; > + > + /* all swap entries are within a cluster for mTHP */ > + VM_BUG_ON(offset % SWAPFILE_CLUSTER + nr_pages > SWAPFILE_CLUSTER); > + > + if (nr_pages == 1) { > + swap_free(entry); > + return; > + } Is it possible to unify swap_free() and swap_free_nr() into one function with acceptable performance? IIUC, the general rule in mTHP effort is to avoid duplicate functions between mTHP and normal small folio. Right? > + > + remain_nr = nr_pages; > + p = _swap_info_get(entry); > + if (p) { > + for (i = 0; i < nr_pages; i += batch_nr) { > + batch_nr = min_t(int, SWAP_BATCH_NR, remain_nr); > + > + ci = lock_cluster_or_swap_info(p, offset); > + for (j = 0; j < batch_nr; j++) { > + if (__swap_entry_free_locked(p, offset + i * SWAP_BATCH_NR + j, 1)) > + __bitmap_set(usage, j, 1); > + } > + unlock_cluster_or_swap_info(p, ci); > + > + for_each_clear_bit(j, usage, batch_nr) > + free_swap_slot(swp_entry(type, offset + i * SWAP_BATCH_NR + j)); > + > + bitmap_clear(usage, 0, SWAP_BATCH_NR); > + remain_nr -= batch_nr; > + } > + } > +} > + > /* > * Called after dropping swapcache to decrease refcnt to swap entries. > */ put_swap_folio() implements batching in another method. Do you think that it's good to use the batching method in that function here? It avoids to use bitmap operations and stack space. -- Best Regards, Huang, Ying