From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6B00E77197 for ; Thu, 9 Jan 2025 04:08:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 542266B009D; Wed, 8 Jan 2025 23:08:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5197C6B00A8; Wed, 8 Jan 2025 23:08:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B9876B00AC; Wed, 8 Jan 2025 23:08:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 183076B009D for ; Wed, 8 Jan 2025 23:08:01 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 92A75B0DE6 for ; Thu, 9 Jan 2025 04:08:00 +0000 (UTC) X-FDA: 82986580320.24.B092C63 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 927D24000E for ; Thu, 9 Jan 2025 04:07:58 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="jB6//Xp2"; spf=pass (imf12.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736395678; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mAE8vSdiZmItWMYElUXD2PNWSksXHi0/cUj+T0Joe/w=; b=rkQfsfajko7NYJQc42ewtI+4ehFqoSgWWsWPzv4UbE1FgG9kJYugIpIIOZ1o9UW13jZKbB pkp/otKY4cv5xTLRV9ILNSVYbzOsl8buaWa9QSkXMpPHOBZNGYav/cbNx4WV+lKS1d5nar x+HpVFzmLs4HMGLaX9qZjBBBapPaSzY= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="jB6//Xp2"; spf=pass (imf12.hostedemail.com: domain of bhe@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bhe@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736395678; a=rsa-sha256; cv=none; b=rbioB4n2FMVd48uoTKTQkv67gxSDvWt5+9R3s1mMjTCMlxLYzHSWOJFQW5cA82+WMV+ZRq Xl2h70Fox2uwuzs4A/zahmHSYtuC8o0eyx0aOOyyQ80MpdPMzPhdysrsfo5K0PEJ9vjBS2 HwMvnBJkxO73+dGhTjVC0X3cgAh/zqY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736395677; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=mAE8vSdiZmItWMYElUXD2PNWSksXHi0/cUj+T0Joe/w=; b=jB6//Xp2t/qCqPq9RRzqOe7Id+mAIHq/oxWn+J7kkxxiR8Pz++IXvZB/mZHnW/cO6fptoG 1xi8pc+TDsP4ZKvEORS1IC0uRLT3o3FK6vEmGrP4XFmyPLG7f0GA2ScLC9i366Gx+uraUN /jDRLFbs0Uk5t/oD96o48cAPrgFqK4Q= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-441-ZTNJP0N5OeeL0jkhMfbzEA-1; Wed, 08 Jan 2025 23:07:51 -0500 X-MC-Unique: ZTNJP0N5OeeL0jkhMfbzEA-1 X-Mimecast-MFC-AGG-ID: ZTNJP0N5OeeL0jkhMfbzEA Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E3BB619153C4; Thu, 9 Jan 2025 04:07:49 +0000 (UTC) Received: from localhost (unknown [10.72.112.99]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4262319560AB; Thu, 9 Jan 2025 04:07:47 +0000 (UTC) Date: Thu, 9 Jan 2025 12:07:43 +0800 From: Baoquan He To: Kairui Song Cc: linux-mm@kvack.org, Andrew Morton , Chris Li , Barry Song , Ryan Roberts , Hugh Dickins , Yosry Ahmed , "Huang, Ying" , Nhat Pham , Johannes Weiner , Kalesh Singh , linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 04/13] mm, swap: use cluster lock for HDD Message-ID: References: <20241230174621.61185-1-ryncsn@gmail.com> <20241230174621.61185-5-ryncsn@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241230174621.61185-5-ryncsn@gmail.com> X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Rspamd-Server: rspam05 X-Stat-Signature: eqfetjou5c5pp871f3iugysb3jz6xemm X-Rspamd-Queue-Id: 927D24000E X-Rspam-User: X-HE-Tag: 1736395678-299037 X-HE-Meta: U2FsdGVkX19JRtcuZJ7z+Yna2YmbuuW1Q1DYxwt9TE/RF6ug3w2c4P22DN9g6UEI1OChzvpBifggzdyua/KQpTFPRMbtPAo9yHlESitEOIsO7cUsCOJvyZwqsym+Lg05+2A6eyUo9X/ZqTqD3rXC5+usubW3fuGaCgU94sCKVtQfZSqyNBzd6I4GToIr3wKFZxIwo8dGHg+4DwqOubqer+xQUiSi9KNdOF/8cyNk0iV/F2vHV6grSbIi0xQbq/Y85TRW81djGtFa3HbXb02o6Y9FxJ8zyQ8ApQhyjcrs668Y6dtRePETBW+e+uw4Sn5OYAkYevEDmssrVmMx9QP2hfSFWtz09SzjssHlpcW2gB1vshnC2R/LFwsthUTnIbUrhMnRvs0p2w8opt2SX8LR6SdTBEHTB9kbCsxsmGTlxMH7DbtjskAOfFZ7njq6i8OK8ywjW42xBv/htyEeCZjFLKqQzm81VYJvvGZ1bNFGBKhQiSF/0SySU/fwGcEDkOnTyvynluOnOdsZOk+hgsAd5azN5yKThnBGAgNjH8Qk3uNzNpdBDGN19vB2Q+NPwCrcRaweLSkej8BpHRz8yGTJPOyo95xLWcM8Ax7Z9mPRZppeXkUTU27GtGoag6HHtW9qNTHwhhYF9aHZy5dQ3G8zRhT06RYzsdNcrsYyAf2iMW0Y7e5uCtpneqNGVCqIr3cspCMiXBsGkLkdHG9U+R36NEAqBEbBbBrWHYl2VaX5EYq8mrS1ZY5vnDHeOGek/Bg5cMmxocm7+dko8PfBwmaYO7n0+9Mjydpz3nh7E8T03eq8cLcUUwvi/HLye3NK1xJ8cz0INfEHzHMX2H2hN74laS4lgzt2hRnF/7pv2tPJO4v7bcyTdJkjSFNkEfsVXQHe5MmHdm/fUgNyaJ27UGXs4XVtrvyRSGnn+AmfqjUiIRsP+NSBAXBpOliecD9L2vnzF2WTigb3m/7Warkb9w3 gS3qtsIw n44+v015eZQlifzCc4cOlbccarroUGYOeEADrnvXzsflAw9PUJ6bKhgt8WDNUzhvqjXibigRaIiNw6KvW0mL6IhAT5J+OoJhu602A3NAEk4sVO2qX4hS7HWmtm20VDfvr+4dkzn1kRFazN8AjzNIHBWxpTh5LGn8Q1dVmu/8UPzX+5ZB3l9iQ+JPEVh729qwF5s8MGMMfte4f2pSKEXN7BpKucSE6A0HFYerhx4wBqglY2ZghXQOOOpsGZx9ol8QEwxNazgPaZGnzydHdSe72o6om1coiWCpd23PZnNYFJt9h3H8vOMRNGi4+9PfFyci64ji6W/F1BzEUJJj18P1Ttk6ELf3KnsQk4cyI+hkPTWKZMYB7z0yIMAiu1FBAmiEvXvrJEsQeh2mSCyzUYS42Rmt8/6pnN+8nUZcUZ+2RAZVgYYw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 12/31/24 at 01:46am, Kairui Song wrote: > From: Kairui Song > > Cluster lock (ci->lock) was introduce to reduce contention for certain ~~~~~~~~~ typo, introduced. > operations. Using cluster lock for HDD is not helpful as HDD have a poor > performance, so locking isn't the bottleneck. But having different set > of locks for HDD / non-HDD prevents further rework of device lock > (si->lock). > > This commit just changed all lock_cluster_or_swap_info to lock_cluster, > which is a safe and straight conversion since cluster info is always > allocated now, also removed all cluster_info related checks. > > Suggested-by: Chris Li > Signed-off-by: Kairui Song > --- > mm/swapfile.c | 107 ++++++++++++++++---------------------------------- > 1 file changed, 34 insertions(+), 73 deletions(-) Other than the nit in patch log, LGTM, Reviewed-by: Baoquan He > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index fca58d43b836..d0e5b9fa0c48 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -58,10 +58,9 @@ static void swap_entry_range_free(struct swap_info_struct *si, swp_entry_t entry > static void swap_range_alloc(struct swap_info_struct *si, unsigned long offset, > unsigned int nr_entries); > static bool folio_swapcache_freeable(struct folio *folio); > -static struct swap_cluster_info *lock_cluster_or_swap_info( > - struct swap_info_struct *si, unsigned long offset); > -static void unlock_cluster_or_swap_info(struct swap_info_struct *si, > - struct swap_cluster_info *ci); > +static struct swap_cluster_info *lock_cluster(struct swap_info_struct *si, > + unsigned long offset); > +static void unlock_cluster(struct swap_cluster_info *ci); > > static DEFINE_SPINLOCK(swap_lock); > static unsigned int nr_swapfiles; > @@ -222,9 +221,9 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, > * swap_map is HAS_CACHE only, which means the slots have no page table > * reference or pending writeback, and can't be allocated to others. > */ > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > need_reclaim = swap_is_has_cache(si, offset, nr_pages); > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > if (!need_reclaim) > goto out_unlock; > > @@ -404,45 +403,15 @@ static inline struct swap_cluster_info *lock_cluster(struct swap_info_struct *si > { > struct swap_cluster_info *ci; > > - ci = si->cluster_info; > - if (ci) { > - ci += offset / SWAPFILE_CLUSTER; > - spin_lock(&ci->lock); > - } > - return ci; > -} > - > -static inline void unlock_cluster(struct swap_cluster_info *ci) > -{ > - if (ci) > - spin_unlock(&ci->lock); > -} > - > -/* > - * Determine the locking method in use for this device. Return > - * swap_cluster_info if SSD-style cluster-based locking is in place. > - */ > -static inline struct swap_cluster_info *lock_cluster_or_swap_info( > - struct swap_info_struct *si, unsigned long offset) > -{ > - struct swap_cluster_info *ci; > - > - /* Try to use fine-grained SSD-style locking if available: */ > - ci = lock_cluster(si, offset); > - /* Otherwise, fall back to traditional, coarse locking: */ > - if (!ci) > - spin_lock(&si->lock); > + ci = &si->cluster_info[offset / SWAPFILE_CLUSTER]; > + spin_lock(&ci->lock); > > return ci; > } > > -static inline void unlock_cluster_or_swap_info(struct swap_info_struct *si, > - struct swap_cluster_info *ci) > +static inline void unlock_cluster(struct swap_cluster_info *ci) > { > - if (ci) > - unlock_cluster(ci); > - else > - spin_unlock(&si->lock); > + spin_unlock(&ci->lock); > } > > /* Add a cluster to discard list and schedule it to do discard */ > @@ -558,9 +527,6 @@ static void inc_cluster_info_page(struct swap_info_struct *si, > unsigned long idx = page_nr / SWAPFILE_CLUSTER; > struct swap_cluster_info *ci; > > - if (!cluster_info) > - return; > - > ci = cluster_info + idx; > ci->count++; > > @@ -576,9 +542,6 @@ static void inc_cluster_info_page(struct swap_info_struct *si, > static void dec_cluster_info_page(struct swap_info_struct *si, > struct swap_cluster_info *ci, int nr_pages) > { > - if (!si->cluster_info) > - return; > - > VM_BUG_ON(ci->count < nr_pages); > VM_BUG_ON(cluster_is_free(ci)); > lockdep_assert_held(&si->lock); > @@ -1007,8 +970,6 @@ static int cluster_alloc_swap(struct swap_info_struct *si, > { > int n_ret = 0; > > - VM_BUG_ON(!si->cluster_info); > - > si->flags += SWP_SCANNING; > > while (n_ret < nr) { > @@ -1052,10 +1013,10 @@ static int scan_swap_map_slots(struct swap_info_struct *si, > } > > /* > - * Swapfile is not block device or not using clusters so unable > + * Swapfile is not block device so unable > * to allocate large entries. > */ > - if (!(si->flags & SWP_BLKDEV) || !si->cluster_info) > + if (!(si->flags & SWP_BLKDEV)) > return 0; > } > > @@ -1295,9 +1256,9 @@ static unsigned char __swap_entry_free(struct swap_info_struct *si, > unsigned long offset = swp_offset(entry); > unsigned char usage; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > usage = __swap_entry_free_locked(si, offset, 1); > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > if (!usage) > free_swap_slot(entry); > > @@ -1320,14 +1281,14 @@ static bool __swap_entries_free(struct swap_info_struct *si, > if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER) > goto fallback; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > if (!swap_is_last_map(si, offset, nr, &has_cache)) { > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > goto fallback; > } > for (i = 0; i < nr; i++) > WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE); > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > > if (!has_cache) { > for (i = 0; i < nr; i++) > @@ -1383,7 +1344,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si, > DECLARE_BITMAP(to_free, BITS_PER_LONG) = { 0 }; > int i, nr; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > while (nr_pages) { > nr = min(BITS_PER_LONG, nr_pages); > for (i = 0; i < nr; i++) { > @@ -1391,18 +1352,18 @@ static void cluster_swap_free_nr(struct swap_info_struct *si, > bitmap_set(to_free, i, 1); > } > if (!bitmap_empty(to_free, BITS_PER_LONG)) { > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > for_each_set_bit(i, to_free, BITS_PER_LONG) > free_swap_slot(swp_entry(si->type, offset + i)); > if (nr == nr_pages) > return; > bitmap_clear(to_free, 0, BITS_PER_LONG); > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > } > offset += nr; > nr_pages -= nr; > } > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > } > > /* > @@ -1441,9 +1402,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry) > if (!si) > return; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > if (size > 1 && swap_is_has_cache(si, offset, size)) { > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > spin_lock(&si->lock); > swap_entry_range_free(si, entry, size); > spin_unlock(&si->lock); > @@ -1451,14 +1412,14 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry) > } > for (int i = 0; i < size; i++, entry.val++) { > if (!__swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE)) { > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > free_swap_slot(entry); > if (i == size - 1) > return; > - lock_cluster_or_swap_info(si, offset); > + lock_cluster(si, offset); > } > } > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > } > > static int swp_entry_cmp(const void *ent1, const void *ent2) > @@ -1522,9 +1483,9 @@ int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry) > struct swap_cluster_info *ci; > int count; > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > count = swap_count(si->swap_map[offset]); > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > return count; > } > > @@ -1547,7 +1508,7 @@ int swp_swapcount(swp_entry_t entry) > > offset = swp_offset(entry); > > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > > count = swap_count(si->swap_map[offset]); > if (!(count & COUNT_CONTINUED)) > @@ -1570,7 +1531,7 @@ int swp_swapcount(swp_entry_t entry) > n *= (SWAP_CONT_MAX + 1); > } while (tmp_count & COUNT_CONTINUED); > out: > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > return count; > } > > @@ -1585,8 +1546,8 @@ static bool swap_page_trans_huge_swapped(struct swap_info_struct *si, > int i; > bool ret = false; > > - ci = lock_cluster_or_swap_info(si, offset); > - if (!ci || nr_pages == 1) { > + ci = lock_cluster(si, offset); > + if (nr_pages == 1) { > if (swap_count(map[roffset])) > ret = true; > goto unlock_out; > @@ -1598,7 +1559,7 @@ static bool swap_page_trans_huge_swapped(struct swap_info_struct *si, > } > } > unlock_out: > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > return ret; > } > > @@ -3428,7 +3389,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) > offset = swp_offset(entry); > VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); > VM_WARN_ON(usage == 1 && nr > 1); > - ci = lock_cluster_or_swap_info(si, offset); > + ci = lock_cluster(si, offset); > > err = 0; > for (i = 0; i < nr; i++) { > @@ -3483,7 +3444,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) > } > > unlock_out: > - unlock_cluster_or_swap_info(si, ci); > + unlock_cluster(ci); > return err; > } > > -- > 2.47.1 > >