From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 275C9E7718F for ; Mon, 30 Dec 2024 17:46:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD1486B0093; Mon, 30 Dec 2024 12:46:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A823A6B0095; Mon, 30 Dec 2024 12:46:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9223D6B0096; Mon, 30 Dec 2024 12:46:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6CF106B0093 for ; Mon, 30 Dec 2024 12:46:54 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id DAFF51A0855 for ; Mon, 30 Dec 2024 17:46:53 +0000 (UTC) X-FDA: 82952353596.03.98B9B10 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by imf03.hostedemail.com (Postfix) with ESMTP id A461C20007 for ; Mon, 30 Dec 2024 17:46:30 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hr63nGIa; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735580790; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OUd7wiP5I69QsBZ6QEaFt/A5eIkrt9OzZ0llQ5gAam8=; b=IR2LfccFV1+VROSrx+9CiDsmtU0pNd6mKmU1reoF85cfJcIZ8p7koDcBL8qYzCuLSig/2O KfoqDfkiZcBnQTkF2fNtFHhYHJ603cUykPfZ9Q6xCHQA8JZy8cl8JLv7J9QTh3fE5+Al/7 HBfw3JC17rRoGVmlZV1io6w/6av5sAc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735580790; a=rsa-sha256; cv=none; b=zKPknBsWEhO98uqWkV3steeEqNpIO77eDlPSmSzmTkvs8WMuWOqZxX4JjRoH282HyxWqua vNnvkEHEX+XQ5466XyKsCuNQ5LF7vU+I2NzP8dqBQBMC5UZ54U5m5vJA9cF2Uu/sVUYePR fCVwPdIH4sCla1xXV9xqemmCJwyLQZo= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hr63nGIa; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.216.47 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-2f4448bf96fso9565585a91.0 for ; Mon, 30 Dec 2024 09:46:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1735580810; x=1736185610; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=OUd7wiP5I69QsBZ6QEaFt/A5eIkrt9OzZ0llQ5gAam8=; b=hr63nGIa3m6TDRsXUPFjjFTO5/u/3DGRzVQ7FmMjg7F1cA93S5VbrVxx3FgyVZd0y+ 3AFN1uIuHHJB8fe6ezYkTO+XcpSpaXydU7Mwd/8E7iU3sw6ar8V8kir36+7aOcLb1Un8 0TRPpSPiRxuOh6JnC3a/js2dJdAh9kz3h/U4kxKfAHphfweRO+YVQptFNGF/j0MTEpFb sGtHua8agptvLowhFUJqfnLGew/TO6LIcgz5LwY+EHpYiG2UaByDvkHK/DxB11gO+oM1 WJrmJYSfnghHpjHfS2IpBop5XJy2tpA4uQ72vMiXhh0zEOAJXPaZlwMAJZ6wYiu09qh4 yXEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735580810; x=1736185610; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=OUd7wiP5I69QsBZ6QEaFt/A5eIkrt9OzZ0llQ5gAam8=; b=ZByhlj+yhoMH8HlEhL/dRU2V7UIAsaNG04IMbvMqQtKbTKCdrA/wQbzi0RRhB8o4GZ BLrccazt7oUPzCfv+VdNoe2AsxInlgh5wyGUTtSDOZiuFtXpLliqisM+UlpNx7ORkeIB nyon5QF5eL4ivCDtVb2iA+p8hYY2tdGZhjjs3MJkAE5CI9UIEKW7IApLUtSpvfXl7OLS 0tzCMVn612rUsmGC3aprZ+/WLMdem38Fx7kx/zQQqLNgNlpxNkdNZ7wLd6w1h1H95Y/b eck0Y3M6RJuPk98Cnstk6JKN7F9+ciGIqGFaaMM1xTL8d64DkNrw6m5KuslYhUbng1Az aRfQ== X-Gm-Message-State: AOJu0YwhlnZ4V3krq5b74fha3n72+NJIwRoS04ygh8ZvxBOgGitYm4Ee hQ4MElpkP4sDatIM1M+BjBkAqhmHOhtameGcA+Z1WabT5F8wRXNE1dJX773dJTl1kg== X-Gm-Gg: ASbGncusQNjtEGPGHXQiX5iiLSaRi+F8RkfkQbbZOLCeECJBk+Bh7tZKMP9Yl8ogC6s QGeSlaQh3AZIoJ9ZMUjBc1v2A47Yb9GQRhhvyxMEcTrE340u/Xo4h2nNOYz7j3le3i6GAD/DC3q 3knAmUXiryFA3FrlGio9bSWoUiZUHSwxYC9DvwxCmX35Ek8LC4RHhUonBItZGBIBzlrfG7o1UK1 Ky2OvaKrFoVp4PO1nh+0ImdxkWtjdl8LhYs2LA4zCAYan0UjKpDqLVmqWr0RXT4tahAmBf8KWYS X-Google-Smtp-Source: AGHT+IFASpgLcnv5ruOz0GXq6zDz/azXtKgcI9+WP8UHO+QEYfltUE4koJjsFKEEQFJ+OZ7gXzQA0w== X-Received: by 2002:a17:90b:2e4a:b0:2ee:ab04:1037 with SMTP id 98e67ed59e1d1-2f452e36956mr56778141a91.17.1735580809678; Mon, 30 Dec 2024 09:46:49 -0800 (PST) Received: from KASONG-MC4.tencent.com ([1.203.117.231]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-219dc9cdf25sm180687695ad.118.2024.12.30.09.46.46 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 30 Dec 2024 09:46:49 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Chris Li , Barry Song , Ryan Roberts , Hugh Dickins , Yosry Ahmed , "Huang, Ying" , Nhat Pham , Johannes Weiner , Kalesh Singh , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v3 04/13] mm, swap: use cluster lock for HDD Date: Tue, 31 Dec 2024 01:46:12 +0800 Message-ID: <20241230174621.61185-5-ryncsn@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241230174621.61185-1-ryncsn@gmail.com> References: <20241230174621.61185-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: A461C20007 X-Stat-Signature: 7uku1jwzmjpu5f7ggbo3tk9fqbnwphr7 X-Rspam-User: X-HE-Tag: 1735580790-318738 X-HE-Meta: U2FsdGVkX18f61FzoAFfc0JyFEtPyhkge93cls8gJ+CN0dfr96anQQTRCmLfP6Rl6yozP+V13UPHsXVJKaIM0bQvnopJPLNzQbvH5HpSYHYwkNRYg5UjJ6mCrYlvHDoCwBrSYvkylc1wKRigmIGruZxiQ+8XkhAlz4BeGcTGE4fQttPAiTwlEoz2EBpEEtFULce7yXJWrljbgUoPsdmB3O51D+0iRNqe5G+jwyNGV/04AGd14hhkbVgaOboZ4AooVSBWjrz6uAHnSFveLWowTDtZXpzQ1lPCxX1BT+h9jMSFpln4RrDQbwXjWj+MGMchotTMW6r9y7/sfFVzA3m3cdxv4wl+QpTMJJCdF9EpAsahOO8crYwefEV0vrdF2HjwSf3fabBfZUYLJDi3iXTo8uO/FBqZmrvx/iFL94omYX1v2e0P2X6tsnpJWOoNkSMzJEld+rV6jqAUPP4RxaGb/WA8qqVSjxOQuZgGFxbdYpy7PPAuO3qlP/ShD29WSn1jfrDlGtxOQGDb6WsIMfggj18ipHZ+DzoRGvAnoOldW/OwXGleqmt5YEmJ1MQy0P0jvyBNO4fbHTBouTTJJ2cs185N/i+YF4aOrxFnJOHNnuyaXosRp35ARBkMdaLFcxjujcVdluL0Qa3Rpw8yQIl4rJ5ZTlbqGaXb5t0izAIQN6TJVMZRLJ3PZ2bK981DIkGEkvYn/niL2jplwRZVh7guuI/Tc7mksPETvJ9y2l8taG/boN9Sm0Fdpw8qKidCIiZF26DrGvNB7pzNFQ1cj5N9xu8Ceh3yuRmyxZvJi++RpzWwW03zTsLAvQVUYHrdiBtdLrv+7H3DH8ShOiaD7Qp/Nh0+P/uj9ZE/dz5GLrx5WP00dJL/grzEgU806hQ617R9cTWBHa1et6nmuYPnfFK7wlJTHO8uDewIVbWivdwX+8UXhZha0e8yT3OkAJGStt4xyDuKYfJcmqV8PuTjtPB 16oNPIiv amdOWOpoyTzp8TRwfwewoUEf9/HjnsKVsuLF8G2TiTCwXv/gVFoW4AcpmDputTR56RcPWpSjr+hfCJDC9yffViWRSI7Wg3yylpHNYsmVh0rxB7N1TV62PUcsUIa1T8+i5fQWlxtpPiZBV7jPHKJnF+4DTHGZjG0Sftujh+oAkKFcqsGmdDFThKKfndkpSS3Oth5oBGn5OfOQymY5n7Q0ajdwqRxKia0blUuVCfbbW5B0wFNqvZxlQtIvlsoCgVtVD1ZB+Krbr6DEw3JBplcovTUD6FhpoIyVfegU3J7bhR109XobFZuBJHna8pARb/3f03nlfwtNR2SbokkquA6eZwtyZHikbS/T9KMxs8VrlEPpRSBhlHjLvWA4ewtysZgqacesBRakyFt2FMGsHrwKzplJ7DKOQmJ885RLDqL1KLvCWDOROF9ymEwC9D57qsSQ4PZGi/rjp3yPJRAh7z7ngR9xpt+KDi6Z6oHvd2m7cRGc80c4o4K0VvkQe/2ofJ2w0yfk5D3rnJHx0KDkz9UbqiQgHWafPpkemomhzFyCNNFGRKfw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Cluster lock (ci->lock) was introduce to reduce contention for certain operations. Using cluster lock for HDD is not helpful as HDD have a poor performance, so locking isn't the bottleneck. But having different set of locks for HDD / non-HDD prevents further rework of device lock (si->lock). This commit just changed all lock_cluster_or_swap_info to lock_cluster, which is a safe and straight conversion since cluster info is always allocated now, also removed all cluster_info related checks. Suggested-by: Chris Li Signed-off-by: Kairui Song --- mm/swapfile.c | 107 ++++++++++++++++---------------------------------- 1 file changed, 34 insertions(+), 73 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index fca58d43b836..d0e5b9fa0c48 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -58,10 +58,9 @@ static void swap_entry_range_free(struct swap_info_struct *si, swp_entry_t entry static void swap_range_alloc(struct swap_info_struct *si, unsigned long offset, unsigned int nr_entries); static bool folio_swapcache_freeable(struct folio *folio); -static struct swap_cluster_info *lock_cluster_or_swap_info( - struct swap_info_struct *si, unsigned long offset); -static void unlock_cluster_or_swap_info(struct swap_info_struct *si, - struct swap_cluster_info *ci); +static struct swap_cluster_info *lock_cluster(struct swap_info_struct *si, + unsigned long offset); +static void unlock_cluster(struct swap_cluster_info *ci); static DEFINE_SPINLOCK(swap_lock); static unsigned int nr_swapfiles; @@ -222,9 +221,9 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, * swap_map is HAS_CACHE only, which means the slots have no page table * reference or pending writeback, and can't be allocated to others. */ - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); need_reclaim = swap_is_has_cache(si, offset, nr_pages); - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); if (!need_reclaim) goto out_unlock; @@ -404,45 +403,15 @@ static inline struct swap_cluster_info *lock_cluster(struct swap_info_struct *si { struct swap_cluster_info *ci; - ci = si->cluster_info; - if (ci) { - ci += offset / SWAPFILE_CLUSTER; - spin_lock(&ci->lock); - } - return ci; -} - -static inline void unlock_cluster(struct swap_cluster_info *ci) -{ - if (ci) - spin_unlock(&ci->lock); -} - -/* - * Determine the locking method in use for this device. Return - * swap_cluster_info if SSD-style cluster-based locking is in place. - */ -static inline struct swap_cluster_info *lock_cluster_or_swap_info( - struct swap_info_struct *si, unsigned long offset) -{ - struct swap_cluster_info *ci; - - /* Try to use fine-grained SSD-style locking if available: */ - ci = lock_cluster(si, offset); - /* Otherwise, fall back to traditional, coarse locking: */ - if (!ci) - spin_lock(&si->lock); + ci = &si->cluster_info[offset / SWAPFILE_CLUSTER]; + spin_lock(&ci->lock); return ci; } -static inline void unlock_cluster_or_swap_info(struct swap_info_struct *si, - struct swap_cluster_info *ci) +static inline void unlock_cluster(struct swap_cluster_info *ci) { - if (ci) - unlock_cluster(ci); - else - spin_unlock(&si->lock); + spin_unlock(&ci->lock); } /* Add a cluster to discard list and schedule it to do discard */ @@ -558,9 +527,6 @@ static void inc_cluster_info_page(struct swap_info_struct *si, unsigned long idx = page_nr / SWAPFILE_CLUSTER; struct swap_cluster_info *ci; - if (!cluster_info) - return; - ci = cluster_info + idx; ci->count++; @@ -576,9 +542,6 @@ static void inc_cluster_info_page(struct swap_info_struct *si, static void dec_cluster_info_page(struct swap_info_struct *si, struct swap_cluster_info *ci, int nr_pages) { - if (!si->cluster_info) - return; - VM_BUG_ON(ci->count < nr_pages); VM_BUG_ON(cluster_is_free(ci)); lockdep_assert_held(&si->lock); @@ -1007,8 +970,6 @@ static int cluster_alloc_swap(struct swap_info_struct *si, { int n_ret = 0; - VM_BUG_ON(!si->cluster_info); - si->flags += SWP_SCANNING; while (n_ret < nr) { @@ -1052,10 +1013,10 @@ static int scan_swap_map_slots(struct swap_info_struct *si, } /* - * Swapfile is not block device or not using clusters so unable + * Swapfile is not block device so unable * to allocate large entries. */ - if (!(si->flags & SWP_BLKDEV) || !si->cluster_info) + if (!(si->flags & SWP_BLKDEV)) return 0; } @@ -1295,9 +1256,9 @@ static unsigned char __swap_entry_free(struct swap_info_struct *si, unsigned long offset = swp_offset(entry); unsigned char usage; - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); usage = __swap_entry_free_locked(si, offset, 1); - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); if (!usage) free_swap_slot(entry); @@ -1320,14 +1281,14 @@ static bool __swap_entries_free(struct swap_info_struct *si, if (nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER) goto fallback; - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); if (!swap_is_last_map(si, offset, nr, &has_cache)) { - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); goto fallback; } for (i = 0; i < nr; i++) WRITE_ONCE(si->swap_map[offset + i], SWAP_HAS_CACHE); - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); if (!has_cache) { for (i = 0; i < nr; i++) @@ -1383,7 +1344,7 @@ static void cluster_swap_free_nr(struct swap_info_struct *si, DECLARE_BITMAP(to_free, BITS_PER_LONG) = { 0 }; int i, nr; - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); while (nr_pages) { nr = min(BITS_PER_LONG, nr_pages); for (i = 0; i < nr; i++) { @@ -1391,18 +1352,18 @@ static void cluster_swap_free_nr(struct swap_info_struct *si, bitmap_set(to_free, i, 1); } if (!bitmap_empty(to_free, BITS_PER_LONG)) { - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); for_each_set_bit(i, to_free, BITS_PER_LONG) free_swap_slot(swp_entry(si->type, offset + i)); if (nr == nr_pages) return; bitmap_clear(to_free, 0, BITS_PER_LONG); - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); } offset += nr; nr_pages -= nr; } - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); } /* @@ -1441,9 +1402,9 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry) if (!si) return; - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); if (size > 1 && swap_is_has_cache(si, offset, size)) { - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); spin_lock(&si->lock); swap_entry_range_free(si, entry, size); spin_unlock(&si->lock); @@ -1451,14 +1412,14 @@ void put_swap_folio(struct folio *folio, swp_entry_t entry) } for (int i = 0; i < size; i++, entry.val++) { if (!__swap_entry_free_locked(si, offset + i, SWAP_HAS_CACHE)) { - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); free_swap_slot(entry); if (i == size - 1) return; - lock_cluster_or_swap_info(si, offset); + lock_cluster(si, offset); } } - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); } static int swp_entry_cmp(const void *ent1, const void *ent2) @@ -1522,9 +1483,9 @@ int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry) struct swap_cluster_info *ci; int count; - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); count = swap_count(si->swap_map[offset]); - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); return count; } @@ -1547,7 +1508,7 @@ int swp_swapcount(swp_entry_t entry) offset = swp_offset(entry); - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); count = swap_count(si->swap_map[offset]); if (!(count & COUNT_CONTINUED)) @@ -1570,7 +1531,7 @@ int swp_swapcount(swp_entry_t entry) n *= (SWAP_CONT_MAX + 1); } while (tmp_count & COUNT_CONTINUED); out: - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); return count; } @@ -1585,8 +1546,8 @@ static bool swap_page_trans_huge_swapped(struct swap_info_struct *si, int i; bool ret = false; - ci = lock_cluster_or_swap_info(si, offset); - if (!ci || nr_pages == 1) { + ci = lock_cluster(si, offset); + if (nr_pages == 1) { if (swap_count(map[roffset])) ret = true; goto unlock_out; @@ -1598,7 +1559,7 @@ static bool swap_page_trans_huge_swapped(struct swap_info_struct *si, } } unlock_out: - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); return ret; } @@ -3428,7 +3389,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) offset = swp_offset(entry); VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); VM_WARN_ON(usage == 1 && nr > 1); - ci = lock_cluster_or_swap_info(si, offset); + ci = lock_cluster(si, offset); err = 0; for (i = 0; i < nr; i++) { @@ -3483,7 +3444,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) } unlock_out: - unlock_cluster_or_swap_info(si, ci); + unlock_cluster(ci); return err; } -- 2.47.1