From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 981EEC27C54 for ; Thu, 6 Jun 2024 16:53:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DED76B0092; Thu, 6 Jun 2024 12:53:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28F2A6B009F; Thu, 6 Jun 2024 12:53:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 158786B009D; Thu, 6 Jun 2024 12:53:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EB3866B00A8 for ; Thu, 6 Jun 2024 12:53:18 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A5510A0AFF for ; Thu, 6 Jun 2024 16:53:18 +0000 (UTC) X-FDA: 82201059276.29.F354F80 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf05.hostedemail.com (Postfix) with ESMTP id E5365100019 for ; Thu, 6 Jun 2024 16:53:16 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FWt4X72P; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of flintglass@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=flintglass@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717692797; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Bc3RMWumHKS+4tAFTa24Td2vGRB/wsUyT746eSPeZQs=; b=uWvE1XxYmwd8gk23uvJ0B7SzoU2EJjXBFxJta3vlDd4LhI0lXXk1wSfSz2g05MLyUZa5DR KdWmoYf8/TPxoguKS/66LR9wjiMJrf4EwN14Y9vAZxA3AHEzPJidbwj0mWkMocIR5dQ9RI zWv58khMVrqbUn2tltmpNyJBqDf5/mo= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=FWt4X72P; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of flintglass@gmail.com designates 209.85.214.171 as permitted sender) smtp.mailfrom=flintglass@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717692797; a=rsa-sha256; cv=none; b=QziuW9CgT+Ik9Wy/r1S4OpJbhCX0kbfjfezjKPHr49yEoQc14C7VZJHxZG5MKE1JPubDMT YL51o1LAUtNnDHCnN/dsqY5E9b6K/3IwdNPTQGF+BJ9jM/LToA8p4GhCxiKwiKO8TfrEfH RvOqFk1AjVCp5qkfLzKFzZ5owVPwutE= Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1f62fae8c3cso11179945ad.3 for ; Thu, 06 Jun 2024 09:53:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717692796; x=1718297596; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Bc3RMWumHKS+4tAFTa24Td2vGRB/wsUyT746eSPeZQs=; b=FWt4X72Pfk7jRC3+RZtDnaq0YjV1nJJVR3yNlVrr0tXH1Drp3h25MQ+NwsTm17GT0G gAQZmuoST00FNLtAxPpwV3g0Gbd61Yf77T44O7thjC9F7laMAKSFjlbKeDVdyRigqvGh Afq8kuDpqR/UTiFm1N2e4j+LsPP5+AxyXdFxGfJAlCGkLKkJGLNU9xx2O7LJT42dSMk5 f4oJKjzjMnPUWnt82b2zBTsGjB5XuDrGKU74ByQ0n+b9iqIrRTZrdEnAEbfg3li8dYLl y9GcYjam/1SXkXYBmCpHWk9Vd9IHOnLEgoS7HjGVwZO4FZHQedcc2Y9wyi234tn68lx8 4FbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717692796; x=1718297596; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Bc3RMWumHKS+4tAFTa24Td2vGRB/wsUyT746eSPeZQs=; b=rLc/xwKKfN/lUte41dFkZ350HhJ5SKvBI5tvy2zUHaCFFL5kvxJkhckYioJtfBEEFg simDKfj612I09VmsNzZcA8yOdqE4cr0s+ONZDTMHVzIctogA7M2WtBIGq28P2HpT75hx SEcKdVrT1pTFXRXGJJDMHrexwXjs2BRfX7Ma2ZhJ7v7kt1bBI8IfUtKga91XrvAIz66a QT/vICykUuwCXk6SMD9w0+mAX3BWZLBU+DYS0tkzbfQ+FKuu+Kivx4ImXiyIqyvdo0tF Wm9cuzxXU+dM5lStPOD8sZ11Q7CI37Bby+Lhq0ncFJXl5ELXmJqYbzZnr6l8FN4G+4bX R8Qg== X-Forwarded-Encrypted: i=1; AJvYcCVM82ucXyKvjvJS2Phb+IwEgjZT7BeG8Pz+1AjEGj1ORD8bIkLNxL4ph68V3eqN9fVQw9cqeT/EozWzshOcpw9lZ10= X-Gm-Message-State: AOJu0YyLBJMuHV4Gc2LvxGGHNAmnU8Ft1iAtCsp86iobZQ2HhaL1DL+W WBud0BZOxLphMTcSbx+PBTSENW8YMXDqdgDxSh/Ok9+/XW7RbnNCxrQsxFD0 X-Google-Smtp-Source: AGHT+IF7rHojm4FcprtqM5OEnPHWFlJqrlUBB9dJkZGFuwLCow34qLp57nM528tdn+I3P+NDM+HhqQ== X-Received: by 2002:a17:902:e752:b0:1f6:7fee:8fe0 with SMTP id d9443c01a7336-1f6d03b961cmr2008585ad.67.1717692795511; Thu, 06 Jun 2024 09:53:15 -0700 (PDT) Received: from cbuild.srv.usb0.net (uw2.srv.usb0.net. [185.197.30.200]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6bd7e0490sm17296335ad.187.2024.06.06.09.53.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 09:53:15 -0700 (PDT) From: Takero Funaki To: Johannes Weiner , Yosry Ahmed , Nhat Pham , Chengming Zhou , Andrew Morton Cc: Takero Funaki , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] mm: zswap: limit number of zpools based on CPU and RAM Date: Thu, 6 Jun 2024 16:53:01 +0000 Message-ID: <20240606165303.431215-1-flintglass@gmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: E5365100019 X-Stat-Signature: 3kfd7hnf595id4jghez518fow1tj5gjy X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1717692796-301907 X-HE-Meta: U2FsdGVkX1+8M+dSj6UgieQsBYmIQf8MO83DZ3ICBwyJDlYxckMPjW2WN+LkKKl+yCZ4Zt2f/dNd4W7ZI1D1q3u2bGoacUBWCblPGTmsp5idW1Zq3OeD0tdrNdfh3eW6WrEdX6vrssQ+7nwk/ivLxUG1iZhUrMHUbWEnj0/m5J9oKdBKj9ss5LvCGNlP4qsZeofG68x78jUkSwITOkE0K3HxKno9I+ANTL4/DuMRe0XWCJK4XkuwCP+iSKDsdnVQt9oA8lDKTndRQeAuYG9TnDOrjfbIP+O8e1Gfmr+H6TCpHpaLBggM9NXx8XoelMW/wmGZkUYR+Xj2ZjqXHpzkQG7uL2xIDG7wgv+vxA09o9prgTguHh6NEZc76eAYwN6PVK4PwzuEkNPoCf33UYb2jvsBnvoz5WGLlotUiGJIzQ/6Gs0YAJ/rmSWpENcXgJNBQ0cq3ABER0BnLKhK9U4iRiAf8kLdJMOlXB5E0u8DiNr9QRTB8vbToOUVlXA3XwRMunImzTG1x31ZH8hg72bovW/k340DKBP0b4WdlUat3JNwLCIcc3oBh11luwsxEY+wUoqXYtVJgTd6ckJATA1CUZOrCq8mtUPxoQ2q+TKZTYpwNUgRF2PaP/YV1PNKvWqGquaArxMGPUf/gLL2oaUkjipjKt55xWtmmFkVcfwm1WDrOQsss+znDV/ywttcOJZ2IyKixvjvKQjxM8ZGXXFJFa72D5XUQJFbGel0j+eCzlCo4x7/yBiPEwnbnfbcrLjXyqpzhATmtjgZYkEIqAP+8PuOkHl3+xjhH4QDRgv4B2/qKZGlC/gizYtx34D0rUWL1p+LcyK66DPOor9vfiWU50FCJBXAdlsjOAvGMZvfgpKd9iw22UfrT+CD3yXqGBIIcohGX4yPCJmIrpw6E/BM9f0+NSElqZ5Sl+EM55ixUyMqoKZYQMKxC8jOxGrms+OIWjkIS0tjgD0klVe9az1 sdp8bsbK fXSXbm8z07pnmV5GA0hAPoWIrH+GDBO/zaiiz5MJ/rPeZlZhIAYtvUsaRPoym1o3IiKgYKxPntYfrK3feAfjl8CldNHYzA7Vh7n4Lk49daQxalwgzmeEgEWI1+SRHKX4OBLxI4sE6qyslRhsSYxQJxm5oqxhipqgYYOkFWww7XvRlIXNY5WsgPbOQpWZLUIAtKVyAPDRClRTiAXYrCtnogjAWvIwJWuOuxd4JRCYonfNseR5/9wRpxXNUlgoO5NrERYuCpYh5EoVGMzwDPQJWdmYVTh12zKQuA6eKCsu4O6gb4xxGSC97xp3+esm9hTrenfQyue472ZCTvJMdwuRHBsX8ZOtu7DfprLmwyex9y07UeKWCg54ect1CJGBBCkJT2QNrwOUxVYOv9KxuBB38n+/RPrgJPiwwcd40h1OayH3UU7oa3wy+FPGbp6pXHVSCkETG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch limits the number of zpools used by zswap on smaller systems. Currently, zswap allocates 32 pools unconditionally. This was implemented to reduce contention on per-zpool locks. However, it incurs allocation overhead by distributing pages across pools, wasting memory on systems with fewer CPUs and less RAM. This patch allocates approximately 2*CPU zpools, with a minimum of 1 zpool for single-CPU systems and up to 32 zpools for systems with 16 or more CPUs. This number is sufficient to keep the probability of busy-waiting by a thread under 40%. The upper limit of 32 zpools remains unchanged. For memory, it limits to 1 zpool per 60MB of memory for the 20% default max pool size limit, assuming the best case with no fragmentation in zspages. It expects 90% pool usage for zsmalloc. Signed-off-by: Takero Funaki --- mm/zswap.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 7 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 4de342a63bc2..e957bfdeaf70 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -124,8 +124,11 @@ static unsigned int zswap_accept_thr_percent = 90; /* of max pool size */ module_param_named(accept_threshold_percent, zswap_accept_thr_percent, uint, 0644); -/* Number of zpools in zswap_pool (empirically determined for scalability) */ -#define ZSWAP_NR_ZPOOLS 32 +/* + * Number of max zpools in zswap_pool (empirically determined for scalability) + * This must be order of 2, for pointer hashing. + */ +#define ZSWAP_NR_ZPOOLS_MAX 32 /* Enable/disable memory pressure-based shrinker. */ static bool zswap_shrinker_enabled = IS_ENABLED( @@ -157,12 +160,13 @@ struct crypto_acomp_ctx { * needs to be verified that it's still valid in the tree. */ struct zswap_pool { - struct zpool *zpools[ZSWAP_NR_ZPOOLS]; + struct zpool *zpools[ZSWAP_NR_ZPOOLS_MAX]; struct crypto_acomp_ctx __percpu *acomp_ctx; struct percpu_ref ref; struct list_head list; struct work_struct release_work; struct hlist_node node; + unsigned char nr_zpools_order; char tfm_name[CRYPTO_MAX_ALG_NAME]; }; @@ -243,11 +247,55 @@ static inline struct xarray *swap_zswap_tree(swp_entry_t swp) pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \ zpool_get_type((p)->zpools[0])) +static unsigned long zswap_max_pages(void); + /********************************* * pool functions **********************************/ static void __zswap_pool_empty(struct percpu_ref *ref); +/* + * Estimate the optimal number of zpools based on CPU and memory. + * + * For CPUs, aim for 40% or lower probability of busy-waiting from a thread, + * assuming all cores are accessing zswap concurrently. + * The threshold is chosen for the simplicity of the formula: + * The probability is 1-(1-(1/pool))^(thr-1). For 40% threshold, this is + * approximately pool = 2 * threads rounded up to orders of 2. + * Threads \ Pools + * 2 4 8 16 32 + * 2 0.50 0.25 < 0.13 0.06 0.03 + * 4 0.88 0.58 0.33 < 0.18 0.09 + * 6 0.97 0.76 0.49 0.28 < 0.15 + * 8 0.99 0.87 0.61 0.36 < 0.20 + * 10 1.00 0.92 0.70 0.44 0.25 < + * 16 1.00 0.99 0.87 0.62 0.38 < + * 18 1.00 0.99 0.90 0.67 0.42 + * + * For memory, expect 90% pool usage for zsmalloc in the best case. + * Assuming uniform distribution, we need to store: + * 590 : sum of pages_per_zspage + * * 0.5 : about half of zspage is empty if no fragmentation + * / (1-0.9) : 90% target usage + * = 2950 : expected max pages of a zpool, + * equivalent to 60MB RAM for a 20% max_pool_percent. + */ +static void __zswap_set_nr_zpools(struct zswap_pool *pool) +{ + unsigned long mem = zswap_max_pages(); + unsigned long cpu = num_online_cpus(); + + mem = DIV_ROUND_UP(mem, 2950); + mem = min(max(1, mem), ZSWAP_NR_ZPOOLS_MAX); + + if (cpu <= 1) + cpu = 1; + else + cpu = 1 << ilog2(min(cpu * 2, ZSWAP_NR_ZPOOLS_MAX); + + pool->nr_zpools_order = ilog2(min(mem, cpu)); +} + static struct zswap_pool *zswap_pool_create(char *type, char *compressor) { int i; @@ -271,7 +319,9 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor) if (!pool) return NULL; - for (i = 0; i < ZSWAP_NR_ZPOOLS; i++) { + __zswap_set_nr_zpools(pool); + + for (i = 0; i < (1 << pool->nr_zpools_order); i++) { /* unique name for each pool specifically required by zsmalloc */ snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count)); @@ -372,7 +422,7 @@ static void zswap_pool_destroy(struct zswap_pool *pool) cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->node); free_percpu(pool->acomp_ctx); - for (i = 0; i < ZSWAP_NR_ZPOOLS; i++) + for (i = 0; i < (1 << pool->nr_zpools_order); i++) zpool_destroy_pool(pool->zpools[i]); kfree(pool); } @@ -513,7 +563,7 @@ unsigned long zswap_total_pages(void) list_for_each_entry_rcu(pool, &zswap_pools, list) { int i; - for (i = 0; i < ZSWAP_NR_ZPOOLS; i++) + for (i = 0; i < (1 << pool->nr_zpools_order); i++) total += zpool_get_total_pages(pool->zpools[i]); } rcu_read_unlock(); @@ -822,7 +872,10 @@ static void zswap_entry_cache_free(struct zswap_entry *entry) static struct zpool *zswap_find_zpool(struct zswap_entry *entry) { - return entry->pool->zpools[hash_ptr(entry, ilog2(ZSWAP_NR_ZPOOLS))]; + if (entry->pool->nr_zpools_order == 0) + return entry->pool->zpools[0]; + + return entry->pool->zpools[hash_ptr(entry, entry->pool->nr_zpools_order)]; } /* -- 2.43.0