From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6F92C27C79 for ; Mon, 17 Jun 2024 20:20:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5DFB06B0107; Mon, 17 Jun 2024 16:20:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 567856B0109; Mon, 17 Jun 2024 16:20:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E16C6B0281; Mon, 17 Jun 2024 16:20:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1D2096B0107 for ; Mon, 17 Jun 2024 16:20:32 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7523281A79 for ; Mon, 17 Jun 2024 20:20:31 +0000 (UTC) X-FDA: 82241498262.08.81E8E01 Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) by imf12.hostedemail.com (Postfix) with ESMTP id 9D68440008 for ; Mon, 17 Jun 2024 20:20:29 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Mi6fXuRj; spf=pass (imf12.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718655626; a=rsa-sha256; cv=none; b=Ckgg/TTF/DFXCdKeTKnnWFpdnVvAbVWJ1ivjc0BcQIrYhv0z3emca7favVLrUHiC15Bn6m CB4vd6xYgQ8LVqVYkG3gVfwJdHAIBYFh1PXVBuRPfGzO7KzsV+QhPO134nawFeOY0EL6wk swCFpyExlYazx+GaHX9OsJfyCSiS4+c= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Mi6fXuRj; spf=pass (imf12.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718655626; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=j/FZe9TIyimHOzUVMZW48V12HZv1QTlyVGMGhXcnOZw=; b=5r5vWyJyyotK7/C5xU5BvRijDujUg8MuajGrjAoRnF4Iq79smkM08T1MeSHhyprvyJX3ji P3ZcxZmKu6m/gcUuJYzglu4+Gy4wnwzWE1wz1ti0FX6rPNAqRephOeOzJ/7SnFqQMeS1AQ tDASVsuhX/conhnQYyafH7MlbV7VVcE= Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-422f7c7af49so16975e9.0 for ; Mon, 17 Jun 2024 13:20:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1718655628; x=1719260428; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=j/FZe9TIyimHOzUVMZW48V12HZv1QTlyVGMGhXcnOZw=; b=Mi6fXuRjKG75Xe12RSw8GosoyAVeokbYJFR5BfvLkUHzfor5GA6p61xgSUiyf7ahNV ZRx/cUCjRSSn3OmbtgLytiQv8re5XXt9vdPHJbJzBTgMDFlkfW4egLlnGThXZShxxx41 uI7GrAPM3Fra1sgl0R6wWH8aDWTH8h3ahweVMR3/RQp+qerGgER10H/Vu0wGZdRINgVa EAA6gbkg6xoRMfF6IXUmypyUtlNRR/BTxTIJLMHuRRSWubby6H2/wpr+apeSV+6tTqCl tbWY07lIWAZmogooWwEKKJLq2xvwPvuwiY6V5RcLyZRqacvK5uULbmz0Tmg5ZN6YrxIE 0Kiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718655628; x=1719260428; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=j/FZe9TIyimHOzUVMZW48V12HZv1QTlyVGMGhXcnOZw=; b=QExeU8qxhNdV/TEQqdobjqFnUkUVgmx8Ga88WzWzm+pjqy3g4ahAuZ57rl+RIq9VXA gW9JKCh93//ZXOhekMP1Y9MfXGFiDNukiVDcVvgHLp4iy8PAbH5EosW+4GBPYwORYVWK 3Xu9V/dyofm89X8g9Rqqp8C8xNfbrQhO38Tj0exfjpevilTXtYGVWCkeA4xwXuk31F/A gue5szxxFfQkBGr72i8b5gXUdnKdFkPAZLovX7OlFTDTAjRK014I9W0Lu33hfEYIZTGh umDeG0gYRuFtojnWcaMG04Jx6tbB15zsLk8zjFAr/ZXHLVe+LeD9XELxAyoFefHtpsTj SeRA== X-Forwarded-Encrypted: i=1; AJvYcCVL6QtsYOcIdYX/p9ysutlHFU4MSOk2cmyBOcKpbNvxHLGujr3R0CAiPk6bFSMAUFFCVd1L/VsbBH3b8ScVE2DYnP4= X-Gm-Message-State: AOJu0YyV5NdMACqhPsTcDCI41oTF4VUFKYySAFRi0dswJoVMj6vQVNiK yOTE7HXvNe2d/0+Y45oRHAn3lYw2e2vaUrZv9WOK2iU3zsf1O5pVw0ESByOpIeBWd8B1YtUu2za WipaVxQ/Nr4aX6uTvTQQiB/YSuSPfl1Oxk0dCxpdHW+VJvZ3qrY+24qs= X-Google-Smtp-Source: AGHT+IF3ut/Agm/Wd5Tdx7OAa5Ke3pxf/V/Wj5lHPSUKc7FCh5rILMpMZj6Fk0u1j4EwlaHbLqMJVJ20A+aHULLxtDQ= X-Received: by 2002:a05:600c:4e11:b0:421:7195:43e with SMTP id 5b1f17b1804b1-4246fcab961mr455075e9.0.1718655627881; Mon, 17 Jun 2024 13:20:27 -0700 (PDT) MIME-Version: 1.0 References: <20240617-zsmalloc-lock-mm-everything-v1-0-5e5081ea11b3@linux.dev> <20240617-zsmalloc-lock-mm-everything-v1-2-5e5081ea11b3@linux.dev> In-Reply-To: From: Yu Zhao Date: Mon, 17 Jun 2024 14:19:51 -0600 Message-ID: Subject: Re: [PATCH 2/2] mm/zswap: use only one pool in zswap To: Chengming Zhou Cc: Minchan Kim , Sergey Senozhatsky , Andrew Morton , Johannes Weiner , Yosry Ahmed , Nhat Pham , Takero Funaki , Chengming Zhou , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: kz487pf8acjgze8fwikxnhoxfkrrj8rr X-Rspamd-Queue-Id: 9D68440008 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1718655629-170814 X-HE-Meta: U2FsdGVkX1+cMYXfnOfbSYU7QDO1zvIzJIJoLm/vaUIifTYd7FBE7QkK8AxFkhc/PxgkGAUHXp3XCJTCHbLnk/Ngg/Vdifml4vzllPt0JUWN3EJnGNM4ufUDa62SsAYOfZlasnKnqW2F6e7BPRq3t0gDh9O+hP43SiY2lcIXDE8LYi64tUDDDA/2miSw8NOUv795HX+yuJ17sQSywPjJtORKfYJsz6Es/5bS4COJSZJYAHEkzzppX8NkqbD8S1nqfIXuHSt6JX58PyvQ4a7KvIi4xt+DxiDKXbQ5bXMD+WTWI8IcZyEnpVNAxPNPtX7hFwS50nkG/4nITqvqsZrZ7C92dr1zXz0LwTswTXAF0reqvtrtpWMk94HwGbxC8YsgBQySqxOnxZaORXI9ukFp6XtoATwqjz11uu/MYJqSswLQ25pZccSu6squSZM0y69aZRGK27D8OrRKBG3ZJdsWM9pVZS/YfmJshzBW51FpmG/MX2FPkcVIC5OaQPVsNuBtYoD0XAVgWKsHPG8tt4dnO9Oq+U6GZDnOM444+zg5t+r4IQ2a8dtv78hMixbfBlxCp9AfDA8iMB3oGU0TrYNb6Gt81dC3KUCuldtYKnsNbxH+M2F+J9wioMUGtgpRcQR/HKxHI1fO1SNgU5MxLBf95OEFMz9toQicq1YTj9H7bTsewZyFy8nNaXdY9Xw+JT0wJF3fZkAwqpCk935jKkDO6xgKYpyuFPOmCH5nLYDVdtGA+RrnXkPUWw/4zyprPAsG7mR5Vi6rajASFSj4Tg5oEGmqWU8RXQiaGy7VqLNkwThvZL85es3UERxptGCVF6IAHW7Mb4gzuZgfqE+LaAEb9KIdpfF8ZhPbrfoGtpxTCPTHQ6d01XX4ZF4w82l+w/hq39inuQhpjol5drvWftsz2RaOmNwRvcVYhP0mf+UQgz51s3PADVV5Ou87Op16Hm4Be70isP3Hv5CGQEcmleN 5iirCUJ3 LPC3IJmAA5HrUW0pZSo0KlKUE2+PetDgWbFCTzxwkW3tScizy2zGX7KuG7SubyTBenMNziwxh6wDIif0QtaejFZ2w4D4fLLeZETCtSvq4+DZ82czZIgjx4Lxk3FPen6/Yy6d5y4ELAPY0hv/gTkSC07Hv2hlU9dxDfDMPbz82sRCMOiTMOICs8WISZhQkT7LAqlL13IsuO1o2ON8zPQBkOUA34Zf15gaFA3aMqh3ilMEF+MmI7ebFBfg/NhZXBGhdzM2hKV307QMVjTytyrtj7Tpa1g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jun 17, 2024 at 2:16=E2=80=AFPM Yu Zhao wrote: > > On Mon, Jun 17, 2024 at 6:58=E2=80=AFAM Chengming Zhou wrote: > > > > Zswap uses 32 pools to workaround the locking scalability problem in > > zsmalloc, > > Note that zpool can have other backends (zbud, z3fold), and the > original patch was developed (even before zswap could use zsmalloc) to > make sure it works for all the backend. > > This patch only makes sense now only because zsmalloc became a lot > more popular than other backends (even though some distros still > default to zbud). And this might also have answered Yosry's question about the "historical context" here [1]. [1] https://lore.kernel.org/CAJD7tkbO+ZLdhs-9BpthztZX32i8C4=3DQEnoiXGS7bM39= 9nqwzg@mail.gmail.com/ > > which brings its own problems like memory waste and more > > memory fragmentation. > > > > Testing results show that we can have near performance with only one > > pool in zswap after changing zsmalloc to use per-size_class lock instea= d > > of pool spinlock. > > > > Testing kernel build (make bzImage -j32) on tmpfs with memory.max=3D1GB= , > > and zswap shrinker enabled with 10GB swapfile on ext4. > > > > real user sys > > 6.10.0-rc3 138.18 1241.38 1452.73 > > 6.10.0-rc3-onepool 149.45 1240.45 1844.69 > > 6.10.0-rc3-onepool-perclass 138.23 1242.37 1469.71 > > > > Signed-off-by: Chengming Zhou > > --- > > mm/zswap.c | 60 +++++++++++++++++++-----------------------------------= ------ > > 1 file changed, 19 insertions(+), 41 deletions(-) > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > index e25a6808c2ed..5063c5372e51 100644 > > --- a/mm/zswap.c > > +++ b/mm/zswap.c > > @@ -122,9 +122,6 @@ static unsigned int zswap_accept_thr_percent =3D 90= ; /* of max pool size */ > > module_param_named(accept_threshold_percent, zswap_accept_thr_percent, > > uint, 0644); > > > > -/* Number of zpools in zswap_pool (empirically determined for scalabil= ity) */ > > -#define ZSWAP_NR_ZPOOLS 32 > > - > > /* Enable/disable memory pressure-based shrinker. */ > > static bool zswap_shrinker_enabled =3D IS_ENABLED( > > CONFIG_ZSWAP_SHRINKER_DEFAULT_ON); > > @@ -160,7 +157,7 @@ struct crypto_acomp_ctx { > > * needs to be verified that it's still valid in the tree. > > */ > > struct zswap_pool { > > - struct zpool *zpools[ZSWAP_NR_ZPOOLS]; > > + struct zpool *zpool; > > struct crypto_acomp_ctx __percpu *acomp_ctx; > > struct percpu_ref ref; > > struct list_head list; > > @@ -237,7 +234,7 @@ static inline struct xarray *swap_zswap_tree(swp_en= try_t swp) > > > > #define zswap_pool_debug(msg, p) \ > > pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \ > > - zpool_get_type((p)->zpools[0])) > > + zpool_get_type((p)->zpool)) > > > > /********************************* > > * pool functions > > @@ -246,7 +243,6 @@ static void __zswap_pool_empty(struct percpu_ref *r= ef); > > > > static struct zswap_pool *zswap_pool_create(char *type, char *compress= or) > > { > > - int i; > > struct zswap_pool *pool; > > char name[38]; /* 'zswap' + 32 char (max) num + \0 */ > > gfp_t gfp =3D __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLA= IM; > > @@ -267,18 +263,14 @@ static struct zswap_pool *zswap_pool_create(char = *type, char *compressor) > > if (!pool) > > return NULL; > > > > - for (i =3D 0; i < ZSWAP_NR_ZPOOLS; i++) { > > - /* unique name for each pool specifically required by z= smalloc */ > > - snprintf(name, 38, "zswap%x", > > - atomic_inc_return(&zswap_pools_count)); > > - > > - pool->zpools[i] =3D zpool_create_pool(type, name, gfp); > > - if (!pool->zpools[i]) { > > - pr_err("%s zpool not available\n", type); > > - goto error; > > - } > > + /* unique name for each pool specifically required by zsmalloc = */ > > + snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_co= unt)); > > + pool->zpool =3D zpool_create_pool(type, name, gfp); > > + if (!pool->zpool) { > > + pr_err("%s zpool not available\n", type); > > + goto error; > > } > > - pr_debug("using %s zpool\n", zpool_get_type(pool->zpools[0])); > > + pr_debug("using %s zpool\n", zpool_get_type(pool->zpool)); > > > > strscpy(pool->tfm_name, compressor, sizeof(pool->tfm_name)); > > > > @@ -311,8 +303,7 @@ static struct zswap_pool *zswap_pool_create(char *t= ype, char *compressor) > > error: > > if (pool->acomp_ctx) > > free_percpu(pool->acomp_ctx); > > - while (i--) > > - zpool_destroy_pool(pool->zpools[i]); > > + zpool_destroy_pool(pool->zpool); > > kfree(pool); > > return NULL; > > } > > @@ -361,15 +352,12 @@ static struct zswap_pool *__zswap_pool_create_fal= lback(void) > > > > static void zswap_pool_destroy(struct zswap_pool *pool) > > { > > - int i; > > - > > zswap_pool_debug("destroying", pool); > > > > cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE, &pool->= node); > > free_percpu(pool->acomp_ctx); > > > > - for (i =3D 0; i < ZSWAP_NR_ZPOOLS; i++) > > - zpool_destroy_pool(pool->zpools[i]); > > + zpool_destroy_pool(pool->zpool); > > kfree(pool); > > } > > > > @@ -464,8 +452,7 @@ static struct zswap_pool *zswap_pool_find_get(char = *type, char *compressor) > > list_for_each_entry_rcu(pool, &zswap_pools, list) { > > if (strcmp(pool->tfm_name, compressor)) > > continue; > > - /* all zpools share the same type */ > > - if (strcmp(zpool_get_type(pool->zpools[0]), type)) > > + if (strcmp(zpool_get_type(pool->zpool), type)) > > continue; > > /* if we can't get it, it's about to be destroyed */ > > if (!zswap_pool_get(pool)) > > @@ -492,12 +479,8 @@ unsigned long zswap_total_pages(void) > > unsigned long total =3D 0; > > > > rcu_read_lock(); > > - list_for_each_entry_rcu(pool, &zswap_pools, list) { > > - int i; > > - > > - for (i =3D 0; i < ZSWAP_NR_ZPOOLS; i++) > > - total +=3D zpool_get_total_pages(pool->zpools[i= ]); > > - } > > + list_for_each_entry_rcu(pool, &zswap_pools, list) > > + total +=3D zpool_get_total_pages(pool->zpool); > > rcu_read_unlock(); > > > > return total; > > @@ -802,11 +785,6 @@ static void zswap_entry_cache_free(struct zswap_en= try *entry) > > kmem_cache_free(zswap_entry_cache, entry); > > } > > > > -static struct zpool *zswap_find_zpool(struct zswap_entry *entry) > > -{ > > - return entry->pool->zpools[hash_ptr(entry, ilog2(ZSWAP_NR_ZPOOL= S))]; > > -} > > - > > /* > > * Carries out the common pattern of freeing and entry's zpool allocat= ion, > > * freeing the entry itself, and decrementing the number of stored pag= es. > > @@ -814,7 +792,7 @@ static struct zpool *zswap_find_zpool(struct zswap_= entry *entry) > > static void zswap_entry_free(struct zswap_entry *entry) > > { > > zswap_lru_del(&zswap_list_lru, entry); > > - zpool_free(zswap_find_zpool(entry), entry->handle); > > + zpool_free(entry->pool->zpool, entry->handle); > > zswap_pool_put(entry->pool); > > if (entry->objcg) { > > obj_cgroup_uncharge_zswap(entry->objcg, entry->length); > > @@ -939,7 +917,7 @@ static bool zswap_compress(struct folio *folio, str= uct zswap_entry *entry) > > if (comp_ret) > > goto unlock; > > > > - zpool =3D zswap_find_zpool(entry); > > + zpool =3D entry->pool->zpool; > > gfp =3D __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLAIM; > > if (zpool_malloc_support_movable(zpool)) > > gfp |=3D __GFP_HIGHMEM | __GFP_MOVABLE; > > @@ -968,7 +946,7 @@ static bool zswap_compress(struct folio *folio, str= uct zswap_entry *entry) > > > > static void zswap_decompress(struct zswap_entry *entry, struct folio *= folio) > > { > > - struct zpool *zpool =3D zswap_find_zpool(entry); > > + struct zpool *zpool =3D entry->pool->zpool; > > struct scatterlist input, output; > > struct crypto_acomp_ctx *acomp_ctx; > > u8 *src; > > @@ -1467,7 +1445,7 @@ bool zswap_store(struct folio *folio) > > return true; > > > > store_failed: > > - zpool_free(zswap_find_zpool(entry), entry->handle); > > + zpool_free(entry->pool->zpool, entry->handle); > > put_pool: > > zswap_pool_put(entry->pool); > > freepage: > > @@ -1683,7 +1661,7 @@ static int zswap_setup(void) > > pool =3D __zswap_pool_create_fallback(); > > if (pool) { > > pr_info("loaded using pool %s/%s\n", pool->tfm_name, > > - zpool_get_type(pool->zpools[0])); > > + zpool_get_type(pool->zpool)); > > list_add(&pool->list, &zswap_pools); > > zswap_has_pool =3D true; > > static_branch_enable(&zswap_ever_enabled); > > > > -- > > 2.45.2 > > > >