From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1191AC3DA6E for ; Wed, 20 Dec 2023 12:21:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 780B56B007E; Wed, 20 Dec 2023 07:21:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 72F276B0080; Wed, 20 Dec 2023 07:21:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F6A26B0081; Wed, 20 Dec 2023 07:21:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 500596B007E for ; Wed, 20 Dec 2023 07:21:00 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 11F201C14A7 for ; Wed, 20 Dec 2023 12:21:00 +0000 (UTC) X-FDA: 81587105880.09.CD1D3E1 Received: from mail-il1-f180.google.com (mail-il1-f180.google.com [209.85.166.180]) by imf05.hostedemail.com (Postfix) with ESMTP id 6A106100017 for ; Wed, 20 Dec 2023 12:20:57 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=bVg2LjeP; spf=pass (imf05.hostedemail.com: domain of zhouchengming@bytedance.com designates 209.85.166.180 as permitted sender) smtp.mailfrom=zhouchengming@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703074858; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lo278K0WzeNhO2XB8OTPVe4wM6zLDvQlgKwUUEsFo2Y=; b=bmiXQAkZ//haUIrAuRPc9oIvNRdd2+HLiSY1+96pSzGyrJlM3VKs1OeustzEIZnBH+LzIY sGkM0/4GGLvRxBzTFMrcB/2NjB4eZL9S1izvgdT7oy9fB39cdwNOtHtgSo6Cbg2VKA1dJA q6oNiKG8IN8htFy+oyto+8tdwuZ4zY0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703074858; a=rsa-sha256; cv=none; b=cbFq4B4XenGbwcoqmkbeXvNXqpyZMTjN2ul9bMIGKwWvCJoDhwUvPoNfBl4mvcVC6d7+gE LEp/C2LDDxp4eaJCC5PrLRVM3SnQ3IHYSIW8P0OlHiDz5BkTD6ot9mnxHb4J0Bb9tOHjwK q4QL8XI0rjz+ILOawjvbxU6//AOi1vY= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=bVg2LjeP; spf=pass (imf05.hostedemail.com: domain of zhouchengming@bytedance.com designates 209.85.166.180 as permitted sender) smtp.mailfrom=zhouchengming@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-il1-f180.google.com with SMTP id e9e14a558f8ab-35fb4159ee2so12049475ab.0 for ; Wed, 20 Dec 2023 04:20:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1703074856; x=1703679656; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=lo278K0WzeNhO2XB8OTPVe4wM6zLDvQlgKwUUEsFo2Y=; b=bVg2LjePf6J0I8re6Ad3aW864qmFJsp7Xhy7lV4WUsmdvqcjvd3Y+OmHmywRxSoooy YK+5VhX9g8i+b/BK5nMm8fzBZW9pUd5Ci28e5kShqszQp1Rv55Wr5cl3MflMqedeWMeY 3246yQB5SFeiww/MdI1t0l7MfL3pyqhcvIJQG3RF3lcKtWOcwlmkMtRcP2U4K0R2syCW 03mppX9W7ZtvMU2CaBQyYMb05vrA8YYYkiFxKLvd4PY9fP0rbIhdW11tR0QX9loxH0um e1OuG7FHQhJKnW2RfKujss/JoftW3UYovhSARrZCOfZ96aCQ21M6bTEC9f97oNINOb5S RQ0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703074856; x=1703679656; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lo278K0WzeNhO2XB8OTPVe4wM6zLDvQlgKwUUEsFo2Y=; b=ogZzXKpVBy3IpYKbSWgOeyL34QPSqhX6XtzrXd11fwwBMG10PXbAIZu1YL1THFA4oF 16JNgfwBMOJOfqQtQGRTiFr6c6AunWiYrwG4pSFgZ9Eg5Wy2w1u6CS+DArvRyPnOIhUq 25ZF8VRMOqG4F4+s4OSzkI2yZqRx3t5S5B67ZckujIIens3xhCEwJnaOoCF888HdY+In tbzkrlSZOnjiNYpv04bEOcGEE5T3+qstpz/GR4gU9hztFtO/PcggdkiU7stJyTgrz0cw 0ePGGe6aFg+/ULYw/UWrYWiCd7TRvlNaaSn++Wn//Nn1bXYlThzhx6RQ4Qrg3BAbdcIN YB1g== X-Gm-Message-State: AOJu0Ywy5McTf/BexwIs2AAd+tnhZkAO+qkadwM/4pXoKSx+nRuw17jq VDd4Y57n+Qryg5u8ONQRQeCh/Q== X-Google-Smtp-Source: AGHT+IGnMvR7ylp1fX7m/paxr1IRwjme5dY/6Uc2RJGRTjvEJLxMjNqHZQav8XEQW3PCKiPsKVCTOw== X-Received: by 2002:a05:6e02:1c87:b0:35f:bba4:d4ce with SMTP id w7-20020a056e021c8700b0035fbba4d4cemr4603927ill.14.1703074856197; Wed, 20 Dec 2023 04:20:56 -0800 (PST) Received: from [10.254.235.157] ([139.177.225.232]) by smtp.gmail.com with ESMTPSA id be9-20020a170902aa0900b001d07d83fdd0sm22807188plb.238.2023.12.20.04.20.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 20 Dec 2023 04:20:55 -0800 (PST) Message-ID: <2a22e9b5-dc8c-4c4d-81c2-2f4e1850cf3d@bytedance.com> Date: Wed, 20 Dec 2023 20:20:49 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 6/6] mm/zswap: directly use percpu mutex and buffer in load/store Content-Language: en-US To: Yosry Ahmed , Nhat Pham , Chris Li Cc: Seth Jennings , Vitaly Wool , Dan Streetman , Johannes Weiner , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20231213-zswap-dstmem-v3-0-4eac09b94ece@bytedance.com> <20231213-zswap-dstmem-v3-6-4eac09b94ece@bytedance.com> From: Chengming Zhou In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 6A106100017 X-Rspam-User: X-Stat-Signature: mmrfweemhzejcp1jtgzb8dm6ergmdug4 X-Rspamd-Server: rspam03 X-HE-Tag: 1703074857-569585 X-HE-Meta: U2FsdGVkX19Wl0BbVFK5j1/ddhtb5gi5xVjL2Iev5h4q5FMFfcwzgouu5fB92LpzVuzz6fxkfwT1YgRCVPtG4Amu0YBGkN6xizIkF3dvb9bP5djtRTEGXpyEsK4xOvQrx8TIuwVBH5PUE/uneY8JlOsZMGh2D/FReY1F/BYlTgO2TtLqKQigncs/HwodX2gXIEplEk9nSBCqEvosibFz3XXecHU2AmHr5j7SpxDZoS/jVZySx+NyCQkJdyxBEq8HTHbmjpbo+Uw2pQ5mbD+zrmrfI2KV50pAUxo5engfXRBy49gqfxDo1ZaLtN3La7oUzPJFc57q6BalImH0W4poKVbmPdZl3qbFrQoOz5RgZfVqD+XKaWgN6IqROhKV71NwDv2LQr9047i3sbUdtefN+tqWwyFo2jeQ/r9IEVveZWaL1DQChln89Oz30QRjs/tzQFkyTbnW2eWz4BGZcZVaV3brXASUzKRYcz4C2cM1d3y5Xzn4QaAokZ0001mSFrlyM5yAWYrgWc3NYiOLFW7QvDt5mU5VYB3tk8eFoM0RlwppPiNdI4jVcpYHolOEWbtWgrFSwLwZZBJeZNiQ7YAZke6rnoKuI8MKOuCCR+1l27zFRWWwAbdwgzSKpJuWbE2jNDkLg50hY5aKTFzZJ2sAsvY4ziG/E9k6KB7LxTGMI3mtMAeitwamVyxGHEFayGRAZjv4pI6d+td2oUkHqtm3gaCITMSVGHfrh2egbSKubtls/6t2nhILJg3yJ3BjjykkIuOrofBPx/0O4/nK5Q7IcqrYCmNo3VZioHa8v80N+5FdGkNSWhmOnv8i6JBLgRm6fMKx4YVKBYooHcyXb7MXEOc8p9lna/DqgSp5yg006AMyKLyiZTU8e4wKvrDMxn+dTZnve7PITR5tPP2bY2VJGFqP/2m5ozqSoUCt2Yqlq0NoEp93pkyweq59Jgl98XUfPtsmwQkJm5UEtBmYyyE ThZIlEGQ JlhF0Qr2zL4mpuu+X4LGuTy2EeNJW3VMULuQHgtYqYVGvBdZlLkOZQm5lz++GSVFoinEHc3yibj6lcZ/QXxdSogIROeri3HNppJ31tvqYKNVt7vUElMux+Fuo6qLRCqnahnrW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2023/12/20 05:39, Yosry Ahmed wrote: > On Tue, Dec 19, 2023 at 10:43 AM Nhat Pham wrote: >> >> On Tue, Dec 19, 2023 at 5:29 AM Chris Li wrote: >>> >>> Hi Chengming and Yosry, >>> >>> On Mon, Dec 18, 2023 at 3:50 AM Chengming Zhou >>> wrote: >>>> >>>> Since the introduce of reusing the dstmem in the load path, it seems >>>> confusing that we are now using acomp_ctx->dstmem and acomp_ctx->mutex >>>> now for purposes other than what the naming suggests. >>>> >>>> Yosry suggested removing these two fields from acomp_ctx, and directly >>>> using zswap_dstmem and zswap_mutex in both the load and store paths, >>>> rename them, and add proper comments above their definitions that they >>>> are for generic percpu buffering on the load and store paths. >>>> >>>> So this patch remove dstmem and mutex from acomp_ctx, and rename the >>>> zswap_dstmem to zswap_buffer, using the percpu mutex and buffer on >>>> the load and store paths. >>> >>> Sorry joining this discussion late. >>> >>> I get the rename of "dstmem" to "buffer". Because the buffer is used >>> for both load and store as well. What I don't get is that, why do we >>> move it out of the acomp_ctx struct. Now we have 3 per cpu entry: >>> buffer, mutex and acomp_ctx. I think we should do the reverse, fold >>> this three per cpu entry into one struct the acomp_ctx. Each per_cpu >>> load() has a sequence of dance around the cpu id and disable preempt >>> etc, while each of the struct member load is just a plan memory load. >>> It seems to me it would be more optimal to combine this three per cpu >>> entry into acomp_ctx. Just do the per cpu for the acomp_ctx once. >> >> I agree with Chris. From a practicality POV, what Chris says here >> makes sense. From a semantic POV, this buffer is only used in >> (de)compression contexts - be it in store, load, or writeback - so it >> belonging to the orignal struct still makes sense to me. Why separate >> it out, without any benefits. Just rename the old field buffer or >> zswap_buffer and call it a day? It will be a smaller patch too! >> > > My main concern is that the struct name is specific for the crypto > acomp stuff, but that buffer and mutex are not. > How about we keep it in the struct, but refactor the struct as follows: > > struct zswap_ctx { > struct { > struct crypto_acomp *acomp; > struct acomp_req *req; > struct crypto_wait wait; > } acomp_ctx; > u8 *dstmem; > struct mutex *mutex; > }; > > , and then rename zswap_pool.acomp_ctx to zswap_pool.ctx? I think there are two viewpoints here, both works ok to me. The first is from ownship or lifetime, these percpu mutex and dstmem are shared between all pools, so no one pool own the mutex and dstmem, nor does the percpu acomp_ctx in each pool. The second is from usage, these percpu mutex and dstmem are used by the percpu acomp_ctx in each pool, so it's easy to use to put pointers to them in the percpu acomp_ctx. Actually I think it's simpler to let the percpu acomp_ctx has its own mutex and dstmem, which in fact are the necessary parts when it use the acomp interfaces. This way, we could delete the percpu mutex and dstmem, and its hotplugs, and not shared anymore between all pools. Maybe we would have many pools at the same time in the future, like different compression algorithm or different zpool for different memcg workloads. Who knows? :-) So how about this patch below? Just RFC. Subject: [PATCH] mm/zswap: make each acomp_ctx has its own mutex and dstmem Signed-off-by: Chengming Zhou --- include/linux/cpuhotplug.h | 1 - mm/zswap.c | 86 ++++++++++++-------------------------- 2 files changed, 26 insertions(+), 61 deletions(-) diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index efc0c0b07efb..c3e06e21766a 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -124,7 +124,6 @@ enum cpuhp_state { CPUHP_ARM_BL_PREPARE, CPUHP_TRACE_RB_PREPARE, CPUHP_MM_ZS_PREPARE, - CPUHP_MM_ZSWP_MEM_PREPARE, CPUHP_MM_ZSWP_POOL_PREPARE, CPUHP_KVM_PPC_BOOK3S_PREPARE, CPUHP_ZCOMP_PREPARE, diff --git a/mm/zswap.c b/mm/zswap.c index 2c349fd88904..37301f1a80a9 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -694,63 +694,31 @@ static void zswap_alloc_shrinker(struct zswap_pool *pool) /********************************* * per-cpu code **********************************/ -static DEFINE_PER_CPU(u8 *, zswap_dstmem); -/* - * If users dynamically change the zpool type and compressor at runtime, i.e. - * zswap is running, zswap can have more than one zpool on one cpu, but they - * are sharing dtsmem. So we need this mutex to be per-cpu. - */ -static DEFINE_PER_CPU(struct mutex *, zswap_mutex); - -static int zswap_dstmem_prepare(unsigned int cpu) -{ - struct mutex *mutex; - u8 *dst; - - dst = kmalloc_node(PAGE_SIZE, GFP_KERNEL, cpu_to_node(cpu)); - if (!dst) - return -ENOMEM; - - mutex = kmalloc_node(sizeof(*mutex), GFP_KERNEL, cpu_to_node(cpu)); - if (!mutex) { - kfree(dst); - return -ENOMEM; - } - - mutex_init(mutex); - per_cpu(zswap_dstmem, cpu) = dst; - per_cpu(zswap_mutex, cpu) = mutex; - return 0; -} - -static int zswap_dstmem_dead(unsigned int cpu) -{ - struct mutex *mutex; - u8 *dst; - - mutex = per_cpu(zswap_mutex, cpu); - kfree(mutex); - per_cpu(zswap_mutex, cpu) = NULL; - - dst = per_cpu(zswap_dstmem, cpu); - kfree(dst); - per_cpu(zswap_dstmem, cpu) = NULL; - - return 0; -} - static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) { struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); struct crypto_acomp *acomp; struct acomp_req *req; + int ret = 0; + + acomp_ctx->dstmem = kmalloc_node(PAGE_SIZE, GFP_KERNEL, cpu_to_node(cpu)); + if (!acomp_ctx->dstmem) + return -ENOMEM; + + acomp_ctx->mutex = kmalloc_node(sizeof(struct mutex), GFP_KERNEL, cpu_to_node(cpu)); + if (!acomp_ctx->mutex) { + ret = -ENOMEM; + goto mutex_fail; + } + mutex_init(acomp_ctx->mutex); acomp = crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to_node(cpu)); if (IS_ERR(acomp)) { pr_err("could not alloc crypto acomp %s : %ld\n", pool->tfm_name, PTR_ERR(acomp)); - return PTR_ERR(acomp); + ret = PTR_ERR(acomp); + goto acomp_fail; } acomp_ctx->acomp = acomp; @@ -758,8 +726,8 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) if (!req) { pr_err("could not alloc crypto acomp_request %s\n", pool->tfm_name); - crypto_free_acomp(acomp_ctx->acomp); - return -ENOMEM; + ret = -ENOMEM; + goto req_fail; } acomp_ctx->req = req; @@ -772,10 +740,15 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, crypto_req_done, &acomp_ctx->wait); - acomp_ctx->mutex = per_cpu(zswap_mutex, cpu); - acomp_ctx->dstmem = per_cpu(zswap_dstmem, cpu); - return 0; +req_fail: + crypto_free_acomp(acomp_ctx->acomp); +acomp_fail: + kfree(acomp_ctx->mutex); +mutex_fail: + kfree(acomp_ctx->dstmem); + + return ret; } static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) @@ -788,6 +761,8 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) acomp_request_free(acomp_ctx->req); if (!IS_ERR_OR_NULL(acomp_ctx->acomp)) crypto_free_acomp(acomp_ctx->acomp); + kfree(acomp_ctx->mutex); + kfree(acomp_ctx->dstmem); } return 0; @@ -1901,13 +1876,6 @@ static int zswap_setup(void) goto cache_fail; } - ret = cpuhp_setup_state(CPUHP_MM_ZSWP_MEM_PREPARE, "mm/zswap:prepare", - zswap_dstmem_prepare, zswap_dstmem_dead); - if (ret) { - pr_err("dstmem alloc failed\n"); - goto dstmem_fail; - } - ret = cpuhp_setup_state_multi(CPUHP_MM_ZSWP_POOL_PREPARE, "mm/zswap_pool:prepare", zswap_cpu_comp_prepare, @@ -1939,8 +1907,6 @@ static int zswap_setup(void) if (pool) zswap_pool_destroy(pool); hp_fail: - cpuhp_remove_state(CPUHP_MM_ZSWP_MEM_PREPARE); -dstmem_fail: kmem_cache_destroy(zswap_entry_cache); cache_fail: /* if built-in, we aren't unloaded on failure; don't allow use */ -- 2.20.1