From: Nhat Pham <nphamcs@gmail.com>
To: "Kanchana P. Sridhar" <kanchanapsridhar2026@gmail.com>
Cc: hannes@cmpxchg.org, yosry@kernel.org, chengming.zhou@linux.dev,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, herbert@gondor.apana.org.au,
senozhatsky@chromium.org
Subject: Re: [PATCH v3 2/2] mm: zswap: Tie per-CPU acomp_ctx lifetime to the pool.
Date: Sun, 12 Apr 2026 17:42:02 -0700 [thread overview]
Message-ID: <CAKEwX=PMbbuMeo9Ppk=cW_hAthZ-cf0rMjALsETpDZzDj6oYJQ@mail.gmail.com> (raw)
In-Reply-To: <20260331183351.29844-3-kanchanapsridhar2026@gmail.com>
On Tue, Mar 31, 2026 at 11:34 AM Kanchana P. Sridhar
<kanchanapsridhar2026@gmail.com> wrote:
>
> Currently, per-CPU acomp_ctx are allocated on pool creation and/or CPU
> hotplug, and destroyed on pool destruction or CPU hotunplug. This
> complicates the lifetime management to save memory while a CPU is
> offlined, which is not very common.
>
> Simplify lifetime management by allocating per-CPU acomp_ctx once on
> pool creation (or CPU hotplug for CPUs onlined later), and keeping them
> allocated until the pool is destroyed.
>
> Refactor cleanup code from zswap_cpu_comp_dead() into
> acomp_ctx_free() to be used elsewhere.
>
> The main benefit of using the CPU hotplug multi state instance startup
> callback to allocate the acomp_ctx resources is that it prevents the
> cores from being offlined until the multi state instance addition call
> returns.
>
> From Documentation/core-api/cpu_hotplug.rst:
>
> "The node list add/remove operations and the callback invocations are
> serialized against CPU hotplug operations."
>
> Furthermore, zswap_[de]compress() cannot contend with
> zswap_cpu_comp_prepare() because:
>
> - During pool creation/deletion, the pool is not in the zswap_pools
> list.
>
> - During CPU hot[un]plug, the CPU is not yet online, as Yosry pointed
> out. zswap_cpu_comp_prepare() will be run on a control CPU,
> since CPUHP_MM_ZSWP_POOL_PREPARE is in the PREPARE section of "enum
> cpuhp_state".
>
> In both these cases, any recursions into zswap reclaim from
> zswap_cpu_comp_prepare() will be handled by the old pool.
>
> The above two observations enable the following simplifications:
>
> 1) zswap_cpu_comp_prepare():
>
> a) acomp_ctx mutex locking:
>
> If the process gets migrated while zswap_cpu_comp_prepare() is
> running, it will complete on the new CPU. In case of failures, we
> pass the acomp_ctx pointer obtained at the start of
> zswap_cpu_comp_prepare() to acomp_ctx_free(), which again, can
> only undergo migration. There appear to be no contention
> scenarios that might cause inconsistent values of acomp_ctx's
> members. Hence, it seems there is no need for
> mutex_lock(&acomp_ctx->mutex) in zswap_cpu_comp_prepare().
>
> b) acomp_ctx mutex initialization:
>
> Since the pool is not yet on zswap_pools list, we don't need to
> initialize the per-CPU acomp_ctx mutex in
> zswap_pool_create(). This has been restored to occur in
> zswap_cpu_comp_prepare().
>
> c) Subsequent CPU offline-online transitions:
>
> zswap_cpu_comp_prepare() checks upfront if acomp_ctx->acomp is
> valid. If so, it returns success. This should handle any CPU
> hotplug online-offline transitions after pool creation is done.
>
> 2) CPU offline vis-a-vis zswap ops:
>
> Let's suppose the process is migrated to another CPU before the
> current CPU is dysfunctional. If zswap_[de]compress() holds the
> acomp_ctx->mutex lock of the offlined CPU, that mutex will be
> released once it completes on the new CPU. Since there is no
> teardown callback, there is no possibility of UAF.
>
> 3) Pool creation/deletion and process migration to another CPU:
>
> During pool creation/deletion, the pool is not in the zswap_pools
> list. Hence it cannot contend with zswap ops on that CPU. However,
> the process can get migrated.
>
> a) Pool creation --> zswap_cpu_comp_prepare()
> --> process migrated:
> * Old CPU offline: no-op.
> * zswap_cpu_comp_prepare() continues
> to run on the new CPU to finish
> allocating acomp_ctx resources for
> the offlined CPU.
>
> b) Pool deletion --> acomp_ctx_free()
> --> process migrated:
> * Old CPU offline: no-op.
> * acomp_ctx_free() continues
> to run on the new CPU to finish
> de-allocating acomp_ctx resources
> for the offlined CPU.
>
> 4) Pool deletion vis-a-vis CPU onlining:
>
> The call to cpuhp_state_remove_instance() cannot race with
> zswap_cpu_comp_prepare() because of hotplug synchronization.
>
> The current acomp_ctx_get_cpu_lock()/acomp_ctx_put_unlock() are
> deleted. Instead, zswap_[de]compress() directly call
> mutex_[un]lock(&acomp_ctx->mutex).
>
> The per-CPU memory cost of not deleting the acomp_ctx resources upon CPU
> offlining, and only deleting them when the pool is destroyed, is 8.28 KB
> on x86_64. This cost is only paid when a CPU is offlined, until it is
> onlined again.
>
> Co-developed-by: Kanchana P. Sridhar <kanchanapsridhar2026@gmail.com>
> Signed-off-by: Kanchana P. Sridhar <kanchanapsridhar2026@gmail.com>
> Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Lol.
> Acked-by: Yosry Ahmed <yosry@kernel.org>
Thanks for simplifying this :) My brain always hurts when I have to
handle CPU offlining for per-cpu structures. I had to deal with this
because I added per-CPU caching for a structure (with reference
counting) in another patch series of mine :)
Acked-by: Nhat Pham <nphamcs@gmail.com>
next prev parent reply other threads:[~2026-04-13 0:42 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-31 18:33 [PATCH v3 0/2] zswap pool per-CPU acomp_ctx simplifications Kanchana P. Sridhar
2026-03-31 18:33 ` [PATCH v3 1/2] mm: zswap: Remove redundant checks in zswap_cpu_comp_dead() Kanchana P. Sridhar
2026-04-13 0:34 ` Nhat Pham
2026-03-31 18:33 ` [PATCH v3 2/2] mm: zswap: Tie per-CPU acomp_ctx lifetime to the pool Kanchana P. Sridhar
2026-04-13 0:42 ` Nhat Pham [this message]
2026-03-31 19:22 ` [PATCH v3 0/2] zswap pool per-CPU acomp_ctx simplifications Yosry Ahmed
2026-03-31 20:58 ` Kanchana P. Sridhar
2026-03-31 22:19 ` Andrew Morton
2026-03-31 22:38 ` Kanchana P. Sridhar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKEwX=PMbbuMeo9Ppk=cW_hAthZ-cf0rMjALsETpDZzDj6oYJQ@mail.gmail.com' \
--to=nphamcs@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=herbert@gondor.apana.org.au \
--cc=kanchanapsridhar2026@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=senozhatsky@chromium.org \
--cc=yosry@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox