From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 337BAE77197 for ; Tue, 7 Jan 2025 18:14:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A37F16B007B; Tue, 7 Jan 2025 13:14:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E8836B00A4; Tue, 7 Jan 2025 13:14:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AFC86B00A8; Tue, 7 Jan 2025 13:14:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6DB676B00A4 for ; Tue, 7 Jan 2025 13:14:24 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 23566140B83 for ; Tue, 7 Jan 2025 18:14:24 +0000 (UTC) X-FDA: 82981455648.24.7FC1A7E Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) by imf10.hostedemail.com (Postfix) with ESMTP id 44218C0003 for ; Tue, 7 Jan 2025 18:14:22 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Jwj8i/yR"; spf=pass (imf10.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.52 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736273662; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oG61dw26vvXoPcjn2vS9jiBka5lWOF0xVLwsEZhkvY4=; b=XznEQu2Pzt78WSo3b1+zfixMjzTc2GFuuV/63B4/hZojHGr86PIg7J3x9XmHKxSbD/YH8T qEBHkZarmyvSmOqKibbZR4TbjdCOYAVFQWSAZHGJ7LqRoFTg3t0H99MOYYyatEYZFtnAuZ zFGm68vWlzbJxjSd3CxcTXHww8dT+Ag= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Jwj8i/yR"; spf=pass (imf10.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.52 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736273662; a=rsa-sha256; cv=none; b=lCiCscLCjcPgA7IOgIu8NU0yRqe4EVOSFZUf7TioCTnUF1TPOQK1m/nGaSu8DJHkivAZWx Hn3OdECezlX1u0xZ+uuzcHMSh8DU/jJ/JuG+ZQbjhmbuBOSd0JX8iapjkwvm2ehUyLtABe 5H93eHTGWIvDC1ZVRCDcOohLo8d/lzo= Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-6dcdd9a3e54so157845446d6.3 for ; Tue, 07 Jan 2025 10:14:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736273661; x=1736878461; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=oG61dw26vvXoPcjn2vS9jiBka5lWOF0xVLwsEZhkvY4=; b=Jwj8i/yRGLZrWyUxRfnXbqjtGA1o5Brs09eb219cMu6slrmKStmCDiaqVDV8mENOxQ WeRCMBhkjHtTAYTzBpqDoiyqdldr9CsIpqzJhdRs353SqNUGsKEVpxmei/KdCZdV8MU4 rfwU2jpNznHZV7F0efh7+/Yq6Owa7xfnJ8l5y9J6X755zHW0UdP1qv2M7RYowjSZEEgx Se74yCPUOr9UHVu8fLaZizto0Pgi+4WVdYv+K6vPtafIT98PrBkauVwlG1YbqxEQY0iv ndT5fI44cQckCC7QkPJgnbrW0tq2OEZK5345KlRvx82+STr6EwY5GPfOFS0uji+ygDPm 18OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736273661; x=1736878461; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oG61dw26vvXoPcjn2vS9jiBka5lWOF0xVLwsEZhkvY4=; b=IPlJKtx+wlAcslpIxHAdIjP0G5MUXRJ8F8+SNC3Zre1MEBbM1JsXwWBmwPaUvAwXlT DjDDlek5WCNumgUx25dB8cRmA5MFG7Pcj0yueI6d7sZU83S5it5MCj0UOsIwDfX06pRk Z1tzOQupilT0O3lilBe4YYu5VTNPJLoCv3OYdC+bKRlcX+YmBf6TgXopYAIKQFLbwo// fRS26Jk4Hms0quYsELDNeco1P1XgfUUQ7Eeoc4wAbeyXZqfBGtj0inXlhDQnrpxHN0vV LKujOSZc/AHclvOW3W30ju8H5TLBcTWfgPheg/ZhSBiYYFB9a9jKNzSz1wv79/qE1NJQ JkYA== X-Forwarded-Encrypted: i=1; AJvYcCXkJ4EBOWdEnvwWb8fHUgNLOJ4TFWv6/EEe2g8ghlTkCkRhkPvkwEOF6+ifpm82UZparSJWf6Ko2A==@kvack.org X-Gm-Message-State: AOJu0YzW0axL4Nkrli8dFsmyXAgyxQvQQwBOEl52n73Id3hzroCxi52M J+L1N6ZS7I6UQR/TH54Jb2MAZ7cYjUGvMWfwKj3m1G0E5OkTSqpyvIHtp41UNYKVDYebWxN30Df vInIGixIxaZVKHjCiul0j3XdkR8rHwots0thL X-Gm-Gg: ASbGnctCnjId/H9RYtj6vcakklvfIGPG+EwssYhzdwZsIaR7PgVBPK1dFicPwIw8OHo xnTJxuzZRWPAJPxbgh8gl0PBOl+XgqVhPijWbLocwzEgSvalPeADMw+rsxxrlr0a45AA= X-Google-Smtp-Source: AGHT+IFV+lxuKL2KP92p5pI89yhB4KNGKqG5ib6N9eKaNVmY+2f0dzgmRVshSSL1WDV/rFTjSFFI7ui/PF6lHR2evOQ= X-Received: by 2002:ad4:5bc3:0:b0:6d8:e5f4:b969 with SMTP id 6a1803df08f44-6df9b1f4300mr607216d6.10.1736273661036; Tue, 07 Jan 2025 10:14:21 -0800 (PST) MIME-Version: 1.0 References: <20250107074724.1756696-1-yosryahmed@google.com> <20250107074724.1756696-2-yosryahmed@google.com> <20250107180345.GD37530@cmpxchg.org> In-Reply-To: <20250107180345.GD37530@cmpxchg.org> From: Yosry Ahmed Date: Tue, 7 Jan 2025 10:13:44 -0800 X-Gm-Features: AbW1kvbDEG3OoD4yS1bJmqQIUzrjE3LTEH-Qw1MG0uO_F4UgZIN2Nh1JLId1D4I Message-ID: Subject: Re: [PATCH RESEND 2/2] mm: zswap: use SRCU to synchronize with CPU hotunplug To: Johannes Weiner Cc: Andrew Morton , Nhat Pham , Chengming Zhou , Vitaly Wool , Barry Song , Sam Sun , linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Stat-Signature: f3kjktf9oj6f3e1ub8uzr78wizitec3d X-Rspamd-Queue-Id: 44218C0003 X-Rspam-User: X-HE-Tag: 1736273662-488890 X-HE-Meta: U2FsdGVkX1+Br3StGTWr6yaUT8WfJJE6mGM1isrfk0btjLy5bRVB48JLC6uGhCJq3/AlPXXBgM8fLEVfedQUSIZbX5FUdbSx0wj/NKW+8j8Cbq//xqp7H4yq6x7wmCNhpwg+8RhHB9XCVEIfMWutw/frvaw5lHmasIGyDkkd3dZGVKcr+7rtxiAkPUTOXFwdiWfKI9SV102r3Mbe3OUiQyAB+/qIB20ACLTiaNrszrgwj3/YydZAWbNrqehJDm8JQ/zGNFKkHc1p53nMunVFVQFKZhTIf2TGuQfkKJUZPk/IWM98stzCtiFYI1q9EADZNhVsZPxlwjhT/oQ/HJH5mo506E2UBz05zGwxyuqL7o8tAybreh4PJQ+MQFNgyRnCOF9JWLDAnrTN1FVAYV3dfEGkPhWD31WKoPyy4YOSLol95RjPttQmUo3b5QA4TjTvheEzaZoIwKBkX393UeUkgNTzWLjl7iSGq9TISxBFIIsboTDe6wYaRNjf+e9TbVaYUwJqMeF5A1lkY8/bOg2fgjlA+2wMph7H9vUCERV8m30ghOieyAtRkKVRr2RijFvA8pwz4RqwZe1mzy7gtQhiIE0Du/Adw3+nT3WM/QARSekyeLzcbJ9C0JIm3pqX5kapCdXw0J/vmOitmjm4IcxaWcxm3+828uz3pGFudFABaf8pwo/dSs2jHqK9W/anbHRJPSkpesTVyDzy8SnO531+P3M5ZF9w6yBupSArY9ZahuNt/FSVwCcsovgVtA/uOQccrGIvzT9Kaq/0uvrK5BIN+Kj1XTM7tDcqPgC9fmHgHUuz+wdIoyHu7OkqDBpGQSKZFRPl/8XhMt2YWE++GKouZ/1O6nYnffeimFdgh+umsHL1OZpQZoOqR8VF3McDpGuAyJLyf121A8XantbwhEZxFWiUsZQJ+fYiwZxwDF2zQ8ZkYS+Kd/Igg9NFHmWxDBJrx2EXjvmGEhPM/v+aYoX TR9o9kHb AILrwwKQ2l4HoWKkjz5czb5sRFR358tszPWYn7oigwPcLsSO7P0c++Adc/DYk7Pvl/spaJbWHceKjpu0qFrVxAxdP0gtxWaKuSoNTr1twcDw8ghyOUGTQzN7Ydz4bo0SGla/KKrfC+pKXtHVz+Lm+7CX3922ryQgGuDrLSL2NYCkkJ1tuz8Pb82aYRbiiiGFHX2gr1oZ/GWLzJSbY+j1cxC+/xAap3Z1Pptbk6/vzqfrD/SnRDMUBjqmfGvxJbK0hZ1w2QC7ywi2ceuOljKN+vZh06QZo7RtLs2M2ZUBpJnKL4Khe+ZenwmpMGiaTLb4YKWs5QMzzksRqERs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 7, 2025 at 10:03=E2=80=AFAM Johannes Weiner wrote: > > On Tue, Jan 07, 2025 at 07:47:24AM +0000, Yosry Ahmed wrote: > > In zswap_compress() and zswap_decompress(), the per-CPU acomp_ctx of th= e > > current CPU at the beginning of the operation is retrieved and used > > throughout. However, since neither preemption nor migration are disabl= ed, > > it is possible that the operation continues on a different CPU. > > > > If the original CPU is hotunplugged while the acomp_ctx is still in use= , > > we run into a UAF bug as the resources attached to the acomp_ctx are fr= eed > > during hotunplug in zswap_cpu_comp_dead(). > > > > The problem was introduced in commit 1ec3b5fe6eec ("mm/zswap: move to u= se > > crypto_acomp API for hardware acceleration") when the switch to the > > crypto_acomp API was made. Prior to that, the per-CPU crypto_comp was > > retrieved using get_cpu_ptr() which disables preemption and makes sure = the > > CPU cannot go away from under us. Preemption cannot be disabled with t= he > > crypto_acomp API as a sleepable context is needed. > > > > Commit 8ba2f844f050 ("mm/zswap: change per-cpu mutex and buffer to > > per-acomp_ctx") increased the UAF surface area by making the per-CPU > > buffers dynamic, adding yet another resource that can be freed from und= er > > zswap compression/decompression by CPU hotunplug. > > > > There are a few ways to fix this: > > (a) Add a refcount for acomp_ctx. > > (b) Disable migration while using the per-CPU acomp_ctx. > > (c) Use SRCU to wait for other CPUs using the acomp_ctx of the CPU bein= g > > hotunplugged. Normal RCU cannot be used as a sleepable context is > > required. > > > > Implement (c) since it's simpler than (a), and (b) involves using > > migrate_disable() which is apparently undesired (see huge comment in > > include/linux/preempt.h). > > > > Fixes: 1ec3b5fe6eec ("mm/zswap: move to use crypto_acomp API for hardwa= re acceleration") > > Cc: > > Signed-off-by: Yosry Ahmed > > Reported-by: Johannes Weiner > > Closes: https://lore.kernel.org/lkml/20241113213007.GB1564047@cmpxchg.o= rg/ > > Reported-by: Sam Sun > > Closes: https://lore.kernel.org/lkml/CAEkJfYMtSdM5HceNsXUDf5haghD5+o2e7= Qv4OcuruL4tPg6OaQ@mail.gmail.com/ > > --- > > mm/zswap.c | 31 ++++++++++++++++++++++++++++--- > > 1 file changed, 28 insertions(+), 3 deletions(-) > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > index f6316b66fb236..add1406d693b8 100644 > > --- a/mm/zswap.c > > +++ b/mm/zswap.c > > @@ -864,12 +864,22 @@ static int zswap_cpu_comp_prepare(unsigned int cp= u, struct hlist_node *node) > > return ret; > > } > > > > +DEFINE_STATIC_SRCU(acomp_srcu); > > + > > static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *no= de) > > { > > struct zswap_pool *pool =3D hlist_entry(node, struct zswap_pool, = node); > > struct crypto_acomp_ctx *acomp_ctx =3D per_cpu_ptr(pool->acomp_ct= x, cpu); > > > > if (!IS_ERR_OR_NULL(acomp_ctx)) { > > + /* > > + * Even though the acomp_ctx should not be currently in u= se on > > + * @cpu, it may still be used by compress/decompress oper= ations > > + * that started on @cpu and migrated to a different CPU. = Wait > > + * for such usages to complete, any news usages would be = a bug. > > + */ > > + synchronize_srcu(&acomp_srcu); > > The docs suggest you can't solve it like that :( > > Documentation/RCU/Design/Requirements/Requirements.rst: > > Also unlike other RCU flavors, synchronize_srcu() may **not** be > invoked from CPU-hotplug notifiers, due to the fact that SRCU grace > periods make use of timers and the possibility of timers being > temporarily =E2=80=9Cstranded=E2=80=9D on the outgoing CPU. This strand= ing of timers > means that timers posted to the outgoing CPU will not fire until > late in the CPU-hotplug process. The problem is that if a notifier > is waiting on an SRCU grace period, that grace period is waiting on > a timer, and that timer is stranded on the outgoing CPU, then the > notifier will never be awakened, in other words, deadlock has > occurred. This same situation of course also prohibits > srcu_barrier() from being invoked from CPU-hotplug notifiers. Thanks for checking, I completely missed this. I guess it only works with SRCU if we use call_srcu(), but then we need to copy the pointers to a new struct to avoid racing with the CPU getting onlined again. Otherwise we can just bite the bullet and add a refcount, or use migrate_disable() despite that being undesirable. Do you have a favorite? :)