From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 607A6D6C28E for ; Tue, 19 Nov 2024 19:51:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3D6E6B0088; Tue, 19 Nov 2024 14:51:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CED0D6B008A; Tue, 19 Nov 2024 14:51:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B8D8F6B008C; Tue, 19 Nov 2024 14:51:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 9507C6B0088 for ; Tue, 19 Nov 2024 14:51:45 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 33F951C61FE for ; Tue, 19 Nov 2024 19:51:45 +0000 (UTC) X-FDA: 82803888678.18.C8D5AE7 Received: from mail-vk1-f176.google.com (mail-vk1-f176.google.com [209.85.221.176]) by imf13.hostedemail.com (Postfix) with ESMTP id D482D20006 for ; Tue, 19 Nov 2024 19:50:49 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zeD9HtG3; spf=pass (imf13.hostedemail.com: domain of yosryahmed@google.com designates 209.85.221.176 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732045701; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7JNN3j9AcXK1kn/AArAYx0mr1pUblQFYF/cde5OTfCM=; b=ET0dVZSCZtTDbeswgzI39RxsDSUa5EIRU2aufh3W+lPhujg9+xsoN4kIZpD8BsTkHrAu5l YGYvaOg6CHddeIVoF3tp4Om+y/AnWWJs1hBKXHCfzbWtbJfov1fJjFw9lF2mOq5zn7uv12 w89EeNW+ldfksX8ZOEimy/ID5PO/lR0= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zeD9HtG3; spf=pass (imf13.hostedemail.com: domain of yosryahmed@google.com designates 209.85.221.176 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732045701; a=rsa-sha256; cv=none; b=iFkxx96LeDrlbv7JYkcfaP5hxDwsZf18o643dGViIP8U7IlVrwlK4cAwyW+MZ/NQTKTCJB COkplgZ5lJgX5IxwMb1nuvSXkngd4aPryVMWNhIkJa66FCdLDY5qr3PKhPBq2rEOSRKusn xOO5tX8StX05y7gqKvtpv70uw/0dTpA= Received: by mail-vk1-f176.google.com with SMTP id 71dfb90a1353d-513de4267d8so1547497e0c.3 for ; Tue, 19 Nov 2024 11:51:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1732045902; x=1732650702; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7JNN3j9AcXK1kn/AArAYx0mr1pUblQFYF/cde5OTfCM=; b=zeD9HtG3SqXo0XbYf1uPBoy4RiVYBOGqlP0olrrzNoXsaZVMz3agIdbco4BMM8uS3S NfghTiFdh741OvvUQKznoW0vsqOQwmUvuDUA+LTl1uLnUyTYCTXtE0qiwvYj3haaVcsK vOmND0S6jMIOrgwvMMwzJMYGJcBu525qzpNa3YcbxeXAxC6vNDl7b/8iGx45d4+rJnL/ 8DhJ0I3cytfN/zlG52ZynFvDP5Ox89RZ0hPokWSyX5829RajqaF4FCCyJM3SI31BWROu Z7Hu2dYwNEorOHGMcpEi8vONsFVJXPUpVu9+hNWv17oXyOYGwLYAgKCHj75KxNhr69M0 w0pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732045902; x=1732650702; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7JNN3j9AcXK1kn/AArAYx0mr1pUblQFYF/cde5OTfCM=; b=BxqqSDFwmd6roeZcU7hd3fYQSZSh5cPHUMwavOx4yuSXvigwy7dCgF3D/13c2sZX29 jzQ1HHChGq/DzOGiqTvzJM2zkAn+TtJXL9RxxaWa79eQuOtwB1ZpLP6QA+nX5NdrfQU6 +g6Cg94vGV8Cus9weMi9+rOY4yPGJOBzDMhScckSyn3fqZufy9VhlaWCIRMLNtDW5khE AikPrgbBWGWYSnYt6SwKLVHz5E8MfUOrtEmvhJ5CYmJvkhrtiFLvCQTrw4GEufD3rxqJ TbNMhdDRbfR2fZPUMOgF1jmVYVK+6m6I39OStaUdfOfYMFF8hl8ZVmFxWY4Oo5qD9UlZ DkWA== X-Forwarded-Encrypted: i=1; AJvYcCVHrjz8VlwGYelqfif97Guvp+gxAtldqMzF+2K9GaqtgfsZL0JCVK1ryF8hHU2/Nl9/xPeDemcrKg==@kvack.org X-Gm-Message-State: AOJu0YxFU/dGDo3WUwEHrhaR0VLmTdLe2dbcpMivukNlRfIoaoOa+t9H W7Qwi58dYb6f3Qcplj0g08ZkAQkSiOz/EgOF/FdVeXFFC/6lc1DcCSeJh60vDPN9Yfq9/9BG901 fGSg5NlMu1REbivrQvuMaUS75mi+OJcakye3Q X-Google-Smtp-Source: AGHT+IHjYo0EcDoyKZ19FOw3cRDHm/vybvcM2ZH27pw0uD0NP0HI4IPAjlfOCjC+wSo4kzr42kcPJsBfpaozKXZEaqw= X-Received: by 2002:ac5:c0c9:0:b0:50a:c7cd:bee4 with SMTP id 71dfb90a1353d-514cf890574mr170689e0c.1.1732045901091; Tue, 19 Nov 2024 11:51:41 -0800 (PST) MIME-Version: 1.0 References: <20241113052413.157039-1-kanchana.p.sridhar@intel.com> <20241113213007.GB1564047@cmpxchg.org> <20241114051149.GC1564047@cmpxchg.org> <9a807484-6693-4e2a-a087-97bbc5ee4ed9@linux.dev> In-Reply-To: From: Yosry Ahmed Date: Tue, 19 Nov 2024 11:51:04 -0800 Message-ID: Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in zswap_decompress(). To: "Sridhar, Kanchana P" Cc: Chengming Zhou , Johannes Weiner , Nhat Pham , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "usamaarif642@gmail.com" , "ryan.roberts@arm.com" , "21cnbao@gmail.com" <21cnbao@gmail.com>, "akpm@linux-foundation.org" , "Feghali, Wajdi K" , "Gopal, Vinodh" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D482D20006 X-Stat-Signature: b6x3zcwhjso6jyduxr6bmor747tbchrg X-Rspam-User: X-HE-Tag: 1732045849-302325 X-HE-Meta: U2FsdGVkX1/A1NlvflWOvBDqrBFhMcxGDjOuQYlHyJb/RKCBS8OnYYkIlseelmFiNhUDcM34SkAUszc18L+GBqd5yCL8x75d/yvu5344BuB3zyPvHpGvhHfVjDhwDgcRfZWha4Y1PPK+jvnNLzSBjPjBZXjsKgVNXYRVxE2LryQJd1TYyqHdCbKnq5VfCLCYVGRte7vy/URwlKKlVoI8WAxMiSu7S0LD2BlUDgXsSqPVmOgooOsfnbIEaVrdsXg4NcoflpnpaJuLN0X7ypYJs6kMrHfbIjCMdKevr/Dhc/1TSJcEwnG5fDZOxp9wx/IznHtUYhkqKg4zgSQX1Y03+kSuajyitVPVibLWu/dOaXcR5el7F82AvTkb1AxM5oammjpENyAeHXc7Wwkzv7h7sjzISIM8YFlddMPuOHBOajmRaBgO9l+dUVCfRylxRoxqgrbIXtKklGGFLoVrAAWfd3J2214CxPCE4Yhby/FWkVGSbz01BMs63wBZWf2aErjcBzMq6Cj8aajrAIu1gc2SSOWbiLCbAUcpNfomR4FUljt1QBj5RvqPcGk+PW+APayVKoqevFDPxcyWONZdrHwb3AiSG4E4e5uTYx5Ge7Ohp/Av4aZ2mt61w627qzHuwa/4UTdUUl7j63Y7IeC+Dho0QaiW+H1y0l9ICf3955Fk+Qc7s1EMxWstk/69PLNiLZ9KY/raax1dvdMF8G+wKEfvSE6HVRCS6+ecJa38r4AIE9YgdUD0cLwuqqSJIWkL2hu/RdR6cnLvTSUerYzttGat36d3eEKLJO2hVihKYRs4RyCYOsQh648Dtp8+yCBZdLTRwcix8rRimBNva0x/DwTdaeaAMfX4ExfiBUsBnf1jgDLztjn37hc28dMuUTaCApP+zOx7NNAvBLgZPTRo/EKhYNki815lYpSFsG1AH5R//z7OkvStjTFoKb2cgH/BP+Rqk/H0DWa6FurvZ/S2Xmr k/fXaZSK dVlWI8jdHqxZPaAP4tvpU/ttwOM9D3UUVIwt4AcaWGUUbLHHHyY1qv6lGht9f43DgrnRWr8Vp69WwFKNcXu1UC3+Af0m5wU/BwEMgQWIoOHoqZ5QkAuZTV39P5Ri+8EwAeZsycxi4ZZBGxyCJ5+VKH4yOfxLpzlzuSYY92G5PqsBg72PbWsgMl/lTa5QI/Q5vPD/8F1PkZjLz8xfpUeEdsnoZXGn7X2dfv4cKPdzskpMuxxSAotGRlP/meFFixHr/uZZTDFrkh69VS18SYigLa/r38bMbteUVQWcTSPPx16M8ZbCaj3sYa4mjufYFnrWNQMnrmpFUJWNnIjH7riuyMk5ZTMdw70r3OoI2yZbgHNL0WxFcaVHyr2UBjXtPUtP0CxECIeYcfnJmJt/96m+KyBq9nlXlqbek4EMjCBCPf8LWsqmjZhYMmgncZW9zNGggIqAlNrfbprM0NxQN1fty9mkD2kTfjKhyt2ic08nNQGsHEW4y61820JmiPnCkJ0/Evj8DVojI7DmFvewIVjTXlnUv6VMbEmMT3rR/mKnPEqveCgwS88HkeGBOzNMbQ4aVAPOcQU1uHye/ads2Fgs56i72+51hlZ42F2M/QCjqan5Qr7n+8JOCHIDuYNlv1I41la2CM5On/EOmqYTBBSqRM26738TunYbIE0aJEzQQMS4c/cM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 19, 2024 at 11:42=E2=80=AFAM Sridhar, Kanchana P wrote: > > > > -----Original Message----- > > From: Yosry Ahmed > > Sent: Tuesday, November 19, 2024 11:27 AM > > To: Sridhar, Kanchana P > > Cc: Chengming Zhou ; Johannes Weiner > > ; Nhat Pham ; linux- > > kernel@vger.kernel.org; linux-mm@kvack.org; usamaarif642@gmail.com; > > ryan.roberts@arm.com; Huang, Ying ; > > 21cnbao@gmail.com; akpm@linux-foundation.org; Feghali, Wajdi K > > ; Gopal, Vinodh > > Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in > > zswap_decompress(). > > > > On Tue, Nov 19, 2024 at 11:22=E2=80=AFAM Sridhar, Kanchana P > > wrote: > > > > > > > > > > -----Original Message----- > > > > From: Yosry Ahmed > > > > Sent: Friday, November 15, 2024 1:49 PM > > > > To: Sridhar, Kanchana P > > > > Cc: Chengming Zhou ; Johannes Weiner > > > > ; Nhat Pham ; linux- > > > > kernel@vger.kernel.org; linux-mm@kvack.org; usamaarif642@gmail.com; > > > > ryan.roberts@arm.com; Huang, Ying ; > > > > 21cnbao@gmail.com; akpm@linux-foundation.org; Feghali, Wajdi K > > > > ; Gopal, Vinodh > > > > Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak in > > > > zswap_decompress(). > > > > > > > > On Fri, Nov 15, 2024 at 1:14=E2=80=AFPM Sridhar, Kanchana P > > > > wrote: > > > > > > > > > > Hi Chengming, > > > > > > > > > > > -----Original Message----- > > > > > > From: Chengming Zhou > > > > > > Sent: Wednesday, November 13, 2024 11:24 PM > > > > > > To: Sridhar, Kanchana P ; Johanne= s > > > > Weiner > > > > > > > > > > > > Cc: Nhat Pham ; Yosry Ahmed > > > > > > ; linux-kernel@vger.kernel.org; linux- > > > > > > mm@kvack.org; usamaarif642@gmail.com; ryan.roberts@arm.com; > > > > Huang, > > > > > > Ying ; 21cnbao@gmail.com; akpm@linux- > > > > > > foundation.org; Feghali, Wajdi K ; G= opal, > > > > Vinodh > > > > > > > > > > > > Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory leak = in > > > > > > zswap_decompress(). > > > > > > > > > > > > Hello, > > > > > > > > > > > > On 2024/11/14 14:37, Sridhar, Kanchana P wrote: > > > > > > > > > > > > > >> -----Original Message----- > > > > > > >> From: Johannes Weiner > > > > > > >> Sent: Wednesday, November 13, 2024 9:12 PM > > > > > > >> To: Sridhar, Kanchana P > > > > > > >> Cc: Nhat Pham ; Yosry Ahmed > > > > > > >> ; linux-kernel@vger.kernel.org; linux= - > > > > > > >> mm@kvack.org; chengming.zhou@linux.dev; > > > > usamaarif642@gmail.com; > > > > > > >> ryan.roberts@arm.com; Huang, Ying ; > > > > > > >> 21cnbao@gmail.com; akpm@linux-foundation.org; Feghali, Wajdi= K > > > > > > >> ; Gopal, Vinodh > > > > > > > > >> Subject: Re: [PATCH v1] mm: zswap: Fix a potential memory le= ak in > > > > > > >> zswap_decompress(). > > > > > > >> > > > > > > >> On Thu, Nov 14, 2024 at 01:56:16AM +0000, Sridhar, Kanchana = P > > > > wrote: > > > > > > >>> So my question was, can we prevent the migration to a diffe= rent > > cpu > > > > > > >>> by relinquishing the mutex lock after this conditional > > > > > > >> > > > > > > >> Holding the mutex doesn't prevent preemption/migration. > > > > > > > > > > > > > > Sure, however, is this also applicable to holding the mutex o= f a per- > > cpu > > > > > > > structure obtained via raw_cpu_ptr()? > > > > > > > > > > > > Yes, unless you use migration_disable() or cpus_read_lock() to = protect > > > > > > this section. > > > > > > > > > > Ok. > > > > > > > > > > > > > > > > > > > > > > > > > Would holding the mutex prevent the acomp_ctx of the cpu prio= r to > > > > > > > the migration (in the UAF scenario you described) from being > > deleted? > > > > > > > > > > > > No, cpu offline can kick in anytime to free the acomp_ctx->buff= er. > > > > > > > > > > > > > > > > > > > > If holding the per-cpu acomp_ctx's mutex isn't sufficient to = prevent > > the > > > > > > > UAF, I agree, we might need a way to prevent the acomp_ctx fr= om > > being > > > > > > > deleted, e.g. with refcounts as you've suggested, or to not u= se the > > > > > > > > > > > > Right, refcount solution from Johannes is very good IMHO. > > > > > > > > > > > > > acomp_ctx at all for the check, instead use a boolean. > > > > > > > > > > > > But this is not enough to just avoid using acomp_ctx for the ch= eck, > > > > > > the usage of acomp_ctx inside the mutex is also UAF, since cpu = offline > > > > > > can kick in anytime to free the acomp_ctx->buffer. > > > > > > > > > > I see. How would the refcounts work? Would this add latency to zs= wap > > > > > ops? In low memory situations, could the cpu offlining code over-= ride > > > > > the refcounts? > > > > > > > > I think what Johannes meant is that the zswap compress/decompress > > > > paths grab a ref on the acomp_ctx before using it, and the CPU > > > > offlining code only drops the initial ref, and does not free the > > > > buffer directly. The buffer is only freed when the ref drops to zer= o. > > > > > > > > I am not familiar with CPU hotplug, would it be simpler if we have = a > > > > wrapper like get_acomp_ctx() that disables migration or calls > > > > cpus_read_lock() before grabbing the per-CPU acomp_ctx? A similar > > > > wrapper, put_acompt_ctx() will be used after we are done using the > > > > acomp_ctx. > > > > > > Would it be sufficient to add a check for mutex_is_locked() in > > > zswap_cpu_comp_dead() and if this returns true, to exit without delet= ing > > > the acomp? > > > > I don't think this works. First of all, it's racy. It's possible the > > mutex gets locked after we check mutex_is_locked() but before we > > delete the acomp_ctx. Also, if we find that the mutex is locked, then > > we do nothing and essentially leak the memory. > > Yes, this would assume the cpu offlining code retries at some interval, > which could prevent the memory leak. I am not sure about that, but even so, it wouldn't handle the first scenario where the mutex gets locked after we check mutex_is_locked(). > > > > > Second, and probably more important, this only checks if anyone is > > currently holding the mutex. What about tasks that may be sleeping > > waiting for the mutex to be unlocked? The mutex will be deleted from > > under them as well. > > Wouldn't this and the race described above, also be issues for the > refcount based approach? I don't think so, at least if implemented correctly. There are a lot of examples around the kernel that use RCU + refcounts for such use cases. I think there are also some examples in kernel docs. That being said, I am wondering if we can get away with something simpler like holding the cpus read lock or disabling migration as I suggested earlier, but I am not quite sure. > > Also, I am wondering if the mutex design already handles cases where > tasks are sleeping, waiting for a mutex that disappears? I don't believe so. It doesn't make sense for someone to free a mutex while someone is waiting for it. How would the waiter know if the memory backing the mutex was freed?