From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D91E8C0219B for ; Tue, 11 Feb 2025 21:11:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59B78280008; Tue, 11 Feb 2025 16:11:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 54A08280004; Tue, 11 Feb 2025 16:11:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 411D9280008; Tue, 11 Feb 2025 16:11:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 254F6280004 for ; Tue, 11 Feb 2025 16:11:10 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D65124A737 for ; Tue, 11 Feb 2025 21:11:09 +0000 (UTC) X-FDA: 83108909058.02.C2510B9 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) by imf10.hostedemail.com (Postfix) with ESMTP id E0202C000E for ; Tue, 11 Feb 2025 21:11:07 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="BG/O8hjb"; spf=pass (imf10.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739308267; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yAuMfXRaQLdMu9tTkvgqRg79Mjm8+DLzP2FAOqvmZcg=; b=YlJ2Q4lp1BiL2G3DDe5w2S6nAiDzq9V/BT8lqG7gudUCr9G73yI8sasME9Ol7/qywlNczO eFpM0EB3qWI3iYOF5qKNrXWaPWFxx7ahgJBOpXhhjyKKTcBFu8pLsyUYH1Biz8SxcnUTFC 50/GfTSzdj2Ukpv9wBoDb/NzqU7UFNE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739308267; a=rsa-sha256; cv=none; b=CaUwfKlc1f5pX0Y6CLCCJaNAj0zgEDVIBQllwNp7J4bjdViKP4kBDTrMT7A3wYeizvZwOR BoYXTm4q3Slno1/0L6/8cr3nkn5f3ghWB1TrijgX6KqsvhZMEIYLrqSJHtFHUAwphM9q9A 8/3ndKJXBhZGQgxhOh3cRU4VwEEWkOY= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="BG/O8hjb"; spf=pass (imf10.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.160.172 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-4718d989c1cso43475071cf.2 for ; Tue, 11 Feb 2025 13:11:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1739308267; x=1739913067; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yAuMfXRaQLdMu9tTkvgqRg79Mjm8+DLzP2FAOqvmZcg=; b=BG/O8hjbeVuxmbsJffc+FO2gsZ/CSHjoQ9s+EEZP9kzsN0KpNWuh8KladHWBecWzlE tOYvSdd94tHAf2acme9ReYX79cHwyrN5/0yQqznaxTa+ZbvSEPlIEmzoRkBXMMYAz/bm w/wLtBHRDT+inLvIbkrRNJgIAxRyGbR1/eYKrswEt7tRiX8tLbwcoZssa18RKIk1cOLx iTBvDNQOlyjmlkqzo7blvhjqzKEoOkBRQ4OJ74g2IX1jwk6h99BFxkWjfOGb/6NsxLyl LSv+kkBWxXoQqSRnKGZGFaLwDQuDmW/NLrBWWyIlRNTKP89GwIi2iRrp15F38zGgHsCy UdYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739308267; x=1739913067; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yAuMfXRaQLdMu9tTkvgqRg79Mjm8+DLzP2FAOqvmZcg=; b=SuQqphriaVckFxKiNS9rQDrpVEZ6ibjDfeCV6xf3J22UpUiq99PoYnNCsUKzBqpapu z53xDO2ZZSxxTh13CPheNNY/Z+0O0V/Y9YTMX2qV1eQhdLl5LA8Sm0snn33cH8tkVT7g zZfdkVfl4i1avwllYlq7ni7EqozA4CcQFzFrDXgXO0gp+uygBhuU87qR0F7W5i8rNmKk cHJbHFbMm2stOKSuxkfUaGB1NAtUzv6a9XlLqyF8dFkNO1y5Y2vr+hKhv4WPTi2s4H2L aAU8ZWTqfWiIIJkivCehqZTbl2YOuBZxL4DkQ1+D/geLoLygMH2t8lCwq71nTqUSrZEz 0h3A== X-Forwarded-Encrypted: i=1; AJvYcCVC//YncJJK9QB7nRdI5j3MfZ/OaTkESIo5CxsNedIkDTOFcae85httN06bJMp0HgU5TomlAHHnsg==@kvack.org X-Gm-Message-State: AOJu0YwG3J8B67Lbi20BpPIT8mqt7WqBlAjSNRUfDzlQRW9o6Pg9eEeA NyKDOeqfTUqdoRUrhWRZ5xhDqBPJliEkv0OlOv/ISaTdb4OicXfQ30ETViktkc4aQHD58oqx9nj e04eXzKPqVeSqyYFWO4wkQ4vTc/w= X-Gm-Gg: ASbGncuFqTQn64OwwvaUzqgjKKa0uy/NiAkPfM3HvBpLtzD6r5LpydKc3Phv19BINP5 5y9o3XTBI8o+PpesjTEUaOYz8vFOq/P9ax604ikU4cf1HVsmNb1sfi11QEAcnsS+x/0eilu/ATg == X-Google-Smtp-Source: AGHT+IHCak+6ZWfjbFFqHo/xIJzNDSDRrtPztkntxLpbJYNa1dGRVwPhhxVAUraO3P9FIWxcs/XZadwLeW1Xhm4Q+m0= X-Received: by 2002:a05:622a:1b12:b0:471:a2d5:acda with SMTP id d75a77b69052e-471afe6d09bmr8881191cf.26.1739308266948; Tue, 11 Feb 2025 13:11:06 -0800 (PST) MIME-Version: 1.0 References: <9cd88643-daa8-4379-be0a-bd31de277658@suse.cz> <20250207172917.GA2072771@perftesting> <8f7333f2-1ba9-4df4-bc54-44fd768b3d5b@suse.cz> <81298bd1-e630-4940-ae5b-7882576b6bf4@suse.cz> <20250210191235.GA2256827@perftesting> <8a99f6bf3f0b5cb909f11539fb3b0ef0d65b3a73.camel@kernel.org> In-Reply-To: From: Joanne Koong Date: Tue, 11 Feb 2025 13:10:40 -0800 X-Gm-Features: AWEUYZmion1fONaEjSH0uaM2LEumNjfTl_1CMECZgdbOS8CnCTXNSiA_nTE9X5Q Message-ID: Subject: Re: [REGRESSION][BISECTED] Crash with Bad page state for FUSE/Flatpak related applications since v6.13 To: Jeff Layton Cc: Matthew Wilcox , Josef Bacik , Vlastimil Babka , Miklos Szeredi , Christian Heusel , Miklos Szeredi , regressions@lists.linux.dev, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm , =?UTF-8?Q?Mantas_Mikul=C4=97nas?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: E0202C000E X-Rspamd-Server: rspam07 X-Stat-Signature: u76mf4riyb45xbjb8wyx9wat5abj6fso X-HE-Tag: 1739308267-990336 X-HE-Meta: U2FsdGVkX1+flsvp054zoFi3bEt1M3WyqbieLFPG4VP5OaNTzm3CxEJUIL5LcGtLv1rHE57fH6wFMfEA5stZL1Uum7pEDw1w4yPv9hloeybeH7qKDuf7kdlqIcZWkKgiLfDkLT2DaCml98EOwIHjNlmQdg+WqbHZfOvyBIRW5hG+XHVuTVYk4bXIxuE1bpElXfpbMw+TTqf8GF8z/RQYbAup/e5TfrRW+C+BghsIAem53JsbyYN5w2ILglE7TBKBiRyaRn56ifFXVOSfa8QPFtM1iSOmgk8PSEbctS1/Eu69Xj9ZtYtGxUAf7Da0YPtn4g/ZzIOSSfqpFinJ0vw78e5NIJYeiH/b4D4gxFdvnYL9sBei5O0ATglpvExZwFSvWUZCvuB540ZNj8narLplgD4Ntlt8e248J2ZLSa402RYhEhCXaBLy4GiFROvaS2LotS4lQ07Ibl/L2kaTg9gTNXoOuxv6eHCawfgkD3NpnnNoOv/9mTZUWbG+JKHYz9/3TJA7cz4JXfumVUUaJ9fGbbUOhRCFqtrz12237H5EOfU0KVOZKFcJn2cBPll+T/rN6JWav2hfR+nKc8UyVWrEAffxsY8RlgRL/qH7+RvGHNVxJvvhg3+1vF0IYZgOdtyjX3ikIt0UYdIaugBeUJQel8lrEs6RpgV7gznXazVncnk+fcmgkuym4XN+xV9xKeo7I/g+1xetm6vNvavmitI0/xhAUep4RR+VcnBF6ApQdKpPr1iRAHAIvjb0oA34j+FXyXIDjcasd6yJZTkdhLGzEFpCp/S2Q+9tFr6sMFXhZYsNzJj94sJLQnv34Bh/6uDgogYoOSj/TlASX27FLMMDUV7TXRR/ZpfdgJ3JU5dQTIRzd15ZTPlRegZn6xTJAlWknYwSLBAjdTPTzaLif2dev5/zVcRItackZk0QdsiWHxDIeja1ui5kf7u5wuYcSiLQkP8eNqnsvoOYQH9bDJj XmwvJg5j D6DNC+9J3An2u28jXROOvgb+sTlFefMeZ1fUwYEWxIFRp/I9NW8De3l5GfTF6qgBK7b765T+3qSopneHQPvSjOoBghn0+6jGRLGwqfe/SIf25D3P3ng2E5tkuJlCV1/UYkKvMF4hKiO6ZUD1voR4qvZakrqEFudJi+qnCwevYvTqHXujTSIeXpDTsaPU6rg/0XkK60BTRHrUCqBMhFWQgcUiXfFlpwmnbeSnrxQx4WkYzATMO7H/zgJc7bl+cDvZzHY5GTw5qN3SBEdbaDAanOdCYEcuamwBEnKFexaqNL0GKTvYM1RvYWea8ujqOyHrgPT2549mi0J0G8vQmdRo2ZWZHRNPtXxbVRQVR3FkSlE8I2JkKRRnV191a1Xi7KpoxM/3cRzyp3cxMRMIkDwvXArkpwUYJ595wWCdV7C1EB/HuZ6AEwzNnKnD2HOrYsmPdgM7Q X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 11, 2025 at 11:41=E2=80=AFAM Jeff Layton w= rote: > > On Tue, 2025-02-11 at 11:23 -0800, Joanne Koong wrote: > > On Tue, Feb 11, 2025 at 6:01=E2=80=AFAM Jeff Layton wrote: > > > > > > On Mon, 2025-02-10 at 17:38 -0500, Jeff Layton wrote: > > > > On Mon, 2025-02-10 at 20:36 +0000, Matthew Wilcox wrote: > > > > > On Mon, Feb 10, 2025 at 02:12:35PM -0500, Josef Bacik wrote: > > > > > > From: Josef Bacik > > > > > > Date: Mon, 10 Feb 2025 14:06:40 -0500 > > > > > > Subject: [PATCH] fuse: drop extra put of folio when using pipe = splice > > > > > > > > > > > > In 3eab9d7bc2f4 ("fuse: convert readahead to use folios"), I co= nverted > > > > > > us to using the new folio readahead code, which drops the refer= ence on > > > > > > the folio once it is locked, using an inferred reference on the= folio. > > > > > > Previously we held a reference on the folio for the entire dura= tion of > > > > > > the readpages call. > > > > > > > > > > > > This is fine, however I failed to catch the case for splice pip= e > > > > > > responses where we will remove the old folio and splice in the = new > > > > > > folio. Here we assumed that there is a reference held on the f= olio for > > > > > > ap->folios, which is no longer the case. > > > > > > > > > > > > To fix this, simply drop the extra put to keep us consistent wi= th the > > > > > > non-splice variation. This will fix the UAF bug that was repor= ted. > > > > > > > > > > > > Link: https://lore.kernel.org/linux-fsdevel/2f681f48-00f5-4e09-= 8431-2b3dbfaa881e@heusel.eu/ > > > > > > Fixes: 3eab9d7bc2f4 ("fuse: convert readahead to use folios") > > > > > > Signed-off-by: Josef Bacik > > > > > > --- > > > > > > fs/fuse/dev.c | 2 -- > > > > > > 1 file changed, 2 deletions(-) > > > > > > > > > > > > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c > > > > > > index 5b5f789b37eb..5bd6e2e184c0 100644 > > > > > > --- a/fs/fuse/dev.c > > > > > > +++ b/fs/fuse/dev.c > > > > > > @@ -918,8 +918,6 @@ static int fuse_try_move_page(struct fuse_c= opy_state *cs, struct page **pagep) > > > > > > } > > > > > > > > > > > > folio_unlock(oldfolio); > > > > > > - /* Drop ref for ap->pages[] array */ > > > > > > - folio_put(oldfolio); > > > > > > cs->len =3D 0; > > > > > > > > > > But aren't we now leaking a reference to newfolio? ie shouldn't > > > > > we also: > > > > > > > > > > - folio_get(newfolio); > > > > > > > > > > a few lines earlier? > > > > > > > > > > > > > > > > > I think that ref was leaking without Josef's patch, but your propos= ed > > > > fix seems correct to me. There is: > > > > > > > > - 1 reference stolen from the pipe_buffer > > > > - 1 reference taken for the pagecache in replace_page_cache_folio() > > > > - the folio_get(newfolio) just after that > > > > > > > > The pagecache ref doesn't count here, and we only need the referenc= e > > > > that was stolen from the pipe_buffer to replace the one in pagep. > > > > > > Actually, no. I'm wrong here. A little after the folio_get(newfolio) > > > call, we do: > > > > > > /* > > > * Release while we have extra ref on stolen page. Otherwise > > > * anon_pipe_buf_release() might think the page can be reused= . > > > */ > > > pipe_buf_release(cs->pipe, buf); > > > > > > ...so that accounts for the extra reference. I think the newfolio > > > refcounting is correct as-is. > > > > I think we do need to remove the folio_get(newfolio); here or we are > > leaking the reference. > > > > new_folio =3D page_folio(buf->page) # ref is 1 > > replace_page_cache_folio() # ref is 2 > > folio_get() # ref is 3 > > pipe_buf_release() # ref is 2 > > > > One ref belongs to the page cache and will get dropped by that, but > > the other ref is unaccounted for (since the original patch removed > > "folio_put()" from fuse_readpages_end()). > > > > I still think acquiring an explicit reference on the folio before we > > add it to ap->folio and then dropping it when we're completely done > > with it in fuse_readpages_end() is the best solution, as that imo > > makes the refcounting / lifetimes the most explicit / clear. For > > example, in try_move_pages(), if we get rid of that "folio_get()" > > call, the page cache is the holder of the remaining reference on it, > > and we rely on the earlier "folio_clear_uptodate(newfolio);" line in > > try_move_pages() to guarantee that the newfolio isn't freed out from > > under us if memory gets tight and it's evicted from the page cache. > > > > imo, a patch like this makes the refcounting the most clear: > > > > From 923fa98b97cf6dfba3bb486833179c349d566d64 Mon Sep 17 00:00:00 2001 > > From: Joanne Koong > > Date: Tue, 11 Feb 2025 10:59:40 -0800 > > Subject: [PATCH] fuse: acquire explicit folio refcount for readahead > > > > In 3eab9d7bc2f4 ("fuse: convert readahead to use folios"), the logic > > was converted to using the new folio readahead code, which drops the > > reference on the folio once it is locked, using an inferred reference > > on the folio. Previously we held a reference on the folio for the > > entire duration of the readpages call. > > > > This is fine, however for the case for splice pipe responses where we > > will remove the old folio and splice in the new folio (see > > fuse_try_move_page()), we assume that there is a reference held on the > > folio for ap->folios, which is no longer the case. > > > > To fix this and make the refcounting explicit, acquire a refcount on th= e > > folio before we add it to ap->folios[] and drop it when we are done wit= h > > the folio in fuse_readpages_end(). This will fix the UAF bug that was > > reported. > > > > Link: https://lore.kernel.org/linux-fsdevel/2f681f48-00f5-4e09-8431-2b3= dbfaa881e@heusel.eu/ > > Fixes: 3eab9d7bc2f4 ("fuse: convert readahead to use folios") > > Signed-off-by: Joanne Koong > > --- > > fs/fuse/file.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > > index 7d92a5479998..6fa535c73d93 100644 > > --- a/fs/fuse/file.c > > +++ b/fs/fuse/file.c > > @@ -955,8 +955,10 @@ static void fuse_readpages_end(struct fuse_mount > > *fm, struct fuse_args *args, > > fuse_invalidate_atime(inode); > > } > > > > - for (i =3D 0; i < ap->num_folios; i++) > > + for (i =3D 0; i < ap->num_folios; i++) { > > folio_end_read(ap->folios[i], !err); > > + folio_put(ap->folios[i]); > > + } > > if (ia->ff) > > fuse_file_put(ia->ff, false); > > > > @@ -1049,6 +1051,12 @@ static void fuse_readahead(struct readahead_cont= rol *rac) > > > > while (ap->num_folios < cur_pages) { > > folio =3D readahead_folio(rac); > > + /* > > + * Acquire an explicit reference on the folio i= n case > > + * it's replaced in the page cache in the splic= e case > > + * (see fuse_try_move_page()). > > + */ > > + folio_get(folio); > > ap->folios[ap->num_folios] =3D folio; > > ap->descs[ap->num_folios].length =3D folio_size= (folio); > > ap->num_folios++; > > That makes sense. My mistake was assuming the pointer in passed in via > pagep would hold a reference, and that the replacement folio would > carry one. I like the above better than assuming we have implicit > reference due to readpages. It's slightly more expensive due to the > refcounting, but it seems less brittle. > > We should couple this with a comment over fuse_try_move_page(). > Something like this maybe? > > /* > * Attempt to steal a page from the splice() pipe and move it into the > * pagecache. If successful, the pointer in @pagep will be updated. The > * folio that was originally in @pagep will lose a reference and the new > * folio returned in @pagep will carry a reference. > */ Great idea, I'll add this in. > > ... > > In any case, for this patch: > > Reviewed-by: Jeff Layton