From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E336DC3600C for ; Thu, 3 Apr 2025 09:25:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CE5F280003; Thu, 3 Apr 2025 05:25:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 97DAF280001; Thu, 3 Apr 2025 05:25:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82599280003; Thu, 3 Apr 2025 05:25:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 66E1F280001 for ; Thu, 3 Apr 2025 05:25:23 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id E5749C1EBE for ; Thu, 3 Apr 2025 09:25:24 +0000 (UTC) X-FDA: 83292199368.08.AB2ABF6 Received: from fout-b2-smtp.messagingengine.com (fout-b2-smtp.messagingengine.com [202.12.124.145]) by imf22.hostedemail.com (Postfix) with ESMTP id B60D7C0010 for ; Thu, 3 Apr 2025 09:25:22 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm2 header.b="LQA3I0f/"; dkim=pass header.d=messagingengine.com header.s=fm2 header.b="u MDLej1"; dmarc=pass (policy=none) header.from=fastmail.fm; spf=pass (imf22.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 202.12.124.145 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743672322; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0grwJN9pQJNLiQHb/gAsQcKM+5IlBV3uvLNe2gjHODQ=; b=KDsLKRgEDmsyZPJMlHmYIM6TLzmKsBGsh4WKjP1tmNo1xqh6ZVKc332GK/RhqPhDucgjBk c9+NaXbscfDuf8KfIxwuzTgbkMFj0pnzFy3bGKhsZayYYQaWXT2KZcgVjtP5yJTkSFG0qm Mfz8bzbXGq2vHWhWGb4WcBjNCovsaww= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743672322; a=rsa-sha256; cv=none; b=tYtMzpZUcITZirLJdK7dB5q/fNY9v2w2ozx1YGXh6gm9hGYBYXk2/kv6eovmG7sXwOV8tk UQnEzlt2rQqc57mN9sl/YtOfrzTqUitWEj/vR+xuJKlWVBCwTZJKVvdrfFI6BEQEWnzKyB Dg3PZUGwWlxEf34lh484J/EexCydrzc= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=fastmail.fm header.s=fm2 header.b="LQA3I0f/"; dkim=pass header.d=messagingengine.com header.s=fm2 header.b="u MDLej1"; dmarc=pass (policy=none) header.from=fastmail.fm; spf=pass (imf22.hostedemail.com: domain of bernd.schubert@fastmail.fm designates 202.12.124.145 as permitted sender) smtp.mailfrom=bernd.schubert@fastmail.fm Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfout.stl.internal (Postfix) with ESMTP id 7F0F61140233; Thu, 3 Apr 2025 05:25:21 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-01.internal (MEProxy); Thu, 03 Apr 2025 05:25:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.fm; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1743672321; x=1743758721; bh=0grwJN9pQJNLiQHb/gAsQcKM+5IlBV3uvLNe2gjHODQ=; b= LQA3I0f//SDOx/8VpBYt1BYRgS6zxjyIjPleI56iYj5mOh0qDYWBo+Xh5GW3Wlf8 eYl0obd7nXgJbeXaqVMrZumMcEN3g8m+42wSmw8xbV3Pp1FNltw3o3gCfNsGulhp hMv1zXqwjbn3TMUrczw5LmkzjRPqBiT/NDvXVF16XrjmS1wnkYkLYsZJ2ad0wgz2 iXI6VkZkSxKpVEXGmwSKxMLgR0pIvi7EAuzHq9MVyLpUorWe1hWoskiumnwXgkNn d0dYHGPanEE6U1plzJwmXAwm8/+7EhbR+sMimIlI+OGAdIhc9/TNfDnV5gptbqub 1YxfDgCEl8ogJftWHUuG/w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1743672321; x= 1743758721; bh=0grwJN9pQJNLiQHb/gAsQcKM+5IlBV3uvLNe2gjHODQ=; b=u MDLej1UzLaWUvPI8FWSRqmxZyql00yoLtp6y95Mp4/SAzyUDEgB7qnZI1Wv2MvTS dxbp0ubWuw5XvccRr4fzx86qE/iIPagSGzDYJUoZuz1aoYOONhY+C0COL68BWxEg XmLC6iWmE+SxdZ0kCTG0MVGzAdBSSPkRwKoyFyBK3QJQmVEoz2PV/Hucfckz0sNV cgFZRjmn7f9IE8QyRmvS9arC5rwrJlVfBOFVnSdHA6iAsMs/OJN0gW7zhYJSuIit DH2jlBKsO0vM0egRTWTAaa+at5RhPB0j8Dw9cNxnVREr/KGn4gkywF6ZOMOXysb6 ypVFmNa0nOJHvmsN60M+g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgddukeekvddtucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggv pdfurfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpih gvnhhtshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtkeertddt vdejnecuhfhrohhmpeeuvghrnhguucfutghhuhgsvghrthcuoegsvghrnhgurdhstghhuh gsvghrthesfhgrshhtmhgrihhlrdhfmheqnecuggftrfgrthhtvghrnhepudelfedvudev udevleegleffffekudekgeevlefgkeeluedvheekheehheekhfefnecuvehluhhsthgvrh fuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepsggvrhhnugdrshgthhhusggv rhhtsehfrghsthhmrghilhdrfhhmpdhnsggprhgtphhtthhopedugedpmhhouggvpehsmh htphhouhhtpdhrtghpthhtohepuggrvhhiugesrhgvughhrghtrdgtohhmpdhrtghpthht ohepjhgvfhhflhgvgihusehlihhnuhigrdgrlhhisggrsggrrdgtohhmpdhrtghpthhtoh epjhhorghnnhgvlhhkohhonhhgsehgmhgrihhlrdgtohhmpdhrtghpthhtohepmhhikhhl ohhssehsiigvrhgvughirdhhuhdprhgtphhtthhopehlihhnuhigqdhfshguvghvvghlse hvghgvrhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtohepshhhrghkvggvlhdrsghuthht sehlihhnuhigrdguvghvpdhrtghpthhtohepjhhoshgvfhesthhogihitghprghnuggrrd gtohhmpdhrtghpthhtoheplhhinhhugidqmhhmsehkvhgrtghkrdhorhhgpdhrtghpthht ohepkhgvrhhnvghlqdhtvggrmhesmhgvthgrrdgtohhm X-ME-Proxy: Feedback-ID: id8a24192:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 3 Apr 2025 05:25:18 -0400 (EDT) Message-ID: Date: Thu, 3 Apr 2025 11:25:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 4/5] mm/migrate: skip migrating folios under writeback with AS_WRITEBACK_INDETERMINATE mappings To: David Hildenbrand , Jingbo Xu , Joanne Koong Cc: miklos@szeredi.hu, linux-fsdevel@vger.kernel.org, shakeel.butt@linux.dev, josef@toxicpanda.com, linux-mm@kvack.org, kernel-team@meta.com, Matthew Wilcox , Zi Yan , Oscar Salvador , Michal Hocko , Keith Busch References: <20241122232359.429647-1-joannelkoong@gmail.com> <20241122232359.429647-5-joannelkoong@gmail.com> <1036199a-3145-464b-8bbb-13726be86f46@linux.alibaba.com> <1577c4be-c6ee-4bc6-bb9c-d0a6d553b156@redhat.com> From: Bernd Schubert Content-Language: en-US, de-DE, fr In-Reply-To: <1577c4be-c6ee-4bc6-bb9c-d0a6d553b156@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B60D7C0010 X-Stat-Signature: 617otogo1jcnazqpyis5marrpmoxw9jy X-Rspam-User: X-HE-Tag: 1743672322-185290 X-HE-Meta: U2FsdGVkX1/f+4lCItKaZaGoC7zeY+cTomL4gjxzmIGIRSnsv2tSpXEW73wamQwfQjOcQ4p1QCFL6Xwy35QiLo7LroRQ8sGvqgFfX8g5OEEiBhmDCg5lo43kojef60qcmv2Uh0lQQdp8BpbDr3PkXNzSO81vruhSlnCwoi93tGENcoBeuRbafR4mJVvV8q0ee+glkwYGHAxzUHyYMvxYj6J4jRwO4hSfWvAQJdk8PzhjKzXpY3rIuzcjBKhxvHWTs1wOm5Vocx70gs53LBfTQm2kbqC6aU8gBIiJiInW/Qw3AOhBk8oCMaInJiUdmryckSX3o+/W59mRuXnZwDQkyNXHHQ1n36IyLRNaew7hx1KOsTeSBGyLnywvugZQc7v9+YFtq7OngVmmuF8L2ZrPQObbIskvL7dzVqVM4Hz3AB0frBoX19atGHdW1KtJu5gAumiloI0Hej29KN3GhLANcwisNKy/boMTuXneYfNHwBcN1khAURK7WoFpcVDIP+bEyjb0rIb37tdWSQUgJoaK2SKou80yYb00WKSS3c0gmEGgx5j8ZnyR0175dFRc1WhcY2DmP/oT/IuRF6X/7LH+K+hO7m7SLYxzZpJMfEpoUAMONTjgoU5RehI6Ga33bKa0JFxmLr+asVrERx8A1fsjkhqQMuh1QHR06cDBmK2sGd7yycSlE2KyOEtgCwJV+t3oD+lyA5PzXhYIx49WtJBnoTDqlB82r0lJ+SIEH1cLd+CUY/9/QMlYlwbmByI/WmyIg4P4L8MivHFBscTOMcrPnR4ySA2ulUfSbnG9+vv3ploAET6ZHagdCfYiZ6Qs1ELr5SuLdel1+50rCa0esslQV9AWo71Hr6oiYYVacXNMfrSpnKG0UTOAbGNBqPnjngaKXQfRc1cGkk6c6rSuHmNysLWTZnjynnUeZIMMMGdXFfrhiJNitoQfx0b5IvILbhmCkQJxhDgtdOc/JcIxTsN C9g0LWZH ZnUb5UeyXGtGN0yhkLDTqcVQlKwWogGTB/lZR5r3gioLMST47aeoHaoRKKsK4BNxQ7HRydD/rU+BmcKZEdeiDHRkRXwUMTxC1ptZvQa/DaMVlcJbl3oKdeMsSGcETRXYdi00XRW2QozNf3C52UXTeRGyEgkZUTechwB6YAnfaj6FWE4DeCrxXOVmOoviKNOwWLM6OzSe+Zd8vJFeW/slwI+gZt1C0nfF/0VQKu5V0zpSZcfQOp5SDSZNKfdAHNxx2kgzFudx2KcGfoebpNt7Wbx6xwBnvWGvlkutgDK4F/Fxg4sTKztG6gwkRlwNURnLudmRepDIH4Yeg0shERvMQBJkPbr2njRb1PjJgVcZr8IuetsihE89EFMvtGvC2JgFX3f4r19QhaBqmjeOSRZN0910MArGfEz+nkswqoUI86qHBIOH4sXvNPCJ/gPBamVqVVrwNYLpnShWq28+o8c8oIwEOwM3BkM0VHLbSfLgLJ3fRfBpVTsWKliUjNxeW65kG0YVm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 4/3/25 11:18, David Hildenbrand wrote: > On 03.04.25 05:31, Jingbo Xu wrote: >> >> >> On 4/3/25 5:34 AM, Joanne Koong wrote: >>> On Thu, Dec 19, 2024 at 5:05 AM David Hildenbrand >>> wrote: >>>> >>>> On 23.11.24 00:23, Joanne Koong wrote: >>>>> For migrations called in MIGRATE_SYNC mode, skip migrating the >>>>> folio if >>>>> it is under writeback and has the AS_WRITEBACK_INDETERMINATE flag >>>>> set on its >>>>> mapping. If the AS_WRITEBACK_INDETERMINATE flag is set on the >>>>> mapping, the >>>>> writeback may take an indeterminate amount of time to complete, and >>>>> waits may get stuck. >>>>> >>>>> Signed-off-by: Joanne Koong >>>>> Reviewed-by: Shakeel Butt >>>>> --- >>>>>    mm/migrate.c | 5 ++++- >>>>>    1 file changed, 4 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/mm/migrate.c b/mm/migrate.c >>>>> index df91248755e4..fe73284e5246 100644 >>>>> --- a/mm/migrate.c >>>>> +++ b/mm/migrate.c >>>>> @@ -1260,7 +1260,10 @@ static int migrate_folio_unmap(new_folio_t >>>>> get_new_folio, >>>>>                 */ >>>>>                switch (mode) { >>>>>                case MIGRATE_SYNC: >>>>> -                     break; >>>>> +                     if (!src->mapping || >>>>> +                         !mapping_writeback_indeterminate(src- >>>>> >mapping)) >>>>> +                             break; >>>>> +                     fallthrough; >>>>>                default: >>>>>                        rc = -EBUSY; >>>>>                        goto out; >>>> >>>> Ehm, doesn't this mean that any fuse user can essentially completely >>>> block CMA allocations, memory compaction, memory hotunplug, memory >>>> poisoning... ?! >>>> >>>> That sounds very bad. >>> >>> I took a closer look at the migration code and the FUSE code. In the >>> migration code in migrate_folio_unmap(), I see that any MIGATE_SYNC >>> mode folio lock holds will block migration until that folio is >>> unlocked. This is the snippet in migrate_folio_unmap() I'm looking at: >>> >>>          if (!folio_trylock(src)) { >>>                  if (mode == MIGRATE_ASYNC) >>>                          goto out; >>> >>>                  if (current->flags & PF_MEMALLOC) >>>                          goto out; >>> >>>                  if (mode == MIGRATE_SYNC_LIGHT && ! >>> folio_test_uptodate(src)) >>>                          goto out; >>> >>>                  folio_lock(src); >>>          } >>> > > Right, I raised that also in my LSF/MM talk: waiting for readahead > currently implies waiting for the folio lock (there is no separate > readahead flag like there would be for writeback). > > The more I look into this and fuse, the more I realize that what fuse > does is just completely broken right now. > >>> If this is all that is needed for a malicious FUSE server to block >>> migration, then it makes no difference if AS_WRITEBACK_INDETERMINATE >>> mappings are skipped in migration. A malicious server has easier and >>> more powerful ways of blocking migration in FUSE than trying to do it >>> through writeback. For a malicious fuse server, we in fact wouldn't >>> even get far enough to hit writeback - a write triggers >>> aops->write_begin() and a malicious server would deliberately hang >>> forever while the folio is locked in write_begin(). >> >> Indeed it seems possible.  A malicious FUSE server may already be >> capable of blocking the synchronous migration in this way. > > Yes, I think the conclusion is that we should advise people from not > using unprivileged FUSE if they care about any features that rely on > page migration or page reclaim. > >> >> >>> >>> I looked into whether we could eradicate all the places in FUSE where >>> we may hold the folio lock for an indeterminate amount of time, >>> because if that is possible, then we should not add this writeback way >>> for a malicious fuse server to affect migration. But I don't think we >>> can, for example taking one case, the folio lock needs to be held as >>> we read in the folio from the server when servicing page faults, else >>> the page cache would contain stale data if there was a concurrent >>> write that happened just before, which would lead to data corruption >>> in the filesystem. Imo, we need a more encompassing solution for all >>> these cases if we're serious about preventing FUSE from blocking >>> migration, which probably looks like a globally enforced default >>> timeout of some sort or an mm solution for mitigating the blast radius >>> of how much memory can be blocked from migration, but that is outside >>> the scope of this patchset and is its own standalone topic. > > I'm still skeptical about timeouts: we can only get it wrong. > > I think a proper solution is making these pages movable, which does seem > feasible if (a) splice is not involved and (b) we can find a way to not > hold the folio lock forever e.g., in the readahead case. > > Maybe readahead would have to be handled more similar to writeback > (e.g., having a separate flag, or using a combination of e.g., > writeback+uptodate flag, not sure) > > In both cases (readahead+writeback), we'd want to call into the FS to > migrate a folio that is under readahread/writeback. In case of fuse > without splice, a migration might be doable, and as discussed, splice > might just be avoided. My personal take is here that we should move away from splice. Keith (or colleague) is working on ZC with io-uring anyway, so maybe a good timing. We should just ensure that the new approach doesn't have the same issue. Thanks, Bernd