From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C68FD3F281 for ; Fri, 18 Oct 2024 19:57:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F27446B00A4; Fri, 18 Oct 2024 15:57:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED6E36B00B4; Fri, 18 Oct 2024 15:57:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9E526B00B5; Fri, 18 Oct 2024 15:57:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BC2B76B00A4 for ; Fri, 18 Oct 2024 15:57:27 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id ACDF5140422 for ; Fri, 18 Oct 2024 19:57:14 +0000 (UTC) X-FDA: 82687782408.14.D9C472B Received: from mail-yb1-f175.google.com (mail-yb1-f175.google.com [209.85.219.175]) by imf25.hostedemail.com (Postfix) with ESMTP id B252BA0004 for ; Fri, 18 Oct 2024 19:57:17 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=WVaXKE6D; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.219.175 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729281397; a=rsa-sha256; cv=none; b=Q6nSsEsNyKnHiTDHqR43QFUDWOVGxOR4zsE2eBbKzTXpMUN2ezv1OdnJahF1dMSpaKp4fx plQHV1hh2hhihBcYx3m+UQiSjVLV8tDg1BAOPBYB6TvD82BxPaRODRWZ74oWRXuTAfpF71 NUeOzrj9UzHmN3DOuuuCQqJL0A1oBQw= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=WVaXKE6D; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.219.175 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729281397; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Nthzttf7hHxZuUwI9LqlfkzYylX0BTNzIEepU14wscY=; b=ecK00P5iUhtqWTC+rvhp7LYvgL4JvhgsKWYB8GzUC5ufgyzx2f/EDyJWgBmR7jiFYOmneM NGHp+llmMv0OfVJQddC/T5nPOVRiRMmML9/Qsh7uK9T/Mro/0tF97YErZvPAixIfZLk7cX JRoUfRGe4oJqgeMZzPAjThc1K7wJ5zg= Received: by mail-yb1-f175.google.com with SMTP id 3f1490d57ef6-e2975deea98so2459218276.1 for ; Fri, 18 Oct 2024 12:57:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729281440; x=1729886240; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Nthzttf7hHxZuUwI9LqlfkzYylX0BTNzIEepU14wscY=; b=WVaXKE6DXSs8Ii+EJUf7VwCqBnN79A4oGfe4LZVQ9yzSYt98PPCT+TD4VQVryiiaRE wbS9Zq6ErN+Itfn6LnQ1pDTIJE3ofXoHk4zmEDBjRcz5qJelQx2vgI+UVe2eAtYwmmk+ wDtTswc+O8jTCwiAYnBVVQ5wAMSac3+RHMbPDOaxP7k70JynEEfsiczHPOynskGPgswl W7gWqRdSGgRMYTBESAvp6PvfMasr2IbZubFf6u2bffstyD3d5vCWUK4LiIPWe5fOtgzc KLL8xTrHU5gDTCFFOQktIhssFo1IFAidQimfawOKYJOJu5cfVLDuCciltf45Yukuq2sX bi2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729281440; x=1729886240; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Nthzttf7hHxZuUwI9LqlfkzYylX0BTNzIEepU14wscY=; b=m56mtyHDrDd7rfdxByoLnL0kHQvcjLBtK0N/XtU32GLwGQS61dDQFFvS7MvIuwtedG sdHJxrfIBN0NiQeuIADoRpEH8gmyCCzGSNTMkZvFKbYZu5LLev5gotEqz2kK6Lr4odqs JUes7X75UOHl/5d1ir1mTdKxV8A/p1piENUdfXre7Hw992W9xSEeCXCSfWaUcIp9L76S vtfBu7ej1p9POjJrdZd3tlUtP+YwfOFu3/J2/pFMCmKE9inLxxMuqZj3usS+obtJDXkc vLWsaAdRne7Dybs1JGXQHT9SDxnIiQMRMs5PJH/0CMs5clhf/Lv3qrizJkvS1mOWs7u7 RpnA== X-Forwarded-Encrypted: i=1; AJvYcCXrsBexDcVP4pahvty+gFqOTdpqfYPrSOVdNefvzEIqlb3KROCztB8e007AR9kNfI4FpEPEI6lMfA==@kvack.org X-Gm-Message-State: AOJu0YynwipXrGFjeXPJ/6xgqiocAz5E2cGtCcg/eBuDB+El+OvLaT5H wplcFq3CBMQsgi2Q8B+KBpWVCGxZqL9Hr7WE6O5gf+o9JiSWg79VKqkuYL4mJeWJDrBJK2nWiya Jdq0Iz9cQawQ0lDjR5nlppgIUrpE= X-Google-Smtp-Source: AGHT+IHIeAhOi4GBxuJbnNsTN1AgC2ZkkGsQl0gP94YS48WpeZHMlQvoiCNcpndXySdHlY5BiWefOluPtr/oGYadYmQ= X-Received: by 2002:a05:6902:2847:b0:e28:eb16:dd5e with SMTP id 3f1490d57ef6-e2bb16c3a1fmr3434978276.52.1729281439728; Fri, 18 Oct 2024 12:57:19 -0700 (PDT) MIME-Version: 1.0 References: <20241014182228.1941246-1-joannelkoong@gmail.com> <20241014182228.1941246-3-joannelkoong@gmail.com> In-Reply-To: From: Joanne Koong Date: Fri, 18 Oct 2024 12:57:08 -0700 Message-ID: Subject: Re: [PATCH v2 2/2] fuse: remove tmp folio for writebacks and internal rb tree To: Shakeel Butt Cc: Miklos Szeredi , linux-fsdevel@vger.kernel.org, josef@toxicpanda.com, bernd.schubert@fastmail.fm, jefflexu@linux.alibaba.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@meta.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: B252BA0004 X-Rspamd-Server: rspam01 X-Stat-Signature: nse17g6mf9dsmqt5uq95bommqx8u89fz X-HE-Tag: 1729281437-503921 X-HE-Meta: U2FsdGVkX18sxeUacnhW1Y8Zj63q6qz3mODdiG54npY06B5OJTfhPqq/Mvr+czowu/qcX/MN5pmL5iV0SFVUZsdOFOeXvxSJGrv2lPTy2cz6BAU9bYAFh26qklBSphcovs8BfYrLlTHKqkYevjmxjIOvb/xoPnidKeAA6B0lDQNzrDeVPz7Tg3tcrxpEMkm4224ZpoYqYVWLMp03m5uNeoIbTW8eqCxn1ckbNlCKhz+Cu8kirGyrZP3f7CgSa7wrrXzLtSVLgxDzCzTwPTxc7EMtTGAL2q5z/t/xdbwNsOviaEZjFnlUoGBP1CfJzY22dy6v7BOlnUOwFfEIDnaIrE1xPs+FSqh4UgczfAFMslw594xeE73p4xab2LTK3eLKrq+YpL4lmHeMrE7AHHcSxomLtkCf6DgT8coOZ5rCGlLQsMFtBN93FikWcvlyrpSFXrWN99DKeQ2MwCvnIeImKtuD3VKIVV+zM5nXYxBigQfRUj9gTbeA6NMnRp0f7Gv1tXfzmu9oVrqz1xj2Ee/gdN5y+4N4wrAxXKKjoHwhJ6FM0wUe5BFgX8EXUoRShZJiih4LEXB/iUwwnEjk6AQI+wyd+4NF4UYMFlZWvIgAzdKBIkw3fihlQupdaF/ONmXlURbbW0cGRn7f56JRba0jrzpN/+jAly6cCZIbzV2YkQmXgiwCVVSjTw5B263/gNuKHUpU5Ooq6ZHgEkO5DaOR0CIpRO/MrcW1fM1bFmWb+xnrFupoPtJX+rm4MLWk59MYWO+uxiW51t26RNdUdfwkgtxZaQaolSLZrpy8zL1PSBInqBpljEVkzCd2qnZwQksid+iZCwTuFT8OTrQpsFd5eunYJbmlnadubFY9hK5lCG78CLw5arrbqqHpK2Hnzi9mYVLj/tr1IIX5FpnZMzylTDHSqKmGHRsONKZQRUr6Q6QONkmsTd3yuTPgq4oDQfOVx342SGWKWXDZfZj3Zjw d6LKpIK+ WqRj/uac8Hjph79uTN1UzwpqVW0Cf0YiG8B9oS0ijF9x0eO4baEFlfVNO6pFVaF3zv+JYbgcQl6aTHWSU8sRc4C/ICMga/dn/VFKGHc6cP7KXyDqB6PQCn1Gn6Ryr2fLj2jtkmFJpPqmfwrZgpJXxFOe3d+HR/tVhe9DgKbIlP78Ck1Y0/M1m2ZLE7lfSwOxaaPElwM05+FZzZToKPKFU7HmWCw/+4OZkMCdDoVxKrGn1DvG7HLuUmpVK7gjqXzVHM5ORYld4X11iMGY7s48608WCcc3Z8jf9pTmHvaFvknzk28plX1m9ihjvH9TLZv4AQVoPItqMZE57LAVIND77rtHvPw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000003, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 17, 2024 at 10:57=E2=80=AFPM Shakeel Butt wrote: > > On Thu, Oct 17, 2024 at 06:30:08PM GMT, Joanne Koong wrote: > > On Tue, Oct 15, 2024 at 3:01=E2=80=AFAM Miklos Szeredi wrote: > > > > > > On Mon, 14 Oct 2024 at 20:23, Joanne Koong w= rote: > > > > > > > This change sets AS_NO_WRITEBACK_RECLAIM on the inode mapping so th= at > > > > FUSE folios are not reclaimed and waited on while in writeback, and > > > > removes the temporary folio + extra copying and the internal rb tre= e. > > > > > > What about sync(2)? And page migration? > > > > > > Hopefully there are no other cases, but I think a careful review of > > > places where generic code waits for writeback is needed before we can > > > say for sure. > > > > The places where I see this potential deadlock still being possible are= : > > * page migration when handling a page fault: > > In particular, this path: handle_mm_fault() -> > > __handle_mm_fault() -> handle_pte_fault() -> do_numa_page() -> > > migrate_misplaced_folio() -> migrate_pages() -> migrate_pages_sync() > > -> migrate_pages_batch() -> migrate_folio_unmap() -> > > folio_wait_writeback() > > So, this is numa fault and if fuse server is not mapping the fuse folios > which it is serving, in its address space then this is not an issue. > However hugepage allocation on page fault can cause compaction which > might migrate unrelated fuse folios. So, fuse server doing compaction > is an issue and we need to resolve similar to reclaim codepath. (Though > I think for THP it is not doing MIGRATE_SYNC but doing for gigantic > hugetlb pages). Thanks for the explanation. Would you mind pointing me to the compaction function where this triggers the migrate? Is this in compact_zone() where it calls migrate_pages() on the cc->migratepages list? > > > * syscalls that trigger waits on writeback, which will lead to > > deadlock if a single-threaded fuse server calls this when servicing > > requests: > > - sync(), sync_file_range(), fsync(), fdatasync() > > - swapoff() > > - move_pages() > > > > I need to analyze the page fault path more to get a clearer picture of > > what is happening, but so far this looks like a valid case for a > > correctly written fuse server to run into. > > For the syscalls however, is it valid/safe in general (disregarding > > the writeback deadlock scenario for a minute) for fuse servers to be > > invoking these syscalls in their handlers anyways? > > > > The other places where I see a generic wait on writeback seem safe: > > * splice, page_cache_pipe_buf_try_steal() (fs/splice.c): > > We hit this in fuse when we try to move a page from the pipe buffer > > into the page cache (fuse_try_move_page()) for the SPLICE_F_MOVE case. > > This wait seems fine, since the folio that's being waited on is the > > folio in the pipe buffer which is not a fuse folio. > > * memory failure (mm/memory_failure.c): > > Soft offlining a page and handling page memory failure - these can > > be triggered asynchronously (memory_failure_work_func()), but this > > should be fine for the fuse use case since the server isn't blocked on > > servicing any writeback requests while memory failure handling is > > waiting on writeback > > * page truncation (mm/truncate.c): > > Same here. These cases seem fine since the server isn't blocked on > > servicing writeback requests while truncation waits on writeback > > > > > > Thanks, > > Joanne > > > > > > > > Thanks, > > > Miklos