From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 199C8D3C54C for ; Fri, 18 Oct 2024 05:58:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F4E16B0088; Fri, 18 Oct 2024 01:58:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A4946B008A; Fri, 18 Oct 2024 01:58:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6935C6B008C; Fri, 18 Oct 2024 01:58:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4BE1B6B0088 for ; Fri, 18 Oct 2024 01:58:02 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 6733BA052F for ; Fri, 18 Oct 2024 05:57:40 +0000 (UTC) X-FDA: 82685666700.13.36D4B39 Received: from out-178.mta1.migadu.com (out-178.mta1.migadu.com [95.215.58.178]) by imf23.hostedemail.com (Postfix) with ESMTP id 28455140008 for ; Fri, 18 Oct 2024 05:57:52 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=eJmcIDdi; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf23.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.178 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729231046; a=rsa-sha256; cv=none; b=7I+I9Xt9oF9Qtplvshymglp4t75OluTsqaM9hwQQwlFX2jFVvLmRHO1PxuSF4wkSP23WfX YN8XLIcPD9Imaihs3KolhnklD8pMAAWhsNOn37/i78MUGemmHx0CpT5MCaLTQIkLCYw4sg 8g2/4xP9us19kLeC+fbQlofJAQPY8m0= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=eJmcIDdi; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf23.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.178 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729231046; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5WiHxI2A8L7DxDjwomZoUbDGNWSprCSLc9WDB3DW9+s=; b=0E+XQ0UiZXjFdNcbVtcyYrgEYGb9ApOYGDeF6qqGDtLNk6ZRFWzDdXLS94f1pADivdz9PO INkGRQ6xnyf3OT5OBKc1k64UvfjYkdK9ZGUh1c+yhBxHrxBPGQuydpNBcYllE++cBPsgn5 sJgOxjLC5Lrh2rkenbm3lUNQDi+ZgL0= Date: Thu, 17 Oct 2024 22:57:51 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1729231077; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5WiHxI2A8L7DxDjwomZoUbDGNWSprCSLc9WDB3DW9+s=; b=eJmcIDdiVcz8oEbJo4MVQLFPPqYmQ5zcE7ThCnNzyf1/o/ei99DRbEibCE7e3oKDJjIRcV bz4kNh7mNuyMQiXXaDypKmLfov6HhDoaMEC1cLxcX9hEIWD7/wa42lJM4NWUsQ28oksQ0S QfhCVbOKk9gNoazIe37Nj24NetQVjNs= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Joanne Koong Cc: Miklos Szeredi , linux-fsdevel@vger.kernel.org, josef@toxicpanda.com, bernd.schubert@fastmail.fm, jefflexu@linux.alibaba.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@meta.com Subject: Re: [PATCH v2 2/2] fuse: remove tmp folio for writebacks and internal rb tree Message-ID: References: <20241014182228.1941246-1-joannelkoong@gmail.com> <20241014182228.1941246-3-joannelkoong@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Stat-Signature: 94wyqqkirfci5kndo8bj74ixiwkyudok X-Rspamd-Queue-Id: 28455140008 X-Rspamd-Server: rspam02 X-HE-Tag: 1729231072-693121 X-HE-Meta: U2FsdGVkX1/NVLGhtBtQQt/L+pQm6yoJGXsNfBAO+0ZB8INuXsNSYeIWwOA+RY6JPiHs0mslvC7Cob/nXmaROGPjFXhbhk+2pO0FiTxhidKh22tpnpaJLQGTHwO8yQ3okZtiQ5R27hm5OqzFHT6XO5nrpqPVuEvRcbX2dtSS4KhSTzJE7DoWvI65R8Ud2+tVIr69BNwIjt2jxWcZuRcWe116MWw+re/zS/r7bwsh6TdxyN0wCl+crazcUdkDPb2VTWgV3k2shvtziwUX2e7RQ0anrejncfVpultVvh59L9W5J5/oQKZArA4IGwD02axn1PMNcylhnDGnrzSu7BzZ9szkkMCwuEdMmh9nqor5D3P/e/rHxCvA/iN97n1+HkHLr2XR/mCD+dzWTs1IgRDA40DgYKcKk1J+X6hoJrLdF6bsPAvphtcbxh+XhHsubHYHavsxNiWMycIKZFRo9EVeCtyThb7grDv0mueGLsyjf+mvfeb62pu6OEBhSzP3Vk2ACn4DSdJiSCLE/PO8s2/87JGETP1vsr7EUY6PhNH84ODBzCSwqZReac3ajmAzA4SL2QaxTTMjTPlZP45gcHXNxTLUVT2dfXU1PSQ0wHQ32ntvzeP3lXKdTOb9oKL0rUEpcfr4Fc4hxExmIhw7aXDFXjOucFrkNNra9Xp7DAlrRQW01DabLf1C63DlBckUX9CtOKDtZqm1WJHfhHWABDb9LypatbFv2XzTpLLgmvYn/HS+oIU46QtH0nHQavNQKSXzpaNJiJwm74qdF/7DeJFUX+pc0CdcdTuH/pLA5KBwQ390BIN1NAENbOXa/ja9zN4PO2j4UlAq51nWiJKeHglyL10cuyMcdu7JD1aO7gHhI1h8aErgIUznBg6LsuoxXerpGA0ceZ/+DiacxOGvwzh2o+byrMpCRuaYQFTMXwnZwRi+cOJPMuztrn6zFwBEzH/mJbG5GBdKJwZhG5lCAXa rp7pr+cj NkmqxLy/S9F3kuI+VUEcGmY2yOOOeL6xeorJYynaB7aRDYe6/VctX7at08VbYthxfK3SapZ7coO6Pixw7sE4tbrZAlycp9BTtNMJ7lg4UjoJvapOiGxtRBD5eeOumdlfv1/YV7a6v7cGzGWN2Hsj4h3/9dggWoOIqA6M/1X6vna/DVCbGErAsPMefCyjJvulCrM99VNITJ8M1HbSBxD3dvn+NX7fvIanc03s3lH5UTkUt+3Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 17, 2024 at 06:30:08PM GMT, Joanne Koong wrote: > On Tue, Oct 15, 2024 at 3:01 AM Miklos Szeredi wrote: > > > > On Mon, 14 Oct 2024 at 20:23, Joanne Koong wrote: > > > > > This change sets AS_NO_WRITEBACK_RECLAIM on the inode mapping so that > > > FUSE folios are not reclaimed and waited on while in writeback, and > > > removes the temporary folio + extra copying and the internal rb tree. > > > > What about sync(2)? And page migration? > > > > Hopefully there are no other cases, but I think a careful review of > > places where generic code waits for writeback is needed before we can > > say for sure. > > The places where I see this potential deadlock still being possible are: > * page migration when handling a page fault: > In particular, this path: handle_mm_fault() -> > __handle_mm_fault() -> handle_pte_fault() -> do_numa_page() -> > migrate_misplaced_folio() -> migrate_pages() -> migrate_pages_sync() > -> migrate_pages_batch() -> migrate_folio_unmap() -> > folio_wait_writeback() So, this is numa fault and if fuse server is not mapping the fuse folios which it is serving, in its address space then this is not an issue. However hugepage allocation on page fault can cause compaction which might migrate unrelated fuse folios. So, fuse server doing compaction is an issue and we need to resolve similar to reclaim codepath. (Though I think for THP it is not doing MIGRATE_SYNC but doing for gigantic hugetlb pages). > * syscalls that trigger waits on writeback, which will lead to > deadlock if a single-threaded fuse server calls this when servicing > requests: > - sync(), sync_file_range(), fsync(), fdatasync() > - swapoff() > - move_pages() > > I need to analyze the page fault path more to get a clearer picture of > what is happening, but so far this looks like a valid case for a > correctly written fuse server to run into. > For the syscalls however, is it valid/safe in general (disregarding > the writeback deadlock scenario for a minute) for fuse servers to be > invoking these syscalls in their handlers anyways? > > The other places where I see a generic wait on writeback seem safe: > * splice, page_cache_pipe_buf_try_steal() (fs/splice.c): > We hit this in fuse when we try to move a page from the pipe buffer > into the page cache (fuse_try_move_page()) for the SPLICE_F_MOVE case. > This wait seems fine, since the folio that's being waited on is the > folio in the pipe buffer which is not a fuse folio. > * memory failure (mm/memory_failure.c): > Soft offlining a page and handling page memory failure - these can > be triggered asynchronously (memory_failure_work_func()), but this > should be fine for the fuse use case since the server isn't blocked on > servicing any writeback requests while memory failure handling is > waiting on writeback > * page truncation (mm/truncate.c): > Same here. These cases seem fine since the server isn't blocked on > servicing writeback requests while truncation waits on writeback > > > Thanks, > Joanne > > > > > Thanks, > > Miklos