From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A324D5E145 for ; Fri, 8 Nov 2024 08:49:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EB6C6B00A0; Fri, 8 Nov 2024 03:49:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 99AA36B00A1; Fri, 8 Nov 2024 03:49:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 889E96B00A2; Fri, 8 Nov 2024 03:49:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 672626B00A0 for ; Fri, 8 Nov 2024 03:49:08 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 0D232AC252 for ; Fri, 8 Nov 2024 08:49:08 +0000 (UTC) X-FDA: 82762301622.27.7DA1343 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf08.hostedemail.com (Postfix) with ESMTP id C2A53160022 for ; Fri, 8 Nov 2024 08:48:38 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=hc+AwEC+; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf08.hostedemail.com: domain of jefflexu@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=jefflexu@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731055574; a=rsa-sha256; cv=none; b=ZDEyPTMBCMtjVcbT7WXXk1q3YgtEkHu8CYmlLQECDUH3lv72PohdEHkJo8/leS5gdTS2yp R//4lyFrXGzLCvwNsRXWciy8U4o6CI0DGsht7wNvqQG03Peta+NuTI5HvqokjSJ0WadMkG wP7vQsBzfsetI79HCc7Ene3SIiVvcXQ= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=hc+AwEC+; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf08.hostedemail.com: domain of jefflexu@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=jefflexu@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731055574; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KSjHF3Tnqu0fgSjAhgyHb1YKD87k6mfwyDJl3JsdIII=; b=ME/3Pt7j0nAL44ojJNvJ/ZKHf1IVLuLJtsWuEZu9HMpP6kPzg3whWevro6BXraQoLR9WiZ zRpB4e3rwh0wWofyxqaQZ/i3Y4gTAIRPmQfvM7udoY+7JeYS51lOwC1ceNzyKLA1+wjQoP spO+G/nr7/TqEfO05iC8bJBy+GzQ8fw= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1731055738; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=KSjHF3Tnqu0fgSjAhgyHb1YKD87k6mfwyDJl3JsdIII=; b=hc+AwEC+tpb2Dl23UJktGKuZ20qSBsjaX97NaWJDYnOQz5fB0jV1ioEwseSRDQJbtle6FQNJREsjWfEN2l2OcA40HvA2SU5ZpzwR5X/jVfKf/aIHV3HOlyX4pQ4oogB0WjMsU4GGQyJ/IdbsyhUXd1aioChEBrApKBa7KZNxIzY= Received: from 30.221.145.86(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0WIzTmZz_1731055736 cluster:ay36) by smtp.aliyun-inc.com; Fri, 08 Nov 2024 16:48:57 +0800 Message-ID: <1b3a36fe-1f62-410c-97fa-d59e7385f683@linux.alibaba.com> Date: Fri, 8 Nov 2024 16:48:55 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 6/6] fuse: remove tmp folio for writebacks and internal rb tree To: Joanne Koong , miklos@szeredi.hu, linux-fsdevel@vger.kernel.org Cc: shakeel.butt@linux.dev, josef@toxicpanda.com, linux-mm@kvack.org, bernd.schubert@fastmail.fm, kernel-team@meta.com References: <20241107235614.3637221-1-joannelkoong@gmail.com> <20241107235614.3637221-7-joannelkoong@gmail.com> Content-Language: en-US From: Jingbo Xu In-Reply-To: <20241107235614.3637221-7-joannelkoong@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: bf8xridj8dkqubyed168wm14x63kuttt X-Rspamd-Queue-Id: C2A53160022 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1731055718-984906 X-HE-Meta: U2FsdGVkX1+kGkfmlFhgSKT3nTLWRvdbwQcSD6V9rxIabnBC6djqeNzJuozhYlthWTL+rnZn1uoxgTrd1SU8Yde1YmWn1rifPvH53gm7l5AoGahFyloJYAsIeJ24VzmObmsAH+Z3O3noxD2OMQR4xgyyDhwFfyUBaFCCL33xdJxQnLpREJgu9clLNRD85eGV2rvLRMk3ouwra/3l1xp2XVhLahfbjwFUTTW4aQr4US5lN+3DyVostG2csWNwWmUanbXYP/RomGPpR1AYYHAFCiBebf6iITqHV1X0L4OSX1/VD3vtyttoyYI24kWYweolpeyqzA1Fbrjigiev5IwGczkMOmz0Dfm0uzPR/USylXVsZdoQAWAZOjVqPCCI/WeQmDhaKM6/irW8sY2erKEfaHhpyfynKkPOH16UUmOtmQsrbTX0VrnWW9RM4NR4M8X0UIlyKp5mGPe5EzBPVttai6XCe7VKjAxlRA9bRbUyo4TnVTNBk8LtmDlMt4nOzXg5/PUJC9E5AdsCSKxd74luv6wovVAI+qifZKeJzqZHtze9vhB+d0Du12eVbF22IBcA+npuIR6jTHUMZ5ycJ2uD2GhRLu60Jo+IiLbAewTVx5iMJ7ajGkVNw8JQ6yQtspdJAMzFKhUW6HqSUk/ukQCp/QmDfCDghEYFR3PF/MFAUSHgIOkVDWtA0Eyf8YACzTSTgSfWXO5vnSx5xecrgBCNoTJMhq0zIxDuwgXIXTBTAdwuVs8VNPPHZzLgF89S8nmfNw8ppMUK88CiK3u7uwHfLHnIOTQqZRfnGQo5mrrS+otM6PhWWvQbEUeqIAAvcFKdRITOIsq69eWQNwN+XxGjMRzCoSFYXSrEPolgeaIzX7zInWILdIZVLSqsLxdM2N2FWV+w2DKvMwiKUmN2D5kFvOKc0ZSncNTqvGyKrummHbK9X1TDHymkecjtM2maXG7JGzka7fbWE3leyxGvLlO h2ZCLxsi H1IKXgdbdhy5VptpbAbaMNrS9zbLRDT/Tblim+vE3a0zoEURVksZIVbQddb4Ph6TYmOBSf7IGfCECUV/o/bfoRoghjsrY8NlAiYPjrIvdh9eWyfh57f1lVBJNON1ohItb+c3w/GnDVqqlgWBmjkDp8jkGIKnhhmce8wIf8u+AIYMciKVaQOKml+F20uPsmjNvAovdvSHJX22FNS0RQx9q7qL+CNMptnUb6hwXfVlyICMyivGk+ukSqR4dhkmUDOZXdCQ00BKGRaIB7VguBYddkZke2Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, Joanne, Thanks for the continuing work! On 11/8/24 7:56 AM, Joanne Koong wrote: > Currently, we allocate and copy data to a temporary folio when > handling writeback in order to mitigate the following deadlock scenario > that may arise if reclaim waits on writeback to complete: > * single-threaded FUSE server is in the middle of handling a request > that needs a memory allocation > * memory allocation triggers direct reclaim > * direct reclaim waits on a folio under writeback > * the FUSE server can't write back the folio since it's stuck in > direct reclaim > > To work around this, we allocate a temporary folio and copy over the > original folio to the temporary folio so that writeback can be > immediately cleared on the original folio. This additionally requires us > to maintain an internal rb tree to keep track of writeback state on the > temporary folios. > > A recent change prevents reclaim logic from waiting on writeback for > folios whose mappings have the AS_WRITEBACK_MAY_BLOCK flag set in it. > This commit sets AS_WRITEBACK_MAY_BLOCK on FUSE inode mappings (which > will prevent FUSE folios from running into the reclaim deadlock described > above) and removes the temporary folio + extra copying and the internal > rb tree. > > fio benchmarks -- > (using averages observed from 10 runs, throwing away outliers) > > Setup: > sudo mount -t tmpfs -o size=30G tmpfs ~/tmp_mount > ./libfuse/build/example/passthrough_ll -o writeback -o max_threads=4 -o source=~/tmp_mount ~/fuse_mount > > fio --name=writeback --ioengine=sync --rw=write --bs={1k,4k,1M} --size=2G > --numjobs=2 --ramp_time=30 --group_reporting=1 --directory=/root/fuse_mount > > bs = 1k 4k 1M > Before 351 MiB/s 1818 MiB/s 1851 MiB/s > After 341 MiB/s 2246 MiB/s 2685 MiB/s > % diff -3% 23% 45% > > Signed-off-by: Joanne Koong > @@ -1622,7 +1543,7 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > return res; > } > } > - if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) { > + if (!cuse && filemap_range_has_writeback(mapping, pos, (pos + count - 1))) { filemap_range_has_writeback() is not equivalent to fuse_range_is_writeback(), as it will return true as long as there's any locked or dirty page? I can't find an equivalent helper function at hand though. > @@ -3423,7 +3143,6 @@ void fuse_init_file_inode(struct inode *inode, unsigned int flags) > fi->iocachectr = 0; > init_waitqueue_head(&fi->page_waitq); > init_waitqueue_head(&fi->direct_io_waitq); > - fi->writepages = RB_ROOT; It seems that 'struct rb_root writepages' is not removed from fuse_inode structure. Besides, I also looked through the former 5 patches and can't find any obvious errors at the very first glance. Hopefully the MM guys could offer more professional reviews. -- Thanks, Jingbo