From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F208EEE278 for ; Fri, 13 Sep 2024 01:25:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 936636B00C2; Thu, 12 Sep 2024 21:25:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EA2A6B00C5; Thu, 12 Sep 2024 21:25:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 78FEE6B00C6; Thu, 12 Sep 2024 21:25:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 57D586B00C2 for ; Thu, 12 Sep 2024 21:25:25 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8C543C0C2C for ; Fri, 13 Sep 2024 01:25:24 +0000 (UTC) X-FDA: 82557972168.09.0859B4F Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) by imf12.hostedemail.com (Postfix) with ESMTP id 3DA9D4000A for ; Fri, 13 Sep 2024 01:25:20 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=BuTH+Cqm; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf12.hostedemail.com: domain of jefflexu@linux.alibaba.com designates 115.124.30.101 as permitted sender) smtp.mailfrom=jefflexu@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726190693; a=rsa-sha256; cv=none; b=wA2iw7QmVerUpr4in+VsMnRSLf8OrQpDoqQSz7jbFDaAPkB6qYmzJtf1e4yfGnI5Kl/U35 InFSN3fJooxzXoDQHWBzv24rAnkz7N0MaTpm8h9Apd/qxLCA/mGbupgSUIqh+jeon+LSOX a2CRatuOwg9Wrg9NlOTpF9Ip8xvULiY= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=BuTH+Cqm; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf12.hostedemail.com: domain of jefflexu@linux.alibaba.com designates 115.124.30.101 as permitted sender) smtp.mailfrom=jefflexu@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726190693; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bT7Ehu308cTbhhm8mjO8oPSFI42XJ/7qjVWe+08lwuE=; b=SW2o++0a2YzWikcPCPS0zjgYTloj5aJuneZF0kLE/7Bq2NpccCi0s6JGoOH0/Nf+Q11sH/ n9fOgDFc1TRy/et3pKBSjaE0XbC5vt2gbZRDEnQU0fwRQQDgm2ZdYkBlQAFICd61CPE7Cb 1oLvcuYbixtCLs7b4c/Ybt9gBd6t9Jk= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1726190717; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=bT7Ehu308cTbhhm8mjO8oPSFI42XJ/7qjVWe+08lwuE=; b=BuTH+CqmAS1JIJ6L5yWk9xZuGGiVEXnujN6va2twzgTmznANc8dHx9Y0wbV6sAI4UvBHO7T/XclVtC0XX4kkvOIzwn9l0L8mimFNH1FLznVNLUkItKyIoptjRRnqZ6Prv55+9B50pwjKkASlFngJ6qIOnSlZp1pMzgkIgIoco8s= Received: from 30.221.145.1(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0WEsnkWg_1726190715) by smtp.aliyun-inc.com; Fri, 13 Sep 2024 09:25:16 +0800 Message-ID: <67cdcde3-1095-41cc-9d99-a0b97274d7be@linux.alibaba.com> Date: Fri, 13 Sep 2024 09:25:13 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [HELP] FUSE writeback performance bottleneck To: Joanne Koong Cc: Miklos Szeredi , Bernd Schubert , "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , lege.wang@jaguarmicro.com, "Matthew Wilcox (Oracle)" , "linux-mm@kvack.org" References: <495d2400-1d96-4924-99d3-8b2952e05fc3@linux.alibaba.com> <67771830-977f-4fca-9d0b-0126abf120a5@fastmail.fm> <2f834b5c-d591-43c5-86ba-18509d77a865@fastmail.fm> Content-Language: en-US From: Jingbo Xu In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 3DA9D4000A X-Rspamd-Server: rspam01 X-Stat-Signature: 39j7zb9xpmxjx44eufdainjzoju37h1p X-HE-Tag: 1726190720-580427 X-HE-Meta: U2FsdGVkX19wAZEWRMHPX6BIZCvoq3T1wG9aNKVVYRf+TeBJldQ91vwDMtQy/OdCT1U8KHXdBG9QasOygAGb6yxmolrf4VqdX+rQeXlOIDOMnXK8GwX9FbQgRJasGLne5Yn5r02lmbHAtpUAhz2Itok66uC4n0fANbryOSpUKbBm03RU8lFluCl959c89p7meW1PrwLAqLJI2JdTMfLm2iA+7haunBxCOxljlLUiXnmGuVHBf6DVL+yhP0v1NHM6pTHwJ4h8ur3hmv5tEBTxMiboxqi3kgCUmv4LDizft3CM16uHLCn+YFRbotrhkIwWcsdpO1R94yfGq9as0QeSmVAzLHIC8tc4gXw33Y4SlUZhNaqj4EcF4Is4q6e+lF3mDva9PUX1Js0X/AWPfUslwPLKmdDBhc1oTP+vlIE3nTjnnzX9/MpM8jLP2A16GZUz40JAf0yTagFMvSEH/Z6KZWUfq2595wvXbpXz3XfuUlGi/kxn3LSFj/qBS9Q/KP+AhNKd7Ssm5SIk+jp8xbWr156cH1AfpYTu8+BoQe4jYsdzOQsEICKwbSAmtv8cP8VEePnmVX9rSMmXT8gzVl8by4sUUTnRqVLxirOHrTqCsgwuLLg1ggDCNAFFNU4uW5KBnHDUa0WxddnSjCVPT2zSnysEVvjtg+5IVoFFF8kJ1bj8tmAlhSXOseXMAyBt42m8j5FgVNaicfGJ5B4ptmdF5O8E3UmMI29wuQKSFs2V2y7qlsQxxIhNMCcXbNBM5bKo/1t4q6vCjHkAX5J53TPYIw0C+SBJDgZTn2hEOQqlTjlX9GPp2afNXaHz7YcJhwzWbx5hWbVWa8ypHoF8/0R0QApVR/QgFdMevHIq/YYOl9j1qIi9D7vbHHvhTmLEJoVeILQKkctOgmgCk/RHNYbdHhEuqLFLbVY9l5Biea20rbHJZuFPy5pNOLF/BP2Z4nSmfUkLD7LvE1yqiWstX1Q rgB1aJOG ycRuygoioYOPg/OUU1ibAfEghJBvs73Web1wC0v+WyC5sIg+aFmUVjd+pjYHzb4jIUednaL+Qh272z8FwhZ1+CXsXmuysn6fIeLdVSNBmygaqYx5vdCWlw3U7xXaDwdJwbSi1IgYUeNDKtT4HSgur17DOAfLWaNfbuuaboy3YE9BtKD+BGekiptJROQ1Lewxazaz0VANruJcEHumNhubAdeEpkluNtSrHaaz2otcW0Zi498srbY1YOi58ZKvtq5/SZOFfIz96GSZCO8LiLN8+Ck/emHw/yZvg+cKxB7g3I5ny2rAziwj2Zbkacg4AJjwv8ERxIy8PuWMeyuDkwJ/jOMWoCQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 9/13/24 8:00 AM, Joanne Koong wrote: > On Thu, Aug 22, 2024 at 8:34 PM Jingbo Xu wrote: >> >> On 6/4/24 6:02 PM, Miklos Szeredi wrote: >>> On Tue, 4 Jun 2024 at 11:32, Bernd Schubert wrote: >>> >>>> Back to the background for the copy, so it copies pages to avoid >>>> blocking on memory reclaim. With that allocation it in fact increases >>>> memory pressure even more. Isn't the right solution to mark those pages >>>> as not reclaimable and to avoid blocking on it? Which is what the tmp >>>> pages do, just not in beautiful way. >>> >>> Copying to the tmp page is the same as marking the pages as >>> non-reclaimable and non-syncable. >>> >>> Conceptually it would be nice to only copy when there's something >>> actually waiting for writeback on the page. >>> >>> Note: normally the WRITE request would be copied to userspace along >>> with the contents of the pages very soon after starting writeback. >>> After this the contents of the page no longer matter, and we can just >>> clear writeback without doing the copy. >> >> OK this really deviates from my previous understanding of the deadlock >> issue. Previously I thought *after* the server has received the WRITE >> request, i.e. has copied the request and page content to userspace, the >> server needs to allocate some memory to handle the WRITE request, e.g. >> make the data persistent on disk, or send the data to the remote >> storage. It is the memory allocation at this point that actually >> triggers a memory direct reclaim (on the FUSE dirty page) and causes a >> deadlock. It seems that I misunderstand it. > > I think your previous understanding is correct (or if not, then my > understanding of this is incorrect too lol). > The first write request makes it to userspace and when the server is > in the middle of handling it, a memory reclaim is triggered where > pages need to be written back. This leads to a SECOND write request > (eg writing back the pages that are reclaimed) but this second write > request will never be copied out to userspace because the server is > stuck handling the first write request and waiting for the page > reclaim bits of the reclaimed pages to be unset, but those reclaim > bits can only be unset when the pages have been copied out to > userspace, which only happens when the server reads /dev/fuse for the > next request. Right, that's true. > >> >> If that's true, we can clear PF_writeback as long as the whole request >> along with the page content has already been copied to userspace, and >> thus eliminate the tmp page copying. >> > > I think the problem is that on a single-threaded server, the pages > will not be copied out to userspace for the second request (aka > writing back the dirty reclaimed pages) since the server is stuck on > the first request. Agreed. -- Thanks, Jingbo