From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1DDCC52D7C for ; Fri, 23 Aug 2024 03:34:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F3F5C80079; Thu, 22 Aug 2024 23:34:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA1FA80073; Thu, 22 Aug 2024 23:34:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D424080079; Thu, 22 Aug 2024 23:34:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AC8FC80073 for ; Thu, 22 Aug 2024 23:34:50 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 442841601DB for ; Fri, 23 Aug 2024 03:34:50 +0000 (UTC) X-FDA: 82482093540.13.237F751 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by imf03.hostedemail.com (Postfix) with ESMTP id E563B20005 for ; Fri, 23 Aug 2024 03:34:41 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=XeSmT1ik; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf03.hostedemail.com: domain of jefflexu@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=jefflexu@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724383997; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=u6QuzGPOlPh4Ysj/C9H5NkvgRQdQ+/ttIK44yeggXe4=; b=HyBwPe3XFe7wUk2oQz5TSrbj2vGBwYuOTTZ2pHoX/KRqH/a1eJdaHm7t7oOMNFsmq9SIVi 7bDdcA79B5JjM7ll88aQmUj9bLb3/GlGLUrUS7bjt4pLo5fHfH7fgmWprI+ZadEeWTcQnd RTnKgqKV8Fu5risr79S+nImRNQqNRzg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724383997; a=rsa-sha256; cv=none; b=2XPFn4jKhHXNaVBybKSnBNeK4xLi7cmDVtVNKh4txSyehXGFFDX67E/vPzdz3kk7ybBvXY UBiTx7lCZ+FvlJrzqNLUgDbMxsC+X3cjzez4WQFRU0krlsFv4XBVx3PA/wmBXxTybdK+rJ AhRvBhubQQBzc4U/V4OFWhWIRZgc3cc= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=XeSmT1ik; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf03.hostedemail.com: domain of jefflexu@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=jefflexu@linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1724384075; h=Message-ID:Date:MIME-Version:From:Subject:To:Content-Type; bh=u6QuzGPOlPh4Ysj/C9H5NkvgRQdQ+/ttIK44yeggXe4=; b=XeSmT1ikRtXnqGYxHZgWyAFQfjCRlc9TymOtYaZRmDprpO9P0XQz2UOZ2cW4TGLV1faLj0dKVpXu9mw2bDGZ5eOVqau3EFJA+sZHq+w0qdEtaO91uWEDacJLnRDeN4Swedj875sKXBqiCcdmvegrs4951vBpZaLC1MFHwkKcM4I= Received: from 30.221.147.23(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0WDS1UhB_1724384073) by smtp.aliyun-inc.com; Fri, 23 Aug 2024 11:34:34 +0800 Message-ID: Date: Fri, 23 Aug 2024 11:34:31 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Jingbo Xu Subject: Re: [HELP] FUSE writeback performance bottleneck To: Miklos Szeredi , Bernd Schubert Cc: "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , lege.wang@jaguarmicro.com, "Matthew Wilcox (Oracle)" , "linux-mm@kvack.org" References: <495d2400-1d96-4924-99d3-8b2952e05fc3@linux.alibaba.com> <67771830-977f-4fca-9d0b-0126abf120a5@fastmail.fm> <2f834b5c-d591-43c5-86ba-18509d77a865@fastmail.fm> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E563B20005 X-Stat-Signature: 3dkqw7iaw5191o4hdxha53ierzwh9mjo X-Rspam-User: X-HE-Tag: 1724384081-857080 X-HE-Meta: U2FsdGVkX18tezM70b4WOyf8rlaYbvtEpOYn83DOblnLQkiuuMFq42tRcM9YIkXqsbUPxqwEvkgZqxn4v+jz+LfAejNbN0UHxkaJtuArW5y1eCnsrFVI0LPu69eCtntONYc5X3V7z7GH1+xDQ1MsYkG1NOftByhD8qnTsRYI7Ho3L6X9OjztWae9aTjs1WJUHuFHl+AbW++uSzeCaR7lx4cbLJH/JpSA9t0BzVL79yb9vHyoux5MDVqIJFEXBRyfymGiDDeFiCEqQZMTRj2qkiHT/9E9BZSrVlGOXaLJqMdUdRtjO/g+XVbZN12LhTLauKEVO6fnctK/dLmjtEx0tnzE8WzRbkhkAZpk2O0ofrQks7DQJjedYW9Pea1DVBnsnXSVyZpSqhE1v/Hwuij9h0WcJEsk1ADJvevh+Sj3qrFxDiKbwH1xOCV93qZbGmijGhhz3laBNja+wG4jt8lNumrAc8x2DxsV4xpBFJtXnrw0vsE8LY0R6vldXgh7wbw9f6NlJvY5QEHaOXv+FKbFV3D4uN5DDh2gcJU+4KdHXKakdvq/zFalPB1g9uTfZuMMnBY3W0IdF22ucllqZDKfBj/+U3vu/lZ2KQd7vUp+oefGjbUxl3ok4j//6WhpG3WRGMOPoUcKX9Et8NoxXQZrnwqaef1Bze2VyVUxXzeSSbz6jOaid7Z++oTxVZH/mUn5JeFrr4vktDldQ0hUbybT3p90tKcECEI5BqmRT7Sr0UuA+2WmkN3Nlm/eUI+68Ruso5LvJpWMlRPx497LgKW8VUGMBGLaAWULGls7AY8Hs4t/y0pw9ykbMMcPGez37bou96QrbNsLZQcH1XjoBhlQn3n0Cw2T3GF4Dxa3UMhVclzT+FNs3GdKAekgGoS8Kk6AenGNkkkeztTAyIYOyWbKyCBw4TNTzJPKLHVevWH1tGfBSjZ6J8gFUbqPkihDgeZcfptRI+cTBoIxmR0VBaj LAlbsPe8 pzKnzVvu93/S201TLd+cPWiaffJnKkHWPQD+Nvz4RHpUgTAEEbFa8yCGw+aTUCOFIAwuSNHDKOerBjiKzSFx+cceTzQzzbsQHxThIFN4QVeT1/KK9hHrzwpbA0EPrvxv1Xg7YOOVecgRuMqYWnucAqAE4yr5LCYn9TGm+j6gZ5YAqEP/cyyokqdzNge/Cu6PCYKrUVL8xKrTd6CEvUgubLg0nR3vba2PPMrHWmJ2fI0zhlA3CE4LQYcKIgTHaNLuMSVRrjqUMTTpwQQ8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 6/4/24 6:02 PM, Miklos Szeredi wrote: > On Tue, 4 Jun 2024 at 11:32, Bernd Schubert wrote: > >> Back to the background for the copy, so it copies pages to avoid >> blocking on memory reclaim. With that allocation it in fact increases >> memory pressure even more. Isn't the right solution to mark those pages >> as not reclaimable and to avoid blocking on it? Which is what the tmp >> pages do, just not in beautiful way. > > Copying to the tmp page is the same as marking the pages as > non-reclaimable and non-syncable. > > Conceptually it would be nice to only copy when there's something > actually waiting for writeback on the page. > > Note: normally the WRITE request would be copied to userspace along > with the contents of the pages very soon after starting writeback. > After this the contents of the page no longer matter, and we can just > clear writeback without doing the copy. OK this really deviates from my previous understanding of the deadlock issue. Previously I thought *after* the server has received the WRITE request, i.e. has copied the request and page content to userspace, the server needs to allocate some memory to handle the WRITE request, e.g. make the data persistent on disk, or send the data to the remote storage. It is the memory allocation at this point that actually triggers a memory direct reclaim (on the FUSE dirty page) and causes a deadlock. It seems that I misunderstand it. If that's true, we can clear PF_writeback as long as the whole request along with the page content has already been copied to userspace, and thus eliminate the tmp page copying. > > But if the request gets stuck in the input queue before being copied > to userspace, then deadlock can still happen if the server blocks on > direct reclaim and won't continue with processing the queue. And > sync(2) will also block in that case. > Hi, Miklos, Would you please give more details on how "the request can get stuck in the input queue before being copied userspace"? Do you mean the WRITE requests (submitted from writeback) are still pending in the background/pending list, waiting to be processed by the server, while at the same time the server gets blocked from processing the queue, either due to the server is blocked on direct reclaim (when handling *another* request), or it's a malicious server and refuses to process any request? -- Thanks, Jingbo