From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88EA5EEE270 for ; Fri, 13 Sep 2024 00:00:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C4A768D0005; Thu, 12 Sep 2024 20:00:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BD2AF8D0003; Thu, 12 Sep 2024 20:00:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A728D8D0005; Thu, 12 Sep 2024 20:00:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8A9428D0003 for ; Thu, 12 Sep 2024 20:00:37 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 225101C5356 for ; Fri, 13 Sep 2024 00:00:37 +0000 (UTC) X-FDA: 82557758514.21.ED45EBB Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf17.hostedemail.com (Postfix) with ESMTP id 43D3A40023 for ; Fri, 13 Sep 2024 00:00:33 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="h+JFws/X"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726185556; a=rsa-sha256; cv=none; b=D/MI+5ktfbY9Ua1srQjTErwMn5mX1GPzFzAUbGVyW5ze17/+0kQzi3RPTejMZyilrgcDTg pXaFSc54u+oba/DX2OV5bDny7Uda9KvFtSNESjVHbIbfSHJG4e300T5AekWGNJKEfYmNbd RzisQG9ZCyKOnAYogb9ZJqNeMQpvRE4= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="h+JFws/X"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726185556; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=m1nLXWhZuleCIwXmkpJzmdAmIhj4G5OIwq8lV7kYTlw=; b=on7wLWpXnqKHPthlMBikYg2WBikXOo8ww5fi9u5yzR1/pbnzfX5nc+5LNtYhW9T4A8nhI/ AmS16hCkYUWnxV2zzxpEmuwNdIeP7X16vxWl4CZG+ltMJWlCim+KBrqBVeXq0jSm/Sqg35 njrhvHWSLAqtF5WrFjPv7h252eq6/eU= Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-4583209a17dso3297471cf.1 for ; Thu, 12 Sep 2024 17:00:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1726185633; x=1726790433; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=m1nLXWhZuleCIwXmkpJzmdAmIhj4G5OIwq8lV7kYTlw=; b=h+JFws/XDLZljj6CxmAKcQvMLjK0Pm3HsSM8KOXTiPLTHGj7NHGFvY9o3d1p2ZWqtQ shXrf/RaQIKMph4UG0917wcZjb0I7/14ru1dOgAI1s7BWx4ZWr8QBM/J8NdZnr2Z1Ofm oicGYLYN9TC36bKh/exYxgb4EJsU8GhlWznIdrrZhzi660Qz5TJf2CrGUyi3b9uYkppj 7Y9to7bbSwt7Upk1NlnqoiuAwmlLn97HDXhK1hdK92DqRn45OG70uWhg58FscP3MOFH/ jwTRbYhMLX+26WLMaBryN2VllZ20e7oOz89W3+dtoE4uljcEWdwGdb292/l/uxShrtpE Rnxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726185633; x=1726790433; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m1nLXWhZuleCIwXmkpJzmdAmIhj4G5OIwq8lV7kYTlw=; b=mZ99ecFIIGOSSb7cU3KyZsv8+JCmPjvRbXIMiZrflwIhjHIKSskCgXPV6LIF81NAko npN1jpvSTewgvBRaG/aIFvXpkzoQOct7nCcRsrzy0zEj62cfhHPEt5syGcSpG8+mMB5q iifnupBmCZtxzfySC2J2w9kn+Z3rmZ9jV+LEzvq/TX4XR/bysTG/NwPnfF2IxdWOJ+EY MKnSj2NF9iPP10tz17arpKi2h2cvnXD2bIEIrtUmOSnTAqiRJKoSygYqg1PlHaMfCa9c szoY327HKCAifPluJH5lXDTpY1vTVEnUcQy+UA2M+7u6eVueW7/BXYy+hshx8onMNMXt 2OPA== X-Forwarded-Encrypted: i=1; AJvYcCW/CNzBz+bPxfskl3wcJ96dSbGVwawTam/l3ZIoB4MyEBeLnBHg0H10yPPGsY3nNESXwq7ZoxhoMQ==@kvack.org X-Gm-Message-State: AOJu0YzGcj/ksiAomneCBd0Bs4yJoWVd+GS4+N3C3OIQIqAAIkD1LRwB ybY10TZxbrK0fI3r77fFJ126bmax11cbixK7QdlqPpBNBNt4f+zFYScW8LyYpgPlDQIU/Ga4ccN 9k2e+gpI+FDVxFqAyXcaYdTXULe/r7gDA X-Google-Smtp-Source: AGHT+IEKAWiNgMeY++F/YiLcNTqKSLVPf5Rl+tNUD0+piqIYBmOwvXMm5o2/00gvd5wyiiEcfRbvCFeRpIDemVPNS3Y= X-Received: by 2002:ac8:5795:0:b0:458:3e20:65bf with SMTP id d75a77b69052e-4599d225663mr15667671cf.7.1726185633070; Thu, 12 Sep 2024 17:00:33 -0700 (PDT) MIME-Version: 1.0 References: <495d2400-1d96-4924-99d3-8b2952e05fc3@linux.alibaba.com> <67771830-977f-4fca-9d0b-0126abf120a5@fastmail.fm> <2f834b5c-d591-43c5-86ba-18509d77a865@fastmail.fm> In-Reply-To: From: Joanne Koong Date: Thu, 12 Sep 2024 17:00:22 -0700 Message-ID: Subject: Re: [HELP] FUSE writeback performance bottleneck To: Jingbo Xu Cc: Miklos Szeredi , Bernd Schubert , "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , lege.wang@jaguarmicro.com, "Matthew Wilcox (Oracle)" , "linux-mm@kvack.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 43D3A40023 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: h5bmaddeqdxzapgxj5s3xecoa6ytghf7 X-HE-Tag: 1726185633-663564 X-HE-Meta: U2FsdGVkX19q1pkeyEbAge/YEkN3KHKYUJfjYPf3YtGfcoDy9WNE7u5fL8p9bwh7j2hnxgk1C3RDA7BwLD0O5cX6UVMcHqaUPnH/Xn1wfYIE0coo8WjO7GdpZJA9GjmuvCkPqsw75THK1o6/MVSOBXy+UVzopq+fomTs/iNbGVRaD8HpYFynveCCHrKViuEoZojTrTtUXdtEscejRYQQb6UZCtIn3e2mFA5UOSDd2vMbSt/vv4BqfLoAnav71+THMwjfXiy8r5tAs0EbMTcsnEqKyRur1uqXlPLr5q2Mq/U6oHVJpkY8JremUBLGXU9Q7npl0S4dcJJp7V0ardiYquF55SR3o/JR28Csm+9YdUY+GS1cmJEF+9vJ9O7ymDm+r1RDIUy+LXKIV8oagwvkz+WynbLker4h7zZ69JlY204QsQGmokAibFxsnk5UZdMLtGMp1ju5Gg8//vMazJjSU5dPuoWXZnmHlUEljlQZT+d3M3XYxXFwNzwvxYIsjRBdJhv0Eh0S2pAg5hTaMh9vHI0DP8SnkOjsKx7izpZxCtsvqwnzKi8QBUxVWO1vKDrUVD7exp8g2rQDhbNA4FQi9w/kulMrBloPT8wdF/zB2f3o0n6roPjdWhoOrhBO2CcANRWtoCXr9mEmrGpTBswRxYfPQmHsEXX9rPB7UNjfLpkCLsnSUEI6CXGx+svsLjq0NLHMMnZIpXISmlAXsLc/edKLee3r9WP6nsK8Cjw9Ux+/2OMrxDd4qNdK17n2xxqUxxMDeMxX5vLYFiafqm/FF1Qzg+OqQOyMl/fF5Ye8XzdLzQeuMz9sW/Lgu/8pxU6ZkWpfOu8El/ObQ/lk1aJQefSbPwy/vPI36X6Y3kIW75mLPj18/nOZa2xPJG7iB9H6zOoYImILTJCbySGyXLf/hYoR701KlPMyR5DgytL363ZLU89TKUjSKgRl7Zw6dDn0NBQCDmBPgaD7LzkHV3A OXlV6NC1 O4B4RlPIGcsYkVDYjk/o9MhCIPUko8XyArRIkuZLoYz6jpqbSY3h84rOGBbnIsHZ7TJKK1+KCj/VpJQksHUy5azZ1PSonCUL3SMD1GYoBVOZBeHttkZq6d8mPy2z3Tb0RpeGLv5aiJo/lokPgVM2Te0139g56XjWbD05SHVoc9IJrOgH3JaYaGkkLiRCI9GUBAwCuZKWY8tRdRMyBJ0MaC0J8YH//f/QTCV/v08RV7MoLPBuw01+vTqvGFZHHCZ9LYSSLd8Rv4FMMWXX9FLwRjClax8Rjvp3kwPr3o/o3owaydK+Il56NGolN1PL2zZydcIpsl4QS5HLYVG8fVH2ZgErAxnXT6bI2hMuR9NzZGkf6EDo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000164, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 22, 2024 at 8:34=E2=80=AFPM Jingbo Xu wrote: > > On 6/4/24 6:02 PM, Miklos Szeredi wrote: > > On Tue, 4 Jun 2024 at 11:32, Bernd Schubert wrote: > > > >> Back to the background for the copy, so it copies pages to avoid > >> blocking on memory reclaim. With that allocation it in fact increases > >> memory pressure even more. Isn't the right solution to mark those page= s > >> as not reclaimable and to avoid blocking on it? Which is what the tmp > >> pages do, just not in beautiful way. > > > > Copying to the tmp page is the same as marking the pages as > > non-reclaimable and non-syncable. > > > > Conceptually it would be nice to only copy when there's something > > actually waiting for writeback on the page. > > > > Note: normally the WRITE request would be copied to userspace along > > with the contents of the pages very soon after starting writeback. > > After this the contents of the page no longer matter, and we can just > > clear writeback without doing the copy. > > OK this really deviates from my previous understanding of the deadlock > issue. Previously I thought *after* the server has received the WRITE > request, i.e. has copied the request and page content to userspace, the > server needs to allocate some memory to handle the WRITE request, e.g. > make the data persistent on disk, or send the data to the remote > storage. It is the memory allocation at this point that actually > triggers a memory direct reclaim (on the FUSE dirty page) and causes a > deadlock. It seems that I misunderstand it. I think your previous understanding is correct (or if not, then my understanding of this is incorrect too lol). The first write request makes it to userspace and when the server is in the middle of handling it, a memory reclaim is triggered where pages need to be written back. This leads to a SECOND write request (eg writing back the pages that are reclaimed) but this second write request will never be copied out to userspace because the server is stuck handling the first write request and waiting for the page reclaim bits of the reclaimed pages to be unset, but those reclaim bits can only be unset when the pages have been copied out to userspace, which only happens when the server reads /dev/fuse for the next request. > > If that's true, we can clear PF_writeback as long as the whole request > along with the page content has already been copied to userspace, and > thus eliminate the tmp page copying. > I think the problem is that on a single-threaded server, the pages will not be copied out to userspace for the second request (aka writing back the dirty reclaimed pages) since the server is stuck on the first request. > > > > But if the request gets stuck in the input queue before being copied > > to userspace, then deadlock can still happen if the server blocks on > > direct reclaim and won't continue with processing the queue. And > > sync(2) will also block in that case. > > > > Hi, Miklos, > > Would you please give more details on how "the request can get stuck in > the input queue before being copied userspace"? Do you mean the WRITE > requests (submitted from writeback) are still pending in the > background/pending list, waiting to be processed by the server, while at > the same time the server gets blocked from processing the queue, either > due to the server is blocked on direct reclaim (when handling *another* > request), or it's a malicious server and refuses to process any request? > > > -- > Thanks, > Jingbo >