linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jingbo Xu <jefflexu@linux.alibaba.com>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	Bernd Schubert <bernd.schubert@fastmail.fm>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	lege.wang@jaguarmicro.com,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [HELP] FUSE writeback performance bottleneck
Date: Fri, 13 Sep 2024 09:25:13 +0800	[thread overview]
Message-ID: <67cdcde3-1095-41cc-9d99-a0b97274d7be@linux.alibaba.com> (raw)
In-Reply-To: <CAJnrk1bTt7r1hfkp6oA3-_x3ixEd_qKb8Kkxrugv8XOOJz7U4Q@mail.gmail.com>



On 9/13/24 8:00 AM, Joanne Koong wrote:
> On Thu, Aug 22, 2024 at 8:34 PM Jingbo Xu <jefflexu@linux.alibaba.com> wrote:
>>
>> On 6/4/24 6:02 PM, Miklos Szeredi wrote:
>>> On Tue, 4 Jun 2024 at 11:32, Bernd Schubert <bernd.schubert@fastmail.fm> wrote:
>>>
>>>> Back to the background for the copy, so it copies pages to avoid
>>>> blocking on memory reclaim. With that allocation it in fact increases
>>>> memory pressure even more. Isn't the right solution to mark those pages
>>>> as not reclaimable and to avoid blocking on it? Which is what the tmp
>>>> pages do, just not in beautiful way.
>>>
>>> Copying to the tmp page is the same as marking the pages as
>>> non-reclaimable and non-syncable.
>>>
>>> Conceptually it would be nice to only copy when there's something
>>> actually waiting for writeback on the page.
>>>
>>> Note: normally the WRITE request would be copied to userspace along
>>> with the contents of the pages very soon after starting writeback.
>>> After this the contents of the page no longer matter, and we can just
>>> clear writeback without doing the copy.
>>
>> OK this really deviates from my previous understanding of the deadlock
>> issue.  Previously I thought *after* the server has received the WRITE
>> request, i.e. has copied the request and page content to userspace, the
>> server needs to allocate some memory to handle the WRITE request, e.g.
>> make the data persistent on disk, or send the data to the remote
>> storage.  It is the memory allocation at this point that actually
>> triggers a memory direct reclaim (on the FUSE dirty page) and causes a
>> deadlock.  It seems that I misunderstand it.
> 
> I think your previous understanding is correct (or if not, then my
> understanding of this is incorrect too lol).
> The first write request makes it to userspace and when the server is
> in the middle of handling it, a memory reclaim is triggered where
> pages need to be written back. This leads to a SECOND write request
> (eg writing back the pages that are reclaimed) but this second write
> request will never be copied out to userspace because the server is
> stuck handling the first write request and waiting for the page
> reclaim bits of the reclaimed pages to be unset, but those reclaim
> bits can only be unset when the pages have been copied out to
> userspace, which only happens when the server reads /dev/fuse for the
> next request.

Right, that's true.

> 
>>
>> If that's true, we can clear PF_writeback as long as the whole request
>> along with the page content has already been copied to userspace, and
>> thus eliminate the tmp page copying.
>>
> 
> I think the problem is that on a single-threaded server,  the pages
> will not be copied out to userspace for the second request (aka
> writing back the dirty reclaimed pages) since the server is stuck on
> the first request.

Agreed.


-- 
Thanks,
Jingbo


  reply	other threads:[~2024-09-13  1:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <495d2400-1d96-4924-99d3-8b2952e05fc3@linux.alibaba.com>
     [not found] ` <67771830-977f-4fca-9d0b-0126abf120a5@fastmail.fm>
     [not found]   ` <CAJfpeguts=V9KkBsMJN_WfdkLHPzB6RswGvumVHUMJ87zOAbDQ@mail.gmail.com>
     [not found]     ` <bd49fcba-3eb6-4e84-a0f0-e73bce31ddb2@linux.alibaba.com>
     [not found]       ` <CAJfpegsfF77SV96wvaxn9VnRkNt5FKCnA4mJ0ieFsZtwFeRuYw@mail.gmail.com>
     [not found]         ` <ffca9534-cb75-4dc6-9830-fe8e84db2413@linux.alibaba.com>
2024-06-04  9:32           ` Bernd Schubert
2024-06-04 10:02             ` Miklos Szeredi
2024-06-04 14:13               ` Bernd Schubert
2024-06-04 16:53                 ` Josef Bacik
2024-06-04 21:39                   ` Bernd Schubert
2024-06-04 22:16                     ` Josef Bacik
2024-06-05  5:49                       ` Amir Goldstein
2024-06-05 15:35                         ` Josef Bacik
2024-08-22 17:00               ` Joanne Koong
2024-08-22 21:01                 ` Joanne Koong
2024-08-23  3:34               ` Jingbo Xu
2024-09-13  0:00                 ` Joanne Koong
2024-09-13  1:25                   ` Jingbo Xu [this message]
2024-06-04 12:24             ` Jingbo Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67cdcde3-1095-41cc-9d99-a0b97274d7be@linux.alibaba.com \
    --to=jefflexu@linux.alibaba.com \
    --cc=bernd.schubert@fastmail.fm \
    --cc=joannelkoong@gmail.com \
    --cc=lege.wang@jaguarmicro.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox