linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: Bharata B Rao <bharata@amd.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	nikunj@amd.com, vbabka@suse.cz, david@redhat.com,
	akpm@linux-foundation.org, yuzhao@google.com, axboe@kernel.dk,
	viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz,
	joshdon@google.com, clm@meta.com
Subject: Re: [RFC PATCH 0/1] Large folios in block buffered IO path
Date: Thu, 28 Nov 2024 04:43:18 +0000	[thread overview]
Message-ID: <Z0f05sdTiL8kW9U8@casper.infradead.org> (raw)
In-Reply-To: <CAGudoHHBu663RSjQUwi14_d+Ln6mw_ESvYCc6dTec-O0Wi1-Eg@mail.gmail.com>

On Thu, Nov 28, 2024 at 05:22:41AM +0100, Mateusz Guzik wrote:
> This means that the folio waiting stuff has poor scalability, but
> without digging into it I have no idea what can be done. The easy way

Actually the easy way is to change:

#define PAGE_WAIT_TABLE_BITS 8

to a larger number.

> out would be to speculatively spin before buggering off, but one would
> have to check what happens in real workloads -- presumably the lock
> owner can be off cpu for a long time (I presume there is no way to
> store the owner).

So ...

 - There's no space in struct folio to put a rwsem.
 - But we want to be able to sleep waiting for a folio to (eg) do I/O.

This is the solution we have.  For the read case, there are three
important bits in folio->flags to pay attention to:

 - PG_locked.  This is held during the read.
 - PG_uptodate.  This is set if the read succeeded.
 - PG_waiters.  This is set if anyone is waiting for PG_locked [*]

The first thread comes along, allocates a folio, locks it, inserts
it into the mapping.
The second thread comes along, finds the folio, sees it's !uptodate,
sets the waiter bit, adds itself to the waitqueue.
The third thread, ditto.
The read completes.  In interrupt or maybe softirq context, the
BIO completion sets the uptodate bit, clears the locked bit and tests
the waiter bit.  Since the waiter bit is set, it walks the waitqueue
looking for waiters which match the locked bit and folio (see
folio_wake_bit()).

So there's not _much_ of a thundering herd problem here.  Most likely
the waitqueue is just too damn long with a lot of threads waiting for
I/O.

[*] oversimplification; don't worry about it.


      parent reply	other threads:[~2024-11-28  4:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-27  5:47 Bharata B Rao
2024-11-27  5:47 ` [RFC PATCH 1/1] block/ioctl: Add an ioctl to enable large folios for " Bharata B Rao
2024-11-27  6:26   ` Christoph Hellwig
2024-11-27 10:37     ` Bharata B Rao
2024-11-28  5:43       ` Christoph Hellwig
2024-11-27  6:13 ` [RFC PATCH 0/1] Large folios in " Mateusz Guzik
2024-11-27  6:19   ` Mateusz Guzik
2024-11-27 12:02     ` Jan Kara
2024-11-27 12:13       ` Christian Brauner
2024-11-28  5:40       ` Ritesh Harjani
2024-11-27 12:18     ` Bharata B Rao
2024-11-27 12:28       ` Mateusz Guzik
2024-11-28  4:01         ` Bharata B Rao
2024-11-28  4:22           ` Matthew Wilcox
2024-11-28  4:37             ` Bharata B Rao
2024-11-28 11:23               ` Bharata B Rao
2024-11-28 23:31                 ` Mateusz Guzik
2024-11-29 10:32                   ` Bharata B Rao
2024-11-28  4:22           ` Mateusz Guzik
2024-11-28  4:31             ` Mateusz Guzik
2024-12-02  9:37               ` Bharata B Rao
2024-12-02 10:08                 ` Mateusz Guzik
2024-12-03  5:01                   ` Bharata B Rao
2024-11-28  4:43             ` Matthew Wilcox [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z0f05sdTiL8kW9U8@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bharata@amd.com \
    --cc=brauner@kernel.org \
    --cc=clm@meta.com \
    --cc=david@redhat.com \
    --cc=jack@suse.cz \
    --cc=joshdon@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mjguzik@gmail.com \
    --cc=nikunj@amd.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox