Re: [LSF/MM/BPF TOPIC] breaking the 512 KiB IO boundary on x86_64

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Keith Busch <kbusch@kernel.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-block@vger.kernel.org, lsf-pc@lists.linux-foundation.org,
	david@fromorbit.com, leon@kernel.org, hch@lst.de,
	sagi@grimberg.me, axboe@kernel.dk, joro@8bytes.org,
	brauner@kernel.org, hare@suse.de, willy@infradead.org,
	john.g.garry@oracle.com, p.raghav@samsung.com,
	gost.dev@samsung.com, da.gomez@samsung.com
Subject: Re: [LSF/MM/BPF TOPIC] breaking the 512 KiB IO boundary on x86_64
Date: Fri, 21 Mar 2025 22:51:42 +0530	[thread overview]
Message-ID: <87frj6s3ix.fsf@gmail.com> (raw)
In-Reply-To: <Z92WBePJ620r5-13@kbusch-mbp>

Keith Busch <kbusch@kernel.org> writes:

> On Fri, Mar 21, 2025 at 07:43:09AM +0530, Ritesh Harjani wrote:
>> i.e. w/o large folios in block devices one could do direct-io &
>> buffered-io in parallel even just next to each other (assuming 4k pagesize). 
>> 
>>            |4k-direct-io | 4k-buffered-io | 
>> 
>> 
>> However with large folios now supported in buffered-io path for block
>> devices, the application cannot submit such direct-io + buffered-io
>> pattern in parallel. Since direct-io can end up invalidating the folio
>> spanning over it's 4k range, on which buffered-io is in progress.
>
> Why would buffered io span more than the 4k range here? You're talking
> to the raw block device in both cases, so they have the exact same
> logical block size alignment. Why is buffered io allocating beyond
> the logical size granularity?

This can happen in following 2 cases - 
1. System's page size is 64k. Then even though the logical block size
granularity for buffered-io is set to 4k (blockdev --setbsz 4k
/dev/sdc), it still will instantiate a 64k page in the page cache.

2. Second is the recent case where (correct me if I am wrong) we now
have large folio support for block devices. So here again we can
instantiate a large folio in the page cache where buffered-io is in
progress correct? (say a previous read causes a readahead and installs a
large folio in that region). Or even iomap_write_iter() these days tries
to first allocate a chunk of size mapping_max_folio_size().

However with large folio support now in block devices, I am not sure
whether an application can retain much benefit of doing buffered-io (if
they happen to mix buffered-io and direct-io carefully over a logical
boundary). Because the direct-io can end up invalidating the entire
large folio, if there is one, in the region where the direct-io
operation is taking place. However this may still be useful if only
buffered-io is being performed on the block device.

-ritesh

next prev parent reply	other threads:[~2025-03-21 18:38 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-20 11:41 Luis Chamberlain
2025-03-20 12:11 ` Matthew Wilcox
2025-03-20 13:29   ` Daniel Gomez
2025-03-20 14:31     ` Matthew Wilcox
2025-03-20 13:47 ` Daniel Gomez
2025-03-20 14:54   ` Christoph Hellwig
2025-03-21  9:14     ` Daniel Gomez
2025-03-20 14:18 ` Christoph Hellwig
2025-03-20 15:37   ` Bart Van Assche
2025-03-20 15:58     ` Keith Busch
2025-03-20 16:13       ` Kanchan Joshi
2025-03-20 16:38       ` Christoph Hellwig
2025-03-20 21:50         ` Luis Chamberlain
2025-03-20 21:46       ` Luis Chamberlain
2025-03-20 21:40   ` Luis Chamberlain
2025-03-20 18:46 ` Ritesh Harjani
2025-03-20 21:30   ` Darrick J. Wong
2025-03-21  2:13     ` Ritesh Harjani
2025-03-21  3:05       ` Darrick J. Wong
2025-03-21  4:56         ` Theodore Ts'o
2025-03-21  5:00           ` Christoph Hellwig
2025-03-21 18:39             ` Ritesh Harjani
2025-03-21 16:38       ` Keith Busch
2025-03-21 17:21         ` Ritesh Harjani [this message]
2025-03-21 18:55           ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87frj6s3ix.fsf@gmail.com \
    --to=ritesh.list@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=da.gomez@samsung.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=gost.dev@samsung.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=john.g.garry@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kbusch@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=sagi@grimberg.me \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox