From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Christoph Hellwig <hch@lst.de>, Theodore Ts'o <tytso@mit.edu>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
Luis Chamberlain <mcgrof@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-block@vger.kernel.org, lsf-pc@lists.linux-foundation.org,
david@fromorbit.com, leon@kernel.org, hch@lst.de,
kbusch@kernel.org, sagi@grimberg.me, axboe@kernel.dk,
joro@8bytes.org, brauner@kernel.org, hare@suse.de,
willy@infradead.org, john.g.garry@oracle.com,
p.raghav@samsung.com, gost.dev@samsung.com, da.gomez@samsung.com
Subject: Re: [LSF/MM/BPF TOPIC] breaking the 512 KiB IO boundary on x86_64
Date: Sat, 22 Mar 2025 00:09:09 +0530 [thread overview]
Message-ID: <87ecyqrzxu.fsf@gmail.com> (raw)
In-Reply-To: <20250321050023.GB1831@lst.de>
Christoph Hellwig <hch@lst.de> writes:
> On Fri, Mar 21, 2025 at 12:56:04AM -0400, Theodore Ts'o wrote:
>> As I recall, in the eary days Linux's safety for DIO and Bufered I/O
>> was best efforts, and other Unix system the recommendation to "don't
>> mix the streams" was far stronger. Even if it works reliably for
>> Linux, it's still something I recommend that people avoid if at all
>> possible.
>
> It still is a best effort, just a much better effort now. It's still
> pretty easy to break the coherent.
Thanks Ted & Christoph for the info. Do you think we should document
this recommendation, maybe somewhere in the kernel Documentation where
we can also lists the possible cases where the coherency could break?
(I am not too well aware of those cases though).
One case which I recently came across was where the application was not
setting --setbsz properly on a block device where system's pagesize is
64k. This if I understand correctly will install 1 buffer_head for a 64k
page for any buffered-io operation. Then, if someone mixes the 4k
buffered-io write, right next to 4k direct-io write, then well it
definitely ends up problematic. Because the 4k buffered-io write will
end up making a read-modify-write over a 64k page (1 buffer_head). This
means we now have the entire 64k dirty page, while there is also a
direct-io write operation in that region. This means both writes got
overlapped, hence causing coherency issues.
Such cases, I believe, are easy to miss. And now, with large folios
being used in block devices, I am not sure if there is much value in
applications mixing buffered I/O and direct I/O. Since direct I/O write
will just end up invalidating the entire large folio, that means it
could negate any benefits of using buffered I/O alongside it, on the
same block device.
-ritesh
next prev parent reply other threads:[~2025-03-21 18:57 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-20 11:41 Luis Chamberlain
2025-03-20 12:11 ` Matthew Wilcox
2025-03-20 13:29 ` Daniel Gomez
2025-03-20 14:31 ` Matthew Wilcox
2025-03-20 13:47 ` Daniel Gomez
2025-03-20 14:54 ` Christoph Hellwig
2025-03-21 9:14 ` Daniel Gomez
2025-03-20 14:18 ` Christoph Hellwig
2025-03-20 15:37 ` Bart Van Assche
2025-03-20 15:58 ` Keith Busch
2025-03-20 16:13 ` Kanchan Joshi
2025-03-20 16:38 ` Christoph Hellwig
2025-03-20 21:50 ` Luis Chamberlain
2025-03-20 21:46 ` Luis Chamberlain
2025-03-20 21:40 ` Luis Chamberlain
2025-03-20 18:46 ` Ritesh Harjani
2025-03-20 21:30 ` Darrick J. Wong
2025-03-21 2:13 ` Ritesh Harjani
2025-03-21 3:05 ` Darrick J. Wong
2025-03-21 4:56 ` Theodore Ts'o
2025-03-21 5:00 ` Christoph Hellwig
2025-03-21 18:39 ` Ritesh Harjani [this message]
2025-03-21 16:38 ` Keith Busch
2025-03-21 17:21 ` Ritesh Harjani
2025-03-21 18:55 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ecyqrzxu.fsf@gmail.com \
--to=ritesh.list@gmail.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=da.gomez@samsung.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=gost.dev@samsung.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=john.g.garry@oracle.com \
--cc=joro@8bytes.org \
--cc=kbusch@kernel.org \
--cc=leon@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=sagi@grimberg.me \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox