From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
linux-btrfs <linux-btrfs@vger.kernel.org>,
linux-mm@kvack.org, mcgrof@kernel.org, p.raghav@samsung.com
Subject: Re: Any way to ensure minimal folio size and alignment for iomap based direct IO?
Date: Tue, 16 Sep 2025 08:50:56 +0930 [thread overview]
Message-ID: <a54b3a0b-76da-4fb3-bdcd-db54941a255b@gmx.com> (raw)
In-Reply-To: <aMibrFB8_21GQWUD@casper.infradead.org>
在 2025/9/16 08:35, Matthew Wilcox 写道:
> On Tue, Sep 16, 2025 at 07:16:48AM +0930, Qu Wenruo wrote:
>>> Is it very difficult to add multi-shot checksum calls for a data block
>>> in btrfs? Does it break certain reliability guarantees?
>>
>> I'd say it's not impossible, but still not an easy thing to do.
>>
>> E.g. at data read time we need to verify the checksum. Currently we're able
>> to do the checksum for one block in one go, then advance the bio iter.
>>
>> But with multi-shot one, we have to update the shash several times before we
>> can determine if the result is correct.
>>
>> There is even compression algorithm which can not support multi-shot
>> interface, lzo.
>>
>> Thankfully compression is only possible for buffered IO, so it's not
>> involved in this case.
>
> Would it be acceptable to vmap() the pages and do the checksum on the
> virtual address?
That may not be any better than multi-shot runs, as we still need to
advance the iter by a sub-block sized length and mapping them.
Considering we need to do sub-block handling anyway, I'll just come up
with a helper to handle the iteration.
>
>> However then the problem is why the read iov_iter passes the alignment
>> check, but we still get the bio not meeting the large folio requirement?
>
> The virtual address _is_ aligned. It's just not backed with large
> folios, for whatever reason.
>
Oh, that explains the problem.
So even if we do the extra checks to ensure all the pages of the iter is
backed by large folios inside btrfs, it will still be very problematic
for user space programs.
As they have no control on the underlying page layouts, and will hit
random DIO failure or fallback, which is not acceptable for end users.
Thanks a lot for the determining answer,
Qu
prev parent reply other threads:[~2025-09-15 23:21 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <9598a140-aa45-4d73-9cd2-0c7ca6e4020a@gmx.com>
2025-09-15 13:03 ` Matthew Wilcox
2025-09-15 18:12 ` Pankaj Raghav (Samsung)
2025-09-15 21:46 ` Qu Wenruo
2025-09-15 23:05 ` Matthew Wilcox
2025-09-15 23:20 ` Qu Wenruo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a54b3a0b-76da-4fb3-bdcd-db54941a255b@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=kernel@pankajraghav.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox