From: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
lsf-pc@lists.linux-foundation.org,
"Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Subject: [RFCv1][WIP] ext2: Move direct-io to use iomap
Date: Thu, 16 Mar 2023 20:10:29 +0530 [thread overview]
Message-ID: <eae9d2125de1887f55186668937df7475b0a33f4.1678977084.git.ritesh.list@gmail.com> (raw)
In-Reply-To: <87ttz889ns.fsf@doe.com>
[DO NOT MERGE] [WORK-IN-PROGRESS]
Hello Jan,
This is an initial version of the patch set which I wanted to share
before today's call. This is still work in progress but atleast passes
the set of test cases which I had kept for dio testing (except 1 from my
list).
Looks like there won't be much/any changes required from iomap side to
support ext2 moving to iomap apis.
I will be doing some more testing specifically test generic/083 which is
occassionally failing in my testing.
Also once this is stabilized, I can do some performance testing too if you
feel so. Last I remembered we saw some performance regressions when ext4
moved to iomap for dio.
PS: Please ignore if there are some silly mistakes. As I said, I wanted
to get this out before today's discussion. :)
Thanks for your help!!
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
fs/ext2/ext2.h | 1 +
fs/ext2/file.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++
fs/ext2/inode.c | 20 +--------
3 files changed, 117 insertions(+), 18 deletions(-)
diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h
index cb78d7dcfb95..cb5e309fe040 100644
--- a/fs/ext2/ext2.h
+++ b/fs/ext2/ext2.h
@@ -753,6 +753,7 @@ extern unsigned long ext2_count_free (struct buffer_head *, unsigned);
extern struct inode *ext2_iget (struct super_block *, unsigned long);
extern int ext2_write_inode (struct inode *, struct writeback_control *);
extern void ext2_evict_inode(struct inode *);
+extern void ext2_write_failed(struct address_space *mapping, loff_t to);
extern int ext2_get_block(struct inode *, sector_t, struct buffer_head *, int);
extern int ext2_setattr (struct mnt_idmap *, struct dentry *, struct iattr *);
extern int ext2_getattr (struct mnt_idmap *, const struct path *,
diff --git a/fs/ext2/file.c b/fs/ext2/file.c
index 6b4bebe982ca..7a8561304559 100644
--- a/fs/ext2/file.c
+++ b/fs/ext2/file.c
@@ -161,12 +161,123 @@ int ext2_fsync(struct file *file, loff_t start, loff_t end, int datasync)
return ret;
}
+static ssize_t ext2_dio_read_iter(struct kiocb *iocb, struct iov_iter *to)
+{
+ struct file *file = iocb->ki_filp;
+ struct inode *inode = file->f_mapping->host;
+ ssize_t ret;
+
+ inode_lock_shared(inode);
+ ret = iomap_dio_rw(iocb, to, &ext2_iomap_ops, NULL, 0, NULL, 0);
+ inode_unlock_shared(inode);
+
+ return ret;
+}
+
+static int ext2_dio_write_end_io(struct kiocb *iocb, ssize_t size,
+ int error, unsigned int flags)
+{
+ loff_t pos = iocb->ki_pos;
+ struct inode *inode = file_inode(iocb->ki_filp);
+
+ if (error)
+ return error;
+
+ pos += size;
+ if (pos > i_size_read(inode))
+ i_size_write(inode, pos);
+
+ return 0;
+}
+
+static const struct iomap_dio_ops ext2_dio_write_ops = {
+ .end_io = ext2_dio_write_end_io,
+};
+
+static ssize_t ext2_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
+{
+ struct file *file = iocb->ki_filp;
+ struct inode *inode = file->f_mapping->host;
+ ssize_t ret;
+ unsigned int flags;
+ unsigned long blocksize = inode->i_sb->s_blocksize;
+ loff_t offset = iocb->ki_pos;
+ loff_t count = iov_iter_count(from);
+
+
+ inode_lock(inode);
+ ret = generic_write_checks(iocb, from);
+ if (ret <= 0)
+ goto out_unlock;
+ ret = file_remove_privs(file);
+ if (ret)
+ goto out_unlock;
+ ret = file_update_time(file);
+ if (ret)
+ goto out_unlock;
+
+ /*
+ * We pass IOMAP_DIO_NOSYNC because otherwise iomap_dio_rw()
+ * calls for generic_write_sync in iomap_dio_complete().
+ * Since ext2_fsync nmust be called w/o inode lock,
+ * hence we pass IOMAP_DIO_NOSYNC and handle generic_write_sync()
+ * ourselves.
+ */
+ flags = IOMAP_DIO_NOSYNC;
+
+ /* use IOMAP_DIO_FORCE_WAIT for unaligned of extending writes */
+ if (iocb->ki_pos + iov_iter_count(from) > i_size_read(inode) ||
+ (!IS_ALIGNED(iocb->ki_pos | iov_iter_alignment(from), blocksize)))
+ flags |= IOMAP_DIO_FORCE_WAIT;
+
+ ret = iomap_dio_rw(iocb, from, &ext2_iomap_ops, &ext2_dio_write_ops,
+ flags, NULL, 0);
+
+ if (ret == -ENOTBLK)
+ ret = 0;
+
+ if (ret < 0 && ret != -EIOCBQUEUED)
+ ext2_write_failed(inode->i_mapping, offset + count);
+
+ /* handle case for partial write or fallback to buffered write */
+ if (ret >= 0 && iov_iter_count(from)) {
+ loff_t pos, endbyte;
+ ssize_t status;
+ ssize_t ret2;
+
+ pos = iocb->ki_pos;
+ status = generic_perform_write(iocb, from);
+ if (unlikely(status < 0)) {
+ ret = status;
+ goto out_unlock;
+ }
+ endbyte = pos + status - 1;
+ ret2 = filemap_write_and_wait_range(inode->i_mapping, pos,
+ endbyte);
+ if (ret2 == 0) {
+ iocb->ki_pos = endbyte + 1;
+ ret += status;
+ invalidate_mapping_pages(inode->i_mapping,
+ pos >> PAGE_SHIFT,
+ endbyte >> PAGE_SHIFT);
+ }
+ }
+out_unlock:
+ inode_unlock(inode);
+ if (ret > 0)
+ ret = generic_write_sync(iocb, ret);
+ return ret;
+}
+
static ssize_t ext2_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
{
#ifdef CONFIG_FS_DAX
if (IS_DAX(iocb->ki_filp->f_mapping->host))
return ext2_dax_read_iter(iocb, to);
#endif
+ if (iocb->ki_flags & IOCB_DIRECT)
+ return ext2_dio_read_iter(iocb, to);
+
return generic_file_read_iter(iocb, to);
}
@@ -176,6 +287,9 @@ static ssize_t ext2_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
if (IS_DAX(iocb->ki_filp->f_mapping->host))
return ext2_dax_write_iter(iocb, from);
#endif
+ if (iocb->ki_flags & IOCB_DIRECT)
+ return ext2_dio_write_iter(iocb, from);
+
return generic_file_write_iter(iocb, from);
}
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 26f135e7ffce..7ff669d0b6d2 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -56,7 +56,7 @@ static inline int ext2_inode_is_fast_symlink(struct inode *inode)
static void ext2_truncate_blocks(struct inode *inode, loff_t offset);
-static void ext2_write_failed(struct address_space *mapping, loff_t to)
+void ext2_write_failed(struct address_space *mapping, loff_t to)
{
struct inode *inode = mapping->host;
@@ -908,22 +908,6 @@ static sector_t ext2_bmap(struct address_space *mapping, sector_t block)
return generic_block_bmap(mapping,block,ext2_get_block);
}
-static ssize_t
-ext2_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
-{
- struct file *file = iocb->ki_filp;
- struct address_space *mapping = file->f_mapping;
- struct inode *inode = mapping->host;
- size_t count = iov_iter_count(iter);
- loff_t offset = iocb->ki_pos;
- ssize_t ret;
-
- ret = blockdev_direct_IO(iocb, inode, iter, ext2_get_block);
- if (ret < 0 && iov_iter_rw(iter) == WRITE)
- ext2_write_failed(mapping, offset + count);
- return ret;
-}
-
static int
ext2_writepages(struct address_space *mapping, struct writeback_control *wbc)
{
@@ -946,7 +930,7 @@ const struct address_space_operations ext2_aops = {
.write_begin = ext2_write_begin,
.write_end = ext2_write_end,
.bmap = ext2_bmap,
- .direct_IO = ext2_direct_IO,
+ .direct_IO = noop_direct_IO,
.writepages = ext2_writepages,
.migrate_folio = buffer_migrate_folio,
.is_partially_uptodate = block_is_partially_uptodate,
--
2.39.2
next prev parent reply other threads:[~2023-03-16 14:40 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-29 4:46 LSF/MM/BPF 2023 IOMAP conversion status update Luis Chamberlain
2023-01-29 5:06 ` Matthew Wilcox
2023-01-29 5:39 ` Luis Chamberlain
2023-02-08 16:04 ` Jan Kara
2023-02-24 7:01 ` Zhang Yi
2023-02-26 20:16 ` Ritesh Harjani
2023-03-16 14:40 ` Ritesh Harjani (IBM) [this message]
2023-03-16 15:41 ` [RFCv1][WIP] ext2: Move direct-io to use iomap Darrick J. Wong
2023-03-20 16:11 ` Ritesh Harjani
2023-03-20 13:15 ` Christoph Hellwig
2023-03-20 17:51 ` Jan Kara
2023-03-22 6:34 ` Ritesh Harjani
2023-03-23 11:30 ` Jan Kara
2023-03-23 13:19 ` Ritesh Harjani
2023-03-30 0:02 ` Christoph Hellwig
2023-02-27 19:26 ` LSF/MM/BPF 2023 IOMAP conversion status update Darrick J. Wong
2023-02-27 21:02 ` Matthew Wilcox
2023-02-27 19:47 ` Darrick J. Wong
2023-02-27 20:24 ` Luis Chamberlain
2023-02-27 19:06 ` Darrick J. Wong
2023-02-27 19:58 ` Luis Chamberlain
2023-03-01 16:59 ` Ritesh Harjani
2023-03-01 17:08 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eae9d2125de1887f55186668937df7475b0a33f4.1678977084.git.ritesh.list@gmail.com \
--to=ritesh.list@gmail.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox