From: Amir Goldstein <amir73il@gmail.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>,
Eric Sandeen <sandeen@redhat.com>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
linux-cifs@vger.kernel.org,
overlayfs <linux-unionfs@vger.kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>,
Linux Btrfs <linux-btrfs@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
ocfs2-devel@oss.oracle.com
Subject: Re: [PATCH 17/25] vfs: make remapping to source file eof more explicit
Date: Wed, 10 Oct 2018 15:29:06 +0300 [thread overview]
Message-ID: <CAOQ4uxivwLR5assf0VwHdp5p06Er4w7urB637Z3wiQ1eZoT9tQ@mail.gmail.com> (raw)
In-Reply-To: <153913043746.32295.17463515265798256890.stgit@magnolia>
On Wed, Oct 10, 2018 at 3:14 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Create a RFR_TO_SRC_EOF flag to explicitly declare that the caller wants
> the remap implementation to remap to the end of the source file, once
> the files are locked.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/ioctl.c | 3 ++-
> fs/nfsd/vfs.c | 3 ++-
> fs/read_write.c | 13 +++++++------
> include/linux/fs.h | 2 ++
> 4 files changed, 13 insertions(+), 8 deletions(-)
>
>
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 505275ec5596..7fec997abd2f 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -224,6 +224,7 @@ static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
> {
> struct fd src_file = fdget(srcfd);
> loff_t cloned;
> + unsigned int flags = olen == 0 ? RFR_TO_SRC_EOF : 0;
> int ret;
>
> if (!src_file.file)
> @@ -232,7 +233,7 @@ static long ioctl_file_clone(struct file *dst_file, unsigned long srcfd,
> if (src_file.file->f_path.mnt != dst_file->f_path.mnt)
> goto fdput;
> cloned = vfs_clone_file_range(src_file.file, off, dst_file, destoff,
> - olen, 0);
> + olen, flags);
> if (cloned < 0)
> ret = cloned;
> else if (olen && cloned != olen)
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 726fc5b2b27a..d1f2ae08adf6 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -542,8 +542,9 @@ __be32 nfsd4_clone_file_range(struct file *src, u64 src_pos, struct file *dst,
> u64 dst_pos, u64 count)
> {
> loff_t cloned;
> + unsigned int flags = count == 0 ? RFR_TO_SRC_EOF : 0;
>
> - cloned = vfs_clone_file_range(src, src_pos, dst, dst_pos, count, 0);
> + cloned = vfs_clone_file_range(src, src_pos, dst, dst_pos, count, flags);
> if (count && cloned != count)
> cloned = -EINVAL;
> return nfserrno(cloned < 0 ? cloned : 0);
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 479eb810c8e6..a628fd9a47cf 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1748,11 +1748,12 @@ int generic_remap_file_range_prep(struct file *file_in, loff_t pos_in,
>
> isize = i_size_read(inode_in);
>
> - /* Zero length dedupe exits immediately; reflink goes to EOF. */
> - if (*len == 0) {
> - if (is_dedupe || pos_in == isize)
> - return 0;
> - if (pos_in > isize)
> + /*
> + * If the caller asked to go all the way to the end of the source file,
> + * set *len now that we have the file locked.
> + */
> + if (remap_flags & RFR_TO_SRC_EOF) {
> + if (pos_in >= isize)
> return -EINVAL;
> *len = isize - pos_in;
> }
> @@ -1828,7 +1829,7 @@ loff_t do_clone_file_range(struct file *file_in, loff_t pos_in,
> struct inode *inode_out = file_inode(file_out);
> loff_t ret;
>
> - WARN_ON_ONCE(remap_flags);
> + WARN_ON_ONCE(remap_flags & ~(RFR_TO_SRC_EOF));
>
> if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
> return -EISDIR;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 5c1bf1c35bc6..9f90dcd4df3b 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1725,8 +1725,10 @@ struct block_device_operations;
> * These flags control the behavior of the remap_file_range function pointer.
> *
> * RFR_IDENTICAL_DATA: only remap if contents identical (i.e. deduplicate)
<bikeshedding> not that I care so much, but is there any reason you chose
to use _IDENTICAL_ vs. _SAME_? The latter is shorter and already engraved
in the dedup uapi. </bikeshedding>
> + * RFR_TO_SRC_EOF: remap to the end of the source file
> */
> #define RFR_IDENTICAL_DATA (1 << 0)
> +#define RFR_TO_SRC_EOF (1 << 1)
>
So what is the best way to make sure that all filesystems can
properly handle this flag? and the RFR_CAN_SHORTEN flag?
The way that your patches took is to not check for invalid flags
at all in filesystems, but I don't think that is a viable option.
Another way would be to individually add those flags to invalid
flags check in all relevant filesystems.
Another way would be to follow a pattern similar to
fiemap_check_flags(), except in case filesystem does not declare
to support the RFR_ "advisory" flags, it will not fail the operation.
Comparing to FIEMAP_ flags, no filesystem would have needed to declare
support for FIEMAP_FLAG_SYNC, because vfs dealt with it anyway
before calling into the filesystem. So de-facto, any filesystem supports
FIEMAP_FLAG_SYNC without doing anything, but it is still worth passing
the flag into filesystem in case it matter (it does for overlayfs).
I am not sure if giving fiemap_check_flags() as an example for sorting
out VFS API flags is a good idea, because I completely don't understand
what is going on in ext4_fiemap(). Why do FIEMAP_FLAG_CACHE and
the case of !EXT4_INODE_EXTENTS bypass fiemap_check_flags()?
Is there a rational behind all this or just plain old API bit rot?
If bit rot, then perhaps the fiemap_check_flags() wasn't a good enough
coding pattern?
Thanks,
Amir.
next prev parent reply other threads:[~2018-10-10 12:29 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-10 0:10 [PATCH v2 00/25] fs: fixes for serious clone/dedupe problems Darrick J. Wong
2018-10-10 0:10 ` [PATCH 01/25] xfs: add a per-xfs trace_printk macro Darrick J. Wong
2018-10-10 0:36 ` Dave Chinner
2018-10-10 15:00 ` [PATCH v2 " Darrick J. Wong
2018-10-10 0:10 ` [PATCH 02/25] xfs: refactor clonerange preparation into a separate helper Darrick J. Wong
2018-10-10 0:10 ` [PATCH 03/25] xfs: zero posteof blocks when cloning above eof Darrick J. Wong
2018-10-10 0:11 ` [PATCH 04/25] xfs: update ctime and remove suid before cloning files Darrick J. Wong
2018-10-10 0:11 ` [PATCH 05/25] vfs: check file ranges " Darrick J. Wong
2018-10-10 23:06 ` Dave Chinner
2018-10-10 23:13 ` Darrick J. Wong
2018-10-10 0:11 ` [PATCH 06/25] vfs: strengthen checking of file range inputs to generic_remap_checks Darrick J. Wong
2018-10-10 5:23 ` Amir Goldstein
2018-10-10 17:01 ` Darrick J. Wong
2018-10-10 17:26 ` Amir Goldstein
2018-10-10 0:11 ` [PATCH 07/25] vfs: skip zero-length dedupe requests Darrick J. Wong
2018-10-10 0:11 ` [PATCH 08/25] vfs: combine the clone and dedupe into a single remap_file_range Darrick J. Wong
2018-10-10 5:54 ` Amir Goldstein
2018-10-10 15:13 ` Darrick J. Wong
2018-10-10 15:23 ` Amir Goldstein
2018-10-10 0:11 ` [PATCH 09/25] vfs: rename vfs_clone_file_prep to be more descriptive Darrick J. Wong
2018-10-10 0:11 ` [PATCH 10/25] vfs: rename clone_verify_area to remap_verify_area Darrick J. Wong
2018-10-10 0:13 ` [PATCH 11/25] vfs: create generic_remap_file_range_touch to update inode metadata Darrick J. Wong
2018-10-10 0:13 ` [PATCH 12/25] vfs: pass remap flags to generic_remap_file_range_prep Darrick J. Wong
2018-10-10 0:13 ` [PATCH 13/25] vfs: pass remap flags to generic_remap_checks Darrick J. Wong
2018-10-10 0:13 ` [PATCH 14/25] vfs: make remap_file_range functions take and return bytes completed Darrick J. Wong
2018-10-10 6:47 ` Amir Goldstein
2018-10-10 15:50 ` Darrick J. Wong
2018-10-10 18:28 ` Amir Goldstein
2018-10-10 18:32 ` Darrick J. Wong
2018-10-10 0:13 ` [PATCH 15/25] vfs: plumb RFR_* remap flags through the vfs clone functions Darrick J. Wong
2018-10-10 6:22 ` Amir Goldstein
2018-10-10 6:39 ` Amir Goldstein
2018-10-10 0:13 ` [PATCH 16/25] vfs: plumb RFR_* remap flags through the vfs dedupe functions Darrick J. Wong
2018-10-10 0:13 ` [PATCH 17/25] vfs: make remapping to source file eof more explicit Darrick J. Wong
2018-10-10 12:29 ` Amir Goldstein [this message]
2018-10-10 16:29 ` Darrick J. Wong
2018-10-10 17:31 ` Amir Goldstein
2018-10-10 0:14 ` [PATCH 18/25] vfs: enable remap callers that can handle short operations Darrick J. Wong
2018-10-10 0:14 ` [PATCH 19/25] vfs: hide file range comparison function Darrick J. Wong
2018-10-10 0:14 ` [PATCH 20/25] vfs: implement opportunistic short dedupe Darrick J. Wong
2018-10-10 0:14 ` [PATCH 21/25] ocfs2: truncate page cache for clone destination file before remapping Darrick J. Wong
2018-10-10 0:14 ` [PATCH 22/25] ocfs2: fix pagecache truncation prior to reflink Darrick J. Wong
2018-10-10 0:14 ` [PATCH 23/25] ocfs2: support partial clone range and dedupe range Darrick J. Wong
2018-10-10 0:14 ` [PATCH 24/25] xfs: fix pagecache truncation prior to reflink Darrick J. Wong
2018-10-10 0:14 ` [PATCH 25/25] xfs: support returning partial reflink results Darrick J. Wong
2018-10-10 1:02 ` [PATCH v2 00/25] fs: fixes for serious clone/dedupe problems Dave Chinner
2018-10-10 1:06 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOQ4uxivwLR5assf0VwHdp5p06Er4w7urB637Z3wiQ1eZoT9tQ@mail.gmail.com \
--to=amir73il@gmail.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-unionfs@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ocfs2-devel@oss.oracle.com \
--cc=sandeen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox