Re: [PATCH v5 1/10] fs: Add new flag(FALLOC_FL_COLLAPSE_RANGE) for fallocate

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Hugh Dickins <hughd@google.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Theodore Ts'o <tytso@mit.edu>, Namjae Jeon <linkinjeon@gmail.com>,
	viro@zeniv.linux.org.uk, bpm@sgi.com, adilger.kernel@dilger.ca,
	jack@suse.cz, mtk.manpages@gmail.com, lczerner@redhat.com,
	linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Namjae Jeon <namjae.jeon@samsung.com>,
	Ashish Sangwan <a.sangwan@samsung.com>
Subject: Re: [PATCH v5 1/10] fs: Add new flag(FALLOC_FL_COLLAPSE_RANGE) for fallocate
Date: Wed, 26 Feb 2014 15:48:07 -0800 (PST)	[thread overview]
Message-ID: <alpine.LSU.2.11.1402261511270.2880@eggly.anvils> (raw)
In-Reply-To: <20140226100439.GV13647@dastard>

On Wed, 26 Feb 2014, Dave Chinner wrote:
> On Tue, Feb 25, 2014 at 09:25:40PM -0800, Hugh Dickins wrote:
> 
> > But I wasn't really thinking of the offset > i_size case, just the
> > offset + len >= i_size case: which would end with i_size at offset,
> > and the areas you're worried about still beyond EOF - or am I confused?
> 
> Right, offset beyond EOF is just plain daft. But you're not thinking
> of the entire picture. What happens when a system crashes half way
> through a collapse operation? On tmpfs you don't care - everything
> is lost, but on real filesystems we have to care about. 
> 
> offset + len beyond EOF is just truncate(offset).
> 
> From the point of view of an application offloading a data movement
> operation via collapse range, any range that overlaps EOF is wrong -
> data beyond EOF is not accessible and is not available for the
> application to move. Hence EINVAL - it's an invalid set of
> parameters.
> 
> If we do allow it and implement it by block shifting (which,
> technically, is the correct interpretation of the collapse range
> behaviour because it preserves preallocation beyond
> the collapsed region beyond EOF), then we have
> thr problem of moving data blocks below EOF by extent shifting
> before we change the EOF. That exposes blocks of undefined content
> to the user if we crash and recover up to that point of the
> operation. It's just plain dangerous, and if we allow this
> possibility via the API, someone is going to make that mistake in a
> filesystem because it's difficult to test and hard to get right.

Again, I would have thought that this is a problem you are already
having to solve in the case when offset + len is below EOF, with
blocks of undefined content preallocated beyond EOF.

But I don't know xfs, you do: so I accept there may be subtle reasons
why the offset + len below EOF case is easier for you to handle - and
please don't spend your time trying to hammer those into my head!

I think I've been somewhat unreasonable: I insisted in the other
thread that "Collapse is significantly more challenging than either
hole-punch or truncation", so I should give you a break, not demand
that you provide a perfect smooth implementation in all circumstances.

None of our filesystems were designed with this operation in mind,
each may have its own sound reasons to reject those peculiare cases
which would pose more trouble and risk than they are worth.

Whether that should be enforced at the VFS level is anther matter:
if it turns out that the xfs and ext4 limitations match up, okay.

I think we have different preferences, for whether to return error
or success, when there is nothing to be done; but I notice now that
fallocate fails on len 0, so you are being consistent with that.

Reject "offset + len >= i_size" or "offset + len > i_size"?

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2014-02-26 23:49 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1392741464-20029-1-git-send-email-linkinjeon@gmail.com>
     [not found] ` <20140222140625.GD26637@thunk.org>
     [not found]   ` <20140223213606.GE4317@dastard>
2014-02-25 23:41     ` Hugh Dickins
2014-02-26  1:57       ` Dave Chinner
2014-02-26  5:25         ` Hugh Dickins
2014-02-26 10:04           ` Dave Chinner
2014-02-26 23:48             ` Hugh Dickins [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1402261511270.2880@eggly.anvils \
    --to=hughd@google.com \
    --cc=a.sangwan@samsung.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=bpm@sgi.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=lczerner@redhat.com \
    --cc=linkinjeon@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtk.manpages@gmail.com \
    --cc=namjae.jeon@samsung.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox