linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Ts'o <tytso@mit.edu>
To: NeilBrown <neilb@suse.com>
Cc: Jan Kara <jack@suse.cz>,
	Trond Myklebust <trondmy@primarydata.com>,
	"kwolf@redhat.com" <kwolf@redhat.com>,
	"riel@redhat.com" <riel@redhat.com>,
	"hch@infradead.org" <hch@infradead.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"jlayton@poochiereds.net" <jlayton@poochiereds.net>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	"rwheeler@redhat.com" <rwheeler@redhat.com>
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] I/O error handling and fsync()
Date: Thu, 26 Jan 2017 22:23:18 -0500	[thread overview]
Message-ID: <20170127032318.rkdiwu6nog3nifdo@thunk.org> (raw)
In-Reply-To: <87r33ptqg1.fsf@notabene.neil.brown.name>

On Fri, Jan 27, 2017 at 09:19:10AM +1100, NeilBrown wrote:
> I don't think it has.
> The original topic was about gracefully handling of recoverable IO errors.
> The question was framed as about retrying fsync() is it reported an
> error, but this was based on a misunderstand.  fsync() doesn't report
> an error for recoverable errors.  It hangs.
> So the original topic is really about gracefully handling IO operations
> which currently can hang indefinitely.

Well, the problem is that it is up to the device driver to decide when
an error is recoverable or not.  This might include waiting X minutes,
and then deciding that the fibre channel connection isn't coming back,
and then turning it into an unrecoverable error.  Or for other
devices, the timeout might be much smaller.

Which is fine --- I think that's where the decision ought to live, and
if users want to tune a different timeout before the driver stops
waiting, that should be between the system administrator and the
device driver /sys tuning knob.

> >> When combined with O_DIRECT, it effectively means "no retries".  For
> >> block devices and files backed by block devices,
> >> REQ_FAILFAST_DEV|REQ_FAILFAST_TRANSPORT is used and a failure will be
> >> reported as EWOULDBLOCK, unless it is obvious that retrying wouldn't
> >> help.

Absolutely no retries?  Even TCP retries in the case of iSCSI?  I
don't think turning every TCP packet drop into EWOULDBLOCK would make
sense under any circumstances.  What might make sense is to have a
"short timeout" where it's up to the block device to decide what
"short timeout" means.

EWOULDBLOCK is also a little misleading, because even if the I/O
request is submitted immediately to the block device and immediately
serviced and returned, the I/O request would still be "blocking".
Maybe ETIMEDOUT instead?

> And aio_write() isn't non-blocking for O_DIRECT already because .... oh,
> it doesn't even try.  Is there something intrinsically hard about async
> O_DIRECT writes, or is it just that no-one has written acceptable code
> yet?

AIO/DIO writes can indeed be non-blocking, if the file system doesn't
need to do any metadata operations.  So if the file is preallocated,
you should be able to issue an async DIO write without losing the CPU.

> A truly async O_DIRECT aio_write() combined with a working io_cancel()
> would probably be sufficient.  The block layer doesn't provide any way
> to cancel a bio though, so that would need to be wired up.

Kent Overstreet worked up io_cancel for AIO/DIO writes when he was at
Google.  As I recall the patchset did get posted a few times, but it
never ended up getted accepted for upstream adoption.

We even had some very rough code that would propagate the cancellation
request to the hard drive, for those hard drives that had a facility
for accepting a cancellation request for an I/O which was queued via
NCQ but which hadn't executed yet.  It sort-of worked, but it never
hit a state where it could be published before the project was
abandoned.

						- Ted

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-01-27  3:23 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-10 16:02 Kevin Wolf
2017-01-11  0:41 ` NeilBrown
2017-01-13 11:09   ` Kevin Wolf
2017-01-13 14:21     ` Theodore Ts'o
2017-01-13 16:00       ` Kevin Wolf
2017-01-13 22:28         ` NeilBrown
2017-01-14  6:18           ` Darrick J. Wong
2017-01-16 12:14           ` [Lsf-pc] " Jeff Layton
2017-01-22 22:44             ` NeilBrown
2017-01-22 23:31               ` Jeff Layton
2017-01-23  0:21                 ` Theodore Ts'o
2017-01-23 10:09                   ` Kevin Wolf
2017-01-23 12:10                     ` Jeff Layton
2017-01-23 17:25                       ` Theodore Ts'o
2017-01-23 17:53                         ` Chuck Lever
2017-01-23 22:40                         ` Jeff Layton
2017-01-23 22:35                     ` Jeff Layton
2017-01-23 23:09                       ` Trond Myklebust
2017-01-24  0:16                         ` NeilBrown
2017-01-24  0:46                           ` Jeff Layton
2017-01-24 21:58                             ` NeilBrown
2017-01-25 13:00                               ` Jeff Layton
2017-01-30  5:30                                 ` NeilBrown
2017-01-24  3:34                           ` Trond Myklebust
2017-01-25 18:35                             ` Theodore Ts'o
2017-01-26  0:36                               ` NeilBrown
2017-01-26  9:25                                 ` Jan Kara
2017-01-26 22:19                                   ` NeilBrown
2017-01-27  3:23                                     ` Theodore Ts'o [this message]
2017-01-27  6:03                                       ` NeilBrown
2017-01-30 16:04                                       ` Jan Kara
2017-01-13 18:40     ` Al Viro
2017-01-13 19:06       ` Kevin Wolf
2017-01-11  5:03 ` Theodore Ts'o
2017-01-11  9:47   ` [Lsf-pc] " Jan Kara
2017-01-11 15:45     ` Theodore Ts'o
2017-01-11 10:55   ` Chris Vest
2017-01-11 11:40   ` Kevin Wolf
2017-01-13  4:51     ` NeilBrown
2017-01-13 11:51       ` Kevin Wolf
2017-01-13 21:55         ` NeilBrown
2017-01-11 12:14   ` Chris Vest

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170127032318.rkdiwu6nog3nifdo@thunk.org \
    --to=tytso@mit.edu \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jlayton@poochiereds.net \
    --cc=kwolf@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=neilb@suse.com \
    --cc=riel@redhat.com \
    --cc=rwheeler@redhat.com \
    --cc=trondmy@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox