From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org, Jan Kara <jack@suse.cz>,
Michal Hocko <mhocko@kernel.org>,
linux-mm@kvack.org
Subject: Re: [RFC PATCH] mm: retry writepages() on ENOMEM when doing an data integrity writeback
Date: Wed, 15 Mar 2017 12:59:33 +0100 [thread overview]
Message-ID: <20170315115933.GF12989@quack2.suse.cz> (raw)
In-Reply-To: <20170315050743.5539-1-tytso@mit.edu>
On Wed 15-03-17 01:07:43, Ted Tso wrote:
> Currently, file system's writepages() function must not fail with an
> ENOMEM, since if they do, it's possible for buffered data to be lost.
> This is because on a data integrity writeback writepages() gets called
> but once, and if it returns ENOMEM and you're lucky the error will get
> reflected back to the userspace process calling fsync() --- at which
> point the application may or may not be properly checking error codes.
> If you aren't lucky, the user is unmounting the file system, and the
> dirty pages will simply be lost.
>
> For this reason, file system code generally will use GFP_NOFS, and in
> some cases, will retry the allocation in a loop, on the theory that
> "kernel livelocks are temporary; data loss is forever".
> Unfortunately, this can indeed cause livelocks, since inside the
> writepages() call, the file system is holding various mutexes, and
> these mutexes may prevent the OOM killer from killing its targetted
> victim if it is also holding on to those mutexes.
>
> A better solution would be to allow writepages() to call the memory
> allocator with flags that give greater latitude to the allocator to
> fail, and then release its locks and return ENOMEM, and in the case of
> background writeback, the writes can be retried at a later time. In
> the case of data-integrity writeback retry after waiting a brief
> amount of time.
>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> ---
>
> As we had discussed in an e-mail thread last week, I'm interested in
> allowing ext4_writepages() to return ENOMEM without causing dirty
> pages from buffered writes getting list. It looks like doing so
> should be fairly straightforward. What do folks think?
Makes sense to me. One comment below:
> + while (1) {
> + if (mapping->a_ops->writepages)
> + ret = mapping->a_ops->writepages(mapping, wbc);
> + else
> + ret = generic_writepages(mapping, wbc);
> + if ((ret != ENOMEM) || (wbc->sync_mode != WB_SYNC_ALL))
-ENOMEM I guess...
> + break;
> + cond_resched();
> + congestion_wait(BLK_RW_ASYNC, HZ/50);
> + }
> return ret;
> }
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-03-15 11:59 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-05 13:35 [PATCH 0/3] mm/fs: get PG_error out of the writeback reporting business Jeff Layton
2017-03-05 13:35 ` [PATCH 1/3] nilfs2: set the mapping error when calling SetPageError on writeback Jeff Layton
2017-03-07 13:46 ` Ryusuke Konishi
2017-03-05 13:35 ` [PATCH 2/3] mm: don't TestClearPageError in __filemap_fdatawait_range Jeff Layton
2017-03-05 13:35 ` [PATCH 3/3] mm: set mapping error when launder_pages fails Jeff Layton
2017-03-05 14:40 ` [PATCH 0/3] mm/fs: get PG_error out of the writeback reporting business Jeff Layton
2017-03-06 23:08 ` Ross Zwisler
2017-03-07 10:26 ` Jan Kara
2017-03-07 14:03 ` Jeff Layton
2017-03-07 15:59 ` Ross Zwisler
2017-03-07 16:17 ` Jan Kara
2017-03-09 2:57 ` Theodore Ts'o
2017-03-09 9:04 ` Jan Kara
2017-03-09 10:47 ` Jeff Layton
2017-03-09 11:02 ` Jan Kara
2017-03-09 12:43 ` Jeff Layton
2017-03-09 13:22 ` Brian Foster
2017-03-09 14:21 ` Theodore Ts'o
2017-03-15 5:07 ` [RFC PATCH] mm: retry writepages() on ENOMEM when doing an data integrity writeback Theodore Ts'o
2017-03-15 11:59 ` Jan Kara [this message]
2017-03-15 14:09 ` Theodore Ts'o
2017-03-15 13:03 ` Michal Hocko
2017-03-16 10:18 ` Tetsuo Handa
2017-03-06 3:06 ` [PATCH 0/3] mm/fs: get PG_error out of the writeback reporting business NeilBrown
2017-03-06 11:43 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170315115933.GF12989@quack2.suse.cz \
--to=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox