From: Dave Chinner <david@fromorbit.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>,
lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [TOPIC] Last iput() from flusher thread, last fput() from munmap()...
Date: Wed, 28 Mar 2012 15:45:18 +1100 [thread overview]
Message-ID: <20120328044518.GB32741@dastard> (raw)
In-Reply-To: <20120328023852.GP6589@ZenIV.linux.org.uk>
On Wed, Mar 28, 2012 at 03:38:52AM +0100, Al Viro wrote:
> On Tue, Mar 27, 2012 at 11:08:58PM +0200, Jan Kara wrote:
> > Hello,
> >
> > maybe the name of this topic could be "How hard should be life of
> > filesystems?" but that's kind of broad topic and suggests too much of
> > bikeshedding. I'd like to concentrate on concrete possible pain points
> > between filesystems & VFS (possibly writeback or even generally MM).
> > Lately, I've myself came across the two issues in $SUBJECT:
> > 1) dropping of last file reference can happen from munmap() and in that
> > case mmap_sem will be held when ->release() is called. Even more it
> > could be held when ->evict_inode() is called to delete inode because
> > inode was unlinked.
>
> Yes, it can.
>
> > 2) since flusher thread takes inode reference when writing inode out, the
> > last inode reference can be dropped from flusher thread. Thus inode may
> > get deleted in the flusher thread context. This does not seem that
> > problematic on its own but if we realize progress of memory reclaim
> > depends (at least from a longterm perspective) on flusher thread making
> > progress, things start looking a bit uncertain. Even more so when we
> > would like avoid ->writepage() calls from reclaim and let flusher thread
> > do the work instead. That would then require filesystems to carefully
> > design their ->evict_inode() routines so that things are not
> > deadlockable.
>
> You mean "use GFP_NOIO for allocations when holding fs-internal locks"?
>
> > Both these issues should be avoidable (we can postpone fput() after we
> > drop mmap_sem; we can tweak inode refcounting to avoid last iput() from
> > flusher thread) but obviously there's some cost in the complexity of generic
> > layer. So the question is, is it worth it?
>
> I don't thing it is. ->i_mutex in ->release() is never needed; existing
> cases are racy and dropping preallocation that way is simply wrong.
The alternative to using ->release is ....?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2012-03-28 4:45 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-27 21:08 Jan Kara
2012-03-28 2:38 ` Al Viro
2012-03-28 4:45 ` Dave Chinner [this message]
2012-03-28 9:04 ` Steven Whitehouse
2012-03-28 11:54 ` [Lsf-pc] " Jan Kara
2012-03-28 14:07 ` Steven Whitehouse
2012-03-28 12:10 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120328044518.GB32741@dastard \
--to=david@fromorbit.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox