linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: hch@infradead.org, akpm@linux-foundation.org,
	gurudas.pai@oracle.com, lkml20101129@newton.leun.net,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: prevent concurrent unmap_mapping_range() on the same inode
Date: Wed, 26 Jan 2011 20:19:15 -0800	[thread overview]
Message-ID: <AANLkTimBR=CuMpWE2juJG2jsLsTqK=tc00sRrEjhkHg=@mail.gmail.com> (raw)
In-Reply-To: <E1PhSO8-0005yN-Dp@pomaz-ex.szeredi.hu>

On Mon, Jan 24, 2011 at 11:47 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Fri, 21 Jan 2011, Hugh Dickins wrote:
>> On Thu, 20 Jan 2011, Miklos Szeredi wrote:
>> > On Thu, 20 Jan 2011, Christoph Hellwig wrote:
>> > > On Thu, Jan 20, 2011 at 01:30:58PM +0100, Miklos Szeredi wrote:
>> > > >
>> > > > Truncate and hole punching already serialize with i_mutex.  Other
>> > > > callers of unmap_mapping_range() do not, and it's difficult to get
>> > > > i_mutex protection for all callers.  In particular ->d_revalidate(),
>> > > > which calls invalidate_inode_pages2_range() in fuse, may be called
>> > > > with or without i_mutex.
>> > >
>> > >
>> > > Which I think is mostly a fuse problem.  I really hate bloating the
>> > > generic inode (into which the address_space is embedded) with another
>> > > mutex for deficits in rather special case filesystems.
>> >
>> > As Hugh pointed out unmap_mapping_range() has grown a varied set of
>> > callers, which are difficult to fix up wrt i_mutex.  Fuse was just an
>> > example.
>> >
>> > I don't like the bloat either, but this is the best I could come up
>> > with for fixing this problem generally.  If you have a better idea,
>> > please share it.
>>
>> If we start from the point that this is mostly a fuse problem (I expect
>> that a thorough audit will show up a few other filesystems too, but
>> let's start from this point): you cite ->d_revalidate as a particular
>> problem, but can we fix up its call sites so that it is always called
>> either with, or much preferably without, i_mutex held?  Though actually
>> I couldn't find where ->d_revalidate() is called while holding i_mutex.
>
> lookup_one_len
> lookup_hash
>  __lookup_hash
>    do_revalidate
>     d_revalidate

Right, thanks.

>
> I don't see an easy way to get rid of i_mutex for lookup_one_len() and
> lookup_hash().
>
>> Failing that, can fuse down_write i_alloc_sem before calling
>> invalidate_inode_pages2(_range), to achieve the same exclusion?
>> The setattr truncation path takes i_alloc_sem as well as i_mutex,
>> though I'm not certain of its full coverage.
>
> Yeah, fuse could use i_alloc_sem or a private mutex, but that would
> leave the other uses of unmap_mapping_range() to sort this out for
> themsevels.

I had wanted to propose that for now you modify just fuse to use
i_alloc_sem for serialization there, and I provide a patch to
unmap_mapping_range() to give safety to whatever other cases there are
(I'm now sure there are other cases, but also sure that I cannot
safely identify them all and fix them correctly at source myself -
even if I found time to do the patches, they'd need at least a release
cycle to bed in with BUG_ONs).

I've spent quite a while on it, but not succeeded: even if I could get
around the restart_addr issue, we're stuck with the deadly embrace
when two are in unmap_mapping_range(), each repeatedly yielding to the
other, each having to start over again.  Anything I came up with was
inferior to the two alternatives you have proposed: your original
wait_on_bit patch, or your current unmap_mutex patch.

Your wait_on_bit patch doesn't bloat (and may be attractive to
enterprise distros seeking binary compatibility), but several of us
agreed with Andrew's comments:

> I do think this was premature optimisation.  The open-coded lock is
> hidden from lockdep so we won't find out if this introduces potential
> deadlocks.  It would be better to add a new mutex at least temporarily,
> then look at replacing it with a MiklosLock later on, when the code is
> bedded in.
>
> At which time, replacing mutexes with MiklosLocks becomes part of a
> general "shrink the address_space" exercise in which there's no reason
> to exclusively concentrate on that new mutex!

It really does seem a mutex too far; but we may let Peter do away with
all that lock breaking when/if his preemptibility patches go in, and
could cut it out at that time.  I don't see a good alternative.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-01-27  4:19 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-20 12:30 Miklos Szeredi
2011-01-20 12:40 ` Christoph Hellwig
2011-01-20 14:13   ` Miklos Szeredi
2011-01-22  4:46     ` Hugh Dickins
2011-01-24 19:47       ` Miklos Szeredi
2011-01-27  4:19         ` Hugh Dickins [this message]
2011-02-08 10:30           ` Miklos Szeredi
2011-02-08 11:52             ` Gurudas Pai
2011-02-08 11:59               ` Miklos Szeredi
2011-02-23 12:49 Miklos Szeredi
2011-02-23 22:20 ` Hugh Dickins
2011-02-23 22:33 ` Linus Torvalds
2011-02-23 23:12   ` Hugh Dickins
2011-03-02  9:48     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTimBR=CuMpWE2juJG2jsLsTqK=tc00sRrEjhkHg=@mail.gmail.com' \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=gurudas.pai@oracle.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkml20101129@newton.leun.net \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox