linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hughd@google.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Davidlohr Bueso <dave@stgolabs.net>
Subject: Re: [PATCH 2/3] mm/hugetlb: Setup hugetlb_falloc during fallocate hole punch
Date: Mon, 19 Oct 2015 19:22:15 -0700 (PDT)	[thread overview]
Message-ID: <alpine.LSU.2.11.1510191852460.5432@eggly.anvils> (raw)
In-Reply-To: <56259BD0.7060307@oracle.com>

On Mon, 19 Oct 2015, Mike Kravetz wrote:
> On 10/19/2015 04:16 PM, Andrew Morton wrote:
> > On Fri, 16 Oct 2015 15:08:29 -0700 Mike Kravetz <mike.kravetz@oracle.com> wrote:
> 
> >>  		mutex_lock(&inode->i_mutex);
> >> +
> >> +		spin_lock(&inode->i_lock);
> >> +		inode->i_private = &hugetlb_falloc;
> >> +		spin_unlock(&inode->i_lock);
> > 
> > Locking around a single atomic assignment is a bit peculiar.  I can
> > kinda see that it kinda protects the logic in hugetlb_fault(), but I
> > would like to hear (in comment form) your description of how this logic
> > works?
> 
> To be honest, this code/scheme was copied from shmem as it addresses
> the same situation there.  I did not notice how strange this looks until
> you pointed it out.  At first glance, the locking does appear to be
> unnecessary.  The fault code initially checks this value outside the
> lock.  However, the fault code (on another CPU) will take the lock
> and access values within the structure.  Without the locking or some other
> type of memory barrier here, there is no guarantee that the structure
> will be initialized before setting i_private.  So, the faulting code
> could see invalid values in the structure.
> 
> Hugh, is that accurate?  You provided the shmem code.

Yes, I think that's accurate; but confess I'm replying now for the
sake of replying in a rare timely fashion, before having spent any
time looking through your hugetlbfs reimplementation of the same.

The peculiar thing in the shmem case, was that the structure being
pointed to is on the kernel stack of the fallocating task (with
i_mutex guaranteeing only one at a time per file could be doing this):
so the faulting end has to be careful that it's not accessing the
stale memory after the fallocator has retreated back up its stack.

And in the shmem case, this "temporary inode extension" also had to
communicate to shmem_writepage(), the swapout end of things.  Which
is not a complication you have with hugetlbfs: perhaps it could be
simpler if just between fallocate and fault, or perhaps not.

Whilst it does all work for tmpfs, it looks as if tmpfs was ahead of
the pack (or trinity was attacking tmpfs before other filesystems),
and the issue of faulting versus holepunching (and DAX) has captured
wider interest recently, with Dave Chinner formulating answers in XFS,
and hoping to set an example for other filesystems.

If that work were further along, and if I had had time to digest any
of what he is doing about it, I would point you in his direction rather
than this; but since this does work for tmpfs, I shouldn't discourage you.

I'll try to take a look through yours in the coming days, but there's
several other patchsets I need to look through too, plus a few more
patches from me, if I can find time to send them in: juggling priorities.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-10-20  2:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-16 22:08 [PATCH 0/3] hugetlbfs fallocate hole punch race with page faults Mike Kravetz
2015-10-16 22:08 ` [PATCH 1/3] mm/hugetlb: Define hugetlb_falloc structure for hole punch race Mike Kravetz
2015-10-16 22:08 ` [PATCH 2/3] mm/hugetlb: Setup hugetlb_falloc during fallocate hole punch Mike Kravetz
2015-10-19 23:16   ` Andrew Morton
2015-10-20  1:41     ` Mike Kravetz
2015-10-20  2:22       ` Hugh Dickins [this message]
2015-10-20  3:12         ` Mike Kravetz
2015-10-16 22:08 ` [PATCH 3/3] mm/hugetlb: page faults check for fallocate hole punch in progress and wait Mike Kravetz
2015-10-19 23:18 ` [PATCH 0/3] hugetlbfs fallocate hole punch race with page faults Andrew Morton
2015-10-20  1:54   ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.11.1510191852460.5432@eggly.anvils \
    --to=hughd@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave@stgolabs.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox