linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hugh@veritas.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	linux-arch@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [rfc] optimise unlock_page
Date: Wed, 16 May 2007 20:28:36 +0100 (BST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0705161946170.28185@blonde.wat.veritas.com> (raw)
In-Reply-To: <20070516181847.GD5883@wotan.suse.de>

On Wed, 16 May 2007, Nick Piggin wrote:
> On Wed, May 16, 2007 at 06:54:15PM +0100, Hugh Dickins wrote:
> > On Sun, 13 May 2007, Nick Piggin wrote:
> > > 
> > > Well I think so, but not completely sure.
> > 
> > That's not quite enough to convince me!
> 
> I did ask Linus, and he was very sure it works.

Good, that's very encouraging.

> Not so much from a high level view (although it does put more constraints
> on the flags layout), but from a CPU level... the way we intermix different
> sized loads and stores can run into store forwarding issues[*] which might
> be expensive as well. Not to mention that we can't do the non-atomic
> unlock on all architectures.

Ah yes, that's easier to envisage than an actual correctness problem.

> The other option of moving the bit into ->mapping hopefully avoids all
> the issues, and would probably be a little faster again on the P4, at the
> expense of being a more intrusive (but it doesn't look too bad, at first
> glance)...

Hmm, I'm so happy with PG_swapcache in there, that I'm reluctant to
cede it to your PG_locked, though I can't deny your use should take
precedence.  Perhaps we could enforce 8-byte alignment of struct
address_space and struct anon_vma to make both bits available
(along with the anon bit).

But I think you may not be appreciating how intrusive PG_locked
will be.  There are many references to page->mapping (often ->host)
throughout fs/ : when we keep anon/swap flags in page->mapping, we
know the filesystems will never see those bits set in their pages,
so no page_mapping-like conversion is needed; just a few places in
common code need to adapt.

And given our deprecation discipline for in-kernel interfaces,
wouldn't we have to wait a similar period before making page->mapping
unavailable to out-of-tree filesystems?

> > Please seek out those guarantees.  Like you, I can't really see how
> > it would go wrong (how could moving in the unlocked char mess with
> > the flag bits in the rest of the long? how could atomically modifying
> > the long have a chance of undoing that move?), but it feels like it
> > might take us into errata territory.
> 
> I think we can just rely on the cache coherency protocol taking care of
> it for us, on x86. movb would not affect other data other than the dest.
> A non-atomic op _could_ of course undo the movb, but it could likewise
> undo any other store to the word or byte. An atomic op on the flags does
> not modify the movb byte so the movb before/after possibilities should
> look exactly the same regardless of the atomic operations happening.

Yes, I've gone through that same thought process (my questions were
intended as rhetorical exclamations of inconceivabilty, rather than
actual queries).  But if you do go that way, I'd still like you to
check with Intel and AMD for errata.  See include/asm-i386/spinlock.h
for the CONFIG_X86_OOSTORE || CONFIG_X86_PPRO_FENCE __raw_spin_unlock
using xchgb: doesn't that hint that exceptions may be needed?

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-16 19:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20070508113709.GA19294@wotan.suse.de>
2007-05-08 11:40 ` Nick Piggin
2007-05-08 20:08   ` Hugh Dickins
2007-05-08 21:30   ` Benjamin Herrenschmidt
2007-05-08 22:41     ` Nick Piggin
2007-05-08 22:50       ` Nick Piggin
2007-05-09 19:33         ` Hugh Dickins
2007-05-09 21:21           ` Benjamin Herrenschmidt
2007-05-10  3:37           ` Nick Piggin
2007-05-10 19:14             ` Hugh Dickins
2007-05-11  8:54               ` Nick Piggin
2007-05-11 13:15                 ` Hugh Dickins
2007-05-13  3:32                   ` Nick Piggin
2007-05-13  4:39                     ` Hugh Dickins
2007-05-13  6:52                       ` Nick Piggin
2007-05-16 17:54                         ` Hugh Dickins
2007-05-16 18:18                           ` Nick Piggin
2007-05-16 19:28                             ` Hugh Dickins [this message]
2007-05-16 19:47                               ` Linus Torvalds
2007-05-17  6:27                                 ` Nick Piggin
2007-05-16 17:21                     ` Hugh Dickins
2007-05-16 17:38                       ` Nick Piggin
2007-05-08 12:13 ` David Howells
2007-05-08 22:35   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0705161946170.28185@blonde.wat.veritas.com \
    --to=hugh@veritas.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox