From: Nick Piggin <npiggin@suse.de>
To: Hugh Dickins <hugh@veritas.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
linux-arch@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [rfc] optimise unlock_page
Date: Sun, 13 May 2007 05:32:10 +0200 [thread overview]
Message-ID: <20070513033210.GA3667@wotan.suse.de> (raw)
In-Reply-To: <Pine.LNX.4.64.0705111357120.3350@blonde.wat.veritas.com>
On Fri, May 11, 2007 at 02:15:03PM +0100, Hugh Dickins wrote:
> On Fri, 11 May 2007, Nick Piggin wrote:
> >
> > Don't worry, I'm only just beginning ;) Can we then do something crazy
> > like this? (working on x86-64 only, so far. It seems to eliminate
> > lat_pagefault and lat_proc regressions here).
>
> I think Mr __NickPiggin_Lock is squirming ever more desperately.
Really? I thought it was pretty cool to be able to shave several
hundreds of cycles off our page lock :)
> So, in essence, you'd like to expand PG_locked from 1 to 8 bits,
> despite the fact that page flags are known to be in short supply?
> Ah, no, you're keeping it near the static mmzone FLAGS_RESERVED.
Yep, no flags bloating at all.
> Hmm, well, I think that's fairly horrid, and would it even be
> guaranteed to work on all architectures? Playing with one char
> of an unsigned long in one way, while playing with the whole of
> the unsigned long in another way (bitops) sounds very dodgy to me.
Of course not, but they can just use a regular atomic word sized
bitop. The problem with i386 is that its atomic ops also imply
memory barriers that you obviously don't need on unlock. So I think
getting rid of them is pretty good. A grep of mm/ and fs/ for
lock_page tells me we want this to be as fast as possible even
if it isn't being used in the nopage fastpath.
> I think I'd rather just accept that your changes have slowed some
> microbenchmarks down: it is not always possible to fix a serious
> bug without slowing something down. That's probably what you're
> trying to push me into saying by this patch ;)
Well I was resigned to that few % regression in the page fault path
until the G5 numbers showed that we needed to improve things. But
now it looks like (at least on my 2*HT P4 Xeon) that we don't have to
have any regression there.
> But again I wonder just what the gain has been, once your double
> unmap_mapping_range is factored in. When I suggested before that
> perhaps the double (well, treble including the one in truncate.c)
> unmap_mapping_range might solve the problem you set out to solve
> (I've lost sight of that!) without pagelock when faulting, you said:
>
> > Well aside from being terribly ugly, it means we can still drop
> > the dirty bit where we'd otherwise rather not, so I don't think
> > we can do that.
>
> but that didn't give me enough information to agree or disagree.
Oh, well invalidate wants to be able to skip dirty pages or have the
filesystem do something special with them first. Once you have taken
the page out of the pagecache but still mapped shared, then blowing
it away doesn't actually solve the data loss problem... only makes
the window of VM inconsistency smaller.
> > What architecture and workloads are you testing with, btw?
>
> i386 (2*HT P4 Xeons), x86_64 (2*HT P4 Xeons), PowerPC (G5 Quad).
>
> Workloads mostly lmbench and my usual pair of make -j20 kernel builds,
> one to tmpfs and one to ext2 looped on tmpfs, restricted to 512M RAM
> plus swap. Which is ever so old but still finds enough to keep me busy.
Thanks,
Nick
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-05-13 3:32 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070508113709.GA19294@wotan.suse.de>
2007-05-08 11:40 ` Nick Piggin
2007-05-08 20:08 ` Hugh Dickins
2007-05-08 21:30 ` Benjamin Herrenschmidt
2007-05-08 22:41 ` Nick Piggin
2007-05-08 22:50 ` Nick Piggin
2007-05-09 19:33 ` Hugh Dickins
2007-05-09 21:21 ` Benjamin Herrenschmidt
2007-05-10 3:37 ` Nick Piggin
2007-05-10 19:14 ` Hugh Dickins
2007-05-11 8:54 ` Nick Piggin
2007-05-11 13:15 ` Hugh Dickins
2007-05-13 3:32 ` Nick Piggin [this message]
2007-05-13 4:39 ` Hugh Dickins
2007-05-13 6:52 ` Nick Piggin
2007-05-16 17:54 ` Hugh Dickins
2007-05-16 18:18 ` Nick Piggin
2007-05-16 19:28 ` Hugh Dickins
2007-05-16 19:47 ` Linus Torvalds
2007-05-17 6:27 ` Nick Piggin
2007-05-16 17:21 ` Hugh Dickins
2007-05-16 17:38 ` Nick Piggin
2007-05-08 12:13 ` David Howells
2007-05-08 22:35 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070513033210.GA3667@wotan.suse.de \
--to=npiggin@suse.de \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=hugh@veritas.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox