From: Hugh Dickins <hugh@veritas.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@osdl.org>,
Linux Memory Management <linux-mm@kvack.org>
Subject: Re: [patch 3/9] mm: speculative get page
Date: Sun, 24 Sep 2006 19:01:11 +0100 (BST) [thread overview]
Message-ID: <Pine.LNX.4.64.0609241802400.7935@blonde.wat.veritas.com> (raw)
In-Reply-To: <20060922172110.22370.33715.sendpatchset@linux.site>
On Fri, 22 Sep 2006, Nick Piggin wrote:
>
> Index: linux-2.6/include/linux/page-flags.h
> ===================================================================
> --- linux-2.6.orig/include/linux/page-flags.h
> +++ linux-2.6/include/linux/page-flags.h
> @@ -86,6 +86,8 @@
> #define PG_nosave_free 18 /* Free, should not be written */
> #define PG_buddy 19 /* Page is free, on buddy lists */
>
> +#define PG_nonewrefs 20 /* Block concurrent pagecache lookups
> + * while testing refcount */
Something I didn't get around to mentioning last time: I could well
be mistaken, but it seemed that you could get along without all the
PageNoNewRefs stuff, at cost of using something (too expensive?)
like atomic_cmpxchg(&page->_count, 2, 0) in remove_mapping() and
migrate_page_move_mapping(); compensated by simplification at the
other end in page_cache_get_speculative(), which is already
expected to be the hotter path.
I find it unaesthetic (suspect you do too) to add that adhoc
PageNoNewRefs method of freezing the count, when you're already
demanding that count 0 must be frozen: why not make use of that?
then since you know it's frozen while 0, you can easily insert
the proper count at the end of the critical region.
I didn't attempt to work out what memory barriers would be needed,
but did test a version working that way on i386 - though I seem
to have tidied those mods away to /dev/null since then.
We disagreed over whether PageNoNewRefs usage in add_to_page_cache
and __add_to_swap_cache was the same as in remove_mapping; but I
think we agreed it could be avoided completely in those, just by
being more careful about the ordering of the updates to struct page
(I think it looked like the SetPageLocked needed to come earlier,
but I forget the logic right now).
> +static inline int page_cache_get_speculative(struct page *page)
> +{
> + VM_BUG_ON(in_interrupt());
> +
> +#ifndef CONFIG_SMP
> +# ifdef CONFIG_PREEMPT
> + VM_BUG_ON(!in_atomic());
> +# endif
> + /*
> + * Preempt must be disabled here - we rely on rcu_read_lock doing
> + * this for us.
> + *
> + * Pagecache won't be truncated from interrupt context, so if we have
> + * found a page in the radix tree here, we have pinned its refcount by
> + * disabling preempt, and hence no need for the "speculative get" that
> + * SMP requires.
> + */
> + VM_BUG_ON(page_count(page) == 0);
> + atomic_inc(&page->_count);
> +
> +#else
> + if (unlikely(!get_page_unless_zero(page)))
> + return 0; /* page has been freed */
This is the test which fails nicely whenever count is set to 0,
whether because the page has been freed or because you wish to
freeze it. But if you do make such a change, callers of
page_cache_get_speculative may need to loop a little differently
when it fails (the page might not be freed).
> + VM_BUG_ON(PageCompound(page) && (struct page *)page_private(page) != page);
I found that VM_BUG_ON confusing, because it's only catching a tiny
proportion of the cases you're interested in ruling out: most high
order pages aren't PageCompound (but only the PageCompound ones offer
that kind of check). If you really want to keep the check, I think
it needs a comment to explain that; but I'd just delete the line.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-09-24 18:01 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-22 19:22 [patch 0/4] lockless pagecache for 2.6.18-rc7-mm1 Nick Piggin
2006-09-22 19:22 ` [patch 2/4] radix-tree: use indirect bit Nick Piggin
2006-09-22 19:22 ` [patch 2/9] radix-tree: gang_lookup_slot Nick Piggin
2006-09-22 19:22 ` [patch 3/9] mm: speculative get page Nick Piggin
2006-09-23 10:01 ` Peter Zijlstra
2006-09-24 18:01 ` Hugh Dickins [this message]
2006-09-25 2:00 ` Nick Piggin
2006-09-25 11:47 ` Nick Piggin
2006-09-25 13:04 ` Peter Zijlstra
2006-09-25 22:41 ` Nick Piggin
2006-09-22 19:22 ` [patch 4/9] mm: lockless pagecache lookups Nick Piggin
2006-09-22 20:01 ` Lee Schermerhorn
2006-09-23 2:35 ` Nick Piggin
2006-09-22 19:24 ` [patch 0/4] lockless pagecache for 2.6.18-rc7-mm1 Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.64.0609241802400.7935@blonde.wat.veritas.com \
--to=hugh@veritas.com \
--cc=akpm@osdl.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox