From: Magnus Damm <magnus.damm@gmail.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Magnus Damm <magnus@valinux.co.jp>,
linux-mm@kvack.org, Magnus Damm <damm@opensource.se>
Subject: Re: [RFC] Removing page->flags
Date: Thu, 9 Feb 2006 14:19:51 +0900 [thread overview]
Message-ID: <aec7e5c30602082119v4127aa92ga3c9d9ba6dee0378@mail.gmail.com> (raw)
In-Reply-To: <43EAC2CE.2010108@yahoo.com.au>
On 2/9/06, Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> Magnus Damm wrote:
> > On 2/8/06, Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> >>There are a large number of paths which access essentially random struct
> >>pages (any memory allocation / deallocation, many pagecache operations).
> >>Your proposal basically guarantees at least an extra cache miss on such
> >>paths. On most modern machines the struct page should be less than or
> >>equal to a cacheline I think.
> >
> >
> > And this extra cache miss comes from accessing the flags in a
> > different cache line than the rest of the struct page, right? OTOH,
>
> Yes
>
> > maybe it is more likely that a certain struct page is in the cache if
> > struct page would become smaller.
> >
>
> In some very select cases, yes. Most of the time I'd say it would
> be more likely that you'll actually have to take two cache misses
> (basic operations like page allocation and freeing touch flags).
Yes, that makes sense.
> >>Also, would you mind explaining how you'd allow non-atomic access to
> >>bits which are already protected somewhere else? Without adding extra
> >>cache misses for each different type of bit that is manipulated? Which
> >>bits do you have in mind, exactly?
> >
> >
> > I'm thinking about PG_lru and PG_active. PG_lru is always modified
> > under zone->lru_lock, and the same goes for PG_active (except in
> > shrink_list(), grr). But as you say above, breaking out the page flags
> > may result in an extra cache miss...
> >
>
> Also, it will still be difficult to enable non-atomic operations on
> them while still keeping overhead to just a single cache miss:
>
> If your flags bits are arranged as an array of flag words, eg
> | page 0 flags | page 1 flags | page 2 flags | ... then obviously
> you can't use non atomic operations.
>
> Otherwise if they are arranged as bits
>
> | PG_lru bits for pages 0..n | PG_active bits | PG_locked bits |
>
> Then you take 3 extra cache misses when locking the page, then
> looking at PG_lru and PG_active.
So, if the optimization is to allow non-atomic operations, then a
better way would be to arrange the flags like this:
| page 0 atomic flags | page 1 atomic flags | ....
together with
| page 0 non-atomic flags | page 1 non-atomic flags | ...
But introducing a second page->flags is out of the question, and
breaking out flags and placing a pointer to them in the node data
structure will introduce more cache misses. So it is probably not
worth it.
> > Also, I think it would be interesting to break out the page
> > replacement policy code and make it pluggable. Different page
> > replacement algorithms need different flags/information associated
> > with each page, so moving the flags from struct page was my way of
> > solving that. Page replacement flags would in that case be stored
> > somewhere else than the rest of the flags.
> >
>
> It seems pretty unlikely that we'll get a pluggable replacement
> policy in mainline any time soon though.
So, do you think it is more likely that a ClockPro implementation will
be accepted then? Or is Linux "doomed" to LRU forever?
> >>If we accept that type A bits are a good idea, then removing just type B
> >>is no point. Sometimes the more complex memory layouts will require more
> >>than just arithmetic (ie. memory loads) so I doubt that is worthwhile
> >>either.
> >
> >
> > Yes, removing type B bits only is no point. But I wonder how the
> > performance would be affected by using the "parent" struct page
> > instead of type B bits.
> >
>
> An essentially random memory access is going to be worth hundreds or
> thousands of integer ops though, and you'd increase cache footprint
> of 'struct page' operations by 50-100% on most architectures.
Yeah, that sounds like a bad idea. =)
> I don't see the problem with type B bits in flags?
There are no problems, especially since we already have the bits available in
page->flags.
Thanks for the comments!
/ magnus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-02-09 5:19 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-08 6:46 Magnus Damm
2006-02-08 11:54 ` Nick Piggin
2006-02-09 2:35 ` Magnus Damm
2006-02-09 4:19 ` Nick Piggin
2006-02-09 5:19 ` Magnus Damm [this message]
2006-02-09 5:37 ` Nick Piggin
2006-02-11 5:30 ` Marcelo Tosatti
2006-02-10 15:03 ` Rik van Riel
2006-02-08 19:37 ` Dave Hansen
2006-02-09 2:50 ` Magnus Damm
2006-02-09 17:27 ` Dave Hansen
2006-02-09 1:55 ` KAMEZAWA Hiroyuki
2006-02-09 2:57 ` Magnus Damm
2006-02-09 3:14 ` KAMEZAWA Hiroyuki
2006-02-09 3:38 ` Magnus Damm
2006-02-09 3:51 ` KAMEZAWA Hiroyuki
2006-02-09 5:24 ` Magnus Damm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aec7e5c30602082119v4127aa92ga3c9d9ba6dee0378@mail.gmail.com \
--to=magnus.damm@gmail.com \
--cc=damm@opensource.se \
--cc=linux-mm@kvack.org \
--cc=magnus@valinux.co.jp \
--cc=nickpiggin@yahoo.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox