From: Roger Larsson <roger.larsson@norran.net>
To: Rik van Riel <riel@conectiva.com.br>
Cc: linux-mm@kvack.org, "Scott F. Kaplan" <sfkaplan@cs.amherst.edu>
Subject: Re: RFC: design for new VM
Date: Fri, 04 Aug 2000 00:28:59 +0200 [thread overview]
Message-ID: <3989F22B.F466FA58@norran.net> (raw)
In-Reply-To: <Pine.LNX.4.21.0008031703250.24022-100000@duckman.distro.conectiva>
Rik van Riel wrote:
>
> On Thu, 3 Aug 2000, Roger Larsson wrote:
>
> > > - data structures (page lists)
> > > - active list
> > > - per node/pgdat
> > > - contains pages with page->age > 0
> > > - pages may be mapped into processes
> > > - scanned and aged whenever we are short
> > > on free + inactive pages
> > > - maybe multiple lists for different ages,
> > > to be better resistant against streaming IO
> > > (and for lower overhead)
> >
> > Does this really need to be a list? Since most pages should
> > be on this list can't it be virtual - pages on no other list
> > are on active list. All pages are scanned all the time...
>
> It doesn't have to be a list per se, but since we have the
> list head in the page struct anyway we might as well make
> it one.
>
If we do not want to increase the size of the page struct.
A union could be added instead - age info _or_ actual list.
(not on any list unless age is zero)
> > > - inactive_dirty list
> > > - per zone
> > > - contains dirty, old pages (page->age == 0)
> > > - pages are not mapped in any process
> > > - inactive_clean list
> > > - per zone
> > > - contains clean, old pages
> > > - can be reused by __alloc_pages, like free pages
> > > - pages are not mapped in any process
> >
> > What will happen to pages on these lists if pages gets referenced?
> > * Move them back to the active list? Then it is hard to know how
> > many free able pages there really are...
>
> Indeed, we will move such a page back to the active list.
> "Luckily" the inactive pages are not mapped, so we have to
> locate them through find_page_nolock() and friends, which
> allows us to move the page back to the active list, adjust
> statistics and maybe even wake up kswapd as needed.
>
> > > - other data structures
> > > - int memory_pressure
> > > - on page allocation or reclaim, memory_pressure++
> > > - on page freeing, memory_pressure-- (keep it >= 0, though)
> > > - decayed on a regular basis (eg. every second x -= x>>6)
> > > - used to determine inactive_target
> > > - inactive_target == one (two?) second(s) worth of memory_pressure,
> > > which is the amount of page reclaims we'll do in one second
> > > - free + inactive_clean >= zone->pages_high
> > > - free + inactive_clean + inactive_dirty >= zone->pages_high \
> > > + one_second_of_memory_pressure * (zone_size / memory_size)
> >
> > One of the most interesting aspects (IMHO) of Scott F. Kaplands
> > "Compressed Cache and Virtual Memory Simulation" was the use of
> > VM time instead of wall time. One second could be too long of a
> > reaction time - relative to X allocations/sec etc.
>
> It's just the inactive target. Trying to keep one second of
> unmapped pages with page->age==0 around is mainly done to:
> - make sure we can flush all of them on time
> - put an "appropriate" amount of pressure on the
> pages in the active list, so page aging is smoothed
> out a little bit
>
Yes, but why one second? Why not 1/10 second, one jiffie, ...
> > > If memory_pressure is high and we're doing a lot of dirty
> > > disk writes, the bdflush percentage will kick in and we'll
> > > be doing extra-agressive cleaning. In that case bdflush
> > > will automatically become more agressive the more page
> > > replacement is going on, which is a good thing.
> >
> > I think that one of the omissions in Kaplands report is the
> > time it takes to clean dirty pages. (Or have I missed
> > something... Need to select the pages earlier)
>
> Page replacement (select which page to replace) should always
> be independant from page flushing. You can make pretty decent
> decisions on which page(s) to free and the last thing you want
> is having them messed up by page flushing.
>
> > > Misc
> > >
> > > Page aging and flushing are completely separated in this
> > > scheme. We'll never end up aging and freeing a "wrong" clean
> > > page because we're waiting for IO completion of old and
> > > to-be-freed pages.
> >
> > Is page ageing modification of LRU enough?
>
> It seems to work fine for FreeBSD. Also, we can always change
> the "aging" of the active pages with something else. The
> system is modular enough that we can do that.
>
> > In many cases it will probably behave worse than plain LRU
> > (slower phase adaptions).
>
> We can change that by using exponential decay for the page
> age, or by using some different aging technique...
>
Another research subject...
> > The access pattern diagrams in Kaplans report are very
> > enlightening...
>
> They are very interesting indeed, but I miss one very
> common workload in their report. A lot of systems do
> (multimedia) streaming IO these days, where a lot of
> data passes through the cache quickly, but all of the
> data is only touched once (or maybe twice).
And at the same time browsing www with netscape...
/RogerL
--
Home page:
http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
next prev parent reply other threads:[~2000-08-03 22:28 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-08-02 22:08 Rik van Riel
2000-08-03 7:19 ` Chris Wedgwood
2000-08-03 16:01 ` Rik van Riel
2000-08-04 15:41 ` Matthew Dillon
2000-08-04 17:49 ` Linus Torvalds
2000-08-04 23:51 ` Matthew Dillon
2000-08-05 0:03 ` Linus Torvalds
2000-08-05 1:52 ` Matthew Dillon
2000-08-05 1:09 ` Matthew Wilcox
2000-08-05 2:05 ` Linus Torvalds
2000-08-05 2:17 ` Alexander Viro
2000-08-07 17:55 ` Matthew Dillon
2000-08-05 22:48 ` Theodore Y. Ts'o
2000-08-03 18:27 ` lamont
2000-08-03 18:34 ` Linus Torvalds
2000-08-03 19:11 ` Chris Wedgwood
2000-08-03 21:04 ` Benjamin C.R. LaHaise
2000-08-03 19:32 ` Rik van Riel
2000-08-03 18:05 ` Linus Torvalds
2000-08-03 18:50 ` Rik van Riel
2000-08-03 20:22 ` Linus Torvalds
2000-08-03 22:05 ` Rik van Riel
2000-08-03 22:19 ` Linus Torvalds
2000-08-03 19:00 ` Richard B. Johnson
2000-08-03 19:29 ` Rik van Riel
2000-08-03 20:23 ` Linus Torvalds
2000-08-03 19:37 ` Ingo Oeser
2000-08-03 20:40 ` Linus Torvalds
2000-08-03 21:56 ` Ingo Oeser
2000-08-03 22:12 ` Linus Torvalds
2000-08-04 2:33 ` David Gould
2000-08-16 15:10 ` Stephen C. Tweedie
2000-08-03 19:26 ` Roger Larsson
2000-08-03 21:50 ` Rik van Riel
2000-08-03 22:28 ` Roger Larsson [this message]
2000-08-04 13:52 Mark_H_Johnson
[not found] <8725692F.0079E22B.00@d53mta03h.boulder.ibm.com>
2000-08-07 17:40 ` Gerrit.Huizenga
2000-08-07 18:37 ` Matthew Wilcox
2000-08-07 20:55 ` Chuck Lever
2000-08-07 21:59 ` Rik van Riel
2000-08-08 3:26 ` David Gould
2000-08-08 5:54 ` Kanoj Sarcar
2000-08-08 7:15 ` David Gould
[not found] <87256934.0072FA16.00@d53mta04h.boulder.ibm.com>
2000-08-08 0:36 ` Gerrit.Huizenga
[not found] <87256934.0078DADB.00@d53mta03h.boulder.ibm.com>
2000-08-08 0:48 ` Gerrit.Huizenga
2000-08-08 15:21 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3989F22B.F466FA58@norran.net \
--to=roger.larsson@norran.net \
--cc=linux-mm@kvack.org \
--cc=riel@conectiva.com.br \
--cc=sfkaplan@cs.amherst.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox