From: Rik van Riel <H.H.vanRiel@phys.uu.nl>
To: "Eric W. Biederman" <ebiederm+eric@ccr.net>
Cc: linux-mm <linux-mm@kvack.org>
Subject: Re: Linux-2.1.129..
Date: Tue, 24 Nov 1998 08:56:57 +0100 (CET) [thread overview]
Message-ID: <Pine.LNX.3.96.981124084018.14227A-100000@mirkwood.dummy.home> (raw)
In-Reply-To: <m13e79eha7.fsf@flinx.ccr.net>
On 24 Nov 1998, Eric W. Biederman wrote:
> >>>>> "RR" == Rik van Riel <H.H.vanRiel@phys.uu.nl> writes:
> RR> On 23 Nov 1998, Eric W. Biederman wrote:
>
> RR> This waiting is also a good thing if we want to do proper
> RR> I/O clustering. I believe DU has a switch to only write
> RR> dirty data when there's more than XX kB of contiguous data
> RR> at that place on the disk (or the data is old).
>
> I can tell who has been reading Digital Unix literature latetly.
DU and IRIX scale to much larger machines than Linux does,
so I've been reading the DU bookshelf for quite a while
now. Guess where some of the stuff in /proc/sys/vm comes
from :)
I'd be grateful if anyone can help me to IRIX documentation
(will be bugging our sysadmins later today -- I know they've
got an origin and several indys :).
> >> Ideally/Theoretically I think that is what we should be doing for
> >> swap as well, as it would spread out the swap writes across evenly
> >> across time. And should leave most of our pages clean.
>
> RR> In order to spread out the disk I/O evenly (why would we
> RR> want to do this?
>
> Imagine a machine with 1 Gigabyte of RAM and 8 Gigabyte of swap, in
> heavy use. Swapping but not thrashing. You can't swap out several
> hundred megabytes all at once.
OK, I see your point now. In your original message I thought
to have read that you wanted to do swap I/O on an individual
basis as opposed to proper I/O clustering. Your second version
of the story is remarkably like what I had in mind :)
> You can handle a suddne flurry of network traffic much better this
> way for example.
This is the main goal why we should push through the new
VM code ASAP. Gigabit ethernet will be in common use long
before 2.4 hits the street.
> >> The correct ratio (of pages to free from each source) (compuated
> >> dynamically) would be: (# of process pages)/(# of pages)
> >>
> >> Basically for every page kswapd frees shrink_mmap must also free one
> >> page. Plus however many pages shrink_mmap used to return.
>
> RR> This is clearly wrong.
>
> No. If for each page we schedule to be swapped, we reclaim a different
> page with shrink_mmap immediately.... so we have free ram.
We only need to have a very small amount of free ram, since
we can easily reclaim memory if we just make sure that we've
got enough unmapped swap cache and page cache laying around.
> As far as fixed percentages. It's a loose every time, and I won't
> drop a working feature for an older lesser design. Having tuneable
> fixed percentages is only a win on a 1 application, 1 load pattern
> box.
The only reason for something like that is that we need to
have some control over the amount of memory that's in the
unmapped/cached state, since:
- we want the pages to undergo somewhat of an aging in order
to avoid easy thrashing
- we need a large enough amount of unmapped memory which we
can reclaim fast when we're under heavy (network) pressure
- having a lot of unmapped memory around will give minor page
faults, decreasing the amount of unmapped memory and requiring
us to keep scanning memory in a slow but steady pace, this:
- spreads out swap I/O evenly over time
- spreads out page aging evenly over space, giving us more
performance and fair aging than we ever dreamt of
Maybe we want the system to auto-tune the mapped:unmapped
ratio depending on the amount of minor faults and actual
page reclaims going on, with a bottom value of 1/16th of
memory so we always have enough buffer to catch big things.
Rik -- slowly getting used to dvorak kbd layout...
+-------------------------------------------------------------------+
| Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+
--
This is a majordomo managed list. To unsubscribe, send a message with
the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org
next prev parent reply other threads:[~1998-11-24 7:58 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Pine.LNX.3.95.981119002335.838A-100000@penguin.transmeta.com>
1998-11-19 21:34 ` Linux-2.1.129 Dr. Werner Fink
1998-11-19 21:58 ` Linux-2.1.129 Rik van Riel
1998-11-20 12:09 ` Linux-2.1.129 Dr. Werner Fink
1998-11-19 22:33 ` Linux-2.1.129 Linus Torvalds
1998-11-23 17:13 ` Linux-2.1.129 Stephen C. Tweedie
1998-11-23 19:16 ` Linux-2.1.129 Eric W. Biederman
1998-11-23 20:02 ` Linux-2.1.129 Linus Torvalds
1998-11-23 21:25 ` Linux-2.1.129 Rik van Riel
1998-11-23 22:19 ` Linux-2.1.129 Dr. Werner Fink
1998-11-24 3:37 ` Linux-2.1.129 Eric W. Biederman
1998-11-24 15:25 ` Linux-2.1.129 Stephen C. Tweedie
1998-11-24 17:33 ` Linux-2.1.129 Linus Torvalds
1998-11-24 19:59 ` Linux-2.1.129 Rik van Riel
1998-11-24 20:45 ` Linux-2.1.129 Linus Torvalds
1998-11-25 14:19 ` Linux-2.1.129 Stephen C. Tweedie
1998-11-25 21:07 ` Linux-2.1.129 Eric W. Biederman
1998-11-26 12:57 ` Linux-2.1.129 Stephen C. Tweedie
1998-11-25 20:33 ` Linux-2.1.129 Zlatko Calusic
1998-11-23 19:46 ` Linux-2.1.129 Eric W. Biederman
1998-11-23 21:18 ` Linux-2.1.129 Rik van Riel
1998-11-24 6:28 ` Linux-2.1.129 Eric W. Biederman
1998-11-24 7:56 ` Rik van Riel [this message]
1998-11-24 15:48 ` Linux-2.1.129 Stephen C. Tweedie
1998-11-24 15:38 ` Linux-2.1.129 Stephen C. Tweedie
1998-11-23 20:12 ` Linux-2.1.129 Rik van Riel
1998-11-23 20:53 ` Running 2.1.129 at extrem load [patch] (Was: Linux-2.1.129..) Dr. Werner Fink
1998-11-23 21:59 ` Rik van Riel
1998-11-23 22:35 ` Dr. Werner Fink
1998-11-24 12:38 ` Dr. Werner Fink
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.3.96.981124084018.14227A-100000@mirkwood.dummy.home \
--to=h.h.vanriel@phys.uu.nl \
--cc=ebiederm+eric@ccr.net \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox