linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@transmeta.com>
To: Jamie Lokier <lk@tantalophile.demon.co.uk>
Cc: Ed Tomlinson <tomlins@cam.org>,
	Marcelo Tosatti <marcelo@conectiva.com.br>,
	linux-mm@kvack.org
Subject: Re: swapout selection change in pre1
Date: Mon, 15 Jan 2001 10:24:19 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.10.10101151011340.6108-100000@penguin.transmeta.com> (raw)
In-Reply-To: <20010115102445.B18014@pcep-jamie.cern.ch>


On Mon, 15 Jan 2001, Jamie Lokier wrote:
> 
> Freeing pages aggressively from a process that's paging lots will make
> that process page more, meaning more aggressive freeing etc. etc.
> Either it works and reduces overall paging fairly (great), it spirals
> out of control, which will be obvious, or it'll simply be stable at many
> different rates which is undesirable but not so obvious in testing.

I doubt that it gets to any of the bad cases.

See - when the VM layer frees pages from a virtual mapping, it doesn't
throw them away. The pages are still there, and there won't be any "spiral
of death". If the faulter faults them in quickly, a soft-fault will happen
without any new memory allocation, and you won't see any more vmascanning.
It doesn't get "worse", if the working set actually fits in memory.

So the only case that actually triggers a "meltdown" is when the working
set does _not_ fit in memory, in which case not only will the pages be
unmapped, but they'll also get freed aggressively by the page_launder()
logic. At that point, the big process will actually end up waiting for the
pages, and will end up penalizing itself, which is exactly what we want. 

So it should never "spiral out of control", simply because of the fact
that if we fit in memory it has no other impact than initially doing more
soft page faults when it tries to find the right balancing point. It only
really kicks in for real when people are continually trying to free
memory: which is only true when we really have a working set bigger than
available memory, and which is exactly the case where we _want_ to
penalize the people who seem to be the worst offenders.

So I woubt you get any "subtle cases".

Note that this ties in to the thread issue too: if you have a single VM
and 50 threads that all fault in, that single VM _will_ be penalized. Not
because it has 50 threads (like the old code did), but because it has a
very active paging behaviour.

Which again is exactly what we want: we don't want to penalize threads per
se, because threads are often used for user interfaces etc and can often
be largely dormant. What we really want to penalize is bad VM behaviour,
and that's exactly the information we get from heavy page faulting.

NOTE! I'm not saying that tuning isn't necessary. Of course it is. And I
suspect that we actually want to add a page allocation flag (__GPF_VM)
that says that "this allocation is for growing our VM", and perhaps make
the VM shrinking conditional on that - so that the VM shrinking really
only kicks in for the big VM offenders, not for people who just read files
into the page cache.

So yes, we'll have VM tuning, the same as 2.2.x had and probably still
has. But I think our algorithms are a lot more "fundamentally stable" than
they were before. Which is not to say that the tuning is obvious - I just
claim that we will probably have a lot better time doing it, and that we
have more tools in our tool-chest.

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

  parent reply	other threads:[~2001-01-15 18:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-01-13  3:28 Marcelo Tosatti
2001-01-13  8:05 ` Linus Torvalds
2001-01-13  7:41   ` Marcelo Tosatti
2001-01-15  1:22   ` Ed Tomlinson
2001-01-15  2:48     ` Linus Torvalds
2001-01-15  9:24       ` Jamie Lokier
2001-01-15  8:16         ` Marcelo Tosatti
2001-01-15 18:24         ` Linus Torvalds [this message]
2001-01-15 18:40           ` Jamie Lokier
2001-01-15 18:55             ` Linus Torvalds
2001-01-15 21:44               ` Jamie Lokier
2001-01-15 21:57                 ` Linus Torvalds
2001-01-15 22:36                   ` Jamie Lokier
2001-01-17 23:40               ` Rik van Riel
2001-01-18 15:38                 ` Roman Zippel
2001-01-17  7:19     ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.10.10101151011340.6108-100000@penguin.transmeta.com \
    --to=torvalds@transmeta.com \
    --cc=linux-mm@kvack.org \
    --cc=lk@tantalophile.demon.co.uk \
    --cc=marcelo@conectiva.com.br \
    --cc=tomlins@cam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox