Re: swapout selection change in pre1

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jamie Lokier <lk@tantalophile.demon.co.uk>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Ed Tomlinson <tomlins@cam.org>,
	Marcelo Tosatti <marcelo@conectiva.com.br>,
	linux-mm@kvack.org
Subject: Re: swapout selection change in pre1
Date: Mon, 15 Jan 2001 19:40:00 +0100	[thread overview]
Message-ID: <20010115194000.C18795@pcep-jamie.cern.ch> (raw)
In-Reply-To: <Pine.LNX.4.10.10101151011340.6108-100000@penguin.transmeta.com>; from torvalds@transmeta.com on Mon, Jan 15, 2001 at 10:24:19AM -0800

Linus Torvalds wrote:
> See - when the VM layer frees pages from a virtual mapping, it doesn't
> throw them away. The pages are still there, and there won't be any "spiral
> of death". If the faulter faults them in quickly, a soft-fault will happen
> without any new memory allocation, and you won't see any more vmascanning.
> It doesn't get "worse", if the working set actually fits in memory.

Ok, as long as the agressive scanning is only increased by hard faults.

> So the only case that actually triggers a "meltdown" is when the working
> set does _not_ fit in memory, in which case not only will the pages be
> unmapped, but they'll also get freed aggressively by the page_launder()
> logic. At that point, the big process will actually end up waiting for the
> pages, and will end up penalizing itself, which is exactly what we want.
> 
> So it should never "spiral out of control", simply because of the fact
> that if we fit in memory it has no other impact than initially doing more
> soft page faults when it tries to find the right balancing point. It only
> really kicks in for real when people are continually trying to free
> memory: which is only true when we really have a working set bigger than
> available memory, and which is exactly the case where we _want_ to
> penalize the people who seem to be the worst offenders.
> 
> So I woubt you get any "subtle cases".

Suppose you have two processes with the same size working set.  Process
A is almost entirely paged out and so everything it does triggers a hard
fault.  This causes A to be agressively vmscanned, which ensures that
most of A's working set pages aren't mapped, and therefore can be paged
out.

Process B is almost entirely paged in and doesn't fault very much.  It
is not being aggressively vmscanned.  After it does hard fault, there is
a good chance that the subsequent few pages it wants are still mapped.

So process A is heavily hard faulting, process B is not, and the
aggressive vmscanning of process A conspires to keep it that way.

Like the TCP unfairness problem, where one stream captures the link and
other streams cannot get a fair share.

I am waving my hands a bit but no more than Linus I think :)

Btw, reverse page mapping resolves this and makes it very simple: no
vmscanning (*), so no hand waving heuristic.  I agree that every scheme
except Dave's for reverse mapping has appeared rather too heavy.  I
don't know if anyone remembers the one I suggested a few months ago,
based on Dave's.  I believe it addresses the problems Dave noted with
anonymous pages etc.  Must find the time etc.

(*) You might vmscan for efficiency sake anyway, but it needn't affect
paging decisions.

> Note that this ties in to the thread issue too: if you have a single VM
> and 50 threads that all fault in, that single VM _will_ be penalized. Not
> because it has 50 threads (like the old code did), but because it has a
> very active paging behaviour.
> 
> Which again is exactly what we want: we don't want to penalize threads per
> se, because threads are often used for user interfaces etc and can often
> be largely dormant. What we really want to penalize is bad VM behaviour,
> and that's exactly the information we get from heavy page faulting.

Certainly, it's most desirable to simply treat VMs as just VMs.

What _may_ be a factor is that thread VMs get an unfair share of the
processor.  Probably they should not, but right now they do.  And this
unfair share certainly skews the scanning and paging statistics.  I'm
not sure if any counterbalance is needed.

-- Jamie
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

next prev parent reply	other threads:[~2001-01-15 18:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-01-13  3:28 Marcelo Tosatti
2001-01-13  8:05 ` Linus Torvalds
2001-01-13  7:41   ` Marcelo Tosatti
2001-01-15  1:22   ` Ed Tomlinson
2001-01-15  2:48     ` Linus Torvalds
2001-01-15  9:24       ` Jamie Lokier
2001-01-15  8:16         ` Marcelo Tosatti
2001-01-15 18:24         ` Linus Torvalds
2001-01-15 18:40           ` Jamie Lokier [this message]
2001-01-15 18:55             ` Linus Torvalds
2001-01-15 21:44               ` Jamie Lokier
2001-01-15 21:57                 ` Linus Torvalds
2001-01-15 22:36                   ` Jamie Lokier
2001-01-17 23:40               ` Rik van Riel
2001-01-18 15:38                 ` Roman Zippel
2001-01-17  7:19     ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010115194000.C18795@pcep-jamie.cern.ch \
    --to=lk@tantalophile.demon.co.uk \
    --cc=linux-mm@kvack.org \
    --cc=marcelo@conectiva.com.br \
    --cc=tomlins@cam.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox