Re: RFC: design for new VM - Gerrit.Huizenga

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Gerrit.Huizenga@us.ibm.com
To: chucklever@bigfoot.com
Cc: linux-mm@kvack.org, linux-kernel@vger.rutgers.edu,
	Linus Torvalds <torvalds@transmeta.com>
Subject: Re: RFC: design for new VM
Date: Mon, 07 Aug 2000 17:36:43 -0700	[thread overview]
Message-ID: <200008080036.RAA03032@eng2.sequent.com> (raw)
In-Reply-To: Your message of Mon, 07 Aug 2000 16:55:32 EDT. <87256934.0072FA16.00@d53mta04h.boulder.ibm.com>

Hi Chuck,

> 1.  kswapd runs in the background and wakes up every so often to handle
> the corner cases that smooth bursty memory request workloads.  it executes
> the same code that is invoked from the kernel's memory allocator to
> reclaim pages.

 yep...  We do the same, although primarily through RSS management and our
 pageout deamon (separate from swapout).

 One possible difference - dirty pages are schedule for asynchronous
 flush to disk and then moved to the end of the free list after IO
 is complete.  If the process faults on that page, either before it is
 paged out or aftewrwards, it can be "reclaimed" either from the dirty
 list or the free list , without re-reading from disk.  The pageout daemon
 runs with the dirty list reaches a tuneable size, and the pageout deamon
 shrinks the list to a tuneable size, moving all written pages to the
 free list.

 In many ways, similar to what Rik is proposing, although I don't see any
 "fast reclaim" capability.  Also, the method by which pages are aged
 is quite different (global phys memory scan vs. processes maintaining
 their own LRU set).  Having a list of prime candidates to flush makes
 the kswapd/pageout overhead lower than using a global clock hand, but
 the global clock hand *may* more perform better global optimisation
 of page aging.

> 2.  i agree with you that when the system exhausts memory, it hits a hard
> knee; it would be better to soften this.  however, the VM system is
> designed to optimize the case where the system has enough memory.  in
> other words, it is designed to avoid unnecessary work when there is no
> need to reclaim memory.  this design was optimized for a desktop workload,
> like the scheduler or ext2 "async" mode.  if i can paraphrase other
> comments i've heard on these lists, it epitomizes a basic design
> philosophy: "to optimize the common case gains the most performance
> advantage."

 This works fine until I have a stable load on my system and then
 start {Netscape, StarOffice, VMware, etc.} which then causes IO for
 demand paging of the executable, as well as paging/swapping activity
 to make room for the piggish footprints of these bigger applications.

 This is where it might help to pre-write dirty pages when the system
 is more idle, without fully returning those pages to the free list.

> can a soft-knee swapping algorithm be demonstrated that doesn't impact the
> performance of applications running on a system that hasn't exhausted its
> memory?
> 
>      - Chuck Lever

 Our VM doesn't exhibit a strong knee, but its method of avoiding that
 is again the flexing RSS management.  Inactive processes tend to shrink
 to their working footprint, larger processes tend to grow to expand
 their footprint but still self-manage within the limits of available
 memory.  I think it is possible to soften the knee on a per-workload
 basis, and that's probably a spot for some tuneables.  E.g. when to
 flush dirty old pages, how many to flush, and I think Rik has already
 talked about having those tunables.

 Despite the fact that our systems have been primarily deployed for
 a single workload type (databases), we still have found that (the
 right!) VM tuneables can have an enormous impact on performance. I
 think the same will be much more true of an OS like Linux which tries
 to be many things to all people.

gerrit
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

next      parent reply	other threads:[~2000-08-08  0:36 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87256934.0072FA16.00@d53mta04h.boulder.ibm.com>
2000-08-08  0:36 ` Gerrit.Huizenga [this message]
     [not found] <87256934.0078DADB.00@d53mta03h.boulder.ibm.com>
2000-08-08  0:48 ` Gerrit.Huizenga
2000-08-08 15:21   ` Rik van Riel
     [not found] <8725692F.0079E22B.00@d53mta03h.boulder.ibm.com>
2000-08-07 17:40 ` Gerrit.Huizenga
2000-08-07 18:37   ` Matthew Wilcox
2000-08-07 20:55   ` Chuck Lever
2000-08-07 21:59     ` Rik van Riel
2000-08-08  3:26   ` David Gould
2000-08-08  5:54     ` Kanoj Sarcar
2000-08-08  7:15       ` David Gould
2000-08-04 13:52 Mark_H_Johnson
  -- strict thread matches above, loose matches on Subject: below --
2000-08-02 22:08 Rik van Riel
2000-08-03  7:19 ` Chris Wedgwood
2000-08-03 16:01   ` Rik van Riel
2000-08-04 15:41     ` Matthew Dillon
2000-08-04 17:49       ` Linus Torvalds
2000-08-04 23:51         ` Matthew Dillon
2000-08-05  0:03           ` Linus Torvalds
2000-08-05  1:52             ` Matthew Dillon
2000-08-05  1:09               ` Matthew Wilcox
2000-08-05  2:05               ` Linus Torvalds
2000-08-05  2:17               ` Alexander Viro
2000-08-07 17:55                 ` Matthew Dillon
2000-08-05 22:48     ` Theodore Y. Ts'o
2000-08-03 18:27   ` lamont
2000-08-03 18:34     ` Linus Torvalds
2000-08-03 19:11       ` Chris Wedgwood
2000-08-03 21:04         ` Benjamin C.R. LaHaise
2000-08-03 19:32       ` Rik van Riel
2000-08-03 18:05 ` Linus Torvalds
2000-08-03 18:50   ` Rik van Riel
2000-08-03 20:22     ` Linus Torvalds
2000-08-03 22:05       ` Rik van Riel
2000-08-03 22:19         ` Linus Torvalds
2000-08-03 19:00   ` Richard B. Johnson
2000-08-03 19:29     ` Rik van Riel
2000-08-03 20:23     ` Linus Torvalds
2000-08-03 19:37   ` Ingo Oeser
2000-08-03 20:40     ` Linus Torvalds
2000-08-03 21:56       ` Ingo Oeser
2000-08-03 22:12         ` Linus Torvalds
2000-08-04  2:33   ` David Gould
2000-08-16 15:10   ` Stephen C. Tweedie
2000-08-03 19:26 ` Roger Larsson
2000-08-03 21:50   ` Rik van Riel
2000-08-03 22:28     ` Roger Larsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200008080036.RAA03032@eng2.sequent.com \
    --to=gerrit.huizenga@us.ibm.com \
    --cc=chucklever@bigfoot.com \
    --cc=linux-kernel@vger.rutgers.edu \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox