linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: Rik van Riel <riel@conectiva.com.br>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: inactive_dirty list
Date: Fri, 06 Sep 2002 15:48:54 -0700	[thread overview]
Message-ID: <3D7930D6.F658E5B9@zip.com.au> (raw)
In-Reply-To: <Pine.LNX.4.44L.0209061923020.1857-100000@imladris.surriel.com>

Rik van Riel wrote:
> 
> On Fri, 6 Sep 2002, Andrew Morton wrote:
> 
> > > So basically pages should _only_ go into the inactive_dirty list
> > > when they are under writeout.
> >
> > Or if they're just dirty.  The thing I'm trying to achieve
> > is to minimise the amount of scanning of unreclaimable pages.
> >
> > So park them elsewhere, and don't scan them.  We know how many
> > pages are there, so we can make decisions based on that.  But let
> > IO completion bring them back onto the inactive_reclaimable(?)
> > list.
> 
> I guess this means the dirty limit should be near 1% for the
> VM.

What is the thinking behind that?
 
> Every time there is a noticable amount of dirty pages, kick
> pdflush and have it write out a few of them, maybe the number
> of pages needed to reach zone->pages_high ?

Well we can certainly do that - the current wakeup_bdflush()
is pretty crude:

void wakeup_bdflush(void)
{
        struct page_state ps;

        get_page_state(&ps);
        pdflush_operation(background_writeout, ps.nr_dirty);
}

We can pass background_writeout 42 pages if necessary.  That's
not aware of zones, of course.  It will just write back the
oldest 42 pages from the oldest dirty inode against the last-mounted
superblock.

I still have not got my head around:

> We did this in early 2.4 kernels and it was a disaster. The
> reason it was a disaster was that in many workloads we'd
> always have some clean pages and we'd end up always reclaiming
> those before even starting writeout on any of the dirty pages.

Does this imply that we need to block on writeout *instead*
of reclaiming clean pagecache?

We could do something like:

	if (zone->nr_inactive_dirty > zone->nr_inactive_clean) {
		wakeup_bdflush();	/* Hope this writes the correct zone */
		yield();
	}

which would get the IO underway promptly.  But the caller would
still go in and gobble remaining clean pagecache.


The thing which happened (basically by accident) from my Wednesday
hackery was a partitioning of the machine.  40% of memory is
available to pagecache writeout, and that's clamped (ignoring
MAP_SHARED for now..).  And everyone else just walks around it.

So a 1G box running dbench 1000 acts like a 600M box.  Which
is not a bad model, perhaps.  If we can twiddle that 40%
up and down based on <mumble> criteria...

But that separaton of the 40% of unusable memory from the 
60% of usable memory is done by scanning at present, and
it costs a bit of CPU.  Not much, but a bit.


(btw, is there any reason at all for having page reserves
in ZONE_HIGHMEM?  I have a suspicion that this is just wasted
memory...)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

  reply	other threads:[~2002-09-06 22:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-06 20:42 Andrew Morton
2002-09-06 21:03 ` Rik van Riel
2002-09-06 21:40   ` Andrew Morton
2002-09-06 21:49     ` Rik van Riel
2002-09-06 21:58       ` Andrew Morton
2002-09-06 22:04         ` Rik van Riel
2002-09-06 22:19           ` Andrew Morton
2002-09-06 22:23             ` Rik van Riel
2002-09-06 22:48               ` Andrew Morton [this message]
2002-09-06 23:03                 ` Rik van Riel
2002-09-06 23:34                   ` Andrew Morton
2002-09-07  0:00                     ` Rik van Riel
2002-09-07  0:29                       ` Andrew Morton
2002-09-08 21:21                     ` Daniel Phillips
2002-09-06 22:22           ` Rik van Riel
2002-09-07  2:14 ` Andrew Morton
2002-09-07  2:10   ` Rik van Riel
2002-09-07  5:28     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D7930D6.F658E5B9@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=linux-mm@kvack.org \
    --cc=riel@conectiva.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox