From: Matthew Dillon <dillon@apollo.backplane.com>
To: Rik van Riel <riel@conectiva.com.br>
Cc: Daniel Phillips <phillips@innominate.de>, linux-mm@kvack.org
Subject: Re: Interesting item came up while working on FreeBSD's pageout daemon
Date: Thu, 21 Dec 2000 19:20:53 -0800 (PST) [thread overview]
Message-ID: <200012220320.eBM3Kr605128@apollo.backplane.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0012211741410.1613-100000@duckman.distro.conectiva>
Right. I am going to add another addendum... let me give a little
background first. I've been testing the FBsd VM system with two
extremes... on one extreme is Yahoo which tends to wind up running
servers which collect a huge number of dirty pages that need to be
flushed, but have lots of disk bandwidth available to flush them.
The other extreme is a heavily loaded newsreader box which operate
under extreme memory pressure but has mostly clean pages. Heavy
load in this case means 400-600 newsreader processes on a 512MB box
eating around 8MB/sec in new memory, but which has mostly clean pages.
My original solution for Yahoo was to treat clean and dirty pages at
the head of the inactive queue the same... that is, flush dirty pages
as they were encountered in the inactive queue and free clean pages,
with no limit on dirty page flushes. This worked great for yahoo,
but failed utterly with the poor news machines. News machines that
were running at a load of 1-2 were suddenly running at lods of 50-150.
i.e. they began to thrash and get really sludgy.
It took me a few days to figure out what was going on, because the
stats from the news machines showed the pageout daemon having no
problems... it was finding around 10,000 clean pages and 200-400
dirty pages per pass, and flushing the 200-400 dirty pages. That's
a 25:1 clean:dirty ratio.
Well, it turns out that the flushing of 200-400 dirty pages per pageout
pass was responsible for the load blowups. The machines had already
been running at 100% disk load, you may recall. Adding the additional
write load, even at 25:1, slowed the drives down enough that suddenly
many of the newsreader processes were blocking on disk I/O. Hence the
load shot through the roof.
I tried to 'fix' the problem by saying "well, ok, so we won't flush
dirty pages immediately, we will give them another runaround in the
inactive queue before we flush them". This worked for medium loads and
I thought I was done, so I wrote my first summary message to Rik and
Linus describing the problem and solution.
--
But the story continues. It turns out that that has NOT fixed the
problem. The number of dirty pages being flushed went down, but
not enough. Newsreader machine loads still ran in the 50-100 range.
At this point we really are talking about truely idle-but-dirty pages.
No matter, the machines were still blowing up.
So, to make a long story even longer, after further experiments I
determined that it was the write-load itself blowin up the machines.
Never mind what they were writing ... the simple *act* of writing
anything made the HD's much less efficient then under a read-only load.
Even limiting the number of pages flushed to a reasonable sounding
number like 64 didn't solve the problem... the load still hovered around
20.
The patch I currently have under test which solves the problem is a
combination of what I had in 4.2-release, which limited the dirty page
flushing to 32 pages per pass, and what I have in 4.2-stable which
has no limit. The new patch basically does this:
(remember pageout passes always free/flush pages from the inactive
queue, never the active queue!)
* Run a pageout pass with a dirty page flushing limit of 32 plus
give dirty inactive pages a second go-around in the inactive
queue.
* If the pass succeeds we are done.
* If the pass cannot free up enough pages (i.e. the machine happens
to have a huge number of dirty pages sitting around, aka the Yahoo
scenario), then take a second pass immediately and do not have any
limit whatsoever on dirty page flushes in the second pass.
*THIS* appears to work for both extremes. It's what I'm going to be
committing in the next few days to FreeBSD. BTW, years ago John Dyson
theorized that disk writing could have this effect on read efficiency,
which is why FBsd originally had a 32 page dirty flush limit per pass.
Now it all makes sense, and I've got proof that it's still a problem
with modern systems.
-Matt
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
next prev parent reply other threads:[~2000-12-22 3:20 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-12-16 20:16 Matthew Dillon
2000-12-21 16:47 ` Daniel Phillips
2000-12-21 19:42 ` Rik van Riel
2000-12-22 3:20 ` Matthew Dillon [this message]
2000-12-28 23:04 ` Daniel Phillips
2000-12-29 6:24 ` Matthew Dillon
2000-12-29 14:19 ` Daniel Phillips
2000-12-29 19:58 ` James Antill
2000-12-29 23:12 ` Daniel Phillips
2000-12-29 23:00 ` Daniel Phillips
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200012220320.eBM3Kr605128@apollo.backplane.com \
--to=dillon@apollo.backplane.com \
--cc=linux-mm@kvack.org \
--cc=phillips@innominate.de \
--cc=riel@conectiva.com.br \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox