From: Roger Luethi <rl@hellgate.ch>
To: Rik van Riel <riel@redhat.com>
Cc: William Lee Irwin III <wli@holomorphy.com>,
linux-mm@kvack.org, Andrew Morton <akpm@digeo.com>
Subject: Re: load control demotion/promotion policy
Date: Mon, 22 Dec 2003 00:55:42 +0100 [thread overview]
Message-ID: <20031221235541.GA22896@k3.hellgate.ch> (raw)
In-Reply-To: <Pine.LNX.4.44.0312202125580.26393-100000@chimarrao.boston.redhat.com>
On Sat, 20 Dec 2003 21:33:34 -0500, Rik van Riel wrote:
> I've got an idea for a load control / memory scheduling
> policy that is inspired by the following requirements
> and data points:
It is my understanding that wli is interested in load control because
he knows this Russian guy who puts an insane load on his box. Do you
have friends in Russia as well? Isn't there _anybody_ interested in
the fact that 2.6 performance completely breaks down under a light
overload where 2.4 doesn't and where load control would be more of a
problem than a solution? Heck, I even showed that you don't have to give
up physical scanning to get most of the pageout performance back! Oh,
and btw: Did I overlook this problem on akpm's should/must fix lists,
or is it missing for a reason?
I can't help but think of the man who looks for his keys not where he
lost them but near the lamp post, where the light is. While I agree
that working on load control is a lot more fun, it is _pageout_ that
has been completely borked in 2.6 and there is no way in hell load
control can fix that. Load control trades latency for throughput and
makes sense for some situations after pageout tuning has been exhausted,
which is not true at all for Linux 2.6.
I hate to be a pest but I am still entirely unconvinced that load control
is what 2.6 needs at this point. Maybe I should make that ceterum censeo
a sig.
That said, here's my take:
> 1) wli pointed out that one of the better performing load
> control mechanisms is one that swaps out the SMALLEST
> process (easy to swap out, removes one process worth of
> IO load from the system)
According to wli this strategy was 15% better than random selection in
terms of throughput / CPU usage. Those 15% may well be quite solid for
transaction based systems, but typical Linux systems and workloads are
different animals and it doesn't seem safe to rely on those numbers here.
Also, on modern servers/workstations with load control, latency will
become a much bigger problem than +/- 15% throughput could ever be.
Bottom line: We would have to benchmark various criteria anyway and
chosing the smallest process is arguably quite arbitrary. The best I
could say about it is that for all we know it's as good as any other
policy.
> 2) small processes, like root shells, should not be
> swapped out for a long time, but should be swapped
> back in relatively quickly
>
> 3) because swapping big processes in or out is a lot of
> work, we should do that infrequently
>
> 4) however, once a big process is swapped out, it should
> stay out for a long time because it greatly reduces
> the amount of memory the system needs
>
> The swapout selection loop would be as follows:
> - calculate (rss / resident time) for every process
> - swap out the process where this value is lowest
> - remember the rss and swapout time in the task struct
>
> At swapin time we can do the opposite, looking at
> every process in the swapped out queue and waking up
> the process where (swap_rss / now - swap_time) is
> the smallest.
If I understand your description correctly, you'll probably stun sshd
early on, because it will have accrued an impressive resident time.
If the user starts a fat GUI administration tool to study/fix the load
problem, it will likely hit the sack as well and stay there for a long
time. IOW, you will help some users and quite possibly make things worse
for others.
Of course I don't claim your selection algorithm is any worse than mine,
but I doubt it is much better. It is hard to get right -- looks like
the OOM killer all over again.
As for the implementation: An overload situation that is grave enough
to make load control worthwhile should be a rare event. I didn't think
I could justify growing the task struct even further for that. So when
I wanted to save some state (like RSS at stunning time), I kept it in
local variables where the processes hit the wait queue. I didn't use
it for global comparisons like what you are suggesting, but even that
is possible with some extra effort. And at the time load control is
kicking in, we've got plenty of CPU cycles to spend on extra efforts.
Roger
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2003-12-21 23:55 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-21 2:33 Rik van Riel
2003-12-21 14:15 ` William Lee Irwin III
2003-12-21 23:55 ` Roger Luethi [this message]
2003-12-22 1:21 ` William Lee Irwin III
2003-12-23 16:13 ` Roger Luethi
2003-12-22 1:34 ` Rik van Riel
2003-12-23 16:13 ` Roger Luethi
2003-12-22 6:56 ` Andrew Morton
2003-12-23 16:14 ` Roger Luethi
2003-12-22 7:00 ` William Lee Irwin III
2003-12-22 15:12 ` Benjamin LaHaise
2003-12-22 15:18 ` William Lee Irwin III
2003-12-22 15:22 ` Benjamin LaHaise
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031221235541.GA22896@k3.hellgate.ch \
--to=rl@hellgate.ch \
--cc=akpm@digeo.com \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
--cc=wli@holomorphy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox