differential throttling & local page replacement

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* differential throttling & local page replacement
@ 2000-08-20  5:18 Rik van Riel
  0 siblings, 0 replies; only message in thread
From: Rik van Riel @ 2000-08-20  5:18 UTC (permalink / raw)
  To: linux-mm

Hi,

I think I have worked out two (related) things to make the
system (more) resistant against thrashing memory hogs,
more interactive during heavy IO load and resistant against
heavy disk writes.

The basic idea revolves around one statistic, which is kept
locally per process and globally, lets call it fault_rate.

On every page fault (in handle_mm_fault()) and on write()
calls, we do a current->fault_rate++, global_fault_rate++;
The value will be decayed on a regular basis.
(that this may count write faults twice is probably a
beneficial effect)

This will mean that for every process we will know the
percentage of page faults in the system are for this
process  (current->fault_rate * 100 / global_fault_rate)
and we can use this ratio for 2 different, but related,
things.

First there's write throttling. The current write
throttling code has the disadvantage that one heavily
writing thread can stall the others too. The formula
(in balance_dirty_state()) is as follows:

if (nr_dirty * 100 > total * max_dirty_percentage/2)
	wake_up_bdflush(NOWAIT);

if (nr_dirty * 100 > total * max_dirty_percentage)
	wake_up_bdflush(WAIT);

Since waiting on a disk write is rather expensive, it
may well be worth it to change the second formula a
bit to take into account the percentage of VM activity
of the current process and put VM or disk bandwidth
hogs to sleep a bit earlier so they won't impact the
performance of lower activity tasks.

local_max_percentage = (max_dirty_percentage - \
	((current->fault_rate * max_dirty_percentage/3) /  \
	global_fault_rate);

if (nr_dirty * 100 > total * local_max_percentage)
	wake_up_bdflush(WAIT);

This will still allow disk bandwidth eaters to run at
full disk speed, but without having any of the less IO
intensive processes stalled on IO.

The second place where we can use these statistics is
in making sure VM hogs (where a hog is defined as a
process that has lots of page faults and causes the VM
subsystem to be busy) don't impact the rest of the system
too much.

The idea here is to make the VM hog do local page replacement
some percentage of the time, the percentage of course being
its own (current->fault_rate * 100 / global_fault_rate). To
prevent a task from doing only local page replacement while
there are idle pages sitting around, this should only happen
when the amount of free+freeable memory is _lower_ than the
amount of memory kswapd tries to keep free, so the local page
replacement only kicks in at the moment where our task truly
starts to burden the rest of the system.

So if:

if (nr_free_pages() + nr_inactive_clean_pages() < kswapd_goal / 2) {
	do_local_fault_rate_percentage_of_the_time()
		swap_out_mm(current->mm);
}

Or, alternatively:

if (nr_free_pages() + nr_inactive_clean_pages() <
				 kswapd_goal * local_fault_ratio)
	swap_out_mm(current->mm);

This way the memory hog (say, mmap002, tar, ...) will do some
page replacement on itself to protect the rest of the system.
Of course we need to do something like that for the page cache
too, but that should be trivial.

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
       -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2000-08-20  5:18 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-08-20  5:18 differential throttling & local page replacement Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox