throttling dirtiers

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* throttling dirtiers
@ 2002-07-31  8:26 Andrew Morton
  2002-07-31 20:06 ` William Lee Irwin III
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2002-07-31  8:26 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

Here's an interesting test.

- mem=512m
- Run a program which mallocs 400 megs and then madly sits
  there touching each page.
- Do a big `dd' to a file.

Everything works nicely - all the anon memory sits on the active
list, all writeback is via shrink_cache -> vm_writeback.
Bandwidth is good.

But as we discussed, we really shouldn't be doing the IO from
within the VM.  balance_dirty_pages() is never triggering
because the system is not reaching 40% dirty.

It would make sense for the VM to detect an overload
of dirty pages coming off the tail of the LRU and to reach
over and tell balance_dirty_pages() to provide throttling,

If we were to say "gee, of the last 1,000 pages, 25% were
dirty, so tell balance_dirty_pages() to throttle everyone"
then that would be too late because the LRU will be _full_
of dirty pages.

I can't think of a sane way of keeping track of the number
of dirty pages on the inactive list, because the locking
is quite disjoint.

But we can certainly track the amount of anon memory in
the machine, and set the balance_dirty_pages thresholds
at 0.4 * (total memory - anon memory) or something like
that.

Thoughts?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31  8:26 throttling dirtiers Andrew Morton
@ 2002-07-31 20:06 ` William Lee Irwin III
  2002-07-31 20:23   ` Benjamin LaHaise
  0 siblings, 1 reply; 14+ messages in thread
From: William Lee Irwin III @ 2002-07-31 20:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Rik van Riel, linux-mm

On Wed, Jul 31, 2002 at 01:26:09AM -0700, Andrew Morton wrote:
> Here's an interesting test.
> - mem=512m
> - Run a program which mallocs 400 megs and then madly sits
>   there touching each page.
> - Do a big `dd' to a file.
> Everything works nicely - all the anon memory sits on the active
> list, all writeback is via shrink_cache -> vm_writeback.
> Bandwidth is good.

The VM has to do some because pages can be dirtied by mmap()'d access.
Only scanning for modified bits or trapping write access (ugh) can find
these pages.

On Wed, Jul 31, 2002 at 01:26:09AM -0700, Andrew Morton wrote:
> But as we discussed, we really shouldn't be doing the IO from
> within the VM.  balance_dirty_pages() is never triggering
> because the system is not reaching 40% dirty.
> It would make sense for the VM to detect an overload
> of dirty pages coming off the tail of the LRU and to reach
> over and tell balance_dirty_pages() to provide throttling,
> If we were to say "gee, of the last 1,000 pages, 25% were
> dirty, so tell balance_dirty_pages() to throttle everyone"
> then that would be too late because the LRU will be _full_
> of dirty pages.

balance_dirty_pages() is the closest thing to source throttling
available, so it should definitely be used before VM background
writeback. Perhaps assigning dirty page budgets to tasks and/or
struct address_space and checking for budget excess would be good?
Trouble is I'm not sure exactly how well they can be enforced
given the mmap() problem.

On Wed, Jul 31, 2002 at 01:26:09AM -0700, Andrew Morton wrote:
> I can't think of a sane way of keeping track of the number
> of dirty pages on the inactive list, because the locking
> is quite disjoint.
> But we can certainly track the amount of anon memory in
> the machine, and set the balance_dirty_pages thresholds
> at 0.4 * (total memory - anon memory) or something like
> that.
> Thoughts?

I'm not a fan of this kind of global decision. For example, I/O devices
may be fast enough and memory small enough to dump all memory in < 1s,
in which case dirtying most or all of memory is okay from a latency
standpoint, or it may take hours to finish dumping out 40% of memory,
in which case it should be far more eager about writeback.

Cheers,
Bill
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 20:06 ` William Lee Irwin III
@ 2002-07-31 20:23   ` Benjamin LaHaise
  2002-07-31 20:26     ` Rik van Riel
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Benjamin LaHaise @ 2002-07-31 20:23 UTC (permalink / raw)
  To: William Lee Irwin III, Andrew Morton, Rik van Riel, linux-mm

On Wed, Jul 31, 2002 at 01:06:12PM -0700, William Lee Irwin III wrote:
> I'm not a fan of this kind of global decision. For example, I/O devices
> may be fast enough and memory small enough to dump all memory in < 1s,
> in which case dirtying most or all of memory is okay from a latency
> standpoint, or it may take hours to finish dumping out 40% of memory,
> in which case it should be far more eager about writeback.

Why?  Filling the entire ram with dirty pages is okay, and in fact you 
want to support that behaviour for apps that "just fit" (think big 
scientific apps).  The only interesting point is that when you hit the 
limit of available memory, the system needs to block on *any* io 
completing and resulting in clean memory (which is reasonably low 
latency), not a specific io which may have very high latency.

		-ben
-- 
"You will be reincarnated as a toad; and you will be much happier."
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 20:23   ` Benjamin LaHaise
@ 2002-07-31 20:26     ` Rik van Riel
  2002-07-31 20:59     ` William Lee Irwin III
  2002-07-31 21:02     ` Andrew Morton
  2 siblings, 0 replies; 14+ messages in thread
From: Rik van Riel @ 2002-07-31 20:26 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: William Lee Irwin III, Andrew Morton, linux-mm

On Wed, 31 Jul 2002, Benjamin LaHaise wrote:

> Why?  Filling the entire ram with dirty pages is okay, and in fact you
> want to support that behaviour for apps that "just fit" (think big
> scientific apps).  The only interesting point is that when you hit the
> limit of available memory, the system needs to block on *any* io
> completing and resulting in clean memory (which is reasonably low
> latency), not a specific io which may have very high latency.

Also, the system shouldn't try writing out the complete inactive
list at once and blocking in __get_request_wait instead of grabbing
pages as they become cleaned.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 20:23   ` Benjamin LaHaise
  2002-07-31 20:26     ` Rik van Riel
@ 2002-07-31 20:59     ` William Lee Irwin III
  2002-07-31 21:02     ` Andrew Morton
  2 siblings, 0 replies; 14+ messages in thread
From: William Lee Irwin III @ 2002-07-31 20:59 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Andrew Morton, Rik van Riel, linux-mm

On Wed, Jul 31, 2002 at 01:06:12PM -0700, William Lee Irwin III wrote:
>> I'm not a fan of this kind of global decision. For example, I/O devices
>> may be fast enough and memory small enough to dump all memory in < 1s,
>> in which case dirtying most or all of memory is okay from a latency
>> standpoint, or it may take hours to finish dumping out 40% of memory,
>> in which case it should be far more eager about writeback.

On Wed, Jul 31, 2002 at 04:23:57PM -0400, Benjamin LaHaise wrote:
> Why?  Filling the entire ram with dirty pages is okay, and in fact you 
> want to support that behaviour for apps that "just fit" (think big 
> scientific apps).  The only interesting point is that when you hit the 
> limit of available memory, the system needs to block on *any* io 
> completing and resulting in clean memory (which is reasonably low 
> latency), not a specific io which may have very high latency.

I had more in mind the case of streaming I/O, not things that "just fit".
IIRC scientific apps mmap and have to have their I/O handled by
background scanning (or trapping writes), and should end up in the
situation you describe because no one has any idea when to throttle them
anyway. If I/O requests are allowed to proceed without blocking and/or
failing at a greater rate than devices can process them, eventually
one's forced to shove data down the device's throat at a greater rate
than it can handle, and you just end up with a backlog of dirty memory
that can't be written out because the rest of memory is dirtied just as
quickly as it's cleaned that could be used elsewhere. That is, if you
can't keep up with dirtiers, you're never going to make forward progress
cleaning, and everything will block/fail anyway when it gets to the end
of the memory supply.  And background VM writeback should also be aware
of the rate at which it should submit I/O as the most visible symptom is
kswapd itself generating excessive arrival rates to the I/O queues.

Cheers,
Bill
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 20:23   ` Benjamin LaHaise
  2002-07-31 20:26     ` Rik van Riel
  2002-07-31 20:59     ` William Lee Irwin III
@ 2002-07-31 21:02     ` Andrew Morton
  2002-07-31 21:14       ` Benjamin LaHaise
  2 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2002-07-31 21:02 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: William Lee Irwin III, Rik van Riel, linux-mm

Benjamin LaHaise wrote:
> 
> On Wed, Jul 31, 2002 at 01:06:12PM -0700, William Lee Irwin III wrote:
> > I'm not a fan of this kind of global decision. For example, I/O devices
> > may be fast enough and memory small enough to dump all memory in < 1s,
> > in which case dirtying most or all of memory is okay from a latency
> > standpoint, or it may take hours to finish dumping out 40% of memory,
> > in which case it should be far more eager about writeback.
> 
> Why?  Filling the entire ram with dirty pages is okay, and in fact you
> want to support that behaviour for apps that "just fit" (think big
> scientific apps).  The only interesting point is that when you hit the
> limit of available memory, the system needs to block on *any* io
> completing and resulting in clean memory (which is reasonably low
> latency), not a specific io which may have very high latency.
> 

I hear what you say.  Sometimes we want to allow a lot of
writeback buffering.  But sometimes we don't.

But let's back off a bit.   The problem is that a process
doing a large write() can penalise innocent processes which
want to allocate memory.

How to fix that?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 21:02     ` Andrew Morton
@ 2002-07-31 21:14       ` Benjamin LaHaise
  2002-07-31 21:25         ` Rik van Riel
  2002-07-31 21:28         ` Andrew Morton
  0 siblings, 2 replies; 14+ messages in thread
From: Benjamin LaHaise @ 2002-07-31 21:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: William Lee Irwin III, Rik van Riel, linux-mm

On Wed, Jul 31, 2002 at 02:02:03PM -0700, Andrew Morton wrote:
> But let's back off a bit.   The problem is that a process
> doing a large write() can penalise innocent processes which
> want to allocate memory.
> 
> How to fix that?

First off, make it obvious where we block in the allocation path (pawning 
off all memory reaping to kswapd et al is an easy first step here).  Then 
make allocators cycle through on a FIFO basis by using something like the 
page reservation patch I came up with a while ago.  That'll give us an 
easy place to change scheduling behaviour.

		-ben
-- 
"You will be reincarnated as a toad; and you will be much happier."
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 21:14       ` Benjamin LaHaise
@ 2002-07-31 21:25         ` Rik van Riel
  2002-07-31 21:32           ` Andrew Morton
  2002-07-31 21:28         ` Andrew Morton
  1 sibling, 1 reply; 14+ messages in thread
From: Rik van Riel @ 2002-07-31 21:25 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Andrew Morton, William Lee Irwin III, linux-mm

On Wed, 31 Jul 2002, Benjamin LaHaise wrote:
> On Wed, Jul 31, 2002 at 02:02:03PM -0700, Andrew Morton wrote:
> > But let's back off a bit.   The problem is that a process
> > doing a large write() can penalise innocent processes which
> > want to allocate memory.
> >
> > How to fix that?
>
> First off, make it obvious where we block in the allocation path (pawning
> off all memory reaping to kswapd et al is an easy first step here).  Then
> make allocators cycle through on a FIFO basis by using something like the
> page reservation patch I came up with a while ago.  That'll give us an
> easy place to change scheduling behaviour.

These ingredients are already in 2.4-rmap.

We need an extra component, a lower lateny shrink_cache/page_launder.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 21:14       ` Benjamin LaHaise
  2002-07-31 21:25         ` Rik van Riel
@ 2002-07-31 21:28         ` Andrew Morton
  2002-07-31 21:35           ` Rik van Riel
  1 sibling, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2002-07-31 21:28 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: William Lee Irwin III, Rik van Riel, linux-mm

Benjamin LaHaise wrote:
> 
> On Wed, Jul 31, 2002 at 02:02:03PM -0700, Andrew Morton wrote:
> > But let's back off a bit.   The problem is that a process
> > doing a large write() can penalise innocent processes which
> > want to allocate memory.
> >
> > How to fix that?
> 
> First off, make it obvious where we block in the allocation path (pawning
> off all memory reaping to kswapd et al is an easy first step here).  Then
> make allocators cycle through on a FIFO basis by using something like the
> page reservation patch I came up with a while ago.  That'll give us an
> easy place to change scheduling behaviour.

None of that will preferentially throttle the source of
dirty pages, which seems a good thing to do?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 21:25         ` Rik van Riel
@ 2002-07-31 21:32           ` Andrew Morton
  2002-07-31 21:55             ` Rik van Riel
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2002-07-31 21:32 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Benjamin LaHaise, William Lee Irwin III, linux-mm

Rik van Riel wrote:
> 
> On Wed, 31 Jul 2002, Benjamin LaHaise wrote:
> > On Wed, Jul 31, 2002 at 02:02:03PM -0700, Andrew Morton wrote:
> > > But let's back off a bit.   The problem is that a process
> > > doing a large write() can penalise innocent processes which
> > > want to allocate memory.
> > >
> > > How to fix that?
> >
> > First off, make it obvious where we block in the allocation path (pawning
> > off all memory reaping to kswapd et al is an easy first step here).  Then
> > make allocators cycle through on a FIFO basis by using something like the
> > page reservation patch I came up with a while ago.  That'll give us an
> > easy place to change scheduling behaviour.
> 
> These ingredients are already in 2.4-rmap.

It doesn't seem to work.  The -ac kernel has weird stalls on storms
of ext3 writeback.  It's quite irritating, although probably not to
do with the VM.

The scheduler in the -ac kernel is also bad.  Start a kernel build
and things like X apps and gdb become hugely slow.  2.5 is like that
too.  I'll be going back to Marcelo.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 21:28         ` Andrew Morton
@ 2002-07-31 21:35           ` Rik van Riel
  0 siblings, 0 replies; 14+ messages in thread
From: Rik van Riel @ 2002-07-31 21:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Benjamin LaHaise, William Lee Irwin III, linux-mm

On Wed, 31 Jul 2002, Andrew Morton wrote:

> > First off, make it obvious where we block in the allocation path (pawning
> > off all memory reaping to kswapd et al is an easy first step here).  Then
> > make allocators cycle through on a FIFO basis by using something like the
> > page reservation patch I came up with a while ago.  That'll give us an
> > easy place to change scheduling behaviour.
>
> None of that will preferentially throttle the source of
> dirty pages, which seems a good thing to do?

But it will throttle the page dirtiers we care about, ie. the
ones allocating new memory.

I'm not sure we care too much about re-dirtying pagecache pages;
if that is happening we want to keep those pages resident anyway.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 21:32           ` Andrew Morton
@ 2002-07-31 21:55             ` Rik van Riel
  2002-07-31 22:24               ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Rik van Riel @ 2002-07-31 21:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Benjamin LaHaise, William Lee Irwin III, linux-mm

On Wed, 31 Jul 2002, Andrew Morton wrote:

> > These ingredients are already in 2.4-rmap.
>
> It doesn't seem to work.  The -ac kernel has weird stalls on
> storms of ext3 writeback.

Maybe you shouldn't have cut off the other line from my
2-line mail ;)))

The most probable reason for the stalls is the fact that
page_launder (like shrink_cache) will try to write out
the complete inactive list if it's almost full of dirty
pages, so the system will still be stuck in __get_request_wait
seconds after the first few megabytes of the paged out
inactive pages have been cleaned already.

cheers,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 21:55             ` Rik van Riel
@ 2002-07-31 22:24               ` Andrew Morton
  2002-07-31 22:32                 ` Rik van Riel
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2002-07-31 22:24 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Benjamin LaHaise, William Lee Irwin III, linux-mm

Rik van Riel wrote:
> 
> On Wed, 31 Jul 2002, Andrew Morton wrote:
> 
> > > These ingredients are already in 2.4-rmap.
> >
> > It doesn't seem to work.  The -ac kernel has weird stalls on
> > storms of ext3 writeback.
> 
> Maybe you shouldn't have cut off the other line from my
> 2-line mail ;)))
> 
> The most probable reason for the stalls is the fact that
> page_launder (like shrink_cache) will try to write out
> the complete inactive list if it's almost full of dirty
> pages, so the system will still be stuck in __get_request_wait
> seconds after the first few megabytes of the paged out
> inactive pages have been cleaned already.

I doubt if it's that, although it might be.

It happens just during a kernel build, 768M of RAM.  And/or
during big CVS operations.  Possibly it's due to ext3 checkpointing.
In ordered data mode with these workloads, kupdate should normally
be doing that, so it may be a kupdate problem, or a missing
wakeup_bdflush.

It's not a big issue - people would be unlikely to notice unless
they were switching between kernels, and were ravingly impatient,
like me.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: throttling dirtiers
  2002-07-31 22:24               ` Andrew Morton
@ 2002-07-31 22:32                 ` Rik van Riel
  0 siblings, 0 replies; 14+ messages in thread
From: Rik van Riel @ 2002-07-31 22:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Benjamin LaHaise, William Lee Irwin III, linux-mm

On Wed, 31 Jul 2002, Andrew Morton wrote:

> > The most probable reason for the stalls is the fact that
> > page_launder (like shrink_cache) will try to write out
> > the complete inactive list if it's almost full of dirty
> > pages, so the system will still be stuck in __get_request_wait
> > seconds after the first few megabytes of the paged out
> > inactive pages have been cleaned already.
>
> I doubt if it's that, although it might be.
>
> It happens just during a kernel build, 768M of RAM.  And/or
> during big CVS operations.  Possibly it's due to ext3 checkpointing.

Indeed, my scenario above is unlikely to be the reason with
these workloads.

However, I have noticed the problem with fillmem, or just
when the system has the sudden urge to swapout a large
process ;)

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2002-07-31 22:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-31  8:26 throttling dirtiers Andrew Morton
2002-07-31 20:06 ` William Lee Irwin III
2002-07-31 20:23   ` Benjamin LaHaise
2002-07-31 20:26     ` Rik van Riel
2002-07-31 20:59     ` William Lee Irwin III
2002-07-31 21:02     ` Andrew Morton
2002-07-31 21:14       ` Benjamin LaHaise
2002-07-31 21:25         ` Rik van Riel
2002-07-31 21:32           ` Andrew Morton
2002-07-31 21:55             ` Rik van Riel
2002-07-31 22:24               ` Andrew Morton
2002-07-31 22:32                 ` Rik van Riel
2002-07-31 21:28         ` Andrew Morton
2002-07-31 21:35           ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox