nonblocking-vm.patch

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* nonblocking-vm.patch
@ 2002-09-04 10:28 Andrew Morton
  2002-09-04 13:32 ` nonblocking-vm.patch Rik van Riel
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-09-04 10:28 UTC (permalink / raw)
  To: linux-mm

This is the only interesting part of today's patches really.

- If the page is Dirty and its queue is not congested, do some writeback.

- If the page is dirty and its queue is congested, refile the page.

- If the page is under writeback, refile it.

- If the page is dirty, and mapped into pagetables then write the
  thing anyway (haven't tested this yet).  This is to get around the
  problem of big dirty mmaps - everything stalls on request queues.
  Oh well.

  It'll also have the effect of throttling everyone under heavy
  swapout.  Which is also not a big worry, IMO.


An optimisation of course is to get those dirty pagecache pages
away onto another list.  But even with the dopey 400-dbench workload,
50% of the pages were successfuly reclaimed.  And I don't really care
about CPU efficiency in that case - I think it's only important to 
care about CPU efficiency in situations where the VM isn't providing
any benefit (there are free pages, trivially reclaimable clean pagecache,
etc).


The way all this works is to basically partition the machine.  40%
of memory is available to the "heavy dirtier" and 60% is available to
the rest of the world.  So if the working set of the innocent processes
exceeds 60% of physical, they get evicted, swapped out, whatever.  it's
like that memory just isn't there.

Which is a reasonable and simple model, I think.   The 40% is governed by
/proc/sys/vm/dirty_async_ratio, and we could adaptively twiddle it down if
we think the heavy writer is consuming too many resources.

Any suggestions on an algorithm for that?  Other comments, improvements,
etc?

How much complexity would it add to put an inactive_dirty list in there?


--- 2.5.33/mm/vmscan.c~nonblocking-vm	Wed Sep  4 03:05:28 2002
+++ 2.5.33-akpm/mm/vmscan.c	Wed Sep  4 03:09:07 2002
@@ -25,6 +25,7 @@
 #include <linux/buffer_head.h>		/* for try_to_release_page() */
 #include <linux/mm_inline.h>
 #include <linux/pagevec.h>
+#include <linux/backing-dev.h>
 #include <linux/rmap-locking.h>
 
 #include <asm/pgalloc.h>
@@ -94,6 +95,17 @@ static inline int is_page_cache_freeable
 	return page_count(page) - !!PagePrivate(page) == 2;
 }
 
+struct vmstats {
+	int inspected;
+	int reclaimed;
+	int refiled_nonfreeable;
+	int refiled_no_mapping;
+	int refiled_nofs;
+	int refiled_congested;
+	int written_back;
+	int refiled_writeback;
+} vmstats;
+
 static /* inline */ int
 shrink_list(struct list_head *page_list, int nr_pages, unsigned int gfp_mask,
 		int priority, int *max_scan, int *prunes_needed)
@@ -112,6 +124,8 @@ shrink_list(struct list_head *page_list,
 		page = list_entry(page_list->prev, struct page, lru);
 		list_del(&page->lru);
 
+		vmstats.inspected++;
+
 		if (TestSetPageLocked(page))
 			goto keep;
 		BUG_ON(PageActive(page));
@@ -135,10 +149,8 @@ shrink_list(struct list_head *page_list,
 				(PageSwapCache(page) && (gfp_mask & __GFP_IO));
 
 		if (PageWriteback(page)) {
-			if (may_enter_fs)
-				wait_on_page_writeback(page);  /* throttling */
-			else
-				goto keep_locked;
+			vmstats.refiled_writeback++;
+			goto keep_locked;
 		}
 
 		pte_chain_lock(page);
@@ -188,19 +200,38 @@ shrink_list(struct list_head *page_list,
 		 * will write it.  So we're back to page-at-a-time writepage
 		 * in LRU order.
 		 */
-		if (PageDirty(page) && is_page_cache_freeable(page) &&
-					mapping && may_enter_fs) {
+		if (PageDirty(page)) {
 			int (*writeback)(struct page *,
 					struct writeback_control *);
 			const int cluster_size = SWAP_CLUSTER_MAX;
 			struct writeback_control wbc = {
 				.nr_to_write = cluster_size,
+				.nonblocking = 1,
 			};
 
+			if (!is_page_cache_freeable(page)) {
+				vmstats.refiled_nonfreeable++;
+				goto keep_locked;
+			}
+			if (!mapping) {
+				vmstats.refiled_no_mapping++;
+				goto keep_locked;
+			}
+			if (!may_enter_fs) {
+				vmstats.refiled_nofs++;
+				goto keep_locked;
+			}
+			if (!page->pte.direct &&
+				bdi_write_congested(mapping->backing_dev_info)){
+				vmstats.refiled_congested++;
+				goto keep_locked;
+			}
+
 			writeback = mapping->a_ops->vm_writeback;
 			if (writeback == NULL)
 				writeback = generic_vm_writeback;
 			(*writeback)(page, &wbc);
+			vmstats.written_back += cluster_size - wbc.nr_to_write;
 			*max_scan -= (cluster_size - wbc.nr_to_write);
 			goto keep;
 		}
@@ -262,6 +293,7 @@ free_ref:
 free_it:
 		unlock_page(page);
 		nr_pages--;
+		vmstats.reclaimed++;
 		if (!pagevec_add(&freed_pvec, page))
 			__pagevec_release_nonlru(&freed_pvec);
 		continue;

.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 10:28 nonblocking-vm.patch Andrew Morton
@ 2002-09-04 13:32 ` Rik van Riel
  2002-09-04 18:44   ` nonblocking-vm.patch Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2002-09-04 13:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

On Wed, 4 Sep 2002, Andrew Morton wrote:

> - If the page is dirty, and mapped into pagetables then write the
>   thing anyway (haven't tested this yet).  This is to get around the
>   problem of big dirty mmaps - everything stalls on request queues.
>   Oh well.

I don't think we need this.  If the request queue is saturated, and
free memory is low, the request queue is guaranteed to be full of
writes, which will result in memory becoming freeable soon.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 13:32 ` nonblocking-vm.patch Rik van Riel
@ 2002-09-04 18:44   ` Andrew Morton
  2002-09-04 19:42     ` nonblocking-vm.patch Rik van Riel
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-09-04 18:44 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

Rik van Riel wrote:
> 
> On Wed, 4 Sep 2002, Andrew Morton wrote:
> 
> > - If the page is dirty, and mapped into pagetables then write the
> >   thing anyway (haven't tested this yet).  This is to get around the
> >   problem of big dirty mmaps - everything stalls on request queues.
> >   Oh well.
> 
> I don't think we need this.  If the request queue is saturated, and
> free memory is low, the request queue is guaranteed to be full of
> writes, which will result in memory becoming freeable soon.
> 

OK.  But I've gone and removed just about all the VM throttling (with
some glee, I might add).

We do need something in there to prevent kswapd from going berzerk.
I'm thinking something like this:

- My code only addresses write(2) pagecache.  Need to handle the (IMO rare)
  situation of large amounts of dirty MAP_SHARED data.

  We do this by always writing it out, and blocking on the request queue.
  And by waiting on PageWriteback pages.  That's just the pre-me behaviour.
  Should be OK for a first pass.

- Similarly, always write out dirty pagecache, so we throttle on the swapdev's
  request queue.

Which I think just leaves us with the no-swap-available problem. In this case
we really do need to slow page allocators down (I think.  I haven't done _any_
swapless testing).

I have a new function in the block layer `blk_congestion_wait()' which will
make the caller take a nap until some request queue comes unblocked.   That's
probably appropriate.  There's a corner case where there's writeout underway, but
no queues are congested.  In that case we can probably add a wakeup to
end_page_writeback(), and kick it on every 32nd page or whatever.  I'll play
with that a bit.

Now, wrt the magical 40% thing.  I'm thinking that we can change it in
this manner:

maximum amount of dirty+writeback pagecache =
	min((total memory - mapped memory) / 2, 40% or memory)

(Need some more accurate logic to calculate "total memory")

This means that half of the pool of unmapped memory is available to
heavy writers.  So if the machine is busy with lots of mapped memory,
and a burst of writes happens then they will initially be throttled
back fairly hard.  But if the write activity continues, `mapped memory'
will shrink due to swapout and pageout, and the amount of memory which
is available to the heavy writer will climb until it hits the (configurable)
40%.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 18:44   ` nonblocking-vm.patch Andrew Morton
@ 2002-09-04 19:42     ` Rik van Riel
  2002-09-04 20:14       ` nonblocking-vm.patch Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2002-09-04 19:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

On Wed, 4 Sep 2002, Andrew Morton wrote:

> We do need something in there to prevent kswapd from going berzerk.

Agreed, but it can be a lot simpler than your idea.

As long as we can free up to zone->pages_high pages,
we don't need to throttle since we're succeeding in
keeping enough pages free to not be woken up for a
while.

If we don't succeed in freeing enough pages, that is
because the pages are still under IO and haven't hit
the disk yet.  In this case, we need to wait for the
IO to finish, or at least for some of the pages to
get cleaned.  We can do this by simply refusing to
scan that zone again for a number of jiffies, say
1/4 of a second.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 19:42     ` nonblocking-vm.patch Rik van Riel
@ 2002-09-04 20:14       ` Andrew Morton
  2002-09-04 20:55         ` nonblocking-vm.patch Rik van Riel
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-09-04 20:14 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

Rik van Riel wrote:
> 
> On Wed, 4 Sep 2002, Andrew Morton wrote:
> 
> > We do need something in there to prevent kswapd from going berzerk.
> 
> Agreed, but it can be a lot simpler than your idea.
> 
> As long as we can free up to zone->pages_high pages,
> we don't need to throttle since we're succeeding in
> keeping enough pages free to not be woken up for a
> while.

OK, so after we've taken a scan through shrink_caches,
if we didn't reclaim the required pages then take a
nap.

Suspect that would work.  I get a bit upset over scanning non-reclaimable
pages (they shouldn't have been on that list!) But instrumentation
indicates that perhaps I'm being silly ;)

> If we don't succeed in freeing enough pages, that is
> because the pages are still under IO and haven't hit
> the disk yet.  In this case, we need to wait for the
> IO to finish, or at least for some of the pages to
> get cleaned.  We can do this by simply refusing to
> scan that zone again for a number of jiffies, say
> 1/4 of a second.

Well, it may be better to terminate that sleep earlier if IO
completes.  We can do that in end_page_writeback or in
blk_congestion_wait().   The latter takes a timeout, and
wakes you up earlier if _any_ queue exits congestion, or
if any queue puts back a request against an uncongested queue.

Which is, I think, precisely what we want - a request typically
covers a whole bunch of pages.  If the dirty memory is backed
by an non-request-oriented device (are there any such?  NFS seems
to be synchronous a lot of the time) then you'll hit the timeout.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 20:14       ` nonblocking-vm.patch Andrew Morton
@ 2002-09-04 20:55         ` Rik van Riel
  2002-09-04 21:22           ` nonblocking-vm.patch Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2002-09-04 20:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

On Wed, 4 Sep 2002, Andrew Morton wrote:

> > get cleaned.  We can do this by simply refusing to
> > scan that zone again for a number of jiffies, say
> > 1/4 of a second.
>
> Well, it may be better to terminate that sleep earlier if IO
> completes.

But only if enough IO completes. Otherwise we'll just end
up doing too much scanning for no gain again.

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 20:55         ` nonblocking-vm.patch Rik van Riel
@ 2002-09-04 21:22           ` Andrew Morton
  2002-09-04 21:34             ` nonblocking-vm.patch Rik van Riel
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-09-04 21:22 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

Rik van Riel wrote:
> 
> On Wed, 4 Sep 2002, Andrew Morton wrote:
> 
> > > get cleaned.  We can do this by simply refusing to
> > > scan that zone again for a number of jiffies, say
> > > 1/4 of a second.
> >
> > Well, it may be better to terminate that sleep earlier if IO
> > completes.
> 
> But only if enough IO completes. Otherwise we'll just end
> up doing too much scanning for no gain again.
> 

Well we want to _find_ the just-completed IO, yes?  Which implies
parking it onto the cold end of the inactive list at interrupt
time, or a separate list or something.

But let's look at the instrumentation and the profiles first.  I
expect it'll be OK.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 21:22           ` nonblocking-vm.patch Andrew Morton
@ 2002-09-04 21:34             ` Rik van Riel
  2002-09-04 21:46               ` nonblocking-vm.patch Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2002-09-04 21:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

On Wed, 4 Sep 2002, Andrew Morton wrote:

> > But only if enough IO completes. Otherwise we'll just end
> > up doing too much scanning for no gain again.
>
> Well we want to _find_ the just-completed IO, yes?  Which implies
> parking it onto the cold end of the inactive list at interrupt
> time, or a separate list or something.

In rmap14 I'm doing the following things when scanning the
inactive list:

1) if the page was referenced, activate
2) if the page is clean, reclaim

3) if the page is written to disk, keep it at the end of
   the list where we start scanning from

4) if we don't write the page to disk (I don't submit too
   much IO at once) we move it to the far end of the inactive
   list

This means that the pages for which IO completed will be found
somewhere near the start of the list.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 21:34             ` nonblocking-vm.patch Rik van Riel
@ 2002-09-04 21:46               ` Andrew Morton
  2002-09-04 22:12                 ` nonblocking-vm.patch Rik van Riel
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-09-04 21:46 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

Rik van Riel wrote:
> 
> On Wed, 4 Sep 2002, Andrew Morton wrote:
> 
> > > But only if enough IO completes. Otherwise we'll just end
> > > up doing too much scanning for no gain again.
> >
> > Well we want to _find_ the just-completed IO, yes?  Which implies
> > parking it onto the cold end of the inactive list at interrupt
> > time, or a separate list or something.
> 
> In rmap14 I'm doing the following things when scanning the
> inactive list:
> 
> 1) if the page was referenced, activate
> 2) if the page is clean, reclaim

OK.  We need to start getting some of that stuff going now.  We're
way too swappy at present.  I'll merge up your NRU/dropbehind
patch soon.  I imagine that you're waiting for me to stop changing
things.

> 3) if the page is written to disk, keep it at the end of
>    the list where we start scanning from

hum.  With the clustered-writeback-from-the-vm regime, this is
done over in mpage_writepages().  And that walks mapping->dirty_pages,
and moves the pages to the hot end of the inactive list (if they're
already on the inactive list).

I suppose we could just move them to the cold end and scan past them,
but that's a bit lazy.

They could be taken off the LRU altogether and reattached to the cold end
at IO completion.

But then, very little writeback actually happens from inside shrink_list.

> 4) if we don't write the page to disk (I don't submit too
>    much IO at once) we move it to the far end of the inactive
>    list
> 
> This means that the pages for which IO completed will be found
> somewhere near the start of the list.

OK.

(Why don't you move them over to inactive_dirty?  I've never understood
those two lists.  I suspect the names are misleading?)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 21:46               ` nonblocking-vm.patch Andrew Morton
@ 2002-09-04 22:12                 ` Rik van Riel
  2002-09-04 22:41                   ` nonblocking-vm.patch Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2002-09-04 22:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

On Wed, 4 Sep 2002, Andrew Morton wrote:

> OK.  We need to start getting some of that stuff going now.  We're
> way too swappy at present.  I'll merge up your NRU/dropbehind
> patch soon.  I imagine that you're waiting for me to stop changing
> things.

You seemed busy enough already ;)

> > 3) if the page is written to disk, keep it at the end of
> >    the list where we start scanning from
>
> hum.  With the clustered-writeback-from-the-vm regime, this is
> done over in mpage_writepages().  And that walks mapping->dirty_pages,
> and moves the pages to the hot end of the inactive list (if they're
> already on the inactive list).
>
> I suppose we could just move them to the cold end and scan past them,
> but that's a bit lazy.

Better yet, just leave them in place and scan over them only if
they aren't cleaned yet when they reach the end of the list.

The closer page reclaim is done to pure LRU order, the smoother
the VM seems to work. Quite possibly this is a side effect of
not doing too much IO at once, but still ... ;)

> > 4) if we don't write the page to disk (I don't submit too
> >    much IO at once) we move it to the far end of the inactive
> >    list
> >
> > This means that the pages for which IO completed will be found
> > somewhere near the start of the list.
>
> OK.
>
> (Why don't you move them over to inactive_dirty?  I've never understood
> those two lists.  I suspect the names are misleading?)

Sorry, I should have been clearer here.

Page_launder (shrink_cache) scans the inactive_dirty list.

Pages which are ready to be reclaimed get moved to the inactive_clean
list, from where __alloc_pages() deals with them.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 22:12                 ` nonblocking-vm.patch Rik van Riel
@ 2002-09-04 22:41                   ` Andrew Morton
  2002-09-04 22:46                     ` nonblocking-vm.patch Rik van Riel
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2002-09-04 22:41 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

Rik van Riel wrote:
> 
> ...
> Page_launder (shrink_cache) scans the inactive_dirty list.
> 
> Pages which are ready to be reclaimed get moved to the inactive_clean
> list, from where __alloc_pages() deals with them.
> 

The clang you heard was a penny.  (Nickel?  Dime?)

So you have kswapd running page_launder most of the time, but under
stress, page allocators will do it too.

With all this infrastructure, we can tell beforehand whether
a writeout will block.  And I think that changes everything.  It
presumably means that we can get quite a bit smarter in there - if
kswapd sees a non-blockingly-writeable mapping, go write it and move
the pages <here>.  If kswapd sees some dirty pages which might cause
request queue blockage, then move them <there>.  If the caller is _not_
kswapd then blocking is sometimes desirable, so do something else.

I think I'm pretty much finished mangling vmscan.c (honest).  Let
me get the current stuff settled in and working not-completely-terribly,
then you can get it working properly, OK?  Should be a few days more..

I'll leave the additional instrumentation in place for the while, find some
way of getting the kernel to spit it out on demand.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 22:41                   ` nonblocking-vm.patch Andrew Morton
@ 2002-09-04 22:46                     ` Rik van Riel
  2002-09-05  5:43                       ` nonblocking-vm.patch Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Rik van Riel @ 2002-09-04 22:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

On Wed, 4 Sep 2002, Andrew Morton wrote:
> Rik van Riel wrote:
> >
> > ...
> > Page_launder (shrink_cache) scans the inactive_dirty list.
> >
> > Pages which are ready to be reclaimed get moved to the inactive_clean
> > list, from where __alloc_pages() deals with them.
>
> The clang you heard was a penny.  (Nickel?  Dime?)
>
> So you have kswapd running page_launder most of the time, but under
> stress, page allocators will do it too.

kswapd (well, page_launde) moves pages from the inactive_dirty list to the
inactive_clean list.  Page allocators grab pages from the inactive_clean
list.

> With all this infrastructure, we can tell beforehand whether
> a writeout will block.  And I think that changes everything.  It
> presumably means that we can get quite a bit smarter in there - if
> kswapd sees a non-blockingly-writeable mapping, go write it and move
> the pages <here>.  If kswapd sees some dirty pages which might cause
> request queue blockage, then move them <there>.  If the caller is _not_
> kswapd then blocking is sometimes desirable, so do something else.

Absolutely.

> I think I'm pretty much finished mangling vmscan.c (honest).  Let
> me get the current stuff settled in and working not-completely-terribly,
> then you can get it working properly, OK?  Should be a few days more..
>
> I'll leave the additional instrumentation in place for the while, find some
> way of getting the kernel to spit it out on demand.

Sounds great.  Btw, what I have found is that once the right mechanism
is in place, additional tweaking of magic numbers achieves exactly ...
nothing.

A good mechanism balances itself.

regards,

Rik
-- 
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/		http://distro.conectiva.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nonblocking-vm.patch
  2002-09-04 22:46                     ` nonblocking-vm.patch Rik van Riel
@ 2002-09-05  5:43                       ` Andrew Morton
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2002-09-05  5:43 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-mm

For the record...

One thing we could do is to make the heavy write()r perform
blocking writeback in the page allocator:

generic_file_write()
{
	current->bdi = mapping->backing_dev_info;
	...
	current->bdi = NULL;
}

shrink_list()
{
	...
	if (PageDirty(page) && mapping->backing_dev_info == current->bdi)
		writeback(page->mapping);
	...
}

So when that writer allocates a page, he gets to clean up
his own mess, rather than scanning past those pages.

We have to write back just that queue; otherwise we get back to
the situation where one queue enters congested and that blocks the
whole world.

It's just an idea to bear in mind - balance_dirty_pages() is
supposed to be the place where this happens, but the above would
perhaps mop up some mmapped dirty memory, stray dirty pages which
reach the cold end of the LRU, etc.   And this is definitely a
writeback resource which we can use in that situation.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2002-09-05  5:43 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-04 10:28 nonblocking-vm.patch Andrew Morton
2002-09-04 13:32 ` nonblocking-vm.patch Rik van Riel
2002-09-04 18:44   ` nonblocking-vm.patch Andrew Morton
2002-09-04 19:42     ` nonblocking-vm.patch Rik van Riel
2002-09-04 20:14       ` nonblocking-vm.patch Andrew Morton
2002-09-04 20:55         ` nonblocking-vm.patch Rik van Riel
2002-09-04 21:22           ` nonblocking-vm.patch Andrew Morton
2002-09-04 21:34             ` nonblocking-vm.patch Rik van Riel
2002-09-04 21:46               ` nonblocking-vm.patch Andrew Morton
2002-09-04 22:12                 ` nonblocking-vm.patch Rik van Riel
2002-09-04 22:41                   ` nonblocking-vm.patch Andrew Morton
2002-09-04 22:46                     ` nonblocking-vm.patch Rik van Riel
2002-09-05  5:43                       ` nonblocking-vm.patch Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox