linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* TODO list for new VM  (oct 2000)
@ 2000-10-02 18:01 Rik van Riel
  2000-10-02 18:20 ` Linus Torvalds
  0 siblings, 1 reply; 5+ messages in thread
From: Rik van Riel @ 2000-10-02 18:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm, Matthew Dillon, Linus Torvalds

[MM TODO list, updated for october 2000]

---
Here is the TODO list for the new VM. The only thing
really needed for 2.4 is the OOM handler and a fix
for the highmem deadlock.

The page->mapping->flush() callback is really wanted
by the journaling filesystem folks.

The rest are mostly extra's that would be nice; these
things won't be pushed for inclusion except if it turns
out to be really trivial to implement, high performance
on the cases they're supposed to affect and their influence
is highly localised...

(sorry folks, but for 2.4 I'll be really conservative)

---> TODO list for the new VM <---

for kernel 2.4, necessary:
- out of memory handling
	[integrate the OOM killer, 10 minutes work]
- fix the highmem deadlock, where the swapper cannot create
  low memory bounce buffers OR swap out low memory because
  it has consumed all resources
	[old bug, already reported with 2.4.0-test6, probably before]

for kernel 2.4, really wanted:
- page->mapping->flush() callback in page_launder(),
  for easier integration with journaling filesystems
  and maybe the network filesystems
	[about 30 minutes of work on the VM side]

for kernel 2.4, wanted:
- maybe rebalance the swapper a bit ... we do page aging
  now so maybe refill_inactive_scan() / shm_swap() and
  swap_out() need to be rebalanced a bit

for kernel 2.5:    (maybe available as patch for 2.4 ???)
- physical->virtual reverse mapping, so we can do much
  better page aging with less CPU usage spikes
- better IO clustering for swap (and filesystem) IO
- move all the global VM variables, lists, etc. into
  the pgdat struct for better NUMA scalability
- (maybe) some QoS things, as far as they are major
  improvements with minor intrusion
- thrashing control, maybe process suspension with some
  forced swapping ?
- include Ben LaHaise's code, which moves readahead
  to the VMA level, this way we can do streaming swap
  IO, complete with drop_behind()

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
       -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/		http://www.surriel.com/


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: TODO list for new VM  (oct 2000)
  2000-10-02 18:01 TODO list for new VM (oct 2000) Rik van Riel
@ 2000-10-02 18:20 ` Linus Torvalds
  2000-10-02 18:24   ` Rik van Riel
  0 siblings, 1 reply; 5+ messages in thread
From: Linus Torvalds @ 2000-10-02 18:20 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, linux-mm, Matthew Dillon

Why do you apparently ignore the fact that page-out write-back performance
is horribly crappy because it always starts out doing synchronous writes?

I pointed out previously in a private email that page_launder() must be
buggy as it stands now, you seem to have ignored that part (and the
test-program that shows 1MB/s writeout speeds due to it) completely.

The whole _point_ of the new VM was performance. Without that, the new VM
is pointless, and discussing TODO features is equally pointless.

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: TODO list for new VM  (oct 2000)
  2000-10-02 18:20 ` Linus Torvalds
@ 2000-10-02 18:24   ` Rik van Riel
  2000-10-02 18:34     ` Rik van Riel
  0 siblings, 1 reply; 5+ messages in thread
From: Rik van Riel @ 2000-10-02 18:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-mm, Matthew Dillon

On Mon, 2 Oct 2000, Linus Torvalds wrote:

> Why do you apparently ignore the fact that page-out write-back
> performance is horribly crappy because it always starts out
> doing synchronous writes?

Because it is fixed in the patch I mailed yesterday?

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
       -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: TODO list for new VM  (oct 2000)
  2000-10-02 18:24   ` Rik van Riel
@ 2000-10-02 18:34     ` Rik van Riel
  2000-10-05  1:08       ` Matthew Dillon
  0 siblings, 1 reply; 5+ messages in thread
From: Rik van Riel @ 2000-10-02 18:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-mm, Matthew Dillon

On Mon, 2 Oct 2000, Rik van Riel wrote:
> On Mon, 2 Oct 2000, Linus Torvalds wrote:
> 
> > Why do you apparently ignore the fact that page-out write-back
> > performance is horribly crappy because it always starts out
> > doing synchronous writes?
> 
> Because it is fixed in the patch I mailed yesterday?

One small warning though. Please don't apply that patch
yet because I fixed 3 more small problems today. I'll
send you an updated patch...

- the compile warnings are fixed
- in try_to_free_pages(), we forgot to set
  PF_MEMALLOC in current->flags  (oops)
- in grow_buffers(), in case we cannot get a
  buffer head, we must unlock the page

A patch against 2.4.0-test9-pre8 with these 3 changes will
be on its way once I've tested it a bit...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
       -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/		http://www.surriel.com/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: TODO list for new VM  (oct 2000)
  2000-10-02 18:34     ` Rik van Riel
@ 2000-10-05  1:08       ` Matthew Dillon
  0 siblings, 0 replies; 5+ messages in thread
From: Matthew Dillon @ 2000-10-05  1:08 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Linus Torvalds, linux-kernel, linux-mm, Matthew Dillon

    My experience with FreeBSD's asynchronous paging
    is that you have to carefully limit the number of
    I/O's you queue at once.  Or, more specifically, you
    have to limit the seeking load the async pageouts
    place on the system.

    The performance curve from the point of user processes 
    in the system looks like a bell, while the paging
    performance looks like a log curve (increased performance
    with diminishing returns)... if you queue too few
    pages (degenerate into synchronous paging), you have low
    paging performance and high user process performance,
    but you can't clean pages fast enough in a heavily loaded
    system.  If you queue too many pages at once, you have
    high paging performance (but with diminishing returns)
    and low user process performance due to the seeking
    load you've placed on the disk.  Excessive seeking
    from pageouts will ruin the disk's performance from
    the point of view of other processes in the system.

    FreeBSD has a sysctl variable called vm.max_page_launder
    which limits the number of pages the pageout daemon
    will queue to I/O at once.  The default is 32.   Numbers
    between 16 and 32 were found to fit the sweet spot of
    the curve the best.  Numbers lower then 16 reduced
    system performance because potentially contiguous pageouts
    would get split (causing more seeking rather then less when
    mixed with I/O initiated from user processes), and numbers
    higher then 32 reduced user process performance due to the
    additional seeking from the queued pageouts.

    The sysadmin can adjust the value to effectively give
    paging more or less priority.  A smaller number reduces
    paging performance but increasing system performance
    for other processes (though anything less then 4 will
    reduce performance for everyone).  A higher number
    increases paging performance at the cost of system
    performance for other processes.  Virtually all FreeBSD
    installations that I know about leave the sysctl variable
    alone.

    Note that the performance bell holds true whether you
    sort disk requests or not, the whole bell simply moves up
    or down on the graph.

    There are a number of things that can be done to mitigate
    the seeking issue, which I discussed with Rik a few months
    ago.  The jist of it, though, is that there is a trade-off
    between page-in and page-out performance based on how you
    try to cluster swap allocation.  FreeBSD clusters swap
    allocations to optimize page-out performance at the cost
    of page-in performance and that seems to work very
    well under heavy system loads.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2000-10-05  1:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-10-02 18:01 TODO list for new VM (oct 2000) Rik van Riel
2000-10-02 18:20 ` Linus Torvalds
2000-10-02 18:24   ` Rik van Riel
2000-10-02 18:34     ` Rik van Riel
2000-10-05  1:08       ` Matthew Dillon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox