From: "Stephen C. Tweedie" <sct@redhat.com>
To: Andrea Arcangeli <andrea@suse.de>
Cc: "Stephen C. Tweedie" <sct@redhat.com>,
Marcelo Tosatti <marcelo@conectiva.com.br>,
Rik van Riel <riel@conectiva.com.br>, Jens Axboe <axboe@suse.de>,
Alan Cox <alan@redhat.com>,
Derek Martin <derek@cerberus.ne.mediaone.net>,
Linux Kernel <linux-kernel@vger.rutgers.edu>,
linux-mm@kvack.org, "David S. Miller" <davem@redhat.com>
Subject: Re: [PATCH] 2.2.17pre7 VM enhancement Re: I/O performance on 2.4.0-test2
Date: Thu, 6 Jul 2000 14:29:45 +0100 [thread overview]
Message-ID: <20000706142945.A4237@redhat.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0007061211480.4810-100000@inspiron.random>; from andrea@suse.de on Thu, Jul 06, 2000 at 12:35:58PM +0200
Hi,
On Thu, Jul 06, 2000 at 12:35:58PM +0200, Andrea Arcangeli wrote:
>
> I'm not sure what you planned exactly to do (maybe we can talk about this
> some time soon) but I'll tell you what I planned to do taking basic idea
> to throw-out-swap_out from the very _cool_ DaveM throw-swap_out patch
> floating around that's been the _only_ recent VM 2.[34].x patch that I
> seen floating around that really excited me (I've not focused all the
> details of his patch but I'm pretty sure it's very similar design even if
> probably not equal to what I'm trying to do).
Right, this is obviously needed for 2.5 (at least as an experimental
branch), but we simply can't do it in time for 2.4. It's too big a
change. If we get rid of swap_out, and do our reclaim based on
physical page lists, then suddenly a whole new class of problems
arises. For example, our swap clustering relies on allocating
sequential swap addresses to sequentially scanned VM addresses, so
that clustered swapout and swapin work naturally. Switch to
physically-ordered swapping and there's no longer any natural way of
getting the on-disk swap related to VA ordering, so that swapin
clustering breaks completely. To fix this, you need the final swapout
to try to swap nearby pages in VA space at the same time. It's a lot
of work to get it right.
> Then we'll need a page-to-pte_chain reverse lookup.
Right, and I think there are ways we can do this relatively cheaply.
Use the address_space's vma ring for shared pages, use the struct page
itself to encode the VA of the page for unshared anon pages, and keep
a separate hash of all shared anon ptes.
> Once we'll have that
> too we'll can remove swap_out and do everything (except dcache/icache
> things) in shrink_mmap
Right, but this is all completely orthogonal to the problems I was
talkiing about in my original email. Those problems were to do with
things like write-throttling and managing free space, and did not
concern identifying which pages to throw out or how to age them.
Rik's multi-queued code, or the new code from Ludovic Fernandez which
separates out page aging to a different thread.
> So basically we'll have these completly different lists:
>
> lru_swap_cache
> lru_cache
> lru_mapped
>
> The three caches have completly different importance that is implicit by
> the semantics of the memory they are queuing.
I think this is entirely the wrong way to be thinking about the
problem. It seems to me to be much more important that we know:
1) What pages are unreferenced by the VM (except for page cache
references) and which can therefore be freed at a moment's notice;
2) What pages are queued for write;
3) what pages are referenced and in use for other reasons.
Completely unreferenced pages can be freed on a moment's notice. If
we are careful with the spinlocks we can even free them from within an
interrupt.
By measuring the throughput of these different page classes we can
work out what the VM pressure and write pressure is. When we get a
write page fault, we can (for example) block until the write queue
comes down to a certain size, to obtain write flow control.
More importantly, the scanning of the dirty and in-use queues can go
on separately from the freeing of clean pages. The more memory
pressure we are under --- ie. the faster we are gobbling unmapped
pages off the unreferenced queue --- the more rapidly we let the aging
thread walk the referenced pages and try to age pages onto the
unreferenced queue.
Cheers,
Stephen
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
next prev parent reply other threads:[~2000-07-06 13:29 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20000629114407.A3914@redhat.com>
[not found] ` <Pine.LNX.4.21.0006291330520.1713-100000@inspiron.random>
2000-06-29 13:00 ` Stephen C. Tweedie
2000-07-06 10:35 ` Andrea Arcangeli
2000-07-06 13:29 ` Stephen C. Tweedie [this message]
2000-07-09 17:11 ` Swap clustering with new VM Marcelo Tosatti
2000-07-09 20:53 ` Andrea Arcangeli
2000-07-11 9:36 ` Stephen C. Tweedie
2000-07-09 20:31 ` [PATCH] 2.2.17pre7 VM enhancement Re: I/O performance on 2.4.0-test2 Andrea Arcangeli
2000-07-11 11:50 ` Stephen C. Tweedie
2000-07-11 16:17 ` Andrea Arcangeli
2000-07-11 16:36 ` Juan J. Quintela
2000-07-11 17:33 ` Andrea Arcangeli
2000-07-11 17:45 ` Rik van Riel
2000-07-11 17:54 ` Andrea Arcangeli
2000-07-11 18:03 ` Juan J. Quintela
2000-07-11 19:32 ` Andrea Arcangeli
2000-07-12 0:05 ` John Alvord
2000-07-12 0:52 ` Andrea Arcangeli
2000-07-12 18:02 ` Rik van Riel
2000-07-14 8:51 ` Stephen C. Tweedie
2000-07-11 17:32 ` Rik van Riel
2000-07-11 17:41 ` Andrea Arcangeli
2000-07-11 17:47 ` Rik van Riel
2000-07-11 18:00 ` Andrea Arcangeli
2000-07-11 18:06 ` Rik van Riel
2000-07-17 7:09 ` [PATCH] 2.2.17pre7 VM enhancement Re: I/O performance on Yannis Smaragdakis
2000-07-17 9:28 ` Stephen C. Tweedie
2000-07-17 13:01 ` James Manning
2000-07-17 14:32 ` Scott F. Kaplan
2000-07-17 14:53 ` Rik van Riel
2000-07-17 16:44 ` Manfred Spraul
2000-07-17 17:02 ` Rik van Riel
2000-07-17 18:55 ` Yannis Smaragdakis
2000-07-17 19:57 ` John Fremlin
2000-07-17 14:46 ` Alan Cox
2000-07-17 14:55 ` Scott F. Kaplan
2000-07-17 15:31 ` Rik van Riel
2000-07-14 9:01 ` [PATCH] 2.2.17pre7 VM enhancement Re: I/O performance on 2.4.0-test2 Stephen C. Tweedie
2000-07-11 18:13 ` Juan J. Quintela
2000-07-11 20:57 ` Roger Larsson
2000-07-11 22:49 ` Juan J. Quintela
2000-07-12 16:01 ` Kev
2000-07-06 13:54 ` [PATCH] 2.2.17pre7 VM enhancement Re: I/O performance on2.4.0-test2 Roman Zippel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20000706142945.A4237@redhat.com \
--to=sct@redhat.com \
--cc=alan@redhat.com \
--cc=andrea@suse.de \
--cc=axboe@suse.de \
--cc=davem@redhat.com \
--cc=derek@cerberus.ne.mediaone.net \
--cc=linux-kernel@vger.rutgers.edu \
--cc=linux-mm@kvack.org \
--cc=marcelo@conectiva.com.br \
--cc=riel@conectiva.com.br \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox