From: "Ray Lee" <ray-lk@madrabbit.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>,
Eric St-Laurent <ericstl34@sympatico.ca>,
Rene Herman <rene.herman@gmail.com>,
Jesper Juhl <jesper.juhl@gmail.com>, ck list <ck@vds.kolivas.org>,
Ingo Molnar <mingo@elte.hu>, Paul Jackson <pj@sgi.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: -mm merge plans for 2.6.23
Date: Wed, 25 Jul 2007 23:33:24 -0700 [thread overview]
Message-ID: <2c0942db0707252333uc7631fduadb080193f6ad323@mail.gmail.com> (raw)
In-Reply-To: <20070725215717.df1d2eea.akpm@linux-foundation.org>
On 7/25/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Wed, 25 Jul 2007 09:09:01 -0700
> "Ray Lee" <ray-lk@madrabbit.org> wrote:
>
> > No, there's a third case which I find the most annoying. I have
> > multiple working sets, the sum of which won't fit into RAM. When I
> > finish one, the kernel had time to preemptively swap back in the
> > other, and yet it didn't. So, I sit around, twiddling my thumbs,
> > waiting for my music player to come back to life, or thunderbird,
> > or...
>
> Yes, I'm thinking that's a good problem statement and it isn't something
> which the kernel even vaguely attempts to address, apart from normal
> demand paging.
>
> We could perhaps improve things with larger and smarter fault readaround,
> perhaps guided by refault-rate measurement. But that's still demand-paged
> rather than being proactive/predictive/whatever.
>
> None of this is swap-specific though: exactly the same problem would need
> to be solved for mmapped files and even plain old pagecache.
<nod> Could be what I'm noticing, but it's important to note that as
others have shown improvement with Con's swap prefetch, it's easily
arguable that targeting just swap is good enough for a first
approximation.
> In fact I'd restate the problem as "system is in steady state A, then there
> is a workload shift causing transition to state B, then the system goes
> idle. We now wish to reinstate state A in anticipation of a resumption of
> the original workload".
Yes, that's a fair transformation / generalization. It's always nice
talking to someone with more clarity than one's self.
> swap-prefetch solves a part of that.
>
> A complete solution for anon and file-backed memory could be implemented
> (ta-da) in userspace using the kernel inspection tools in -mm's maps2-*
> patches.
> We would need to add a means by which userspace can repopulate
> swapcache,
Okay, let's run with that for argument's sake.
> but that doesn't sound too hard (especially when you haven't
> thought about it).
I've always thought your sense of humor was underappreciated.
> And userspace can right now work out which pages from which files are in
> pagecache so this application can handle pagecache, swap and file-backed
> memory. (file-backed memory might not even need special treatment, given
> that it's pagecache anyway).
So in your proposed scheme, would userspace be polling, er, <goes and
looks through email for maps2 stuff, only finds Rusty's patches to
it>, well, /proc/<pids>/something_or_another?
A userspace daemon that wakes up regularly to poll a bunch of proc
files fills me with glee. Wait, is that glee? I think, no... wait...
horror, yes, horror is what I'm feeling.
I'm wrong, right? I love being wrong about this kind of stuff.
> And userspace can do a much better implementation of this
> how-to-handle-large-load-shifts problem, because it is really quite
> complex. The system needs to be monitored to determine what is the "usual"
> state (ie: the thing we wish to reestablish when the transient workload
> subsides). The system then needs to be monitored to determine when the
> exceptional workload has started, and when it has subsided, and userspace
> then needs to decide when to start reestablishing the old working set, at
> what rate, when to abort doing that, etc.
Oy. I mean this in the most respectful way possible, but you're too
smart for your own good.
I mean, sure, it's possible one could have multiply-chained transient
workloads each of which have their optimum workingset, of which
there's little overlap with the previous. Mainframes made their names
on such loads. Workingset A starts, generates data, finishes and
invokes workingset B, of which the only thing they share in common is
said data. B finishes and invokes C, etc.
So, yeah, that's way too complex to stuff into the kernel. Even if it
were possible to do so, I cringe at the thought. And I can't believe
that would be a common enough pattern nowadays to justify any
hueristics on anyone's part. It's certainly complex enough that I'd
like to punt that scenario out of the conversation entirely -- I think
it has the potential to give a false impression as to how involved of
a process we're talking about here.
Let's go back to your restatement:
> In fact I'd restate the problem as "system is in steady state A, then there
> is a workload shift causing transition to state B, then the system goes
> idle. We now wish to reinstate state A in anticipation of a resumption of
> the original workload".
I'll take an 80% solution for that one problem, and happily declare
that the kernel's job is done. In particular, when a resource hog
exits (or whatever hueristics prefetch is currently hooking in to),
the kernel (or userspace, if that interface could be made sane) could
exercise a completely workload agnostic refetch of the last n things
evicted, where n is determined by what's suddenly become free (or
whatever Con came up with).
Just, y'know, MRU style.
> All this would end up needing runtime configurability and tweakability and
> customisability. All standard fare for userspace stuff - much easier than
> patching the kernel.
We're talking about patching the kernel for whatever API you're coming
up with to repopulate pagecache, swap, and inodes, aren't we? If we
are, it doesn't seem like we're saving any work here. Also we're
talking about a creating a new user-visible API instead of augmenting
a pre-existing hueristic -- page replacement -- that the kernel
doesn't export and so can change at a moment's notice. Augmenting an
opaque hueristic seems a lot more friendly to long-term maintenance.
> So. We can
>
> a) provide a way for userspace to reload pagecache and
>
> b) merge maps2 (once it's finished) (pokes mpm)
>
> and we're done?
Eh, dunno. Maybe?
We're assuming we come up with an API for userspace to get
notifications of evictions (without polling, though poll() would be
fine -- you know what I mean), and an API for re-victing those things
on demand. If you think that adding that API and maintaining it is
simpler/better than including a variation on the above hueristic I
offered, then yeah, I guess we are. It'll all have that vague
userspace s2ram odor about it, but I'm sure it could be made to work.
As I think I've successfully Peter Principled my way through this
conversation to my level of incompetence, I'll shut up now.
Ray
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-07-26 6:33 UTC|newest]
Thread overview: 227+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20070710013152.ef2cd200.akpm@linux-foundation.org>
2007-07-10 10:15 ` Con Kolivas
2007-07-11 1:02 ` Matthew Hawkins
2007-07-11 1:14 ` [ck] " Andrew Morton
2007-07-11 1:52 ` André Goddard Rosa
2007-07-11 4:25 ` [ck] " André Goddard Rosa
2007-07-11 2:21 ` Ira Snyder
2007-07-11 3:37 ` timotheus
2007-07-11 2:54 ` Matthew Hawkins
2007-07-11 5:18 ` Nick Piggin
2007-07-11 5:47 ` Ray Lee
2007-07-11 5:54 ` Nick Piggin
2007-07-11 6:04 ` Ray Lee
2007-07-11 6:24 ` Nick Piggin
2007-07-11 7:50 ` swap prefetch (Re: -mm merge plans for 2.6.23) Ingo Molnar
2007-07-11 6:00 ` [ck] Re: -mm merge plans for 2.6.23 Nick Piggin
2007-07-11 3:59 ` Grzegorz Kulewski
2007-07-11 12:26 ` Kevin Winchester
2007-07-11 12:36 ` Jesper Juhl
2007-07-12 12:06 ` Kacper Wysocki
2007-07-12 12:35 ` Avuton Olrich
2007-07-22 23:11 ` Con Kolivas
2007-07-23 23:08 ` Jesper Juhl
2007-07-24 3:22 ` Nick Piggin
2007-07-24 4:53 ` Ray Lee
2007-07-24 5:10 ` Jeremy Fitzhardinge
2007-07-24 5:18 ` Ray Lee
2007-07-24 5:16 ` Nick Piggin
2007-07-24 16:11 ` -mm merge plans for 2.6.23 - Completely Fair Swap Prefetch Frank Kingswood
2007-07-25 0:59 ` [ck] " Matthew Hawkins
2007-07-24 16:15 ` -mm merge plans for 2.6.23 Ray Lee
2007-07-25 4:06 ` Nick Piggin
2007-07-25 4:55 ` Rene Herman
2007-07-25 5:00 ` Nick Piggin
2007-07-25 5:12 ` david
2007-07-25 5:30 ` Rene Herman
2007-07-25 5:51 ` david
2007-07-25 7:14 ` Valdis.Kletnieks
2007-07-25 8:18 ` Rene Herman
2007-07-25 8:28 ` Ingo Molnar
2007-07-25 8:43 ` Rene Herman
2007-07-25 10:53 ` [ck] " Jos Poortvliet
2007-07-25 11:06 ` Nick Piggin
2007-07-25 12:39 ` Jos Poortvliet
2007-07-25 13:30 ` Rene Herman
2007-07-25 13:50 ` Ingo Molnar
2007-07-25 17:33 ` Satyam Sharma
2007-07-25 20:35 ` Ingo Molnar
2007-07-26 2:32 ` Bartlomiej Zolnierkiewicz
2007-07-26 4:13 ` Jeff Garzik
2007-07-26 10:22 ` Bartlomiej Zolnierkiewicz
2007-07-25 11:34 ` Ingo Molnar
2007-07-25 11:40 ` Rene Herman
2007-07-25 11:50 ` Ingo Molnar
2007-07-25 16:08 ` Valdis.Kletnieks
2007-07-25 22:05 ` Paul Jackson
2007-07-25 22:22 ` Zan Lynx
2007-07-25 22:27 ` Jesper Juhl
2007-07-25 22:28 ` [ck] " Michael Chang
2007-07-25 23:45 ` André Goddard Rosa
2007-07-25 16:02 ` Ray Lee
2007-07-25 20:55 ` Zan Lynx
2007-07-25 21:28 ` Ray Lee
2007-07-26 1:15 ` [ck] " Matthew Hawkins
2007-07-26 1:32 ` Ray Lee
2007-07-26 3:16 ` Matthew Hawkins
2007-07-26 22:30 ` Michael Chang
2007-07-25 5:30 ` Eric St-Laurent
2007-07-25 5:37 ` Nick Piggin
2007-07-25 5:53 ` david
2007-07-25 6:04 ` Nick Piggin
2007-07-25 6:23 ` david
2007-07-25 7:25 ` Nick Piggin
2007-07-25 7:49 ` Ingo Molnar
2007-07-25 7:58 ` Nick Piggin
2007-07-25 8:15 ` Ingo Molnar
2007-07-25 10:41 ` Jesper Juhl
2007-07-25 6:19 ` [ck] " Matthew Hawkins
2007-07-25 6:30 ` Nick Piggin
2007-07-25 6:47 ` Mike Galbraith
2007-07-25 7:19 ` Eric St-Laurent
2007-07-25 6:44 ` Eric St-Laurent
2007-07-25 16:09 ` Ray Lee
2007-07-26 4:57 ` Andrew Morton
2007-07-26 5:53 ` Nick Piggin
2007-07-26 6:06 ` Andrew Morton
2007-07-26 6:17 ` Nick Piggin
2007-07-26 6:33 ` Ray Lee [this message]
2007-07-26 6:50 ` Andrew Morton
2007-07-26 7:43 ` Ray Lee
2007-07-26 7:59 ` Nick Piggin
2007-07-28 0:24 ` Matt Mackall
2007-07-26 14:19 ` [ck] " Michael Chang
2007-07-26 18:13 ` Andrew Morton
2007-07-26 22:04 ` Dirk Schoebel
2007-07-26 22:33 ` Dirk Schoebel
2007-07-26 23:27 ` Jeff Garzik
2007-07-26 23:29 ` david
2007-07-26 23:39 ` Jeff Garzik
2007-07-27 0:12 ` david
2007-07-28 0:12 ` Matt Mackall
2007-07-28 3:42 ` Daniel Cheng
2007-07-28 9:35 ` Stefan Richter
2007-07-25 17:55 ` Frank A. Kingswood
2007-07-25 6:09 ` [ck] " Matthew Hawkins
2007-07-25 6:18 ` Nick Piggin
2007-07-25 16:19 ` Ray Lee
2007-07-25 20:46 ` Andi Kleen
2007-07-26 8:38 ` Frank Kingswood
2007-07-26 9:20 ` Ingo Molnar
2007-07-26 9:34 ` Andrew Morton
2007-07-26 9:40 ` RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23] Ingo Molnar
2007-07-26 10:09 ` Andrew Morton
2007-07-26 10:24 ` Ingo Molnar
2007-07-27 0:33 ` [ck] " Matthew Hawkins
2007-07-30 9:33 ` Helge Hafting
2007-07-26 10:27 ` Ingo Molnar
2007-07-26 10:38 ` Andrew Morton
2007-07-26 12:46 ` Mike Galbraith
2007-07-26 18:05 ` Andrew Morton
2007-07-27 5:12 ` Mike Galbraith
2007-07-27 7:23 ` Mike Galbraith
2007-07-27 8:47 ` Andrew Morton
2007-07-27 8:54 ` Al Viro
2007-07-27 9:02 ` Andrew Morton
2007-07-27 9:40 ` Mike Galbraith
2007-07-27 10:00 ` Andrew Morton
2007-07-27 10:25 ` Mike Galbraith
2007-07-27 17:45 ` Daniel Hazelton
2007-07-27 18:16 ` Rene Herman
2007-07-27 19:43 ` david
2007-07-28 7:19 ` Rene Herman
2007-07-28 8:55 ` david
2007-07-28 10:11 ` Rene Herman
2007-07-28 11:21 ` Alan Cox
2007-07-28 16:29 ` Ray Lee
2007-07-28 21:03 ` david
2007-07-29 8:11 ` Rene Herman
2007-07-29 13:12 ` Alan Cox
2007-07-29 14:07 ` Rene Herman
2007-07-29 14:58 ` Ray Lee
2007-07-29 14:59 ` Rene Herman
2007-07-29 15:20 ` Ray Lee
2007-07-29 15:36 ` Rene Herman
2007-07-29 16:04 ` Ray Lee
2007-07-29 16:59 ` Rene Herman
2007-07-29 17:19 ` Ray Lee
2007-07-29 17:33 ` Rene Herman
2007-07-29 17:52 ` Ray Lee
2007-07-29 19:05 ` Rene Herman
2007-07-29 17:53 ` Alan Cox
2007-07-29 19:33 ` Paul Jackson
2007-07-29 20:00 ` Ray Lee
2007-07-29 20:18 ` Paul Jackson
2007-07-29 20:23 ` Ray Lee
2007-07-29 21:06 ` Daniel Hazelton
2007-07-28 21:00 ` david
2007-07-29 10:09 ` Rene Herman
2007-07-29 11:41 ` david
2007-07-29 14:01 ` Rene Herman
2007-07-29 21:19 ` david
2007-08-06 2:14 ` Nick Piggin
2007-08-06 2:22 ` david
2007-08-06 9:21 ` Nick Piggin
2007-08-06 9:55 ` Paolo Ciarrocchi
2007-07-28 15:56 ` Daniel Hazelton
2007-07-28 21:06 ` david
2007-07-28 21:48 ` Daniel Hazelton
2007-07-27 20:28 ` Daniel Hazelton
2007-07-28 5:19 ` Rene Herman
2007-07-27 23:15 ` Björn Steinbrink
2007-07-27 23:29 ` Andi Kleen
2007-07-28 0:08 ` Björn Steinbrink
2007-07-28 1:10 ` Daniel Hazelton
2007-07-29 12:53 ` Paul Jackson
2007-07-28 7:35 ` Rene Herman
2007-07-28 8:51 ` Rene Herman
2007-07-27 22:08 ` Mike Galbraith
2007-07-27 22:51 ` Daniel Hazelton
2007-07-28 7:48 ` Mike Galbraith
2007-07-28 15:36 ` Daniel Hazelton
2007-07-29 1:33 ` Rik van Riel
2007-07-29 3:39 ` Andrew Morton
2007-07-26 10:20 ` Al Viro
2007-07-26 12:23 ` Andi Kleen
2007-07-26 14:59 ` Al Viro
2007-07-11 20:41 ` Pavel Machek
2007-07-27 19:19 ` Paul Jackson
2007-07-26 13:05 ` Fredrik Klasson
2007-07-31 16:37 ` [ck] Re: -mm merge plans for 2.6.23 Matthew Hawkins
2007-08-06 2:11 ` Nick Piggin
2007-07-25 4:46 ` david
2007-07-25 8:00 ` Rene Herman
2007-07-25 8:07 ` david
2007-07-25 8:29 ` Rene Herman
2007-07-25 8:31 ` david
2007-07-25 8:33 ` david
2007-07-25 10:58 ` Rene Herman
2007-07-25 15:55 ` Ray Lee
2007-07-25 20:16 ` Al Boldi
2007-07-27 0:28 ` Magnus Naeslund
2007-07-24 5:18 ` Andrew Morton
2007-07-24 6:01 ` Ray Lee
2007-07-24 6:10 ` Andrew Morton
2007-07-24 9:38 ` Tilman Schmidt
2007-07-25 1:26 ` [ck] " Matthew Hawkins
2007-07-25 1:35 ` David Miller, Matthew Hawkins
2007-07-24 0:08 ` Con Kolivas
2007-07-11 11:39 ` buffered write patches, " Christoph Hellwig
2007-07-11 17:23 ` Andrew Morton
2007-07-11 12:23 ` lguest, " Christoph Hellwig
2007-07-11 15:45 ` Randy Dunlap
2007-07-11 18:04 ` Andrew Morton
2007-07-12 1:21 ` Rusty Russell
2007-07-12 2:28 ` David Miller, Rusty Russell
2007-07-12 2:48 ` Rusty Russell
2007-07-12 2:51 ` David Miller, Rusty Russell
2007-07-12 3:15 ` Rusty Russell
2007-07-12 3:35 ` David Miller, Rusty Russell
2007-07-12 4:24 ` Andrew Morton
2007-07-12 4:52 ` Rusty Russell
2007-07-12 11:10 ` Avi Kivity
2007-07-19 17:27 ` Christoph Hellwig
2007-07-20 3:27 ` Rusty Russell
2007-07-20 7:15 ` Christoph Hellwig
2007-07-12 0:54 ` fault vs invalidate race (Re: -mm merge plans for 2.6.23) Nick Piggin
2007-07-12 2:31 ` block_page_mkwrite? (Re: fault vs invalidate race (Re: -mm merge plans for 2.6.23)) David Chinner
2007-07-12 2:42 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2c0942db0707252333uc7631fduadb080193f6ad323@mail.gmail.com \
--to=ray-lk@madrabbit.org \
--cc=akpm@linux-foundation.org \
--cc=ck@vds.kolivas.org \
--cc=ericstl34@sympatico.ca \
--cc=jesper.juhl@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=pj@sgi.com \
--cc=rene.herman@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox