From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Paul Menage <menage@google.com>
Cc: linux-mm@kvack.org, akpm@osdl.org
Subject: Re: [RFC][PATCH 1/1] Expose per-node reclaim and migration to userspace
Date: Thu, 30 Nov 2006 21:15:41 +1100 [thread overview]
Message-ID: <456EAF4D.5000804@yahoo.com.au> (raw)
In-Reply-To: <6599ad830611300145gae22510te7eaa63edf539ad1@mail.gmail.com>
Paul Menage wrote:
> On 11/30/06, Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
>> >> AFAIK they do that in their higher level APIs (at least HPC numa
>> does).
>> >
>> >
>> > Could you point me at an example?
>>
>> kernel/cpuset.c:cpuset_migrate_mm
>
>
> No, that doesn't really do what we want. It basically just calls
> do_migrate_pages, which has the drawbacks of:
I know it doesn't do what you want. It is an example of using page
migration under a higher level API, which I thought is what you
wanted to see.
>> How about "try to change the memory reservation charge of this
>> 'container' from xMB to yMB"? Underneath that API, your fakenode
>> controller would do the node reclaim and consolidation stuff --
>> but it could be implemented completely differently in the case of
>> a different type of controller.
>
>
> How would it make decisions such as which node to free up (e.g.
> userspace might have a strong preference for keeping a job on one
> particular real node, or moving it to a different one.) I think that
> policy decisions like this belong in userspace, in the same way that
> the existing cpusets API provides a way to say "this cpuset uses these
> nodes" rather than "this cpuset should have N nodes".
Now you're talking about physical nodes as well, which is definitely
a problem you get when mixing the two.
But there is no reason why you shouldn't be able to specify physical
nodes, while also altering the reservation. Even if that does mean
hiding the fake nodes from the cpuset interface.
>> If it is exporting any kind of implementation details, then it needs
>> to be justified with a specific user that can't be implemented in a
>> better way, IMO.
>
>
> It's not really exporting any more implementation details than the
> existing cpusets API (i.e. explicitly binding a job to a set of nodes
> chosen by userspace). The only true exposed implementation detail is
> the "priority" value from try_to_free_pages, and that could be
> abstracted away as a value in some range 0-N where 0 means "try very
> hard" and N means "hardly try at all", and it wouldn't have to be
> directly linked to the try_to_free_pages() priority.
Or the fact that memory reservation is implemented with nodes. I'm
still not convinced that idea is the best way to export memory
control to userspace, regardless of whether it is quick and easy to
develop (or even deploy, at google).
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-11-30 10:15 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-29 3:06 [RFC][PATCH 0/1] Node-based reclaim/migration menage
2006-11-29 3:06 ` [RFC][PATCH 1/1] Expose per-node reclaim and migration to userspace menage
2006-11-29 6:07 ` Nick Piggin
2006-11-29 21:57 ` Paul Menage
2006-11-30 4:13 ` Christoph Lameter
2006-11-30 4:18 ` Paul Menage
2006-11-30 7:38 ` Nick Piggin
2006-11-30 7:57 ` Paul Menage
2006-11-30 8:26 ` Nick Piggin
2006-11-30 8:39 ` Paul Menage
2006-11-30 8:55 ` Nick Piggin
2006-11-30 9:06 ` Paul Menage
2006-11-30 9:21 ` Nick Piggin
2006-11-30 9:45 ` Paul Menage
2006-11-30 10:15 ` Nick Piggin [this message]
2006-11-30 10:40 ` Paul Menage
2006-11-30 11:04 ` Nick Piggin
2006-11-30 11:23 ` Paul Menage
2006-11-30 11:35 ` Nick Piggin
2006-11-30 0:18 ` KAMEZAWA Hiroyuki
2006-11-30 0:25 ` Paul Menage
2006-11-30 0:38 ` KAMEZAWA Hiroyuki
2006-11-30 4:15 ` Christoph Lameter
2006-11-30 4:10 ` Christoph Lameter
2006-11-30 0:31 ` [RFC][PATCH 0/1] Node-based reclaim/migration KAMEZAWA Hiroyuki
2006-11-30 0:31 ` Paul Menage
2006-11-30 4:11 ` KAMEZAWA Hiroyuki
2006-11-30 4:17 ` Christoph Lameter
2006-11-30 10:45 ` Paul Menage
2006-11-30 11:12 ` KAMEZAWA Hiroyuki
2006-11-30 11:25 ` Paul Menage
2006-11-30 12:18 ` KAMEZAWA Hiroyuki
2006-11-30 18:28 ` Christoph Lameter
2006-11-30 18:35 ` Paul Menage
2006-11-30 18:39 ` Christoph Lameter
2006-11-30 19:09 ` Paul Menage
2006-11-30 19:42 ` Christoph Lameter
2006-11-30 19:53 ` Paul Menage
2006-11-30 20:00 ` Christoph Lameter
2006-11-30 20:07 ` Paul Menage
2006-11-30 20:15 ` Christoph Lameter
2006-11-30 21:33 ` Paul Menage
2006-11-30 23:41 ` Christoph Lameter
2006-11-30 23:48 ` Paul Menage
2006-12-01 2:23 ` Christoph Lameter
2006-12-01 19:32 ` Paul Menage
2006-12-01 19:56 ` Christoph Lameter
2006-12-01 2:44 ` KAMEZAWA Hiroyuki
2006-12-01 2:43 ` Christoph Lameter
2006-12-01 2:59 ` KAMEZAWA Hiroyuki
2006-12-01 2:44 ` Christoph Lameter
2006-12-01 3:10 ` KAMEZAWA Hiroyuki
2006-12-01 5:28 ` Christoph Lameter
2006-11-30 4:04 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=456EAF4D.5000804@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=akpm@osdl.org \
--cc=linux-mm@kvack.org \
--cc=menage@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox