From: Christoph Lameter <clameter@engr.sgi.com>
To: Andi Kleen <ak@suse.de>
Cc: Paul Jackson <pj@sgi.com>,
kenneth.w.chen@intel.com, linux-mm@kvack.org,
linux-ia64@vger.kernel.org
Subject: Re: [NUMA] Display and modify the memory policy of a process through /proc/<pid>/numa_policy
Date: Fri, 15 Jul 2005 16:56:34 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.62.0507151647300.12832@schroedinger.engr.sgi.com> (raw)
In-Reply-To: <20050715234402.GN15783@wotan.suse.de>
On Sat, 16 Jul 2005, Andi Kleen wrote:
> > If you encounter different situation then you may need different address
> > translation. F.e. lets say you want to move a process from node 3 and 4 to
> > node 5. That wont work with the existing patches. Or you want a process
> > running on node 1 to be split to nodes 2 and 3. You want 1G to be moved to
> > node 2 and the rest to node 3. Cannot be done with the old page migration.
>
> Ok, let's review it slowly. Why would you want to move 1GB
> of a existing process and another GB to different nodes?
Many reasons: One is to optimize access: Interleave. Or there just happens
to be space on these nodes and one needs the space on this node for
something else.
> Considering you want to optimize for latency:
> - It doesn't make sense here because your external agent doesn't know
> which thread is using the first GB and which thread is using the last 2GBs.
> Most likely they use malloc and everything is pretty much mixed up.
> That is information only the code knows or the kernel indirectly from its
> first touch policy. But you need it otherwise you violate local
> memory policy for one thread or another.
>
> In short blocks of memory are useless here because they have no
> relationship to what the code actually does.
>
> If you want to optimize for bandwidth:
>
> - Similar problem applies. First GB and last GB of memory has no
> relationship to how the memory is interleaved.
>
> So it doesn't make much sense to work on smaller pieces
> than processes here. Files are corner cases, but they can
> be already handled with some existing patches to mbind.
You are prescribing now how things have to be done. This is not manual
page migration anymore. Manual page migration would allow control over
memory locations of a process.
Lets say I want neither of the above. I just need to run a process on a
certain node because there is disk storage attached to that node and the
other processes need to get out of the way for the next 30 minutes.
One always needs control over what is migrated. Ideally one would be able
to specify that only the vma containing the huge amount of sparsely
accessed data is to be migrated if memory becomes tight but the process
continues to run on the same node. The stack and text segments and
libraries should stay on the node.
On the other hand if the process is migrated to another one node by
the scheduler then one may want to migrate the text segment and the
stack but leave the 6G vma containing data vma where it originally was.
It all boils down to the following:
Are you willing to allow us to control memory placement? Or will it be
automatically? If automatically then maybe you need to get rid of libnuma
and numactl and put it all in the scheduler. Otherwise please full control
and not some half-way measures.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2005-07-15 23:56 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-07-15 1:39 Christoph Lameter
2005-07-15 3:50 ` Paul Jackson
2005-07-15 4:52 ` Chen, Kenneth W
2005-07-15 5:07 ` Christoph Lameter
2005-07-15 5:55 ` Chen, Kenneth W
2005-07-15 6:05 ` Paul Jackson
2005-07-15 11:46 ` Andi Kleen
2005-07-15 16:06 ` Christoph Lameter
2005-07-15 21:04 ` Paul Jackson
2005-07-15 21:12 ` Andi Kleen
2005-07-15 21:20 ` Christoph Lameter
2005-07-15 21:47 ` Andi Kleen
2005-07-15 21:55 ` Christoph Lameter
2005-07-15 22:07 ` Andi Kleen
2005-07-15 22:30 ` Christoph Lameter
2005-07-15 22:37 ` Andi Kleen
2005-07-15 22:49 ` Christoph Lameter
2005-07-15 22:56 ` Andi Kleen
2005-07-15 23:11 ` Christoph Lameter
2005-07-15 23:44 ` Andi Kleen
2005-07-15 23:56 ` Christoph Lameter [this message]
2005-07-16 2:01 ` Andi Kleen
2005-07-16 15:14 ` Christoph Lameter
2005-07-16 22:39 ` Paul Jackson
2005-07-16 23:30 ` Paul Jackson
2005-07-17 1:55 ` Christoph Lameter
2005-07-17 3:50 ` Paul Jackson
2005-07-17 5:56 ` Christoph Lameter
2005-07-17 7:22 ` Paul Jackson
2005-07-17 3:21 ` Christoph Lameter
2005-07-17 4:51 ` Paul Jackson
2005-07-17 6:00 ` Christoph Lameter
2005-07-17 8:17 ` Paul Jackson
2005-07-16 0:00 ` David Singleton
2005-07-16 0:16 ` Steve Neuner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.62.0507151647300.12832@schroedinger.engr.sgi.com \
--to=clameter@engr.sgi.com \
--cc=ak@suse.de \
--cc=kenneth.w.chen@intel.com \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pj@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox