linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: Christoph Lameter <clameter@engr.sgi.com>
Cc: ak@suse.de, kenneth.w.chen@intel.com, linux-mm@kvack.org,
	linux-ia64@vger.kernel.org
Subject: Re: [NUMA] Display and modify the memory policy of a process through /proc/<pid>/numa_policy
Date: Sun, 17 Jul 2005 01:17:02 -0700	[thread overview]
Message-ID: <20050717011702.23f8a269.pj@sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.62.0507162256180.28788@schroedinger.engr.sgi.com>

Christoph wrote:
> Could you give me some more detail on how this should integrate with 
> cpusets? I am not aware of any thing that I would call "hard".

I can't speak to how "hard" it is, but what I have in mind is the
following lines from the mm/mempolicy.c get_nodes() routine:

        /* Update current mems_allowed */
        cpuset_update_current_mems_allowed();
        /* Ignore nodes not set in current->mems_allowed */
        cpuset_restrict_to_mems_allowed(nodes);

These lines insure that the current tasks mems_allowed is uptodate
with any constraints imposed by the tasks cpuset, and then they
restrict the nodes to that mems_allowed.

Offhand, I do not know a safe way to update a tasks mems_allowed
from its cpuset, except within the tasks context.  This is why
'mems_generation' and cpuset_update_current_mems_allowed() exist.

If you can find a way, more power to you.  I could simiply the
cpuset mems_generation apparatus if I had such a way.

The above get_nodes() routines is called by mbind() and set_mempolicy(),
when passing in a list of memory nodes as part of a memory policy.


> What do you mean by synchronously? 

Probably what Andi is referring to when he worries about locking.
If so, he certainly understands this better than I.

But for example, I notice that the check_range() routine is called
for mbind() requests.  The check_range() code does a bunch of poking
around in the current tasks vma structs.  How do you propose to allow
a separate task to do this safely?

Also, there are several derefences of the pointer 'current'. and to
further mm and vma state referenced via current, to pick up various
attributes of the current task and its memory.  Each one of these
has to be examined, I presume, in order to determine what accesses
can safely be done from an external task, and still obtain consistent
results.


> There is no transactional behavior that allows the changes of multiple
> items at once, nor is there any guarantee that the vma you are changing
> is still there after you have read /proc/<pid>/numa_maps. Why would
> such synchronicity be necessary?

I agree that such is not possible, present nor necessary.

I am worried about what happens within a single mbind or set_mempolicy
call attempted on an external task, not what happens between one such
call and the next.

Clearly the mm/mempolicy code for mbind and set_mempolicy was written
with the assumption that it applied to the current task, its mm
and vmas, and hence the current task was stuck inside this code.

A variety of task and memory state is read and written, without
need for much locking, because we are single threaded in the only
task that is allowed to modify this state.  The author of this code
repeatedly expresses concerns that external modification will fail
due to locking issues.

To me, that means it will take, at best, a careful and detailed
analysis to have any hope of safe external modification of this state,
if it is possible at all.

This is why I suspect we need a way to plug in code that executes in
the context of a task, to apply externally determined changes to the
tasks memory layout.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2005-07-17  8:17 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-15  1:39 Christoph Lameter
2005-07-15  3:50 ` Paul Jackson
2005-07-15  4:52 ` Chen, Kenneth W
2005-07-15  5:07   ` Christoph Lameter
2005-07-15  5:55     ` Chen, Kenneth W
2005-07-15  6:05     ` Paul Jackson
2005-07-15 11:46       ` Andi Kleen
2005-07-15 16:06       ` Christoph Lameter
2005-07-15 21:04         ` Paul Jackson
2005-07-15 21:12           ` Andi Kleen
2005-07-15 21:20             ` Christoph Lameter
2005-07-15 21:47               ` Andi Kleen
2005-07-15 21:55                 ` Christoph Lameter
2005-07-15 22:07                   ` Andi Kleen
2005-07-15 22:30                     ` Christoph Lameter
2005-07-15 22:37                       ` Andi Kleen
2005-07-15 22:49                         ` Christoph Lameter
2005-07-15 22:56                           ` Andi Kleen
2005-07-15 23:11                             ` Christoph Lameter
2005-07-15 23:44                               ` Andi Kleen
2005-07-15 23:56                                 ` Christoph Lameter
2005-07-16  2:01                                   ` Andi Kleen
2005-07-16 15:14                                     ` Christoph Lameter
2005-07-16 22:39                                       ` Paul Jackson
2005-07-16 23:30                                     ` Paul Jackson
2005-07-17  1:55                                       ` Christoph Lameter
2005-07-17  3:50                                         ` Paul Jackson
2005-07-17  5:56                                           ` Christoph Lameter
2005-07-17  7:22                                             ` Paul Jackson
2005-07-17  3:21                                       ` Christoph Lameter
2005-07-17  4:51                                         ` Paul Jackson
2005-07-17  6:00                                           ` Christoph Lameter
2005-07-17  8:17                                             ` Paul Jackson [this message]
2005-07-16  0:00                                 ` David Singleton
2005-07-16  0:16                                 ` Steve Neuner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050717011702.23f8a269.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=ak@suse.de \
    --cc=clameter@engr.sgi.com \
    --cc=kenneth.w.chen@intel.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox