From: Christoph Lameter <cl@linux.com>
To: Rafael Aquini <aquini@redhat.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Stephen Wilson <wilsons@start.ca>,
Andrea Arcangeli <aarcange@redhat.com>,
Rik van Riel <riel@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/mempolicy.c: make sys_mbind & sys_set_mempolicy aware of task_struct->mems_allowed
Date: Wed, 3 Aug 2011 09:19:11 -0500 (CDT) [thread overview]
Message-ID: <alpine.DEB.2.00.1108030913090.24201@router.home> (raw)
In-Reply-To: <20110803123721.GA2892@x61.redhat.com>
On Wed, 3 Aug 2011, Rafael Aquini wrote:
> Among several other features enabled when CONFIG_CPUSETS is defined, task_struct is enhanced with the nodemask_t mems_allowed element that serves to register/report on which memory nodes the task may obtain memory. Also, two new lines that reflect the value registered at task_struct->mems_allowed are added to the '/proc/[pid]/status' file:
> Mems_allowed: ...,00000000,0000000f
> Mems_allowed_list: 0-3
>
> The system calls sys_mbind and sys_set_mempolicy, which serve to cope
> with NUMA memory policies, and receive a nodemask_t parameter, do not
> set task_struct->mems_allowed accordingly to their received nodemask,
> when CONFIG_CPUSETS is defined. This unawareness causes unexpected
> values being reported at '/proc/[pid]/status' Mems_allowed fields, for
> applications relying on those syscalls, or spawned by numactl.
That is intentionally so since mbind does not restrict the memory nodes
allowed by the process. mbind means that process is directing its
allocation to a specific set of nodes. The process can still specify a
memory policy for allocation from any other node in mems_allowed.
> Despite not affecting the memory policy operation itself, the
> aforementioned unawareness is source of confusion and annoyance when one
> is trying to figure out which resources are bound to a given task.
Nope this is so for a reason.
> As we can check, the expected reported list would be "1,2", instead of "0-3".
Wrong. The process is allowed to allocate from nodes 0-3. Reporting
anything else would be misleading.
> @@ -1256,7 +1264,10 @@ SYSCALL_DEFINE6(mbind, unsigned long, start, unsigned long, len,
> err = get_nodes(&nodes, nmask, maxnode);
> if (err)
> return err;
> - return do_mbind(start, len, mode, mode_flags, &nodes, flags);
> + err = do_mbind(start, len, mode, mode_flags, &nodes, flags);
> + if (!err)
> + set_mems_allowed(nodes);
> + return err;
> }
Uhhh. set_mems_allowed() suffers from various races and cannot easiy be
used in random locations. Special serialization is required. See
cpuset_mems_allowed() and cpuset_change_task_nodemask
> @@ -1276,7 +1287,10 @@ SYSCALL_DEFINE3(set_mempolicy, int, mode, unsigned long __user *, nmask,
> err = get_nodes(&nodes, nmask, maxnode);
> if (err)
> return err;
> - return do_set_mempolicy(mode, flags, &nodes);
> + err = do_set_mempolicy(mode, flags, &nodes);
> + if (!err)
> + set_mems_allowed(nodes);
> + return err;
> }
Same issue.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-08-03 14:19 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-03 12:37 Rafael Aquini
2011-08-03 14:19 ` Christoph Lameter [this message]
2011-08-04 1:59 ` Andi Kleen
2011-08-04 22:07 ` Rafael Aquini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.1108030913090.24201@router.home \
--to=cl@linux.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=aquini@redhat.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
--cc=wilsons@start.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox