From: Miao Xie <miaox@cn.fujitsu.com>
To: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>,
Nick Piggin <npiggin@suse.de>, Paul Menage <menage@google.com>,
Linux-Kernel <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH V2 4/4] cpuset,mm: update task's mems_allowed lazily
Date: Tue, 09 Mar 2010 15:25:54 +0800 [thread overview]
Message-ID: <4B95F802.9020308@cn.fujitsu.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1003081330370.18502@chino.kir.corp.google.com>
on 2010-3-9 5:46, David Rientjes wrote:
[snip]
>> Considering the change of task->mems_allowed is not frequent, so in this patch,
>> I use two variables as a tag to indicate whether task->mems_allowed need be
>> update or not. And before setting the tag, cpuset caches the new mask of every
>> task at its task_struct.
>>
>
> So what exactly is the benefit of 58568d2 from last June that caused this
> issue to begin with? It seems like this entire patchset is a revert of
> that commit. So why shouldn't we just revert that one commit and then add
> the locking and updating necessary for configs where
> MAX_NUMNODES > BITS_PER_LONG on top?
I worried about the consistency of task->mempolicy with task->mems_allowed for
configs where MAX_NUMNODES <= BITS_PER_LONG.
The problem that I worried is fowllowing:
When the kernel allocator allocates pages for tasks, it will access task->mempolicy
first and get the allowed node, then check whether that node is allowed by
task->mems_allowed.
But, Without this patch, ->mempolicy and ->mems_allowed is not updated at the same
time. the kernel allocator may access the inconsistent information of ->mempolicy
and ->mems_allowed, sush as the allocator gets the allowed node from old mempolicy,
but checks whether that node is allowed by new mems_allowed which does't intersect
old mempolicy.
So I made this patchset.
>> +/**
>> + * cpuset_update_task_mems_allowed - update task memory placement
>> + *
>> + * If the current task's mems_allowed_for_update and mempolicy_for_update are
>> + * changed by cpuset behind our backs, update current->mems_allowed,
>> + * mems_generation and task NUMA mempolicy to the new value.
>> + *
>> + * Call WITHOUT mems_lock held.
>> + *
>> + * This routine is needed to update the pre-task mems_allowed and mempolicy
>> + * within the tasks context, when it is trying to allocate memory.
>> + */
>> +static __always_inline void cpuset_update_task_mems_allowed(void)
>> +{
>> + struct task_struct *tsk = current;
>> + unsigned long flags;
>> +
>> + if (unlikely(tsk->mems_generation != tsk->mems_generation_for_update)) {
>> + task_mems_lock_irqsave(tsk, flags);
>> + tsk->mems_allowed = tsk->mems_allowed_for_update;
>> + tsk->mems_generation = tsk->mems_generation_for_update;
>> + task_mems_unlock_irqrestore(tsk, flags);
>
> By this synchronization, you're guaranteeing that no other kernel code
> ever reads tsk->mems_allowed when tsk != current? Otherwise, you're
> simply protecting the store to tsk->mems_allowed here and not serializing
> on the loads that can return empty nodemasks.
I guarantee that no other kernel code changes tsk->mems_allowed when tsk != current.
so every task can be safe to read tsk->mems_allowed without lock.
I will use mems_lock to protect it when other task reads.
>> + /* Protection of ->mems_allowed_for_update */
>> + spinlock_t mems_lock;
>> + /*
>> + * This variable(mems_allowed_for_update) are just used for caching
>> + * memory placement information.
>> + *
>> + * ->mems_allowed are used by the kernel allocator.
>> + */
>> + nodemask_t mems_allowed_for_update; /* Protected by mems_lock */
>
> Another nodemask_t in struct task_struct for this? And for all configs,
> including those that can do atomic updates to mems_allowed?
Yes, for all configs.
>
>> +
>> + /*
>> + * Increment this integer everytime ->mems_allowed_for_update is
>> + * changed by cpuset. Task can compare this number with mems_generation,
>> + * and if they are not the same, mems_allowed_for_update is changed and
>> + * ->mems_allowed must be updated. In this way, tasks can avoid having
>> + * to lock and reload mems_allowed_for_update unless it is changed.
>> + */
>> + int mems_generation_for_update;
>> + /*
>> + * After updating mems_allowed, set mems_generation to
>> + * mems_generation_for_update.
>> + */
>> + int mems_generation;
>
> I don't see why you need two mems_generation numbers, one should belong in
> the task's cpuset. Then you can compare tsk->mems_generation to
> task_cs(tsk)->mems_generation at cpuset_update_task_memory_state() if you
> set tsk->mems_generation = task_cs(tsk)->mems_generation on
> cpuset_attach() or update_nodemask().
In this way, we must use rcu_read_lock() to protect task's cs, and the performance
will slowdown though rcu read lock's spending is very small.
Thanks!
Miao
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-03-09 7:26 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-08 10:10 Miao Xie
2010-03-08 21:46 ` David Rientjes
2010-03-09 7:25 ` Miao Xie [this message]
2010-03-11 8:15 ` Nick Piggin
2010-03-11 10:33 ` Miao Xie
2010-03-11 11:03 ` Nick Piggin
2010-03-25 10:23 ` Miao Xie
2010-03-25 12:56 ` Miao Xie
2010-03-25 13:33 ` [PATCH] [PATCH -mmotm] cpuset,mm: use seqlock to protect task->mempolicy and mems_allowed (v2) (was: Re: [PATCH V2 4/4] cpuset,mm: update task's mems_allowed lazily) Miao Xie
2010-03-28 5:30 ` Bob Liu
2010-03-31 19:42 ` Andrew Morton
2010-03-31 9:54 ` [PATCH V2 4/4] cpuset,mm: update task's mems_allowed lazily Miao Xie
2010-03-31 10:34 ` David Rientjes
2010-04-01 2:16 ` Miao Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B95F802.9020308@cn.fujitsu.com \
--to=miaox@cn.fujitsu.com \
--cc=lee.schermerhorn@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=menage@google.com \
--cc=npiggin@suse.de \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox