Re: [-mm] Add an owner to the mm_struct (v2)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Paul Menage" <menage@google.com>
To: balbir@linux.vnet.ibm.com
Cc: Pavel Emelianov <xemul@openvz.org>,
	Hugh Dickins <hugh@veritas.com>,
	Sudhir Kumar <skumar@linux.vnet.ibm.com>,
	YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
	taka@valinux.co.jp, linux-mm@kvack.org,
	David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [-mm] Add an owner to the mm_struct (v2)
Date: Fri, 28 Mar 2008 08:38:16 -0700	[thread overview]
Message-ID: <6599ad830803280838s19ffc366w1a950ebb12e2907b@mail.gmail.com> (raw)
In-Reply-To: <47ED0621.4050304@linux.vnet.ibm.com>

On Fri, Mar 28, 2008 at 7:52 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>
>  mm->owner_lock is there to protect mm->owner field from changing simultaneously
>  as tasks fork/exit.
>

But the *hardware* already does that for you - individual writes to
pointers are already atomic operations and so will be serialized.
Using a lock to guard something only does anything useful if at least
one of the critical regions that takes the lock consists of more than
a single atomic operation, or if you have a mixture of read sections
and write sections. Now it's true that your critical region in
mm_fork_init_owner() is more than a single atomic op, but I'm arguing
below that it's a no-op. So that just leaves the single region

spin_lock(&mm->owner_lock);
mm->owner = new_owner;
spin_unlock(&mm->owner_lock);

which isn't observably different if you remove the spinlock.

>
>  Oh! yes.. my bad again. The check should have been p == p->thread_group, but
>  that is not required either. The check should now ideally be
>
>  if (!(clone_flags & CLONE_VM))
>

OK, so if the new thread has its own mm (and hence will already have
mm->owner set up to point to p in mm_init()) then we do:

>  +       if (mm->owner != p)
>  +               rcu_assign_pointer(mm->owner, p->group_leader);

which is a no-op since we know mm->owner == p.

>
>  Yes.. I think we need to call it earlier.
>

No, I think we need to call it later - after we've cleared current->mm
(from within task_lock(current)) - so we can't rely on p->mm in this
function, we have to pass it in. If we call it before while
current->mm == mm, then we risk a race where the (new or existing)
owner exits and passes it back to us *after* we've done a check to see
if we need to find a new owner. If we ensure that current->mm != mm
before we call mm_update_next_owner(), then we know we're not a
candidate for receiving the ownership if we don't have it already.

>
>  But there is no way to guarantee that, what is the new_owner exec's after we've
>  done the check and assigned. Won't we end up breaking the invariant? How about
>  we have mm_update_new_owner() call in exec_mmap() as well? That way, we can
>  still use owner_lock and keep the invariant.
>

Oops, I thought that exit_mm() already got called in the execve()
path, but you're right, it doesn't.

Yes, exit_mmap() should call mm_update_next_owner() after the call to
task_unlock(), i.e. after it's set its new mm.

So I need to express the invariant more carefully.

What we need to preserve is that, for every mm at all times, mm->owner
points to a valid task. So either:

1) mm->owner->mm == mm AND mm->owner will check to see whether it
needs to pass ownership before it exits or execs.

OR

2) mm->owner is the last user of mm and is about to free mm.

OR

3) mm->owner is currently searching for another user of mm to pass the
ownership to.

In order to get from state 3 to state 1 safely we have to hold
task_lock(new_owner). Otherwise we can race with an exit or exec in
new_owner, resulting in a process that has already passed the point of
checking current->mm->owner.

I don't see why we need mm->owner_lock to maintain this invariant.
(But am quite prepared to be proven wrong).

Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-03-28 15:38 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-28  8:23 Balbir Singh
2008-03-28  9:41 ` Jiri Slaby
2008-03-28  9:43   ` Jiri Slaby
2008-03-28 10:11     ` Balbir Singh
2008-03-28 10:48 ` KAMEZAWA Hiroyuki
2008-03-28 10:51   ` Balbir Singh
2008-03-28 11:06     ` KAMEZAWA Hiroyuki
2008-03-28 10:55 ` KAMEZAWA Hiroyuki
2008-03-28 10:52   ` Balbir Singh
2008-03-28 11:04     ` Paul Menage
2008-03-28 11:15     ` KAMEZAWA Hiroyuki
2008-03-28 11:21       ` KAMEZAWA Hiroyuki
2008-03-28 11:01 ` Paul Menage
2008-03-28 12:36   ` Balbir Singh
2008-03-28 12:54     ` Balbir Singh
2008-03-28 14:06       ` Paul Menage
2008-03-28 14:05     ` Paul Menage
2008-03-28 14:52       ` Balbir Singh
2008-03-28 15:38         ` Paul Menage [this message]
2008-03-28 18:10           ` Balbir Singh
2008-03-28 18:52             ` Paul Menage
2008-03-29  1:02               ` Balbir Singh
2008-03-29  5:46               ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6599ad830803280838s19ffc366w1a950ebb12e2907b@mail.gmail.com \
    --to=menage@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=rientjes@google.com \
    --cc=skumar@linux.vnet.ibm.com \
    --cc=taka@valinux.co.jp \
    --cc=xemul@openvz.org \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox