Re: [-mm] Add an owner to the mm_struct (v2)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Paul Menage <menage@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>,
	Hugh Dickins <hugh@veritas.com>,
	Sudhir Kumar <skumar@linux.vnet.ibm.com>,
	YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
	taka@valinux.co.jp, linux-mm@kvack.org,
	David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [-mm] Add an owner to the mm_struct (v2)
Date: Sat, 29 Mar 2008 11:16:02 +0530	[thread overview]
Message-ID: <47EDD79A.7040108@linux.vnet.ibm.com> (raw)
In-Reply-To: <6599ad830803281152g33e693f5s4c7090a503d2751d@mail.gmail.com>

Paul Menage wrote:
> Hi Balbir,
> 
> Could you send out the latest version of your patch? I suspect it's
> changed a bit based on on this review and it would be good to make
> sure we're both on the same page.
> 
> On Fri, Mar 28, 2008 at 11:10 AM, Balbir Singh
> <balbir@linux.vnet.ibm.com> wrote:
>>  At fork time, we can have do_fork() run in parallel and we need to protect
>>  mm->owner, if several threads are created at the same time. We don't want to
>>  overwrite mm->owner for each thread that is created.
> 
> Why would you want to overwrite mm->owner for any of the threads? If
> they're sharing an existing mm, then that mm must already have an
> owner, so no need to update it.
> 

Yes, that's why we have the lock and check.

>>  > No, I think we need to call it later - after we've cleared current->mm
>>  > (from within task_lock(current)) - so we can't rely on p->mm in this
>>  > function, we have to pass it in. If we call it before while
>>  > current->mm == mm, then we risk a race where the (new or existing)
>>  > owner exits and passes it back to us *after* we've done a check to see
>>  > if we need to find a new owner. If we ensure that current->mm != mm
>>  > before we call mm_update_next_owner(), then we know we're not a
>>  > candidate for receiving the ownership if we don't have it already.
>>  >
>>
>>  Yes and we could also check for flags & PF_EXITING
>>
> 
> A couple of issues with that:
> 
> - I'm not sure how that handles the exec case
> 
> - assume two users; the owner exits and wants to pass the ownership to
> the other user. It finds it, but sees that it's PF_EXITING. What
> should it do? If it waits for that other user to exit, it could take a
> long time (e.g. core dumps can take many seconds). If it exits
> immediately, then it will leave mm->owner pointing to an invalid task.
> If it passes ownership to the other task, it might pass it after the
> other task had done its mm_update_next_owner() check, which would be
> too late.
> 
> - assume three users; the owner exits and wants to pass the ownership
> to one of the other two users. it searches through the candidates and
> finds one of the other users, which is in PF_EXITING, so it skips it.
> Just after this it sees that the user count has fallen to two users.
> How does it know whether the user that dropped the count was the
> PF_EXITING process that it saw previously (in which case it should
> keep searching) or the third user that it's not encountered yet (in
> which case it's not going to find the other user anywhere in its
> search).
> 
>>  >>  But there is no way to guarantee that, what is the new_owner exec's after we've
>>  >>  done the check and assigned. Won't we end up breaking the invariant? How about
>>  >>  we have mm_update_new_owner() call in exec_mmap() as well? That way, we can
>>  >>  still use owner_lock and keep the invariant.
>>  >>
>>  >
>>  > Oops, I thought that exit_mm() already got called in the execve()
>>  > path, but you're right, it doesn't.
>>  >
>>  > Yes, exit_mmap() should call mm_update_next_owner() after the call to
>>  > task_unlock(), i.e. after it's set its new mm.
>>  >
>>  > So I need to express the invariant more carefully.
>>  >
>>  > What we need to preserve is that, for every mm at all times, mm->owner
>>  > points to a valid task. So either:
>>  >
>>  > 1) mm->owner->mm == mm AND mm->owner will check to see whether it
>>  > needs to pass ownership before it exits or execs.
>>  >
>>  > OR
>>  >
>>  > 2) mm->owner is the last user of mm and is about to free mm.
>>  >
>>  > OR
>>  >
>>  > 3) mm->owner is currently searching for another user of mm to pass the
>>  > ownership to.
>>  >
>>  > In order to get from state 3 to state 1 safely we have to hold
>>  > task_lock(new_owner). Otherwise we can race with an exit or exec in
>>  > new_owner, resulting in a process that has already passed the point of
>>  > checking current->mm->owner.
>>  >
>>
>>  No.. like you said if we do it after current->mm has changed and is different
>>  from mm, then it's safe to find a new owner. I still don't see why we need
>>  task_lock(new_owner).
> 
> How about the following sequence: A is old owner, B is new owner
> 
> A gets to the task_unlock() in exit_mm(): A->mm is now NULL, mm->owner == A
> B starts to execve()
> A calls mm_update_next_owner()
> B gets to the "active_mm = tsk->active_mm" in exec_mmap()
> A finds that B->mm == mm
> B continues through the critical section, gets past the point where it
> needs to check for ownership
> A sets mm->owner = B
> B finishes its exec, and carries on with its new mmap
> 

OK, a task changing the mm from underneath us can be a problem. Let me move over
to using task_lock(). I wish there was a simpler solution for implementing
mm->owner, but the fact that CLONE_THREAD and CLONE_VM can be called
independently is a huge bottleneck.

> 
>> Even if we have task_lock(new_owner), it can still exit or
>>  exec later.
> 
> Yes, but once we've set mm->owner to the other task and released its
> task_lock, the new owner is responsible for handing off the mm to yet
> another owner if necessary.
> 
>>  Why mix task_lock() to protect mm->owner?
> 
> We're not protecting mm->owner - we're protecting new_owner->mm
>

Yes


> Paul

-- 
	Warm Regards,
	Balbir Singh
	Linux Technology Center
	IBM, ISTL

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2008-03-29  5:50 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-28  8:23 Balbir Singh
2008-03-28  9:41 ` Jiri Slaby
2008-03-28  9:43   ` Jiri Slaby
2008-03-28 10:11     ` Balbir Singh
2008-03-28 10:48 ` KAMEZAWA Hiroyuki
2008-03-28 10:51   ` Balbir Singh
2008-03-28 11:06     ` KAMEZAWA Hiroyuki
2008-03-28 10:55 ` KAMEZAWA Hiroyuki
2008-03-28 10:52   ` Balbir Singh
2008-03-28 11:04     ` Paul Menage
2008-03-28 11:15     ` KAMEZAWA Hiroyuki
2008-03-28 11:21       ` KAMEZAWA Hiroyuki
2008-03-28 11:01 ` Paul Menage
2008-03-28 12:36   ` Balbir Singh
2008-03-28 12:54     ` Balbir Singh
2008-03-28 14:06       ` Paul Menage
2008-03-28 14:05     ` Paul Menage
2008-03-28 14:52       ` Balbir Singh
2008-03-28 15:38         ` Paul Menage
2008-03-28 18:10           ` Balbir Singh
2008-03-28 18:52             ` Paul Menage
2008-03-29  1:02               ` Balbir Singh
2008-03-29  5:46               ` Balbir Singh [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47EDD79A.7040108@linux.vnet.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=rientjes@google.com \
    --cc=skumar@linux.vnet.ibm.com \
    --cc=taka@valinux.co.jp \
    --cc=xemul@openvz.org \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox