linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Emelyanov <xemul@parallels.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	linux-mm@kvack.org
Subject: Re: Change soft-dirty interface?
Date: Fri, 14 Jun 2013 14:01:23 +0400	[thread overview]
Message-ID: <51BAE9F3.5030301@parallels.com> (raw)
In-Reply-To: <20130614050738.GA21852@bbox>

>>>>> If it's not allowed, another approach should be new system call.
>>>>>
>>>>>         int sys_softdirty(pid_t pid, void *addr, size_t len);
>>>>
>>>> This looks like existing sys_madvise() one.
>>>
>>> Except pid part. It is added by your purpose, which external task
>>> can control any process.

In CRIU we can work with pid-less syscalls just fine :) So extending regular
madvise would work.

>>>>
>>>>> If we approach new system call, we don't need to maintain current
>>>>> proc interface and it would be very handy to get a information
>>>>> without pagemap (open/read/close) so we can add a parameter to
>>>>> get a dirty information easily.
>>>>>
>>>>>         int sys_softdirty(pid_t pid, void *addr, size_t len, unsigned char *vec)
>>>>>
>>>>> What do you think about it?
>>>>>
>>>>
>>>> This is OK for me, though there's another issue with this API I'd like
>>>> to mention -- consider your app is doing these tricks with soft-dirty
>>>> and at the same time CRIU tools live-migrate it using the soft-dirty bits
>>>> to optimize the freeze time.
>>>>
>>>> In that case soft-dirty bits would be in wrong state for both -- you app
>>>> and CRIU, but with the proc API we could compare the ctime-s of the 
>>>> clear_refs file and find out, that someone spoiled the soft-dirty state
>>>> from last time we messed with it and handle it somehow (copy all the memory
>>>> in the worst case). Can we somehow handle this with your proposal?
>>>
>>> Good point I didn't think over that.
>>> A simple idea popped from my mind is we can use read/write lock
>>> so if pid is equal to calling process's one or pid is NULL,
>>> we use read side lock, which can allow marking soft-dirty 
>>> several vmas with parallel. And pid is not equal to calling
>>> process's one, the API should try to hold write-side lock
>>> then, if it's fail, the API should return EAGAIN so that CRIU
>>> can progress other processes and retry it after a while.
>>> Of course, it would make live-lock so that sys_softdirty might
>>> need another argument like "int block".
>>
>> And we need a flag to show SELF_SOFT_DIRTY or EXTERNAL_SOFT_DIRTY
>> and the flag will be protected by above lock. It could prevent mixed
>> case by self and external.
> 
> I realized it's not enough. Another idea is here.
> The intenion is followin as,
> 
> self softdirty VS self softdirty -> NOT exclusive
> self softdirty VS external softdirty -> exclusive
> external softdirty VS external softdirty-> excluisve

I think it might work for us. However, I have two comments to the
implementation, please see below.

> struct softdirty token {
>         u64 external;
>         u64 internal;
> };
> 
>        int sys_set_softdirty(pid_t pid, unsigned long start, size_t len,
>                                 struct softdirty *token); 
>        int sys_get_softdirty(pid_t pid, unsigned long start, size_t len, 
>                                 struct softdirty token, char *vec);

Can you please show an example how to use these two, I don't quite get how
can I do external soft-dirty tracking in atomic manner.

> 
> SYSCALL(set_softdirty, ..., token)
> {
>         struct task_struct *tsk = task_from_pid(pid);
>         mutex_lock(&mm->st_lock);
>         if (tsk == current)
>                 tsk->mm->token.internal++; 
>         else
>                 tsk->mm->token.external++;
>         token->external = mm->token.external;
>         token->internal = mm->token.internal;
>         mutex_unlock(&mm->st_lock);
>         ..
>         ..
> 
> }
> 
> SYSCALL(get_softdirty, ..., token, ...)
> {
>         struct task_struct *tsk = task_from_pid(pid);
>         mutex_lock(&mm->st_lock);
>         if (tsk == current) {
>                 if (tsk->mm->token.external != token.external) {
>                         mutex_unlock
>                         return -EAGAIN;
>                 }
>         } else {
>                 if (tsk->mm->token.external != token.external ||
>                     tsk->mm->token.internal != token.internal) {
>                         mutex_unlock;
>                         return -EAGAIN;
>                 }
>         }
>         mutex_unlock(&mm->st_lock);

Presumably the critical section should be longer, as if tokens match and we
release the lock and proceed with working on pagemap, the concurrent call
to set_softdirty can proceed and spoil the picture.

>         ...
> }
> 
> 
> 
> 
>>
>> -- 
>> Kind regards,
>> Minchan Kim
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-06-14 10:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-13  1:53 Minchan Kim
2013-06-13  9:10 ` Pavel Emelyanov
2013-06-14  0:32   ` Minchan Kim
2013-06-14  0:41     ` Minchan Kim
2013-06-14  5:07       ` Minchan Kim
2013-06-14 10:01         ` Pavel Emelyanov [this message]
2013-06-14 11:22           ` Minchan Kim
2013-06-14 11:37             ` Pavel Emelyanov
2013-06-15  6:41               ` Minchan Kim
2013-06-19  9:31                 ` Pavel Emelyanov
2013-06-21  1:41                   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51BAE9F3.5030301@parallels.com \
    --to=xemul@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox