From: Kirill Korotaev <dev@sw.ru>
To: Kirill Korotaev <dev@sw.ru>
Cc: Christoph Lameter <christoph@lameter.com>,
Pavel Machek <pavel@ucw.cz>, Linus Torvalds <torvalds@osdl.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
raybry@engr.sgi.com, Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Subject: Re: [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usable for other purposes
Date: Tue, 28 Jun 2005 12:13:04 +0400 [thread overview]
Message-ID: <42C10690.10108@sw.ru> (raw)
In-Reply-To: <42C0FCB3.4030205@sw.ru>
Christoph I was wrong a bit. Due to use of completion you have no one
race I described before. If the task is leaving refrigarator with
TIF_FREEZE it will just visit refrigarator() once more, but won't sleep
there since completion is done. BTW, I see no place where you
initialize the completion.
Kirill
>>>> -static inline int freezing(struct task_struct *p)
>>>> -{
>>>> - return p->flags & PF_FREEZE;
>>>> -}
>>>> +#if defined(CONFIG_PM) || defined(CONFIG_MIGRATE)
>>>
>>>
>>> <<<< why not to make a single option CONFIG_REFRIGERATOR? It looks to
>>> be a
>>> more robust way, since there are multiple users of it.
>>
>>
>>
>> Yes. That may be better. We can do that once the migration code is
>> finished and when we know what kind of CONFIG_XXX the migration code
>> really needs.
>>
>>
>>>> +#ifdef CONFIG_PM
>>>
>>>
>>> <<<< is it intentionaly? or you just lost CONFIG_MIGRATE?
>>
>> It is intentional. freeze_processes and thaw_processes are only needed
>> for suspend. One only needs to freeze a couple of processes for
>> process migration.
>
> But PM and your migrate code can be not the only users of it.
>
>>> <<<< I still think this refrigerator is racy with freeze_processes():
>>> <<<< scenarios:
>>> <<<< scenario 1
>>> <<<<
>>> task1 -> freeze_processes(): taskXXX ->refrigerator()
>>> checked (task->flags & PF_FROZEN) == 0 cur->flags |= PF_FROZEN
>>> clear TIF_FREEZE
>>> <sleep on thaw>
>>> set TIF_FREEZING
>>> clear PF_FROZEN
>>>
>>> <<<< so the task awakes with TIF_FREEZE flag set!!!
>>
>>
>>
>> Hmm... If we wait to clear both flags until after the completion
>> notification then we do not have the race right? But then we need to
>> move the signal recalc since it tests for TIF_FREEZE too.
>
> It is almost ok, but it is still not fine :)
>
> look what happens if you call freeze/unfreeze in a loop:
>
> refrigerator:
> awakes
>
> freezer:
> check PF_FROZEN, it is still set, skips task and thinks it is finished
> freezing.
>
> refrigerator:
> clears PF_FROZEN and TIF_FREEZE and returns.
>
> I think you can fix this by moving PF_FROZEN check and set in both
> places under siglock.
>
> Kirill
>
>
>>> <<<< scenario 2
>>> <<<< look at error path in freeze_processes (on timeout), it is
>>> broken as
>>> well. You need to wakeup tasks there...
>>
>>
>>
>> Ok. How about this additional patch? This still requires that process
>> freezing does not immediately occurr again after the completion
>> handler. All of this is iffy due to not having a real lock protecting
>> all these values and we may still need to add some barriers.
>>
>> Index: linux-2.6.12/kernel/power/process.c
>> ===================================================================
>> --- linux-2.6.12.orig/kernel/power/process.c 2005-06-28
>> 06:34:52.000000000 +0000
>> +++ linux-2.6.12/kernel/power/process.c 2005-06-28
>> 06:40:28.000000000 +0000
>> @@ -47,12 +47,13 @@ int freeze_processes(void)
>> unsigned long flags;
>> if (!freezeable(p))
>> continue;
>> - if ((p->flags & PF_FROZEN) ||
>> - (p->state == TASK_TRACED) ||
>> + if ((p->state == TASK_TRACED) ||
>> (p->state == TASK_STOPPED))
>> continue;
>>
>> set_thread_flag(TIF_FREEZE);
>> + if (p->flags & PF_FROZEN)
>> + continue;
>> spin_lock_irqsave(&p->sighand->siglock, flags);
>> signal_wake_up(p, 0);
>> spin_unlock_irqrestore(&p->sighand->siglock, flags);
>> @@ -63,6 +64,8 @@ int freeze_processes(void)
>> if (time_after(jiffies, start_time + TIMEOUT)) {
>> printk( "\n" );
>> printk(KERN_ERR " stopping tasks failed (%d tasks
>> remaining)\n", todo );
>> + complete_all(&thaw);
>> + up(&freezer_sem);
>> return todo;
>> }
>> } while(todo);
>> Index: linux-2.6.12/kernel/sched.c
>> ===================================================================
>> --- linux-2.6.12.orig/kernel/sched.c 2005-06-28 06:34:52.000000000
>> +0000
>> +++ linux-2.6.12/kernel/sched.c 2005-06-28 06:37:36.000000000 +0000
>> @@ -5210,13 +5210,13 @@ DECLARE_COMPLETION(thaw);
>> void refrigerator(void)
>> {
>> current->flags |= PF_FROZEN;
>> + wait_for_completion(&thaw);
>> clear_thread_flag(TIF_FREEZE);
>> + current->flags &= ~PF_FROZEN;
>> /* A fake signal 0 may have been sent. Recalculate sigpending */
>> spin_lock_irq(¤t->sighand->siglock);
>> recalc_sigpending();
>> spin_unlock_irq(¤t->sighand->siglock);
>> - wait_for_completion(&thaw);
>> - current->flags &= ~PF_FROZEN;
>> }
>> EXPORT_SYMBOL(refrigerator);
>> #endif
>>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next prev parent reply other threads:[~2005-06-28 8:13 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-06-24 20:20 Christoph Lameter
2005-06-24 22:25 ` Nigel Cunningham
2005-06-25 2:51 ` Pavel Machek
2005-06-25 4:31 ` Christoph Lameter
2005-06-25 4:46 ` Nigel Cunningham
2005-06-25 22:37 ` Pavel Machek
2005-06-25 6:13 ` Christoph Lameter
2005-06-26 2:30 ` Pavel Machek
2005-06-26 2:55 ` Christoph Lameter
2005-06-26 3:09 ` Pavel Machek
2005-06-27 3:51 ` Christoph Lameter
2005-06-27 4:21 ` Pavel Machek
2005-06-27 4:24 ` Linus Torvalds
2005-06-27 5:53 ` Christoph Lameter
2005-06-27 14:13 ` Pavel Machek
2005-06-27 15:13 ` Christoph Lameter
2005-06-28 6:18 ` Kirill Korotaev
2005-06-28 7:01 ` Christoph Lameter
2005-06-28 7:30 ` Kirill Korotaev
2005-06-28 8:13 ` Kirill Korotaev [this message]
2005-06-28 14:02 ` Christoph Lameter
2005-06-28 14:01 ` Christoph Lameter
2005-06-28 12:47 ` Pavel Machek
2005-07-01 17:06 ` [SUSPEND 1/2]Replace PF_FREEZE with TIF_FREEZE Christoph Lameter
2005-07-01 17:14 ` [SUSPEND 2/2] Replace PF_FROZEN with TASK_FROZEN Christoph Lameter
2005-06-28 21:50 ` [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usable for other purposes Ray Bryant
2005-06-28 21:54 ` Christoph Lameter
2005-07-03 11:06 ` Pavel Machek
2005-06-26 10:54 ` Nigel Cunningham
2005-06-25 6:18 ` Christoph Lameter
2005-06-25 7:35 ` Kirill Korotaev
2005-06-27 7:06 ` Ray Bryant
2005-06-27 13:17 ` Pavel Machek
2005-06-27 14:59 ` Ray Bryant
2005-06-27 18:05 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42C10690.10108@sw.ru \
--to=dev@sw.ru \
--cc=christoph@lameter.com \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=pavel@ucw.cz \
--cc=raybry@engr.sgi.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox