linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kirill Korotaev <dev@sw.ru>
To: Christoph Lameter <christoph@lameter.com>
Cc: Pavel Machek <pavel@ucw.cz>, Linus Torvalds <torvalds@osdl.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	raybry@engr.sgi.com, Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Subject: Re: [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usable for other purposes
Date: Tue, 28 Jun 2005 11:30:59 +0400	[thread overview]
Message-ID: <42C0FCB3.4030205@sw.ru> (raw)
In-Reply-To: <Pine.LNX.4.62.0506272323490.30956@graphe.net>

>>>-static inline int freezing(struct task_struct *p)
>>>-{
>>>-	return p->flags & PF_FREEZE;
>>>-}
>>>+#if defined(CONFIG_PM) || defined(CONFIG_MIGRATE)
>>
>><<<< why not to make a single option CONFIG_REFRIGERATOR? It looks to be a
>>more robust way, since there are multiple users of it.
> 
> 
> Yes. That may be better. We can do that once the migration code is 
> finished and when we know what kind of CONFIG_XXX the migration code 
> really needs.
> 
> 
>>>+#ifdef CONFIG_PM
>>
>><<<< is it intentionaly? or you just lost CONFIG_MIGRATE?
> It is intentional. freeze_processes and thaw_processes are only needed 
> for suspend. One only needs to freeze a couple of processes for process 
> migration.
But PM and your migrate code can be not the only users of it.

>><<<< I still think this refrigerator is racy with freeze_processes():
>><<<< scenarios:
>><<<< scenario 1
>><<<<
>>task1 -> freeze_processes():			taskXXX ->refrigerator()
>>  checked (task->flags & PF_FROZEN) == 0	cur->flags |= PF_FROZEN
>>						clear TIF_FREEZE
>>						<sleep on thaw>
>>  set TIF_FREEZING
>>						clear PF_FROZEN
>>
>><<<< so the task awakes with TIF_FREEZE flag set!!!
> 
> 
> Hmm... If we wait to clear both flags until after the completion 
> notification then we do not have the race right? But then we need to move 
> the signal recalc since it tests for TIF_FREEZE too.
It is almost ok, but it is still not fine :)

look what happens if you call freeze/unfreeze in a loop:

refrigerator:
awakes

freezer:
check PF_FROZEN, it is still set, skips task and thinks it is finished 
freezing.

refrigerator:
clears PF_FROZEN and TIF_FREEZE and returns.

I think you can fix this by moving PF_FROZEN check and set in both 
places under siglock.

Kirill


>><<<< scenario 2
>><<<< look at error path in freeze_processes (on timeout), it is broken as
>>well. You need to wakeup tasks there...
> 
> 
> Ok. How about this additional patch? This still requires that process 
> freezing does not immediately occurr again after the completion 
> handler. All of this is iffy due to not having a real lock protecting all 
> these values and we may still need to add some barriers.
> 
> Index: linux-2.6.12/kernel/power/process.c
> ===================================================================
> --- linux-2.6.12.orig/kernel/power/process.c	2005-06-28 06:34:52.000000000 +0000
> +++ linux-2.6.12/kernel/power/process.c	2005-06-28 06:40:28.000000000 +0000
> @@ -47,12 +47,13 @@ int freeze_processes(void)
>  			unsigned long flags;
>  			if (!freezeable(p))
>  				continue;
> -			if ((p->flags & PF_FROZEN) ||
> -			    (p->state == TASK_TRACED) ||
> +			if ((p->state == TASK_TRACED) ||
>  			    (p->state == TASK_STOPPED))
>  				continue;
>  
>  			set_thread_flag(TIF_FREEZE);
> +			if (p->flags & PF_FROZEN)
> +				continue;
>  			spin_lock_irqsave(&p->sighand->siglock, flags);
>  			signal_wake_up(p, 0);
>  			spin_unlock_irqrestore(&p->sighand->siglock, flags);
> @@ -63,6 +64,8 @@ int freeze_processes(void)
>  		if (time_after(jiffies, start_time + TIMEOUT)) {
>  			printk( "\n" );
>  			printk(KERN_ERR " stopping tasks failed (%d tasks remaining)\n", todo );
> +			complete_all(&thaw);
> +			up(&freezer_sem);
>  			return todo;
>  		}
>  	} while(todo);
> Index: linux-2.6.12/kernel/sched.c
> ===================================================================
> --- linux-2.6.12.orig/kernel/sched.c	2005-06-28 06:34:52.000000000 +0000
> +++ linux-2.6.12/kernel/sched.c	2005-06-28 06:37:36.000000000 +0000
> @@ -5210,13 +5210,13 @@ DECLARE_COMPLETION(thaw);
>  void refrigerator(void)
>  {
>  	current->flags |= PF_FROZEN;
> +	wait_for_completion(&thaw);
>  	clear_thread_flag(TIF_FREEZE);
> +	current->flags &= ~PF_FROZEN;
>  	/* A fake signal 0 may have been sent. Recalculate sigpending */
>  	spin_lock_irq(&current->sighand->siglock);
>  	recalc_sigpending();
>  	spin_unlock_irq(&current->sighand->siglock);
> -	wait_for_completion(&thaw);
> -	current->flags &= ~PF_FROZEN;
>  }
>  EXPORT_SYMBOL(refrigerator);
>  #endif
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2005-06-28  7:30 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-24 20:20 Christoph Lameter
2005-06-24 22:25 ` Nigel Cunningham
2005-06-25  2:51 ` Pavel Machek
2005-06-25  4:31   ` Christoph Lameter
2005-06-25  4:46     ` Nigel Cunningham
2005-06-25 22:37     ` Pavel Machek
2005-06-25  6:13   ` Christoph Lameter
2005-06-26  2:30     ` Pavel Machek
2005-06-26  2:55       ` Christoph Lameter
2005-06-26  3:09         ` Pavel Machek
2005-06-27  3:51           ` Christoph Lameter
2005-06-27  4:21             ` Pavel Machek
2005-06-27  4:24             ` Linus Torvalds
2005-06-27  5:53               ` Christoph Lameter
2005-06-27 14:13                 ` Pavel Machek
2005-06-27 15:13                   ` Christoph Lameter
2005-06-28  6:18                     ` Kirill Korotaev
2005-06-28  7:01                       ` Christoph Lameter
2005-06-28  7:30                         ` Kirill Korotaev [this message]
2005-06-28  8:13                           ` Kirill Korotaev
2005-06-28 14:02                             ` Christoph Lameter
2005-06-28 14:01                           ` Christoph Lameter
2005-06-28 12:47                       ` Pavel Machek
2005-07-01 17:06                         ` [SUSPEND 1/2]Replace PF_FREEZE with TIF_FREEZE Christoph Lameter
2005-07-01 17:14                           ` [SUSPEND 2/2] Replace PF_FROZEN with TASK_FROZEN Christoph Lameter
2005-06-28 21:50                 ` [RFC] Fix SMP brokenness for PF_FREEZE and make freezing usable for other purposes Ray Bryant
2005-06-28 21:54                   ` Christoph Lameter
2005-07-03 11:06                     ` Pavel Machek
2005-06-26 10:54       ` Nigel Cunningham
2005-06-25  6:18   ` Christoph Lameter
2005-06-25  7:35 ` Kirill Korotaev
2005-06-27  7:06   ` Ray Bryant
2005-06-27 13:17     ` Pavel Machek
2005-06-27 14:59       ` Ray Bryant
2005-06-27 18:05         ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42C0FCB3.4030205@sw.ru \
    --to=dev@sw.ru \
    --cc=christoph@lameter.com \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pavel@ucw.cz \
    --cc=raybry@engr.sgi.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox