linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	David Rientjes <rientjes@google.com>,
	Oleg Nesterov <oleg@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Linux PM list <linux-pm@vger.kernel.org>
Subject: Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend
Date: Wed, 5 Nov 2014 15:55:51 +0100	[thread overview]
Message-ID: <20141105145551.GF4527@dhcp22.suse.cz> (raw)
In-Reply-To: <20141105124620.GB4527@dhcp22.suse.cz>

On Wed 05-11-14 13:46:20, Michal Hocko wrote:
[...]
> From ef6227565fa65b52986c4626d49ba53b499e54d1 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Wed, 5 Nov 2014 11:49:14 +0100
> Subject: [PATCH] OOM, PM: make OOM detection in the freezer path raceless
> 
> 5695be142e20 (OOM, PM: OOM killed task shouldn't escape PM suspend)
> has left a race window when OOM killer manages to note_oom_kill after
> freeze_processes checks the counter. The race window is quite small
> and really unlikely and deemed sufficient at the time of submission.
> 
> Tejun wasn't happy about this partial solution though and insisted on
> a full solution. That requires the full OOM and freezer exclusion,
> though. This is done by this patch which introduces oom_sem RW lock.
> Page allocation OOM path takes the lock for reading because there might
> be concurrent OOM happening on disjunct zonelists. oom_killer_disabled
> check is moved right before out_of_memory is called because it was
> checked too early before and we do not want to hold the lock while doing
> the last attempt for allocation which might involve zone_reclaim.

This is incorrect because it would cause an endless allocation loop
because we really have to got to no_page if OOM is disabled.

> freeze_processes then takes the lock for write throughout the whole
> freezing process and OOM disabling.
> 
> There is no need to recheck all the processes with the full
> synchronization anymore.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  include/linux/oom.h    |  5 +++++
>  kernel/power/process.c | 50 +++++++++-----------------------------------------
>  mm/oom_kill.c          | 17 -----------------
>  mm/page_alloc.c        | 24 ++++++++++++------------
>  4 files changed, 26 insertions(+), 70 deletions(-)
> 
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index e8d6e1058723..350b9b2ffeec 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -73,7 +73,12 @@ extern void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
>  extern int register_oom_notifier(struct notifier_block *nb);
>  extern int unregister_oom_notifier(struct notifier_block *nb);
>  
> +/*
> + * oom_killer_disabled can be modified only under oom_sem taken for write
> + * and checked under read lock along with the full OOM handler.
> + */
>  extern bool oom_killer_disabled;
> +extern struct rw_semaphore oom_sem;
>  
>  static inline void oom_killer_disable(void)
>  {
> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index 5a6ec8678b9a..befce9785233 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -108,30 +108,6 @@ static int try_to_freeze_tasks(bool user_only)
>  	return todo ? -EBUSY : 0;
>  }
>  
> -static bool __check_frozen_processes(void)
> -{
> -	struct task_struct *g, *p;
> -
> -	for_each_process_thread(g, p)
> -		if (p != current && !freezer_should_skip(p) && !frozen(p))
> -			return false;
> -
> -	return true;
> -}
> -
> -/*
> - * Returns true if all freezable tasks (except for current) are frozen already
> - */
> -static bool check_frozen_processes(void)
> -{
> -	bool ret;
> -
> -	read_lock(&tasklist_lock);
> -	ret = __check_frozen_processes();
> -	read_unlock(&tasklist_lock);
> -	return ret;
> -}
> -
>  /**
>   * freeze_processes - Signal user space processes to enter the refrigerator.
>   * The current thread will not be frozen.  The same process that calls
> @@ -142,7 +118,6 @@ static bool check_frozen_processes(void)
>  int freeze_processes(void)
>  {
>  	int error;
> -	int oom_kills_saved;
>  
>  	error = __usermodehelper_disable(UMH_FREEZING);
>  	if (error)
> @@ -157,27 +132,20 @@ int freeze_processes(void)
>  	pm_wakeup_clear();
>  	printk("Freezing user space processes ... ");
>  	pm_freezing = true;
> -	oom_kills_saved = oom_kills_count();
> +
> +	/*
> +	 * Need to exlude OOM killer from triggering while tasks are
> +	 * getting frozen to make sure none of them gets killed after
> +	 * try_to_freeze_tasks is done.
> +	 */
> +	down_write(&oom_sem);
>  	error = try_to_freeze_tasks(true);
>  	if (!error) {
>  		__usermodehelper_set_disable_depth(UMH_DISABLED);
>  		oom_killer_disable();
> -
> -		/*
> -		 * There might have been an OOM kill while we were
> -		 * freezing tasks and the killed task might be still
> -		 * on the way out so we have to double check for race.
> -		 */
> -		if (oom_kills_count() != oom_kills_saved &&
> -		    !check_frozen_processes()) {
> -			__usermodehelper_set_disable_depth(UMH_ENABLED);
> -			printk("OOM in progress.");
> -			error = -EBUSY;
> -		} else {
> -			printk("done.");
> -		}
> +		printk("done.\n");
>  	}
> -	printk("\n");
> +	up_write(&oom_sem);
>  	BUG_ON(in_atomic());
>  
>  	if (error)
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5340f6b91312..bbf405a3a18f 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -404,23 +404,6 @@ static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
>  		dump_tasks(memcg, nodemask);
>  }
>  
> -/*
> - * Number of OOM killer invocations (including memcg OOM killer).
> - * Primarily used by PM freezer to check for potential races with
> - * OOM killed frozen task.
> - */
> -static atomic_t oom_kills = ATOMIC_INIT(0);
> -
> -int oom_kills_count(void)
> -{
> -	return atomic_read(&oom_kills);
> -}
> -
> -void note_oom_kill(void)
> -{
> -	atomic_inc(&oom_kills);
> -}
> -
>  #define K(x) ((x) << (PAGE_SHIFT-10))
>  /*
>   * Must be called while holding a reference to p, which will be released upon
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 9cd36b822444..76095266c4b5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -243,6 +243,7 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
>  }
>  
>  bool oom_killer_disabled __read_mostly;
> +DECLARE_RWSEM(oom_sem);
>  
>  #ifdef CONFIG_DEBUG_VM
>  static int page_outside_zone_boundaries(struct zone *zone, struct page *page)
> @@ -2252,14 +2253,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
>  	}
>  
>  	/*
> -	 * PM-freezer should be notified that there might be an OOM killer on
> -	 * its way to kill and wake somebody up. This is too early and we might
> -	 * end up not killing anything but false positives are acceptable.
> -	 * See freeze_processes.
> -	 */
> -	note_oom_kill();
> -
> -	/*
>  	 * Go through the zonelist yet one more time, keep very high watermark
>  	 * here, this is only to catch a parallel oom killing, we must fail if
>  	 * we're still under heavy pressure.
> @@ -2288,8 +2281,17 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
>  		if (gfp_mask & __GFP_THISNODE)
>  			goto out;
>  	}
> -	/* Exhausted what can be done so it's blamo time */
> -	out_of_memory(zonelist, gfp_mask, order, nodemask, false);
> +
> +	/*
> +	 * Exhausted what can be done so it's blamo time.
> +	 * Just make sure that we cannot race with oom_killer disabling
> +	 * e.g. PM freezer needs to make sure that no OOM happens after
> +	 * all tasks are frozen.
> +	 */
> +	down_read(&oom_sem);
> +	if (!oom_killer_disabled)
> +		out_of_memory(zonelist, gfp_mask, order, nodemask, false);
> +	up_read(&oom_sem);
>  
>  out:
>  	oom_zonelist_unlock(zonelist, gfp_mask);
> @@ -2716,8 +2718,6 @@ rebalance:
>  	 */
>  	if (!did_some_progress) {
>  		if (oom_gfp_allowed(gfp_mask)) {
> -			if (oom_killer_disabled)
> -				goto nopage;
>  			/* Coredumps can quickly deplete all memory reserves */
>  			if ((current->flags & PF_DUMPCORE) &&
>  			    !(gfp_mask & __GFP_NOFAIL))
> -- 
> 2.1.1
> 
> 
> -- 
> Michal Hocko
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2014-11-05 14:55 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-21  7:27 [PATCH 0/4 -v2] OOM vs. freezer interaction fixes Michal Hocko
2014-10-21  7:27 ` [PATCH 1/4] freezer: Do not freeze tasks killed by OOM killer Michal Hocko
2014-10-21 12:04   ` Rafael J. Wysocki
2014-10-21  7:27 ` [PATCH 2/4] freezer: remove obsolete comments in __thaw_task() Michal Hocko
2014-10-21 12:04   ` Rafael J. Wysocki
2014-10-21  7:27 ` [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend Michal Hocko
2014-10-21 12:09   ` Rafael J. Wysocki
2014-10-21 13:14     ` Michal Hocko
2014-10-21 13:42       ` Rafael J. Wysocki
2014-10-21 14:11         ` Michal Hocko
2014-10-21 14:41           ` Rafael J. Wysocki
2014-10-21 14:29             ` Michal Hocko
2014-10-22 14:39               ` Rafael J. Wysocki
2014-10-22 14:22                 ` Michal Hocko
2014-10-22 21:18                   ` Rafael J. Wysocki
2014-10-26 18:49               ` Pavel Machek
2014-11-04 19:27               ` Tejun Heo
2014-11-05 12:46                 ` Michal Hocko
2014-11-05 13:02                   ` Tejun Heo
2014-11-05 13:31                     ` Michal Hocko
2014-11-05 13:42                       ` Michal Hocko
2014-11-05 14:14                         ` Michal Hocko
2014-11-05 15:45                           ` Michal Hocko
2014-11-05 15:44                         ` Tejun Heo
2014-11-05 16:01                           ` Michal Hocko
2014-11-05 16:29                             ` Tejun Heo
2014-11-05 16:39                               ` Michal Hocko
2014-11-05 16:54                                 ` Tejun Heo
2014-11-05 17:01                                   ` Tejun Heo
2014-11-06 13:05                                     ` Michal Hocko
2014-11-06 15:09                                       ` Tejun Heo
2014-11-06 16:01                                         ` Michal Hocko
2014-11-06 16:12                                           ` Tejun Heo
2014-11-06 16:31                                             ` Michal Hocko
2014-11-06 16:33                                               ` Tejun Heo
2014-11-06 16:58                                                 ` Michal Hocko
2014-11-05 17:46                                   ` Michal Hocko
2014-11-05 17:55                                     ` Tejun Heo
2014-11-06 12:49                                       ` Michal Hocko
2014-11-06 15:01                                         ` Tejun Heo
2014-11-06 16:02                                           ` Michal Hocko
2014-11-06 16:28                                             ` Tejun Heo
2014-11-10 16:30                                               ` Michal Hocko
2014-11-12 18:58                                                 ` [RFC 0/4] OOM vs PM freezer fixes Michal Hocko
2014-11-12 18:58                                                   ` [RFC 1/4] OOM, PM: Do not miss OOM killed frozen tasks Michal Hocko
2014-11-14 17:55                                                     ` Tejun Heo
2014-11-12 18:58                                                   ` [RFC 2/4] OOM, PM: make OOM detection in the freezer path raceless Michal Hocko
2014-11-12 18:58                                                   ` [RFC 3/4] OOM, PM: handle pm freezer as an OOM victim correctly Michal Hocko
2014-11-12 18:58                                                   ` [RFC 4/4] OOM: thaw the OOM victim if it is frozen Michal Hocko
2014-11-14 20:14                                                   ` [RFC 0/4] OOM vs PM freezer fixes Tejun Heo
2014-11-18 21:08                                                     ` Michal Hocko
2014-11-18 21:10                                                       ` [RFC 1/2] oom: add helper for setting and clearing TIF_MEMDIE Michal Hocko
2014-11-18 21:10                                                         ` [RFC 2/2] OOM, PM: make OOM detection in the freezer path raceless Michal Hocko
2014-11-27  0:47                                                           ` Rafael J. Wysocki
2014-12-02 22:08                                                           ` Tejun Heo
2014-12-04 14:16                                                             ` Michal Hocko
2014-12-04 14:44                                                               ` Tejun Heo
2014-12-04 16:56                                                                 ` Michal Hocko
2014-12-04 17:18                                                                   ` Michal Hocko
2014-12-05 16:41                                                 ` [PATCH 0/4] OOM vs PM freezer fixes Michal Hocko
2014-12-05 16:41                                                   ` [PATCH -v2 1/5] oom: add helpers for setting and clearing TIF_MEMDIE Michal Hocko
2014-12-06 12:56                                                     ` Tejun Heo
2014-12-07 10:13                                                       ` Michal Hocko
2015-01-07 17:57                                                     ` Tejun Heo
2015-01-07 18:23                                                       ` Michal Hocko
2014-12-05 16:41                                                   ` [PATCH -v2 2/5] OOM: thaw the OOM victim if it is frozen Michal Hocko
2014-12-06 13:06                                                     ` Tejun Heo
2014-12-07 10:24                                                       ` Michal Hocko
2014-12-07 10:45                                                         ` Michal Hocko
2014-12-07 13:59                                                           ` Tejun Heo
2014-12-07 18:55                                                             ` Michal Hocko
2014-12-05 16:41                                                   ` [PATCH -v2 3/5] PM: convert printk to pr_* equivalent Michal Hocko
2014-12-05 22:40                                                     ` Rafael J. Wysocki
2014-12-07 10:26                                                       ` Michal Hocko
2014-12-06 13:08                                                     ` Tejun Heo
2014-12-05 16:41                                                   ` [PATCH -v2 4/5] sysrq: " Michal Hocko
2014-12-06 13:09                                                     ` Tejun Heo
2014-12-05 16:41                                                   ` [PATCH -v2 5/5] OOM, PM: make OOM detection in the freezer path raceless Michal Hocko
2014-12-06 13:11                                                     ` Tejun Heo
2014-12-07 10:11                                                       ` Michal Hocko
2015-01-07 18:41                                                     ` Tejun Heo
2015-01-07 18:48                                                       ` Michal Hocko
2015-01-08 11:51                                                     ` Michal Hocko
2014-12-07 10:09                                                   ` [PATCH 0/4] OOM vs PM freezer fixes Michal Hocko
2014-12-07 13:55                                                     ` Tejun Heo
2014-12-07 19:00                                                       ` Michal Hocko
2014-12-18 16:27                                                         ` Michal Hocko
2014-11-05 14:55                   ` Michal Hocko [this message]
2014-10-26 18:40   ` [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend Pavel Machek
2014-10-21  7:27 ` [PATCH 4/4] PM: convert do_each_thread to for_each_process_thread Michal Hocko
2014-10-21 12:10   ` Rafael J. Wysocki
2014-10-21 13:19     ` Michal Hocko
2014-10-21 13:43       ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141105145551.GF4527@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    --cc=rjw@rjwysocki.net \
    --cc=tj@kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox