linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Andrew Morton <akpm@linux-foundation.org>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	David Rientjes <rientjes@google.com>,
	Oleg Nesterov <oleg@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Linux PM list <linux-pm@vger.kernel.org>
Subject: Re: [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend
Date: Thu, 6 Nov 2014 17:02:23 +0100	[thread overview]
Message-ID: <20141106160223.GJ7202@dhcp22.suse.cz> (raw)
In-Reply-To: <20141106150121.GA25642@htj.dyndns.org>

On Thu 06-11-14 10:01:21, Tejun Heo wrote:
> On Thu, Nov 06, 2014 at 01:49:53PM +0100, Michal Hocko wrote:
> > On Wed 05-11-14 12:55:27, Tejun Heo wrote:
> > > On Wed, Nov 05, 2014 at 06:46:09PM +0100, Michal Hocko wrote:
> > > > Because out_of_memory can be called from mutliple paths. And
> > > > the only interesting one should be the page allocation path.
> > > > pagefault_out_of_memory is not interesting because it cannot happen for
> > > > the frozen task.
> > > 
> > > Hmmm.... wouldn't that be broken by definition tho?  So, if the oom
> > > killer is invoked from somewhere else than page allocation path, it
> > > would proceed ignoring the disabled setting and would race against PM
> > > freeze path all the same. 
> > 
> > Not really because try_to_freeze_tasks doesn't finish until _all_ tasks
> > are frozen and a task in the page fault path cannot be frozen, can it?
> 
> We used to have freezing points deep in file system code which may be
> reacheable from page fault.

If that is really the case then there is no way around and use
out_of_memory from the page fault path as well. I cannot say I would be
happy about that though. There should be ideally only single freezing
place. But that is another story.

> Please take a step back and look at the paragraph above.  Doesn't
> it sound extremely contrived and brittle even if it's not outright
> broken?  What if somebody adds another oom killing site somewhere
> else?

The only way to add an oom killing site is out_of_memory and that does
all the magic with my patch.

> How can this possibly be a solution that we intentionally implement?
>
> > I mean there shouldn't be any problem to not invoke OOM killer under
> > from the page fault path as well but that might lead to looping in the
> > page fault path without any progress until freezer enables OOM killer on
> > the failure path because the said task cannot be frozen.
> > 
> > Is this preferable?
> 
> Why would PM freezing make OOM killing fail?  That doesn't make much
> sense.  Sure, it can block it for a finite duration for sync purposes
> but making OOM killing fail seems the wrong way around.  

We cannot block in the allocation path because the request might come
from the freezer path itself (e.g. when suspending devices etc.).
At least this is my understanding why the original oom disable approach
was implemented.

> We're doing one thing for non-PM freezing and the other way around for
> PM freezing, which indicates one of the two directions is wrong.

Because those two paths are quite different in their requirements. The
cgroup freezer only cares about freezing tasks and it doesn't have to
care about tasks accessing a possibly half suspended device on their way
out.

> Shouldn't it be that OOM killing happening while PM freezing is in
> progress cancels PM freezing rather than the other way around?  Find a
> point in PM suspend/hibernation operation where everything must be
> stable, disable OOM killing there and check whether OOM killing
> happened inbetween and if so back out. 

This is freeze_processes AFAIU. I might be wrong of course but this is
the time since when nobody should be waking processes up because they
could access half suspended devices.

> It seems rather obvious to me that OOM killing has to have precedence
> over PM freezing.
> 
> Sure, once the system reaches a point where the whole system must be
> in a stable state for snapshotting or whatever, disabling OOM killing
> is fine but at that point the system is in a very limited execution
> mode and sure won't be processing page faults from userland for
> example and we can actually disable OOM killing knowing that anything
> afterwards is ready to handle memory allocation failures.

I am really confused now. This is basically what the final patch does
actually.  Here is the what I have currently just to make the further
discussion easier.
---

  reply	other threads:[~2014-11-06 16:02 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-21  7:27 [PATCH 0/4 -v2] OOM vs. freezer interaction fixes Michal Hocko
2014-10-21  7:27 ` [PATCH 1/4] freezer: Do not freeze tasks killed by OOM killer Michal Hocko
2014-10-21 12:04   ` Rafael J. Wysocki
2014-10-21  7:27 ` [PATCH 2/4] freezer: remove obsolete comments in __thaw_task() Michal Hocko
2014-10-21 12:04   ` Rafael J. Wysocki
2014-10-21  7:27 ` [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend Michal Hocko
2014-10-21 12:09   ` Rafael J. Wysocki
2014-10-21 13:14     ` Michal Hocko
2014-10-21 13:42       ` Rafael J. Wysocki
2014-10-21 14:11         ` Michal Hocko
2014-10-21 14:41           ` Rafael J. Wysocki
2014-10-21 14:29             ` Michal Hocko
2014-10-22 14:39               ` Rafael J. Wysocki
2014-10-22 14:22                 ` Michal Hocko
2014-10-22 21:18                   ` Rafael J. Wysocki
2014-10-26 18:49               ` Pavel Machek
2014-11-04 19:27               ` Tejun Heo
2014-11-05 12:46                 ` Michal Hocko
2014-11-05 13:02                   ` Tejun Heo
2014-11-05 13:31                     ` Michal Hocko
2014-11-05 13:42                       ` Michal Hocko
2014-11-05 14:14                         ` Michal Hocko
2014-11-05 15:45                           ` Michal Hocko
2014-11-05 15:44                         ` Tejun Heo
2014-11-05 16:01                           ` Michal Hocko
2014-11-05 16:29                             ` Tejun Heo
2014-11-05 16:39                               ` Michal Hocko
2014-11-05 16:54                                 ` Tejun Heo
2014-11-05 17:01                                   ` Tejun Heo
2014-11-06 13:05                                     ` Michal Hocko
2014-11-06 15:09                                       ` Tejun Heo
2014-11-06 16:01                                         ` Michal Hocko
2014-11-06 16:12                                           ` Tejun Heo
2014-11-06 16:31                                             ` Michal Hocko
2014-11-06 16:33                                               ` Tejun Heo
2014-11-06 16:58                                                 ` Michal Hocko
2014-11-05 17:46                                   ` Michal Hocko
2014-11-05 17:55                                     ` Tejun Heo
2014-11-06 12:49                                       ` Michal Hocko
2014-11-06 15:01                                         ` Tejun Heo
2014-11-06 16:02                                           ` Michal Hocko [this message]
2014-11-06 16:28                                             ` Tejun Heo
2014-11-10 16:30                                               ` Michal Hocko
2014-11-12 18:58                                                 ` [RFC 0/4] OOM vs PM freezer fixes Michal Hocko
2014-11-12 18:58                                                   ` [RFC 1/4] OOM, PM: Do not miss OOM killed frozen tasks Michal Hocko
2014-11-14 17:55                                                     ` Tejun Heo
2014-11-12 18:58                                                   ` [RFC 2/4] OOM, PM: make OOM detection in the freezer path raceless Michal Hocko
2014-11-12 18:58                                                   ` [RFC 3/4] OOM, PM: handle pm freezer as an OOM victim correctly Michal Hocko
2014-11-12 18:58                                                   ` [RFC 4/4] OOM: thaw the OOM victim if it is frozen Michal Hocko
2014-11-14 20:14                                                   ` [RFC 0/4] OOM vs PM freezer fixes Tejun Heo
2014-11-18 21:08                                                     ` Michal Hocko
2014-11-18 21:10                                                       ` [RFC 1/2] oom: add helper for setting and clearing TIF_MEMDIE Michal Hocko
2014-11-18 21:10                                                         ` [RFC 2/2] OOM, PM: make OOM detection in the freezer path raceless Michal Hocko
2014-11-27  0:47                                                           ` Rafael J. Wysocki
2014-12-02 22:08                                                           ` Tejun Heo
2014-12-04 14:16                                                             ` Michal Hocko
2014-12-04 14:44                                                               ` Tejun Heo
2014-12-04 16:56                                                                 ` Michal Hocko
2014-12-04 17:18                                                                   ` Michal Hocko
2014-12-05 16:41                                                 ` [PATCH 0/4] OOM vs PM freezer fixes Michal Hocko
2014-12-05 16:41                                                   ` [PATCH -v2 1/5] oom: add helpers for setting and clearing TIF_MEMDIE Michal Hocko
2014-12-06 12:56                                                     ` Tejun Heo
2014-12-07 10:13                                                       ` Michal Hocko
2015-01-07 17:57                                                     ` Tejun Heo
2015-01-07 18:23                                                       ` Michal Hocko
2014-12-05 16:41                                                   ` [PATCH -v2 2/5] OOM: thaw the OOM victim if it is frozen Michal Hocko
2014-12-06 13:06                                                     ` Tejun Heo
2014-12-07 10:24                                                       ` Michal Hocko
2014-12-07 10:45                                                         ` Michal Hocko
2014-12-07 13:59                                                           ` Tejun Heo
2014-12-07 18:55                                                             ` Michal Hocko
2014-12-05 16:41                                                   ` [PATCH -v2 3/5] PM: convert printk to pr_* equivalent Michal Hocko
2014-12-05 22:40                                                     ` Rafael J. Wysocki
2014-12-07 10:26                                                       ` Michal Hocko
2014-12-06 13:08                                                     ` Tejun Heo
2014-12-05 16:41                                                   ` [PATCH -v2 4/5] sysrq: " Michal Hocko
2014-12-06 13:09                                                     ` Tejun Heo
2014-12-05 16:41                                                   ` [PATCH -v2 5/5] OOM, PM: make OOM detection in the freezer path raceless Michal Hocko
2014-12-06 13:11                                                     ` Tejun Heo
2014-12-07 10:11                                                       ` Michal Hocko
2015-01-07 18:41                                                     ` Tejun Heo
2015-01-07 18:48                                                       ` Michal Hocko
2015-01-08 11:51                                                     ` Michal Hocko
2014-12-07 10:09                                                   ` [PATCH 0/4] OOM vs PM freezer fixes Michal Hocko
2014-12-07 13:55                                                     ` Tejun Heo
2014-12-07 19:00                                                       ` Michal Hocko
2014-12-18 16:27                                                         ` Michal Hocko
2014-11-05 14:55                   ` [PATCH 3/4] OOM, PM: OOM killed task shouldn't escape PM suspend Michal Hocko
2014-10-26 18:40   ` Pavel Machek
2014-10-21  7:27 ` [PATCH 4/4] PM: convert do_each_thread to for_each_process_thread Michal Hocko
2014-10-21 12:10   ` Rafael J. Wysocki
2014-10-21 13:19     ` Michal Hocko
2014-10-21 13:43       ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141106160223.GJ7202@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    --cc=rjw@rjwysocki.net \
    --cc=tj@kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox