Re: [RFC PATCH 2/2] mm,oom: Try last second allocation after selecting an OOM victim.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@suse.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: aarcange@redhat.com, hannes@cmpxchg.org,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	rientjes@google.com, mjaggi@caviumnetworks.com, mgorman@suse.de,
	oleg@redhat.com, vdavydov.dev@gmail.com, vbabka@suse.cz
Subject: Re: [RFC PATCH 2/2] mm,oom: Try last second allocation after selecting an OOM victim.
Date: Wed, 25 Oct 2017 13:09:55 +0200	[thread overview]
Message-ID: <20171025110955.jsc4lqjbg6ww5va6@dhcp22.suse.cz> (raw)
In-Reply-To: <201710251948.EJH00500.MOOStFLFQOHFJV@I-love.SAKURA.ne.jp>

On Wed 25-10-17 19:48:09, Tetsuo Handa wrote:
> Michal Hocko wrote:
[...]
> > The OOM killer is the last hand break. At the time you hit the OOM
> > condition your system is usually hard to use anyway. And that is why I
> > do care to make this path deadlock free. I have mentioned multiple times
> > that I find real life triggers much more important than artificial DoS
> > like workloads which make your system unsuable long before you hit OOM
> > killer.
> 
> Unable to invoke the OOM killer (i.e. OOM lockup) is worse than hand break injury.
> 
> If you do care to make this path deadlock free, you had better stop depending on
> mutex_trylock(&oom_lock). Not only printk() from oom_kill_process() can trigger
> deadlock due to console_sem versus oom_lock dependency but also

And this means that we have to fix printk. Completely silent oom path is
out of question IMHO

> schedule_timeout_killable(1) from out_of_memory() can also trigger deadlock
> due to SCHED_IDLE versus !SCHED_IDLE dependency (like I suggested at 
> http://lkml.kernel.org/r/201603031941.CBC81272.OtLMSFVOFJHOFQ@I-love.SAKURA.ne.jp ).

You are still missing the point here. You do not really have to sleep to
get preempted by high priority task here. Moreover sleep is done after
we have killed the victim and the reaper can already start tearing down
the memory. If you oversubscribe your system by high priority tasks you
are screwed no matter what.
 
> > > Current code is somehow easier to OOM lockup due to printk() versus oom_lock
> > > dependency, and I'm proposing a patch for mitigating printk() versus oom_lock
> > > dependency using oom_printk_lock because I can hardly examine OOM related
> > > problems since linux-4.9, and your response was "Hell no!".
> > 
> > Because you are repeatedly proposing a paper over rather than to attempt
> > something resembling a solution. And this is highly annoying. I've
> > already said that I am willing to sacrifice the stall warning rather
> > than fiddle with random locks put here and there.
> 
> I've already said that I do welcome removing the stall warning if it is
> replaced with a better approach. If there is no acceptable alternative now,
> I do want to avoid "warn_alloc() without oom_lock held" versus
> "oom_kill_process() with oom_lock held" dependency. And I'm waiting for your
> answer in that thread.

I have already responded. Nagging me further doesn't help.

[...]
> Despite you have said
> 
>   So let's agree to disagree about importance of the reliability
>   warn_alloc. I see it as an improvement which doesn't really have to be
>   perfect.

And I stand by this statement.

> at https://patchwork.kernel.org/patch/9381891/ , can we agree with killing
> the synchronous allocation stall warning messages and start seeking for
> asynchronous approach?

I've already said that I will not oppose removing it if regular
workloads are tripping over it. Johannes had some real world examples
AFAIR but didn't provide any details which we could use for the
changelog. I wouldn't be entirely happy about that but the reality says
that the printk infrastructure is not really prepared for extreme loads.
 
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3868,8 +3868,6 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  	enum compact_result compact_result;
>  	int compaction_retries;
>  	int no_progress_loops;
> -	unsigned long alloc_start = jiffies;
> -	unsigned int stall_timeout = 10 * HZ;
>  	unsigned int cpuset_mems_cookie;
>  	int reserve_flags;
>  
> @@ -4001,14 +3999,6 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  	if (!can_direct_reclaim)
>  		goto nopage;
>  
> -	/* Make sure we know about allocations which stall for too long */
> -	if (time_after(jiffies, alloc_start + stall_timeout)) {
> -		warn_alloc(gfp_mask & ~__GFP_NOWARN, ac->nodemask,
> -			"page allocation stalls for %ums, order:%u",
> -			jiffies_to_msecs(jiffies-alloc_start), order);
> -		stall_timeout += 10 * HZ;
> -	}
> -
>  	/* Avoid recursion of direct reclaim */
>  	if (current->flags & PF_MEMALLOC)
>  		goto nopage;
> -- 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-10-25 11:09 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1503577106-9196-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>
2017-08-24 12:18 ` Tetsuo Handa
2017-08-24 13:18   ` Michal Hocko
2017-08-24 14:40     ` Tetsuo Handa
2017-08-25  8:00       ` Michal Hocko
2017-09-09  0:55         ` Tetsuo Handa
     [not found]           ` <201710172204.AGG30740.tVHJFFOQLMSFOO@I-love.SAKURA.ne.jp>
2017-10-20 12:40             ` Michal Hocko
2017-10-20 14:18               ` Tetsuo Handa
2017-10-23 11:30                 ` Michal Hocko
2017-10-24 11:24                   ` Tetsuo Handa
2017-10-24 11:41                     ` Michal Hocko
2017-10-25 10:48                       ` Tetsuo Handa
2017-10-25 11:09                         ` Michal Hocko [this message]
2017-10-25 12:15                           ` Tetsuo Handa
2017-10-25 12:41                             ` Michal Hocko
2017-10-25 14:58                               ` Tetsuo Handa
2017-10-25 15:05                                 ` Michal Hocko
2017-10-25 15:34                                   ` Tetsuo Handa
2017-08-24 13:03 ` [PATCH 1/2] mm,page_alloc: Don't call __node_reclaim() with oom_lock held Michal Hocko
2017-08-25 20:47 ` Andrew Morton
2017-08-26  1:28   ` Tetsuo Handa
2017-08-27  4:17     ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171025110955.jsc4lqjbg6ww5va6@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mjaggi@caviumnetworks.com \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox