linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Sameer Nanda <snanda@chromium.org>
To: David Rientjes <rientjes@google.com>
Cc: Luigi Semenzato <semenzato@google.com>,
	msb@facebook.com, Andrew Morton <akpm@linux-foundation.org>,
	mhocko@suse.cz, Johannes Weiner <hannes@cmpxchg.org>,
	Rusty Russell <rusty@rustcorp.com.au>,
	oleg@redhat.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm, oom: Fix race when selecting process to kill
Date: Thu, 7 Nov 2013 11:34:43 -0800	[thread overview]
Message-ID: <CANMivWYzp_Eqw3BjeUz5ycQLftBuHjcZ7ZoFEwazekJNY2cJXA@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.02.1311061631280.22318@chino.kir.corp.google.com>

On Wed, Nov 6, 2013 at 4:35 PM, David Rientjes <rientjes@google.com> wrote:
> On Wed, 6 Nov 2013, Sameer Nanda wrote:
>
>> David -- I think we can make the duration that the tasklist_lock is
>> held smaller by consolidating the process selection logic that is
>> currently split across select_bad_process and oom_kill_process into
>> one place in select_bad_process.  The tasklist_lock would then need to
>> be held only when the thread lists are being traversed.  Would you be
>> ok with that?  I can re-spin the patch if that sounds like a workable
>> option.
>>
>
> No, this caused hundreds of machines to hit soft lockups for Google
> because there's no synchronization that prevents dozens of cpus to take
> tasklist_lock in the oom killer during parallel memcg oom conditions and
> never allow the write_lock_irq() on fork() or exit() to make progress.  We
> absolutely must hold tasklist_lock for as little time as possible in the
> oom killer.
>
> That said, I've never actually seen your reported bug manifest in our
> production environment so let's see if Oleg has any ideas.

Is the path you are referring to mem_cgroup_out_of_memory calling
oom_kill_process?  If so, then that path doesn't appear to suffer from
the two step select_bad_process, oom_kill_process race since
mem_cgroup_out_of_memory directly calls oom_kill_process without going
through select_bad_process.  This also means that the patch I sent is
incorrect since it removes the existing tasklist_lock protection in
oom_kill_process.

Respinning patch to take care of this case.

-- 
Sameer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-11-07 19:35 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-05 23:26 Sameer Nanda
2013-11-06  1:18 ` David Rientjes
2013-11-06  1:25   ` Luigi Semenzato
2013-11-06  1:27     ` David Rientjes
2013-11-06  3:00       ` Vladimir Murzin
2013-11-06  3:04       ` Sameer Nanda
2013-11-06  4:45         ` Luigi Semenzato
2013-11-06  7:17           ` Luigi Semenzato
2013-11-06 16:58             ` Sameer Nanda
2013-11-07  0:35               ` David Rientjes
2013-11-07 19:34                 ` Sameer Nanda [this message]
2013-11-08 18:07                 ` [PATCH v2] " Sameer Nanda
2013-11-08 18:45                   ` Oleg Nesterov
2013-11-08 19:49                     ` [PATCH v3] " Sameer Nanda
2013-11-09 15:16                       ` Oleg Nesterov
2013-11-11 23:15                         ` Sameer Nanda
2013-11-12  0:21                         ` [PATCH v4] " Sameer Nanda
2013-11-12 15:13                           ` Michal Hocko
2013-11-12 20:01                           ` Oleg Nesterov
2013-11-12 20:08                             ` Sameer Nanda
2013-11-12 20:23                               ` [PATCH v5] " Sameer Nanda
2013-11-13  2:33                                 ` David Rientjes
2013-11-13 16:46                                   ` Sameer Nanda
2013-11-13 17:18                                     ` [PATCH v6] " Sameer Nanda
2013-11-13 17:29                                       ` Oleg Nesterov
2013-11-14 13:43                                       ` dserrg
2013-11-14 17:03                                         ` Sameer Nanda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANMivWYzp_Eqw3BjeUz5ycQLftBuHjcZ7ZoFEwazekJNY2cJXA@mail.gmail.com \
    --to=snanda@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=msb@facebook.com \
    --cc=oleg@redhat.com \
    --cc=rientjes@google.com \
    --cc=rusty@rustcorp.com.au \
    --cc=semenzato@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox