linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	mhocko@suse.cz, dchinner@redhat.com, linux-mm@kvack.org,
	rientjes@google.com, oleg@redhat.com, mgorman@suse.de,
	torvalds@linux-foundation.org, xfs@oss.sgi.com
Subject: Re: How to handle TIF_MEMDIE stalls?
Date: Sun, 22 Feb 2015 17:29:30 -0800	[thread overview]
Message-ID: <20150222172930.6586516d.akpm@linux-foundation.org> (raw)
In-Reply-To: <20150223004521.GK12722@dastard>

On Mon, 23 Feb 2015 11:45:21 +1100 Dave Chinner <david@fromorbit.com> wrote:

> > > I really don't care about the OOM Killer corner cases - it's
> > > completely the wrong way line of development to be spending time on
> > > and you aren't going to convince me otherwise. The OOM killer a
> > > crutch used to justify having a memory allocation subsystem that
> > > can't provide forward progress guarantee mechanisms to callers that
> > > need it.
> > 
> > We can provide this.  Are all these callers able to preallocate?
> 
> Anything that allocates in transaction context (and therefor is
> GFP_NOFS by definition) can preallocate at transaction reservation
> time. However, preallocation is dumb, complex, CPU and memory
> intensive and will have a *massive* impact on performance.
> Allocating 10-100 pages to a reserve which we will almost *never
> use* and then free them again *on every single transaction* is a lot
> of unnecessary additional fast path overhead.  Hence a "preallocate
> for every context" reserve pool is not a viable solution.

Yup.

> Reservations are simply an *accounting* of the maximum amount of a
> reserve required by an operation to guarantee forwards progress. In
> filesystems, we do this for log space (transactions) and some do it
> for filesystem space (e.g. delayed allocation needs correct ENOSPC
> detection so we don't overcommit disk space).  The VM already has
> such concepts (e.g. watermarks and things like min_free_kbytes) that
> it uses to ensure that there are sufficient reserves for certain
> types of allocations to succeed.

Yes, as we do for __GFP_HIGH and PF_MEMALLOC etc.  Add a dynamic
reserve.  So to reserve N pages we increase the page allocator dynamic
reserve by N, do some reclaim if necessary then deposit N tokens into
the caller's task_struct (it'll be a set of zone/nr-pages tuples I
suppose).

When allocating pages the caller should drain its reserves in
preference to dipping into the regular freelist.  This guy has already
done his reclaim and shouldn't be penalised a second time.  I guess
Johannes's preallocation code should switch to doing this for the same
reason, plus the fact that snipping a page off
task_struct.prealloc_pages is super-fast and needs to be done sometime
anyway so why not do it by default.

Both reservation and preallocation are vulnerable to deadlocks - 10,000
tasks all trying to reserve/prealloc 100 pages, they all have 50 pages
and we ran out of memory.  Whoops.  We can undeadlock by returning
ENOMEM but I suspect there will still be problematic situations where
massive numbers of pages are temporarily AWOL.  Perhaps some form of
queuing and throttling will be needed, to limit the peak number of
reserved pages.  Per zone, I guess.

And it'll be a huge pain handling order>0 pages.  I'd be inclined to
make it order-0 only, and tell the lamer callers that
vmap-is-thattaway.  Alas, one lame caller is slub.


But the biggest issue is how the heck does a caller work out how many
pages to reserve/prealloc?  Even a single sb_bread() - it's sitting on
loop on a sparse NTFS file on loop on a five-deep DM stack on a
six-deep MD stack on loop on NFS on an eleventy-deep networking stack. 
And then there will be an unknown number of slab allocations of unknown
size with unknown slabs-per-page rules - how many pages needed for
them?  And to make it much worse, how many pages of which orders? 
Bless its heart, slub will go and use a 1-order page for allocations
which should have been in 0-order pages..


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-02-23  1:30 UTC|newest]

Thread overview: 177+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-12 13:54 [RFC PATCH] oom: Don't count on mm-less current process Tetsuo Handa
2014-12-16 12:47 ` Michal Hocko
2014-12-17 11:54   ` Tetsuo Handa
2014-12-17 13:08     ` Michal Hocko
2014-12-18 12:11       ` Tetsuo Handa
2014-12-18 15:33         ` Michal Hocko
2014-12-19 12:07           ` Tetsuo Handa
2014-12-19 12:49             ` Michal Hocko
2014-12-20  9:13               ` Tetsuo Handa
2014-12-20 11:42                 ` Tetsuo Handa
2014-12-22 20:25                   ` Michal Hocko
2014-12-23  1:00                     ` Tetsuo Handa
2014-12-23  9:51                       ` Michal Hocko
2014-12-23 11:46                         ` Tetsuo Handa
2014-12-23 11:57                           ` Tetsuo Handa
2014-12-23 12:12                             ` Tetsuo Handa
2014-12-23 12:27                             ` Michal Hocko
2014-12-23 12:24                           ` Michal Hocko
2014-12-23 13:00                             ` Tetsuo Handa
2014-12-23 13:09                               ` Michal Hocko
2014-12-23 13:20                                 ` Tetsuo Handa
2014-12-23 13:43                                   ` Michal Hocko
2014-12-23 14:11                                     ` Tetsuo Handa
2014-12-23 14:57                                       ` Michal Hocko
2014-12-19 12:22           ` How to handle TIF_MEMDIE stalls? Tetsuo Handa
2014-12-20  2:03             ` Dave Chinner
2014-12-20 12:41               ` Tetsuo Handa
2014-12-20 22:35                 ` Dave Chinner
2014-12-21  8:45                   ` Tetsuo Handa
2014-12-21 20:42                     ` Dave Chinner
2014-12-22 16:57                       ` Michal Hocko
2014-12-22 21:30                         ` Dave Chinner
2014-12-23  9:41                           ` Johannes Weiner
2014-12-24  1:06                             ` Dave Chinner
2014-12-24  2:40                               ` Linus Torvalds
2014-12-29 18:19                     ` Michal Hocko
2014-12-30  6:42                       ` Tetsuo Handa
2014-12-30 11:21                         ` Michal Hocko
2014-12-30 13:33                           ` Tetsuo Handa
2014-12-31 10:24                             ` Tetsuo Handa
2015-02-09 11:44                           ` Tetsuo Handa
2015-02-10 13:58                             ` Tetsuo Handa
2015-02-10 15:19                               ` Johannes Weiner
2015-02-11  2:23                                 ` Tetsuo Handa
2015-02-11 13:37                                   ` Tetsuo Handa
2015-02-11 18:50                                     ` Oleg Nesterov
2015-02-11 18:59                                       ` Oleg Nesterov
2015-03-14 13:03                                         ` Tetsuo Handa
2015-02-17 12:23                                   ` Tetsuo Handa
2015-02-17 12:53                                     ` Johannes Weiner
2015-02-17 15:38                                       ` Michal Hocko
2015-02-17 22:54                                       ` Dave Chinner
2015-02-17 23:32                                         ` Dave Chinner
2015-02-18  8:25                                         ` Michal Hocko
2015-02-18 10:48                                           ` Dave Chinner
2015-02-18 12:16                                             ` Michal Hocko
2015-02-18 21:31                                               ` Dave Chinner
2015-02-19  9:40                                                 ` Michal Hocko
2015-02-19 22:03                                                   ` Dave Chinner
2015-02-20  9:27                                                     ` Michal Hocko
2015-02-19 11:01                                               ` Johannes Weiner
2015-02-19 12:29                                                 ` Michal Hocko
2015-02-19 12:58                                                   ` Michal Hocko
2015-02-19 15:29                                                     ` Tetsuo Handa
2015-02-19 21:53                                                       ` Tetsuo Handa
2015-02-20  9:13                                                       ` Michal Hocko
2015-02-20 13:37                                                         ` Stefan Ring
2015-02-19 13:29                                                   ` Tetsuo Handa
2015-02-20  9:10                                                     ` Michal Hocko
2015-02-20 12:20                                                       ` Tetsuo Handa
2015-02-20 12:38                                                         ` Michal Hocko
2015-02-19 21:43                                                   ` Dave Chinner
2015-02-20 12:48                                                     ` Michal Hocko
2015-02-20 23:09                                                       ` Dave Chinner
2015-02-19 10:24                                         ` Johannes Weiner
2015-02-19 22:52                                           ` Dave Chinner
2015-02-20 10:36                                             ` Tetsuo Handa
2015-02-20 23:15                                               ` Dave Chinner
2015-02-21  3:20                                                 ` Theodore Ts'o
2015-02-21  9:19                                                   ` Andrew Morton
2015-02-21 13:48                                                     ` Tetsuo Handa
2015-02-21 21:38                                                     ` Dave Chinner
2015-02-22  0:20                                                     ` Johannes Weiner
2015-02-23 10:48                                                       ` Michal Hocko
2015-02-23 11:23                                                         ` Tetsuo Handa
2015-02-23 21:33                                                       ` David Rientjes
2015-02-22 14:48                                                     ` __GFP_NOFAIL and oom_killer_disabled? Tetsuo Handa
2015-02-23 10:21                                                       ` Michal Hocko
2015-02-23 13:03                                                         ` Tetsuo Handa
2015-02-24 18:14                                                           ` Michal Hocko
2015-02-25 11:22                                                             ` Tetsuo Handa
2015-02-25 16:02                                                               ` Michal Hocko
2015-02-25 21:48                                                                 ` Tetsuo Handa
2015-02-25 21:51                                                                   ` Andrew Morton
2015-02-21 12:00                                                   ` How to handle TIF_MEMDIE stalls? Tetsuo Handa
2015-02-23 10:26                                                   ` Michal Hocko
2015-02-21 11:12                                                 ` Tetsuo Handa
2015-02-21 21:48                                                   ` Dave Chinner
2015-02-21 23:52                                             ` Johannes Weiner
2015-02-23  0:45                                               ` Dave Chinner
2015-02-23  1:29                                                 ` Andrew Morton [this message]
2015-02-23  7:32                                                   ` Dave Chinner
2015-02-27 18:24                                                     ` Vlastimil Babka
2015-02-28  0:03                                                       ` Dave Chinner
2015-02-28 15:17                                                         ` Theodore Ts'o
2015-03-02  9:39                                                     ` Vlastimil Babka
2015-03-02 22:31                                                       ` Dave Chinner
2015-03-03  9:13                                                         ` Vlastimil Babka
2015-03-04  1:33                                                           ` Dave Chinner
2015-03-04  8:50                                                             ` Vlastimil Babka
2015-03-04 11:03                                                               ` Dave Chinner
2015-03-07  0:20                                                         ` Johannes Weiner
2015-03-07  3:43                                                           ` Dave Chinner
2015-03-07 15:08                                                             ` Johannes Weiner
2015-03-02 20:22                                                     ` Johannes Weiner
2015-03-02 23:12                                                       ` Dave Chinner
2015-03-03  2:50                                                         ` Johannes Weiner
2015-03-04  6:52                                                           ` Dave Chinner
2015-03-04 15:04                                                             ` Johannes Weiner
2015-03-04 17:38                                                               ` Theodore Ts'o
2015-03-04 23:17                                                                 ` Dave Chinner
2015-02-28 16:29                                                 ` Johannes Weiner
2015-02-28 16:41                                                   ` Theodore Ts'o
2015-02-28 22:15                                                     ` Johannes Weiner
2015-03-01 11:17                                                       ` Tetsuo Handa
2015-03-06 11:53                                                         ` Tetsuo Handa
2015-03-01 13:43                                                       ` Theodore Ts'o
2015-03-01 16:15                                                         ` Johannes Weiner
2015-03-01 19:36                                                           ` Theodore Ts'o
2015-03-01 20:44                                                             ` Johannes Weiner
2015-03-01 20:17                                                         ` Johannes Weiner
2015-03-01 21:48                                                       ` Dave Chinner
2015-03-02  0:17                                                         ` Dave Chinner
2015-03-02 12:46                                                           ` Brian Foster
2015-02-28 18:36                                                 ` Vlastimil Babka
2015-03-02 15:18                                                 ` Michal Hocko
2015-03-02 16:05                                                   ` Johannes Weiner
2015-03-02 17:10                                                     ` Michal Hocko
2015-03-02 17:27                                                       ` Johannes Weiner
2015-03-02 16:39                                                   ` Theodore Ts'o
2015-03-02 16:58                                                     ` Michal Hocko
2015-03-04 12:52                                                       ` Dave Chinner
2015-02-17 14:59                                     ` Michal Hocko
2015-02-17 14:50                                 ` Michal Hocko
2015-02-17 14:37                             ` Michal Hocko
2015-02-17 14:44                               ` Michal Hocko
2015-02-16 11:23                           ` Tetsuo Handa
2015-02-16 15:42                             ` Johannes Weiner
2015-02-17 11:57                               ` Tetsuo Handa
2015-02-17 13:16                                 ` Johannes Weiner
2015-02-17 16:50                                   ` Michal Hocko
2015-02-17 23:25                                     ` Dave Chinner
2015-02-18  8:48                                       ` Michal Hocko
2015-02-18 11:23                                         ` Tetsuo Handa
2015-02-18 12:29                                           ` Michal Hocko
2015-02-18 14:06                                             ` Tetsuo Handa
2015-02-18 14:25                                               ` Michal Hocko
2015-02-19 10:48                                                 ` Tetsuo Handa
2015-02-20  8:26                                                   ` Michal Hocko
2015-02-23 22:08                                 ` David Rientjes
2015-02-24 11:20                                   ` Tetsuo Handa
2015-02-24 15:20                                     ` Theodore Ts'o
2015-02-24 21:02                                       ` Dave Chinner
2015-02-25 14:31                                         ` Tetsuo Handa
2015-02-27  7:39                                           ` Dave Chinner
2015-02-27 12:42                                             ` Tetsuo Handa
2015-02-27 13:12                                               ` Dave Chinner
2015-03-04 12:41                                                 ` Tetsuo Handa
2015-03-04 13:25                                                   ` Dave Chinner
2015-03-04 14:11                                                     ` Tetsuo Handa
2015-03-05  1:36                                                       ` Dave Chinner
2015-02-17 16:33                             ` Michal Hocko
2014-12-29 17:40                   ` [PATCH] mm: get rid of radix tree gfp mask for pagecache_get_page (was: Re: How to handle TIF_MEMDIE stalls?) Michal Hocko
2014-12-29 18:45                     ` Linus Torvalds
2014-12-29 19:33                       ` Michal Hocko
2014-12-30 13:42                         ` Michal Hocko
2014-12-30 21:45                           ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150222172930.6586516d.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox