From: Hugh Dickins <hughd@google.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
Chris Murphy <lists@colorremedies.com>,
Linux Memory Management List <linux-mm@kvack.org>,
Chris Wilson <chris@chris-wilson.co.uk>,
Hugh Dickins <hughd@google.com>
Subject: Re: 5.7.0 page allocation failure: order:0, mode:0x400d0
Date: Mon, 8 Jun 2020 14:33:08 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.11.2006081321120.10251@eggly.anvils> (raw)
In-Reply-To: <20200608114445.GT19604@bombadil.infradead.org>
On Mon, 8 Jun 2020, Matthew Wilcox wrote:
> On Mon, Jun 08, 2020 at 11:08:33AM +0200, Vlastimil Babka wrote:
> > On 6/6/20 5:12 PM, Matthew Wilcox wrote:
> > > On Sat, Jun 06, 2020 at 01:38:57AM -0600, Chris Murphy wrote:
> > >> [ 5225.501756] gnome-shell: page allocation failure: order:0,
> > >> mode:0x400d0(__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_RECLAIMABLE),
> > >> nodemask=(null),cpuset=/,mems_allowed=0
> > >
> > > These are relatively liberal constraints on what the page allocator is
> > > allowed to do in order to succeed.
> >
> > They are not, it's not allowed to reclaim at all - __GFP_RECLAIMABLE is not the
> > same thing as __GFP_RECLAIM :)
>
> Ohh, __GFP_RECLAIMABLE is set because it's an XArray allocation and the
> slab can shrink the xarray nodes (by pruning page cache shadow entries),
> but __GFP_RECLAIM isn't set because shmem_gfp_pages() removes it. So slab
> is saying "you can reclaim", but shmem is saying "not in this context".
>
> > AFAICS the masks starts in shmem_gfp_pages()
>
> Ah! You mean in i915's shmem_get_pages().
Yes, it is surprising to find shmem_get_pages() over there. It's static,
but still confusing. ChrisW, any chance that your shmem_get_pages()
could be renamed i915_gem_shmem_get_pages(), and similarly the several
other shmem_* functions in there?
>
> > * Fail silently without starting the shrinker
> > noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
> > noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
> >
> > possibly mapping has GFP_KERNEL, but this removes the GFP_RECLAIM part and adds
> > __GFP_NORETRY | __GFP_NOWARN
> >
> > if this fails (silently) there's a fallback
> >
> > But when this reaches __read_swap_cache_async() it does:
> >
> > /* May fail (-ENOMEM) if XArray node allocation failed. */
> > err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL);
> >
> > So we lose the __GFP_NORETRY and importantly __GFP_NOWARN. Looks like you added
> > that with commit 8d93b41c09d1b :)
>
> I just moved code around:
>
> - /*
> - * call radix_tree_preload() while we can wait.
> - */
> - err = radix_tree_maybe_preload(gfp_mask & GFP_KERNEL);
> - if (err)
> - break;
>
> so this problem could have occurred without this patch.
>
> Yes, it seems to me that the problem is that i915 set GFP_NOWARN and
> swap_state removed it. It's been this way since 2008 when Hugh committed
> f000944d03a5
>
> I wouldn't have a problem with turning that '& GFP_KERNEL' into
> '& GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY'.
Yes, I'd be fine with that too (with some parentheses perhaps).
Or (looking at what you used in __add_to_page_cache_locked()),
would "gfp_mask & GFP_RECLAIM_MASK" be better? I think so,
and guess you'd be glad to follow your own precedent.
Quiz question: where would you expect GFP_RECLAIM_MASK to be defined?
ChrisM, could you try with the patch below, and see if it works
for you - I hope it doesn't just give you a blank screen.
I've often wondered whether the swapin readaheads would do better
to pass in such flags on all but the one page required - but that's
a separate matter.
>
> i915 has been using __GFP_NOWARN | __GFP_NORETRY in this path since 2012
> (!) with commit 6c085a728cf0 from Chris Wilson. So I guess we just
> got away with it until now?
I tried, but haven't found a convincing explanation.
Is i915 putting more pressure on now in this setup, perhaps?
>
> Adding Chris & Hugh.
>
> > ...
> >
> > >> [ 5225.502339] Mem-Info:
> > >> [ 5225.502345] active_anon:1433763 inactive_anon:207289 isolated_anon:182
> > >> active_file:10333 inactive_file:8393 isolated_file:2
> > >> unevictable:4657 dirty:100 writeback:0 unstable:0
> > >> slab_reclaimable:16672 slab_unreclaimable:38093
> > >> mapped:8919 shmem:4496 pagetables:10454 bounce:0
> > >> free:26161 free_pcp:2054 free_cma:0
> > >> [ 5225.502350] Node 0 active_anon:5735052kB inactive_anon:829156kB
> > >> active_file:41332kB inactive_file:33572kB unevictable:18628kB
> > >> isolated(anon):728kB isolated(file):8kB mapped:35676kB dirty:400kB
> > >> writeback:0kB shmem:17984kB shmem_thp: 0kB shmem_pmdmapped: 0kB
> > >> anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
> > >> [ 5225.502352] Node 0 DMA free:15344kB min:128kB low:160kB high:192kB
> > >> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> > >> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> > >> present:15988kB managed:15360kB mlocked:0kB kernel_stack:0kB
> > >> pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > >> [ 5225.502357] lowmem_reserve[]: 0 2069 7810 7810 7810
> > >> [ 5225.502360] Node 0 DMA32 free:40212kB min:17868kB low:22332kB
> > >> high:26796kB reserved_highatomic:0KB active_anon:1640016kB
> > >> inactive_anon:265148kB active_file:9856kB inactive_file:12584kB
> > >> unevictable:2968kB writepending:136kB present:2255864kB
> > >> managed:2157904kB mlocked:0kB kernel_stack:48kB pagetables:8524kB
> > >> bounce:0kB free_pcp:2904kB local_pcp:156kB free_cma:0kB
> > >> [ 5225.502365] lowmem_reserve[]: 0 0 5741 5741 5741
> > >> [ 5225.502368] Node 0 Normal free:49088kB min:49584kB low:61980kB
> > >> high:74376kB reserved_highatomic:2048KB active_anon:4098076kB
> > >> inactive_anon:563004kB active_file:32212kB inactive_file:21312kB
> > >> unevictable:13680kB writepending:0kB present:6027264kB
> > >> managed:5879476kB mlocked:1792kB kernel_stack:7312kB
> > >> pagetables:33292kB bounce:0kB free_pcp:5388kB local_pcp:780kB
> > >> free_cma:0kB
> > >> [ 5225.502373] lowmem_reserve[]: 0 0 0 0 0
> > >> [ 5225.502376] Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 1*32kB (U) 1*64kB
> > >> (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB
> > >> (M) = 15344kB
> > >> [ 5225.502385] Node 0 DMA32: 1160*4kB (UM) 471*8kB (UME) 84*16kB (UM)
> > >> 103*32kB (UME) 116*64kB (UME) 59*128kB (UME) 26*256kB (UE) 8*512kB (E)
> > >> 2*1024kB (E) 0*2048kB 0*4096kB = 40824kB
> > >> [ 5225.502394] Node 0 Normal: 4778*4kB (UMH) 1400*8kB (UMEH) 346*16kB
> > >> (UMH) 270*32kB (UMEH) 20*64kB (UMEH) 20*128kB (MEH) 4*256kB (MEH)
> > >> 0*512kB 0*1024kB 0*2048kB 0*4096kB = 49352kB
> > >
> > > Umm ... seems like there's lots of memory free. Why did this fail?
> >
> > Normal is below min watermark, and for DMA32 and DMA the lowmem reserve most
> > likely kicked in.
--- 5.7.0/mm/swap_state.c 2020-05-31 16:49:15.000000000 -0700
+++ linux/mm/swap_state.c 2020-06-08 14:27:38.211813658 -0700
@@ -23,6 +23,7 @@
#include <linux/huge_mm.h>
#include <asm/pgtable.h>
+#include "internal.h"
/*
* swapper_space is a fiction, retained to simplify the path through
@@ -418,7 +419,8 @@ struct page *__read_swap_cache_async(swp
/* May fail (-ENOMEM) if XArray node allocation failed. */
__SetPageLocked(new_page);
__SetPageSwapBacked(new_page);
- err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL);
+ err = add_to_swap_cache(new_page, entry,
+ gfp_mask & GFP_RECLAIM_MASK);
if (likely(!err)) {
/* Initiate read into locked page */
SetPageWorkingset(new_page);
next prev parent reply other threads:[~2020-06-08 21:34 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-06 7:38 Chris Murphy
2020-06-06 15:12 ` Matthew Wilcox
2020-06-07 0:50 ` Chris Murphy
2020-06-08 9:08 ` Vlastimil Babka
2020-06-08 11:44 ` Matthew Wilcox
2020-06-08 21:33 ` Hugh Dickins [this message]
2020-06-09 2:01 ` Matthew Wilcox
2020-06-10 15:21 ` Chris Murphy
2020-06-10 15:28 ` Chris Murphy
2020-06-10 23:31 ` Chris Murphy
2020-06-10 23:33 ` Matthew Wilcox
2020-06-10 23:43 ` Chris Murphy
2020-06-11 3:14 ` Chris Murphy
2020-06-15 20:29 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.2006081321120.10251@eggly.anvils \
--to=hughd@google.com \
--cc=chris@chris-wilson.co.uk \
--cc=linux-mm@kvack.org \
--cc=lists@colorremedies.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox