From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: peterz@infradead.org
Cc: torvalds@linux-foundation.org, davej@codemonkey.org.uk,
npiggin@gmail.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, netdev@vger.kernel.org, mhocko@kernel.org
Subject: Re: [4.15-rc9] fs_reclaim lockdep trace
Date: Mon, 29 Jan 2018 20:47:20 +0900 [thread overview]
Message-ID: <201801292047.EHC05241.OHSQOJOVtFMFLF@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20180129102746.GQ2269@hirez.programming.kicks-ass.net>
Peter Zijlstra wrote:
> On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote:
> > This warning seems to be caused by commit d92a8cfcb37ecd13
> > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the
> > location of
> >
> > /* this guy won't enter reclaim */
> > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
> > return false;
> >
> > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context
> > (__GFP_NOFS)").
>
> I'm not entirly sure I get what you mean here. How did I move it? It was
> part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not
> mark the lock as held.
d92a8cfcb37ecd13 replaced lockdep_set_current_reclaim_state() with
fs_reclaim_acquire(), and removed current->lockdep_recursion handling.
----------
# git show d92a8cfcb37ecd13 | grep recursion
-# define INIT_LOCKDEP .lockdep_recursion = 0, .lockdep_reclaim_gfp = 0,
+# define INIT_LOCKDEP .lockdep_recursion = 0,
unsigned int lockdep_recursion;
- if (unlikely(current->lockdep_recursion))
- current->lockdep_recursion = 1;
- current->lockdep_recursion = 0;
- * context checking code. This tests GFP_FS recursion (a lock taken
----------
>
> The new code has it in fs_reclaim_acquire/release to the same effect, if
> __GFP_NOMEMALLOC, we'll not acquire/release the lock.
Excuse me, but I can't catch.
We currently acquire/release __fs_reclaim_map if __GFP_NOMEMALLOC.
----------
+static bool __need_fs_reclaim(gfp_t gfp_mask)
+{
(...snipped...)
+ /* this guy won't enter reclaim */
+ if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
+ return false;
(...snipped...)
+}
----------
>
>
> > Since __kmalloc_reserve() from __alloc_skb() adds
> > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is
> > failing to return false despite PF_MEMALLOC context (and resulted in
> > lockdep warning).
>
> But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC.
> That's what the name says.
__GFP_NOMEMALLOC negates PF_MEMALLOC regarding what watermark that allocation
request should use.
----------
static inline int __gfp_pfmemalloc_flags(gfp_t gfp_mask)
{
if (unlikely(gfp_mask & __GFP_NOMEMALLOC))
return 0;
if (gfp_mask & __GFP_MEMALLOC)
return ALLOC_NO_WATERMARKS;
if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
return ALLOC_NO_WATERMARKS;
if (!in_interrupt()) {
if (current->flags & PF_MEMALLOC)
return ALLOC_NO_WATERMARKS;
else if (oom_reserves_allowed(current))
return ALLOC_OOM;
}
return 0;
}
----------
But at the same time, PF_MEMALLOC negates __GFP_DIRECT_RECLAIM.
----------
/* Attempt with potentially adjusted zonelist and alloc_flags */
page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
if (page)
goto got_pg;
/* Caller is not willing to reclaim, we can't balance anything */
if (!can_direct_reclaim)
goto nopage;
/* Avoid recursion of direct reclaim */
if (current->flags & PF_MEMALLOC)
goto nopage;
/* Try direct reclaim and then allocating */
page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac,
&did_some_progress);
if (page)
goto got_pg;
/* Try direct compaction and then allocating */
page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
compact_priority, &compact_result);
if (page)
goto got_pg;
/* Do not loop if specifically requested */
if (gfp_mask & __GFP_NORETRY)
goto nopage;
----------
Then, how can fs_reclaim contribute to deadlock?
>
> > Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking
> > __GFP_NOMEMALLOC might make sense. But since this safeguard was added by
> > commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for
> > allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense.
> > Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to
> > return false.
>
> This does not in fact explain what's going on, it just points to
> 'random' patches.
>
> Are you talking about this:
>
> + /* Avoid recursion of direct reclaim */
> + if (p->flags & PF_MEMALLOC)
> + goto nopage;
>
> bit?
Yes.
>
> > Reported-by: Dave Jones <davej@codemonkey.org.uk>
> > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Nick Piggin <npiggin@gmail.com>
> > ---
> > mm/page_alloc.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 76c9688..7804b0e 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3583,7 +3583,7 @@ static bool __need_fs_reclaim(gfp_t gfp_mask)
> > return false;
> >
> > /* this guy won't enter reclaim */
> > - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC))
> > + if (current->flags & PF_MEMALLOC)
> > return false;
>
> I'm _really_ uncomfortable doing that. Esp. without a solid explanation
> of how this raelly can't possibly lead to trouble. Which the above semi
> incoherent rambling is not.
>
> Your backtrace shows the btrfs shrinker doing an allocation, that's the
> exact kind of thing we need to be extremely careful with.
>
If btrfs is already holding some lock (and thus __GFP_FS is not safe),
that lock must be printed at
2 locks held by sshd/24800:
#0: (sk_lock-AF_INET6){+.+.}, at: [<000000001a069652>] tcp_sendmsg+0x19/0x40
#1: (fs_reclaim){+.+.}, at: [<0000000084f438c2>] fs_reclaim_acquire.part.102+0x5/0x30
doesn't it? But sk_lock-AF_INET6 is not a FS lock, and fs_reclaim does not
actually lock something.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2018-01-29 11:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-24 1:36 Dave Jones
2018-01-27 22:24 ` Dave Jones
2018-01-27 22:43 ` Linus Torvalds
2018-01-28 1:16 ` Tetsuo Handa
2018-01-28 4:25 ` Tetsuo Handa
2018-01-28 5:55 ` Tetsuo Handa
2018-01-29 2:43 ` Dave Jones
2018-01-29 10:27 ` Peter Zijlstra
2018-01-29 11:47 ` Tetsuo Handa [this message]
2018-01-29 13:55 ` Peter Zijlstra
2018-02-01 11:36 ` Tetsuo Handa
2018-02-08 11:43 ` [PATCH v2] lockdep: Fix fs_reclaim warning Tetsuo Handa
2018-02-12 12:08 ` Nikolay Borisov
2018-02-12 13:46 ` Tetsuo Handa
2018-02-19 11:52 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201801292047.EHC05241.OHSQOJOVtFMFLF@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=davej@codemonkey.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox