linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Qian Cai <cai@lca.pw>, Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: Kmemleak infrastructure improvement for task_struct leaks and call_rcu()
Date: Thu, 7 May 2020 18:14:19 +0100	[thread overview]
Message-ID: <20200507171418.GC3180@gaia> (raw)
In-Reply-To: <20200506174019.GA2869@paulmck-ThinkPad-P72>

On Wed, May 06, 2020 at 10:40:19AM -0700, Paul E. McKenney wrote:
> On Wed, May 06, 2020 at 12:22:37PM -0400, Qian Cai wrote:
> > == call_rcu() leaks ==
> > Another issue that might be relevant is that it seems sometimes,
> > kmemleak will give a lot of false positives (hundreds) because the
> > memory was supposed to be freed by call_rcu()  (for example, in
> > dst_release()) but for some reasons, it takes a long time probably
> > waiting for grace periods or some kind of RCU self-stall, but the
> > memory had already became an orphan. I am not sure how we are going
> > to resolve this properly until we have to figure out why call_rcu()
> > is taking so long to finish?
> 
> I know nothing about kmemleak, but I won't let that stop me from making
> random suggestions...
> 
> One approach is to do an rcu_barrier() inside kmemleak just before
> printing leaked blocks, and check to see if any are still leaked after
> the rcu_barrier().

The main issue is that kmemleak doesn't stop the world when scanning
(which can take over a minute, depending on your hardware), so we get
lots of transient pointer misses. There are some heuristics but
obviously they don't always work.

With RCU, objects are queued for RCU freeing later and chained via
rcu_head.next (IIUC). Under load, this list can be pretty volatile and
if this happen during kmemleak scanning, it's sufficient to lose track
of a next pointer and the rest of the list would be reported as a leak.

I think rcu_barrier() just before the starting the kmemleak scanning may
help if it reduces the number of objects queued.

Now, I wonder whether kmemleak itself can break this RCU chain. The
kmemleak metadata is allocated on a slab alloc callback. The freeing,
however, is done using call_rcu() because originally calling back into
the slab freeing from kmemleak_free() didn't go well. Since the
kmemleak_object structure is not tracked by kmemleak, I wonder whether
its rcu_head would break this directed pointer reference graph.

Let's try the rcu_barrier() first and I'll think about the metadata case
over the weekend.

Thanks.

-- 
Catalin


  reply	other threads:[~2020-05-07 17:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-06 16:22 Qian Cai
2020-05-06 17:40 ` Paul E. McKenney
2020-05-07 17:14   ` Catalin Marinas [this message]
2020-05-07 17:54     ` Paul E. McKenney
2020-05-07 17:16 ` Catalin Marinas
2020-05-07 17:29   ` Qian Cai
2020-05-09  9:44     ` Catalin Marinas
2020-05-10 21:27       ` Qian Cai
2020-05-12 14:15         ` Catalin Marinas
2020-05-12 18:09           ` Qian Cai
2020-05-13  9:59             ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200507171418.GC3180@gaia \
    --to=catalin.marinas@arm.com \
    --cc=cai@lca.pw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox