Re: [PATCH] kmemleak: don't use __GFP_NOFAIL

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Catalin Marinas <catalin.marinas@arm.com>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Michal Hocko <mhocko@suse.com>, Chunyu Hu <chuhu@redhat.com>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	malat@debian.org, Linux-MM <linux-mm@kvack.org>,
	Akinobu Mita <akinobu.mita@gmail.com>
Subject: Re: [PATCH] kmemleak: don't use __GFP_NOFAIL
Date: Mon, 4 Jun 2018 17:41:03 +0100	[thread overview]
Message-ID: <20180604164102.vemn3htz6qgmonjq@armageddon.cambridge.arm.com> (raw)
In-Reply-To: <CACT4Y+YzaFeBD2nBmv5BGv6Cq_-4RK+D9MhUTjwOUuc4jN5pYQ@mail.gmail.com>

On Mon, Jun 04, 2018 at 05:36:31PM +0200, Dmitry Vyukov wrote:
> On Mon, Jun 4, 2018 at 5:08 PM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Mon, Jun 04, 2018 at 02:42:10PM +0200, Michal Hocko wrote:
> >> On Mon 04-06-18 10:41:39, Dmitry Vyukov wrote:
> >> [...]
> >> > FWIW this problem is traditionally solved in dynamic analysis tools by
> >> > embedding meta info right in headers of heap blocks. All of KASAN,
> >> > KMSAN, slub debug, LeakSanitizer, asan, valgrind work this way. Then
> >> > an object is either allocated or not. If caller has something to
> >> > prevent allocations from failing in any context, then the same will be
> >> > true for KMEMLEAK meta data.
> >>
> >> This makes much more sense, of course. I thought there were some
> >> fundamental reasons why kmemleak needs to have an off-object tracking
> >> which makes the whole thing much more complicated of course.
> >
> > Kmemleak needs to track all memory blocks that may contain pointers
> > (otherwise the dependency graph cannot be correctly tracked leading to
> > lots of false positives). Not all these objects come from the slab
> > allocator, for example it tracks certain alloc_pages() blocks, all of
> > memblock_alloc().
> 
> I understand that this will make KMEMLEAK tracking non-uniform, but
> heap objects are the most important class of allocations.
> page struct already contains stackdepot id if CONFIG_PAGE_OWNER is
> enabled. Do we need anything else other than stack trace for pages?
> I don't know about memblock's.

Well, it needs most of the other stuff that's in struct kmemleak_object
(list_head, rb_node, some counters, spinlock_t).

> > An option would be to use separate metadata only for non-slab objects,
> > though I'd have to see how intrusive this is for mm/sl*b.c. Also there
> > is RCU freeing for the kmemleak metadata to avoid locking when
> > traversing the internal lists. If the metadata is in the slab object
> > itself, we'd have to either defer its freeing or add some bigger lock to
> > kmemleak.
> 
> This relates to scanning without slopped world, right?

Initially the RCU mechanism was added to defer kmemleak freeing its
metadata with another recursive call into the slab freeing routine
(since it does this when the tracked object is freed). This came in
handy for other lists traversal in kmemleak. For the actual memory
scanning, there is some fine-grained locking per metadata object as we
want to block the freeing until the scanning of the specific object
completes (e.g. vfree() must not unmap the object during scanning).

> In our
> experience with large-scale systematic testing any tool with false
> positives can't be used in practice in systematic way. KMEMLEAK false
> positives do not allow to enable it on syzbot. We know there are tons
> of leaks, we have the tool, but we are not detecting leaks. I don't
> know who/how uses KMEMLEAK in non-stop-the-world mode, but
> stop-the-world is pretty much a requirement for deployment for us. And
> it would also solve the problem with disappearing under our feet heap
> blocks, right?

A hard requirement during the early kmemleak development was not to
actually stop the world (as it can even take minutes to complete the
scanning). It employs various heuristics to deal with false positives
like checksumming, delaying the actual reporting, waiting for an object
to be detected as a leak in two successive scans while its checksum is
the same. While not ideal, it works most of the time.

Now, there was indeed a recent requirement to implement stop-the-world
scanning via a "stopscan" command to /sys/kernel/debug/kmemleak (using
stop_machine()) but I never got around to implementing it. This would be
very useful for non-interactive sessions like automated testing.

> FWIW In LeakSanitizer we don't specifically keep track of heap blocks.
> Instead we stop the world and then ask memory allocator for metainfo.
> I would expect that sl*b also have all required info, maybe in not
> O(1) accessible form, so it may require some preprocessing (e.g.
> collecting all free objects in a slab and then subtracting it from set
> of all objects in the slab to get set of allocated objects).
> But I understand that all of this turns this from "add a flag" to
> almost a complete rewrite of the tool...

As I said above, background scanning is still a requirement but we could
add a stopscan command on top, should be too hard.

-- 
Catalin

     prev parent reply	other threads:[~2018-06-04 16:41 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-26  7:14 WARNING: CPU: 0 PID: 21 at ../mm/page_alloc.c:4258 __alloc_pages_nodemask+0xa88/0xfec Mathieu Malaterre
2018-05-28  8:34 ` Michal Hocko
2018-05-28 13:05   ` [PATCH] kmemleak: don't use __GFP_NOFAIL Tetsuo Handa
2018-05-28 13:24     ` Michal Hocko
2018-05-28 21:05       ` Tetsuo Handa
2018-05-29 13:27         ` Chunyu Hu
2018-05-29 13:46           ` Tetsuo Handa
2018-05-30  9:35             ` Chunyu Hu
2018-05-30 10:46               ` Michal Hocko
2018-05-30 11:42                 ` Chunyu Hu
2018-05-30 12:38                   ` Michal Hocko
2018-05-31 10:51                     ` Chunyu Hu
2018-05-31 11:35                       ` Michal Hocko
2018-05-31 12:28                         ` Chunyu Hu
2018-05-31 15:22                     ` Catalin Marinas
2018-05-31 18:41                       ` Michal Hocko
2018-06-01  1:50                         ` Chunyu Hu
2018-06-01  4:53                           ` Chunyu Hu
2018-06-04  8:41                             ` Dmitry Vyukov
2018-06-04 12:42                               ` Michal Hocko
2018-06-04 15:08                                 ` Catalin Marinas
2018-06-04 15:36                                   ` Dmitry Vyukov
2018-06-04 16:41                                     ` Catalin Marinas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180604164102.vemn3htz6qgmonjq@armageddon.cambridge.arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akinobu.mita@gmail.com \
    --cc=chuhu@redhat.com \
    --cc=dvyukov@google.com \
    --cc=linux-mm@kvack.org \
    --cc=malat@debian.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox