Re: tty: memory leak in tty_register

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: tty: memory leak in tty_register_driver
       [not found] <CACT4Y+bZticikTpnc0djxRBLCWhj=2DqQk=KRf5zDvrLdHzEbQ@mail.gmail.com>
@ 2016-02-28 16:42 ` Dmitry Vyukov
  2016-02-28 23:47   ` Catalin Marinas
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Vyukov @ 2016-02-28 16:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby, LKML, Peter Hurley,
	One Thousand Gnomes, J Freyensee, Catalin Marinas, linux-mm,
	Paul Bolle
  Cc: Alexander Potapenko, Kostya Serebryany, Sasha Levin, syzkaller

On Mon, Feb 15, 2016 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Hello,
>
> When I am running the following program in a parallel loop, kmemleak
> starts reporting memory leaks of objects allocated in
> tty_register_driver during boot. These leaks start popping up
> chaotically and as you can see they originate in different drivers
> (synclinkmp_init, isdn_init, chr_dev_init, sysfs_init).
>
> On commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95 (4.5-rc3).
>
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sys/ioctl.h>
>
> int main()
> {
>   int fd, val;
>
>   fd = open("/dev/ptmx", O_RDWR);
>   val = 21;
>   ioctl(fd, TIOCSETD, &val);
>   return 0;
> }
>
> unreferenced object 0xffff88006708dc20 (size 8):
>   comm "swapper/0", pid 1, jiffies 4294672590 (age 930.839s)
>   hex dump (first 8 bytes):
>     74 74 79 53 4c 4d 38 00                          ttySLM8.
>   backtrace:
>     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_add_class_symlinks drivers/base/core.c:891
>     [<ffffffff835897fc>] device_add+0x73c/0x1480 drivers/base/core.c:1086
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
> unreferenced object 0xffff88006709b330 (size 152):
>   comm "swapper/0", pid 1, jiffies 4294672590 (age 930.839s)
>   hex dump (first 32 bytes):
>     01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<ffffffff8194e5fc>] __kernfs_new_node+0x6c/0x2b0 fs/kernfs/dir.c:540
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_add_class_symlinks drivers/base/core.c:891
>     [<ffffffff835897fc>] device_add+0x73c/0x1480 drivers/base/core.c:1086
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
>     [<ffffffff8663ca33>] kernel_init+0x13/0x150 init/main.c:936
>     [<ffffffff866617af>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
> unreferenced object 0xffff88006708d860 (size 8):
>   comm "swapper/0", pid 1, jiffies 4294672591 (age 930.838s)
>   hex dump (first 8 bytes):
>     74 74 79 53 4c 4d 39 00                          ttySLM9.
>   backtrace:
>     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_add_class_symlinks drivers/base/core.c:891
>     [<ffffffff835897fc>] device_add+0x73c/0x1480 drivers/base/core.c:1086
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
> unreferenced object 0xffff88006709a490 (size 152):
>   comm "swapper/0", pid 1, jiffies 4294672591 (age 930.853s)
>   hex dump (first 32 bytes):
>     01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<ffffffff8194e5fc>] __kernfs_new_node+0x6c/0x2b0 fs/kernfs/dir.c:540
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_add_class_symlinks drivers/base/core.c:891
>     [<ffffffff835897fc>] device_add+0x73c/0x1480 drivers/base/core.c:1086
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
>     [<ffffffff8663ca33>] kernel_init+0x13/0x150 init/main.c:936
>     [<ffffffff866617af>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
> unreferenced object 0xffff880064f3c960 (size 8):
>   comm "swapper/0", pid 1, jiffies 4294674404 (age 929.065s)
>   hex dump (first 8 bytes):
>     74 74 79 49 31 30 00 ff                          ttyI10..
>   backtrace:
>     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_add_class_symlinks drivers/base/core.c:891
>     [<ffffffff835897fc>] device_add+0x73c/0x1480 drivers/base/core.c:1086
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff848b4ef8>] isdn_tty_modem_init+0x3a8/0x1220
> drivers/isdn/i4l/isdn_tty.c:1785
>     [<ffffffff889ed622>] isdn_init+0x2c3/0x505
> drivers/isdn/i4l/isdn_common.c:2334
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
> unreferenced object 0xffff880064f41380 (size 152):
>   comm "swapper/0", pid 1, jiffies 4294674404 (age 929.066s)
>   hex dump (first 32 bytes):
>     01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<ffffffff8194e5fc>] __kernfs_new_node+0x6c/0x2b0 fs/kernfs/dir.c:540
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_add_class_symlinks drivers/base/core.c:891
>     [<ffffffff835897fc>] device_add+0x73c/0x1480 drivers/base/core.c:1086
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff848b4ef8>] isdn_tty_modem_init+0x3a8/0x1220
> drivers/isdn/i4l/isdn_tty.c:1785
>     [<ffffffff889ed622>] isdn_init+0x2c3/0x505
> drivers/isdn/i4l/isdn_common.c:2334
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
>     [<ffffffff8663ca33>] kernel_init+0x13/0x150 init/main.c:936
> unreferenced object 0xffff88006717e960 (size 8):
>   comm "swapper/0", pid 1, jiffies 4294672708 (age 973.931s)
>   hex dump (first 8 bytes):
>     32 33 37 3a 31 38 39 00                          237:189.
>   backtrace:
>     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_create_sys_dev_entry drivers/base/core.c:974
>     [<ffffffff8358a05c>] device_add+0xf9c/0x1480 drivers/base/core.c:1105
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
> unreferenced object 0xffff880067169ad0 (size 152):
>   comm "swapper/0", pid 1, jiffies 4294672708 (age 973.931s)
>   hex dump (first 32 bytes):
>     01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<ffffffff8194e5fc>] __kernfs_new_node+0x6c/0x2b0 fs/kernfs/dir.c:540
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_create_sys_dev_entry drivers/base/core.c:974
>     [<ffffffff8358a05c>] device_add+0xf9c/0x1480 drivers/base/core.c:1105
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
>     [<ffffffff8663ca33>] kernel_init+0x13/0x150 init/main.c:936
>     [<ffffffff866617af>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
> unreferenced object 0xffff88006717eb40 (size 8):
>   comm "swapper/0", pid 1, jiffies 4294672709 (age 973.930s)
>   hex dump (first 8 bytes):
>     32 33 37 3a 31 39 30 00                          237:190.
>   backtrace:
>     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_create_sys_dev_entry drivers/base/core.c:974
>     [<ffffffff8358a05c>] device_add+0xf9c/0x1480 drivers/base/core.c:1105
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
> unreferenced object 0xffff8800363ffa80 (size 152):
>   comm "swapper/0", pid 1, jiffies 4294672709 (age 973.962s)
>   hex dump (first 32 bytes):
>     01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<ffffffff8194e5fc>] __kernfs_new_node+0x6c/0x2b0 fs/kernfs/dir.c:540
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_create_sys_dev_entry drivers/base/core.c:974
>     [<ffffffff8358a05c>] device_add+0xf9c/0x1480 drivers/base/core.c:1105
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
>     [<ffffffff8663ca33>] kernel_init+0x13/0x150 init/main.c:936
>     [<ffffffff866617af>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
> unreferenced object 0xffff88006717ed20 (size 8):
>   comm "swapper/0", pid 1, jiffies 4294672711 (age 973.960s)
>   hex dump (first 8 bytes):
>     32 33 37 3a 31 39 31 00                          237:191.
>   backtrace:
>     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_create_sys_dev_entry drivers/base/core.c:974
>     [<ffffffff8358a05c>] device_add+0xf9c/0x1480 drivers/base/core.c:1105
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
> unreferenced object 0xffff8800671c3cf0 (size 152):
>   comm "swapper/0", pid 1, jiffies 4294672711 (age 973.960s)
>   hex dump (first 32 bytes):
>     01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<ffffffff8194e5fc>] __kernfs_new_node+0x6c/0x2b0 fs/kernfs/dir.c:540
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_create_sys_dev_entry drivers/base/core.c:974
>     [<ffffffff8358a05c>] device_add+0xf9c/0x1480 drivers/base/core.c:1105
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889babcb>] synclinkmp_init+0x35a/0x40e
> drivers/tty/synclinkmp.c:3992
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
>     [<ffffffff8663ca33>] kernel_init+0x13/0x150 init/main.c:936
>     [<ffffffff866617af>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
> unreferenced object 0xffff88006ca7f5a0 (size 152):
>   comm "swapper/0", pid 1, jiffies 4294670068 (age 267.578s)
>   hex dump (first 32 bytes):
>     01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<ffffffff8194e5fc>] __kernfs_new_node+0x6c/0x2b0 fs/kernfs/dir.c:540
>     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> fs/sysfs/symlink.c:44
>     [<     inline     >] sysfs_do_create_link fs/sysfs/symlink.c:80
>     [<ffffffff81959d45>] sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:92
>     [<     inline     >] device_add_class_symlinks drivers/base/core.c:891
>     [<ffffffff835897fc>] device_add+0x73c/0x1480 drivers/base/core.c:1086
>     [<ffffffff8358a55d>] device_register+0x1d/0x20 drivers/base/core.c:1189
>     [<ffffffff82f80b50>] tty_register_device_attr+0x320/0x760
> drivers/tty/tty_io.c:3312
>     [<     inline     >] tty_register_device drivers/tty/tty_io.c:3239
>     [<ffffffff82f8133b>] tty_register_driver+0x36b/0x670
> drivers/tty/tty_io.c:3504
>     [<ffffffff889b171c>] vty_init+0x366/0x398 drivers/tty/vt/vt.c:3093
>     [<ffffffff889af3f6>] tty_init+0x146/0x14a drivers/tty/tty_io.c:3686
>     [<ffffffff889bb4e8>] chr_dev_init+0x12a/0x13c drivers/char/mem.c:869
>     [<ffffffff81002259>] do_one_initcall+0x159/0x380 init/main.c:794
>     [<     inline     >] do_initcall_level init/main.c:859
>     [<     inline     >] do_initcalls init/main.c:867
>     [<     inline     >] do_basic_setup init/main.c:885
>     [<ffffffff888fcc29>] kernel_init_freeable+0x474/0x52d init/main.c:1010
> unreferenced object 0xffff88003de48000 (size 2096):
>   comm "swapper/0", pid 0, jiffies 4294667421 (age 186.244s)
>   hex dump (first 32 bytes):
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<     inline     >] __idr_pre_get lib/idr.c:196
>     [<ffffffff82be9253>] ida_pre_get+0x123/0x270 lib/idr.c:899
>     [<ffffffff82be9471>] ida_simple_get+0xd1/0x1d0 lib/idr.c:1096
>     [<ffffffff8194e622>] __kernfs_new_node+0x92/0x2b0 fs/kernfs/dir.c:544
>     [<ffffffff81952906>] kernfs_create_root+0xe6/0x2a0 fs/kernfs/dir.c:782
>     [<ffffffff88972422>] sysfs_init+0x18/0x8c fs/sysfs/mount.c:69
>     [<ffffffff8896d670>] mnt_init+0x1e0/0x42e fs/namespace.c:3143
>     [<ffffffff8896cfdc>] vfs_caches_init+0xaa/0x156 fs/dcache.c:3461
>     [<ffffffff888fc718>] start_kernel+0x60c/0x6a9 init/main.c:659
>     [<ffffffff888fb350>] x86_64_start_reservations+0x38/0x3a
> arch/x86/kernel/head64.c:203
>     [<ffffffff888fb4aa>] x86_64_start_kernel+0x158/0x167
> arch/x86/kernel/head64.c:184
>     [<ffffffffffffffff>] 0xffffffffffffffff
> unreferenced object 0xffff88003de4ee58 (size 2096):
>   comm "swapper/0", pid 0, jiffies 4294667421 (age 186.244s)
>   hex dump (first 32 bytes):
>     00 00 00 00 00 00 00 00 00 80 e4 3d 00 88 ff ff  ...........=....
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<ffffffff81761ba3>] kmem_cache_alloc+0x153/0x2e0 mm/slub.c:2609
>     [<     inline     >] kmem_cache_zalloc include/linux/slab.h:597
>     [<     inline     >] __idr_pre_get lib/idr.c:196
>     [<ffffffff82be9253>] ida_pre_get+0x123/0x270 lib/idr.c:899
>     [<ffffffff82be9471>] ida_simple_get+0xd1/0x1d0 lib/idr.c:1096
>     [<ffffffff8194e622>] __kernfs_new_node+0x92/0x2b0 fs/kernfs/dir.c:544
>     [<ffffffff81952906>] kernfs_create_root+0xe6/0x2a0 fs/kernfs/dir.c:782
>     [<ffffffff88972422>] sysfs_init+0x18/0x8c fs/sysfs/mount.c:69
>     [<ffffffff8896d670>] mnt_init+0x1e0/0x42e fs/namespace.c:3143
>     [<ffffffff8896cfdc>] vfs_caches_init+0xaa/0x156 fs/dcache.c:3461
>     [<ffffffff888fc718>] start_kernel+0x60c/0x6a9 init/main.c:659
>     [<ffffffff888fb350>] x86_64_start_reservations+0x38/0x3a
> arch/x86/kernel/head64.c:203
>     [<ffffffff888fb4aa>] x86_64_start_kernel+0x158/0x167
> arch/x86/kernel/head64.c:184
>     [<ffffffffffffffff>] 0xffffffffffffffff


+Catalin (kmemleak maintainer)

I am noticed a weird thing. I am not 100% sure but it seems that the
leaks are reported iff I run leak checking concurrently with the
programs running. And if I run the program several thousand times and
then run leak checking, then no leaks reported.

Catalin, it is possible that it is a kmemleak false positive?

I see that kmemleak just scans thread stacks one-by-one. I would
expect that kmemleak should stop all threads, then scan all stacks and
all registers of all threads, and then restart threads. If it does not
scan registers or does not stop threads, then I think it should be
possible that a pointer value can sneak off kmemleak. Does it make
sense?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty: memory leak in tty_register_driver
  2016-02-28 16:42 ` tty: memory leak in tty_register_driver Dmitry Vyukov
@ 2016-02-28 23:47   ` Catalin Marinas
  2016-02-29 10:22     ` Dmitry Vyukov
  0 siblings, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2016-02-28 23:47 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, Peter Hurley,
	One Thousand Gnomes, J Freyensee, linux-mm, Paul Bolle,
	Alexander Potapenko, Kostya Serebryany, Sasha Levin, syzkaller

On Sun, Feb 28, 2016 at 05:42:24PM +0100, Dmitry Vyukov wrote:
> On Mon, Feb 15, 2016 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > When I am running the following program in a parallel loop, kmemleak
> > starts reporting memory leaks of objects allocated in
> > tty_register_driver during boot. These leaks start popping up
> > chaotically and as you can see they originate in different drivers
> > (synclinkmp_init, isdn_init, chr_dev_init, sysfs_init).
> >
> > On commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95 (4.5-rc3).
[...]
> > unreferenced object 0xffff88006708dc20 (size 8):
> >   comm "swapper/0", pid 1, jiffies 4294672590 (age 930.839s)
> >   hex dump (first 8 bytes):
> >     74 74 79 53 4c 4d 38 00                          ttySLM8.
> >   backtrace:
> >     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
> >     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
> >     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
> >     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
> >     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
> >     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
> >     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
[...]
> +Catalin (kmemleak maintainer)
> 
> I am noticed a weird thing. I am not 100% sure but it seems that the
> leaks are reported iff I run leak checking concurrently with the
> programs running. And if I run the program several thousand times and
> then run leak checking, then no leaks reported.
> 
> Catalin, it is possible that it is a kmemleak false positive?

Yes, it's possible. If you run kmemleak scanning continuously (or at
very short intervals) and especially in parallel with some intensive
tasks, it will miss pointers that may be stored in registers (on other
CPUs) or moved between task stacks, other memory locations. Linked lists
are especially prone to such false positives.

Kmemleak tries to work around this by checksumming each object, so it
will only be reported if it hasn't changed on two consecutive scans.
Since the default scanning is 10min, it is very unlikely to trigger
false positives in such scenarios. However, if you reduce the scanning
time (or trigger it manually in a loop), you can hit this condition.

> I see that kmemleak just scans thread stacks one-by-one. I would
> expect that kmemleak should stop all threads, then scan all stacks and
> all registers of all threads, and then restart threads. If it does not
> scan registers or does not stop threads, then I think it should be
> possible that a pointer value can sneak off kmemleak. Does it make
> sense?

Given how long it takes to scan the memory, stopping the threads is not
really feasible. You could do something like stop_machine() only for
scanning the current stack on all CPUs but it still wouldn't catch
pointers being moved around in memory unless you stop the system
completely for a full scan. The heuristic about periodic scanning and
checksumming seems to work fine in normal usage scenarios.

For your tests, I would recommend that you run the tests for a long(ish)
time and only do two kmemleak scans at the end after they finished (and
with a few seconds delay between them). Continuous scanning is less
reliable.

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty: memory leak in tty_register_driver
  2016-02-28 23:47   ` Catalin Marinas
@ 2016-02-29 10:22     ` Dmitry Vyukov
  2016-02-29 10:24       ` Dmitry Vyukov
  2016-02-29 11:34       ` Catalin Marinas
  0 siblings, 2 replies; 7+ messages in thread
From: Dmitry Vyukov @ 2016-02-29 10:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, Peter Hurley,
	One Thousand Gnomes, J Freyensee, linux-mm, Paul Bolle,
	Alexander Potapenko, Kostya Serebryany, Sasha Levin, syzkaller

On Mon, Feb 29, 2016 at 12:47 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Sun, Feb 28, 2016 at 05:42:24PM +0100, Dmitry Vyukov wrote:
>> On Mon, Feb 15, 2016 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> > When I am running the following program in a parallel loop, kmemleak
>> > starts reporting memory leaks of objects allocated in
>> > tty_register_driver during boot. These leaks start popping up
>> > chaotically and as you can see they originate in different drivers
>> > (synclinkmp_init, isdn_init, chr_dev_init, sysfs_init).
>> >
>> > On commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95 (4.5-rc3).
> [...]
>> > unreferenced object 0xffff88006708dc20 (size 8):
>> >   comm "swapper/0", pid 1, jiffies 4294672590 (age 930.839s)
>> >   hex dump (first 8 bytes):
>> >     74 74 79 53 4c 4d 38 00                          ttySLM8.
>> >   backtrace:
>> >     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>> >     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>> >     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>> >     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>> >     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>> >     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>> >     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> [...]
>> +Catalin (kmemleak maintainer)
>>
>> I am noticed a weird thing. I am not 100% sure but it seems that the
>> leaks are reported iff I run leak checking concurrently with the
>> programs running. And if I run the program several thousand times and
>> then run leak checking, then no leaks reported.
>>
>> Catalin, it is possible that it is a kmemleak false positive?
>
> Yes, it's possible. If you run kmemleak scanning continuously (or at
> very short intervals) and especially in parallel with some intensive
> tasks, it will miss pointers that may be stored in registers (on other
> CPUs) or moved between task stacks, other memory locations. Linked lists
> are especially prone to such false positives.
>
> Kmemleak tries to work around this by checksumming each object, so it
> will only be reported if it hasn't changed on two consecutive scans.
> Since the default scanning is 10min, it is very unlikely to trigger
> false positives in such scenarios. However, if you reduce the scanning
> time (or trigger it manually in a loop), you can hit this condition.
>
>> I see that kmemleak just scans thread stacks one-by-one. I would
>> expect that kmemleak should stop all threads, then scan all stacks and
>> all registers of all threads, and then restart threads. If it does not
>> scan registers or does not stop threads, then I think it should be
>> possible that a pointer value can sneak off kmemleak. Does it make
>> sense?
>
> Given how long it takes to scan the memory, stopping the threads is not
> really feasible. You could do something like stop_machine() only for
> scanning the current stack on all CPUs but it still wouldn't catch
> pointers being moved around in memory unless you stop the system
> completely for a full scan. The heuristic about periodic scanning and
> checksumming seems to work fine in normal usage scenarios.
>
> For your tests, I would recommend that you run the tests for a long(ish)
> time and only do two kmemleak scans at the end after they finished (and
> with a few seconds delay between them). Continuous scanning is less
> reliable.


Thanks for the explanation, Catalin!

Let me describe my usage scenario first. I am running automatic
testing 24x7. Currently a VM executes a dozen of small programs (a
dozen of syscalls each), then I run manual leak scanning. I can't run
significantly more programs between scans, because then I won't be
able to restore reproducers for bugs and they will be unactionable. I
could run leak checking after each program, but it will increase
overhead significantly. So a dozen of programs is a trade-off. And I
disable automatic scanning.

False positives are super unpleasant in automatic testing. If a tool
false positive rate if high, I just disable it, it is unusable. It is
not that bad for leak checking. But each false positive consumes human
(my) time.

So I need to run scanning twice, because the first one never reports leaks.

For the false positives due to registers/pointer jumping, will it help
if I run scanning one more time if leaks are detected? I mean: run
scanning twice, if leaks are found sleep for several seconds and run
scanning third time. Since leaks are usually not detected I can afford
to sleep more and do one or two additional scans. The question here:
will kmemleak _remove_ an object for leaked objects, if it discovered
reachable or contents change on subsequent scans?

Regarding stopping all threads and doing proper scan, why is not it
feasible? Will kernel break if we stall all CPUs for seconds? In
automatic testing scenarios a stalled for several seconds machine is
not a problem. But on the other hand, absence of false positives is a
must. And it would improve testing bandwidth, because we don't need
sleep and second scan.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty: memory leak in tty_register_driver
  2016-02-29 10:22     ` Dmitry Vyukov
@ 2016-02-29 10:24       ` Dmitry Vyukov
  2016-02-29 11:34       ` Catalin Marinas
  1 sibling, 0 replies; 7+ messages in thread
From: Dmitry Vyukov @ 2016-02-29 10:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, Peter Hurley,
	One Thousand Gnomes, J Freyensee, linux-mm, Paul Bolle,
	Alexander Potapenko, Kostya Serebryany, Sasha Levin, syzkaller

On Mon, Feb 29, 2016 at 11:22 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Mon, Feb 29, 2016 at 12:47 AM, Catalin Marinas
> <catalin.marinas@arm.com> wrote:
>> On Sun, Feb 28, 2016 at 05:42:24PM +0100, Dmitry Vyukov wrote:
>>> On Mon, Feb 15, 2016 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> > When I am running the following program in a parallel loop, kmemleak
>>> > starts reporting memory leaks of objects allocated in
>>> > tty_register_driver during boot. These leaks start popping up
>>> > chaotically and as you can see they originate in different drivers
>>> > (synclinkmp_init, isdn_init, chr_dev_init, sysfs_init).
>>> >
>>> > On commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95 (4.5-rc3).
>> [...]
>>> > unreferenced object 0xffff88006708dc20 (size 8):
>>> >   comm "swapper/0", pid 1, jiffies 4294672590 (age 930.839s)
>>> >   hex dump (first 8 bytes):
>>> >     74 74 79 53 4c 4d 38 00                          ttySLM8.
>>> >   backtrace:
>>> >     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>>> >     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>>> >     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>>> >     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>>> >     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>>> >     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>>> >     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
>> [...]
>>> +Catalin (kmemleak maintainer)
>>>
>>> I am noticed a weird thing. I am not 100% sure but it seems that the
>>> leaks are reported iff I run leak checking concurrently with the
>>> programs running. And if I run the program several thousand times and
>>> then run leak checking, then no leaks reported.
>>>
>>> Catalin, it is possible that it is a kmemleak false positive?
>>
>> Yes, it's possible. If you run kmemleak scanning continuously (or at
>> very short intervals) and especially in parallel with some intensive
>> tasks, it will miss pointers that may be stored in registers (on other
>> CPUs) or moved between task stacks, other memory locations. Linked lists
>> are especially prone to such false positives.
>>
>> Kmemleak tries to work around this by checksumming each object, so it
>> will only be reported if it hasn't changed on two consecutive scans.
>> Since the default scanning is 10min, it is very unlikely to trigger
>> false positives in such scenarios. However, if you reduce the scanning
>> time (or trigger it manually in a loop), you can hit this condition.
>>
>>> I see that kmemleak just scans thread stacks one-by-one. I would
>>> expect that kmemleak should stop all threads, then scan all stacks and
>>> all registers of all threads, and then restart threads. If it does not
>>> scan registers or does not stop threads, then I think it should be
>>> possible that a pointer value can sneak off kmemleak. Does it make
>>> sense?
>>
>> Given how long it takes to scan the memory, stopping the threads is not
>> really feasible. You could do something like stop_machine() only for
>> scanning the current stack on all CPUs but it still wouldn't catch
>> pointers being moved around in memory unless you stop the system
>> completely for a full scan. The heuristic about periodic scanning and
>> checksumming seems to work fine in normal usage scenarios.
>>
>> For your tests, I would recommend that you run the tests for a long(ish)
>> time and only do two kmemleak scans at the end after they finished (and
>> with a few seconds delay between them). Continuous scanning is less
>> reliable.
>
>
> Thanks for the explanation, Catalin!
>
> Let me describe my usage scenario first. I am running automatic
> testing 24x7. Currently a VM executes a dozen of small programs (a
> dozen of syscalls each), then I run manual leak scanning. I can't run
> significantly more programs between scans, because then I won't be
> able to restore reproducers for bugs and they will be unactionable. I
> could run leak checking after each program, but it will increase
> overhead significantly. So a dozen of programs is a trade-off. And I
> disable automatic scanning.
>
> False positives are super unpleasant in automatic testing. If a tool
> false positive rate if high, I just disable it, it is unusable. It is
> not that bad for leak checking. But each false positive consumes human
> (my) time.
>
> So I need to run scanning twice, because the first one never reports leaks.
>
> For the false positives due to registers/pointer jumping, will it help
> if I run scanning one more time if leaks are detected? I mean: run
> scanning twice, if leaks are found sleep for several seconds and run
> scanning third time. Since leaks are usually not detected I can afford
> to sleep more and do one or two additional scans. The question here:
> will kmemleak _remove_ an object for leaked objects, if it discovered
> reachable or contents change on subsequent scans?
>
> Regarding stopping all threads and doing proper scan, why is not it
> feasible? Will kernel break if we stall all CPUs for seconds? In
> automatic testing scenarios a stalled for several seconds machine is
> not a problem. But on the other hand, absence of false positives is a
> must. And it would improve testing bandwidth, because we don't need
> sleep and second scan.



Paul, regarding this particular leak, let's consider it kmemleak false
positive (until proven otherwise).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty: memory leak in tty_register_driver
  2016-02-29 10:22     ` Dmitry Vyukov
  2016-02-29 10:24       ` Dmitry Vyukov
@ 2016-02-29 11:34       ` Catalin Marinas
  2016-03-01 15:27         ` Dmitry Vyukov
  1 sibling, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2016-02-29 11:34 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, Peter Hurley,
	One Thousand Gnomes, J Freyensee, linux-mm, Paul Bolle,
	Alexander Potapenko, Kostya Serebryany, Sasha Levin, syzkaller

On Mon, Feb 29, 2016 at 11:22:58AM +0100, Dmitry Vyukov wrote:
> On Mon, Feb 29, 2016 at 12:47 AM, Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > On Sun, Feb 28, 2016 at 05:42:24PM +0100, Dmitry Vyukov wrote:
> >> On Mon, Feb 15, 2016 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> >> > When I am running the following program in a parallel loop, kmemleak
> >> > starts reporting memory leaks of objects allocated in
> >> > tty_register_driver during boot. These leaks start popping up
> >> > chaotically and as you can see they originate in different drivers
> >> > (synclinkmp_init, isdn_init, chr_dev_init, sysfs_init).
> >> >
> >> > On commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95 (4.5-rc3).
> > [...]
> >> > unreferenced object 0xffff88006708dc20 (size 8):
> >> >   comm "swapper/0", pid 1, jiffies 4294672590 (age 930.839s)
> >> >   hex dump (first 8 bytes):
> >> >     74 74 79 53 4c 4d 38 00                          ttySLM8.
> >> >   backtrace:
> >> >     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
> >> >     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
> >> >     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
> >> >     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
> >> >     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
> >> >     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
> >> >     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
> > [...]
> >> +Catalin (kmemleak maintainer)
> >>
> >> I am noticed a weird thing. I am not 100% sure but it seems that the
> >> leaks are reported iff I run leak checking concurrently with the
> >> programs running. And if I run the program several thousand times and
> >> then run leak checking, then no leaks reported.
> >>
> >> Catalin, it is possible that it is a kmemleak false positive?
> >
> > Yes, it's possible. If you run kmemleak scanning continuously (or at
> > very short intervals) and especially in parallel with some intensive
> > tasks, it will miss pointers that may be stored in registers (on other
> > CPUs) or moved between task stacks, other memory locations. Linked lists
> > are especially prone to such false positives.
> >
> > Kmemleak tries to work around this by checksumming each object, so it
> > will only be reported if it hasn't changed on two consecutive scans.
> > Since the default scanning is 10min, it is very unlikely to trigger
> > false positives in such scenarios. However, if you reduce the scanning
> > time (or trigger it manually in a loop), you can hit this condition.
> >
> >> I see that kmemleak just scans thread stacks one-by-one. I would
> >> expect that kmemleak should stop all threads, then scan all stacks and
> >> all registers of all threads, and then restart threads. If it does not
> >> scan registers or does not stop threads, then I think it should be
> >> possible that a pointer value can sneak off kmemleak. Does it make
> >> sense?
> >
> > Given how long it takes to scan the memory, stopping the threads is not
> > really feasible. You could do something like stop_machine() only for
> > scanning the current stack on all CPUs but it still wouldn't catch
> > pointers being moved around in memory unless you stop the system
> > completely for a full scan. The heuristic about periodic scanning and
> > checksumming seems to work fine in normal usage scenarios.
> >
> > For your tests, I would recommend that you run the tests for a long(ish)
> > time and only do two kmemleak scans at the end after they finished (and
> > with a few seconds delay between them). Continuous scanning is less
> > reliable.
> 
> Let me describe my usage scenario first. I am running automatic
> testing 24x7. Currently a VM executes a dozen of small programs (a
> dozen of syscalls each), then I run manual leak scanning.

IIUC, you said that the leak reporting happens iff you run leak checking
concurrently with the test programs running. If you run the kmemleak
scanning afterwards, there are no leaks reported. As I explained, that's
the normal usage I would expect.

> I can't run significantly more programs between scans, because then I
> won't be able to restore reproducers for bugs and they will be
> unactionable. I could run leak checking after each program, but it
> will increase overhead significantly. So a dozen of programs is a
> trade-off. And I disable automatic scanning.

That's fine, not an issue here.

> False positives are super unpleasant in automatic testing. If a tool
> false positive rate if high, I just disable it, it is unusable. It is
> not that bad for leak checking. But each false positive consumes human
> (my) time.

Indeed. I've spent a significant amount of time in the early kmemleak
days just trying to prove whether it's a real leak or not but these days
with the automatic scanning, false positives seem to be very low ratio.

> So I need to run scanning twice, because the first one never reports leaks.

That's because of the checksumming. For example, you have objects stored
in a list. On some delete or insert, the list_head is temporarily
modified, possibly stored in CPU registers on another CPU. For this
brief time, kmemleak may no longer detect a pointer to the rest of the
list, hence reporting a big part of it as leaked objects.

Since list deletion/insertion requires a modification of an object
list_head, it's checksum changes, hence if this value has changed since
the previous scan, kmemleak will not report the object, assuming it is
something transient. If you run two scans in quick succession, it
possible that the transient condition hasn't cleared yet, so you risk a
false positive. Of course, I could place some random delay in kmemleak
between successive scans but I assumed that most people would leave it
running on the default 10min scan.

I had other heuristics like object age but checksumming proved to be the
most efficient, with the minor drawback that you'd have to run the
scanning twice before reporting. And any kind of delayed reporting (e.g.
X secs since the last checksum modification) would most likely break
your testing workflow.

> For the false positives due to registers/pointer jumping, will it help
> if I run scanning one more time if leaks are detected? I mean: run
> scanning twice, if leaks are found sleep for several seconds and run
> scanning third time. Since leaks are usually not detected I can afford
> to sleep more and do one or two additional scans.

This should work.

> The question here: will kmemleak _remove_ an object for leaked
> objects, if it discovered reachable or contents change on subsequent
> scans?

Yes, it will remove them from /sys/kernel/debug/kmemleak even if they
were previously reported.

> Regarding stopping all threads and doing proper scan, why is not it
> feasible? Will kernel break if we stall all CPUs for seconds? In
> automatic testing scenarios a stalled for several seconds machine is
> not a problem. But on the other hand, absence of false positives is a
> must. And it would improve testing bandwidth, because we don't need
> sleep and second scan.

Scanning time is the main issue with it taking minutes on some slow ARM
machines (my primary testing target). Such timing was significantly
improved with commit 93ada579b0ee ("mm: kmemleak: optimise kmemleak_lock
acquiring during kmemleak_scan") but even if it is few seconds, it is
not suitable for a live, interactive system.

What we could do though, since you already trigger the scanning
manually, is to add a "stopscan" command that you echo into
/sys/kernel/debug/kmemleak and performs a stop_machine() during memory
scanning. If you have time, please feel free to give it a try ;).

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty: memory leak in tty_register_driver
  2016-02-29 11:34       ` Catalin Marinas
@ 2016-03-01 15:27         ` Dmitry Vyukov
  2016-03-01 15:55           ` Catalin Marinas
  0 siblings, 1 reply; 7+ messages in thread
From: Dmitry Vyukov @ 2016-03-01 15:27 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, Peter Hurley,
	One Thousand Gnomes, J Freyensee, linux-mm, Paul Bolle,
	Alexander Potapenko, Kostya Serebryany, Sasha Levin, syzkaller

On Mon, Feb 29, 2016 at 12:34 PM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Mon, Feb 29, 2016 at 11:22:58AM +0100, Dmitry Vyukov wrote:
>> On Mon, Feb 29, 2016 at 12:47 AM, Catalin Marinas
>> <catalin.marinas@arm.com> wrote:
>> > On Sun, Feb 28, 2016 at 05:42:24PM +0100, Dmitry Vyukov wrote:
>> >> On Mon, Feb 15, 2016 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> >> > When I am running the following program in a parallel loop, kmemleak
>> >> > starts reporting memory leaks of objects allocated in
>> >> > tty_register_driver during boot. These leaks start popping up
>> >> > chaotically and as you can see they originate in different drivers
>> >> > (synclinkmp_init, isdn_init, chr_dev_init, sysfs_init).
>> >> >
>> >> > On commit 388f7b1d6e8ca06762e2454d28d6c3c55ad0fe95 (4.5-rc3).
>> > [...]
>> >> > unreferenced object 0xffff88006708dc20 (size 8):
>> >> >   comm "swapper/0", pid 1, jiffies 4294672590 (age 930.839s)
>> >> >   hex dump (first 8 bytes):
>> >> >     74 74 79 53 4c 4d 38 00                          ttySLM8.
>> >> >   backtrace:
>> >> >     [<ffffffff81765d10>] __kmalloc_track_caller+0x1b0/0x320 mm/slub.c:4068
>> >> >     [<ffffffff816b37a9>] kstrdup+0x39/0x70 mm/util.c:53
>> >> >     [<ffffffff816b3826>] kstrdup_const+0x46/0x60 mm/util.c:74
>> >> >     [<ffffffff8194e5bb>] __kernfs_new_node+0x2b/0x2b0 fs/kernfs/dir.c:536
>> >> >     [<ffffffff81951c70>] kernfs_new_node+0x80/0xe0 fs/kernfs/dir.c:572
>> >> >     [<ffffffff81957223>] kernfs_create_link+0x33/0x150 fs/kernfs/symlink.c:32
>> >> >     [<ffffffff81959c4b>] sysfs_do_create_link_sd.isra.2+0x8b/0x120
>> > [...]
>> >> +Catalin (kmemleak maintainer)
>> >>
>> >> I am noticed a weird thing. I am not 100% sure but it seems that the
>> >> leaks are reported iff I run leak checking concurrently with the
>> >> programs running. And if I run the program several thousand times and
>> >> then run leak checking, then no leaks reported.
>> >>
>> >> Catalin, it is possible that it is a kmemleak false positive?
>> >
>> > Yes, it's possible. If you run kmemleak scanning continuously (or at
>> > very short intervals) and especially in parallel with some intensive
>> > tasks, it will miss pointers that may be stored in registers (on other
>> > CPUs) or moved between task stacks, other memory locations. Linked lists
>> > are especially prone to such false positives.
>> >
>> > Kmemleak tries to work around this by checksumming each object, so it
>> > will only be reported if it hasn't changed on two consecutive scans.
>> > Since the default scanning is 10min, it is very unlikely to trigger
>> > false positives in such scenarios. However, if you reduce the scanning
>> > time (or trigger it manually in a loop), you can hit this condition.
>> >
>> >> I see that kmemleak just scans thread stacks one-by-one. I would
>> >> expect that kmemleak should stop all threads, then scan all stacks and
>> >> all registers of all threads, and then restart threads. If it does not
>> >> scan registers or does not stop threads, then I think it should be
>> >> possible that a pointer value can sneak off kmemleak. Does it make
>> >> sense?
>> >
>> > Given how long it takes to scan the memory, stopping the threads is not
>> > really feasible. You could do something like stop_machine() only for
>> > scanning the current stack on all CPUs but it still wouldn't catch
>> > pointers being moved around in memory unless you stop the system
>> > completely for a full scan. The heuristic about periodic scanning and
>> > checksumming seems to work fine in normal usage scenarios.
>> >
>> > For your tests, I would recommend that you run the tests for a long(ish)
>> > time and only do two kmemleak scans at the end after they finished (and
>> > with a few seconds delay between them). Continuous scanning is less
>> > reliable.
>>
>> Let me describe my usage scenario first. I am running automatic
>> testing 24x7. Currently a VM executes a dozen of small programs (a
>> dozen of syscalls each), then I run manual leak scanning.
>
> IIUC, you said that the leak reporting happens iff you run leak checking
> concurrently with the test programs running. If you run the kmemleak
> scanning afterwards, there are no leaks reported. As I explained, that's
> the normal usage I would expect.
>
>> I can't run significantly more programs between scans, because then I
>> won't be able to restore reproducers for bugs and they will be
>> unactionable. I could run leak checking after each program, but it
>> will increase overhead significantly. So a dozen of programs is a
>> trade-off. And I disable automatic scanning.
>
> That's fine, not an issue here.
>
>> False positives are super unpleasant in automatic testing. If a tool
>> false positive rate if high, I just disable it, it is unusable. It is
>> not that bad for leak checking. But each false positive consumes human
>> (my) time.
>
> Indeed. I've spent a significant amount of time in the early kmemleak
> days just trying to prove whether it's a real leak or not but these days
> with the automatic scanning, false positives seem to be very low ratio.
>
>> So I need to run scanning twice, because the first one never reports leaks.
>
> That's because of the checksumming. For example, you have objects stored
> in a list. On some delete or insert, the list_head is temporarily
> modified, possibly stored in CPU registers on another CPU. For this
> brief time, kmemleak may no longer detect a pointer to the rest of the
> list, hence reporting a big part of it as leaked objects.
>
> Since list deletion/insertion requires a modification of an object
> list_head, it's checksum changes, hence if this value has changed since
> the previous scan, kmemleak will not report the object, assuming it is
> something transient. If you run two scans in quick succession, it
> possible that the transient condition hasn't cleared yet, so you risk a
> false positive. Of course, I could place some random delay in kmemleak
> between successive scans but I assumed that most people would leave it
> running on the default 10min scan.
>
> I had other heuristics like object age but checksumming proved to be the
> most efficient, with the minor drawback that you'd have to run the
> scanning twice before reporting. And any kind of delayed reporting (e.g.
> X secs since the last checksum modification) would most likely break
> your testing workflow.
>
>> For the false positives due to registers/pointer jumping, will it help
>> if I run scanning one more time if leaks are detected? I mean: run
>> scanning twice, if leaks are found sleep for several seconds and run
>> scanning third time. Since leaks are usually not detected I can afford
>> to sleep more and do one or two additional scans.
>
> This should work.
>
>> The question here: will kmemleak _remove_ an object for leaked
>> objects, if it discovered reachable or contents change on subsequent
>> scans?
>
> Yes, it will remove them from /sys/kernel/debug/kmemleak even if they
> were previously reported.
>
>> Regarding stopping all threads and doing proper scan, why is not it
>> feasible? Will kernel break if we stall all CPUs for seconds? In
>> automatic testing scenarios a stalled for several seconds machine is
>> not a problem. But on the other hand, absence of false positives is a
>> must. And it would improve testing bandwidth, because we don't need
>> sleep and second scan.
>
> Scanning time is the main issue with it taking minutes on some slow ARM
> machines (my primary testing target). Such timing was significantly
> improved with commit 93ada579b0ee ("mm: kmemleak: optimise kmemleak_lock
> acquiring during kmemleak_scan") but even if it is few seconds, it is
> not suitable for a live, interactive system.
>
> What we could do though, since you already trigger the scanning
> manually, is to add a "stopscan" command that you echo into
> /sys/kernel/debug/kmemleak and performs a stop_machine() during memory
> scanning. If you have time, please feel free to give it a try ;).


Stopscan would be useful for me, but I don't feel like I am ready to
tackle it. To be absolutely sure that we don't miss pointers we would
also need to scan all registers from stopped CPUs, and I don't know
how to obtain that.

For now I did several changes in my test driver:
 - try harder to ensure that there are no concurrent activities during scanning
 - scan twice with second delay
 - if something is discovered, wait another second and scan again
Will monitor how it affects false positive rate.

Thanks

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: tty: memory leak in tty_register_driver
  2016-03-01 15:27         ` Dmitry Vyukov
@ 2016-03-01 15:55           ` Catalin Marinas
  0 siblings, 0 replies; 7+ messages in thread
From: Catalin Marinas @ 2016-03-01 15:55 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Greg Kroah-Hartman, Jiri Slaby, LKML, Peter Hurley,
	One Thousand Gnomes, J Freyensee, linux-mm, Paul Bolle,
	Alexander Potapenko, Kostya Serebryany, Sasha Levin, syzkaller

On Tue, Mar 01, 2016 at 04:27:28PM +0100, Dmitry Vyukov wrote:
> On Mon, Feb 29, 2016 at 12:34 PM, Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > On Mon, Feb 29, 2016 at 11:22:58AM +0100, Dmitry Vyukov wrote:
> >> Regarding stopping all threads and doing proper scan, why is not it
> >> feasible? Will kernel break if we stall all CPUs for seconds? In
> >> automatic testing scenarios a stalled for several seconds machine is
> >> not a problem. But on the other hand, absence of false positives is a
> >> must. And it would improve testing bandwidth, because we don't need
> >> sleep and second scan.
> >
> > Scanning time is the main issue with it taking minutes on some slow ARM
> > machines (my primary testing target). Such timing was significantly
> > improved with commit 93ada579b0ee ("mm: kmemleak: optimise kmemleak_lock
> > acquiring during kmemleak_scan") but even if it is few seconds, it is
> > not suitable for a live, interactive system.
> >
> > What we could do though, since you already trigger the scanning
> > manually, is to add a "stopscan" command that you echo into
> > /sys/kernel/debug/kmemleak and performs a stop_machine() during memory
> > scanning. If you have time, please feel free to give it a try ;).
> 
> Stopscan would be useful for me, but I don't feel like I am ready to
> tackle it.

It's not that hard ;). Anyway, when I get a bit of time I'll try to look
into it.

> To be absolutely sure that we don't miss pointers we would also need
> to scan all registers from stopped CPUs, and I don't know how to
> obtain that.

With stop_machine(), we probably wouldn't need to. This mechanism causes
the other CPUs to go take an IPI and execute a certain function (or wait
for the completion of a function call on another CPU). We can assume
that the functions stop_machine() is calling wouldn't manipulate
allocated objects/lists/etc., so the register file content is not
relevant to kmemleak. The previous context interrupted by the IPI would
be stored on the IRQ stack and that's one area the kmemleak does not
scan (it's architecture specific).

On the CPU issuing the stop_machine(), this would be done as a result of
debugfs write and I don't think we have any object
allocation/manipulation on this path (and it's only the callee-saved
registers that we would miss when calling kmemleak's scan_object()).

So yes, in addition to stop_machine(), we would have to scan the IRQ
stack if the architecture uses a separate one (vs just the current
thread stack).

-- 
Catalin

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-03-01 15:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CACT4Y+bZticikTpnc0djxRBLCWhj=2DqQk=KRf5zDvrLdHzEbQ@mail.gmail.com>
2016-02-28 16:42 ` tty: memory leak in tty_register_driver Dmitry Vyukov
2016-02-28 23:47   ` Catalin Marinas
2016-02-29 10:22     ` Dmitry Vyukov
2016-02-29 10:24       ` Dmitry Vyukov
2016-02-29 11:34       ` Catalin Marinas
2016-03-01 15:27         ` Dmitry Vyukov
2016-03-01 15:55           ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox