linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION] Null pointer dereference while shrinking zswap
@ 2024-04-16 12:19 Christian Heusel
  2024-04-16 19:18 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Christian Heusel @ 2024-04-16 12:19 UTC (permalink / raw)
  To: Seth Jennings, Dan Streetman, Vitaly Wool, Andrew Morton,
	linux-mm, linux-kernel
  Cc: David Runge, Richard W.M. Jones, Mark W, regressions

[-- Attachment #1: Type: text/plain, Size: 3596 bytes --]

Hello everyone,

while rebuilding a few packages in Arch Linux we have recently come
across a regression in the linux kernel which was made visible by a test
failure in libguestfs[0], where the booted kernel showed a Call Trace
like the following one:

[  218.738568] CPU: 0 PID: 167 Comm: guestfsd Not tainted 6.7.0-rc4-1-mainline-00158-gb5ba474f3f51 #1 bf39861cf50acae7a79c534e25532f28afe4e593^M
[  218.739007] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.3-1-1 04/01/2014^M
[  218.739787] RIP: 0010:memcg_page_state+0x9/0x30^M
[  218.740299] Code: 0d b8 ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 <48> 8b 87 00 06 00 00 48 63 f6 31 d2 48 8b 04 f0 48 85 c0 48 0f 48^M
[  218.740727] RSP: 0018:ffffb5fa808dfc10 EFLAGS: 00000202^M
[  218.740862] RAX: 0000000000000000 RBX: ffffb5fa808dfce0 RCX: 0000000000000002^M
[  218.741016] RDX: 0000000000000001 RSI: 0000000000000033 RDI: 0000000000000000^M
[  218.741168] RBP: 0000000000000000 R08: ffff976681ff8000 R09: 0000000000000000^M
[  218.741322] R10: 0000000000000001 R11: ffff9766833f9d00 R12: ffff9766ffffe780^M
[  218.742167] R13: 0000000000000000 R14: ffff976680cc1800 R15: ffff976682204d80^M
[  218.742376] FS:  00007f1479d9f540(0000) GS:ffff9766fbc00000(0000) knlGS:0000000000000000^M
[  218.742569] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[  218.743256] CR2: 0000000000000600 CR3: 0000000103606000 CR4: 0000000000750ef0^M
[  218.743494] PKRU: 55555554^M
[  218.743593] Call Trace:^M
[  218.743733]  <TASK>^M
[  218.743847]  ? __die+0x23/0x70^M
[  218.743957]  ? page_fault_oops+0x171/0x4e0^M
[  218.744056]  ? free_unref_page+0xf6/0x180^M
[  218.744458]  ? exc_page_fault+0x7f/0x180^M
[  218.744551]  ? asm_exc_page_fault+0x26/0x30^M
[  218.744684]  ? memcg_page_state+0x9/0x30^M
[  218.744779]  zswap_shrinker_count+0x9d/0x110^M
[  218.744896]  do_shrink_slab+0x3a/0x360^M
[  218.744990]  shrink_slab+0xc7/0x3c0^M
[  218.745609]  drop_slab+0x85/0x140^M
[  218.745691]  drop_caches_sysctl_handler+0x7e/0xd0^M
[  218.745799]  proc_sys_call_handler+0x1c0/0x2e0^M
[  218.745912]  vfs_write+0x23d/0x400^M
[  218.745998]  ksys_write+0x6f/0xf0^M
[  218.746080]  do_syscall_64+0x64/0xe0^M
[  218.746169]  ? exit_to_user_mode_prepare+0x132/0x1f0^M
[  218.746873]  entry_SYSCALL_64_after_hwframe+0x6e/0x76^M

The regression is present in the mainline kernel and also was
independently reported to the redhat bugtracker[1].

I have bisected (see log[2]) the regression between v6.9-rc4 and v6.6
and have landed on the following results (removed unrelated test commit)
as remainders since some of the commits were not buildable for me:
- 7108cc3f765c ("mm: memcg: add per-memcg zswap writeback stat")
- a65b0e7607cc ("zswap: make shrinking memcg-aware")
- b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")

I have decided on good/bad commits with the relevant libguestfs tests,
but I think the reproducer in the redhat bugzilla is simpler (although I
only became aware of it during the bisection and therefore didn't test
it myself):

  LIBGUESTFS_MEMSIZE=4096 LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1 make -C /build/libguestfs/src/libguestfs-1.52.0/tests -k check TESTS=c-api/tests

I hope I have included everything needed to debug this further, if there
is more to add I'm happy to provide more details!

Cheers,
Christian

[0]: https://github.com/libguestfs/libguestfs/issues/139
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2275252
[2]: https://gist.github.com/christian-heusel/d5095c36b72ae90871e27dfed32ddc46

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-04-19 19:10 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-16 12:19 [REGRESSION] Null pointer dereference while shrinking zswap Christian Heusel
2024-04-16 19:18 ` Andrew Morton
2024-04-16 19:57   ` Christian Heusel
2024-04-16 22:14 ` Nhat Pham
2024-04-16 22:29   ` Christian Heusel
2024-04-16 23:29   ` Nhat Pham
2024-04-17  0:22     ` Nhat Pham
2024-04-17  3:44       ` Chengming Zhou
2024-04-17 14:33         ` Johannes Weiner
2024-04-17 15:08           ` Richard W.M. Jones
2024-04-17 17:18           ` Christian Heusel
2024-04-18 12:40             ` Johannes Weiner
2024-04-18 14:25               ` Linux regression tracking (Thorsten Leemhuis)
2024-04-18 20:09               ` Yosry Ahmed
2024-04-19 14:22                 ` Johannes Weiner
2024-04-19 19:10                   ` Yosry Ahmed
2024-04-17  0:33 ` Nhat Pham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox