linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [linus:master] [mm]  ba42b524a0: segfault_at_ip_sp_error
@ 2024-05-28  7:00 kernel test robot
  2024-05-28 20:52 ` Ingo Saitz
  0 siblings, 1 reply; 4+ messages in thread
From: kernel test robot @ 2024-05-28  7:00 UTC (permalink / raw)
  To: York Jasper Niebuhr
  Cc: oe-lkp, lkp, linux-kernel, Andrew Morton, Matthew Wilcox,
	Kees Cook, linux-doc, linux-mm, linux-security-module,
	oliver.sang



Hello,

kernel test robot noticed "segfault_at_ip_sp_error" on:

commit: ba42b524a0408b5f92bd41edaee1ea84309ab9ae ("mm: init_mlocked_on_free_v3")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      c760b3725e52403dc1b28644fb09c47a83cacea6]
[test failed on linux-next/master 3689b0ef08b70e4e03b82ebd37730a03a672853a]

in testcase: rcutorture
version: 
with following parameters:

	runtime: 300s
	test: default
	torture_type: tasks-tracing



compiler: gcc-13
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we didn't see clear hint in below dmesg information, but by further runs, the
issue looks quite persistent:


6c47de3be3a021d8 ba42b524a0408b5f92bd41edaee
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :30         100%          30:30    dmesg.segfault_at_ip_sp_error


so report this FYI what we observed and captured in our tests.


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202405281425.92ece916-lkp@intel.com



[  OK  ] Started /etc/rc.local Compatibility.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.
[  OK  ] Reached target Multi-User System.
Starting watchdog daemon...
[  351.243261][ T1844] watchdog[1844]: segfault at 0 ip 00000000 sp bfd01fac error 4 in watchdog[49d000+2000] likely on CPU 1 (core 1, socket 0)
[ 351.245577][ T1844] Code: Unable to access opcode bytes at 0xffffffd6.

Code starting with the faulting instruction
===========================================
[  351.253843][ T1846] watchdog[1846]: segfault at 0 ip 00000000 sp bfd01f9c error 4 in watchdog[49d000+2000] likely on CPU 0 (core 0, socket 0)
[ 351.256126][ T1846] Code: Unable to access opcode bytes at 0xffffffd6.

Code starting with the faulting instruction
===========================================
[FAILED] Failed to start watchdog daemon.
See 'systemctl status watchdog.service' for details.



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240528/202405281425.92ece916-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linus:master] [mm]  ba42b524a0: segfault_at_ip_sp_error
  2024-05-28  7:00 [linus:master] [mm] ba42b524a0: segfault_at_ip_sp_error kernel test robot
@ 2024-05-28 20:52 ` Ingo Saitz
  2024-05-29  7:52   ` Ingo Saitz
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Saitz @ 2024-05-28 20:52 UTC (permalink / raw)
  To: linux-mm

I hit the same error on devuan unstable, with startpar crashing on boot:

[   10.025881] dracut: Switching root
[   10.339687] startpar[720]: segfault at 0 ip 0000000000000000 sp 00007fffbdca9c38 error 14 likely on CPU 1 (core 0, socket 0)
[   10.340225] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[   10.349516] startpar[820]: segfault at 0 ip 0000000000000000 sp 00007ffd52c34e48 error 14 likely on CPU 3 (core 1, socket 0)
[   10.350086] Code: Unable to access opcode bytes at 0xffffffffffffffd6.

Setting init_mlocked_on_free=0 on the commandline mitigates this crash
and allows the system to boot successfully.

I managed to get a strace of startpar here:
https://hannover.ccc.de/~ingo/startpar-6.10-rc1/startpar.strace.txt.gz

Complete dmesg of the boot:
https://hannover.ccc.de/~ingo/startpar-6.10-rc1/dmesg-6.10-rc1.gz

I'm not sure if this actually is a kernel bug or if it is a classical
use after free that just got noticed by this change.

    Ingo
-- 
const_cast<long double>(Λ)


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linus:master] [mm]  ba42b524a0: segfault_at_ip_sp_error
  2024-05-28 20:52 ` Ingo Saitz
@ 2024-05-29  7:52   ` Ingo Saitz
  2024-05-29  8:25     ` Ingo Saitz
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Saitz @ 2024-05-29  7:52 UTC (permalink / raw)
  To: linux-mm

[-- Attachment #1: Type: text/plain, Size: 987 bytes --]

On Tue, May 28, 2024 at 10:52:41PM +0200, Ingo Saitz wrote:
> I hit the same error on devuan unstable, with startpar crashing on boot:
> 
> [   10.025881] dracut: Switching root
> [   10.339687] startpar[720]: segfault at 0 ip 0000000000000000 sp 00007fffbdca9c38 error 14 likely on CPU 1 (core 0, socket 0)
> [   10.340225] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> [   10.349516] startpar[820]: segfault at 0 ip 0000000000000000 sp 00007ffd52c34e48 error 14 likely on CPU 3 (core 1, socket 0)
> [   10.350086] Code: Unable to access opcode bytes at 0xffffffffffffffd6.

I can reproduce the error with the attached c program.

[ 1325.980596] thoughts[2127]: segfault at 0 ip 0000000000000000 sp 00007fff183d66e8 error 14 likely on CPU 0 (core 0, socket 0)
[ 1325.981180] Code: Unable to access opcode bytes at 0xffffffffffffffd6.

Simply calling mlockall(MCL_CURRENT) and then fork() seems to be enough
to trigger the error.

    Ingo
-- 
const_cast<long double>(Λ)

[-- Attachment #2: thoughts.c --]
[-- Type: text/x-csrc, Size: 428 bytes --]

#define _GNU_SOURCE

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/select.h>
#include <unistd.h>

int main() {
    mlockall(MCL_CURRENT);

    pid_t pid = fork();
    if (pid == 0) {
        _exit(0);
    }

    const struct timespec zero = {2, 0};

    sigset_t smask;
    sigemptyset(&smask);

    pselect(0, 0, 0, 0, &zero, &smask);

    printf("Success on pid %d\n", pid);
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linus:master] [mm]  ba42b524a0: segfault_at_ip_sp_error
  2024-05-29  7:52   ` Ingo Saitz
@ 2024-05-29  8:25     ` Ingo Saitz
  0 siblings, 0 replies; 4+ messages in thread
From: Ingo Saitz @ 2024-05-29  8:25 UTC (permalink / raw)
  To: linux-mm

Even a simple

int main() {
	mlockall(MCL_CURRENT);
	fork();
}

crashes, although with another errors:

[ 3421.236586] thoughts2[2297]: segfault at 10e0 ip 00007f598a0275c5 sp 00007ffeec71ef50 error 4 in libc.so.6[da5c5,7f5989f73000+157000] likely on CPU 0 (core 0, socket 0)
[ 3421.237280] Code: 89 c3 45 84 e4 0f 84 9a 02 00 00 e8 e5 b0 ff ff 41 89 c4 85 c0 0f 85 a2 01 00 00 48 83 05 ca 21 10 00 04 4c 8b 25 cb b9 0f 00 <49> 8b 84 24 e0 10 00 00 66 0f ef c0 49 c7 84 24 28 0a 00 00 00 00

[ 3421.957294] traps: thoughts2[2298] general protection fault ip:7f7aca1f36fd sp:7ffc6b05d6f0 error:0 in libc.so.6[3f6fd,7f7aca1da000+157000]

    Ingo
-- 
const_cast<long double>(Λ)


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-05-29  8:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-28  7:00 [linus:master] [mm] ba42b524a0: segfault_at_ip_sp_error kernel test robot
2024-05-28 20:52 ` Ingo Saitz
2024-05-29  7:52   ` Ingo Saitz
2024-05-29  8:25     ` Ingo Saitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox