linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error
@ 2023-08-07 12:31 John Hsu (許永翰)
  2023-08-07 15:32 ` Mark Rutland
  0 siblings, 1 reply; 7+ messages in thread
From: John Hsu (許永翰) @ 2023-08-07 12:31 UTC (permalink / raw)
  To: catalin.marinas
  Cc: linux-kernel, linux-mediatek,
	Xiaobing Shi (史小兵),
	Chunhui Li (李春辉),
	linux-mm, Kuan-Ying Lee (李冠穎),
	Casper Li (李中榮),
	linux-arm-kernel

Hi ARM maintainers,
 
We met this issue under our internal test.
It seems that __pi_strncmp() reads out-of-bound.
 
[ 7445.268043][  T382] ueventd: [name:fault&]Unable to handle kernel
paging request at virtual address ffffff803fd3f000
[ 7445.268078][  T382] ueventd: [name:fault&]Mem abort info:
[ 7445.268084][  T382] ueventd: [name:fault&]  ESR = 0x96000007
[ 7445.268089][  T382] ueventd: [name:fault&]  EC = 0x25: DABT (current
EL), IL = 32 bits
[ 7445.268095][  T382] ueventd: [name:fault&]  SET = 0, FnV = 0
[ 7445.268100][  T382] ueventd: [name:fault&]  EA = 0, S1PTW = 0
[ 7445.268105][  T382] ueventd: [name:fault&]  FSC = 0x07: level 3
translation fault
[ 7445.268110][  T382] ueventd: [name:fault&]Data abort info:
[ 7445.268115][  T382] ueventd: [name:fault&]  ISV = 0, ISS =
0x00000007
[ 7445.268120][  T382] ueventd: [name:fault&]  CM = 0, WnR = 0
[ 7445.268126][  T382] ueventd: [name:fault&]swapper pgtable: 4k pages,
39-bit VAs, pgdp=00000000426c6000
[ 7445.268133][  T382] ueventd: [name:fault&][ffffff803fd3f000]
pgd=1800000327ff5003, p4d=1800000327ff5003, pud=1800000327ff5003,
pmd=1800000327fef003, pte=0000000000000000
[ 7445.268154][  T382] ueventd: [name:traps&]Internal error: Oops:
96000007 [#1] PREEMPT SMP
[ 7445.268278][  T382] ueventd: [name:mrdump&]Kernel Offset:
0x2825400000 from 0xffffffc008000000
[ 7445.268286][  T382] ueventd: [name:mrdump&]PHYS_OFFSET: 0x40000000
[ 7445.268294][  T382] ueventd: [name:mrdump&]pstate: 82400005 (Nzcv
daif +PAN -UAO)
[ 7445.268301][  T382] ueventd: [name:mrdump&]pc : [0xffffffe82d420210]
__pi_strncmp+0x1a0/0x1c4
[ 7445.268310][  T382] ueventd: [name:mrdump&]lr : [0xffffffe82dbe12c0]
__security_genfs_sid+0x100/0x168
[ 7445.268319][  T382] ueventd: [name:mrdump&]sp : ffffffc0097cb8b0
…
[ 7445.269337][  T382] ueventd: CPU: 0 PID: 382 Comm: ueventd Tainted:
G S      W  OE     5.15.41-android13-8-gb1f1ad628628 #1
[ 7445.269347][  T382] ueventd: Hardware name: MT6886(ENG) (DT)
[ 7445.269354][  T382] ueventd: Call trace:
[ 7445.269359][  T382] ueventd:  dump_backtrace+0x0/0x2a8
[ 7445.269374][  T382] ueventd:  dump_stack_lvl+0x74/0xa4
[ 7445.269384][  T382] ueventd:  dump_stack+0x14/0x1c
[ 7445.269391][  T382] ueventd:  mrdump_common_die+0x32c/0x5ac [mrdump]
[ 7445.269470][  T382] ueventd:  ipanic_die+0x1c/0x28 [mrdump]
[ 7445.269539][  T382] ueventd:  __die+0xbc/0x308
[ 7445.269548][  T382] ueventd:  die+0xd8/0x500
[ 7445.269556][  T382] ueventd:  die_kernel_fault+0x94/0xa8
[ 7445.269565][  T382] ueventd:  __do_kernel_fault+0x1d8/0x214
[ 7445.269571][  T382] ueventd:  do_bad_area+0x40/0x174
[ 7445.269579][  T382] ueventd:  do_translation_fault+0x48/0x54
[ 7445.269585][  T382] ueventd:  do_mem_abort+0x3c/0x100
[ 7445.269592][  T382] ueventd:  el1_abort+0x38/0x54
[ 7445.269602][  T382] ueventd:  el1h_64_sync_handler+0x54/0x88
[ 7445.269610][  T382] ueventd:  el1h_64_sync+0x78/0x7c
[ 7445.269618][  T382] ueventd:  __pi_strncmp+0x1a0/0x1c4
[ 7445.269626][  T382] ueventd:  selinux_genfs_get_sid+0x114/0x220
[ 7445.269636][  T382] ueventd:  inode_doinit_with_dentry+0x3d0/0x598
[ 7445.269644][  T382] ueventd:  selinux_d_instantiate+0x1c/0x24
[ 7445.269652][  T382] ueventd:  d_splice_alias+0x5c/0x280
[ 7445.269662][  T382] ueventd:  kernfs_iop_lookup+0xec/0x21c
[ 7445.269674][  T382] ueventd:  __lookup_slow+0xc4/0x150
[ 7445.269684][  T382] ueventd:  lookup_slow+0x40/0xf0
[ 7445.269690][  T382] ueventd:  walk_component+0x144/0x160
[ 7445.269696][  T382] ueventd:  link_path_walk+0x25c/0x344
[ 7445.269703][  T382] ueventd:  path_lookupat+0x64/0x120
[ 7445.269710][  T382] ueventd:  filename_lookup+0xc4/0x1b0
[ 7445.269718][  T382] ueventd:  user_path_at_empty+0x48/0xb4
[ 7445.269725][  T382] ueventd:  do_faccessat+0xa8/0x1f0
[ 7445.269732][  T382] ueventd:  __arm64_sys_faccessat+0x20/0x28
[ 7445.269738][  T382] ueventd:  invoke_syscall+0x3c/0xf0
[ 7445.269746][  T382] ueventd:  el0_svc_common+0x84/0xe8
[ 7445.269753][  T382] ueventd:  do_el0_svc+0x20/0x84
[ 7445.269759][  T382] ueventd:  el0_svc+0x1c/0x48
[ 7445.269766][  T382] ueventd:  el0t_64_sync_handler+0x7c/0xd8
[ 7445.269773][  T382] ueventd:  el0t_64_sync+0x15c/0x160
 
We found that we hit this issue when we compare these two strings.
 
________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
__   0123456789ABCDEF
   NSD:FFFFFF80089EDA00|>2F 64 65 76 69 63 65 73 2F 76 69 72 74 75 61
6C  /devices/virtual
   NSD:FFFFFF80089EDA10| 2F 62 6C 6F 63 6B 2F 00 E0 03 01 AA E1 03 02
AA  /block/.........
 
________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
__    0123456789ABCDEF
   NSD:FFFFFF803FD3EFE0| 00 00 00 00 00 00 00 00 00 00 2F 64 65 76 69
63    ........../devic
   NSD:FFFFFF803FD3EFF0| 65 73>2F 76 69 72 74 75 61 6C 2F 6D 69 73 63
00  es/virtual/misc.
   NSD:FFFFFF803FD3F000| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
??
   NSD:FFFFFF803FD3F0E0| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
??

We observe the second string is put at the tail of the first page and
the next page is unreadable.
Thus, we made a simple test as below and it can reproduce this issue.

static noinline void strncmp_ut(void)
{
    int ret = 0;
    int size = 4096;
    char *src1 = vmalloc(size);
    char *src2 = vmalloc(size);
    char *str1 = "/devices/virtual/block/";
    char *str2 = "/devices/virtual/misc";
    int len1 = strlen(str1);
    int len2 = strlen(str2);
    char *str1_start, *str2_start;

    pr_info("src1: %px\n", src1);
    pr_info("src2: %px\n", src2);
    pr_info("len1 :%d, len2: %d\n", len1, len2);

    memset(src1, 0, size);
    strncpy(&src1[size-len1-1], str1, len1);
    memset(src2, 0, size);
    strncpy(&src2[size-len2-1], str2, len2);

    str1_start = src1 + size - len1 - 1;
    pr_info("str1_start: %px", str1_start);
    str2_start = src2 + size - len2 - 1;
    pr_info("str2_start: %px", str2_start);
    ret = strncmp(str1_start, str2_start, len1);
    pr_info("ret: %d\n", ret);
}


Does any issue exist in __pi_strncmp in kernel-5.15?

Any suggestion is appreciated.

Thanks,
John Hsu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error
  2023-08-07 12:31 [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error John Hsu (許永翰)
@ 2023-08-07 15:32 ` Mark Rutland
  2023-08-10 12:23   ` Robin Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Rutland @ 2023-08-07 15:32 UTC (permalink / raw)
  To: John Hsu (許永翰)
  Cc: catalin.marinas, linux-kernel, linux-mediatek,
	Xiaobing Shi (史小兵),
	Chunhui Li (李春辉),
	linux-mm, Kuan-Ying Lee (李冠穎),
	Casper Li (李中榮),
	linux-arm-kernel, Will Deacon, Robin Murphy

On Mon, Aug 07, 2023 at 12:31:45PM +0000, John Hsu (許永翰) wrote:
> Hi ARM maintainers,
>  
> We met this issue under our internal test.
> It seems that __pi_strncmp() reads out-of-bound.
>  
> [ 7445.268043][  T382] ueventd: [name:fault&]Unable to handle kernel
> paging request at virtual address ffffff803fd3f000
> [ 7445.268078][  T382] ueventd: [name:fault&]Mem abort info:
> [ 7445.268084][  T382] ueventd: [name:fault&]  ESR = 0x96000007
> [ 7445.268089][  T382] ueventd: [name:fault&]  EC = 0x25: DABT (current
> EL), IL = 32 bits
> [ 7445.268095][  T382] ueventd: [name:fault&]  SET = 0, FnV = 0
> [ 7445.268100][  T382] ueventd: [name:fault&]  EA = 0, S1PTW = 0
> [ 7445.268105][  T382] ueventd: [name:fault&]  FSC = 0x07: level 3
> translation fault
> [ 7445.268110][  T382] ueventd: [name:fault&]Data abort info:
> [ 7445.268115][  T382] ueventd: [name:fault&]  ISV = 0, ISS =
> 0x00000007
> [ 7445.268120][  T382] ueventd: [name:fault&]  CM = 0, WnR = 0
> [ 7445.268126][  T382] ueventd: [name:fault&]swapper pgtable: 4k pages,
> 39-bit VAs, pgdp=00000000426c6000
> [ 7445.268133][  T382] ueventd: [name:fault&][ffffff803fd3f000]
> pgd=1800000327ff5003, p4d=1800000327ff5003, pud=1800000327ff5003,
> pmd=1800000327fef003, pte=0000000000000000
> [ 7445.268154][  T382] ueventd: [name:traps&]Internal error: Oops:
> 96000007 [#1] PREEMPT SMP
> [ 7445.268278][  T382] ueventd: [name:mrdump&]Kernel Offset:
> 0x2825400000 from 0xffffffc008000000
> [ 7445.268286][  T382] ueventd: [name:mrdump&]PHYS_OFFSET: 0x40000000
> [ 7445.268294][  T382] ueventd: [name:mrdump&]pstate: 82400005 (Nzcv
> daif +PAN -UAO)
> [ 7445.268301][  T382] ueventd: [name:mrdump&]pc : [0xffffffe82d420210]
> __pi_strncmp+0x1a0/0x1c4
> [ 7445.268310][  T382] ueventd: [name:mrdump&]lr : [0xffffffe82dbe12c0]
> __security_genfs_sid+0x100/0x168
> [ 7445.268319][  T382] ueventd: [name:mrdump&]sp : ffffffc0097cb8b0
> …
> [ 7445.269337][  T382] ueventd: CPU: 0 PID: 382 Comm: ueventd Tainted:
> G S      W  OE     5.15.41-android13-8-gb1f1ad628628 #1
> [ 7445.269347][  T382] ueventd: Hardware name: MT6886(ENG) (DT)
> [ 7445.269354][  T382] ueventd: Call trace:
> [ 7445.269359][  T382] ueventd:  dump_backtrace+0x0/0x2a8
> [ 7445.269374][  T382] ueventd:  dump_stack_lvl+0x74/0xa4
> [ 7445.269384][  T382] ueventd:  dump_stack+0x14/0x1c
> [ 7445.269391][  T382] ueventd:  mrdump_common_die+0x32c/0x5ac [mrdump]
> [ 7445.269470][  T382] ueventd:  ipanic_die+0x1c/0x28 [mrdump]
> [ 7445.269539][  T382] ueventd:  __die+0xbc/0x308
> [ 7445.269548][  T382] ueventd:  die+0xd8/0x500
> [ 7445.269556][  T382] ueventd:  die_kernel_fault+0x94/0xa8
> [ 7445.269565][  T382] ueventd:  __do_kernel_fault+0x1d8/0x214
> [ 7445.269571][  T382] ueventd:  do_bad_area+0x40/0x174
> [ 7445.269579][  T382] ueventd:  do_translation_fault+0x48/0x54
> [ 7445.269585][  T382] ueventd:  do_mem_abort+0x3c/0x100
> [ 7445.269592][  T382] ueventd:  el1_abort+0x38/0x54
> [ 7445.269602][  T382] ueventd:  el1h_64_sync_handler+0x54/0x88
> [ 7445.269610][  T382] ueventd:  el1h_64_sync+0x78/0x7c
> [ 7445.269618][  T382] ueventd:  __pi_strncmp+0x1a0/0x1c4
> [ 7445.269626][  T382] ueventd:  selinux_genfs_get_sid+0x114/0x220
> [ 7445.269636][  T382] ueventd:  inode_doinit_with_dentry+0x3d0/0x598
> [ 7445.269644][  T382] ueventd:  selinux_d_instantiate+0x1c/0x24
> [ 7445.269652][  T382] ueventd:  d_splice_alias+0x5c/0x280
> [ 7445.269662][  T382] ueventd:  kernfs_iop_lookup+0xec/0x21c
> [ 7445.269674][  T382] ueventd:  __lookup_slow+0xc4/0x150
> [ 7445.269684][  T382] ueventd:  lookup_slow+0x40/0xf0
> [ 7445.269690][  T382] ueventd:  walk_component+0x144/0x160
> [ 7445.269696][  T382] ueventd:  link_path_walk+0x25c/0x344
> [ 7445.269703][  T382] ueventd:  path_lookupat+0x64/0x120
> [ 7445.269710][  T382] ueventd:  filename_lookup+0xc4/0x1b0
> [ 7445.269718][  T382] ueventd:  user_path_at_empty+0x48/0xb4
> [ 7445.269725][  T382] ueventd:  do_faccessat+0xa8/0x1f0
> [ 7445.269732][  T382] ueventd:  __arm64_sys_faccessat+0x20/0x28
> [ 7445.269738][  T382] ueventd:  invoke_syscall+0x3c/0xf0
> [ 7445.269746][  T382] ueventd:  el0_svc_common+0x84/0xe8
> [ 7445.269753][  T382] ueventd:  do_el0_svc+0x20/0x84
> [ 7445.269759][  T382] ueventd:  el0_svc+0x1c/0x48
> [ 7445.269766][  T382] ueventd:  el0t_64_sync_handler+0x7c/0xd8
> [ 7445.269773][  T382] ueventd:  el0t_64_sync+0x15c/0x160
>  
> We found that we hit this issue when we compare these two strings.
>  
> ________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
> __   0123456789ABCDEF
>    NSD:FFFFFF80089EDA00|>2F 64 65 76 69 63 65 73 2F 76 69 72 74 75 61
> 6C  /devices/virtual
>    NSD:FFFFFF80089EDA10| 2F 62 6C 6F 63 6B 2F 00 E0 03 01 AA E1 03 02
> AA  /block/.........
>  
> ________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
> __    0123456789ABCDEF
>    NSD:FFFFFF803FD3EFE0| 00 00 00 00 00 00 00 00 00 00 2F 64 65 76 69
> 63    ........../devic
>    NSD:FFFFFF803FD3EFF0| 65 73>2F 76 69 72 74 75 61 6C 2F 6D 69 73 63
> 00  es/virtual/misc.
>    NSD:FFFFFF803FD3F000| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
> ??
>    NSD:FFFFFF803FD3F0E0| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
> ??
> 
> We observe the second string is put at the tail of the first page and
> the next page is unreadable.
> Thus, we made a simple test as below and it can reproduce this issue.
> 
> static noinline void strncmp_ut(void)
> {
>     int ret = 0;
>     int size = 4096;
>     char *src1 = vmalloc(size);
>     char *src2 = vmalloc(size);
>     char *str1 = "/devices/virtual/block/";
>     char *str2 = "/devices/virtual/misc";
>     int len1 = strlen(str1);
>     int len2 = strlen(str2);
>     char *str1_start, *str2_start;
> 
>     pr_info("src1: %px\n", src1);
>     pr_info("src2: %px\n", src2);
>     pr_info("len1 :%d, len2: %d\n", len1, len2);
> 
>     memset(src1, 0, size);
>     strncpy(&src1[size-len1-1], str1, len1);
>     memset(src2, 0, size);
>     strncpy(&src2[size-len2-1], str2, len2);
> 
>     str1_start = src1 + size - len1 - 1;
>     pr_info("str1_start: %px", str1_start);
>     str2_start = src2 + size - len2 - 1;
>     pr_info("str2_start: %px", str2_start);
>     ret = strncmp(str1_start, str2_start, len1);
>     pr_info("ret: %d\n", ret);
> }
> 
> 
> Does any issue exist in __pi_strncmp in kernel-5.15?

There's no known issue, but the implementation change substantially in commit:

  387d828adffcf1eb ("arm64: lib: Import latest version of Arm Optimized Routines' strncmp")

... which added additional alignment restrictions to accesses to avoid crossing
an MTE tag granule.

Using your test on v6.5-rc3 I could not trigger the issue, but I could trigger
the issue atop v5.15:

[    1.639268] src1: ffff800010005000
[    1.639982] src2: ffff80001000d000
[    1.640758] len1 :23, len2: 21
[    1.641098] str1_start: ffff800010005fe8
[    1.641202] str2_start: ffff80001000dfea
[    1.642312] Unable to handle kernel paging request at virtual address ffff80001000e000
[    1.643237] Mem abort info:
[    1.643858]   ESR = 0x96000007
[    1.644201]   EC = 0x25: DABT (current EL), IL = 32 bits
[    1.644453]   SET = 0, FnV = 0
[    1.644979]   EA = 0, S1PTW = 0
[    1.645212]   FSC = 0x07: level 3 translation fault
[    1.645492] Data abort info:
[    1.645638]   ISV = 0, ISS = 0x00000007
[    1.645818]   CM = 0, WnR = 0
[    1.646075] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000007a611000
[    1.646417] [ffff80001000e000] pgd=100000004001d003, p4d=100000004001d003, pud=100000004001e003, pmd=100000004001f003, pte=0000000000000000
[    1.648699] Internal error: Oops: 96000007 [#1] PREEMPT SMP
[    1.649681] Modules linked in:
[    1.650322] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-dirty #1
[    1.650994] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    1.651475] pc : __pi_strncmp+0x1a0/0x1c4
[    1.652860] lr : strncmp_ut+0xf4/0x118
[    1.653133] sp : ffff80001009bdd0
[    1.653680] x29: ffff80001009bdd0 x28: 0000000000000000 x27: 0000000000000000
[    1.654134] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[    1.654453] x23: ffffa050aff25000 x22: 0000000000001000 x21: ffff800010005fe8
[    1.654774] x20: ffff80001000dfea x19: ffff80001000dff7 x18: 0000000000000006
[    1.655138] x17: 687469726f676c61 x16: 2046454420504d49 x15: ffff80001009b950
[    1.655462] x14: 0000000000000008 x13: ffffffffffffffff x12: 00000000000002dc
[    1.656107] x11: 0101010101010101 x10: ffffa050afc9c540 x9 : 7f7f7f7f7f7f7f7f
[    1.656467] x8 : 6b6074737168752e x7 : ffffa050afc9ae60 x6 : 0000000000000000
[    1.656769] x5 : 0000000000000000 x4 : 6c6175747269762f x3 : 2f6b636f6c622f6c
[    1.657083] x2 : 0000000000000007 x1 : ffff80001000dff2 x0 : ffff800010005ff0
[    1.657576] Call trace:
[    1.657920]  __pi_strncmp+0x1a0/0x1c4
[    1.658179]  smp_cpus_done+0xb4/0xc0
[    1.658363]  smp_init+0x80/0x90
[    1.658516]  kernel_init_freeable+0x12c/0x290
[    1.658792]  kernel_init+0x28/0x130
[    1.658966]  ret_from_fork+0x10/0x20
[    1.659564] Code: b4fff762 d1002000 d1002021 f8626803 (f8626824) 
[    1.661848] ---[ end trace 5ce515dccfbd3da3 ]---
[    1.663291] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    1.663861] SMP: stopping secondary CPUs
[    1.667296] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

That __pi_strncmp+0x1a0/0x1c4 is right at the end of __pi_strcmp():

| L(done_loop):
|         /* We found a difference or a NULL before the limit was reached.  */
|         and     limit, limit, #7
|         cbz     limit, L(not_limit)
|         /* Read the last word.  */
|         sub     src1, src1, 8
|         sub     src2, src2, 8
|         ldr     data1, [src1, limit]
|         ldr     data2, [src2, limit]	// <----------- Faulting LDR here
|         sub     tmp1, data1, zeroones
|         orr     tmp2, data1, #REP8_7f
|         eor     diff, data1, data2      /* Non-zero if differences found.  */
|         bics    has_nul, tmp1, tmp2     /* Non-zero if NUL terminator.  */
|         ccmp    diff, #0, #0, eq
|         b.ne    L(not_limit)
| 
| L(ret0):
|         mov     result, #0
|         ret 

Thanks,
Mark.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error
  2023-08-07 15:32 ` Mark Rutland
@ 2023-08-10 12:23   ` Robin Murphy
  2023-08-10 14:31     ` Will Deacon
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2023-08-10 12:23 UTC (permalink / raw)
  To: Mark Rutland, John Hsu (許永翰)
  Cc: catalin.marinas, linux-kernel, linux-mediatek,
	Xiaobing Shi (史小兵),
	Chunhui Li (李春辉),
	linux-mm, Kuan-Ying Lee (李冠穎),
	Casper Li (李中榮),
	linux-arm-kernel, Will Deacon

On 2023-08-07 16:32, Mark Rutland wrote:
> On Mon, Aug 07, 2023 at 12:31:45PM +0000, John Hsu (許永翰) wrote:
>> Hi ARM maintainers,
>>   
>> We met this issue under our internal test.
>> It seems that __pi_strncmp() reads out-of-bound.
>>   
>> [ 7445.268043][  T382] ueventd: [name:fault&]Unable to handle kernel
>> paging request at virtual address ffffff803fd3f000
>> [ 7445.268078][  T382] ueventd: [name:fault&]Mem abort info:
>> [ 7445.268084][  T382] ueventd: [name:fault&]  ESR = 0x96000007
>> [ 7445.268089][  T382] ueventd: [name:fault&]  EC = 0x25: DABT (current
>> EL), IL = 32 bits
>> [ 7445.268095][  T382] ueventd: [name:fault&]  SET = 0, FnV = 0
>> [ 7445.268100][  T382] ueventd: [name:fault&]  EA = 0, S1PTW = 0
>> [ 7445.268105][  T382] ueventd: [name:fault&]  FSC = 0x07: level 3
>> translation fault
>> [ 7445.268110][  T382] ueventd: [name:fault&]Data abort info:
>> [ 7445.268115][  T382] ueventd: [name:fault&]  ISV = 0, ISS =
>> 0x00000007
>> [ 7445.268120][  T382] ueventd: [name:fault&]  CM = 0, WnR = 0
>> [ 7445.268126][  T382] ueventd: [name:fault&]swapper pgtable: 4k pages,
>> 39-bit VAs, pgdp=00000000426c6000
>> [ 7445.268133][  T382] ueventd: [name:fault&][ffffff803fd3f000]
>> pgd=1800000327ff5003, p4d=1800000327ff5003, pud=1800000327ff5003,
>> pmd=1800000327fef003, pte=0000000000000000
>> [ 7445.268154][  T382] ueventd: [name:traps&]Internal error: Oops:
>> 96000007 [#1] PREEMPT SMP
>> [ 7445.268278][  T382] ueventd: [name:mrdump&]Kernel Offset:
>> 0x2825400000 from 0xffffffc008000000
>> [ 7445.268286][  T382] ueventd: [name:mrdump&]PHYS_OFFSET: 0x40000000
>> [ 7445.268294][  T382] ueventd: [name:mrdump&]pstate: 82400005 (Nzcv
>> daif +PAN -UAO)
>> [ 7445.268301][  T382] ueventd: [name:mrdump&]pc : [0xffffffe82d420210]
>> __pi_strncmp+0x1a0/0x1c4
>> [ 7445.268310][  T382] ueventd: [name:mrdump&]lr : [0xffffffe82dbe12c0]
>> __security_genfs_sid+0x100/0x168
>> [ 7445.268319][  T382] ueventd: [name:mrdump&]sp : ffffffc0097cb8b0
>> …
>> [ 7445.269337][  T382] ueventd: CPU: 0 PID: 382 Comm: ueventd Tainted:
>> G S      W  OE     5.15.41-android13-8-gb1f1ad628628 #1
>> [ 7445.269347][  T382] ueventd: Hardware name: MT6886(ENG) (DT)
>> [ 7445.269354][  T382] ueventd: Call trace:
>> [ 7445.269359][  T382] ueventd:  dump_backtrace+0x0/0x2a8
>> [ 7445.269374][  T382] ueventd:  dump_stack_lvl+0x74/0xa4
>> [ 7445.269384][  T382] ueventd:  dump_stack+0x14/0x1c
>> [ 7445.269391][  T382] ueventd:  mrdump_common_die+0x32c/0x5ac [mrdump]
>> [ 7445.269470][  T382] ueventd:  ipanic_die+0x1c/0x28 [mrdump]
>> [ 7445.269539][  T382] ueventd:  __die+0xbc/0x308
>> [ 7445.269548][  T382] ueventd:  die+0xd8/0x500
>> [ 7445.269556][  T382] ueventd:  die_kernel_fault+0x94/0xa8
>> [ 7445.269565][  T382] ueventd:  __do_kernel_fault+0x1d8/0x214
>> [ 7445.269571][  T382] ueventd:  do_bad_area+0x40/0x174
>> [ 7445.269579][  T382] ueventd:  do_translation_fault+0x48/0x54
>> [ 7445.269585][  T382] ueventd:  do_mem_abort+0x3c/0x100
>> [ 7445.269592][  T382] ueventd:  el1_abort+0x38/0x54
>> [ 7445.269602][  T382] ueventd:  el1h_64_sync_handler+0x54/0x88
>> [ 7445.269610][  T382] ueventd:  el1h_64_sync+0x78/0x7c
>> [ 7445.269618][  T382] ueventd:  __pi_strncmp+0x1a0/0x1c4
>> [ 7445.269626][  T382] ueventd:  selinux_genfs_get_sid+0x114/0x220
>> [ 7445.269636][  T382] ueventd:  inode_doinit_with_dentry+0x3d0/0x598
>> [ 7445.269644][  T382] ueventd:  selinux_d_instantiate+0x1c/0x24
>> [ 7445.269652][  T382] ueventd:  d_splice_alias+0x5c/0x280
>> [ 7445.269662][  T382] ueventd:  kernfs_iop_lookup+0xec/0x21c
>> [ 7445.269674][  T382] ueventd:  __lookup_slow+0xc4/0x150
>> [ 7445.269684][  T382] ueventd:  lookup_slow+0x40/0xf0
>> [ 7445.269690][  T382] ueventd:  walk_component+0x144/0x160
>> [ 7445.269696][  T382] ueventd:  link_path_walk+0x25c/0x344
>> [ 7445.269703][  T382] ueventd:  path_lookupat+0x64/0x120
>> [ 7445.269710][  T382] ueventd:  filename_lookup+0xc4/0x1b0
>> [ 7445.269718][  T382] ueventd:  user_path_at_empty+0x48/0xb4
>> [ 7445.269725][  T382] ueventd:  do_faccessat+0xa8/0x1f0
>> [ 7445.269732][  T382] ueventd:  __arm64_sys_faccessat+0x20/0x28
>> [ 7445.269738][  T382] ueventd:  invoke_syscall+0x3c/0xf0
>> [ 7445.269746][  T382] ueventd:  el0_svc_common+0x84/0xe8
>> [ 7445.269753][  T382] ueventd:  do_el0_svc+0x20/0x84
>> [ 7445.269759][  T382] ueventd:  el0_svc+0x1c/0x48
>> [ 7445.269766][  T382] ueventd:  el0t_64_sync_handler+0x7c/0xd8
>> [ 7445.269773][  T382] ueventd:  el0t_64_sync+0x15c/0x160
>>   
>> We found that we hit this issue when we compare these two strings.
>>   
>> ________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
>> __   0123456789ABCDEF
>>     NSD:FFFFFF80089EDA00|>2F 64 65 76 69 63 65 73 2F 76 69 72 74 75 61
>> 6C  /devices/virtual
>>     NSD:FFFFFF80089EDA10| 2F 62 6C 6F 63 6B 2F 00 E0 03 01 AA E1 03 02
>> AA  /block/.........
>>   
>> ________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
>> __    0123456789ABCDEF
>>     NSD:FFFFFF803FD3EFE0| 00 00 00 00 00 00 00 00 00 00 2F 64 65 76 69
>> 63    ........../devic
>>     NSD:FFFFFF803FD3EFF0| 65 73>2F 76 69 72 74 75 61 6C 2F 6D 69 73 63
>> 00  es/virtual/misc.
>>     NSD:FFFFFF803FD3F000| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
>> ??
>>     NSD:FFFFFF803FD3F0E0| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
>> ??
>>
>> We observe the second string is put at the tail of the first page and
>> the next page is unreadable.
>> Thus, we made a simple test as below and it can reproduce this issue.

I'm not sure there's strictly a bug here. The C standard says:

"The strncmp function compares not more than n characters (characters 
that follow a null character are not compared) ..."

so although any characters between the first NULL and n must not be 
considered for the result of the comparison, there doesn't seem to be 
any explicit promise anywhere that they can't be *accessed*. AFAICT what 
happens here is in the request to compare at most 23 characters, it ends 
up in the do_misaligned case, loop_misaligned runs twice and finds no 
differences or NULLs in characters 0-7 and 8-15, so then done_loop loads 
characters 15-23 to compare the last 7, and is tripped up by 22-23 not 
actually existing in src2. Possibly the original intent was that this 
case should have ended up in page_end_loop, and the condition for that 
was slightly off, but I'm not sure, and this code is obsolete now anyway.

Thanks,
Robin.

>>
>> static noinline void strncmp_ut(void)
>> {
>>      int ret = 0;
>>      int size = 4096;
>>      char *src1 = vmalloc(size);
>>      char *src2 = vmalloc(size);
>>      char *str1 = "/devices/virtual/block/";
>>      char *str2 = "/devices/virtual/misc";
>>      int len1 = strlen(str1);
>>      int len2 = strlen(str2);
>>      char *str1_start, *str2_start;
>>
>>      pr_info("src1: %px\n", src1);
>>      pr_info("src2: %px\n", src2);
>>      pr_info("len1 :%d, len2: %d\n", len1, len2);
>>
>>      memset(src1, 0, size);
>>      strncpy(&src1[size-len1-1], str1, len1);
>>      memset(src2, 0, size);
>>      strncpy(&src2[size-len2-1], str2, len2);
>>
>>      str1_start = src1 + size - len1 - 1;
>>      pr_info("str1_start: %px", str1_start);
>>      str2_start = src2 + size - len2 - 1;
>>      pr_info("str2_start: %px", str2_start);
>>      ret = strncmp(str1_start, str2_start, len1);
>>      pr_info("ret: %d\n", ret);
>> }
>>
>>
>> Does any issue exist in __pi_strncmp in kernel-5.15?
> 
> There's no known issue, but the implementation change substantially in commit:
> 
>    387d828adffcf1eb ("arm64: lib: Import latest version of Arm Optimized Routines' strncmp")
> 
> ... which added additional alignment restrictions to accesses to avoid crossing
> an MTE tag granule.
> 
> Using your test on v6.5-rc3 I could not trigger the issue, but I could trigger
> the issue atop v5.15:
> 
> [    1.639268] src1: ffff800010005000
> [    1.639982] src2: ffff80001000d000
> [    1.640758] len1 :23, len2: 21
> [    1.641098] str1_start: ffff800010005fe8
> [    1.641202] str2_start: ffff80001000dfea
> [    1.642312] Unable to handle kernel paging request at virtual address ffff80001000e000
> [    1.643237] Mem abort info:
> [    1.643858]   ESR = 0x96000007
> [    1.644201]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    1.644453]   SET = 0, FnV = 0
> [    1.644979]   EA = 0, S1PTW = 0
> [    1.645212]   FSC = 0x07: level 3 translation fault
> [    1.645492] Data abort info:
> [    1.645638]   ISV = 0, ISS = 0x00000007
> [    1.645818]   CM = 0, WnR = 0
> [    1.646075] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000007a611000
> [    1.646417] [ffff80001000e000] pgd=100000004001d003, p4d=100000004001d003, pud=100000004001e003, pmd=100000004001f003, pte=0000000000000000
> [    1.648699] Internal error: Oops: 96000007 [#1] PREEMPT SMP
> [    1.649681] Modules linked in:
> [    1.650322] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.0-dirty #1
> [    1.650994] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [    1.651475] pc : __pi_strncmp+0x1a0/0x1c4
> [    1.652860] lr : strncmp_ut+0xf4/0x118
> [    1.653133] sp : ffff80001009bdd0
> [    1.653680] x29: ffff80001009bdd0 x28: 0000000000000000 x27: 0000000000000000
> [    1.654134] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
> [    1.654453] x23: ffffa050aff25000 x22: 0000000000001000 x21: ffff800010005fe8
> [    1.654774] x20: ffff80001000dfea x19: ffff80001000dff7 x18: 0000000000000006
> [    1.655138] x17: 687469726f676c61 x16: 2046454420504d49 x15: ffff80001009b950
> [    1.655462] x14: 0000000000000008 x13: ffffffffffffffff x12: 00000000000002dc
> [    1.656107] x11: 0101010101010101 x10: ffffa050afc9c540 x9 : 7f7f7f7f7f7f7f7f
> [    1.656467] x8 : 6b6074737168752e x7 : ffffa050afc9ae60 x6 : 0000000000000000
> [    1.656769] x5 : 0000000000000000 x4 : 6c6175747269762f x3 : 2f6b636f6c622f6c
> [    1.657083] x2 : 0000000000000007 x1 : ffff80001000dff2 x0 : ffff800010005ff0
> [    1.657576] Call trace:
> [    1.657920]  __pi_strncmp+0x1a0/0x1c4
> [    1.658179]  smp_cpus_done+0xb4/0xc0
> [    1.658363]  smp_init+0x80/0x90
> [    1.658516]  kernel_init_freeable+0x12c/0x290
> [    1.658792]  kernel_init+0x28/0x130
> [    1.658966]  ret_from_fork+0x10/0x20
> [    1.659564] Code: b4fff762 d1002000 d1002021 f8626803 (f8626824)
> [    1.661848] ---[ end trace 5ce515dccfbd3da3 ]---
> [    1.663291] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [    1.663861] SMP: stopping secondary CPUs
> [    1.667296] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
> 
> That __pi_strncmp+0x1a0/0x1c4 is right at the end of __pi_strcmp():
> 
> | L(done_loop):
> |         /* We found a difference or a NULL before the limit was reached.  */
> |         and     limit, limit, #7
> |         cbz     limit, L(not_limit)
> |         /* Read the last word.  */
> |         sub     src1, src1, 8
> |         sub     src2, src2, 8
> |         ldr     data1, [src1, limit]
> |         ldr     data2, [src2, limit]	// <----------- Faulting LDR here
> |         sub     tmp1, data1, zeroones
> |         orr     tmp2, data1, #REP8_7f
> |         eor     diff, data1, data2      /* Non-zero if differences found.  */
> |         bics    has_nul, tmp1, tmp2     /* Non-zero if NUL terminator.  */
> |         ccmp    diff, #0, #0, eq
> |         b.ne    L(not_limit)
> |
> | L(ret0):
> |         mov     result, #0
> |         ret
> 
> Thanks,
> Mark.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error
  2023-08-10 12:23   ` Robin Murphy
@ 2023-08-10 14:31     ` Will Deacon
  2023-08-10 15:00       ` Robin Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Will Deacon @ 2023-08-10 14:31 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Mark Rutland, John Hsu (許永翰),
	catalin.marinas, linux-kernel, linux-mediatek,
	Xiaobing Shi (史小兵),
	Chunhui Li (李春辉),
	linux-mm, Kuan-Ying Lee (李冠穎),
	Casper Li (李中榮),
	linux-arm-kernel

On Thu, Aug 10, 2023 at 01:23:28PM +0100, Robin Murphy wrote:
> On 2023-08-07 16:32, Mark Rutland wrote:
> > On Mon, Aug 07, 2023 at 12:31:45PM +0000, John Hsu (許永翰) wrote:
> > > [ 7445.269347][  T382] ueventd: Hardware name: MT6886(ENG) (DT)
> > > [ 7445.269354][  T382] ueventd: Call trace:
> > > [ 7445.269359][  T382] ueventd:  dump_backtrace+0x0/0x2a8
> > > [ 7445.269374][  T382] ueventd:  dump_stack_lvl+0x74/0xa4
> > > [ 7445.269384][  T382] ueventd:  dump_stack+0x14/0x1c
> > > [ 7445.269391][  T382] ueventd:  mrdump_common_die+0x32c/0x5ac [mrdump]
> > > [ 7445.269470][  T382] ueventd:  ipanic_die+0x1c/0x28 [mrdump]
> > > [ 7445.269539][  T382] ueventd:  __die+0xbc/0x308
> > > [ 7445.269548][  T382] ueventd:  die+0xd8/0x500
> > > [ 7445.269556][  T382] ueventd:  die_kernel_fault+0x94/0xa8
> > > [ 7445.269565][  T382] ueventd:  __do_kernel_fault+0x1d8/0x214
> > > [ 7445.269571][  T382] ueventd:  do_bad_area+0x40/0x174
> > > [ 7445.269579][  T382] ueventd:  do_translation_fault+0x48/0x54
> > > [ 7445.269585][  T382] ueventd:  do_mem_abort+0x3c/0x100
> > > [ 7445.269592][  T382] ueventd:  el1_abort+0x38/0x54
> > > [ 7445.269602][  T382] ueventd:  el1h_64_sync_handler+0x54/0x88
> > > [ 7445.269610][  T382] ueventd:  el1h_64_sync+0x78/0x7c
> > > [ 7445.269618][  T382] ueventd:  __pi_strncmp+0x1a0/0x1c4
> > > [ 7445.269626][  T382] ueventd:  selinux_genfs_get_sid+0x114/0x220
> > > [ 7445.269636][  T382] ueventd:  inode_doinit_with_dentry+0x3d0/0x598
> > > [ 7445.269644][  T382] ueventd:  selinux_d_instantiate+0x1c/0x24
> > > [ 7445.269652][  T382] ueventd:  d_splice_alias+0x5c/0x280
> > > [ 7445.269662][  T382] ueventd:  kernfs_iop_lookup+0xec/0x21c
> > > [ 7445.269674][  T382] ueventd:  __lookup_slow+0xc4/0x150
> > > [ 7445.269684][  T382] ueventd:  lookup_slow+0x40/0xf0
> > > [ 7445.269690][  T382] ueventd:  walk_component+0x144/0x160
> > > [ 7445.269696][  T382] ueventd:  link_path_walk+0x25c/0x344
> > > [ 7445.269703][  T382] ueventd:  path_lookupat+0x64/0x120
> > > [ 7445.269710][  T382] ueventd:  filename_lookup+0xc4/0x1b0
> > > [ 7445.269718][  T382] ueventd:  user_path_at_empty+0x48/0xb4
> > > [ 7445.269725][  T382] ueventd:  do_faccessat+0xa8/0x1f0
> > > [ 7445.269732][  T382] ueventd:  __arm64_sys_faccessat+0x20/0x28
> > > [ 7445.269738][  T382] ueventd:  invoke_syscall+0x3c/0xf0
> > > [ 7445.269746][  T382] ueventd:  el0_svc_common+0x84/0xe8
> > > [ 7445.269753][  T382] ueventd:  do_el0_svc+0x20/0x84
> > > [ 7445.269759][  T382] ueventd:  el0_svc+0x1c/0x48
> > > [ 7445.269766][  T382] ueventd:  el0t_64_sync_handler+0x7c/0xd8
> > > [ 7445.269773][  T382] ueventd:  el0t_64_sync+0x15c/0x160
> > > We found that we hit this issue when we compare these two strings.
> > > ________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
> > > __   0123456789ABCDEF
> > >     NSD:FFFFFF80089EDA00|>2F 64 65 76 69 63 65 73 2F 76 69 72 74 75 61
> > > 6C  /devices/virtual
> > >     NSD:FFFFFF80089EDA10| 2F 62 6C 6F 63 6B 2F 00 E0 03 01 AA E1 03 02
> > > AA  /block/.........
> > > ________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
> > > __    0123456789ABCDEF
> > >     NSD:FFFFFF803FD3EFE0| 00 00 00 00 00 00 00 00 00 00 2F 64 65 76 69
> > > 63    ........../devic
> > >     NSD:FFFFFF803FD3EFF0| 65 73>2F 76 69 72 74 75 61 6C 2F 6D 69 73 63
> > > 00  es/virtual/misc.
> > >     NSD:FFFFFF803FD3F000| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
> > > ??
> > >     NSD:FFFFFF803FD3F0E0| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
> > > ??
> > > 
> > > We observe the second string is put at the tail of the first page and
> > > the next page is unreadable.
> > > Thus, we made a simple test as below and it can reproduce this issue.
> 
> I'm not sure there's strictly a bug here. The C standard says:
> 
> "The strncmp function compares not more than n characters (characters that
> follow a null character are not compared) ..."
> 
> so although any characters between the first NULL and n must not be
> considered for the result of the comparison, there doesn't seem to be any
> explicit promise anywhere that they can't be *accessed*. AFAICT what happens
> here is in the request to compare at most 23 characters, it ends up in the
> do_misaligned case, loop_misaligned runs twice and finds no differences or
> NULLs in characters 0-7 and 8-15, so then done_loop loads characters 15-23
> to compare the last 7, and is tripped up by 22-23 not actually existing in
> src2. Possibly the original intent was that this case should have ended up
> in page_end_loop, and the condition for that was slightly off, but I'm not
> sure, and this code is obsolete now anyway.

The long backtrace above worries me, as it suggests that you can trigger
this from userspace. In that case I think it's a bug regardless of what
the C standard says.

Perhaps we should just fallback to the generic strncmp() implementation
in the stable kernel? I couldn't spot any other architectures doing
anything particularly clever, and x86_64 looks like it doesn't select
__HAVE_ARCH_STRNCMP at all.

Will


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error
  2023-08-10 14:31     ` Will Deacon
@ 2023-08-10 15:00       ` Robin Murphy
  2023-08-10 16:09         ` Will Deacon
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2023-08-10 15:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, John Hsu (許永翰),
	catalin.marinas, linux-kernel, linux-mediatek,
	Xiaobing Shi (史小兵),
	Chunhui Li (李春辉),
	linux-mm, Kuan-Ying Lee (李冠穎),
	Casper Li (李中榮),
	linux-arm-kernel

On 10/08/2023 3:31 pm, Will Deacon wrote:
> On Thu, Aug 10, 2023 at 01:23:28PM +0100, Robin Murphy wrote:
>> On 2023-08-07 16:32, Mark Rutland wrote:
>>> On Mon, Aug 07, 2023 at 12:31:45PM +0000, John Hsu (許永翰) wrote:
>>>> [ 7445.269347][  T382] ueventd: Hardware name: MT6886(ENG) (DT)
>>>> [ 7445.269354][  T382] ueventd: Call trace:
>>>> [ 7445.269359][  T382] ueventd:  dump_backtrace+0x0/0x2a8
>>>> [ 7445.269374][  T382] ueventd:  dump_stack_lvl+0x74/0xa4
>>>> [ 7445.269384][  T382] ueventd:  dump_stack+0x14/0x1c
>>>> [ 7445.269391][  T382] ueventd:  mrdump_common_die+0x32c/0x5ac [mrdump]
>>>> [ 7445.269470][  T382] ueventd:  ipanic_die+0x1c/0x28 [mrdump]
>>>> [ 7445.269539][  T382] ueventd:  __die+0xbc/0x308
>>>> [ 7445.269548][  T382] ueventd:  die+0xd8/0x500
>>>> [ 7445.269556][  T382] ueventd:  die_kernel_fault+0x94/0xa8
>>>> [ 7445.269565][  T382] ueventd:  __do_kernel_fault+0x1d8/0x214
>>>> [ 7445.269571][  T382] ueventd:  do_bad_area+0x40/0x174
>>>> [ 7445.269579][  T382] ueventd:  do_translation_fault+0x48/0x54
>>>> [ 7445.269585][  T382] ueventd:  do_mem_abort+0x3c/0x100
>>>> [ 7445.269592][  T382] ueventd:  el1_abort+0x38/0x54
>>>> [ 7445.269602][  T382] ueventd:  el1h_64_sync_handler+0x54/0x88
>>>> [ 7445.269610][  T382] ueventd:  el1h_64_sync+0x78/0x7c
>>>> [ 7445.269618][  T382] ueventd:  __pi_strncmp+0x1a0/0x1c4
>>>> [ 7445.269626][  T382] ueventd:  selinux_genfs_get_sid+0x114/0x220
>>>> [ 7445.269636][  T382] ueventd:  inode_doinit_with_dentry+0x3d0/0x598
>>>> [ 7445.269644][  T382] ueventd:  selinux_d_instantiate+0x1c/0x24
>>>> [ 7445.269652][  T382] ueventd:  d_splice_alias+0x5c/0x280
>>>> [ 7445.269662][  T382] ueventd:  kernfs_iop_lookup+0xec/0x21c
>>>> [ 7445.269674][  T382] ueventd:  __lookup_slow+0xc4/0x150
>>>> [ 7445.269684][  T382] ueventd:  lookup_slow+0x40/0xf0
>>>> [ 7445.269690][  T382] ueventd:  walk_component+0x144/0x160
>>>> [ 7445.269696][  T382] ueventd:  link_path_walk+0x25c/0x344
>>>> [ 7445.269703][  T382] ueventd:  path_lookupat+0x64/0x120
>>>> [ 7445.269710][  T382] ueventd:  filename_lookup+0xc4/0x1b0
>>>> [ 7445.269718][  T382] ueventd:  user_path_at_empty+0x48/0xb4
>>>> [ 7445.269725][  T382] ueventd:  do_faccessat+0xa8/0x1f0
>>>> [ 7445.269732][  T382] ueventd:  __arm64_sys_faccessat+0x20/0x28
>>>> [ 7445.269738][  T382] ueventd:  invoke_syscall+0x3c/0xf0
>>>> [ 7445.269746][  T382] ueventd:  el0_svc_common+0x84/0xe8
>>>> [ 7445.269753][  T382] ueventd:  do_el0_svc+0x20/0x84
>>>> [ 7445.269759][  T382] ueventd:  el0_svc+0x1c/0x48
>>>> [ 7445.269766][  T382] ueventd:  el0t_64_sync_handler+0x7c/0xd8
>>>> [ 7445.269773][  T382] ueventd:  el0t_64_sync+0x15c/0x160
>>>> We found that we hit this issue when we compare these two strings.
>>>> ________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
>>>> __   0123456789ABCDEF
>>>>      NSD:FFFFFF80089EDA00|>2F 64 65 76 69 63 65 73 2F 76 69 72 74 75 61
>>>> 6C  /devices/virtual
>>>>      NSD:FFFFFF80089EDA10| 2F 62 6C 6F 63 6B 2F 00 E0 03 01 AA E1 03 02
>>>> AA  /block/.........
>>>> ________________address|_0__1__2__3__4__5__6__7__8__9__A__B__C__D__E__F
>>>> __    0123456789ABCDEF
>>>>      NSD:FFFFFF803FD3EFE0| 00 00 00 00 00 00 00 00 00 00 2F 64 65 76 69
>>>> 63    ........../devic
>>>>      NSD:FFFFFF803FD3EFF0| 65 73>2F 76 69 72 74 75 61 6C 2F 6D 69 73 63
>>>> 00  es/virtual/misc.
>>>>      NSD:FFFFFF803FD3F000| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
>>>> ??
>>>>      NSD:FFFFFF803FD3F0E0| ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
>>>> ??
>>>>
>>>> We observe the second string is put at the tail of the first page and
>>>> the next page is unreadable.
>>>> Thus, we made a simple test as below and it can reproduce this issue.
>>
>> I'm not sure there's strictly a bug here. The C standard says:
>>
>> "The strncmp function compares not more than n characters (characters that
>> follow a null character are not compared) ..."
>>
>> so although any characters between the first NULL and n must not be
>> considered for the result of the comparison, there doesn't seem to be any
>> explicit promise anywhere that they can't be *accessed*. AFAICT what happens
>> here is in the request to compare at most 23 characters, it ends up in the
>> do_misaligned case, loop_misaligned runs twice and finds no differences or
>> NULLs in characters 0-7 and 8-15, so then done_loop loads characters 15-23
>> to compare the last 7, and is tripped up by 22-23 not actually existing in
>> src2. Possibly the original intent was that this case should have ended up
>> in page_end_loop, and the condition for that was slightly off, but I'm not
>> sure, and this code is obsolete now anyway.
> 
> The long backtrace above worries me, as it suggests that you can trigger
> this from userspace. In that case I think it's a bug regardless of what
> the C standard says.

Bleh, poor choice of words... obviously there is a bug overall, it just 
might arguably be in the caller's expectations rather than the strncmp() 
implementation itself. However I would concur that there's no way we're 
going over all ~3000 strncmp() callsites with the "well, actually" comb 
just for this. It was more to say I don't think it's worth digging much 
deeper into exactly why, and I agree the pragmatic thing to do is either 
rip it out or backport the newer MTE-safe implementation which should be 
more robust.

Cheers,
Robin.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error
  2023-08-10 15:00       ` Robin Murphy
@ 2023-08-10 16:09         ` Will Deacon
  2023-09-06 18:07           ` Will Deacon
  0 siblings, 1 reply; 7+ messages in thread
From: Will Deacon @ 2023-08-10 16:09 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Mark Rutland, John Hsu (許永翰),
	catalin.marinas, linux-kernel, linux-mediatek,
	Xiaobing Shi (史小兵),
	Chunhui Li (李春辉),
	linux-mm, Kuan-Ying Lee (李冠穎),
	Casper Li (李中榮),
	linux-arm-kernel

On Thu, Aug 10, 2023 at 04:00:00PM +0100, Robin Murphy wrote:
> On 10/08/2023 3:31 pm, Will Deacon wrote:
> > On Thu, Aug 10, 2023 at 01:23:28PM +0100, Robin Murphy wrote:
> > > I'm not sure there's strictly a bug here. The C standard says:
> > > 
> > > "The strncmp function compares not more than n characters (characters that
> > > follow a null character are not compared) ..."
> > > 
> > > so although any characters between the first NULL and n must not be
> > > considered for the result of the comparison, there doesn't seem to be any
> > > explicit promise anywhere that they can't be *accessed*. AFAICT what happens
> > > here is in the request to compare at most 23 characters, it ends up in the
> > > do_misaligned case, loop_misaligned runs twice and finds no differences or
> > > NULLs in characters 0-7 and 8-15, so then done_loop loads characters 15-23
> > > to compare the last 7, and is tripped up by 22-23 not actually existing in
> > > src2. Possibly the original intent was that this case should have ended up
> > > in page_end_loop, and the condition for that was slightly off, but I'm not
> > > sure, and this code is obsolete now anyway.
> > 
> > The long backtrace above worries me, as it suggests that you can trigger
> > this from userspace. In that case I think it's a bug regardless of what
> > the C standard says.
> 
> Bleh, poor choice of words... obviously there is a bug overall, it just
> might arguably be in the caller's expectations rather than the strncmp()
> implementation itself. However I would concur that there's no way we're
> going over all ~3000 strncmp() callsites with the "well, actually" comb just
> for this. It was more to say I don't think it's worth digging much deeper
> into exactly why, and I agree the pragmatic thing to do is either rip it out
> or backport the newer MTE-safe implementation which should be more robust.

Heh, then we agree. I was worried you'd gone mad :)

Will


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error
  2023-08-10 16:09         ` Will Deacon
@ 2023-09-06 18:07           ` Will Deacon
  0 siblings, 0 replies; 7+ messages in thread
From: Will Deacon @ 2023-09-06 18:07 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Mark Rutland, John Hsu (許永翰),
	catalin.marinas, linux-kernel, linux-mediatek,
	Xiaobing Shi (史小兵),
	Chunhui Li (李春辉),
	linux-mm, Kuan-Ying Lee (李冠穎),
	Casper Li (李中榮),
	linux-arm-kernel

On Thu, Aug 10, 2023 at 05:09:07PM +0100, Will Deacon wrote:
> On Thu, Aug 10, 2023 at 04:00:00PM +0100, Robin Murphy wrote:
> > On 10/08/2023 3:31 pm, Will Deacon wrote:
> > > On Thu, Aug 10, 2023 at 01:23:28PM +0100, Robin Murphy wrote:
> > > > I'm not sure there's strictly a bug here. The C standard says:
> > > > 
> > > > "The strncmp function compares not more than n characters (characters that
> > > > follow a null character are not compared) ..."
> > > > 
> > > > so although any characters between the first NULL and n must not be
> > > > considered for the result of the comparison, there doesn't seem to be any
> > > > explicit promise anywhere that they can't be *accessed*. AFAICT what happens
> > > > here is in the request to compare at most 23 characters, it ends up in the
> > > > do_misaligned case, loop_misaligned runs twice and finds no differences or
> > > > NULLs in characters 0-7 and 8-15, so then done_loop loads characters 15-23
> > > > to compare the last 7, and is tripped up by 22-23 not actually existing in
> > > > src2. Possibly the original intent was that this case should have ended up
> > > > in page_end_loop, and the condition for that was slightly off, but I'm not
> > > > sure, and this code is obsolete now anyway.
> > > 
> > > The long backtrace above worries me, as it suggests that you can trigger
> > > this from userspace. In that case I think it's a bug regardless of what
> > > the C standard says.
> > 
> > Bleh, poor choice of words... obviously there is a bug overall, it just
> > might arguably be in the caller's expectations rather than the strncmp()
> > implementation itself. However I would concur that there's no way we're
> > going over all ~3000 strncmp() callsites with the "well, actually" comb just
> > for this. It was more to say I don't think it's worth digging much deeper
> > into exactly why, and I agree the pragmatic thing to do is either rip it out
> > or backport the newer MTE-safe implementation which should be more robust.
> 
> Heh, then we agree. I was worried you'd gone mad :)

In the end I cherry-picked the newer implementation rather than fall back to
the generic implementation:

https://lore.kernel.org/r/20230906180336.4973-1-will@kernel.org

Will


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-09-06 18:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-07 12:31 [BUG kernel-5.15] aarch64: __pi_strncmp() out-of-bound error John Hsu (許永翰)
2023-08-07 15:32 ` Mark Rutland
2023-08-10 12:23   ` Robin Murphy
2023-08-10 14:31     ` Will Deacon
2023-08-10 15:00       ` Robin Murphy
2023-08-10 16:09         ` Will Deacon
2023-09-06 18:07           ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox