* [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter
@ 2024-09-13 7:24 kernel test robot
2024-09-13 7:59 ` David Howells
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: kernel test robot @ 2024-09-13 7:24 UTC (permalink / raw)
To: David Howells
Cc: oe-lkp, lkp, Linux Memory Management List, Christian Brauner,
Jeff Layton, netfs, linux-fsdevel, oliver.sang
Hello,
kernel test robot noticed "BUG:KASAN:slab-use-after-free_in_copy_from_iter" on:
commit: a05b682d498a81ca12f1dd964f06f3aec48af595 ("netfs: Use new folio_queue data type and iterator instead of xarray iter")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
[test failed on linux-next/master 32ffa5373540a8d1c06619f52d019c6cdc948bb4]
in testcase: xfstests
version: xfstests-x86_64-b1465280-1_20240909
with following parameters:
disk: 4HDD
fs: ext4
fs2: smbv2
test: generic-group-07
compiler: gcc-12
test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (Skylake) with 32G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202409131438.3f225fbf-oliver.sang@intel.com
[ 364.731854][ T2434] BUG: KASAN: slab-use-after-free in _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 364.739592][ T2434] Read of size 8 at addr ffff8881b2af7d20 by task fstest/2434
[ 364.746901][ T2434]
[ 364.749086][ T2434] CPU: 1 UID: 0 PID: 2434 Comm: fstest Not tainted 6.11.0-rc6-00065-ga05b682d498a #1
[ 364.758405][ T2434] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
[ 364.766511][ T2434] Call Trace:
[ 364.769650][ T2434] <TASK>
[ 364.772441][ T2434] dump_stack_lvl (lib/dump_stack.c:122 (discriminator 1))
[ 364.776796][ T2434] print_address_description+0x2c/0x3a0
[ 364.783231][ T2434] ? _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 364.788188][ T2434] print_report (mm/kasan/report.c:489)
[ 364.792453][ T2434] ? kasan_addr_to_slab (mm/kasan/common.c:37)
[ 364.797237][ T2434] ? _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 364.802196][ T2434] kasan_report (mm/kasan/report.c:603)
[ 364.806461][ T2434] ? _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 364.811420][ T2434] _copy_from_iter (include/linux/iov_iter.h:157 include/linux/iov_iter.h:308 include/linux/iov_iter.h:328 lib/iov_iter.c:249 lib/iov_iter.c:260)
[ 364.816205][ T2434] ? __pfx_try_charge_memcg (mm/memcontrol.c:2158)
[ 364.821438][ T2434] ? __pfx__copy_from_iter (lib/iov_iter.c:254)
[ 364.826569][ T2434] ? __mod_memcg_state (mm/memcontrol.c:555 mm/memcontrol.c:669)
[ 364.831529][ T2434] ? check_heap_object (arch/x86/include/asm/bitops.h:206 arch/x86/include/asm/bitops.h:238 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/page-flags.h:827 include/linux/page-flags.h:848 include/linux/mm.h:1126 include/linux/mm.h:2142 mm/usercopy.c:199)
[ 364.836485][ T2434] ? 0xffffffff81000000
[ 364.840490][ T2434] ? __check_object_size (mm/memremap.c:167)
[ 364.846143][ T2434] skb_do_copy_data_nocache (include/linux/uio.h:219 include/linux/uio.h:236 include/net/sock.h:2167)
[ 364.851533][ T2434] ? __pfx_skb_do_copy_data_nocache (include/net/sock.h:2158)
[ 364.857443][ T2434] ? __sk_mem_schedule (net/core/sock.c:3194)
[ 364.862229][ T2434] tcp_sendmsg_locked (include/net/sock.h:2195 net/ipv4/tcp.c:1218)
[ 364.867274][ T2434] ? __pfx_tcp_sendmsg_locked (net/ipv4/tcp.c:1049)
[ 364.872665][ T2434] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178)
[ 364.877447][ T2434] ? __pfx__raw_spin_lock_bh (kernel/locking/spinlock.c:177)
[ 364.882751][ T2434] tcp_sendmsg (net/ipv4/tcp.c:1355)
[ 364.886840][ T2434] sock_sendmsg (net/socket.c:730 net/socket.c:745 net/socket.c:768)
[ 364.891192][ T2434] ? __pfx__raw_spin_lock_bh (kernel/locking/spinlock.c:177)
[ 364.896495][ T2434] ? __pfx_sock_sendmsg (net/socket.c:757)
[ 364.901387][ T2434] ? recalc_sigpending (arch/x86/include/asm/bitops.h:75 include/asm-generic/bitops/instrumented-atomic.h:42 include/linux/thread_info.h:94 kernel/signal.c:178 kernel/signal.c:175)
[ 364.906379][ T2434] smb_send_kvec (fs/smb/client/transport.c:215) cifs
[ 364.911543][ T2434] __smb_send_rqst (fs/smb/client/transport.c:361) cifs
[ 364.916848][ T2434] ? __pfx___smb_send_rqst (fs/smb/client/transport.c:274) cifs
[ 364.922668][ T2434] ? __pfx_mempool_alloc_noprof (mm/mempool.c:385)
[ 364.928234][ T2434] ? __asan_memset (mm/kasan/shadow.c:84)
[ 364.932672][ T2434] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 364.937195][ T2434] ? __pfx__raw_spin_lock (kernel/locking/spinlock.c:153)
[ 364.942239][ T2434] ? smb2_setup_async_request (fs/smb/client/smb2transport.c:903) cifs
[ 364.948496][ T2434] cifs_call_async (fs/smb/client/transport.c:841) cifs
[ 364.953800][ T2434] ? __pfx_cifs_call_async (fs/smb/client/transport.c:787) cifs
[ 364.959623][ T2434] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 364.964148][ T2434] ? __asan_memset (mm/kasan/shadow.c:84)
[ 364.968586][ T2434] ? __smb2_plain_req_init (arch/x86/include/asm/atomic.h:53 include/linux/atomic/atomic-arch-fallback.h:992 include/linux/atomic/atomic-instrumented.h:436 fs/smb/client/smb2pdu.c:555) cifs
[ 364.974672][ T2434] smb2_async_writev (fs/smb/client/smb2pdu.c:5026) cifs
[ 364.980242][ T2434] ? __pfx_smb2_async_writev (fs/smb/client/smb2pdu.c:4894) cifs
[ 364.986252][ T2434] ? cifs_pick_channel (fs/smb/client/transport.c:1068) cifs
[ 364.991910][ T2434] ? cifs_prepare_write (fs/smb/client/file.c:77) cifs
[ 364.997652][ T2434] ? netfs_advance_write (fs/netfs/write_issue.c:300)
[ 365.002792][ T2434] netfs_advance_write (fs/netfs/write_issue.c:300)
[ 365.007758][ T2434] ? netfs_buffer_append_folio (arch/x86/include/asm/bitops.h:206 (discriminator 3) arch/x86/include/asm/bitops.h:238 (discriminator 3) include/asm-generic/bitops/instrumented-non-atomic.h:142 (discriminator 3) include/linux/page-flags.h:827 (discriminator 3) include/linux/page-flags.h:848 (discriminator 3) include/linux/mm.h:1126 (discriminator 3) include/linux/folio_queue.h:102 (discriminator 3) fs/netfs/misc.c:43 (discriminator 3))
[ 365.013434][ T2434] netfs_write_folio (fs/netfs/write_issue.c:468)
[ 365.018306][ T2434] ? writeback_iter (mm/page-writeback.c:2591)
[ 365.023007][ T2434] netfs_writepages (fs/netfs/write_issue.c:540)
[ 365.027705][ T2434] ? __pfx_netfs_writepages (fs/netfs/write_issue.c:499)
[ 365.032922][ T2434] do_writepages (mm/page-writeback.c:2683)
[ 365.037377][ T2434] ? rcu_segcblist_enqueue (arch/x86/include/asm/atomic64_64.h:25 include/linux/atomic/atomic-arch-fallback.h:2672 include/linux/atomic/atomic-long.h:121 include/linux/atomic/atomic-instrumented.h:3261 kernel/rcu/rcu_segcblist.c:214 kernel/rcu/rcu_segcblist.c:231 kernel/rcu/rcu_segcblist.c:343)
[ 365.042510][ T2434] ? __pfx_do_writepages (mm/page-writeback.c:2673)
[ 365.047466][ T2434] ? __call_rcu_common+0x321/0x9e0
[ 365.053466][ T2434] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 365.057988][ T2434] ? __pfx__raw_spin_lock (kernel/locking/spinlock.c:153)
[ 365.063030][ T2434] ? wbc_attach_and_unlock_inode (arch/x86/include/asm/jump_label.h:27 include/linux/backing-dev.h:176 fs/fs-writeback.c:737)
[ 365.068766][ T2434] filemap_fdatawrite_wbc (mm/filemap.c:398 mm/filemap.c:387)
[ 365.073983][ T2434] __filemap_fdatawrite_range (mm/filemap.c:422)
[ 365.079385][ T2434] ? __pfx___filemap_fdatawrite_range (mm/filemap.c:422)
[ 365.085489][ T2434] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 365.090015][ T2434] ? __pfx__raw_spin_lock (kernel/locking/spinlock.c:153)
[ 365.095058][ T2434] filemap_write_and_wait_range (mm/filemap.c:685 mm/filemap.c:676)
[ 365.100621][ T2434] cifs_flush (fs/smb/client/file.c:2763) cifs
[ 365.105493][ T2434] filp_flush (fs/open.c:1526)
[ 365.109586][ T2434] __x64_sys_close (fs/open.c:1566 fs/open.c:1551 fs/open.c:1551)
[ 365.114025][ T2434] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 365.118385][ T2434] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 365.124149][ T2434] RIP: 0033:0x7fc02c3878e0
[ 365.128414][ T2434] Code: 0d 00 00 00 eb b2 e8 ff f7 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 80 3d 01 1d 0e 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
All code
========
0: 0d 00 00 00 eb or $0xeb000000,%eax
5: b2 e8 mov $0xe8,%dl
7: ff f7 push %rdi
9: 01 00 add %eax,(%rax)
b: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
12: 00 00 00
15: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
1a: 80 3d 01 1d 0e 00 00 cmpb $0x0,0xe1d01(%rip) # 0xe1d22
21: 74 17 je 0x3a
23: b8 03 00 00 00 mov $0x3,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 48 ja 0x7a
32: c3 retq
33: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
3a: 48 83 ec 18 sub $0x18,%rsp
3e: 89 .byte 0x89
3f: 7c .byte 0x7c
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 48 ja 0x50
8: c3 retq
9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
10: 48 83 ec 18 sub $0x18,%rsp
14: 89 .byte 0x89
15: 7c .byte 0x7c
[ 365.147838][ T2434] RSP: 002b:00007fffcdbaed28 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
[ 365.156089][ T2434] RAX: ffffffffffffffda RBX: 000055ea7a5142b0 RCX: 00007fc02c3878e0
[ 365.163904][ T2434] RDX: 000000000001dd50 RSI: 0000000000000000 RDI: 0000000000000004
[ 365.171718][ T2434] RBP: 0000000000000004 R08: 0000000000000004 R09: 0000000000000000
[ 365.179535][ T2434] R10: 0000000000000001 R11: 0000000000000202 R12: 000000000000000a
[ 365.187349][ T2434] R13: 0000000000a00000 R14: 0000000000a00000 R15: 0000000000002000
[ 365.195186][ T2434] </TASK>
[ 365.198063][ T2434]
[ 365.200249][ T2434] Allocated by task 2434:
[ 365.204436][ T2434] kasan_save_stack (mm/kasan/common.c:48)
[ 365.208958][ T2434] kasan_save_track (arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69)
[ 365.213492][ T2434] __kasan_kmalloc (mm/kasan/common.c:370 mm/kasan/common.c:387)
[ 365.217927][ T2434] netfs_buffer_append_folio (include/linux/slab.h:681 fs/netfs/misc.c:25)
[ 365.223428][ T2434] netfs_write_folio (fs/netfs/write_issue.c:434)
[ 365.228306][ T2434] netfs_writepages (fs/netfs/write_issue.c:540)
[ 365.233013][ T2434] do_writepages (mm/page-writeback.c:2683)
[ 365.237456][ T2434] filemap_fdatawrite_wbc (mm/filemap.c:398 mm/filemap.c:387)
[ 365.242681][ T2434] __filemap_fdatawrite_range (mm/filemap.c:422)
[ 365.248079][ T2434] filemap_write_and_wait_range (mm/filemap.c:685 mm/filemap.c:676)
[ 365.253643][ T2434] cifs_flush (fs/smb/client/file.c:2763) cifs
[ 365.258510][ T2434] filp_flush (fs/open.c:1526)
[ 365.262604][ T2434] __x64_sys_close (fs/open.c:1566 fs/open.c:1551 fs/open.c:1551)
[ 365.267040][ T2434] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 365.271391][ T2434] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 365.277142][ T2434]
[ 365.279326][ T2434] Freed by task 11:
[ 365.282983][ T2434] kasan_save_stack (mm/kasan/common.c:48)
[ 365.287505][ T2434] kasan_save_track (arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69)
[ 365.292028][ T2434] kasan_save_free_info (mm/kasan/generic.c:582)
[ 365.296899][ T2434] poison_slab_object (mm/kasan/common.c:242)
[ 365.301768][ T2434] __kasan_slab_free (mm/kasan/common.c:256)
[ 365.306399][ T2434] kfree (mm/slub.c:4478 mm/slub.c:4598)
[ 365.310057][ T2434] netfs_delete_buffer_head (fs/netfs/misc.c:60)
[ 365.315379][ T2434] netfs_writeback_unlock_folios (fs/netfs/write_collect.c:144)
[ 365.321202][ T2434] netfs_collect_write_results (fs/netfs/write_collect.c:558)
[ 365.326937][ T2434] netfs_write_collection_worker (include/linux/instrumented.h:68 include/asm-generic/bitops/instrumented-non-atomic.h:141 fs/netfs/write_collect.c:648)
[ 365.332759][ T2434] process_one_work (kernel/workqueue.c:3231)
[ 365.337542][ T2434] worker_thread (kernel/workqueue.c:3306 kernel/workqueue.c:3389)
[ 365.341980][ T2434] kthread (kernel/kthread.c:389)
[ 365.345895][ T2434] ret_from_fork (arch/x86/kernel/process.c:147)
[ 365.350158][ T2434] ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
[ 365.354767][ T2434]
[ 365.356949][ T2434] The buggy address belongs to the object at ffff8881b2af7c00
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240913/202409131438.3f225fbf-oliver.sang@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-13 7:24 [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter kernel test robot @ 2024-09-13 7:59 ` David Howells 2024-09-13 8:11 ` Christian Brauner 2024-09-18 14:03 ` David Howells 2024-09-24 21:47 ` David Howells 2 siblings, 1 reply; 15+ messages in thread From: David Howells @ 2024-09-13 7:59 UTC (permalink / raw) To: kernel test robot Cc: dhowells, oe-lkp, lkp, Linux Memory Management List, Christian Brauner, Jeff Layton, netfs, linux-fsdevel Can you try with the attached change? It'll get folded into Christian's vfs.netfs branch at some point. David --- diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 84a517a0189d..97003155bfac 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1026,7 +1026,7 @@ static ssize_t iter_folioq_get_pages(struct iov_iter *iter, iov_offset += part; extracted += part; - *pages = folio_page(folio, offset % PAGE_SIZE); + *pages = folio_page(folio, offset / PAGE_SIZE); get_page(*pages); pages++; maxpages--; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-13 7:59 ` David Howells @ 2024-09-13 8:11 ` Christian Brauner 2024-09-18 2:24 ` Oliver Sang ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Christian Brauner @ 2024-09-13 8:11 UTC (permalink / raw) To: David Howells Cc: kernel test robot, oe-lkp, lkp, Linux Memory Management List, Jeff Layton, netfs, linux-fsdevel On Fri, Sep 13, 2024 at 08:59:19AM GMT, David Howells wrote: > Can you try with the attached change? It'll get folded into Christian's > vfs.netfs branch at some point. The fix you pasted below is already applied and folded into vfs.netfs. But what the kernel test robot tested was an old version of that branch. The commit hash that kernel test robot tested was: commit: a05b682d498a81ca12f1dd964f06f3aec48af595 ("netfs: Use new folio_queue data type and iterator instead of xarray iter") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master but in vfs.netfs we have: cd0277ed0c188dd40e7744e89299af7b78831ca4 ("netfs: Use new folio_queue data type and iterator instead of xarray iter") and the diff between the two is: diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 84a517a0189d..97003155bfac 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1026,7 +1026,7 @@ static ssize_t iter_folioq_get_pages(struct iov_iter *iter, iov_offset += part; extracted += part; - *pages = folio_page(folio, offset % PAGE_SIZE); + *pages = folio_page(folio, offset / PAGE_SIZE); get_page(*pages); pages++; maxpages--; So this is a bug report for an old version of vfs.netfs. > > David > --- > diff --git a/lib/iov_iter.c b/lib/iov_iter.c > index 84a517a0189d..97003155bfac 100644 > --- a/lib/iov_iter.c > +++ b/lib/iov_iter.c > @@ -1026,7 +1026,7 @@ static ssize_t iter_folioq_get_pages(struct iov_iter *iter, > iov_offset += part; > extracted += part; > > - *pages = folio_page(folio, offset % PAGE_SIZE); > + *pages = folio_page(folio, offset / PAGE_SIZE); > get_page(*pages); > pages++; > maxpages--; > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-13 8:11 ` Christian Brauner @ 2024-09-18 2:24 ` Oliver Sang 2024-09-18 10:34 ` David Howells 2024-09-18 11:27 ` David Howells 2 siblings, 0 replies; 15+ messages in thread From: Oliver Sang @ 2024-09-18 2:24 UTC (permalink / raw) To: Christian Brauner Cc: David Howells, oe-lkp, lkp, Linux Memory Management List, Jeff Layton, netfs, linux-fsdevel, oliver.sang hi, Christian Brauner, hi, David Howells, On Fri, Sep 13, 2024 at 10:11:25AM +0200, Christian Brauner wrote: > On Fri, Sep 13, 2024 at 08:59:19AM GMT, David Howells wrote: > > Can you try with the attached change? It'll get folded into Christian's > > vfs.netfs branch at some point. > > The fix you pasted below is already applied and folded into vfs.netfs. > But what the kernel test robot tested was an old version of that branch. > > The commit hash that kernel test robot tested was: > > commit: a05b682d498a81ca12f1dd964f06f3aec48af595 ("netfs: Use new folio_queue data type and iterator instead of xarray iter") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > but in vfs.netfs we have: > cd0277ed0c188dd40e7744e89299af7b78831ca4 ("netfs: Use new folio_queue data type and iterator instead of xarray iter") thanks for information! howerver, we noticed there is still similar issues upon cd0277ed0c which now is in mainline. we reported in below link FYI. https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com/ the issue is still reproduced on mainline or linux-next/master tip when bot finished the bisect. [test failed on linus/master a430d95c5efa2b545d26a094eb5f624e36732af0] [test failed on linux-next/master 7083504315d64199a329de322fce989e1e10f4f7] > > and the diff between the two is: > > diff --git a/lib/iov_iter.c b/lib/iov_iter.c > index 84a517a0189d..97003155bfac 100644 > --- a/lib/iov_iter.c > +++ b/lib/iov_iter.c > @@ -1026,7 +1026,7 @@ static ssize_t iter_folioq_get_pages(struct iov_iter *iter, > iov_offset += part; > extracted += part; > > - *pages = folio_page(folio, offset % PAGE_SIZE); > + *pages = folio_page(folio, offset / PAGE_SIZE); > get_page(*pages); > pages++; > maxpages--; > > So this is a bug report for an old version of vfs.netfs. > > > > > David > > --- > > diff --git a/lib/iov_iter.c b/lib/iov_iter.c > > index 84a517a0189d..97003155bfac 100644 > > --- a/lib/iov_iter.c > > +++ b/lib/iov_iter.c > > @@ -1026,7 +1026,7 @@ static ssize_t iter_folioq_get_pages(struct iov_iter *iter, > > iov_offset += part; > > extracted += part; > > > > - *pages = folio_page(folio, offset % PAGE_SIZE); > > + *pages = folio_page(folio, offset / PAGE_SIZE); > > get_page(*pages); > > pages++; > > maxpages--; > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-13 8:11 ` Christian Brauner 2024-09-18 2:24 ` Oliver Sang @ 2024-09-18 10:34 ` David Howells 2024-09-18 11:27 ` David Howells 2 siblings, 0 replies; 15+ messages in thread From: David Howells @ 2024-09-18 10:34 UTC (permalink / raw) To: Oliver Sang Cc: dhowells, Christian Brauner, oe-lkp, lkp, Linux Memory Management List, Jeff Layton, netfs, linux-fsdevel Does this: https://lore.kernel.org/linux-fsdevel/2280667.1726594254@warthog.procyon.org.uk/T/#u [PATCH] cifs: Fix reversion of the iter in cifs_readv_receive() help? David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-13 8:11 ` Christian Brauner 2024-09-18 2:24 ` Oliver Sang 2024-09-18 10:34 ` David Howells @ 2024-09-18 11:27 ` David Howells 2024-09-19 2:23 ` Oliver Sang 2024-09-19 7:14 ` David Howells 2 siblings, 2 replies; 15+ messages in thread From: David Howells @ 2024-09-18 11:27 UTC (permalink / raw) To: Oliver Sang Cc: dhowells, Christian Brauner, Steve French, oe-lkp, lkp, Linux Memory Management List, Jeff Layton, netfs, linux-fsdevel David Howells <dhowells@redhat.com> wrote: > Does this: > > https://lore.kernel.org/linux-fsdevel/2280667.1726594254@warthog.procyon.org.uk/T/#u > > [PATCH] cifs: Fix reversion of the iter in cifs_readv_receive() > > help? Actually, it probably won't. The issue seems to be one I'm already trying to reproduce that Steve has flagged. Can you tell me SMB server you're using? Samba, ksmbd, Windows, Azure? I'm guessing one of the first two. Also, will your reproducer really clobber four arbitrary partitions on sdb? David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-18 11:27 ` David Howells @ 2024-09-19 2:23 ` Oliver Sang 2024-09-19 7:14 ` David Howells 1 sibling, 0 replies; 15+ messages in thread From: Oliver Sang @ 2024-09-19 2:23 UTC (permalink / raw) To: David Howells Cc: Christian Brauner, Steve French, oe-lkp, lkp, Linux Memory Management List, Jeff Layton, netfs, linux-fsdevel, oliver.sang [-- Attachment #1: Type: text/plain, Size: 3053 bytes --] hi, David, On Wed, Sep 18, 2024 at 12:27:48PM +0100, David Howells wrote: > David Howells <dhowells@redhat.com> wrote: > > > Does this: > > > > https://lore.kernel.org/linux-fsdevel/2280667.1726594254@warthog.procyon.org.uk/T/#u > > > > [PATCH] cifs: Fix reversion of the iter in cifs_readv_receive() > > > > help? > > Actually, it probably won't. The issue seems to be one I'm already trying to > reproduce that Steve has flagged. > > Can you tell me SMB server you're using? Samba, ksmbd, Windows, Azure? I'm > guessing one of the first two. we actually use local mount to simulate smb. I attached an output for details. 2024-09-11 23:30:58 mkdir -p /cifs/sda1 2024-09-11 23:30:58 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda1 /cifs/sda1 mount cifs success 2024-09-11 23:30:58 mkdir -p /cifs/sda2 2024-09-11 23:30:58 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda2 /cifs/sda2 mount cifs success 2024-09-11 23:30:59 mkdir -p /cifs/sda3 2024-09-11 23:30:59 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda3 /cifs/sda3 mount cifs success 2024-09-11 23:30:59 mkdir -p /cifs/sda4 2024-09-11 23:30:59 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda4 /cifs/sda4 mount cifs success 2024-09-11 23:31:00 mount /dev/sda1 /fs/sda1 2024-09-11 23:31:01 mkdir -p /smbv2//cifs/sda1 2024-09-11 23:31:01 export FSTYP=cifs 2024-09-11 23:31:01 export TEST_DEV=//localhost/fs/sda1 2024-09-11 23:31:01 export TEST_DIR=/smbv2//cifs/sda1 2024-09-11 23:31:01 export CIFS_MOUNT_OPTIONS=-ousername=root,password=pass,noperm,vers=2.0,mfsymlinks,actimeo=0 2024-09-11 23:31:01 sed "s:^:generic/:" //lkp/benchmarks/xfstests/tests/generic-group-07 2024-09-11 23:31:01 ./check -E tests/cifs/exclude.incompatible-smb2.txt -E tests/cifs/exclude.very-slow.txt generic/071 generic/072 generic/074 generic/075 generic/076 generic/078 generic/079 > > Also, will your reproducer really clobber four arbitrary partitions on sdb? yeah, we setup dedicated hdd for tests on each test machine, e.g. for the lkp-skl-d05 used in the test, it has: nr_hdd_partitions: 4 hdd_partitions: /dev/disk/by-id/wwn-0x5000c50091e544de-part* then in this 4HDD-ext4-smbv2-generic-group-07 test, also as in attached output 2024-09-11 23:26:17 wipefs -a --force /dev/sda1 /dev/sda1: 2 bytes were erased at offset 0x00000438 (ext4): 53 ef 2024-09-11 23:26:17 wipefs -a --force /dev/sda2 2024-09-11 23:26:17 wipefs -a --force /dev/sda3 2024-09-11 23:26:17 wipefs -a --force /dev/sda4 2024-09-11 23:26:17 mkfs -t ext4 -q -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sda1 2024-09-11 23:26:17 mkfs -t ext4 -q -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sda3 2024-09-11 23:26:17 mkfs -t ext4 -q -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sda2 2024-09-11 23:26:17 mkfs -t ext4 -q -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sda4 I also attached 074.full. KASAN issue occurs while this 074 test in generic-group-07. > > David > [-- Attachment #2: output --] [-- Type: text/plain, Size: 5379 bytes --] ==> /tmp/stdout <== ==> /tmp/stderr <== ==> /tmp/stdout <== RESULT_ROOT=/result/xfstests/4HDD-ext4-smbv2-generic-group-07/lkp-skl-d05/debian-12-x86_64-20240206.cgz/x86_64-rhel-8.3-func/gcc-12/a05b682d498a81ca12f1dd964f06f3aec48af595/0 job=/lkp/jobs/scheduled/lkp-skl-d05/xfstests-4HDD-ext4-smbv2-generic-group-07-debian-12-x86_64-20240206.cgz-a05b682d498a-20240912-365474-1kx9t2n-0.yaml result_service: raw_upload, RESULT_MNT: /internal-lkp-server/result, RESULT_ROOT: /internal-lkp-server/result/xfstests/4HDD-ext4-smbv2-generic-group-07/lkp-skl-d05/debian-12-x86_64-20240206.cgz/x86_64-rhel-8.3-func/gcc-12/a05b682d498a81ca12f1dd964f06f3aec48af595/0, TMP_RESULT_ROOT: /tmp/lkp/result run-job /lkp/jobs/scheduled/lkp-skl-d05/xfstests-4HDD-ext4-smbv2-generic-group-07-debian-12-x86_64-20240206.cgz-a05b682d498a-20240912-365474-1kx9t2n-0.yaml /usr/bin/wget -q --timeout=3600 --tries=1 --local-encoding=UTF-8 http://internal-lkp-server:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-skl-d05/xfstests-4HDD-ext4-smbv2-generic-group-07-debian-12-x86_64-20240206.cgz-a05b682d498a-20240912-365474-1kx9t2n-0.yaml&job_state=running -O /dev/null target ucode: 0xf0 LKP: stdout: 1226: current_version: f0, target_version: f0 2024-09-11 23:26:16 dmsetup remove_all 2024-09-11 23:26:17 wipefs -a --force /dev/sda1 /dev/sda1: 2 bytes were erased at offset 0x00000438 (ext4): 53 ef 2024-09-11 23:26:17 wipefs -a --force /dev/sda2 2024-09-11 23:26:17 wipefs -a --force /dev/sda3 2024-09-11 23:26:17 wipefs -a --force /dev/sda4 2024-09-11 23:26:17 mkfs -t ext4 -q -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sda1 2024-09-11 23:26:17 mkfs -t ext4 -q -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sda3 2024-09-11 23:26:17 mkfs -t ext4 -q -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sda2 2024-09-11 23:26:17 mkfs -t ext4 -q -E lazy_itable_init=0,lazy_journal_init=0 -F /dev/sda4 2024-09-11 23:30:56 mkdir -p /fs/sda1 ext4 2024-09-11 23:30:56 mount -t ext4 /dev/sda1 /fs/sda1 2024-09-11 23:30:56 mkdir -p /fs/sda2 ext4 2024-09-11 23:30:56 mount -t ext4 /dev/sda2 /fs/sda2 2024-09-11 23:30:57 mkdir -p /fs/sda3 ext4 2024-09-11 23:30:57 mount -t ext4 /dev/sda3 /fs/sda3 2024-09-11 23:30:57 mkdir -p /fs/sda4 ext4 2024-09-11 23:30:57 mount -t ext4 /dev/sda4 /fs/sda4 Added user root. 2024-09-11 23:30:58 mkdir -p /cifs/sda1 2024-09-11 23:30:58 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda1 /cifs/sda1 mount cifs success 2024-09-11 23:30:58 mkdir -p /cifs/sda2 2024-09-11 23:30:58 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda2 /cifs/sda2 mount cifs success 2024-09-11 23:30:59 mkdir -p /cifs/sda3 2024-09-11 23:30:59 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda3 /cifs/sda3 mount cifs success 2024-09-11 23:30:59 mkdir -p /cifs/sda4 2024-09-11 23:30:59 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda4 /cifs/sda4 mount cifs success check_nr_cpu CPU(s): 4 On-line CPU(s) list: 0-3 Model name: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz BIOS Model name: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz CPU @ 3.2GHz Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 CPU(s) scaling MHz: 94% NUMA node(s): 1 NUMA node0 CPU(s): 0-3 ==> /tmp/stderr <== 512+0 records in 512+0 records out 262144 bytes (262 kB, 256 KiB) copied, 0.013281 s, 19.7 MB/s 512+0 records in 512+0 records out 262144 bytes (262 kB, 256 KiB) copied, 0.090282 s, 2.9 MB/s 512+0 records in 512+0 records out 262144 bytes (262 kB, 256 KiB) copied, 0.0451926 s, 5.8 MB/s ==> /tmp/stdout <== 2024-09-11 23:31:00 mount /dev/sda1 /fs/sda1 2024-09-11 23:31:01 mkdir -p /smbv2//cifs/sda1 2024-09-11 23:31:01 export FSTYP=cifs 2024-09-11 23:31:01 export TEST_DEV=//localhost/fs/sda1 2024-09-11 23:31:01 export TEST_DIR=/smbv2//cifs/sda1 2024-09-11 23:31:01 export CIFS_MOUNT_OPTIONS=-ousername=root,password=pass,noperm,vers=2.0,mfsymlinks,actimeo=0 2024-09-11 23:31:01 sed "s:^:generic/:" //lkp/benchmarks/xfstests/tests/generic-group-07 2024-09-11 23:31:01 ./check -E tests/cifs/exclude.incompatible-smb2.txt -E tests/cifs/exclude.very-slow.txt generic/071 generic/072 generic/074 generic/075 generic/076 generic/078 generic/079 IPMI BMC is not supported on this machine, skip bmc-watchdog setup! FSTYP -- cifs PLATFORM -- Linux/x86_64 lkp-skl-d05 6.11.0-rc6-00065-ga05b682d498a #1 SMP PREEMPT_DYNAMIC Thu Sep 12 06:26:04 CST 2024 generic/071 [not run] this test requires a valid $SCRATCH_DEV generic/072 [not run] xfs_io fcollapse failed (old kernel/wrong fs?) generic/074 _check_dmesg: something found in dmesg (see /lkp/benchmarks/xfstests/results//generic/074.dmesg) generic/075 95s generic/076 [not run] this test requires a valid $SCRATCH_DEV generic/078 [not run] kernel doesn't support renameat2 syscall generic/079 [not run] file system doesn't support chattr +ia Ran: generic/071 generic/072 generic/074 generic/075 generic/076 generic/078 generic/079 Not run: generic/071 generic/072 generic/076 generic/078 generic/079 Failures: generic/074 Failed 1 of 7 tests [-- Attachment #3: 074.full --] [-- Type: text/plain, Size: 4329 bytes --] Params are for Linux SMP Params: n = 3 l = 10 f = 5 num_children=1 file_size=1048576 num_files=1 loop_count=10 block_size=1024 mmap=0 sync=0 prealloc=0 Total data size 1.0 Mbyte Child 0 loop 0 Child 0 loop 1 Child 0 loop 2 Child 0 loop 3 Child 0 loop 4 Child 0 loop 5 Child 0 loop 6 Child 0 loop 7 Child 0 loop 8 Child 0 loop 9 Child 0 cleaning up /smbv2/cifs/sda1/fstest.0/child0 num_children=1 file_size=1048576 num_files=1 loop_count=10 block_size=1024 mmap=0 sync=0 prealloc=0 Total data size 1.0 Mbyte num_children=1 file_size=10485760 num_files=1 loop_count=10 block_size=8192 mmap=1 sync=0 prealloc=0 Total data size 10.5 Mbyte Child 0 loop 0 Child 0 loop 1 Child 0 loop 2 Child 0 loop 3 Child 0 loop 4 Child 0 loop 5 Child 0 loop 6 Child 0 loop 7 Child 0 loop 8 Child 0 loop 9 Child 0 cleaning up /smbv2/cifs/sda1/fstest.1/child0 num_children=1 file_size=10485760 num_files=1 loop_count=10 block_size=8192 mmap=1 sync=0 prealloc=0 Total data size 10.5 Mbyte num_children=3 file_size=31457280 num_files=5 loop_count=10 block_size=512 mmap=0 sync=0 prealloc=0 Total data size 471.9 Mbyte Child 0 loop 0 Child 0 loop 1 Child 0 loop 2 Child 0 loop 3 Child 0 loop 4 Child 0 loop 5 Child 0 loop 6 Child 0 loop 7 Child 0 loop 8 Child 0 loop 9 Child 0 cleaning up /smbv2/cifs/sda1/fstest.2/child0 num_children=3 file_size=31457280 num_files=5 loop_count=10 block_size=512 mmap=0 sync=0 prealloc=0 Total data size 471.9 Mbyte Child 1 loop 0 Child 1 loop 1 Child 1 loop 2 Child 1 loop 3 Child 1 loop 4 Child 1 loop 5 Child 1 loop 6 Child 1 loop 7 Child 1 loop 8 Child 1 loop 9 Child 1 cleaning up /smbv2/cifs/sda1/fstest.2/child1 num_children=3 file_size=31457280 num_files=5 loop_count=10 block_size=512 mmap=0 sync=0 prealloc=0 Total data size 471.9 Mbyte Child 2 loop 0 Child 2 loop 1 Child 2 loop 2 Child 2 loop 3 Child 2 loop 4 Child 2 loop 5 Child 2 loop 6 Child 2 loop 7 Child 2 loop 8 Child 2 loop 9 Child 2 cleaning up /smbv2/cifs/sda1/fstest.2/child2 num_children=3 file_size=31457280 num_files=5 loop_count=10 block_size=512 mmap=0 sync=0 prealloc=0 Total data size 471.9 Mbyte num_children=3 file_size=31457280 num_files=5 loop_count=10 block_size=512 mmap=1 sync=0 prealloc=0 Total data size 471.9 Mbyte Child 0 loop 0 Child 0 loop 1 Child 0 loop 2 Child 0 loop 3 Child 0 loop 4 Child 0 loop 5 Child 0 loop 6 Child 0 loop 7 Child 0 loop 8 Child 0 loop 9 Child 0 cleaning up /smbv2/cifs/sda1/fstest.3/child0 num_children=3 file_size=31457280 num_files=5 loop_count=10 block_size=512 mmap=1 sync=0 prealloc=0 Total data size 471.9 Mbyte Child 2 loop 0 Child 2 loop 1 Child 2 loop 2 Child 2 loop 3 Child 2 loop 4 Child 2 loop 5 Child 2 loop 6 Child 2 loop 7 Child 2 loop 8 Child 2 loop 9 Child 2 cleaning up /smbv2/cifs/sda1/fstest.3/child2 num_children=3 file_size=31457280 num_files=5 loop_count=10 block_size=512 mmap=1 sync=0 prealloc=0 Total data size 471.9 Mbyte Child 1 loop 0 Child 1 loop 1 Child 1 loop 2 Child 1 loop 3 Child 1 loop 4 Child 1 loop 5 Child 1 loop 6 Child 1 loop 7 Child 1 loop 8 Child 1 loop 9 Child 1 cleaning up /smbv2/cifs/sda1/fstest.3/child1 num_children=3 file_size=31457280 num_files=5 loop_count=10 block_size=512 mmap=1 sync=0 prealloc=0 Total data size 471.9 Mbyte num_children=3 file_size=10485760 num_files=5 loop_count=10 block_size=512 mmap=1 sync=1 prealloc=0 Total data size 157.3 Mbyte Child 2 loop 0 Child 2 loop 1 Child 2 loop 2 Child 2 loop 3 Child 2 loop 4 Child 2 loop 5 Child 2 loop 6 Child 2 loop 7 Child 2 loop 8 Child 2 loop 9 Child 2 cleaning up /smbv2/cifs/sda1/fstest.4/child2 num_children=3 file_size=10485760 num_files=5 loop_count=10 block_size=512 mmap=1 sync=1 prealloc=0 Total data size 157.3 Mbyte Child 0 loop 0 Child 0 loop 1 Child 0 loop 2 Child 0 loop 3 Child 0 loop 4 Child 0 loop 5 Child 0 loop 6 Child 0 loop 7 Child 0 loop 8 Child 0 loop 9 Child 0 cleaning up /smbv2/cifs/sda1/fstest.4/child0 num_children=3 file_size=10485760 num_files=5 loop_count=10 block_size=512 mmap=1 sync=1 prealloc=0 Total data size 157.3 Mbyte Child 1 loop 0 Child 1 loop 1 Child 1 loop 2 Child 1 loop 3 Child 1 loop 4 Child 1 loop 5 Child 1 loop 6 Child 1 loop 7 Child 1 loop 8 Child 1 loop 9 Child 1 cleaning up /smbv2/cifs/sda1/fstest.4/child1 num_children=3 file_size=10485760 num_files=5 loop_count=10 block_size=512 mmap=1 sync=1 prealloc=0 Total data size 157.3 Mbyte ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-18 11:27 ` David Howells 2024-09-19 2:23 ` Oliver Sang @ 2024-09-19 7:14 ` David Howells 2024-09-20 6:36 ` Oliver Sang 2024-09-20 7:55 ` David Howells 1 sibling, 2 replies; 15+ messages in thread From: David Howells @ 2024-09-19 7:14 UTC (permalink / raw) To: Oliver Sang Cc: dhowells, Christian Brauner, Steve French, oe-lkp, lkp, Linux Memory Management List, Jeff Layton, netfs, linux-fsdevel Oliver Sang <oliver.sang@intel.com> wrote: > > Can you tell me SMB server you're using? Samba, ksmbd, Windows, Azure? I'm > > guessing one of the first two. > > we actually use local mount to simulate smb. I attached an output for details. > > 2024-09-11 23:30:58 mkdir -p /cifs/sda1 > 2024-09-11 23:30:58 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda1 /cifs/sda1 > mount cifs success Does your mount command run up samba or something? This doesn't seem to work on my system. I get: andromeda32# mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda6 /mnt mount error(111): could not connect to ::1mount error(111): could not connect to 127.0.0.1Unable to find suitable address. David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-19 7:14 ` David Howells @ 2024-09-20 6:36 ` Oliver Sang 2024-09-20 7:55 ` David Howells 1 sibling, 0 replies; 15+ messages in thread From: Oliver Sang @ 2024-09-20 6:36 UTC (permalink / raw) To: David Howells Cc: Christian Brauner, Steve French, oe-lkp, lkp, Linux Memory Management List, Jeff Layton, netfs, linux-fsdevel, oliver.sang hi, David, On Thu, Sep 19, 2024 at 08:14:50AM +0100, David Howells wrote: > Oliver Sang <oliver.sang@intel.com> wrote: > > > > Can you tell me SMB server you're using? Samba, ksmbd, Windows, Azure? I'm > > > guessing one of the first two. > > > > we actually use local mount to simulate smb. I attached an output for details. > > > > 2024-09-11 23:30:58 mkdir -p /cifs/sda1 > > 2024-09-11 23:30:58 timeout 5m mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda1 /cifs/sda1 > > mount cifs success > > Does your mount command run up samba or something? This doesn't seem to work > on my system. I get: > > andromeda32# mount -t cifs -o vers=2.0 -o user=root,password=pass //localhost/fs/sda6 /mnt > mount error(111): could not connect to ::1mount error(111): could not connect to 127.0.0.1Unable to find suitable address. have you enable the samba with a /fs path? such like: start_smbd() { # setup smb.conf cat >> /etc/samba/smb.conf <<EOF [fs] path = /fs comment = lkp cifs browseable = yes read only = no EOF # setup passwd (echo "pass"; echo "pass") | smbpasswd -s -a $(whoami) # restart service systemctl restart smb.service } (https://github.com/intel/lkp-tests/blob/48db85cbe0f249d075bc7eef263b485f02cb153d/lib/fs_ext.sh#L93C1-L107C2) > > David > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-19 7:14 ` David Howells 2024-09-20 6:36 ` Oliver Sang @ 2024-09-20 7:55 ` David Howells 1 sibling, 0 replies; 15+ messages in thread From: David Howells @ 2024-09-20 7:55 UTC (permalink / raw) To: Oliver Sang Cc: dhowells, Christian Brauner, Steve French, oe-lkp, lkp, Linux Memory Management List, Jeff Layton, netfs, linux-fsdevel Oliver Sang <oliver.sang@intel.com> wrote: > have you enable the samba with a /fs path? such like: Ahhh! That's what you're doing! > > > > Can you tell me SMB server you're using? Samba, ksmbd, Windows, > > > > Azure? I'm guessing one of the first two. > > > > > > we actually use local mount to simulate smb. I attached an output for > > > details. I was misled by your answer here. Okay, thanks. It doesn't help, unfortunately. I think my test machine isn't oomphy enough to trigger the race, but I know someone else at RH who can reproduce it, so I can work through them. David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-13 7:24 [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter kernel test robot 2024-09-13 7:59 ` David Howells @ 2024-09-18 14:03 ` David Howells 2024-09-19 2:50 ` Oliver Sang 2024-09-24 21:47 ` David Howells 2 siblings, 1 reply; 15+ messages in thread From: David Howells @ 2024-09-18 14:03 UTC (permalink / raw) To: kernel test robot Cc: dhowells, oe-lkp, lkp, Linux Memory Management List, Christian Brauner, Jeff Layton, netfs, linux-fsdevel Hi Oliver, The reproducer script doesn't manage to build (I'm using Fedora 39): + /usr/lib/rpm/check-rpaths ******************************************************************************* * * WARNING: 'check-rpaths' detected a broken RPATH OR RUNPATH and will cause * 'rpmbuild' to fail. To ignore these errors, you can set the * '$QA_RPATHS' environment variable which is a bitmask allowing the * values below. The current value of QA_RPATHS is 0x0000. * * 0x0001 ... standard RPATHs (e.g. /usr/lib); such RPATHs are a minor * issue but are introducing redundant searchpaths without * providing a benefit. They can also cause errors in multilib * environments. * 0x0002 ... invalid RPATHs; these are RPATHs which are neither absolute * nor relative filenames and can therefore be a SECURITY risk * 0x0004 ... insecure RPATHs; these are relative RPATHs which are a * SECURITY risk * 0x0008 ... the special '$ORIGIN' RPATHs are appearing after other * RPATHs; this is just a minor issue but usually unwanted * 0x0010 ... the RPATH is empty; there is no reason for such RPATHs * and they cause unneeded work while loading libraries * 0x0020 ... an RPATH references '..' of an absolute path; this will break * the functionality when the path before '..' is a symlink * * * Examples: * - to ignore standard and empty RPATHs, execute 'rpmbuild' like * $ QA_RPATHS=$(( 0x0001|0x0010 )) rpmbuild my-package.src.rpm * - to check existing files, set $RPM_BUILD_ROOT and execute check-rpaths like * $ RPM_BUILD_ROOT=<top-dir> /usr/lib/rpm/check-rpaths * ******************************************************************************* ERROR 0002: file '/usr/local/sbin/fsck.f2fs' contains an invalid runpath '/usr/local/lib' in [/usr/local/lib] ERROR 0002: file '/usr/local/sbin/mkfs.f2fs' contains an invalid runpath '/usr/local/lib' in [/usr/local/lib] ERROR 0002: file '/usr/local/lib/libf2fs_format.so.9.0.0' contains an invalid runpath '/usr/local/lib' in [/usr/local/lib] error: Bad exit status from /var/tmp/rpm-tmp.ASUBws (%install) RPM build warnings: source_date_epoch_from_changelog set but %changelog is missing RPM build errors: Bad exit status from /var/tmp/rpm-tmp.ASUBws (%install) error: open of /mnt2/lkp-tests/programs/xfstests/pkg/rpm_build/RPMS/xfstests-LKP.rpm failed: No such file or directory ==> WARNING: Failed to install built package(s). ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-18 14:03 ` David Howells @ 2024-09-19 2:50 ` Oliver Sang 0 siblings, 0 replies; 15+ messages in thread From: Oliver Sang @ 2024-09-19 2:50 UTC (permalink / raw) To: David Howells Cc: oe-lkp, lkp, Linux Memory Management List, Christian Brauner, Jeff Layton, netfs, linux-fsdevel, oliver.sang hi, David, sorry that we only have initial support for fedora https://github.com/intel/lkp-tests?tab=readme-ov-file#supported-distributions we will look this issue. however, due to resource constraint, we may not be able to supply quick support. btw, for this case, it really need 4 hdd partitions, so need refer to https://github.com/intel/lkp-tests?tab=readme-ov-file#run-your-own-disk-partitions On Wed, Sep 18, 2024 at 03:03:38PM +0100, David Howells wrote: > Hi Oliver, > > The reproducer script doesn't manage to build (I'm using Fedora 39): > > + /usr/lib/rpm/check-rpaths > ******************************************************************************* > * > * WARNING: 'check-rpaths' detected a broken RPATH OR RUNPATH and will cause > * 'rpmbuild' to fail. To ignore these errors, you can set the > * '$QA_RPATHS' environment variable which is a bitmask allowing the > * values below. The current value of QA_RPATHS is 0x0000. > * > * 0x0001 ... standard RPATHs (e.g. /usr/lib); such RPATHs are a minor > * issue but are introducing redundant searchpaths without > * providing a benefit. They can also cause errors in multilib > * environments. > * 0x0002 ... invalid RPATHs; these are RPATHs which are neither absolute > * nor relative filenames and can therefore be a SECURITY risk > * 0x0004 ... insecure RPATHs; these are relative RPATHs which are a > * SECURITY risk > * 0x0008 ... the special '$ORIGIN' RPATHs are appearing after other > * RPATHs; this is just a minor issue but usually unwanted > * 0x0010 ... the RPATH is empty; there is no reason for such RPATHs > * and they cause unneeded work while loading libraries > * 0x0020 ... an RPATH references '..' of an absolute path; this will break > * the functionality when the path before '..' is a symlink > * > * > * Examples: > * - to ignore standard and empty RPATHs, execute 'rpmbuild' like > * $ QA_RPATHS=$(( 0x0001|0x0010 )) rpmbuild my-package.src.rpm > * - to check existing files, set $RPM_BUILD_ROOT and execute check-rpaths like > * $ RPM_BUILD_ROOT=<top-dir> /usr/lib/rpm/check-rpaths > * > ******************************************************************************* > ERROR 0002: file '/usr/local/sbin/fsck.f2fs' contains an invalid runpath '/usr/local/lib' in [/usr/local/lib] > ERROR 0002: file '/usr/local/sbin/mkfs.f2fs' contains an invalid runpath '/usr/local/lib' in [/usr/local/lib] > ERROR 0002: file '/usr/local/lib/libf2fs_format.so.9.0.0' contains an invalid runpath '/usr/local/lib' in [/usr/local/lib] > error: Bad exit status from /var/tmp/rpm-tmp.ASUBws (%install) > > RPM build warnings: > source_date_epoch_from_changelog set but %changelog is missing > > RPM build errors: > Bad exit status from /var/tmp/rpm-tmp.ASUBws (%install) > error: open of /mnt2/lkp-tests/programs/xfstests/pkg/rpm_build/RPMS/xfstests-LKP.rpm failed: No such file or directory > ==> WARNING: Failed to install built package(s). > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-13 7:24 [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter kernel test robot 2024-09-13 7:59 ` David Howells 2024-09-18 14:03 ` David Howells @ 2024-09-24 21:47 ` David Howells 2024-09-24 23:19 ` Steve French 2024-09-26 2:20 ` Oliver Sang 2 siblings, 2 replies; 15+ messages in thread From: David Howells @ 2024-09-24 21:47 UTC (permalink / raw) To: kernel test robot Cc: dhowells, oe-lkp, lkp, Linux Memory Management List, Christian Brauner, Jeff Layton, netfs, linux-fsdevel Does the attached fix the problem? David --- netfs: Fix write oops in generic/346 (9p) and maybe generic/074 (cifs) In netfslib, a buffered writeback operation has a 'write queue' of folios that are being written, held in a linear sequence of folio_queue structs. The 'issuer' adds new folio_queues on the leading edge of the queue and populates each one progressively; the 'collector' pops them off the trailing edge and discards them and the folios they point to as they are consumed. The queue is required to always retain at least one folio_queue structure. This allows the queue to be accessed without locking and with just a bit of barriering. When a new subrequest is prepared, its ->io_iter iterator is pointed at the current end of the write queue and then the iterator is extended as more data is added to the queue until the subrequest is committed. Now, the problem is that the folio_queue at the leading edge of the write queue when a subrequest is prepared might have been entirely consumed - but not yet removed from the queue as it is the only remaining one and is preventing the queue from collapsing. So, what happens is that subreq->io_iter is pointed at the spent folio_queue, then a new folio_queue is added, and, at that point, the collector is at entirely at liberty to immediately delete the spent folio_queue. This leaves the subreq->io_iter pointing at a freed object. If the system is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted freed object and advances to the next folio_queue in the queue. In the case seen, however, the freed object gets recycled and put back onto the queue at the tail and filled to the end. This confuses iterate_folioq() and it tries to step ->next, which may be NULL - resulting in an oops. Fix this by the following means: (1) When preparing a write subrequest, make sure there's a folio_queue struct with space in it at the leading edge of the queue. A function to make space is split out of the function to append a folio so that it can be called for this purpose. (2) If the request struct iterator is pointing to a completely spent folio_queue when we make space, then advance the iterator to the newly allocated folio_queue. The subrequest's iterator will then be set from this. Whilst we're at it, also split out the function to allocate a folio_queue, initialise it and do the accounting. The oops could be triggered using the generic/346 xfstest with a filesystem on9P over TCP with cache=loose. The oops looked something like: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page ... RIP: 0010:_copy_from_iter+0x2db/0x530 ... Call Trace: <TASK> ... p9pdu_vwritef+0x3d8/0x5d0 p9_client_prepare_req+0xa8/0x140 p9_client_rpc+0x81/0x280 p9_client_write+0xcf/0x1c0 v9fs_issue_write+0x87/0xc0 netfs_advance_write+0xa0/0xb0 netfs_write_folio.isra.0+0x42d/0x500 netfs_writepages+0x15a/0x1f0 do_writepages+0xd1/0x220 filemap_fdatawrite_wbc+0x5c/0x80 v9fs_mmap_vm_close+0x7d/0xb0 remove_vma+0x35/0x70 vms_complete_munmap_vmas+0x11a/0x170 do_vmi_align_munmap+0x17d/0x1c0 do_vmi_munmap+0x13e/0x150 __vm_munmap+0x92/0xd0 __x64_sys_munmap+0x17/0x20 do_syscall_64+0x80/0xe0 entry_SYSCALL_64_after_hwframe+0x71/0x79 This may also fix a similar-looking issue with cifs and generic/074. | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs@lists.linux.dev cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org --- fs/netfs/internal.h | 2 + fs/netfs/misc.c | 72 ++++++++++++++++++++++++++++++++++--------------- fs/netfs/objects.c | 12 ++++++++ fs/netfs/write_issue.c | 12 +++++++- 4 files changed, 76 insertions(+), 22 deletions(-) diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h index c7f23dd3556a..79c0ad89affb 100644 --- a/fs/netfs/internal.h +++ b/fs/netfs/internal.h @@ -58,6 +58,7 @@ static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq) {} /* * misc.c */ +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq); int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, bool needs_put); struct folio_queue *netfs_delete_buffer_head(struct netfs_io_request *wreq); @@ -76,6 +77,7 @@ void netfs_clear_subrequests(struct netfs_io_request *rreq, bool was_async); void netfs_put_request(struct netfs_io_request *rreq, bool was_async, enum netfs_rreq_ref_trace what); struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq); +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp); static inline void netfs_see_request(struct netfs_io_request *rreq, enum netfs_rreq_ref_trace what) diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c index 0ad0982ce0e2..a743e8963247 100644 --- a/fs/netfs/misc.c +++ b/fs/netfs/misc.c @@ -9,34 +9,64 @@ #include "internal.h" /* - * Append a folio to the rolling queue. + * Make sure there's space in the rolling queue. */ -int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, - bool needs_put) +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq) { - struct folio_queue *tail = rreq->buffer_tail; - unsigned int slot, order = folio_order(folio); + struct folio_queue *tail = rreq->buffer_tail, *prev; + unsigned int prev_nr_slots = 0; if (WARN_ON_ONCE(!rreq->buffer && tail) || WARN_ON_ONCE(rreq->buffer && !tail)) - return -EIO; - - if (!tail || folioq_full(tail)) { - tail = kmalloc(sizeof(*tail), GFP_NOFS); - if (!tail) - return -ENOMEM; - netfs_stat(&netfs_n_folioq); - folioq_init(tail); - tail->prev = rreq->buffer_tail; - if (tail->prev) - tail->prev->next = tail; - rreq->buffer_tail = tail; - if (!rreq->buffer) { - rreq->buffer = tail; - iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0); + return ERR_PTR(-EIO); + + prev = tail; + if (prev) { + if (!folioq_full(tail)) + return tail; + prev_nr_slots = folioq_nr_slots(tail); + } + + tail = netfs_folioq_alloc(rreq, GFP_NOFS); + if (!tail) + return ERR_PTR(-ENOMEM); + tail->prev = prev; + if (prev) + /* [!] NOTE: After we set prev->next, the consumer is entirely + * at liberty to delete prev. + */ + WRITE_ONCE(prev->next, tail); + + rreq->buffer_tail = tail; + if (!rreq->buffer) { + rreq->buffer = tail; + iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0); + } else { + /* Make sure we don't leave the master iterator pointing to a + * block that might get immediately consumed. + */ + if (rreq->io_iter.folioq == prev && + rreq->io_iter.folioq_slot == prev_nr_slots) { + rreq->io_iter.folioq = tail; + rreq->io_iter.folioq_slot = 0; } - rreq->buffer_tail_slot = 0; } + rreq->buffer_tail_slot = 0; + return tail; +} + +/* + * Append a folio to the rolling queue. + */ +int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, + bool needs_put) +{ + struct folio_queue *tail; + unsigned int slot, order = folio_order(folio); + + tail = netfs_buffer_make_space(rreq); + if (IS_ERR(tail)) + return PTR_ERR(tail); rreq->io_iter.count += PAGE_SIZE << order; diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c index d32964e8ca5d..dd8241bc996b 100644 --- a/fs/netfs/objects.c +++ b/fs/netfs/objects.c @@ -250,3 +250,15 @@ void netfs_put_subrequest(struct netfs_io_subrequest *subreq, bool was_async, if (dead) netfs_free_subrequest(subreq, was_async); } + +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp) +{ + struct folio_queue *fq; + + fq = kmalloc(sizeof(*fq), gfp); + if (fq) { + netfs_stat(&netfs_n_folioq); + folioq_init(fq); + } + return fq; +} diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c index 04e66d587f77..0929d9fd4ce7 100644 --- a/fs/netfs/write_issue.c +++ b/fs/netfs/write_issue.c @@ -153,12 +153,22 @@ static void netfs_prepare_write(struct netfs_io_request *wreq, loff_t start) { struct netfs_io_subrequest *subreq; + struct iov_iter *wreq_iter = &wreq->io_iter; + + /* Make sure we don't point the iterator at a used-up folio_queue + * struct being used as a placeholder to prevent the queue from + * collapsing. In such a case, extend the queue. + */ + if (iov_iter_is_folioq(wreq_iter) && + wreq_iter->folioq_slot >= folioq_nr_slots(wreq_iter->folioq)) { + netfs_buffer_make_space(wreq); + } subreq = netfs_alloc_subrequest(wreq); subreq->source = stream->source; subreq->start = start; subreq->stream_nr = stream->stream_nr; - subreq->io_iter = wreq->io_iter; + subreq->io_iter = *wreq_iter; _enter("R=%x[%x]", wreq->debug_id, subreq->debug_index); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-24 21:47 ` David Howells @ 2024-09-24 23:19 ` Steve French 2024-09-26 2:20 ` Oliver Sang 1 sibling, 0 replies; 15+ messages in thread From: Steve French @ 2024-09-24 23:19 UTC (permalink / raw) To: David Howells Cc: kernel test robot, oe-lkp, lkp, Linux Memory Management List, Christian Brauner, Jeff Layton, netfs, linux-fsdevel Yes - I can confirm that this fixes the recent netfs regression in generic/075 http://smb311-linux-testing.southcentralus.cloudapp.azure.com/#/builders/3/builds/239 You can add: Tested-by: Steve French <stfrench@microsoft.com> On Tue, Sep 24, 2024 at 4:47 PM David Howells <dhowells@redhat.com> wrote: > > Does the attached fix the problem? > > David > --- > netfs: Fix write oops in generic/346 (9p) and maybe generic/074 (cifs) > > In netfslib, a buffered writeback operation has a 'write queue' of folios > that are being written, held in a linear sequence of folio_queue structs. > The 'issuer' adds new folio_queues on the leading edge of the queue and > populates each one progressively; the 'collector' pops them off the > trailing edge and discards them and the folios they point to as they are > consumed. > > The queue is required to always retain at least one folio_queue structure. > This allows the queue to be accessed without locking and with just a bit of > barriering. > > When a new subrequest is prepared, its ->io_iter iterator is pointed at the > current end of the write queue and then the iterator is extended as more > data is added to the queue until the subrequest is committed. > > Now, the problem is that the folio_queue at the leading edge of the write > queue when a subrequest is prepared might have been entirely consumed - but > not yet removed from the queue as it is the only remaining one and is > preventing the queue from collapsing. > > So, what happens is that subreq->io_iter is pointed at the spent > folio_queue, then a new folio_queue is added, and, at that point, the > collector is at entirely at liberty to immediately delete the spent > folio_queue. > > This leaves the subreq->io_iter pointing at a freed object. If the system > is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted > freed object and advances to the next folio_queue in the queue. > > In the case seen, however, the freed object gets recycled and put back onto > the queue at the tail and filled to the end. This confuses > iterate_folioq() and it tries to step ->next, which may be NULL - resulting > in an oops. > > Fix this by the following means: > > (1) When preparing a write subrequest, make sure there's a folio_queue > struct with space in it at the leading edge of the queue. A function > to make space is split out of the function to append a folio so that > it can be called for this purpose. > > (2) If the request struct iterator is pointing to a completely spent > folio_queue when we make space, then advance the iterator to the newly > allocated folio_queue. The subrequest's iterator will then be set > from this. > > Whilst we're at it, also split out the function to allocate a folio_queue, > initialise it and do the accounting. > > The oops could be triggered using the generic/346 xfstest with a filesystem > on9P over TCP with cache=loose. The oops looked something like: > > BUG: kernel NULL pointer dereference, address: 0000000000000008 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > ... > RIP: 0010:_copy_from_iter+0x2db/0x530 > ... > Call Trace: > <TASK> > ... > p9pdu_vwritef+0x3d8/0x5d0 > p9_client_prepare_req+0xa8/0x140 > p9_client_rpc+0x81/0x280 > p9_client_write+0xcf/0x1c0 > v9fs_issue_write+0x87/0xc0 > netfs_advance_write+0xa0/0xb0 > netfs_write_folio.isra.0+0x42d/0x500 > netfs_writepages+0x15a/0x1f0 > do_writepages+0xd1/0x220 > filemap_fdatawrite_wbc+0x5c/0x80 > v9fs_mmap_vm_close+0x7d/0xb0 > remove_vma+0x35/0x70 > vms_complete_munmap_vmas+0x11a/0x170 > do_vmi_align_munmap+0x17d/0x1c0 > do_vmi_munmap+0x13e/0x150 > __vm_munmap+0x92/0xd0 > __x64_sys_munmap+0x17/0x20 > do_syscall_64+0x80/0xe0 > entry_SYSCALL_64_after_hwframe+0x71/0x79 > > This may also fix a similar-looking issue with cifs and generic/074. > > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com > > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Eric Van Hensbergen <ericvh@kernel.org> > cc: Latchesar Ionkov <lucho@ionkov.net> > cc: Dominique Martinet <asmadeus@codewreck.org> > cc: Christian Schoenebeck <linux_oss@crudebyte.com> > cc: Steve French <sfrench@samba.org> > cc: Paulo Alcantara <pc@manguebit.com> > cc: Jeff Layton <jlayton@kernel.org> > cc: v9fs@lists.linux.dev > cc: linux-cifs@vger.kernel.org > cc: netfs@lists.linux.dev > cc: linux-fsdevel@vger.kernel.org > --- > fs/netfs/internal.h | 2 + > fs/netfs/misc.c | 72 ++++++++++++++++++++++++++++++++++--------------- > fs/netfs/objects.c | 12 ++++++++ > fs/netfs/write_issue.c | 12 +++++++- > 4 files changed, 76 insertions(+), 22 deletions(-) > > diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h > index c7f23dd3556a..79c0ad89affb 100644 > --- a/fs/netfs/internal.h > +++ b/fs/netfs/internal.h > @@ -58,6 +58,7 @@ static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq) {} > /* > * misc.c > */ > +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq); > int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, > bool needs_put); > struct folio_queue *netfs_delete_buffer_head(struct netfs_io_request *wreq); > @@ -76,6 +77,7 @@ void netfs_clear_subrequests(struct netfs_io_request *rreq, bool was_async); > void netfs_put_request(struct netfs_io_request *rreq, bool was_async, > enum netfs_rreq_ref_trace what); > struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq); > +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp); > > static inline void netfs_see_request(struct netfs_io_request *rreq, > enum netfs_rreq_ref_trace what) > diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c > index 0ad0982ce0e2..a743e8963247 100644 > --- a/fs/netfs/misc.c > +++ b/fs/netfs/misc.c > @@ -9,34 +9,64 @@ > #include "internal.h" > > /* > - * Append a folio to the rolling queue. > + * Make sure there's space in the rolling queue. > */ > -int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, > - bool needs_put) > +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq) > { > - struct folio_queue *tail = rreq->buffer_tail; > - unsigned int slot, order = folio_order(folio); > + struct folio_queue *tail = rreq->buffer_tail, *prev; > + unsigned int prev_nr_slots = 0; > > if (WARN_ON_ONCE(!rreq->buffer && tail) || > WARN_ON_ONCE(rreq->buffer && !tail)) > - return -EIO; > - > - if (!tail || folioq_full(tail)) { > - tail = kmalloc(sizeof(*tail), GFP_NOFS); > - if (!tail) > - return -ENOMEM; > - netfs_stat(&netfs_n_folioq); > - folioq_init(tail); > - tail->prev = rreq->buffer_tail; > - if (tail->prev) > - tail->prev->next = tail; > - rreq->buffer_tail = tail; > - if (!rreq->buffer) { > - rreq->buffer = tail; > - iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0); > + return ERR_PTR(-EIO); > + > + prev = tail; > + if (prev) { > + if (!folioq_full(tail)) > + return tail; > + prev_nr_slots = folioq_nr_slots(tail); > + } > + > + tail = netfs_folioq_alloc(rreq, GFP_NOFS); > + if (!tail) > + return ERR_PTR(-ENOMEM); > + tail->prev = prev; > + if (prev) > + /* [!] NOTE: After we set prev->next, the consumer is entirely > + * at liberty to delete prev. > + */ > + WRITE_ONCE(prev->next, tail); > + > + rreq->buffer_tail = tail; > + if (!rreq->buffer) { > + rreq->buffer = tail; > + iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0); > + } else { > + /* Make sure we don't leave the master iterator pointing to a > + * block that might get immediately consumed. > + */ > + if (rreq->io_iter.folioq == prev && > + rreq->io_iter.folioq_slot == prev_nr_slots) { > + rreq->io_iter.folioq = tail; > + rreq->io_iter.folioq_slot = 0; > } > - rreq->buffer_tail_slot = 0; > } > + rreq->buffer_tail_slot = 0; > + return tail; > +} > + > +/* > + * Append a folio to the rolling queue. > + */ > +int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, > + bool needs_put) > +{ > + struct folio_queue *tail; > + unsigned int slot, order = folio_order(folio); > + > + tail = netfs_buffer_make_space(rreq); > + if (IS_ERR(tail)) > + return PTR_ERR(tail); > > rreq->io_iter.count += PAGE_SIZE << order; > > diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c > index d32964e8ca5d..dd8241bc996b 100644 > --- a/fs/netfs/objects.c > +++ b/fs/netfs/objects.c > @@ -250,3 +250,15 @@ void netfs_put_subrequest(struct netfs_io_subrequest *subreq, bool was_async, > if (dead) > netfs_free_subrequest(subreq, was_async); > } > + > +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp) > +{ > + struct folio_queue *fq; > + > + fq = kmalloc(sizeof(*fq), gfp); > + if (fq) { > + netfs_stat(&netfs_n_folioq); > + folioq_init(fq); > + } > + return fq; > +} > diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c > index 04e66d587f77..0929d9fd4ce7 100644 > --- a/fs/netfs/write_issue.c > +++ b/fs/netfs/write_issue.c > @@ -153,12 +153,22 @@ static void netfs_prepare_write(struct netfs_io_request *wreq, > loff_t start) > { > struct netfs_io_subrequest *subreq; > + struct iov_iter *wreq_iter = &wreq->io_iter; > + > + /* Make sure we don't point the iterator at a used-up folio_queue > + * struct being used as a placeholder to prevent the queue from > + * collapsing. In such a case, extend the queue. > + */ > + if (iov_iter_is_folioq(wreq_iter) && > + wreq_iter->folioq_slot >= folioq_nr_slots(wreq_iter->folioq)) { > + netfs_buffer_make_space(wreq); > + } > > subreq = netfs_alloc_subrequest(wreq); > subreq->source = stream->source; > subreq->start = start; > subreq->stream_nr = stream->stream_nr; > - subreq->io_iter = wreq->io_iter; > + subreq->io_iter = *wreq_iter; > > _enter("R=%x[%x]", wreq->debug_id, subreq->debug_index); > > -- Thanks, Steve ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter 2024-09-24 21:47 ` David Howells 2024-09-24 23:19 ` Steve French @ 2024-09-26 2:20 ` Oliver Sang 1 sibling, 0 replies; 15+ messages in thread From: Oliver Sang @ 2024-09-26 2:20 UTC (permalink / raw) To: David Howells Cc: oe-lkp, lkp, Linux Memory Management List, Christian Brauner, Jeff Layton, netfs, linux-fsdevel, oliver.sang hi, David, On Tue, Sep 24, 2024 at 10:47:30PM +0100, David Howells wrote: > Does the attached fix the problem? yes, as I've replied in https://lore.kernel.org/all/ZvTD2t5s8lwQyphN@xsang-OptiPlex-9020/ thanks! > > David > --- > netfs: Fix write oops in generic/346 (9p) and maybe generic/074 (cifs) > > In netfslib, a buffered writeback operation has a 'write queue' of folios > that are being written, held in a linear sequence of folio_queue structs. > The 'issuer' adds new folio_queues on the leading edge of the queue and > populates each one progressively; the 'collector' pops them off the > trailing edge and discards them and the folios they point to as they are > consumed. > > The queue is required to always retain at least one folio_queue structure. > This allows the queue to be accessed without locking and with just a bit of > barriering. > > When a new subrequest is prepared, its ->io_iter iterator is pointed at the > current end of the write queue and then the iterator is extended as more > data is added to the queue until the subrequest is committed. > > Now, the problem is that the folio_queue at the leading edge of the write > queue when a subrequest is prepared might have been entirely consumed - but > not yet removed from the queue as it is the only remaining one and is > preventing the queue from collapsing. > > So, what happens is that subreq->io_iter is pointed at the spent > folio_queue, then a new folio_queue is added, and, at that point, the > collector is at entirely at liberty to immediately delete the spent > folio_queue. > > This leaves the subreq->io_iter pointing at a freed object. If the system > is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted > freed object and advances to the next folio_queue in the queue. > > In the case seen, however, the freed object gets recycled and put back onto > the queue at the tail and filled to the end. This confuses > iterate_folioq() and it tries to step ->next, which may be NULL - resulting > in an oops. > > Fix this by the following means: > > (1) When preparing a write subrequest, make sure there's a folio_queue > struct with space in it at the leading edge of the queue. A function > to make space is split out of the function to append a folio so that > it can be called for this purpose. > > (2) If the request struct iterator is pointing to a completely spent > folio_queue when we make space, then advance the iterator to the newly > allocated folio_queue. The subrequest's iterator will then be set > from this. > > Whilst we're at it, also split out the function to allocate a folio_queue, > initialise it and do the accounting. > > The oops could be triggered using the generic/346 xfstest with a filesystem > on9P over TCP with cache=loose. The oops looked something like: > > BUG: kernel NULL pointer dereference, address: 0000000000000008 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > ... > RIP: 0010:_copy_from_iter+0x2db/0x530 > ... > Call Trace: > <TASK> > ... > p9pdu_vwritef+0x3d8/0x5d0 > p9_client_prepare_req+0xa8/0x140 > p9_client_rpc+0x81/0x280 > p9_client_write+0xcf/0x1c0 > v9fs_issue_write+0x87/0xc0 > netfs_advance_write+0xa0/0xb0 > netfs_write_folio.isra.0+0x42d/0x500 > netfs_writepages+0x15a/0x1f0 > do_writepages+0xd1/0x220 > filemap_fdatawrite_wbc+0x5c/0x80 > v9fs_mmap_vm_close+0x7d/0xb0 > remove_vma+0x35/0x70 > vms_complete_munmap_vmas+0x11a/0x170 > do_vmi_align_munmap+0x17d/0x1c0 > do_vmi_munmap+0x13e/0x150 > __vm_munmap+0x92/0xd0 > __x64_sys_munmap+0x17/0x20 > do_syscall_64+0x80/0xe0 > entry_SYSCALL_64_after_hwframe+0x71/0x79 > > This may also fix a similar-looking issue with cifs and generic/074. > > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com > > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Eric Van Hensbergen <ericvh@kernel.org> > cc: Latchesar Ionkov <lucho@ionkov.net> > cc: Dominique Martinet <asmadeus@codewreck.org> > cc: Christian Schoenebeck <linux_oss@crudebyte.com> > cc: Steve French <sfrench@samba.org> > cc: Paulo Alcantara <pc@manguebit.com> > cc: Jeff Layton <jlayton@kernel.org> > cc: v9fs@lists.linux.dev > cc: linux-cifs@vger.kernel.org > cc: netfs@lists.linux.dev > cc: linux-fsdevel@vger.kernel.org > --- > fs/netfs/internal.h | 2 + > fs/netfs/misc.c | 72 ++++++++++++++++++++++++++++++++++--------------- > fs/netfs/objects.c | 12 ++++++++ > fs/netfs/write_issue.c | 12 +++++++- > 4 files changed, 76 insertions(+), 22 deletions(-) > > diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h > index c7f23dd3556a..79c0ad89affb 100644 > --- a/fs/netfs/internal.h > +++ b/fs/netfs/internal.h > @@ -58,6 +58,7 @@ static inline void netfs_proc_del_rreq(struct netfs_io_request *rreq) {} > /* > * misc.c > */ > +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq); > int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, > bool needs_put); > struct folio_queue *netfs_delete_buffer_head(struct netfs_io_request *wreq); > @@ -76,6 +77,7 @@ void netfs_clear_subrequests(struct netfs_io_request *rreq, bool was_async); > void netfs_put_request(struct netfs_io_request *rreq, bool was_async, > enum netfs_rreq_ref_trace what); > struct netfs_io_subrequest *netfs_alloc_subrequest(struct netfs_io_request *rreq); > +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp); > > static inline void netfs_see_request(struct netfs_io_request *rreq, > enum netfs_rreq_ref_trace what) > diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c > index 0ad0982ce0e2..a743e8963247 100644 > --- a/fs/netfs/misc.c > +++ b/fs/netfs/misc.c > @@ -9,34 +9,64 @@ > #include "internal.h" > > /* > - * Append a folio to the rolling queue. > + * Make sure there's space in the rolling queue. > */ > -int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, > - bool needs_put) > +struct folio_queue *netfs_buffer_make_space(struct netfs_io_request *rreq) > { > - struct folio_queue *tail = rreq->buffer_tail; > - unsigned int slot, order = folio_order(folio); > + struct folio_queue *tail = rreq->buffer_tail, *prev; > + unsigned int prev_nr_slots = 0; > > if (WARN_ON_ONCE(!rreq->buffer && tail) || > WARN_ON_ONCE(rreq->buffer && !tail)) > - return -EIO; > - > - if (!tail || folioq_full(tail)) { > - tail = kmalloc(sizeof(*tail), GFP_NOFS); > - if (!tail) > - return -ENOMEM; > - netfs_stat(&netfs_n_folioq); > - folioq_init(tail); > - tail->prev = rreq->buffer_tail; > - if (tail->prev) > - tail->prev->next = tail; > - rreq->buffer_tail = tail; > - if (!rreq->buffer) { > - rreq->buffer = tail; > - iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0); > + return ERR_PTR(-EIO); > + > + prev = tail; > + if (prev) { > + if (!folioq_full(tail)) > + return tail; > + prev_nr_slots = folioq_nr_slots(tail); > + } > + > + tail = netfs_folioq_alloc(rreq, GFP_NOFS); > + if (!tail) > + return ERR_PTR(-ENOMEM); > + tail->prev = prev; > + if (prev) > + /* [!] NOTE: After we set prev->next, the consumer is entirely > + * at liberty to delete prev. > + */ > + WRITE_ONCE(prev->next, tail); > + > + rreq->buffer_tail = tail; > + if (!rreq->buffer) { > + rreq->buffer = tail; > + iov_iter_folio_queue(&rreq->io_iter, ITER_SOURCE, tail, 0, 0, 0); > + } else { > + /* Make sure we don't leave the master iterator pointing to a > + * block that might get immediately consumed. > + */ > + if (rreq->io_iter.folioq == prev && > + rreq->io_iter.folioq_slot == prev_nr_slots) { > + rreq->io_iter.folioq = tail; > + rreq->io_iter.folioq_slot = 0; > } > - rreq->buffer_tail_slot = 0; > } > + rreq->buffer_tail_slot = 0; > + return tail; > +} > + > +/* > + * Append a folio to the rolling queue. > + */ > +int netfs_buffer_append_folio(struct netfs_io_request *rreq, struct folio *folio, > + bool needs_put) > +{ > + struct folio_queue *tail; > + unsigned int slot, order = folio_order(folio); > + > + tail = netfs_buffer_make_space(rreq); > + if (IS_ERR(tail)) > + return PTR_ERR(tail); > > rreq->io_iter.count += PAGE_SIZE << order; > > diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c > index d32964e8ca5d..dd8241bc996b 100644 > --- a/fs/netfs/objects.c > +++ b/fs/netfs/objects.c > @@ -250,3 +250,15 @@ void netfs_put_subrequest(struct netfs_io_subrequest *subreq, bool was_async, > if (dead) > netfs_free_subrequest(subreq, was_async); > } > + > +struct folio_queue *netfs_folioq_alloc(struct netfs_io_request *rreq, gfp_t gfp) > +{ > + struct folio_queue *fq; > + > + fq = kmalloc(sizeof(*fq), gfp); > + if (fq) { > + netfs_stat(&netfs_n_folioq); > + folioq_init(fq); > + } > + return fq; > +} > diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c > index 04e66d587f77..0929d9fd4ce7 100644 > --- a/fs/netfs/write_issue.c > +++ b/fs/netfs/write_issue.c > @@ -153,12 +153,22 @@ static void netfs_prepare_write(struct netfs_io_request *wreq, > loff_t start) > { > struct netfs_io_subrequest *subreq; > + struct iov_iter *wreq_iter = &wreq->io_iter; > + > + /* Make sure we don't point the iterator at a used-up folio_queue > + * struct being used as a placeholder to prevent the queue from > + * collapsing. In such a case, extend the queue. > + */ > + if (iov_iter_is_folioq(wreq_iter) && > + wreq_iter->folioq_slot >= folioq_nr_slots(wreq_iter->folioq)) { > + netfs_buffer_make_space(wreq); > + } > > subreq = netfs_alloc_subrequest(wreq); > subreq->source = stream->source; > subreq->start = start; > subreq->stream_nr = stream->stream_nr; > - subreq->io_iter = wreq->io_iter; > + subreq->io_iter = *wreq_iter; > > _enter("R=%x[%x]", wreq->debug_id, subreq->debug_index); > ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-09-26 2:21 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-09-13 7:24 [linux-next:master] [netfs] a05b682d49: BUG:KASAN:slab-use-after-free_in_copy_from_iter kernel test robot 2024-09-13 7:59 ` David Howells 2024-09-13 8:11 ` Christian Brauner 2024-09-18 2:24 ` Oliver Sang 2024-09-18 10:34 ` David Howells 2024-09-18 11:27 ` David Howells 2024-09-19 2:23 ` Oliver Sang 2024-09-19 7:14 ` David Howells 2024-09-20 6:36 ` Oliver Sang 2024-09-20 7:55 ` David Howells 2024-09-18 14:03 ` David Howells 2024-09-19 2:50 ` Oliver Sang 2024-09-24 21:47 ` David Howells 2024-09-24 23:19 ` Steve French 2024-09-26 2:20 ` Oliver Sang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox