[syzbot] [mm?] WARNING in fsnotify_file_area

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [syzbot] [mm?] WARNING in fsnotify_file_area_perm
@ 2025-02-06  9:59 syzbot
  2025-02-07  0:54 ` Andrew Morton
  2025-03-02 16:32 ` [syzbot] [xfs?] " syzbot
  0 siblings, 2 replies; 20+ messages in thread
From: syzbot @ 2025-02-06  9:59 UTC (permalink / raw)
  To: akpm, linux-kernel, linux-mm, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    69e858e0b8b2 Merge tag 'uml-for-linus-6.14-rc1' of git://g..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=135c1724580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d033b14aeef39158
dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-69e858e0.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a53b888c1f3f/vmlinux-69e858e0.xz
kernel image: https://storage.googleapis.com/syzbot-assets/6b5e17edafc0/bzImage-69e858e0.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com

loop0: detected capacity change from 0 to 32768
XFS: ikeep mount option is deprecated.
XFS (loop0): Mounting V5 Filesystem a2f82aab-77f8-4286-afd4-a8f747a74bab
XFS (loop0): Ending clean mount
XFS (loop0): Quotacheck needed: Please wait.
XFS (loop0): Quotacheck: Done.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 5321 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x1e5/0x250 include/linux/fsnotify.h:145
Modules linked in:
CPU: 0 UID: 0 PID: 5321 Comm: syz.0.0 Not tainted 6.13.0-syzkaller-09760-g69e858e0b8b2 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:fsnotify_file_area_perm+0x1e5/0x250 include/linux/fsnotify.h:145
Code: c3 cc cc cc cc e8 fb 8f c6 ff 49 83 ec 80 4c 89 e7 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 01 9f 00 00 e8 dc 8f c6 ff 90 <0f> 0b 90 e9 0a ff ff ff 48 c7 c1 10 73 1b 90 80 e1 07 80 c1 03 38
RSP: 0018:ffffc9000d416320 EFLAGS: 00010283
RAX: ffffffff81f8dce4 RBX: 0000000000000001 RCX: 0000000000100000
RDX: ffffc9000e5c2000 RSI: 00000000000008fa RDI: 00000000000008fb
RBP: 0000000000008000 R08: ffffffff81f8dbdc R09: 1ffff110087dca2e
R10: dffffc0000000000 R11: ffffed10087dca2f R12: ffff888033d4b1c0
R13: 0000000000000010 R14: dffffc0000000000 R15: ffffc9000d416460
FS:  00007f5bca7346c0(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000100 CR3: 0000000033fd2000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 filemap_fault+0x14a9/0x16c0 mm/filemap.c:3509
 __do_fault+0x135/0x390 mm/memory.c:4977
 do_read_fault mm/memory.c:5392 [inline]
 do_fault mm/memory.c:5526 [inline]
 do_pte_missing mm/memory.c:4047 [inline]
 handle_pte_fault mm/memory.c:5889 [inline]
 __handle_mm_fault+0x4c44/0x70f0 mm/memory.c:6032
 handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
 faultin_page mm/gup.c:1196 [inline]
 __get_user_pages+0x1a92/0x4140 mm/gup.c:1491
 __get_user_pages_locked mm/gup.c:1757 [inline]
 __gup_longterm_locked+0xe64/0x17f0 mm/gup.c:2529
 gup_fast_fallback+0x2266/0x29c0 mm/gup.c:3430
 pin_user_pages_fast+0xcc/0x160 mm/gup.c:3536
 iov_iter_extract_user_pages lib/iov_iter.c:1844 [inline]
 iov_iter_extract_pages+0x3bb/0x5c0 lib/iov_iter.c:1907
 __bio_iov_iter_get_pages block/bio.c:1181 [inline]
 bio_iov_iter_get_pages+0x4f1/0x1460 block/bio.c:1263
 iomap_dio_bio_iter+0xc9c/0x1740 fs/iomap/direct-io.c:406
 __iomap_dio_rw+0x13b7/0x25b0 fs/iomap/direct-io.c:703
 iomap_dio_rw+0x46/0xa0 fs/iomap/direct-io.c:792
 xfs_file_dio_write_unaligned+0x2ef/0x6f0 fs/xfs/xfs_file.c:692
 xfs_file_dio_write fs/xfs/xfs_file.c:725 [inline]
 xfs_file_write_iter+0x5c6/0x720 fs/xfs/xfs_file.c:876
 do_iter_readv_writev+0x71a/0x9d0
 vfs_writev+0x38b/0xbc0 fs/read_write.c:1050
 do_pwritev fs/read_write.c:1146 [inline]
 __do_sys_pwritev2 fs/read_write.c:1204 [inline]
 __se_sys_pwritev2+0x196/0x2b0 fs/read_write.c:1195
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5bc998cda9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f5bca734038 EFLAGS: 00000246 ORIG_RAX: 0000000000000148
RAX: ffffffffffffffda RBX: 00007f5bc9ba5fa0 RCX: 00007f5bc998cda9
RDX: 0000000000000001 RSI: 0000000020000240 RDI: 0000000000000007
RBP: 00007f5bc9a0e2a0 R08: 0000000000000000 R09: 0000000000000003
R10: 0000000000007c00 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f5bc9ba5fa0 R15: 00007fff90caf808
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [mm?] WARNING in fsnotify_file_area_perm
  2025-02-06  9:59 [syzbot] [mm?] WARNING in fsnotify_file_area_perm syzbot
@ 2025-02-07  0:54 ` Andrew Morton
  2025-02-07  8:45   ` Christian Brauner
  2025-03-02 16:32 ` [syzbot] [xfs?] " syzbot
  1 sibling, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2025-02-07  0:54 UTC (permalink / raw)
  To: syzbot
  Cc: linux-kernel, linux-mm, syzkaller-bugs, linux-fsdevel,
	Jens Axboe, Amir Goldstein


Thanks.  Let me cc linux-fsdevel and a few others who might help with
this.


On Thu, 06 Feb 2025 01:59:19 -0800 syzbot <syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com> wrote:

> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    69e858e0b8b2 Merge tag 'uml-for-linus-6.14-rc1' of git://g..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=135c1724580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=d033b14aeef39158
> dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-69e858e0.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/a53b888c1f3f/vmlinux-69e858e0.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/6b5e17edafc0/bzImage-69e858e0.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
> 
> loop0: detected capacity change from 0 to 32768
> XFS: ikeep mount option is deprecated.
> XFS (loop0): Mounting V5 Filesystem a2f82aab-77f8-4286-afd4-a8f747a74bab
> XFS (loop0): Ending clean mount
> XFS (loop0): Quotacheck needed: Please wait.
> XFS (loop0): Quotacheck: Done.
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 5321 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x1e5/0x250 include/linux/fsnotify.h:145
> Modules linked in:
> CPU: 0 UID: 0 PID: 5321 Comm: syz.0.0 Not tainted 6.13.0-syzkaller-09760-g69e858e0b8b2 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> RIP: 0010:fsnotify_file_area_perm+0x1e5/0x250 include/linux/fsnotify.h:145
> Code: c3 cc cc cc cc e8 fb 8f c6 ff 49 83 ec 80 4c 89 e7 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 01 9f 00 00 e8 dc 8f c6 ff 90 <0f> 0b 90 e9 0a ff ff ff 48 c7 c1 10 73 1b 90 80 e1 07 80 c1 03 38
> RSP: 0018:ffffc9000d416320 EFLAGS: 00010283
> RAX: ffffffff81f8dce4 RBX: 0000000000000001 RCX: 0000000000100000
> RDX: ffffc9000e5c2000 RSI: 00000000000008fa RDI: 00000000000008fb
> RBP: 0000000000008000 R08: ffffffff81f8dbdc R09: 1ffff110087dca2e
> R10: dffffc0000000000 R11: ffffed10087dca2f R12: ffff888033d4b1c0
> R13: 0000000000000010 R14: dffffc0000000000 R15: ffffc9000d416460
> FS:  00007f5bca7346c0(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000020000100 CR3: 0000000033fd2000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  filemap_fault+0x14a9/0x16c0 mm/filemap.c:3509
>  __do_fault+0x135/0x390 mm/memory.c:4977
>  do_read_fault mm/memory.c:5392 [inline]
>  do_fault mm/memory.c:5526 [inline]
>  do_pte_missing mm/memory.c:4047 [inline]
>  handle_pte_fault mm/memory.c:5889 [inline]
>  __handle_mm_fault+0x4c44/0x70f0 mm/memory.c:6032
>  handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
>  faultin_page mm/gup.c:1196 [inline]
>  __get_user_pages+0x1a92/0x4140 mm/gup.c:1491
>  __get_user_pages_locked mm/gup.c:1757 [inline]
>  __gup_longterm_locked+0xe64/0x17f0 mm/gup.c:2529
>  gup_fast_fallback+0x2266/0x29c0 mm/gup.c:3430
>  pin_user_pages_fast+0xcc/0x160 mm/gup.c:3536
>  iov_iter_extract_user_pages lib/iov_iter.c:1844 [inline]
>  iov_iter_extract_pages+0x3bb/0x5c0 lib/iov_iter.c:1907
>  __bio_iov_iter_get_pages block/bio.c:1181 [inline]
>  bio_iov_iter_get_pages+0x4f1/0x1460 block/bio.c:1263
>  iomap_dio_bio_iter+0xc9c/0x1740 fs/iomap/direct-io.c:406
>  __iomap_dio_rw+0x13b7/0x25b0 fs/iomap/direct-io.c:703
>  iomap_dio_rw+0x46/0xa0 fs/iomap/direct-io.c:792
>  xfs_file_dio_write_unaligned+0x2ef/0x6f0 fs/xfs/xfs_file.c:692
>  xfs_file_dio_write fs/xfs/xfs_file.c:725 [inline]
>  xfs_file_write_iter+0x5c6/0x720 fs/xfs/xfs_file.c:876
>  do_iter_readv_writev+0x71a/0x9d0
>  vfs_writev+0x38b/0xbc0 fs/read_write.c:1050
>  do_pwritev fs/read_write.c:1146 [inline]
>  __do_sys_pwritev2 fs/read_write.c:1204 [inline]
>  __se_sys_pwritev2+0x196/0x2b0 fs/read_write.c:1195
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f5bc998cda9
> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f5bca734038 EFLAGS: 00000246 ORIG_RAX: 0000000000000148
> RAX: ffffffffffffffda RBX: 00007f5bc9ba5fa0 RCX: 00007f5bc998cda9
> RDX: 0000000000000001 RSI: 0000000020000240 RDI: 0000000000000007
> RBP: 00007f5bc9a0e2a0 R08: 0000000000000000 R09: 0000000000000003
> R10: 0000000000007c00 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: 00007f5bc9ba5fa0 R15: 00007fff90caf808
>  </TASK>
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [mm?] WARNING in fsnotify_file_area_perm
  2025-02-07  0:54 ` Andrew Morton
@ 2025-02-07  8:45   ` Christian Brauner
  2025-02-07 19:33     ` Amir Goldstein
  0 siblings, 1 reply; 20+ messages in thread
From: Christian Brauner @ 2025-02-07  8:45 UTC (permalink / raw)
  To: Andrew Morton
  Cc: syzbot, linux-kernel, linux-mm, syzkaller-bugs, linux-fsdevel,
	Jens Axboe, Amir Goldstein

On Thu, Feb 06, 2025 at 04:54:04PM -0800, Andrew Morton wrote:
> 
> Thanks.  Let me cc linux-fsdevel and a few others who might help with
> this.

Thanks! I already have a fix for this in vfs.fixes:

https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs.fixes&id=b13036454697d83e53bf754efbcaaedf431b7a8a

I'll get that out today.

> 
> 
> On Thu, 06 Feb 2025 01:59:19 -0800 syzbot <syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com> wrote:
> 
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    69e858e0b8b2 Merge tag 'uml-for-linus-6.14-rc1' of git://g..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=135c1724580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=d033b14aeef39158
> > dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
> > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > 
> > Unfortunately, I don't have any reproducer for this issue yet.
> > 
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-69e858e0.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/a53b888c1f3f/vmlinux-69e858e0.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/6b5e17edafc0/bzImage-69e858e0.xz
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
> > 
> > loop0: detected capacity change from 0 to 32768
> > XFS: ikeep mount option is deprecated.
> > XFS (loop0): Mounting V5 Filesystem a2f82aab-77f8-4286-afd4-a8f747a74bab
> > XFS (loop0): Ending clean mount
> > XFS (loop0): Quotacheck needed: Please wait.
> > XFS (loop0): Quotacheck: Done.
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 5321 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x1e5/0x250 include/linux/fsnotify.h:145
> > Modules linked in:
> > CPU: 0 UID: 0 PID: 5321 Comm: syz.0.0 Not tainted 6.13.0-syzkaller-09760-g69e858e0b8b2 #0
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > RIP: 0010:fsnotify_file_area_perm+0x1e5/0x250 include/linux/fsnotify.h:145
> > Code: c3 cc cc cc cc e8 fb 8f c6 ff 49 83 ec 80 4c 89 e7 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 01 9f 00 00 e8 dc 8f c6 ff 90 <0f> 0b 90 e9 0a ff ff ff 48 c7 c1 10 73 1b 90 80 e1 07 80 c1 03 38
> > RSP: 0018:ffffc9000d416320 EFLAGS: 00010283
> > RAX: ffffffff81f8dce4 RBX: 0000000000000001 RCX: 0000000000100000
> > RDX: ffffc9000e5c2000 RSI: 00000000000008fa RDI: 00000000000008fb
> > RBP: 0000000000008000 R08: ffffffff81f8dbdc R09: 1ffff110087dca2e
> > R10: dffffc0000000000 R11: ffffed10087dca2f R12: ffff888033d4b1c0
> > R13: 0000000000000010 R14: dffffc0000000000 R15: ffffc9000d416460
> > FS:  00007f5bca7346c0(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000020000100 CR3: 0000000033fd2000 CR4: 0000000000352ef0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <TASK>
> >  filemap_fault+0x14a9/0x16c0 mm/filemap.c:3509
> >  __do_fault+0x135/0x390 mm/memory.c:4977
> >  do_read_fault mm/memory.c:5392 [inline]
> >  do_fault mm/memory.c:5526 [inline]
> >  do_pte_missing mm/memory.c:4047 [inline]
> >  handle_pte_fault mm/memory.c:5889 [inline]
> >  __handle_mm_fault+0x4c44/0x70f0 mm/memory.c:6032
> >  handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
> >  faultin_page mm/gup.c:1196 [inline]
> >  __get_user_pages+0x1a92/0x4140 mm/gup.c:1491
> >  __get_user_pages_locked mm/gup.c:1757 [inline]
> >  __gup_longterm_locked+0xe64/0x17f0 mm/gup.c:2529
> >  gup_fast_fallback+0x2266/0x29c0 mm/gup.c:3430
> >  pin_user_pages_fast+0xcc/0x160 mm/gup.c:3536
> >  iov_iter_extract_user_pages lib/iov_iter.c:1844 [inline]
> >  iov_iter_extract_pages+0x3bb/0x5c0 lib/iov_iter.c:1907
> >  __bio_iov_iter_get_pages block/bio.c:1181 [inline]
> >  bio_iov_iter_get_pages+0x4f1/0x1460 block/bio.c:1263
> >  iomap_dio_bio_iter+0xc9c/0x1740 fs/iomap/direct-io.c:406
> >  __iomap_dio_rw+0x13b7/0x25b0 fs/iomap/direct-io.c:703
> >  iomap_dio_rw+0x46/0xa0 fs/iomap/direct-io.c:792
> >  xfs_file_dio_write_unaligned+0x2ef/0x6f0 fs/xfs/xfs_file.c:692
> >  xfs_file_dio_write fs/xfs/xfs_file.c:725 [inline]
> >  xfs_file_write_iter+0x5c6/0x720 fs/xfs/xfs_file.c:876
> >  do_iter_readv_writev+0x71a/0x9d0
> >  vfs_writev+0x38b/0xbc0 fs/read_write.c:1050
> >  do_pwritev fs/read_write.c:1146 [inline]
> >  __do_sys_pwritev2 fs/read_write.c:1204 [inline]
> >  __se_sys_pwritev2+0x196/0x2b0 fs/read_write.c:1195
> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> >  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > RIP: 0033:0x7f5bc998cda9
> > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007f5bca734038 EFLAGS: 00000246 ORIG_RAX: 0000000000000148
> > RAX: ffffffffffffffda RBX: 00007f5bc9ba5fa0 RCX: 00007f5bc998cda9
> > RDX: 0000000000000001 RSI: 0000000020000240 RDI: 0000000000000007
> > RBP: 00007f5bc9a0e2a0 R08: 0000000000000000 R09: 0000000000000003
> > R10: 0000000000007c00 R11: 0000000000000246 R12: 0000000000000000
> > R13: 0000000000000000 R14: 00007f5bc9ba5fa0 R15: 00007fff90caf808
> >  </TASK>
> > 
> > 
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > 
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > 
> > If the report is already addressed, let syzbot know by replying with:
> > #syz fix: exact-commit-title
> > 
> > If you want to overwrite report's subsystems, reply with:
> > #syz set subsystems: new-subsystem
> > (See the list of subsystem names on the web dashboard)
> > 
> > If the report is a duplicate of another one, reply with:
> > #syz dup: exact-subject-of-another-report
> > 
> > If you want to undo deduplication, reply with:
> > #syz undup


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [mm?] WARNING in fsnotify_file_area_perm
  2025-02-07  8:45   ` Christian Brauner
@ 2025-02-07 19:33     ` Amir Goldstein
  0 siblings, 0 replies; 20+ messages in thread
From: Amir Goldstein @ 2025-02-07 19:33 UTC (permalink / raw)
  To: Christian Brauner, Jan Kara
  Cc: Andrew Morton, syzbot, linux-kernel, linux-mm, syzkaller-bugs,
	linux-fsdevel, Jens Axboe, Josef Bacik

On Fri, Feb 7, 2025 at 9:46 AM Christian Brauner <brauner@kernel.org> wrote:
>
> On Thu, Feb 06, 2025 at 04:54:04PM -0800, Andrew Morton wrote:
> >
> > Thanks.  Let me cc linux-fsdevel and a few others who might help with
> > this.
>
> Thanks! I already have a fix for this in vfs.fixes:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs.fixes&id=b13036454697d83e53bf754efbcaaedf431b7a8a
>

Yes, I hope that it is fixed.
The assertion in a fsnotify_file_area_perm() hook on page read fault
on the blockdev, which shouldn't have FMODE_FSNOTIFY_HSM set.


> I'll get that out today.
>

Thanks,
Amir.


> >
> >
> > On Thu, 06 Feb 2025 01:59:19 -0800 syzbot <syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com> wrote:
> >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit:    69e858e0b8b2 Merge tag 'uml-for-linus-6.14-rc1' of git://g..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=135c1724580000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=d033b14aeef39158
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
> > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > Downloadable assets:
> > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-69e858e0.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/a53b888c1f3f/vmlinux-69e858e0.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/6b5e17edafc0/bzImage-69e858e0.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
> > >
> > > loop0: detected capacity change from 0 to 32768
> > > XFS: ikeep mount option is deprecated.
> > > XFS (loop0): Mounting V5 Filesystem a2f82aab-77f8-4286-afd4-a8f747a74bab
> > > XFS (loop0): Ending clean mount
> > > XFS (loop0): Quotacheck needed: Please wait.
> > > XFS (loop0): Quotacheck: Done.
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 0 PID: 5321 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x1e5/0x250 include/linux/fsnotify.h:145
> > > Modules linked in:
> > > CPU: 0 UID: 0 PID: 5321 Comm: syz.0.0 Not tainted 6.13.0-syzkaller-09760-g69e858e0b8b2 #0
> > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > RIP: 0010:fsnotify_file_area_perm+0x1e5/0x250 include/linux/fsnotify.h:145
> > > Code: c3 cc cc cc cc e8 fb 8f c6 ff 49 83 ec 80 4c 89 e7 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 01 9f 00 00 e8 dc 8f c6 ff 90 <0f> 0b 90 e9 0a ff ff ff 48 c7 c1 10 73 1b 90 80 e1 07 80 c1 03 38
> > > RSP: 0018:ffffc9000d416320 EFLAGS: 00010283
> > > RAX: ffffffff81f8dce4 RBX: 0000000000000001 RCX: 0000000000100000
> > > RDX: ffffc9000e5c2000 RSI: 00000000000008fa RDI: 00000000000008fb
> > > RBP: 0000000000008000 R08: ffffffff81f8dbdc R09: 1ffff110087dca2e
> > > R10: dffffc0000000000 R11: ffffed10087dca2f R12: ffff888033d4b1c0
> > > R13: 0000000000000010 R14: dffffc0000000000 R15: ffffc9000d416460
> > > FS:  00007f5bca7346c0(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000000020000100 CR3: 0000000033fd2000 CR4: 0000000000352ef0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > >  <TASK>
> > >  filemap_fault+0x14a9/0x16c0 mm/filemap.c:3509
> > >  __do_fault+0x135/0x390 mm/memory.c:4977
> > >  do_read_fault mm/memory.c:5392 [inline]
> > >  do_fault mm/memory.c:5526 [inline]
> > >  do_pte_missing mm/memory.c:4047 [inline]
> > >  handle_pte_fault mm/memory.c:5889 [inline]
> > >  __handle_mm_fault+0x4c44/0x70f0 mm/memory.c:6032
> > >  handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
> > >  faultin_page mm/gup.c:1196 [inline]
> > >  __get_user_pages+0x1a92/0x4140 mm/gup.c:1491
> > >  __get_user_pages_locked mm/gup.c:1757 [inline]
> > >  __gup_longterm_locked+0xe64/0x17f0 mm/gup.c:2529
> > >  gup_fast_fallback+0x2266/0x29c0 mm/gup.c:3430
> > >  pin_user_pages_fast+0xcc/0x160 mm/gup.c:3536
> > >  iov_iter_extract_user_pages lib/iov_iter.c:1844 [inline]
> > >  iov_iter_extract_pages+0x3bb/0x5c0 lib/iov_iter.c:1907
> > >  __bio_iov_iter_get_pages block/bio.c:1181 [inline]
> > >  bio_iov_iter_get_pages+0x4f1/0x1460 block/bio.c:1263
> > >  iomap_dio_bio_iter+0xc9c/0x1740 fs/iomap/direct-io.c:406
> > >  __iomap_dio_rw+0x13b7/0x25b0 fs/iomap/direct-io.c:703
> > >  iomap_dio_rw+0x46/0xa0 fs/iomap/direct-io.c:792
> > >  xfs_file_dio_write_unaligned+0x2ef/0x6f0 fs/xfs/xfs_file.c:692
> > >  xfs_file_dio_write fs/xfs/xfs_file.c:725 [inline]
> > >  xfs_file_write_iter+0x5c6/0x720 fs/xfs/xfs_file.c:876
> > >  do_iter_readv_writev+0x71a/0x9d0
> > >  vfs_writev+0x38b/0xbc0 fs/read_write.c:1050
> > >  do_pwritev fs/read_write.c:1146 [inline]
> > >  __do_sys_pwritev2 fs/read_write.c:1204 [inline]
> > >  __se_sys_pwritev2+0x196/0x2b0 fs/read_write.c:1195
> > >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > >  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> > >  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > > RIP: 0033:0x7f5bc998cda9


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-02-06  9:59 [syzbot] [mm?] WARNING in fsnotify_file_area_perm syzbot
  2025-02-07  0:54 ` Andrew Morton
@ 2025-03-02 16:32 ` syzbot
  2025-03-04 11:06   ` Jan Kara
  1 sibling, 1 reply; 20+ messages in thread
From: syzbot @ 2025-03-02 16:32 UTC (permalink / raw)
  To: akpm, amir73il, axboe, brauner, cem, chandan.babu, djwong, jack,
	josef, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

syzbot has found a reproducer for the following issue on:

HEAD commit:    e056da87c780 Merge remote-tracking branch 'will/for-next/p..
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=11f61864580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d6b7e15dc5b5e776
dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=162aba97980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15f61864580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/3d8b1b7cc4c0/disk-e056da87.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/b84c04cff235/vmlinux-e056da87.xz
kernel image: https://storage.googleapis.com/syzbot-assets/2ae4d0525881/Image-e056da87.gz.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/4ea12659f0c0/mount_0.gz
  fsck result: failed (log: https://syzkaller.appspot.com/x/fsck.log?x=1584cfb8580000)

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com

XFS (loop0): Mounting V5 Filesystem bfdc47fc-10d8-4eed-a562-11a831b3f791
XFS (loop0): Ending clean mount
XFS (loop0): Quotacheck needed: Please wait.
XFS (loop0): Quotacheck: Done.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
Modules linked in:
CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
sp : ffff8000a42569d0
x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
Call trace:
 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
 filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
 xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
 __do_fault+0xf8/0x498 mm/memory.c:4988
 do_read_fault mm/memory.c:5403 [inline]
 do_fault mm/memory.c:5537 [inline]
 do_pte_missing mm/memory.c:4058 [inline]
 handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
 __handle_mm_fault mm/memory.c:6043 [inline]
 handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
 do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
 do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
 do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
 el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
 el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
 el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
 __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
 fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
 fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
 iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
 iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
 xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
 xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
 new_sync_write fs/read_write.c:586 [inline]
 vfs_write+0x704/0xa9c fs/read_write.c:679
 ksys_pwrite64 fs/read_write.c:786 [inline]
 __do_sys_pwrite64 fs/read_write.c:794 [inline]
 __se_sys_pwrite64 fs/read_write.c:791 [inline]
 __arm64_sys_pwrite64+0x188/0x220 fs/read_write.c:791
 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
 invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
 el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:744
 el0t_64_sync_handler+0x84/0x108 arch/arm64/kernel/entry-common.c:762
 el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
irq event stamp:


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-02 16:32 ` [syzbot] [xfs?] " syzbot
@ 2025-03-04 11:06   ` Jan Kara
  2025-03-04 15:09     ` Amir Goldstein
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Kara @ 2025-03-04 11:06 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, amir73il, axboe, brauner, cem, chandan.babu, djwong, jack,
	josef, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

Josef, Amir,

this is indeed an interesting case:

On Sun 02-03-25 08:32:30, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
...
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> Modules linked in:
> CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> sp : ffff8000a42569d0
> x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> Call trace:
>  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
>  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
>  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
>  __do_fault+0xf8/0x498 mm/memory.c:4988
>  do_read_fault mm/memory.c:5403 [inline]
>  do_fault mm/memory.c:5537 [inline]
>  do_pte_missing mm/memory.c:4058 [inline]
>  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
>  __handle_mm_fault mm/memory.c:6043 [inline]
>  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
>  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
>  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
>  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
>  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
>  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
>  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
>  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
>  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
>  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
>  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
>  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
>  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
>  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
>  new_sync_write fs/read_write.c:586 [inline]
>  vfs_write+0x704/0xa9c fs/read_write.c:679

The backtrace actually explains it all. We had a buffered write whose
buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
of the buffer happens already (quite deep) under the filesystem freeze
protection (obtained in vfs_write()) which breaks assumptions of HSM code
and introduces potential deadlock of HSM handler in userspace with filesystem
freezing. So we need to think how to deal with this case...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-04 11:06   ` Jan Kara
@ 2025-03-04 15:09     ` Amir Goldstein
  2025-03-04 16:15       ` Josef Bacik
  0 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2025-03-04 15:09 UTC (permalink / raw)
  To: Jan Kara
  Cc: syzbot, akpm, axboe, brauner, cem, chandan.babu, djwong, josef,
	linux-fsdevel, linux-kernel, linux-mm, linux-xfs, syzkaller-bugs

On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
>
> Josef, Amir,
>
> this is indeed an interesting case:
>
> On Sun 02-03-25 08:32:30, syzbot wrote:
> > syzbot has found a reproducer for the following issue on:
> ...
> > ------------[ cut here ]------------
> > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > Modules linked in:
> > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > sp : ffff8000a42569d0
> > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > Call trace:
> >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> >  __do_fault+0xf8/0x498 mm/memory.c:4988
> >  do_read_fault mm/memory.c:5403 [inline]
> >  do_fault mm/memory.c:5537 [inline]
> >  do_pte_missing mm/memory.c:4058 [inline]
> >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> >  __handle_mm_fault mm/memory.c:6043 [inline]
> >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> >  new_sync_write fs/read_write.c:586 [inline]
> >  vfs_write+0x704/0xa9c fs/read_write.c:679
>
> The backtrace actually explains it all. We had a buffered write whose
> buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> of the buffer happens already (quite deep) under the filesystem freeze
> protection (obtained in vfs_write()) which breaks assumptions of HSM code
> and introduces potential deadlock of HSM handler in userspace with filesystem
> freezing. So we need to think how to deal with this case...

Ouch. It's like the splice mess all over again.
Except we do not really care to make this use case work with HSM
in the sense that we do not care to have to fill in the mmaped file content
in this corner case - we just need to let HSM fail the access if content is
not available.

If you remember, in one of my very early version of pre-content events,
the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
carried a flag (I think it was called FAN_PRE_VFS) to communicate to
HSM service if it was safe to write to fs in the context of event handling.

At the moment, I cannot think of any elegant way out of this use case
except annotating the event from fault_in_readable() as "unsafe-for-write".
This will relax the debugging code assertion and notify the HSM service
(via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
Maybe we can reuse the FAN_ACCESS_PERM event to communicate
this case to HSM service.

WDYT?

Thanks,
Amir.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-04 15:09     ` Amir Goldstein
@ 2025-03-04 16:15       ` Josef Bacik
  2025-03-04 20:27         ` Amir Goldstein
  0 siblings, 1 reply; 20+ messages in thread
From: Josef Bacik @ 2025-03-04 16:15 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> >
> > Josef, Amir,
> >
> > this is indeed an interesting case:
> >
> > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > syzbot has found a reproducer for the following issue on:
> > ...
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > Modules linked in:
> > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > sp : ffff8000a42569d0
> > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > Call trace:
> > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > >  do_read_fault mm/memory.c:5403 [inline]
> > >  do_fault mm/memory.c:5537 [inline]
> > >  do_pte_missing mm/memory.c:4058 [inline]
> > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > >  new_sync_write fs/read_write.c:586 [inline]
> > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> >
> > The backtrace actually explains it all. We had a buffered write whose
> > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > of the buffer happens already (quite deep) under the filesystem freeze
> > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > and introduces potential deadlock of HSM handler in userspace with filesystem
> > freezing. So we need to think how to deal with this case...
> 
> Ouch. It's like the splice mess all over again.
> Except we do not really care to make this use case work with HSM
> in the sense that we do not care to have to fill in the mmaped file content
> in this corner case - we just need to let HSM fail the access if content is
> not available.
> 
> If you remember, in one of my very early version of pre-content events,
> the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> HSM service if it was safe to write to fs in the context of event handling.
> 
> At the moment, I cannot think of any elegant way out of this use case
> except annotating the event from fault_in_readable() as "unsafe-for-write".
> This will relax the debugging code assertion and notify the HSM service
> (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> this case to HSM service.
> 
> WDYT?

I think that mmap was a mistake.

Is there a way to tell if we're currently in a path that is under fsfreeze
protection?  Just denying this case would be a simpler short term solution while
we come up with a long term solution. I think your solution is fine, but I'd be
just as happy with a simpler "this isn't allowed" solution. Thanks,

Josef


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-04 16:15       ` Josef Bacik
@ 2025-03-04 20:27         ` Amir Goldstein
  2025-03-04 20:36           ` Josef Bacik
  0 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2025-03-04 20:27 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > Josef, Amir,
> > >
> > > this is indeed an interesting case:
> > >
> > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > syzbot has found a reproducer for the following issue on:
> > > ...
> > > > ------------[ cut here ]------------
> > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > Modules linked in:
> > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > sp : ffff8000a42569d0
> > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > Call trace:
> > > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > > >  do_read_fault mm/memory.c:5403 [inline]
> > > >  do_fault mm/memory.c:5537 [inline]
> > > >  do_pte_missing mm/memory.c:4058 [inline]
> > > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > >  new_sync_write fs/read_write.c:586 [inline]
> > > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> > >
> > > The backtrace actually explains it all. We had a buffered write whose
> > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > of the buffer happens already (quite deep) under the filesystem freeze
> > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > freezing. So we need to think how to deal with this case...
> >
> > Ouch. It's like the splice mess all over again.
> > Except we do not really care to make this use case work with HSM
> > in the sense that we do not care to have to fill in the mmaped file content
> > in this corner case - we just need to let HSM fail the access if content is
> > not available.
> >
> > If you remember, in one of my very early version of pre-content events,
> > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > HSM service if it was safe to write to fs in the context of event handling.
> >
> > At the moment, I cannot think of any elegant way out of this use case
> > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > This will relax the debugging code assertion and notify the HSM service
> > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > this case to HSM service.
> >
> > WDYT?
>
> I think that mmap was a mistake.

What do you mean?
Isn't the fault hook required for your large executables use case?

>
> Is there a way to tell if we're currently in a path that is under fsfreeze
> protection?

Not at the moment.
At the moment, file_write_not_started() is not a reliable check
(has false positives) without CONFIG_LOCKDEP.

> Just denying this case would be a simpler short term solution while
> we come up with a long term solution. I think your solution is fine, but I'd be
> just as happy with a simpler "this isn't allowed" solution. Thanks,

Yeh, I don't mind that, but it's a bit of an overkill considering that
file with no content may in fact be rare.

Thanks,
Amir.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-04 20:27         ` Amir Goldstein
@ 2025-03-04 20:36           ` Josef Bacik
  2025-03-04 21:13             ` Amir Goldstein
  0 siblings, 1 reply; 20+ messages in thread
From: Josef Bacik @ 2025-03-04 20:36 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Tue, Mar 04, 2025 at 09:27:20PM +0100, Amir Goldstein wrote:
> On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > Josef, Amir,
> > > >
> > > > this is indeed an interesting case:
> > > >
> > > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > > syzbot has found a reproducer for the following issue on:
> > > > ...
> > > > > ------------[ cut here ]------------
> > > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > Modules linked in:
> > > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > sp : ffff8000a42569d0
> > > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > > Call trace:
> > > > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > > > >  do_read_fault mm/memory.c:5403 [inline]
> > > > >  do_fault mm/memory.c:5537 [inline]
> > > > >  do_pte_missing mm/memory.c:4058 [inline]
> > > > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > > > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > > >  new_sync_write fs/read_write.c:586 [inline]
> > > > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> > > >
> > > > The backtrace actually explains it all. We had a buffered write whose
> > > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > > of the buffer happens already (quite deep) under the filesystem freeze
> > > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > > freezing. So we need to think how to deal with this case...
> > >
> > > Ouch. It's like the splice mess all over again.
> > > Except we do not really care to make this use case work with HSM
> > > in the sense that we do not care to have to fill in the mmaped file content
> > > in this corner case - we just need to let HSM fail the access if content is
> > > not available.
> > >
> > > If you remember, in one of my very early version of pre-content events,
> > > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > > HSM service if it was safe to write to fs in the context of event handling.
> > >
> > > At the moment, I cannot think of any elegant way out of this use case
> > > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > > This will relax the debugging code assertion and notify the HSM service
> > > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > > this case to HSM service.
> > >
> > > WDYT?
> >
> > I think that mmap was a mistake.
> 
> What do you mean?
> Isn't the fault hook required for your large executables use case?

I mean the mmap syscall was a mistake ;).

> 
> >
> > Is there a way to tell if we're currently in a path that is under fsfreeze
> > protection?
> 
> Not at the moment.
> At the moment, file_write_not_started() is not a reliable check
> (has false positives) without CONFIG_LOCKDEP.
> 
> > Just denying this case would be a simpler short term solution while
> > we come up with a long term solution. I think your solution is fine, but I'd be
> > just as happy with a simpler "this isn't allowed" solution. Thanks,
> 
> Yeh, I don't mind that, but it's a bit of an overkill considering that
> file with no content may in fact be rare.

Agreed, I'm fine with your solution.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-04 20:36           ` Josef Bacik
@ 2025-03-04 21:13             ` Amir Goldstein
  2025-03-07 15:46               ` Josef Bacik
  0 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2025-03-04 21:13 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Tue, Mar 4, 2025 at 9:37 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> On Tue, Mar 04, 2025 at 09:27:20PM +0100, Amir Goldstein wrote:
> > On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > >
> > > On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > > > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > Josef, Amir,
> > > > >
> > > > > this is indeed an interesting case:
> > > > >
> > > > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > > > syzbot has found a reproducer for the following issue on:
> > > > > ...
> > > > > > ------------[ cut here ]------------
> > > > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > Modules linked in:
> > > > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > sp : ffff8000a42569d0
> > > > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > > > Call trace:
> > > > > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > > > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > > > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > > > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > > > > >  do_read_fault mm/memory.c:5403 [inline]
> > > > > >  do_fault mm/memory.c:5537 [inline]
> > > > > >  do_pte_missing mm/memory.c:4058 [inline]
> > > > > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > > > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > > > > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > > > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > > > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > > > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > > > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > > > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > > > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > > > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > > > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > > > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > > > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > > > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > > > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > > > >  new_sync_write fs/read_write.c:586 [inline]
> > > > > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> > > > >
> > > > > The backtrace actually explains it all. We had a buffered write whose
> > > > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > > > of the buffer happens already (quite deep) under the filesystem freeze
> > > > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > > > freezing. So we need to think how to deal with this case...
> > > >
> > > > Ouch. It's like the splice mess all over again.
> > > > Except we do not really care to make this use case work with HSM
> > > > in the sense that we do not care to have to fill in the mmaped file content
> > > > in this corner case - we just need to let HSM fail the access if content is
> > > > not available.
> > > >
> > > > If you remember, in one of my very early version of pre-content events,
> > > > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > > > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > > > HSM service if it was safe to write to fs in the context of event handling.
> > > >
> > > > At the moment, I cannot think of any elegant way out of this use case
> > > > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > > > This will relax the debugging code assertion and notify the HSM service
> > > > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > > > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > > > this case to HSM service.
> > > >
> > > > WDYT?
> > >
> > > I think that mmap was a mistake.
> >
> > What do you mean?
> > Isn't the fault hook required for your large executables use case?
>
> I mean the mmap syscall was a mistake ;).
>

ah :)

> >
> > >
> > > Is there a way to tell if we're currently in a path that is under fsfreeze
> > > protection?
> >
> > Not at the moment.
> > At the moment, file_write_not_started() is not a reliable check
> > (has false positives) without CONFIG_LOCKDEP.
> >

One very ugly solution is to require CONFIG_LOCKDEP for
pre-content events.

> > > Just denying this case would be a simpler short term solution while
> > > we come up with a long term solution. I think your solution is fine, but I'd be
> > > just as happy with a simpler "this isn't allowed" solution. Thanks,
> >
> > Yeh, I don't mind that, but it's a bit of an overkill considering that
> > file with no content may in fact be rare.
>
> Agreed, I'm fine with your solution.

Well, my "solution" was quite hand-wavy - it did not really say how to
propagate the fact that faults initiated from fault_in_readable().
Do you guys have any ideas for a simple solution?

Thanks,
Amir.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-04 21:13             ` Amir Goldstein
@ 2025-03-07 15:46               ` Josef Bacik
  2025-03-07 16:07                 ` Amir Goldstein
  0 siblings, 1 reply; 20+ messages in thread
From: Josef Bacik @ 2025-03-07 15:46 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Tue, Mar 04, 2025 at 10:13:39PM +0100, Amir Goldstein wrote:
> On Tue, Mar 4, 2025 at 9:37 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > On Tue, Mar 04, 2025 at 09:27:20PM +0100, Amir Goldstein wrote:
> > > On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > >
> > > > On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > > > > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> > > > > >
> > > > > > Josef, Amir,
> > > > > >
> > > > > > this is indeed an interesting case:
> > > > > >
> > > > > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > > > > syzbot has found a reproducer for the following issue on:
> > > > > > ...
> > > > > > > ------------[ cut here ]------------
> > > > > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > Modules linked in:
> > > > > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > sp : ffff8000a42569d0
> > > > > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > > > > Call trace:
> > > > > > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > > > > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > > > > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > > > > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > > > > > >  do_read_fault mm/memory.c:5403 [inline]
> > > > > > >  do_fault mm/memory.c:5537 [inline]
> > > > > > >  do_pte_missing mm/memory.c:4058 [inline]
> > > > > > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > > > > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > > > > > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > > > > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > > > > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > > > > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > > > > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > > > > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > > > > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > > > > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > > > > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > > > > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > > > > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > > > > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > > > > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > > > > >  new_sync_write fs/read_write.c:586 [inline]
> > > > > > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> > > > > >
> > > > > > The backtrace actually explains it all. We had a buffered write whose
> > > > > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > > > > of the buffer happens already (quite deep) under the filesystem freeze
> > > > > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > > > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > > > > freezing. So we need to think how to deal with this case...
> > > > >
> > > > > Ouch. It's like the splice mess all over again.
> > > > > Except we do not really care to make this use case work with HSM
> > > > > in the sense that we do not care to have to fill in the mmaped file content
> > > > > in this corner case - we just need to let HSM fail the access if content is
> > > > > not available.
> > > > >
> > > > > If you remember, in one of my very early version of pre-content events,
> > > > > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > > > > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > > > > HSM service if it was safe to write to fs in the context of event handling.
> > > > >
> > > > > At the moment, I cannot think of any elegant way out of this use case
> > > > > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > > > > This will relax the debugging code assertion and notify the HSM service
> > > > > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > > > > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > > > > this case to HSM service.
> > > > >
> > > > > WDYT?
> > > >
> > > > I think that mmap was a mistake.
> > >
> > > What do you mean?
> > > Isn't the fault hook required for your large executables use case?
> >
> > I mean the mmap syscall was a mistake ;).
> >
> 
> ah :)
> 
> > >
> > > >
> > > > Is there a way to tell if we're currently in a path that is under fsfreeze
> > > > protection?
> > >
> > > Not at the moment.
> > > At the moment, file_write_not_started() is not a reliable check
> > > (has false positives) without CONFIG_LOCKDEP.
> > >
> 
> One very ugly solution is to require CONFIG_LOCKDEP for
> pre-content events.
> 
> > > > Just denying this case would be a simpler short term solution while
> > > > we come up with a long term solution. I think your solution is fine, but I'd be
> > > > just as happy with a simpler "this isn't allowed" solution. Thanks,
> > >
> > > Yeh, I don't mind that, but it's a bit of an overkill considering that
> > > file with no content may in fact be rare.
> >
> > Agreed, I'm fine with your solution.
> 
> Well, my "solution" was quite hand-wavy - it did not really say how to
> propagate the fact that faults initiated from fault_in_readable().
> Do you guys have any ideas for a simple solution?

Sorry I've been elbow deep in helping getting our machine replacements working
faster.

I've been thnking about this, it's not like we can carry context from the reason
we are faulting in, at least not simply, so I think the best thing to do is
either 

1) Emit a precontent event at mmap() time for the whole file, since really all I
care about is faulting at exec time, and then we can just skip the precontent
event if we're not exec.

2) Revert the page fault stuff, put back your thing to fault the whole file, and
wait until we think of a better way to deal with this.

Obviously I'd prefer not #2, but I'd really, really rather not chuck all of HSM
because my page fault thing is silly.  I'll carry what I need internally while
we figure out what to do upstream.  #1 doesn't seem bad, but I haven't thought
about it that hard.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-07 15:46               ` Josef Bacik
@ 2025-03-07 16:07                 ` Amir Goldstein
  2025-03-07 16:21                   ` syzbot
  2025-03-07 17:45                   ` Amir Goldstein
  0 siblings, 2 replies; 20+ messages in thread
From: Amir Goldstein @ 2025-03-07 16:07 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Fri, Mar 7, 2025 at 4:46 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> On Tue, Mar 04, 2025 at 10:13:39PM +0100, Amir Goldstein wrote:
> > On Tue, Mar 4, 2025 at 9:37 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > >
> > > On Tue, Mar 04, 2025 at 09:27:20PM +0100, Amir Goldstein wrote:
> > > > On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > > >
> > > > > On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > > > > > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> > > > > > >
> > > > > > > Josef, Amir,
> > > > > > >
> > > > > > > this is indeed an interesting case:
> > > > > > >
> > > > > > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > > > > > syzbot has found a reproducer for the following issue on:
> > > > > > > ...
> > > > > > > > ------------[ cut here ]------------
> > > > > > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > Modules linked in:
> > > > > > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > sp : ffff8000a42569d0
> > > > > > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > > > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > > > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > > > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > > > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > > > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > > > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > > > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > > > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > > > > > Call trace:
> > > > > > > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > > > > > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > > > > > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > > > > > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > > > > > > >  do_read_fault mm/memory.c:5403 [inline]
> > > > > > > >  do_fault mm/memory.c:5537 [inline]
> > > > > > > >  do_pte_missing mm/memory.c:4058 [inline]
> > > > > > > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > > > > > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > > > > > > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > > > > > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > > > > > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > > > > > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > > > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > > > > > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > > > > > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > > > > > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > > > > > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > > > > > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > > > > > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > > > > > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > > > > > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > > > > > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > > > > > >  new_sync_write fs/read_write.c:586 [inline]
> > > > > > > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> > > > > > >
> > > > > > > The backtrace actually explains it all. We had a buffered write whose
> > > > > > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > > > > > of the buffer happens already (quite deep) under the filesystem freeze
> > > > > > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > > > > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > > > > > freezing. So we need to think how to deal with this case...
> > > > > >
> > > > > > Ouch. It's like the splice mess all over again.
> > > > > > Except we do not really care to make this use case work with HSM
> > > > > > in the sense that we do not care to have to fill in the mmaped file content
> > > > > > in this corner case - we just need to let HSM fail the access if content is
> > > > > > not available.
> > > > > >
> > > > > > If you remember, in one of my very early version of pre-content events,
> > > > > > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > > > > > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > > > > > HSM service if it was safe to write to fs in the context of event handling.
> > > > > >
> > > > > > At the moment, I cannot think of any elegant way out of this use case
> > > > > > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > > > > > This will relax the debugging code assertion and notify the HSM service
> > > > > > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > > > > > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > > > > > this case to HSM service.
> > > > > >
> > > > > > WDYT?
> > > > >
> > > > > I think that mmap was a mistake.
> > > >
> > > > What do you mean?
> > > > Isn't the fault hook required for your large executables use case?
> > >
> > > I mean the mmap syscall was a mistake ;).
> > >
> >
> > ah :)
> >
> > > >
> > > > >
> > > > > Is there a way to tell if we're currently in a path that is under fsfreeze
> > > > > protection?
> > > >
> > > > Not at the moment.
> > > > At the moment, file_write_not_started() is not a reliable check
> > > > (has false positives) without CONFIG_LOCKDEP.
> > > >
> >
> > One very ugly solution is to require CONFIG_LOCKDEP for
> > pre-content events.
> >
> > > > > Just denying this case would be a simpler short term solution while
> > > > > we come up with a long term solution. I think your solution is fine, but I'd be
> > > > > just as happy with a simpler "this isn't allowed" solution. Thanks,
> > > >
> > > > Yeh, I don't mind that, but it's a bit of an overkill considering that
> > > > file with no content may in fact be rare.
> > >
> > > Agreed, I'm fine with your solution.
> >
> > Well, my "solution" was quite hand-wavy - it did not really say how to
> > propagate the fact that faults initiated from fault_in_readable().
> > Do you guys have any ideas for a simple solution?
>
> Sorry I've been elbow deep in helping getting our machine replacements working
> faster.
>
> I've been thnking about this, it's not like we can carry context from the reason
> we are faulting in, at least not simply, so I think the best thing to do is
> either
>
> 1) Emit a precontent event at mmap() time for the whole file, since really all I
> care about is faulting at exec time, and then we can just skip the precontent
> event if we're not exec.

Sorry, not that familiar with exec code. Do you mean to issue pre-content
for page fault only if memory is mapped executable or is there another way
of knowing that we are in exec context?

If the former, then syzbot will catch up with us and write a buffer which is
mapped readable and exec.

>
> 2) Revert the page fault stuff, put back your thing to fault the whole file, and
> wait until we think of a better way to deal with this.
>
> Obviously I'd prefer not #2, but I'd really, really rather not chuck all of HSM
> because my page fault thing is silly.  I'll carry what I need internally while
> we figure out what to do upstream.  #1 doesn't seem bad, but I haven't thought
> about it that hard.  Thanks,
>

So I started to test this patch, but I may be doing something very
terribly wrong
with this. Q: What is this something that is terribly wrong?

So far it did not explode, so let's at least see if that fixed the
reported issue:

#syz test: https://github.com/amir73il/linux fsnotify-fixes

Thanks,
Amir.

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2788df98080f8..a8822b44d4967 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3033,13 +3033,27 @@ static inline void file_start_write(struct file *file)
        if (!S_ISREG(file_inode(file)->i_mode))
                return;
        sb_start_write(file_inode(file)->i_sb);
+       /*
+        * Prevent fault-in user pages that may call HSM hooks with
+        * sb_writers held.
+        */
+       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
+               pagefault_disable();
 }

 static inline bool file_start_write_trylock(struct file *file)
 {
        if (!S_ISREG(file_inode(file)->i_mode))
                return true;
-       return sb_start_write_trylock(file_inode(file)->i_sb);
+       if (!sb_start_write_trylock(file_inode(file)->i_sb))
+               return false;
+       /*
+        * Prevent fault-in user pages that may call HSM hooks with
+        * sb_writers held.
+        */
+       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
+               pagefault_disable();
+       return true;
 }

 /**
@@ -3053,6 +3067,8 @@ static inline void file_end_write(struct file *file)
        if (!S_ISREG(file_inode(file)->i_mode))
                return;
        sb_end_write(file_inode(file)->i_sb);
+       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
+               pagefault_enable();
 }


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-07 16:07                 ` Amir Goldstein
@ 2025-03-07 16:21                   ` syzbot
  2025-03-07 16:22                     ` Amir Goldstein
  2025-03-07 17:45                   ` Amir Goldstein
  1 sibling, 1 reply; 20+ messages in thread
From: syzbot @ 2025-03-07 16:21 UTC (permalink / raw)
  To: akpm, amir73il, axboe, brauner, cem, chandan.babu, djwong, jack,
	josef, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to apply patch:
checking file include/linux/fs.h
patch: **** unexpected end of file in patch

Tested on:

commit:         ea33db4d fsnotify: avoid possible deadlock with HSM ho..
git tree:       https://github.com/amir73il/linux fsnotify-fixes
kernel config:  https://syzkaller.appspot.com/x/.config?x=d6b7e15dc5b5e776
dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
compiler:       
userspace arch: arm64
patch:          https://syzkaller.appspot.com/x/patch.diff?x=16fe4878580000

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-07 16:21                   ` syzbot
@ 2025-03-07 16:22                     ` Amir Goldstein
  2025-03-07 16:49                       ` syzbot
  0 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2025-03-07 16:22 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, axboe, brauner, cem, chandan.babu, djwong, jack, josef,
	linux-fsdevel, linux-kernel, linux-mm, linux-xfs, syzkaller-bugs

On Fri, Mar 7, 2025 at 5:21 PM syzbot
<syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot tried to test the proposed patch but the build/boot failed:
>
> failed to apply patch:
> checking file include/linux/fs.h
> patch: **** unexpected end of file in patch
>
>
>
> Tested on:
>
> commit:         ea33db4d fsnotify: avoid possible deadlock with HSM ho..
> git tree:       https://github.com/amir73il/linux fsnotify-fixes
> kernel config:  https://syzkaller.appspot.com/x/.config?x=d6b7e15dc5b5e776
> dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
> compiler:
> userspace arch: arm64
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=16fe4878580000
>

Let's try again - just the branch - no extra patch:

#syz test: https://github.com/amir73il/linux fsnotify-fixes

Thanks,
Amir


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-07 16:22                     ` Amir Goldstein
@ 2025-03-07 16:49                       ` syzbot
  0 siblings, 0 replies; 20+ messages in thread
From: syzbot @ 2025-03-07 16:49 UTC (permalink / raw)
  To: akpm, amir73il, axboe, brauner, cem, chandan.babu, djwong, jack,
	josef, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
Tested-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com

Tested on:

commit:         ea33db4d fsnotify: avoid possible deadlock with HSM ho..
git tree:       https://github.com/amir73il/linux fsnotify-fixes
console output: https://syzkaller.appspot.com/x/log.txt?x=12494878580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=afb3000d0159783f
dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64

Note: no patches were applied.
Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-07 16:07                 ` Amir Goldstein
  2025-03-07 16:21                   ` syzbot
@ 2025-03-07 17:45                   ` Amir Goldstein
  2025-03-09 12:09                     ` Amir Goldstein
  1 sibling, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2025-03-07 17:45 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Fri, Mar 7, 2025 at 5:07 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Mar 7, 2025 at 4:46 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > On Tue, Mar 04, 2025 at 10:13:39PM +0100, Amir Goldstein wrote:
> > > On Tue, Mar 4, 2025 at 9:37 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > >
> > > > On Tue, Mar 04, 2025 at 09:27:20PM +0100, Amir Goldstein wrote:
> > > > > On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > > > >
> > > > > > On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > > > > > > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> > > > > > > >
> > > > > > > > Josef, Amir,
> > > > > > > >
> > > > > > > > this is indeed an interesting case:
> > > > > > > >
> > > > > > > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > > > > > > syzbot has found a reproducer for the following issue on:
> > > > > > > > ...
> > > > > > > > > ------------[ cut here ]------------
> > > > > > > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > Modules linked in:
> > > > > > > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > > > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > > > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > sp : ffff8000a42569d0
> > > > > > > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > > > > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > > > > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > > > > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > > > > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > > > > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > > > > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > > > > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > > > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > > > > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > > > > > > Call trace:
> > > > > > > > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > > > > > > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > > > > > > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > > > > > > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > > > > > > > >  do_read_fault mm/memory.c:5403 [inline]
> > > > > > > > >  do_fault mm/memory.c:5537 [inline]
> > > > > > > > >  do_pte_missing mm/memory.c:4058 [inline]
> > > > > > > > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > > > > > > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > > > > > > > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > > > > > > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > > > > > > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > > > > > > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > > > > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > > > > > > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > > > > > > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > > > > > > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > > > > > > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > > > > > > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > > > > > > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > > > > > > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > > > > > > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > > > > > > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > > > > > > >  new_sync_write fs/read_write.c:586 [inline]
> > > > > > > > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> > > > > > > >
> > > > > > > > The backtrace actually explains it all. We had a buffered write whose
> > > > > > > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > > > > > > of the buffer happens already (quite deep) under the filesystem freeze
> > > > > > > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > > > > > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > > > > > > freezing. So we need to think how to deal with this case...
> > > > > > >
> > > > > > > Ouch. It's like the splice mess all over again.
> > > > > > > Except we do not really care to make this use case work with HSM
> > > > > > > in the sense that we do not care to have to fill in the mmaped file content
> > > > > > > in this corner case - we just need to let HSM fail the access if content is
> > > > > > > not available.
> > > > > > >
> > > > > > > If you remember, in one of my very early version of pre-content events,
> > > > > > > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > > > > > > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > > > > > > HSM service if it was safe to write to fs in the context of event handling.
> > > > > > >
> > > > > > > At the moment, I cannot think of any elegant way out of this use case
> > > > > > > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > > > > > > This will relax the debugging code assertion and notify the HSM service
> > > > > > > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > > > > > > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > > > > > > this case to HSM service.
> > > > > > >
> > > > > > > WDYT?
> > > > > >
> > > > > > I think that mmap was a mistake.
> > > > >
> > > > > What do you mean?
> > > > > Isn't the fault hook required for your large executables use case?
> > > >
> > > > I mean the mmap syscall was a mistake ;).
> > > >
> > >
> > > ah :)
> > >
> > > > >
> > > > > >
> > > > > > Is there a way to tell if we're currently in a path that is under fsfreeze
> > > > > > protection?
> > > > >
> > > > > Not at the moment.
> > > > > At the moment, file_write_not_started() is not a reliable check
> > > > > (has false positives) without CONFIG_LOCKDEP.
> > > > >
> > >
> > > One very ugly solution is to require CONFIG_LOCKDEP for
> > > pre-content events.
> > >
> > > > > > Just denying this case would be a simpler short term solution while
> > > > > > we come up with a long term solution. I think your solution is fine, but I'd be
> > > > > > just as happy with a simpler "this isn't allowed" solution. Thanks,
> > > > >
> > > > > Yeh, I don't mind that, but it's a bit of an overkill considering that
> > > > > file with no content may in fact be rare.
> > > >
> > > > Agreed, I'm fine with your solution.
> > >
> > > Well, my "solution" was quite hand-wavy - it did not really say how to
> > > propagate the fact that faults initiated from fault_in_readable().
> > > Do you guys have any ideas for a simple solution?
> >
> > Sorry I've been elbow deep in helping getting our machine replacements working
> > faster.
> >
> > I've been thnking about this, it's not like we can carry context from the reason
> > we are faulting in, at least not simply, so I think the best thing to do is
> > either
> >
> > 1) Emit a precontent event at mmap() time for the whole file, since really all I
> > care about is faulting at exec time, and then we can just skip the precontent
> > event if we're not exec.
>
> Sorry, not that familiar with exec code. Do you mean to issue pre-content
> for page fault only if memory is mapped executable or is there another way
> of knowing that we are in exec context?
>
> If the former, then syzbot will catch up with us and write a buffer which is
> mapped readable and exec.
>
> >
> > 2) Revert the page fault stuff, put back your thing to fault the whole file, and
> > wait until we think of a better way to deal with this.
> >
> > Obviously I'd prefer not #2, but I'd really, really rather not chuck all of HSM
> > because my page fault thing is silly.  I'll carry what I need internally while
> > we figure out what to do upstream.  #1 doesn't seem bad, but I haven't thought
> > about it that hard.  Thanks,
> >
>
> So I started to test this patch, but I may be doing something very
> terribly wrong
> with this. Q: What is this something that is terribly wrong?
>
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 2788df98080f8..a8822b44d4967 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3033,13 +3033,27 @@ static inline void file_start_write(struct file *file)
>         if (!S_ISREG(file_inode(file)->i_mode))
>                 return;
>         sb_start_write(file_inode(file)->i_sb);
> +       /*
> +        * Prevent fault-in user pages that may call HSM hooks with
> +        * sb_writers held.
> +        */
> +       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
> +               pagefault_disable();
>  }
>
>  static inline bool file_start_write_trylock(struct file *file)
>  {
>         if (!S_ISREG(file_inode(file)->i_mode))
>                 return true;
> -       return sb_start_write_trylock(file_inode(file)->i_sb);
> +       if (!sb_start_write_trylock(file_inode(file)->i_sb))
> +               return false;
> +       /*
> +        * Prevent fault-in user pages that may call HSM hooks with
> +        * sb_writers held.
> +        */
> +       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
> +               pagefault_disable();
> +       return true;
>  }
>
>  /**
> @@ -3053,6 +3067,8 @@ static inline void file_end_write(struct file *file)
>         if (!S_ISREG(file_inode(file)->i_mode))
>                 return;
>         sb_end_write(file_inode(file)->i_sb);
> +       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
> +               pagefault_enable();
>  }

One thing that is wrong is that this is checking if the written file
is marked for
pre-content events, not the input buffer mmaped file.

What we would have needed here is a check of
  unlikely(fsnotify_sb_has_priority_watchers(sb,
                                                FSNOTIFY_PRIO_PRE_CONTENT)))

But Linus will not like that...

Do we even care about optimizing the pre-content hooks of sporadic files
that are not marked for pre-content events when there are pre-content
watches on the filesystem?

I think all of our use cases mark the sb for pre-content events anyway
and do not care about a bit of overhead for non-marked files.
If that is the case we can do away with the extra optimization
and then the changes above will really solve the issue.

I've squashed the followup change to the fsnotify-fixes branch.

One thing that this patch does not address is aio and io_uring,
but the comment above fault_in_iov_iter_readable() says:
   " ...For async buffered writes the assumption is that the user
   " page has already been faulted in.

IDK. Let me know what you think.

Thanks,
Amir.

--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -652,7 +652,6 @@ void file_set_fsnotify_mode_from_watchers(struct file *file)
 {
        struct dentry *dentry = file->f_path.dentry, *parent;
        struct super_block *sb = dentry->d_sb;
-       __u32 mnt_mask, p_mask;

        /* Is it a file opened by fanotify? */
        if (FMODE_FSNOTIFY_NONE(file->f_mode))
@@ -681,30 +680,10 @@ void file_set_fsnotify_mode_from_watchers(struct
file *file)
        }

        /*
-        * OK, there are some pre-content watchers. Check if anybody is
-        * watching for pre-content events on *this* file.
+        * OK, there are some pre-content watchers on this fs, so
+        * Enable pre-content events.
         */
-       mnt_mask = READ_ONCE(real_mount(file->f_path.mnt)->mnt_fsnotify_mask);
-       if (unlikely(fsnotify_object_watched(d_inode(dentry), mnt_mask,
-                                    FSNOTIFY_PRE_CONTENT_EVENTS))) {
-               /* Enable pre-content events */
-               file_set_fsnotify_mode(file, 0);
-               return;
-       }
-
-       /* Is parent watching for pre-content events on this file? */
-       if (dentry->d_flags & DCACHE_FSNOTIFY_PARENT_WATCHED) {
-               parent = dget_parent(dentry);
-               p_mask = fsnotify_inode_watches_children(d_inode(parent));
-               dput(parent);
-               if (p_mask & FSNOTIFY_PRE_CONTENT_EVENTS) {
-                       /* Enable pre-content events */
-                       file_set_fsnotify_mode(file, 0);
-                       return;
-               }
-       }
-       /* Nobody watching for pre-content events from this file */
-       file_set_fsnotify_mode(file, FMODE_NONOTIFY | FMODE_NONOTIFY_PERM);
+       file_set_fsnotify_mode(file, 0);
 }
 #endif


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-07 17:45                   ` Amir Goldstein
@ 2025-03-09 12:09                     ` Amir Goldstein
  2025-03-09 15:03                       ` Amir Goldstein
  0 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2025-03-09 12:09 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Fri, Mar 7, 2025 at 6:45 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Mar 7, 2025 at 5:07 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Mar 7, 2025 at 4:46 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > >
> > > On Tue, Mar 04, 2025 at 10:13:39PM +0100, Amir Goldstein wrote:
> > > > On Tue, Mar 4, 2025 at 9:37 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > > >
> > > > > On Tue, Mar 04, 2025 at 09:27:20PM +0100, Amir Goldstein wrote:
> > > > > > On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > > > > >
> > > > > > > On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > > > > > > > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> > > > > > > > >
> > > > > > > > > Josef, Amir,
> > > > > > > > >
> > > > > > > > > this is indeed an interesting case:
> > > > > > > > >
> > > > > > > > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > > > > > > > syzbot has found a reproducer for the following issue on:
> > > > > > > > > ...
> > > > > > > > > > ------------[ cut here ]------------
> > > > > > > > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > > Modules linked in:
> > > > > > > > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > > > > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > > > > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > > sp : ffff8000a42569d0
> > > > > > > > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > > > > > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > > > > > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > > > > > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > > > > > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > > > > > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > > > > > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > > > > > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > > > > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > > > > > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > > > > > > > Call trace:
> > > > > > > > > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > > > > > > > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > > > > > > > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > > > > > > > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > > > > > > > > >  do_read_fault mm/memory.c:5403 [inline]
> > > > > > > > > >  do_fault mm/memory.c:5537 [inline]
> > > > > > > > > >  do_pte_missing mm/memory.c:4058 [inline]
> > > > > > > > > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > > > > > > > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > > > > > > > > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > > > > > > > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > > > > > > > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > > > > > > > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > > > > > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > > > > > > > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > > > > > > > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > > > > > > > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > > > > > > > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > > > > > > > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > > > > > > > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > > > > > > > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > > > > > > > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > > > > > > > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > > > > > > > >  new_sync_write fs/read_write.c:586 [inline]
> > > > > > > > > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> > > > > > > > >
> > > > > > > > > The backtrace actually explains it all. We had a buffered write whose
> > > > > > > > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > > > > > > > of the buffer happens already (quite deep) under the filesystem freeze
> > > > > > > > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > > > > > > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > > > > > > > freezing. So we need to think how to deal with this case...
> > > > > > > >
> > > > > > > > Ouch. It's like the splice mess all over again.
> > > > > > > > Except we do not really care to make this use case work with HSM
> > > > > > > > in the sense that we do not care to have to fill in the mmaped file content
> > > > > > > > in this corner case - we just need to let HSM fail the access if content is
> > > > > > > > not available.
> > > > > > > >
> > > > > > > > If you remember, in one of my very early version of pre-content events,
> > > > > > > > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > > > > > > > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > > > > > > > HSM service if it was safe to write to fs in the context of event handling.
> > > > > > > >
> > > > > > > > At the moment, I cannot think of any elegant way out of this use case
> > > > > > > > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > > > > > > > This will relax the debugging code assertion and notify the HSM service
> > > > > > > > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > > > > > > > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > > > > > > > this case to HSM service.
> > > > > > > >
> > > > > > > > WDYT?
> > > > > > >
> > > > > > > I think that mmap was a mistake.
> > > > > >
> > > > > > What do you mean?
> > > > > > Isn't the fault hook required for your large executables use case?
> > > > >
> > > > > I mean the mmap syscall was a mistake ;).
> > > > >
> > > >
> > > > ah :)
> > > >
> > > > > >
> > > > > > >
> > > > > > > Is there a way to tell if we're currently in a path that is under fsfreeze
> > > > > > > protection?
> > > > > >
> > > > > > Not at the moment.
> > > > > > At the moment, file_write_not_started() is not a reliable check
> > > > > > (has false positives) without CONFIG_LOCKDEP.
> > > > > >
> > > >
> > > > One very ugly solution is to require CONFIG_LOCKDEP for
> > > > pre-content events.
> > > >
> > > > > > > Just denying this case would be a simpler short term solution while
> > > > > > > we come up with a long term solution. I think your solution is fine, but I'd be
> > > > > > > just as happy with a simpler "this isn't allowed" solution. Thanks,
> > > > > >
> > > > > > Yeh, I don't mind that, but it's a bit of an overkill considering that
> > > > > > file with no content may in fact be rare.
> > > > >
> > > > > Agreed, I'm fine with your solution.
> > > >
> > > > Well, my "solution" was quite hand-wavy - it did not really say how to
> > > > propagate the fact that faults initiated from fault_in_readable().
> > > > Do you guys have any ideas for a simple solution?
> > >
> > > Sorry I've been elbow deep in helping getting our machine replacements working
> > > faster.
> > >
> > > I've been thnking about this, it's not like we can carry context from the reason
> > > we are faulting in, at least not simply, so I think the best thing to do is
> > > either
> > >
> > > 1) Emit a precontent event at mmap() time for the whole file, since really all I
> > > care about is faulting at exec time, and then we can just skip the precontent
> > > event if we're not exec.
> >
> > Sorry, not that familiar with exec code. Do you mean to issue pre-content
> > for page fault only if memory is mapped executable or is there another way
> > of knowing that we are in exec context?
> >
> > If the former, then syzbot will catch up with us and write a buffer which is
> > mapped readable and exec.
> >

Oh, I was being silly.
You meant to call the hook from page fault only for FMODE_EXEC.
This makes sense to me. I will try to write it up.

> > >
> > > 2) Revert the page fault stuff, put back your thing to fault the whole file, and
> > > wait until we think of a better way to deal with this.
> > >
> > > Obviously I'd prefer not #2, but I'd really, really rather not chuck all of HSM
> > > because my page fault thing is silly.  I'll carry what I need internally while
> > > we figure out what to do upstream.  #1 doesn't seem bad, but I haven't thought
> > > about it that hard.  Thanks,
> > >
> >
> > So I started to test this patch, but I may be doing something very
> > terribly wrong
> > with this. Q: What is this something that is terribly wrong?
> >
> >
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 2788df98080f8..a8822b44d4967 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -3033,13 +3033,27 @@ static inline void file_start_write(struct file *file)
> >         if (!S_ISREG(file_inode(file)->i_mode))
> >                 return;
> >         sb_start_write(file_inode(file)->i_sb);
> > +       /*
> > +        * Prevent fault-in user pages that may call HSM hooks with
> > +        * sb_writers held.
> > +        */
> > +       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
> > +               pagefault_disable();
> >  }
> >
> >  static inline bool file_start_write_trylock(struct file *file)
> >  {
> >         if (!S_ISREG(file_inode(file)->i_mode))
> >                 return true;
> > -       return sb_start_write_trylock(file_inode(file)->i_sb);
> > +       if (!sb_start_write_trylock(file_inode(file)->i_sb))
> > +               return false;
> > +       /*
> > +        * Prevent fault-in user pages that may call HSM hooks with
> > +        * sb_writers held.
> > +        */
> > +       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
> > +               pagefault_disable();
> > +       return true;
> >  }
> >
> >  /**
> > @@ -3053,6 +3067,8 @@ static inline void file_end_write(struct file *file)
> >         if (!S_ISREG(file_inode(file)->i_mode))
> >                 return;
> >         sb_end_write(file_inode(file)->i_sb);
> > +       if (unlikely(FMODE_FSNOTIFY_HSM(file->f_mode)))
> > +               pagefault_enable();
> >  }
>
> One thing that is wrong is that this is checking if the written file
> is marked for
> pre-content events, not the input buffer mmaped file.
>
> What we would have needed here is a check of
>   unlikely(fsnotify_sb_has_priority_watchers(sb,
>                                                 FSNOTIFY_PRIO_PRE_CONTENT)))
>
> But Linus will not like that...
>
> Do we even care about optimizing the pre-content hooks of sporadic files
> that are not marked for pre-content events when there are pre-content
> watches on the filesystem?
>
> I think all of our use cases mark the sb for pre-content events anyway
> and do not care about a bit of overhead for non-marked files.
> If that is the case we can do away with the extra optimization
> and then the changes above will really solve the issue.
>
> I've squashed the followup change to the fsnotify-fixes branch.

This was actually a partial revert of commit 318652e07fa5b ("fsnotify:
check if file is actually being watched for pre-content events on open"),
so posted it as a separate patch.

I am not sure if we need this if we go the route of event on mmap(),
but posted the patches so we have them if we decide that they are useful.

Thanks,
Amir.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-09 12:09                     ` Amir Goldstein
@ 2025-03-09 15:03                       ` Amir Goldstein
  2025-03-09 16:20                         ` syzbot
  0 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2025-03-09 15:03 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Jan Kara, syzbot, akpm, axboe, brauner, cem, chandan.babu,
	djwong, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

On Sun, Mar 9, 2025 at 1:09 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Mar 7, 2025 at 6:45 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Mar 7, 2025 at 5:07 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Fri, Mar 7, 2025 at 4:46 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > >
> > > > On Tue, Mar 04, 2025 at 10:13:39PM +0100, Amir Goldstein wrote:
> > > > > On Tue, Mar 4, 2025 at 9:37 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > > > >
> > > > > > On Tue, Mar 04, 2025 at 09:27:20PM +0100, Amir Goldstein wrote:
> > > > > > > On Tue, Mar 4, 2025 at 5:15 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Mar 04, 2025 at 04:09:16PM +0100, Amir Goldstein wrote:
> > > > > > > > > On Tue, Mar 4, 2025 at 12:06 PM Jan Kara <jack@suse.cz> wrote:
> > > > > > > > > >
> > > > > > > > > > Josef, Amir,
> > > > > > > > > >
> > > > > > > > > > this is indeed an interesting case:
> > > > > > > > > >
> > > > > > > > > > On Sun 02-03-25 08:32:30, syzbot wrote:
> > > > > > > > > > > syzbot has found a reproducer for the following issue on:
> > > > > > > > > > ...
> > > > > > > > > > > ------------[ cut here ]------------
> > > > > > > > > > > WARNING: CPU: 1 PID: 6440 at ./include/linux/fsnotify.h:145 fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > > > Modules linked in:
> > > > > > > > > > > CPU: 1 UID: 0 PID: 6440 Comm: syz-executor370 Not tainted 6.14.0-rc4-syzkaller-ge056da87c780 #0
> > > > > > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
> > > > > > > > > > > pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > > > > > > > pc : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > > > lr : fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145
> > > > > > > > > > > sp : ffff8000a42569d0
> > > > > > > > > > > x29: ffff8000a42569d0 x28: ffff0000dcec1b48 x27: ffff0000d68a1708
> > > > > > > > > > > x26: ffff0000d68a16c0 x25: dfff800000000000 x24: 0000000000008000
> > > > > > > > > > > x23: 0000000000000001 x22: ffff8000a4256b00 x21: 0000000000001000
> > > > > > > > > > > x20: 0000000000000010 x19: ffff0000d68a16c0 x18: ffff8000a42566e0
> > > > > > > > > > > x17: 000000000000e388 x16: ffff800080466c24 x15: 0000000000000001
> > > > > > > > > > > x14: 1fffe0001b31513c x13: 0000000000000000 x12: 0000000000000000
> > > > > > > > > > > x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000
> > > > > > > > > > > x8 : ffff0000c6d98000 x7 : 0000000000000000 x6 : 0000000000000000
> > > > > > > > > > > x5 : 0000000000000020 x4 : 0000000000000000 x3 : 0000000000001000
> > > > > > > > > > > x2 : ffff8000a4256b00 x1 : 0000000000000001 x0 : 0000000000000000
> > > > > > > > > > > Call trace:
> > > > > > > > > > >  fsnotify_file_area_perm+0x20c/0x25c include/linux/fsnotify.h:145 (P)
> > > > > > > > > > >  filemap_fault+0x12b0/0x1518 mm/filemap.c:3509
> > > > > > > > > > >  xfs_filemap_fault+0xc4/0x194 fs/xfs/xfs_file.c:1543
> > > > > > > > > > >  __do_fault+0xf8/0x498 mm/memory.c:4988
> > > > > > > > > > >  do_read_fault mm/memory.c:5403 [inline]
> > > > > > > > > > >  do_fault mm/memory.c:5537 [inline]
> > > > > > > > > > >  do_pte_missing mm/memory.c:4058 [inline]
> > > > > > > > > > >  handle_pte_fault+0x3504/0x57b0 mm/memory.c:5900
> > > > > > > > > > >  __handle_mm_fault mm/memory.c:6043 [inline]
> > > > > > > > > > >  handle_mm_fault+0xfa8/0x188c mm/memory.c:6212
> > > > > > > > > > >  do_page_fault+0x570/0x10a8 arch/arm64/mm/fault.c:690
> > > > > > > > > > >  do_translation_fault+0xc4/0x114 arch/arm64/mm/fault.c:783
> > > > > > > > > > >  do_mem_abort+0x74/0x200 arch/arm64/mm/fault.c:919
> > > > > > > > > > >  el1_abort+0x3c/0x5c arch/arm64/kernel/entry-common.c:432
> > > > > > > > > > >  el1h_64_sync_handler+0x60/0xcc arch/arm64/kernel/entry-common.c:510
> > > > > > > > > > >  el1h_64_sync+0x6c/0x70 arch/arm64/kernel/entry.S:595
> > > > > > > > > > >  __uaccess_mask_ptr arch/arm64/include/asm/uaccess.h:169 [inline] (P)
> > > > > > > > > > >  fault_in_readable+0x168/0x310 mm/gup.c:2234 (P)
> > > > > > > > > > >  fault_in_iov_iter_readable+0x1dc/0x22c lib/iov_iter.c:94
> > > > > > > > > > >  iomap_write_iter fs/iomap/buffered-io.c:950 [inline]
> > > > > > > > > > >  iomap_file_buffered_write+0x490/0xd54 fs/iomap/buffered-io.c:1039
> > > > > > > > > > >  xfs_file_buffered_write+0x2dc/0xac8 fs/xfs/xfs_file.c:792
> > > > > > > > > > >  xfs_file_write_iter+0x2c4/0x6ac fs/xfs/xfs_file.c:881
> > > > > > > > > > >  new_sync_write fs/read_write.c:586 [inline]
> > > > > > > > > > >  vfs_write+0x704/0xa9c fs/read_write.c:679
> > > > > > > > > >
> > > > > > > > > > The backtrace actually explains it all. We had a buffered write whose
> > > > > > > > > > buffer was mmapped file on a filesystem with an HSM mark. Now the prefaulting
> > > > > > > > > > of the buffer happens already (quite deep) under the filesystem freeze
> > > > > > > > > > protection (obtained in vfs_write()) which breaks assumptions of HSM code
> > > > > > > > > > and introduces potential deadlock of HSM handler in userspace with filesystem
> > > > > > > > > > freezing. So we need to think how to deal with this case...
> > > > > > > > >
> > > > > > > > > Ouch. It's like the splice mess all over again.
> > > > > > > > > Except we do not really care to make this use case work with HSM
> > > > > > > > > in the sense that we do not care to have to fill in the mmaped file content
> > > > > > > > > in this corner case - we just need to let HSM fail the access if content is
> > > > > > > > > not available.
> > > > > > > > >
> > > > > > > > > If you remember, in one of my very early version of pre-content events,
> > > > > > > > > the pre-content event (or maybe it was FAN_ACCESS_PERM itself)
> > > > > > > > > carried a flag (I think it was called FAN_PRE_VFS) to communicate to
> > > > > > > > > HSM service if it was safe to write to fs in the context of event handling.
> > > > > > > > >
> > > > > > > > > At the moment, I cannot think of any elegant way out of this use case
> > > > > > > > > except annotating the event from fault_in_readable() as "unsafe-for-write".
> > > > > > > > > This will relax the debugging code assertion and notify the HSM service
> > > > > > > > > (via an event flag) that it can ALLOW/DENY, but it cannot fill the file.
> > > > > > > > > Maybe we can reuse the FAN_ACCESS_PERM event to communicate
> > > > > > > > > this case to HSM service.
> > > > > > > > >
> > > > > > > > > WDYT?
> > > > > > > >
> > > > > > > > I think that mmap was a mistake.
> > > > > > >
> > > > > > > What do you mean?
> > > > > > > Isn't the fault hook required for your large executables use case?
> > > > > >
> > > > > > I mean the mmap syscall was a mistake ;).
> > > > > >
> > > > >
> > > > > ah :)
> > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Is there a way to tell if we're currently in a path that is under fsfreeze
> > > > > > > > protection?
> > > > > > >
> > > > > > > Not at the moment.
> > > > > > > At the moment, file_write_not_started() is not a reliable check
> > > > > > > (has false positives) without CONFIG_LOCKDEP.
> > > > > > >
> > > > >
> > > > > One very ugly solution is to require CONFIG_LOCKDEP for
> > > > > pre-content events.
> > > > >
> > > > > > > > Just denying this case would be a simpler short term solution while
> > > > > > > > we come up with a long term solution. I think your solution is fine, but I'd be
> > > > > > > > just as happy with a simpler "this isn't allowed" solution. Thanks,
> > > > > > >
> > > > > > > Yeh, I don't mind that, but it's a bit of an overkill considering that
> > > > > > > file with no content may in fact be rare.
> > > > > >
> > > > > > Agreed, I'm fine with your solution.
> > > > >
> > > > > Well, my "solution" was quite hand-wavy - it did not really say how to
> > > > > propagate the fact that faults initiated from fault_in_readable().
> > > > > Do you guys have any ideas for a simple solution?
> > > >
> > > > Sorry I've been elbow deep in helping getting our machine replacements working
> > > > faster.
> > > >
> > > > I've been thnking about this, it's not like we can carry context from the reason
> > > > we are faulting in, at least not simply, so I think the best thing to do is
> > > > either
> > > >
> > > > 1) Emit a precontent event at mmap() time for the whole file, since really all I
> > > > care about is faulting at exec time, and then we can just skip the precontent
> > > > event if we're not exec.
> > >
> > > Sorry, not that familiar with exec code. Do you mean to issue pre-content
> > > for page fault only if memory is mapped executable or is there another way
> > > of knowing that we are in exec context?
> > >
> > > If the former, then syzbot will catch up with us and write a buffer which is
> > > mapped readable and exec.
> > >
>
> Oh, I was being silly.
> You meant to call the hook from page fault only for FMODE_EXEC.
> This makes sense to me. I will try to write it up.
>

Let'e see if that works:

#syz test: https://github.com/amir73il/linux fsnotify-mmap

So far only compile and sanity tested.

Thanks,
Amir.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [xfs?] WARNING in fsnotify_file_area_perm
  2025-03-09 15:03                       ` Amir Goldstein
@ 2025-03-09 16:20                         ` syzbot
  0 siblings, 0 replies; 20+ messages in thread
From: syzbot @ 2025-03-09 16:20 UTC (permalink / raw)
  To: akpm, amir73il, axboe, brauner, cem, chandan.babu, djwong, jack,
	josef, linux-fsdevel, linux-kernel, linux-mm, linux-xfs,
	syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com
Tested-by: syzbot+7229071b47908b19d5b7@syzkaller.appspotmail.com

Tested on:

commit:         b63f532f fsnotify: avoid pre-content events when fault..
git tree:       https://github.com/amir73il/linux fsnotify-mmap
console output: https://syzkaller.appspot.com/x/log.txt?x=11fd1fa0580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=afb3000d0159783f
dashboard link: https://syzkaller.appspot.com/bug?extid=7229071b47908b19d5b7
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64

Note: no patches were applied.
Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-03-09 16:20 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-06  9:59 [syzbot] [mm?] WARNING in fsnotify_file_area_perm syzbot
2025-02-07  0:54 ` Andrew Morton
2025-02-07  8:45   ` Christian Brauner
2025-02-07 19:33     ` Amir Goldstein
2025-03-02 16:32 ` [syzbot] [xfs?] " syzbot
2025-03-04 11:06   ` Jan Kara
2025-03-04 15:09     ` Amir Goldstein
2025-03-04 16:15       ` Josef Bacik
2025-03-04 20:27         ` Amir Goldstein
2025-03-04 20:36           ` Josef Bacik
2025-03-04 21:13             ` Amir Goldstein
2025-03-07 15:46               ` Josef Bacik
2025-03-07 16:07                 ` Amir Goldstein
2025-03-07 16:21                   ` syzbot
2025-03-07 16:22                     ` Amir Goldstein
2025-03-07 16:49                       ` syzbot
2025-03-07 17:45                   ` Amir Goldstein
2025-03-09 12:09                     ` Amir Goldstein
2025-03-09 15:03                       ` Amir Goldstein
2025-03-09 16:20                         ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox