* HMM related use-after-free with amdgpu @ 2019-07-15 16:51 Michel Dänzer 2019-07-15 17:25 ` Jason Gunthorpe 0 siblings, 1 reply; 9+ messages in thread From: Michel Dänzer @ 2019-07-15 16:51 UTC (permalink / raw) To: amd-gfx; +Cc: Jérôme Glisse, linux-mm, Jason Gunthorpe [-- Attachment #1: Type: text/plain, Size: 534 bytes --] With a KASAN enabled kernel built from amd-staging-drm-next, the attached use-after-free is pretty reliably detected during a piglit gpu run. Any ideas? P.S. With my standard kernels without KASAN (currently 5.2.y + drm-next changes for 5.3), I'm having trouble lately completing a piglit run, running into various issues which look like memory corruption, so might be related. -- Earthling Michel Dänzer | https://www.amd.com Libre software enthusiast | Mesa and X developer [-- Attachment #2: kern.log --] [-- Type: text/x-log, Size: 8929 bytes --] Jul 15 18:09:29 kaveri kernel: [ 560.388751][T12568] ================================================================== Jul 15 18:09:29 kaveri kernel: [ 560.389063][T12568] BUG: KASAN: use-after-free in __mmu_notifier_release+0x286/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389068][T12568] Read of size 8 at addr ffff88835e1c7cb0 by task amd_pinned_memo/12568 Jul 15 18:09:29 kaveri kernel: [ 560.389071][T12568] Jul 15 18:09:29 kaveri kernel: [ 560.389077][T12568] CPU: 9 PID: 12568 Comm: amd_pinned_memo Tainted: G OE 5.2.0-rc1-00811-g2ad5a7d31bdf #125 Jul 15 18:09:29 kaveri kernel: [ 560.389080][T12568] Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.80 09/13/2017 Jul 15 18:09:29 kaveri kernel: [ 560.389084][T12568] Call Trace: Jul 15 18:09:29 kaveri kernel: [ 560.389091][T12568] dump_stack+0x7c/0xc0 Jul 15 18:09:29 kaveri kernel: [ 560.389097][T12568] ? __mmu_notifier_release+0x286/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389101][T12568] print_address_description+0x65/0x22e Jul 15 18:09:29 kaveri kernel: [ 560.389106][T12568] ? __mmu_notifier_release+0x286/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389110][T12568] ? __mmu_notifier_release+0x286/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389115][T12568] __kasan_report.cold.3+0x1a/0x3d Jul 15 18:09:29 kaveri kernel: [ 560.389122][T12568] ? __mmu_notifier_release+0x286/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389128][T12568] kasan_report+0xe/0x20 Jul 15 18:09:29 kaveri kernel: [ 560.389132][T12568] __mmu_notifier_release+0x286/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389142][T12568] exit_mmap+0x93/0x400 Jul 15 18:09:29 kaveri kernel: [ 560.389146][T12568] ? quarantine_put+0xb7/0x150 Jul 15 18:09:29 kaveri kernel: [ 560.389151][T12568] ? do_munmap+0x10/0x10 Jul 15 18:09:29 kaveri kernel: [ 560.389156][T12568] ? lockdep_hardirqs_on+0x37f/0x560 Jul 15 18:09:29 kaveri kernel: [ 560.389165][T12568] ? __khugepaged_exit+0x2af/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389169][T12568] ? __khugepaged_exit+0x2af/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389174][T12568] ? rcu_read_lock_sched_held+0xd8/0x110 Jul 15 18:09:29 kaveri kernel: [ 560.389179][T12568] ? kmem_cache_free+0x279/0x2c0 Jul 15 18:09:29 kaveri kernel: [ 560.389185][T12568] ? __khugepaged_exit+0x2be/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389192][T12568] mmput+0xb2/0x390 Jul 15 18:09:29 kaveri kernel: [ 560.389199][T12568] do_exit+0x880/0x2a70 Jul 15 18:09:29 kaveri kernel: [ 560.389207][T12568] ? find_held_lock+0x33/0x1c0 Jul 15 18:09:29 kaveri kernel: [ 560.389213][T12568] ? mm_update_next_owner+0x5d0/0x5d0 Jul 15 18:09:29 kaveri kernel: [ 560.389218][T12568] ? __do_page_fault+0x41d/0xa20 Jul 15 18:09:29 kaveri kernel: [ 560.389226][T12568] ? lock_downgrade+0x620/0x620 Jul 15 18:09:29 kaveri kernel: [ 560.389232][T12568] ? handle_mm_fault+0x4ab/0x6a0 Jul 15 18:09:29 kaveri kernel: [ 560.389242][T12568] do_group_exit+0xf0/0x2e0 Jul 15 18:09:29 kaveri kernel: [ 560.389249][T12568] __x64_sys_exit_group+0x3a/0x50 Jul 15 18:09:29 kaveri kernel: [ 560.389255][T12568] do_syscall_64+0x9c/0x430 Jul 15 18:09:29 kaveri kernel: [ 560.389261][T12568] entry_SYSCALL_64_after_hwframe+0x49/0xbe Jul 15 18:09:29 kaveri kernel: [ 560.389266][T12568] RIP: 0033:0x7fc23d8ed9d6 Jul 15 18:09:29 kaveri kernel: [ 560.389271][T12568] Code: 00 4c 8b 0d bc 44 0f 00 eb 19 66 2e 0f 1f 84 00 00 00 00 00 89 d7 89 f0 0f 05 48 3d 00 f0 ff ff 77 22 f4 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff ff 76 e2 f7 d8 64 41 89 01 eb da 66 2e 0f 1f 84 00 Jul 15 18:09:29 kaveri kernel: [ 560.389275][T12568] RSP: 002b:00007fff8c3bcfa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 Jul 15 18:09:29 kaveri kernel: [ 560.389280][T12568] RAX: ffffffffffffffda RBX: 00007fc23d9de760 RCX: 00007fc23d8ed9d6 Jul 15 18:09:29 kaveri kernel: [ 560.389283][T12568] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 Jul 15 18:09:29 kaveri kernel: [ 560.389287][T12568] RBP: 0000000000000000 R08: 00000000000000e7 R09: ffffffffffffff48 Jul 15 18:09:29 kaveri kernel: [ 560.389290][T12568] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc23d9de760 Jul 15 18:09:29 kaveri kernel: [ 560.389293][T12568] R13: 00000000000004f0 R14: 00007fc23d9e7428 R15: 0000000000000000 Jul 15 18:09:29 kaveri kernel: [ 560.389306][T12568] Jul 15 18:09:29 kaveri kernel: [ 560.389309][T12568] Allocated by task 12568: Jul 15 18:09:29 kaveri kernel: [ 560.389314][T12568] save_stack+0x19/0x80 Jul 15 18:09:29 kaveri kernel: [ 560.389318][T12568] __kasan_kmalloc.constprop.8+0xc1/0xd0 Jul 15 18:09:29 kaveri kernel: [ 560.389323][T12568] hmm_get_or_create+0x8f/0x3f0 Jul 15 18:09:29 kaveri kernel: [ 560.389327][T12568] hmm_mirror_register+0x58/0x240 Jul 15 18:09:29 kaveri kernel: [ 560.389425][T12568] amdgpu_mn_get+0x37b/0x6c0 [amdgpu] Jul 15 18:09:29 kaveri kernel: [ 560.389554][T12568] amdgpu_mn_register+0xf6/0x710 [amdgpu] Jul 15 18:09:29 kaveri kernel: [ 560.389656][T12568] amdgpu_gem_userptr_ioctl+0x6a3/0x8b0 [amdgpu] Jul 15 18:09:29 kaveri kernel: [ 560.389678][T12568] drm_ioctl_kernel+0x1c9/0x260 [drm] Jul 15 18:09:29 kaveri kernel: [ 560.389701][T12568] drm_ioctl+0x436/0x930 [drm] Jul 15 18:09:29 kaveri kernel: [ 560.389830][T12568] amdgpu_drm_ioctl+0xd0/0x1b0 [amdgpu] Jul 15 18:09:29 kaveri kernel: [ 560.389836][T12568] do_vfs_ioctl+0x193/0xfd0 Jul 15 18:09:29 kaveri kernel: [ 560.389839][T12568] ksys_ioctl+0x60/0x90 Jul 15 18:09:29 kaveri kernel: [ 560.389843][T12568] __x64_sys_ioctl+0x6f/0xb0 Jul 15 18:09:29 kaveri kernel: [ 560.389847][T12568] do_syscall_64+0x9c/0x430 Jul 15 18:09:29 kaveri kernel: [ 560.389851][T12568] entry_SYSCALL_64_after_hwframe+0x49/0xbe Jul 15 18:09:29 kaveri kernel: [ 560.389853][T12568] Jul 15 18:09:29 kaveri kernel: [ 560.389857][T12568] Freed by task 12568: Jul 15 18:09:29 kaveri kernel: [ 560.389860][T12568] save_stack+0x19/0x80 Jul 15 18:09:29 kaveri kernel: [ 560.389864][T12568] __kasan_slab_free+0x125/0x170 Jul 15 18:09:29 kaveri kernel: [ 560.389867][T12568] kfree+0xe2/0x290 Jul 15 18:09:29 kaveri kernel: [ 560.389871][T12568] __mmu_notifier_release+0xef/0x3e0 Jul 15 18:09:29 kaveri kernel: [ 560.389875][T12568] exit_mmap+0x93/0x400 Jul 15 18:09:29 kaveri kernel: [ 560.389879][T12568] mmput+0xb2/0x390 Jul 15 18:09:29 kaveri kernel: [ 560.389883][T12568] do_exit+0x880/0x2a70 Jul 15 18:09:29 kaveri kernel: [ 560.389886][T12568] do_group_exit+0xf0/0x2e0 Jul 15 18:09:29 kaveri kernel: [ 560.389890][T12568] __x64_sys_exit_group+0x3a/0x50 Jul 15 18:09:29 kaveri kernel: [ 560.389893][T12568] do_syscall_64+0x9c/0x430 Jul 15 18:09:29 kaveri kernel: [ 560.389897][T12568] entry_SYSCALL_64_after_hwframe+0x49/0xbe Jul 15 18:09:29 kaveri kernel: [ 560.389900][T12568] Jul 15 18:09:29 kaveri kernel: [ 560.389903][T12568] The buggy address belongs to the object at ffff88835e1c7c00 Jul 15 18:09:29 kaveri kernel: [ 560.389903][T12568] which belongs to the cache kmalloc-512 of size 512 Jul 15 18:09:29 kaveri kernel: [ 560.389908][T12568] The buggy address is located 176 bytes inside of Jul 15 18:09:29 kaveri kernel: [ 560.389908][T12568] 512-byte region [ffff88835e1c7c00, ffff88835e1c7e00) Jul 15 18:09:29 kaveri kernel: [ 560.389911][T12568] The buggy address belongs to the page: Jul 15 18:09:29 kaveri kernel: [ 560.389915][T12568] page:ffffea000d787100 refcount:1 mapcount:0 mapping:ffff88837d80ec00 index:0x0 compound_mapcount: 0 Jul 15 18:09:29 kaveri kernel: [ 560.389921][T12568] flags: 0x17fffc000010200(slab|head) Jul 15 18:09:29 kaveri kernel: [ 560.389929][T12568] raw: 017fffc000010200 0000000000000000 0000000100000001 ffff88837d80ec00 Jul 15 18:09:29 kaveri kernel: [ 560.389933][T12568] raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000 Jul 15 18:09:29 kaveri kernel: [ 560.389936][T12568] page dumped because: kasan: bad access detected Jul 15 18:09:29 kaveri kernel: [ 560.389939][T12568] Jul 15 18:09:29 kaveri kernel: [ 560.389942][T12568] Memory state around the buggy address: Jul 15 18:09:29 kaveri kernel: [ 560.389946][T12568] ffff88835e1c7b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Jul 15 18:09:29 kaveri kernel: [ 560.389949][T12568] ffff88835e1c7c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Jul 15 18:09:29 kaveri kernel: [ 560.389953][T12568] >ffff88835e1c7c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Jul 15 18:09:29 kaveri kernel: [ 560.389956][T12568] ^ Jul 15 18:09:29 kaveri kernel: [ 560.389960][T12568] ffff88835e1c7d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Jul 15 18:09:29 kaveri kernel: [ 560.389963][T12568] ffff88835e1c7d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Jul 15 18:09:29 kaveri kernel: [ 560.389966][T12568] ================================================================== ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: HMM related use-after-free with amdgpu 2019-07-15 16:51 HMM related use-after-free with amdgpu Michel Dänzer @ 2019-07-15 17:25 ` Jason Gunthorpe 2019-07-16 16:31 ` Michel Dänzer 0 siblings, 1 reply; 9+ messages in thread From: Jason Gunthorpe @ 2019-07-15 17:25 UTC (permalink / raw) To: Michel Dänzer; +Cc: amd-gfx, Jérôme Glisse, linux-mm On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: > > With a KASAN enabled kernel built from amd-staging-drm-next, the > attached use-after-free is pretty reliably detected during a piglit gpu run. Does this branch you are testing have the hmm.git merged? I think from the name it does not? Use after free's of this nature were something that was fixed in hmm.git.. I don't see an obvious way you can hit something like this with the new code arrangement.. > P.S. With my standard kernels without KASAN (currently 5.2.y + drm-next > changes for 5.3), I'm having trouble lately completing a piglit run, > running into various issues which look like memory corruption, so might > be related. I'm skeptical that the AMDGPU implementation of the locking around the hmm_range & mirror is working, it doesn'r follow the perscribed pattern at least. > Jul 15 18:09:29 kaveri kernel: [ 560.388751][T12568] ================================================================== > Jul 15 18:09:29 kaveri kernel: [ 560.389063][T12568] BUG: KASAN: use-after-free in __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389068][T12568] Read of size 8 at addr ffff88835e1c7cb0 by task amd_pinned_memo/12568 > Jul 15 18:09:29 kaveri kernel: [ 560.389071][T12568] > Jul 15 18:09:29 kaveri kernel: [ 560.389077][T12568] CPU: 9 PID: 12568 Comm: amd_pinned_memo Tainted: G OE 5.2.0-rc1-00811-g2ad5a7d31bdf #125 > Jul 15 18:09:29 kaveri kernel: [ 560.389080][T12568] Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.80 09/13/2017 > Jul 15 18:09:29 kaveri kernel: [ 560.389084][T12568] Call Trace: > Jul 15 18:09:29 kaveri kernel: [ 560.389091][T12568] dump_stack+0x7c/0xc0 > Jul 15 18:09:29 kaveri kernel: [ 560.389097][T12568] ? __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389101][T12568] print_address_description+0x65/0x22e > Jul 15 18:09:29 kaveri kernel: [ 560.389106][T12568] ? __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389110][T12568] ? __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389115][T12568] __kasan_report.cold.3+0x1a/0x3d > Jul 15 18:09:29 kaveri kernel: [ 560.389122][T12568] ? __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389128][T12568] kasan_report+0xe/0x20 > Jul 15 18:09:29 kaveri kernel: [ 560.389132][T12568] __mmu_notifier_release+0x286/0x3e0 So we are iterating over the mn list and touched free'd memory > Jul 15 18:09:29 kaveri kernel: [ 560.389309][T12568] Allocated by task 12568: > Jul 15 18:09:29 kaveri kernel: [ 560.389314][T12568] save_stack+0x19/0x80 > Jul 15 18:09:29 kaveri kernel: [ 560.389318][T12568] __kasan_kmalloc.constprop.8+0xc1/0xd0 > Jul 15 18:09:29 kaveri kernel: [ 560.389323][T12568] hmm_get_or_create+0x8f/0x3f0 The memory is probably a struct hmm > Jul 15 18:09:29 kaveri kernel: [ 560.389857][T12568] Freed by task 12568: > Jul 15 18:09:29 kaveri kernel: [ 560.389860][T12568] save_stack+0x19/0x80 > Jul 15 18:09:29 kaveri kernel: [ 560.389864][T12568] __kasan_slab_free+0x125/0x170 > Jul 15 18:09:29 kaveri kernel: [ 560.389867][T12568] kfree+0xe2/0x290 > Jul 15 18:09:29 kaveri kernel: [ 560.389871][T12568] __mmu_notifier_release+0xef/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389875][T12568] exit_mmap+0x93/0x400 And the free was also done in notifier_release (presumably the backtrace is corrupt and this is really in the old hmm_release -> hmm_put -> hmm_free -> kfree call chain) Which was not OK, as __mmu_notifier_release doesn't use a 'safe' hlist iterator, so the release callback can never trigger kfree of a struct mmu_notifier. The new hmm.git code does not call kfree from release, it schedules that through a SRCU which won't run until __mmu_notifier_release returns, by definition. So should be fixed. Jason ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: HMM related use-after-free with amdgpu 2019-07-15 17:25 ` Jason Gunthorpe @ 2019-07-16 16:31 ` Michel Dänzer 2019-07-16 16:35 ` Jason Gunthorpe 0 siblings, 1 reply; 9+ messages in thread From: Michel Dänzer @ 2019-07-16 16:31 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: linux-mm, Jérôme Glisse, amd-gfx On 2019-07-15 7:25 p.m., Jason Gunthorpe wrote: > On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: >> >> With a KASAN enabled kernel built from amd-staging-drm-next, the >> attached use-after-free is pretty reliably detected during a piglit gpu run. > > Does this branch you are testing have the hmm.git merged? I think from > the name it does not? Indeed, no. > Use after free's of this nature were something that was fixed in > hmm.git.. > > I don't see an obvious way you can hit something like this with the > new code arrangement.. I tried merging the hmm-devmem-cleanup.4 changes[0] into my 5.2.y + drm-next for 5.3 kernel. While the result didn't hit the problem, all GL_AMD_pinned_memory piglit tests failed, so I suspect the problem was simply avoided by not actually hitting the HMM related functionality. It's possible that I made a mistake in merging the changes, or that I missed some other required changes. But it's also possible that the HMM changes broke the corresponding user-pointer functionality in amdgpu. [0] Specifically, the following (ranges of) commits: 9ffbe8ac05dbb4ab4a4836a55a47fc6be945a38f (-> lockdep_assert_held_write) e1bfa87399e372446454ecbaeba2800f0a385733..5da04cc86d1215fd9fe0e5c88ead6e8428a75e56 fec88ab0af9706b2201e5daf377c5031c62d11f7^..fec88ab0af9706b2201e5daf377c5031c62d11f7 -- Earthling Michel Dänzer | https://www.amd.com Libre software enthusiast | Mesa and X developer ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: HMM related use-after-free with amdgpu 2019-07-16 16:31 ` Michel Dänzer @ 2019-07-16 16:35 ` Jason Gunthorpe 2019-07-16 17:04 ` Michel Dänzer 0 siblings, 1 reply; 9+ messages in thread From: Jason Gunthorpe @ 2019-07-16 16:35 UTC (permalink / raw) To: Michel Dänzer; +Cc: linux-mm, Jérôme Glisse, amd-gfx On Tue, Jul 16, 2019 at 06:31:09PM +0200, Michel Dänzer wrote: > On 2019-07-15 7:25 p.m., Jason Gunthorpe wrote: > > On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: > >> > >> With a KASAN enabled kernel built from amd-staging-drm-next, the > >> attached use-after-free is pretty reliably detected during a piglit gpu run. > > > > Does this branch you are testing have the hmm.git merged? I think from > > the name it does not? > > Indeed, no. > > > > Use after free's of this nature were something that was fixed in > > hmm.git.. > > > > I don't see an obvious way you can hit something like this with the > > new code arrangement.. > > I tried merging the hmm-devmem-cleanup.4 changes[0] into my 5.2.y + > drm-next for 5.3 kernel. While the result didn't hit the problem, all > GL_AMD_pinned_memory piglit tests failed, so I suspect the problem was > simply avoided by not actually hitting the HMM related functionality. > > It's possible that I made a mistake in merging the changes, or that I > missed some other required changes. But it's also possible that the HMM > changes broke the corresponding user-pointer functionality in amdgpu. Not sure, this was all Tested by the AMD team so it should work, I hope. It should all be sorted out in rc1, try again then? Jason ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: HMM related use-after-free with amdgpu 2019-07-16 16:35 ` Jason Gunthorpe @ 2019-07-16 17:04 ` Michel Dänzer 2019-07-16 17:20 ` Jason Gunthorpe 2019-07-16 22:10 ` Kuehling, Felix 0 siblings, 2 replies; 9+ messages in thread From: Michel Dänzer @ 2019-07-16 17:04 UTC (permalink / raw) To: Jason Gunthorpe; +Cc: linux-mm, Jérôme Glisse, amd-gfx On 2019-07-16 6:35 p.m., Jason Gunthorpe wrote: > On Tue, Jul 16, 2019 at 06:31:09PM +0200, Michel Dänzer wrote: >> On 2019-07-15 7:25 p.m., Jason Gunthorpe wrote: >>> On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: >>>> >>>> With a KASAN enabled kernel built from amd-staging-drm-next, the >>>> attached use-after-free is pretty reliably detected during a piglit gpu run. >>> >>> Does this branch you are testing have the hmm.git merged? I think from >>> the name it does not? >> >> Indeed, no. >> >> >>> Use after free's of this nature were something that was fixed in >>> hmm.git.. >>> >>> I don't see an obvious way you can hit something like this with the >>> new code arrangement.. >> >> I tried merging the hmm-devmem-cleanup.4 changes[0] into my 5.2.y + >> drm-next for 5.3 kernel. While the result didn't hit the problem, all >> GL_AMD_pinned_memory piglit tests failed, so I suspect the problem was >> simply avoided by not actually hitting the HMM related functionality. >> >> It's possible that I made a mistake in merging the changes, or that I >> missed some other required changes. But it's also possible that the HMM >> changes broke the corresponding user-pointer functionality in amdgpu. > > Not sure, this was all Tested by the AMD team so it should work, I > hope. It can't, due to the issue pointed out by Linus in the "drm pull for 5.3-rc1" thread: DRM_AMDGPU_USERPTR still depends on ARCH_HAS_HMM, which no longer exists, so it can't be enabled. Fixing that up manually, it successfully finished a piglit run with that functionality enabled as well. -- Earthling Michel Dänzer | https://www.amd.com Libre software enthusiast | Mesa and X developer ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: HMM related use-after-free with amdgpu 2019-07-16 17:04 ` Michel Dänzer @ 2019-07-16 17:20 ` Jason Gunthorpe 2019-07-16 22:10 ` Kuehling, Felix 1 sibling, 0 replies; 9+ messages in thread From: Jason Gunthorpe @ 2019-07-16 17:20 UTC (permalink / raw) To: Michel Dänzer; +Cc: linux-mm, Jérôme Glisse, amd-gfx On Tue, Jul 16, 2019 at 07:04:52PM +0200, Michel Dänzer wrote: > On 2019-07-16 6:35 p.m., Jason Gunthorpe wrote: > > On Tue, Jul 16, 2019 at 06:31:09PM +0200, Michel Dänzer wrote: > >> On 2019-07-15 7:25 p.m., Jason Gunthorpe wrote: > >>> On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: > >>>> > >>>> With a KASAN enabled kernel built from amd-staging-drm-next, the > >>>> attached use-after-free is pretty reliably detected during a piglit gpu run. > >>> > >>> Does this branch you are testing have the hmm.git merged? I think from > >>> the name it does not? > >> > >> Indeed, no. > >> > >> > >>> Use after free's of this nature were something that was fixed in > >>> hmm.git.. > >>> > >>> I don't see an obvious way you can hit something like this with the > >>> new code arrangement.. > >> > >> I tried merging the hmm-devmem-cleanup.4 changes[0] into my 5.2.y + > >> drm-next for 5.3 kernel. While the result didn't hit the problem, all > >> GL_AMD_pinned_memory piglit tests failed, so I suspect the problem was > >> simply avoided by not actually hitting the HMM related functionality. > >> > >> It's possible that I made a mistake in merging the changes, or that I > >> missed some other required changes. But it's also possible that the HMM > >> changes broke the corresponding user-pointer functionality in amdgpu. > > > > Not sure, this was all Tested by the AMD team so it should work, I > > hope. > > It can't, due to the issue pointed out by Linus in the "drm pull for > 5.3-rc1" thread: DRM_AMDGPU_USERPTR still depends on ARCH_HAS_HMM, which > no longer exists, so it can't be enabled. Somehow that merge resolution got missed, but I think the AMD folks must have included it when they did their merge & test. Jason ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: HMM related use-after-free with amdgpu 2019-07-16 17:04 ` Michel Dänzer 2019-07-16 17:20 ` Jason Gunthorpe @ 2019-07-16 22:10 ` Kuehling, Felix 2019-07-17 7:47 ` Michel Dänzer 2019-07-17 11:34 ` Jason Gunthorpe 1 sibling, 2 replies; 9+ messages in thread From: Kuehling, Felix @ 2019-07-16 22:10 UTC (permalink / raw) To: Michel Dänzer, Jason Gunthorpe Cc: linux-mm, Jérôme Glisse, amd-gfx On 2019-07-16 1:04 p.m., Michel Dänzer wrote: > On 2019-07-16 6:35 p.m., Jason Gunthorpe wrote: >> On Tue, Jul 16, 2019 at 06:31:09PM +0200, Michel Dänzer wrote: >>> On 2019-07-15 7:25 p.m., Jason Gunthorpe wrote: >>>> On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: >>>>> With a KASAN enabled kernel built from amd-staging-drm-next, the >>>>> attached use-after-free is pretty reliably detected during a piglit gpu run. >>>> Does this branch you are testing have the hmm.git merged? I think from >>>> the name it does not? >>> Indeed, no. >>> >>> >>>> Use after free's of this nature were something that was fixed in >>>> hmm.git.. >>>> >>>> I don't see an obvious way you can hit something like this with the >>>> new code arrangement.. >>> I tried merging the hmm-devmem-cleanup.4 changes[0] into my 5.2.y + >>> drm-next for 5.3 kernel. While the result didn't hit the problem, all >>> GL_AMD_pinned_memory piglit tests failed, so I suspect the problem was >>> simply avoided by not actually hitting the HMM related functionality. >>> >>> It's possible that I made a mistake in merging the changes, or that I >>> missed some other required changes. But it's also possible that the HMM >>> changes broke the corresponding user-pointer functionality in amdgpu. >> Not sure, this was all Tested by the AMD team so it should work, I >> hope. > It can't, due to the issue pointed out by Linus in the "drm pull for > 5.3-rc1" thread: DRM_AMDGPU_USERPTR still depends on ARCH_HAS_HMM, which > no longer exists, so it can't be enabled. As far as I can tell, Linus fixed this up in his merge commit be8454afc50f43016ca8b6130d9673bdd0bd56ec. Jason, is hmm.git going to get rebased or merge to pick up the amdgpu changes for HMM from master? Regards, Felix > > Fixing that up manually, it successfully finished a piglit run with that > functionality enabled as well. > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: HMM related use-after-free with amdgpu 2019-07-16 22:10 ` Kuehling, Felix @ 2019-07-17 7:47 ` Michel Dänzer 2019-07-17 11:34 ` Jason Gunthorpe 1 sibling, 0 replies; 9+ messages in thread From: Michel Dänzer @ 2019-07-17 7:47 UTC (permalink / raw) To: Kuehling, Felix, Jason Gunthorpe Cc: linux-mm, Jérôme Glisse, amd-gfx On 2019-07-17 12:10 a.m., Kuehling, Felix wrote: > On 2019-07-16 1:04 p.m., Michel Dänzer wrote: >> On 2019-07-16 6:35 p.m., Jason Gunthorpe wrote: >>> On Tue, Jul 16, 2019 at 06:31:09PM +0200, Michel Dänzer wrote: >>>> On 2019-07-15 7:25 p.m., Jason Gunthorpe wrote: >>>>> On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: >>>>>> With a KASAN enabled kernel built from amd-staging-drm-next, the >>>>>> attached use-after-free is pretty reliably detected during a piglit gpu run. >>>>> Does this branch you are testing have the hmm.git merged? I think from >>>>> the name it does not? >>>> Indeed, no. >>>> >>>> >>>>> Use after free's of this nature were something that was fixed in >>>>> hmm.git.. >>>>> >>>>> I don't see an obvious way you can hit something like this with the >>>>> new code arrangement.. >>>> I tried merging the hmm-devmem-cleanup.4 changes[0] into my 5.2.y + >>>> drm-next for 5.3 kernel. While the result didn't hit the problem, all >>>> GL_AMD_pinned_memory piglit tests failed, so I suspect the problem was >>>> simply avoided by not actually hitting the HMM related functionality. >>>> >>>> It's possible that I made a mistake in merging the changes, or that I >>>> missed some other required changes. But it's also possible that the HMM >>>> changes broke the corresponding user-pointer functionality in amdgpu. >>> Not sure, this was all Tested by the AMD team so it should work, I >>> hope. >> It can't, due to the issue pointed out by Linus in the "drm pull for >> 5.3-rc1" thread: DRM_AMDGPU_USERPTR still depends on ARCH_HAS_HMM, which >> no longer exists, so it can't be enabled. > > As far as I can tell, Linus fixed this up in his merge commit > be8454afc50f43016ca8b6130d9673bdd0bd56ec. Ah! That's the piece I was missing, since I had merged the drm-next changes before Linus did. Thanks Felix. Note that AFAICT it was basically luck that Linus noticed this and fixed it up. It would be better not to push our luck like this. :) -- Earthling Michel Dänzer | https://www.amd.com Libre software enthusiast | Mesa and X developer ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: HMM related use-after-free with amdgpu 2019-07-16 22:10 ` Kuehling, Felix 2019-07-17 7:47 ` Michel Dänzer @ 2019-07-17 11:34 ` Jason Gunthorpe 1 sibling, 0 replies; 9+ messages in thread From: Jason Gunthorpe @ 2019-07-17 11:34 UTC (permalink / raw) To: Kuehling, Felix Cc: Michel Dänzer, linux-mm, Jérôme Glisse, amd-gfx On Tue, Jul 16, 2019 at 10:10:46PM +0000, Kuehling, Felix wrote: > On 2019-07-16 1:04 p.m., Michel Dänzer wrote: > > On 2019-07-16 6:35 p.m., Jason Gunthorpe wrote: > >> On Tue, Jul 16, 2019 at 06:31:09PM +0200, Michel Dänzer wrote: > >>> On 2019-07-15 7:25 p.m., Jason Gunthorpe wrote: > >>>> On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: > >>>>> With a KASAN enabled kernel built from amd-staging-drm-next, the > >>>>> attached use-after-free is pretty reliably detected during a piglit gpu run. > >>>> Does this branch you are testing have the hmm.git merged? I think from > >>>> the name it does not? > >>> Indeed, no. > >>> > >>> > >>>> Use after free's of this nature were something that was fixed in > >>>> hmm.git.. > >>>> > >>>> I don't see an obvious way you can hit something like this with the > >>>> new code arrangement.. > >>> I tried merging the hmm-devmem-cleanup.4 changes[0] into my 5.2.y + > >>> drm-next for 5.3 kernel. While the result didn't hit the problem, all > >>> GL_AMD_pinned_memory piglit tests failed, so I suspect the problem was > >>> simply avoided by not actually hitting the HMM related functionality. > >>> > >>> It's possible that I made a mistake in merging the changes, or that I > >>> missed some other required changes. But it's also possible that the HMM > >>> changes broke the corresponding user-pointer functionality in amdgpu. > >> Not sure, this was all Tested by the AMD team so it should work, I > >> hope. > > It can't, due to the issue pointed out by Linus in the "drm pull for > > 5.3-rc1" thread: DRM_AMDGPU_USERPTR still depends on ARCH_HAS_HMM, which > > no longer exists, so it can't be enabled. > > As far as I can tell, Linus fixed this up in his merge commit > be8454afc50f43016ca8b6130d9673bdd0bd56ec. Jason, is hmm.git going to get > rebased or merge to pick up the amdgpu changes for HMM from master? It will be reset to -rc1 when it comes out, then we start all over again. Jason ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-07-17 11:34 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-15 16:51 HMM related use-after-free with amdgpu Michel Dänzer 2019-07-15 17:25 ` Jason Gunthorpe 2019-07-16 16:31 ` Michel Dänzer 2019-07-16 16:35 ` Jason Gunthorpe 2019-07-16 17:04 ` Michel Dänzer 2019-07-16 17:20 ` Jason Gunthorpe 2019-07-16 22:10 ` Kuehling, Felix 2019-07-17 7:47 ` Michel Dänzer 2019-07-17 11:34 ` Jason Gunthorpe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox