* Re: A possible bug: Calling mutex_lock while holding spinlock [not found] ` <20170803153902.71ceaa3b435083fc2e112631@linux-foundation.org> @ 2017-08-04 13:49 ` Kirill A. Shutemov 2017-08-04 14:03 ` axie 0 siblings, 1 reply; 5+ messages in thread From: Kirill A. Shutemov @ 2017-08-04 13:49 UTC (permalink / raw) To: Andrew Morton; +Cc: axie, Alex Deucher, Writer, Tim, linux-mm On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote: > > (cc Kirill) > > On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote: > > > Hi Andrew, > > > > > > I got a report yesterday with "BUG: sleeping function called from > > invalid context at kernel/locking/mutex.c" > > > > I checked the relevant functions for the issue. Function > > page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier, > > amdgpu_mn_invalidate_page called function mutex_lock, which triggered > > the "bug". > > > > Function page_vma_mapped_walk was introduced recently by you in commit > > c7ab0d2fdc840266b39db94538f74207ec2afbf6 and > > ace71a19cec5eb430207c3269d8a2683f0574306. > > > > Would you advise how to proceed with this bug? Change > > page_vma_mapped_walk not to use spinlock? Or change > > amdgpu_mn_invalidate_page to use spinlock to meet the change, or > > something else? > > > > hm, as far as I can tell this was an unintended side-effect of > c7ab0d2fd ("mm: convert try_to_unmap_one() to use > page_vma_mapped_walk()"). Before that patch, > mmu_notifier_invalidate_page() was not called under page_table_lock. > After that patch, mmu_notifier_invalidate_page() is called under > page_table_lock. > > Perhaps Kirill can suggest a fix? Sorry for this. What about the patch below? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A possible bug: Calling mutex_lock while holding spinlock 2017-08-04 13:49 ` A possible bug: Calling mutex_lock while holding spinlock Kirill A. Shutemov @ 2017-08-04 14:03 ` axie 2017-08-08 16:51 ` axie 0 siblings, 1 reply; 5+ messages in thread From: axie @ 2017-08-04 14:03 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton Cc: Alex Deucher, Writer, Tim, linux-mm, Xie, AlexBin Hi Kirill, Thanks for the patch. I have sent the patch to the user asking whether he can give it a try. Regards, Alex (Bin) Xie On 2017-08-04 09:49 AM, Kirill A. Shutemov wrote: > On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote: >> (cc Kirill) >> >> On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote: >> >>> Hi Andrew, >>> >>> >>> I got a report yesterday with "BUG: sleeping function called from >>> invalid context at kernel/locking/mutex.c" >>> >>> I checked the relevant functions for the issue. Function >>> page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier, >>> amdgpu_mn_invalidate_page called function mutex_lock, which triggered >>> the "bug". >>> >>> Function page_vma_mapped_walk was introduced recently by you in commit >>> c7ab0d2fdc840266b39db94538f74207ec2afbf6 and >>> ace71a19cec5eb430207c3269d8a2683f0574306. >>> >>> Would you advise how to proceed with this bug? Change >>> page_vma_mapped_walk not to use spinlock? Or change >>> amdgpu_mn_invalidate_page to use spinlock to meet the change, or >>> something else? >>> >> hm, as far as I can tell this was an unintended side-effect of >> c7ab0d2fd ("mm: convert try_to_unmap_one() to use >> page_vma_mapped_walk()"). Before that patch, >> mmu_notifier_invalidate_page() was not called under page_table_lock. >> After that patch, mmu_notifier_invalidate_page() is called under >> page_table_lock. >> >> Perhaps Kirill can suggest a fix? > Sorry for this. > > What about the patch below? > > From f48dbcdd0ed83dee9a157062b7ca1e2915172678 Mon Sep 17 00:00:00 2001 > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > Date: Fri, 4 Aug 2017 16:37:26 +0300 > Subject: [PATCH] rmap: do not call mmu_notifier_invalidate_page() under ptl > > MMU notifiers can sleep, but in page_mkclean_one() we call > mmu_notifier_invalidate_page() under page table lock. > > Let's instead use mmu_notifier_invalidate_range() outside > page_vma_mapped_walk() loop. > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()") > --- > mm/rmap.c | 21 +++++++++++++-------- > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/mm/rmap.c b/mm/rmap.c > index ced14f1af6dc..b4b711a82c01 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -852,10 +852,10 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma, > .flags = PVMW_SYNC, > }; > int *cleaned = arg; > + bool invalidation_needed = false; > > while (page_vma_mapped_walk(&pvmw)) { > int ret = 0; > - address = pvmw.address; > if (pvmw.pte) { > pte_t entry; > pte_t *pte = pvmw.pte; > @@ -863,11 +863,11 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma, > if (!pte_dirty(*pte) && !pte_write(*pte)) > continue; > > - flush_cache_page(vma, address, pte_pfn(*pte)); > - entry = ptep_clear_flush(vma, address, pte); > + flush_cache_page(vma, pvmw.address, pte_pfn(*pte)); > + entry = ptep_clear_flush(vma, pvmw.address, pte); > entry = pte_wrprotect(entry); > entry = pte_mkclean(entry); > - set_pte_at(vma->vm_mm, address, pte, entry); > + set_pte_at(vma->vm_mm, pvmw.address, pte, entry); > ret = 1; > } else { > #ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE > @@ -877,11 +877,11 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma, > if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) > continue; > > - flush_cache_page(vma, address, page_to_pfn(page)); > - entry = pmdp_huge_clear_flush(vma, address, pmd); > + flush_cache_page(vma, pvmw.address, page_to_pfn(page)); > + entry = pmdp_huge_clear_flush(vma, pvmw.address, pmd); > entry = pmd_wrprotect(entry); > entry = pmd_mkclean(entry); > - set_pmd_at(vma->vm_mm, address, pmd, entry); > + set_pmd_at(vma->vm_mm, pvmw.address, pmd, entry); > ret = 1; > #else > /* unexpected pmd-mapped page? */ > @@ -890,11 +890,16 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma, > } > > if (ret) { > - mmu_notifier_invalidate_page(vma->vm_mm, address); > (*cleaned)++; > + invalidation_needed = true; > } > } > > + if (invalidation_needed) { > + mmu_notifier_invalidate_range(vma->vm_mm, address, > + address + (1UL << compound_order(page))); > + } > + > return true; > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A possible bug: Calling mutex_lock while holding spinlock 2017-08-04 14:03 ` axie @ 2017-08-08 16:51 ` axie 2017-08-08 17:01 ` Kirill A. Shutemov 0 siblings, 1 reply; 5+ messages in thread From: axie @ 2017-08-08 16:51 UTC (permalink / raw) To: Kirill A. Shutemov, Andrew Morton Cc: Alex Deucher, Writer, Tim, linux-mm, Xie, AlexBin [-- Attachment #1: Type: text/plain, Size: 5085 bytes --] Hi Kirill, Here is the result from the user:"This patch does appear fix the issue." Thanks, Alex (Bin) Xie On 2017-08-04 10:03 AM, axie wrote: > Hi Kirill, > > > Thanks for the patch. I have sent the patch to the user asking whether > he can give it a try. > > > Regards, > > Alex (Bin) Xie > > > > On 2017-08-04 09:49 AM, Kirill A. Shutemov wrote: >> On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote: >>> (cc Kirill) >>> >>> On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote: >>> >>>> Hi Andrew, >>>> >>>> >>>> I got a report yesterday with "BUG: sleeping function called from >>>> invalid context at kernel/locking/mutex.c" >>>> >>>> I checked the relevant functions for the issue. Function >>>> page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier, >>>> amdgpu_mn_invalidate_page called function mutex_lock, which triggered >>>> the "bug". >>>> >>>> Function page_vma_mapped_walk was introduced recently by you in commit >>>> c7ab0d2fdc840266b39db94538f74207ec2afbf6 and >>>> ace71a19cec5eb430207c3269d8a2683f0574306. >>>> >>>> Would you advise how to proceed with this bug? Change >>>> page_vma_mapped_walk not to use spinlock? Or change >>>> amdgpu_mn_invalidate_page to use spinlock to meet the change, or >>>> something else? >>>> >>> hm, as far as I can tell this was an unintended side-effect of >>> c7ab0d2fd ("mm: convert try_to_unmap_one() to use >>> page_vma_mapped_walk()"). Before that patch, >>> mmu_notifier_invalidate_page() was not called under page_table_lock. >>> After that patch, mmu_notifier_invalidate_page() is called under >>> page_table_lock. >>> >>> Perhaps Kirill can suggest a fix? >> Sorry for this. >> >> What about the patch below? >> >> From f48dbcdd0ed83dee9a157062b7ca1e2915172678 Mon Sep 17 00:00:00 2001 >> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> >> Date: Fri, 4 Aug 2017 16:37:26 +0300 >> Subject: [PATCH] rmap: do not call mmu_notifier_invalidate_page() >> under ptl >> >> MMU notifiers can sleep, but in page_mkclean_one() we call >> mmu_notifier_invalidate_page() under page table lock. >> >> Let's instead use mmu_notifier_invalidate_range() outside >> page_vma_mapped_walk() loop. >> >> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> >> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use >> page_vma_mapped_walk()") >> --- >> mm/rmap.c | 21 +++++++++++++-------- >> 1 file changed, 13 insertions(+), 8 deletions(-) >> >> diff --git a/mm/rmap.c b/mm/rmap.c >> index ced14f1af6dc..b4b711a82c01 100644 >> --- a/mm/rmap.c >> +++ b/mm/rmap.c >> @@ -852,10 +852,10 @@ static bool page_mkclean_one(struct page *page, >> struct vm_area_struct *vma, >> .flags = PVMW_SYNC, >> }; >> int *cleaned = arg; >> + bool invalidation_needed = false; >> while (page_vma_mapped_walk(&pvmw)) { >> int ret = 0; >> - address = pvmw.address; >> if (pvmw.pte) { >> pte_t entry; >> pte_t *pte = pvmw.pte; >> @@ -863,11 +863,11 @@ static bool page_mkclean_one(struct page *page, >> struct vm_area_struct *vma, >> if (!pte_dirty(*pte) && !pte_write(*pte)) >> continue; >> - flush_cache_page(vma, address, pte_pfn(*pte)); >> - entry = ptep_clear_flush(vma, address, pte); >> + flush_cache_page(vma, pvmw.address, pte_pfn(*pte)); >> + entry = ptep_clear_flush(vma, pvmw.address, pte); >> entry = pte_wrprotect(entry); >> entry = pte_mkclean(entry); >> - set_pte_at(vma->vm_mm, address, pte, entry); >> + set_pte_at(vma->vm_mm, pvmw.address, pte, entry); >> ret = 1; >> } else { >> #ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE >> @@ -877,11 +877,11 @@ static bool page_mkclean_one(struct page *page, >> struct vm_area_struct *vma, >> if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) >> continue; >> - flush_cache_page(vma, address, page_to_pfn(page)); >> - entry = pmdp_huge_clear_flush(vma, address, pmd); >> + flush_cache_page(vma, pvmw.address, page_to_pfn(page)); >> + entry = pmdp_huge_clear_flush(vma, pvmw.address, pmd); >> entry = pmd_wrprotect(entry); >> entry = pmd_mkclean(entry); >> - set_pmd_at(vma->vm_mm, address, pmd, entry); >> + set_pmd_at(vma->vm_mm, pvmw.address, pmd, entry); >> ret = 1; >> #else >> /* unexpected pmd-mapped page? */ >> @@ -890,11 +890,16 @@ static bool page_mkclean_one(struct page *page, >> struct vm_area_struct *vma, >> } >> if (ret) { >> - mmu_notifier_invalidate_page(vma->vm_mm, address); >> (*cleaned)++; >> + invalidation_needed = true; >> } >> } >> + if (invalidation_needed) { >> + mmu_notifier_invalidate_range(vma->vm_mm, address, >> + address + (1UL << compound_order(page))); >> + } >> + >> return true; >> } > [-- Attachment #2: Type: text/html, Size: 8922 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A possible bug: Calling mutex_lock while holding spinlock 2017-08-08 16:51 ` axie @ 2017-08-08 17:01 ` Kirill A. Shutemov 2017-08-08 20:29 ` Kirill A. Shutemov 0 siblings, 1 reply; 5+ messages in thread From: Kirill A. Shutemov @ 2017-08-08 17:01 UTC (permalink / raw) To: axie Cc: Kirill A. Shutemov, Andrew Morton, Alex Deucher, Writer, Tim, linux-mm, Xie, AlexBin On Tue, Aug 08, 2017 at 12:51:15PM -0400, axie wrote: > Hi Kirill, > > Here is the result from the user:"This patch does appear fix the issue." Hm. Could you get logs from failure on the patched kernel? -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A possible bug: Calling mutex_lock while holding spinlock 2017-08-08 17:01 ` Kirill A. Shutemov @ 2017-08-08 20:29 ` Kirill A. Shutemov 0 siblings, 0 replies; 5+ messages in thread From: Kirill A. Shutemov @ 2017-08-08 20:29 UTC (permalink / raw) To: axie Cc: Kirill A. Shutemov, Andrew Morton, Alex Deucher, Writer, Tim, linux-mm, Xie, AlexBin On Tue, Aug 08, 2017 at 08:01:27PM +0300, Kirill A. Shutemov wrote: > On Tue, Aug 08, 2017 at 12:51:15PM -0400, axie wrote: > > Hi Kirill, > > > > Here is the result from the user:"This patch does appear fix the issue." > > Hm. Could you get logs from failure on the patched kernel? Please ignore. I've misread what you wrote. %) -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-08-08 20:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <2d442de2-c5d4-ecce-2345-4f8f34314247@amd.com>
[not found] ` <20170803153902.71ceaa3b435083fc2e112631@linux-foundation.org>
2017-08-04 13:49 ` A possible bug: Calling mutex_lock while holding spinlock Kirill A. Shutemov
2017-08-04 14:03 ` axie
2017-08-08 16:51 ` axie
2017-08-08 17:01 ` Kirill A. Shutemov
2017-08-08 20:29 ` Kirill A. Shutemov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox