Re: A possible bug: Calling mutex_lock while holding spinlock

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Re: A possible bug: Calling mutex_lock while holding spinlock
       [not found] ` <20170803153902.71ceaa3b435083fc2e112631@linux-foundation.org>
@ 2017-08-04 13:49   ` Kirill A. Shutemov
  2017-08-04 14:03     ` axie
  0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2017-08-04 13:49 UTC (permalink / raw)
  To: Andrew Morton; +Cc: axie, Alex Deucher, Writer, Tim, linux-mm

On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote:
> 
> (cc Kirill)
> 
> On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote:
> 
> > Hi Andrew,
> > 
> > 
> > I got a report yesterday with "BUG: sleeping function called from 
> > invalid context at kernel/locking/mutex.c"
> > 
> > I checked the relevant functions for the issue. Function 
> > page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier, 
> > amdgpu_mn_invalidate_page called function mutex_lock, which triggered 
> > the "bug".
> > 
> > Function page_vma_mapped_walk was introduced recently by you in commit
> > c7ab0d2fdc840266b39db94538f74207ec2afbf6 and 
> > ace71a19cec5eb430207c3269d8a2683f0574306.
> > 
> > Would you advise how to proceed with this bug? Change 
> > page_vma_mapped_walk not to use spinlock? Or change 
> > amdgpu_mn_invalidate_page to use spinlock to meet the change, or 
> > something else?
> > 
> 
> hm, as far as I can tell this was an unintended side-effect of
> c7ab0d2fd ("mm: convert try_to_unmap_one() to use
> page_vma_mapped_walk()").  Before that patch,
> mmu_notifier_invalidate_page() was not called under page_table_lock. 
> After that patch, mmu_notifier_invalidate_page() is called under
> page_table_lock.
> 
> Perhaps Kirill can suggest a fix?

Sorry for this.

What about the patch below?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: A possible bug: Calling mutex_lock while holding spinlock
  2017-08-04 13:49   ` A possible bug: Calling mutex_lock while holding spinlock Kirill A. Shutemov
@ 2017-08-04 14:03     ` axie
  2017-08-08 16:51       ` axie
  0 siblings, 1 reply; 5+ messages in thread
From: axie @ 2017-08-04 14:03 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrew Morton
  Cc: Alex Deucher, Writer, Tim, linux-mm, Xie, AlexBin

Hi Kirill,


Thanks for the patch. I have sent the patch to the user asking whether 
he can give it a try.


Regards,

Alex (Bin) Xie



On 2017-08-04 09:49 AM, Kirill A. Shutemov wrote:
> On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote:
>> (cc Kirill)
>>
>> On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote:
>>
>>> Hi Andrew,
>>>
>>>
>>> I got a report yesterday with "BUG: sleeping function called from
>>> invalid context at kernel/locking/mutex.c"
>>>
>>> I checked the relevant functions for the issue. Function
>>> page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier,
>>> amdgpu_mn_invalidate_page called function mutex_lock, which triggered
>>> the "bug".
>>>
>>> Function page_vma_mapped_walk was introduced recently by you in commit
>>> c7ab0d2fdc840266b39db94538f74207ec2afbf6 and
>>> ace71a19cec5eb430207c3269d8a2683f0574306.
>>>
>>> Would you advise how to proceed with this bug? Change
>>> page_vma_mapped_walk not to use spinlock? Or change
>>> amdgpu_mn_invalidate_page to use spinlock to meet the change, or
>>> something else?
>>>
>> hm, as far as I can tell this was an unintended side-effect of
>> c7ab0d2fd ("mm: convert try_to_unmap_one() to use
>> page_vma_mapped_walk()").  Before that patch,
>> mmu_notifier_invalidate_page() was not called under page_table_lock.
>> After that patch, mmu_notifier_invalidate_page() is called under
>> page_table_lock.
>>
>> Perhaps Kirill can suggest a fix?
> Sorry for this.
>
> What about the patch below?
>
>  From f48dbcdd0ed83dee9a157062b7ca1e2915172678 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Fri, 4 Aug 2017 16:37:26 +0300
> Subject: [PATCH] rmap: do not call mmu_notifier_invalidate_page() under ptl
>
> MMU notifiers can sleep, but in page_mkclean_one() we call
> mmu_notifier_invalidate_page() under page table lock.
>
> Let's instead use mmu_notifier_invalidate_range() outside
> page_vma_mapped_walk() loop.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
> ---
>   mm/rmap.c | 21 +++++++++++++--------
>   1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index ced14f1af6dc..b4b711a82c01 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -852,10 +852,10 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
>   		.flags = PVMW_SYNC,
>   	};
>   	int *cleaned = arg;
> +	bool invalidation_needed = false;
>   
>   	while (page_vma_mapped_walk(&pvmw)) {
>   		int ret = 0;
> -		address = pvmw.address;
>   		if (pvmw.pte) {
>   			pte_t entry;
>   			pte_t *pte = pvmw.pte;
> @@ -863,11 +863,11 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
>   			if (!pte_dirty(*pte) && !pte_write(*pte))
>   				continue;
>   
> -			flush_cache_page(vma, address, pte_pfn(*pte));
> -			entry = ptep_clear_flush(vma, address, pte);
> +			flush_cache_page(vma, pvmw.address, pte_pfn(*pte));
> +			entry = ptep_clear_flush(vma, pvmw.address, pte);
>   			entry = pte_wrprotect(entry);
>   			entry = pte_mkclean(entry);
> -			set_pte_at(vma->vm_mm, address, pte, entry);
> +			set_pte_at(vma->vm_mm, pvmw.address, pte, entry);
>   			ret = 1;
>   		} else {
>   #ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
> @@ -877,11 +877,11 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
>   			if (!pmd_dirty(*pmd) && !pmd_write(*pmd))
>   				continue;
>   
> -			flush_cache_page(vma, address, page_to_pfn(page));
> -			entry = pmdp_huge_clear_flush(vma, address, pmd);
> +			flush_cache_page(vma, pvmw.address, page_to_pfn(page));
> +			entry = pmdp_huge_clear_flush(vma, pvmw.address, pmd);
>   			entry = pmd_wrprotect(entry);
>   			entry = pmd_mkclean(entry);
> -			set_pmd_at(vma->vm_mm, address, pmd, entry);
> +			set_pmd_at(vma->vm_mm, pvmw.address, pmd, entry);
>   			ret = 1;
>   #else
>   			/* unexpected pmd-mapped page? */
> @@ -890,11 +890,16 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
>   		}
>   
>   		if (ret) {
> -			mmu_notifier_invalidate_page(vma->vm_mm, address);
>   			(*cleaned)++;
> +			invalidation_needed = true;
>   		}
>   	}
>   
> +	if (invalidation_needed) {
> +		mmu_notifier_invalidate_range(vma->vm_mm, address,
> +				address + (1UL << compound_order(page)));
> +	}
> +
>   	return true;
>   }
>   

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: A possible bug: Calling mutex_lock while holding spinlock
  2017-08-04 14:03     ` axie
@ 2017-08-08 16:51       ` axie
  2017-08-08 17:01         ` Kirill A. Shutemov
  0 siblings, 1 reply; 5+ messages in thread
From: axie @ 2017-08-08 16:51 UTC (permalink / raw)
  To: Kirill A. Shutemov, Andrew Morton
  Cc: Alex Deucher, Writer, Tim, linux-mm, Xie, AlexBin

[-- Attachment #1: Type: text/plain, Size: 5085 bytes --]

Hi Kirill,

Here is the result from the user:"This patch does appear fix the issue."

Thanks,

Alex (Bin) Xie


On 2017-08-04 10:03 AM, axie wrote:
> Hi Kirill,
>
>
> Thanks for the patch. I have sent the patch to the user asking whether 
> he can give it a try.
>
>
> Regards,
>
> Alex (Bin) Xie
>
>
>
> On 2017-08-04 09:49 AM, Kirill A. Shutemov wrote:
>> On Thu, Aug 03, 2017 at 03:39:02PM -0700, Andrew Morton wrote:
>>> (cc Kirill)
>>>
>>> On Thu, 3 Aug 2017 12:35:28 -0400 axie <axie@amd.com> wrote:
>>>
>>>> Hi Andrew,
>>>>
>>>>
>>>> I got a report yesterday with "BUG: sleeping function called from
>>>> invalid context at kernel/locking/mutex.c"
>>>>
>>>> I checked the relevant functions for the issue. Function
>>>> page_vma_mapped_walk did acquire spinlock. Later, in MMU notifier,
>>>> amdgpu_mn_invalidate_page called function mutex_lock, which triggered
>>>> the "bug".
>>>>
>>>> Function page_vma_mapped_walk was introduced recently by you in commit
>>>> c7ab0d2fdc840266b39db94538f74207ec2afbf6 and
>>>> ace71a19cec5eb430207c3269d8a2683f0574306.
>>>>
>>>> Would you advise how to proceed with this bug? Change
>>>> page_vma_mapped_walk not to use spinlock? Or change
>>>> amdgpu_mn_invalidate_page to use spinlock to meet the change, or
>>>> something else?
>>>>
>>> hm, as far as I can tell this was an unintended side-effect of
>>> c7ab0d2fd ("mm: convert try_to_unmap_one() to use
>>> page_vma_mapped_walk()").  Before that patch,
>>> mmu_notifier_invalidate_page() was not called under page_table_lock.
>>> After that patch, mmu_notifier_invalidate_page() is called under
>>> page_table_lock.
>>>
>>> Perhaps Kirill can suggest a fix?
>> Sorry for this.
>>
>> What about the patch below?
>>
>>  From f48dbcdd0ed83dee9a157062b7ca1e2915172678 Mon Sep 17 00:00:00 2001
>> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>> Date: Fri, 4 Aug 2017 16:37:26 +0300
>> Subject: [PATCH] rmap: do not call mmu_notifier_invalidate_page() 
>> under ptl
>>
>> MMU notifiers can sleep, but in page_mkclean_one() we call
>> mmu_notifier_invalidate_page() under page table lock.
>>
>> Let's instead use mmu_notifier_invalidate_range() outside
>> page_vma_mapped_walk() loop.
>>
>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use 
>> page_vma_mapped_walk()")
>> ---
>>   mm/rmap.c | 21 +++++++++++++--------
>>   1 file changed, 13 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index ced14f1af6dc..b4b711a82c01 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -852,10 +852,10 @@ static bool page_mkclean_one(struct page *page, 
>> struct vm_area_struct *vma,
>>           .flags = PVMW_SYNC,
>>       };
>>       int *cleaned = arg;
>> +    bool invalidation_needed = false;
>>         while (page_vma_mapped_walk(&pvmw)) {
>>           int ret = 0;
>> -        address = pvmw.address;
>>           if (pvmw.pte) {
>>               pte_t entry;
>>               pte_t *pte = pvmw.pte;
>> @@ -863,11 +863,11 @@ static bool page_mkclean_one(struct page *page, 
>> struct vm_area_struct *vma,
>>               if (!pte_dirty(*pte) && !pte_write(*pte))
>>                   continue;
>>   -            flush_cache_page(vma, address, pte_pfn(*pte));
>> -            entry = ptep_clear_flush(vma, address, pte);
>> +            flush_cache_page(vma, pvmw.address, pte_pfn(*pte));
>> +            entry = ptep_clear_flush(vma, pvmw.address, pte);
>>               entry = pte_wrprotect(entry);
>>               entry = pte_mkclean(entry);
>> -            set_pte_at(vma->vm_mm, address, pte, entry);
>> +            set_pte_at(vma->vm_mm, pvmw.address, pte, entry);
>>               ret = 1;
>>           } else {
>>   #ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
>> @@ -877,11 +877,11 @@ static bool page_mkclean_one(struct page *page, 
>> struct vm_area_struct *vma,
>>               if (!pmd_dirty(*pmd) && !pmd_write(*pmd))
>>                   continue;
>>   -            flush_cache_page(vma, address, page_to_pfn(page));
>> -            entry = pmdp_huge_clear_flush(vma, address, pmd);
>> +            flush_cache_page(vma, pvmw.address, page_to_pfn(page));
>> +            entry = pmdp_huge_clear_flush(vma, pvmw.address, pmd);
>>               entry = pmd_wrprotect(entry);
>>               entry = pmd_mkclean(entry);
>> -            set_pmd_at(vma->vm_mm, address, pmd, entry);
>> +            set_pmd_at(vma->vm_mm, pvmw.address, pmd, entry);
>>               ret = 1;
>>   #else
>>               /* unexpected pmd-mapped page? */
>> @@ -890,11 +890,16 @@ static bool page_mkclean_one(struct page *page, 
>> struct vm_area_struct *vma,
>>           }
>>             if (ret) {
>> -            mmu_notifier_invalidate_page(vma->vm_mm, address);
>>               (*cleaned)++;
>> +            invalidation_needed = true;
>>           }
>>       }
>>   +    if (invalidation_needed) {
>> +        mmu_notifier_invalidate_range(vma->vm_mm, address,
>> +                address + (1UL << compound_order(page)));
>> +    }
>> +
>>       return true;
>>   }
>


[-- Attachment #2: Type: text/html, Size: 8922 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: A possible bug: Calling mutex_lock while holding spinlock
  2017-08-08 16:51       ` axie
@ 2017-08-08 17:01         ` Kirill A. Shutemov
  2017-08-08 20:29           ` Kirill A. Shutemov
  0 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2017-08-08 17:01 UTC (permalink / raw)
  To: axie
  Cc: Kirill A. Shutemov, Andrew Morton, Alex Deucher, Writer, Tim,
	linux-mm, Xie, AlexBin

On Tue, Aug 08, 2017 at 12:51:15PM -0400, axie wrote:
> Hi Kirill,
> 
> Here is the result from the user:"This patch does appear fix the issue."

Hm. Could you get logs from failure on the patched kernel?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: A possible bug: Calling mutex_lock while holding spinlock
  2017-08-08 17:01         ` Kirill A. Shutemov
@ 2017-08-08 20:29           ` Kirill A. Shutemov
  0 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2017-08-08 20:29 UTC (permalink / raw)
  To: axie
  Cc: Kirill A. Shutemov, Andrew Morton, Alex Deucher, Writer, Tim,
	linux-mm, Xie, AlexBin

On Tue, Aug 08, 2017 at 08:01:27PM +0300, Kirill A. Shutemov wrote:
> On Tue, Aug 08, 2017 at 12:51:15PM -0400, axie wrote:
> > Hi Kirill,
> > 
> > Here is the result from the user:"This patch does appear fix the issue."
> 
> Hm. Could you get logs from failure on the patched kernel?

Please ignore. I've misread what you wrote. %)

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-08-08 20:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2d442de2-c5d4-ecce-2345-4f8f34314247@amd.com>
     [not found] ` <20170803153902.71ceaa3b435083fc2e112631@linux-foundation.org>
2017-08-04 13:49   ` A possible bug: Calling mutex_lock while holding spinlock Kirill A. Shutemov
2017-08-04 14:03     ` axie
2017-08-08 16:51       ` axie
2017-08-08 17:01         ` Kirill A. Shutemov
2017-08-08 20:29           ` Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox