linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -mm] mm, gup: prevent pmd checking race in follow_pmd_mask()
@ 2018-04-04  3:22 Huang, Ying
  2018-04-04 15:02 ` Zi Yan
  0 siblings, 1 reply; 4+ messages in thread
From: Huang, Ying @ 2018-04-04  3:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Al Viro, Aneesh Kumar K.V,
	Dan Williams, Zi Yan, Kirill A. Shutemov

From: Huang Ying <ying.huang@intel.com>

mmap_sem will be read locked when calling follow_pmd_mask().  But this
cannot prevent PMD from being changed for all cases when PTL is
unlocked, for example, from pmd_trans_huge() to pmd_none() via
MADV_DONTNEED.  So it is possible for the pmd_present() check in
follow_pmd_mask() encounter a none PMD.  This may cause incorrect
VM_BUG_ON() or infinite loop.  Fixed this via reading PMD entry again
but only once and checking the local variable and pmd_none() in the
retry loop.

As Kirill pointed out, with PTL unlocked, the *pmd may be changed
under us, so read it directly again and again may incur weird bugs.
So although using *pmd directly other than pmd_present() checking may
be safe, it is still better to replace them to read *pmd once and
check the local variable for multiple times.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
 # When PTL unlocked, replace all *pmd with local variable
Suggested-by: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
---
 mm/gup.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 2e2df7f3e92d..51734292839b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -213,53 +213,61 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma,
 				    unsigned long address, pud_t *pudp,
 				    unsigned int flags, unsigned int *page_mask)
 {
-	pmd_t *pmd;
+	pmd_t *pmd, pmdval;
 	spinlock_t *ptl;
 	struct page *page;
 	struct mm_struct *mm = vma->vm_mm;
 
 	pmd = pmd_offset(pudp, address);
-	if (pmd_none(*pmd))
+	pmdval = READ_ONCE(*pmd);
+	if (pmd_none(pmdval))
 		return no_page_table(vma, flags);
-	if (pmd_huge(*pmd) && vma->vm_flags & VM_HUGETLB) {
+	if (pmd_huge(pmdval) && vma->vm_flags & VM_HUGETLB) {
 		page = follow_huge_pmd(mm, address, pmd, flags);
 		if (page)
 			return page;
 		return no_page_table(vma, flags);
 	}
-	if (is_hugepd(__hugepd(pmd_val(*pmd)))) {
+	if (is_hugepd(__hugepd(pmd_val(pmdval)))) {
 		page = follow_huge_pd(vma, address,
-				      __hugepd(pmd_val(*pmd)), flags,
+				      __hugepd(pmd_val(pmdval)), flags,
 				      PMD_SHIFT);
 		if (page)
 			return page;
 		return no_page_table(vma, flags);
 	}
 retry:
-	if (!pmd_present(*pmd)) {
+	if (!pmd_present(pmdval)) {
 		if (likely(!(flags & FOLL_MIGRATION)))
 			return no_page_table(vma, flags);
 		VM_BUG_ON(thp_migration_supported() &&
-				  !is_pmd_migration_entry(*pmd));
-		if (is_pmd_migration_entry(*pmd))
+				  !is_pmd_migration_entry(pmdval));
+		if (is_pmd_migration_entry(pmdval))
 			pmd_migration_entry_wait(mm, pmd);
+		pmdval = READ_ONCE(*pmd);
+		if (pmd_none(pmdval))
+			return no_page_table(vma, flags);
 		goto retry;
 	}
-	if (pmd_devmap(*pmd)) {
+	if (pmd_devmap(pmdval)) {
 		ptl = pmd_lock(mm, pmd);
 		page = follow_devmap_pmd(vma, address, pmd, flags);
 		spin_unlock(ptl);
 		if (page)
 			return page;
 	}
-	if (likely(!pmd_trans_huge(*pmd)))
+	if (likely(!pmd_trans_huge(pmdval)))
 		return follow_page_pte(vma, address, pmd, flags);
 
-	if ((flags & FOLL_NUMA) && pmd_protnone(*pmd))
+	if ((flags & FOLL_NUMA) && pmd_protnone(pmdval))
 		return no_page_table(vma, flags);
 
 retry_locked:
 	ptl = pmd_lock(mm, pmd);
+	if (unlikely(pmd_none(*pmd))) {
+		spin_unlock(ptl);
+		return no_page_table(vma, flags);
+	}
 	if (unlikely(!pmd_present(*pmd))) {
 		spin_unlock(ptl);
 		if (likely(!(flags & FOLL_MIGRATION)))
-- 
2.15.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH -mm] mm, gup: prevent pmd checking race in follow_pmd_mask()
  2018-04-04  3:22 [PATCH -mm] mm, gup: prevent pmd checking race in follow_pmd_mask() Huang, Ying
@ 2018-04-04 15:02 ` Zi Yan
  2018-04-06  1:57   ` huang ying
  0 siblings, 1 reply; 4+ messages in thread
From: Zi Yan @ 2018-04-04 15:02 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Al Viro, Aneesh Kumar K.V,
	Dan Williams, Kirill A. Shutemov

[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]

On 3 Apr 2018, at 23:22, Huang, Ying wrote:

> From: Huang Ying <ying.huang@intel.com>
>
> mmap_sem will be read locked when calling follow_pmd_mask().  But this
> cannot prevent PMD from being changed for all cases when PTL is
> unlocked, for example, from pmd_trans_huge() to pmd_none() via
> MADV_DONTNEED.  So it is possible for the pmd_present() check in
> follow_pmd_mask() encounter a none PMD.  This may cause incorrect
> VM_BUG_ON() or infinite loop.  Fixed this via reading PMD entry again
> but only once and checking the local variable and pmd_none() in the
> retry loop.
>
> As Kirill pointed out, with PTL unlocked, the *pmd may be changed
> under us, so read it directly again and again may incur weird bugs.
> So although using *pmd directly other than pmd_present() checking may
> be safe, it is still better to replace them to read *pmd once and
> check the local variable for multiple times.

I see you point there. The patch wants to provide a consistent value
for all race checks. Specifically, this patch is trying to avoid the inconsistent
reads of *pmd for if-statements, which causes problem when both if-condition reads *pmd and
the statements inside "if" reads *pmd again and two reads can give different values.
Am I right about this?

If yes, the problem can be solved by something like:

if (!pmd_present(tmpval = *pmd)) {
    check tmpval instead of *pmd;
}

Right?

I just wonder if we need some general code for all race checks.

Thanks.

--
Best Regards
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH -mm] mm, gup: prevent pmd checking race in follow_pmd_mask()
  2018-04-04 15:02 ` Zi Yan
@ 2018-04-06  1:57   ` huang ying
  2018-04-06  2:16     ` Zi Yan
  0 siblings, 1 reply; 4+ messages in thread
From: huang ying @ 2018-04-06  1:57 UTC (permalink / raw)
  To: Zi Yan
  Cc: Huang, Ying, Andrew Morton, linux-mm, LKML, Al Viro,
	Aneesh Kumar K.V, Dan Williams, Kirill A. Shutemov

On Wed, Apr 4, 2018 at 11:02 PM, Zi Yan <zi.yan@cs.rutgers.edu> wrote:
> On 3 Apr 2018, at 23:22, Huang, Ying wrote:
>
>> From: Huang Ying <ying.huang@intel.com>
>>
>> mmap_sem will be read locked when calling follow_pmd_mask().  But this
>> cannot prevent PMD from being changed for all cases when PTL is
>> unlocked, for example, from pmd_trans_huge() to pmd_none() via
>> MADV_DONTNEED.  So it is possible for the pmd_present() check in
>> follow_pmd_mask() encounter a none PMD.  This may cause incorrect
>> VM_BUG_ON() or infinite loop.  Fixed this via reading PMD entry again
>> but only once and checking the local variable and pmd_none() in the
>> retry loop.
>>
>> As Kirill pointed out, with PTL unlocked, the *pmd may be changed
>> under us, so read it directly again and again may incur weird bugs.
>> So although using *pmd directly other than pmd_present() checking may
>> be safe, it is still better to replace them to read *pmd once and
>> check the local variable for multiple times.
>
> I see you point there. The patch wants to provide a consistent value
> for all race checks. Specifically, this patch is trying to avoid the inconsistent
> reads of *pmd for if-statements, which causes problem when both if-condition reads *pmd and
> the statements inside "if" reads *pmd again and two reads can give different values.
> Am I right about this?

Yes.

> If yes, the problem can be solved by something like:
>
> if (!pmd_present(tmpval = *pmd)) {
>     check tmpval instead of *pmd;
> }
>
> Right?

I think this isn't enough yet.  we need

tmpval = READ_ONCE(*pmd);

To prevent compiler to generate code to read *pmd again and again.
Please check the comments of pmd_none_or_trans_huge_or_clear_bad()
about barrier.

Best Regards,
Huang, Ying

> I just wonder if we need some general code for all race checks.
>
> Thanks.
>
> --
> Best Regards
> Yan Zi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH -mm] mm, gup: prevent pmd checking race in follow_pmd_mask()
  2018-04-06  1:57   ` huang ying
@ 2018-04-06  2:16     ` Zi Yan
  0 siblings, 0 replies; 4+ messages in thread
From: Zi Yan @ 2018-04-06  2:16 UTC (permalink / raw)
  To: huang ying
  Cc: Huang, Ying, Andrew Morton, linux-mm, LKML, Al Viro,
	Aneesh Kumar K.V, Dan Williams, Kirill A. Shutemov

[-- Attachment #1: Type: text/plain, Size: 2173 bytes --]

On 5 Apr 2018, at 21:57, huang ying wrote:

> On Wed, Apr 4, 2018 at 11:02 PM, Zi Yan <zi.yan@cs.rutgers.edu> wrote:
>> On 3 Apr 2018, at 23:22, Huang, Ying wrote:
>>
>>> From: Huang Ying <ying.huang@intel.com>
>>>
>>> mmap_sem will be read locked when calling follow_pmd_mask().  But this
>>> cannot prevent PMD from being changed for all cases when PTL is
>>> unlocked, for example, from pmd_trans_huge() to pmd_none() via
>>> MADV_DONTNEED.  So it is possible for the pmd_present() check in
>>> follow_pmd_mask() encounter a none PMD.  This may cause incorrect
>>> VM_BUG_ON() or infinite loop.  Fixed this via reading PMD entry again
>>> but only once and checking the local variable and pmd_none() in the
>>> retry loop.
>>>
>>> As Kirill pointed out, with PTL unlocked, the *pmd may be changed
>>> under us, so read it directly again and again may incur weird bugs.
>>> So although using *pmd directly other than pmd_present() checking may
>>> be safe, it is still better to replace them to read *pmd once and
>>> check the local variable for multiple times.
>>
>> I see you point there. The patch wants to provide a consistent value
>> for all race checks. Specifically, this patch is trying to avoid the inconsistent
>> reads of *pmd for if-statements, which causes problem when both if-condition reads *pmd and
>> the statements inside "if" reads *pmd again and two reads can give different values.
>> Am I right about this?
>
> Yes.
>
>> If yes, the problem can be solved by something like:
>>
>> if (!pmd_present(tmpval = *pmd)) {
>>     check tmpval instead of *pmd;
>> }
>>
>> Right?
>
> I think this isn't enough yet.  we need
>
> tmpval = READ_ONCE(*pmd);
>
> To prevent compiler to generate code to read *pmd again and again.
> Please check the comments of pmd_none_or_trans_huge_or_clear_bad()
> about barrier.

Got it. And if there is a barrier (implicit or explicit) inside if-statement, like
pmd_migrationt_entry_wait(mm, pmd), we need to update tmpval with READ_ONCE() after the barrier.

The patch looks good to me. Thanks.

Reviewed-by: Zi Yan <zi.yan@cs.rutgers.edu>

—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 557 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-04-06  2:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-04  3:22 [PATCH -mm] mm, gup: prevent pmd checking race in follow_pmd_mask() Huang, Ying
2018-04-04 15:02 ` Zi Yan
2018-04-06  1:57   ` huang ying
2018-04-06  2:16     ` Zi Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox