linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] mm: thp: fix flags for pmd migration when split
@ 2018-12-13  5:15 Peter Xu
  2018-12-13  7:55 ` Konstantin Khlebnikov
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Peter Xu @ 2018-12-13  5:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterx, Andrea Arcangeli, Andrew Morton, Kirill A. Shutemov,
	Matthew Wilcox, Michal Hocko, Dave Jiang, Aneesh Kumar K.V,
	Souptick Joarder, Konstantin Khlebnikov, Zi Yan, linux-mm

When splitting a huge migrating PMD, we'll transfer all the existing
PMD bits and apply them again onto the small PTEs.  However we are
fetching the bits unconditionally via pmd_soft_dirty(), pmd_write()
or pmd_yound() while actually they don't make sense at all when it's
a migration entry.  Fix them up.  Since at it, drop the ifdef together
as not needed.

Note that if my understanding is correct about the problem then if
without the patch there is chance to lose some of the dirty bits in
the migrating pmd pages (on x86_64 we're fetching bit 11 which is part
of swap offset instead of bit 2) and it could potentially corrupt the
memory of an userspace program which depends on the dirty bit.

CC: Andrea Arcangeli <aarcange@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
CC: Matthew Wilcox <willy@infradead.org>
CC: Michal Hocko <mhocko@suse.com>
CC: Dave Jiang <dave.jiang@intel.com>
CC: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
CC: Souptick Joarder <jrdr.linux@gmail.com>
CC: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
CC: Zi Yan <zi.yan@cs.rutgers.edu>
CC: linux-mm@kvack.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Peter Xu <peterx@redhat.com>
---
v2:
- fix it up for young/write/dirty bits too [Konstantin]
v3:
- fetch write correctly for migration entry; drop macro [Konstantin]
---
 mm/huge_memory.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f2d19e4fe854..aebade83cec9 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2145,23 +2145,25 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 	 */
 	old_pmd = pmdp_invalidate(vma, haddr, pmd);
 
-#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
 	pmd_migration = is_pmd_migration_entry(old_pmd);
-	if (pmd_migration) {
+	if (unlikely(pmd_migration)) {
 		swp_entry_t entry;
 
 		entry = pmd_to_swp_entry(old_pmd);
 		page = pfn_to_page(swp_offset(entry));
-	} else
-#endif
+		write = is_write_migration_entry(entry);
+		young = false;
+		soft_dirty = pmd_swp_soft_dirty(old_pmd);
+	} else {
 		page = pmd_page(old_pmd);
+		if (pmd_dirty(old_pmd))
+			SetPageDirty(page);
+		write = pmd_write(old_pmd);
+		young = pmd_young(old_pmd);
+		soft_dirty = pmd_soft_dirty(old_pmd);
+	}
 	VM_BUG_ON_PAGE(!page_count(page), page);
 	page_ref_add(page, HPAGE_PMD_NR - 1);
-	if (pmd_dirty(old_pmd))
-		SetPageDirty(page);
-	write = pmd_write(old_pmd);
-	young = pmd_young(old_pmd);
-	soft_dirty = pmd_soft_dirty(old_pmd);
 
 	/*
 	 * Withdraw the table only after we mark the pmd entry invalid.
-- 
2.17.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] mm: thp: fix flags for pmd migration when split
  2018-12-13  5:15 [PATCH v3] mm: thp: fix flags for pmd migration when split Peter Xu
@ 2018-12-13  7:55 ` Konstantin Khlebnikov
  2018-12-13  8:24 ` William Kucharski
  2018-12-13  9:59 ` Kirill A. Shutemov
  2 siblings, 0 replies; 5+ messages in thread
From: Konstantin Khlebnikov @ 2018-12-13  7:55 UTC (permalink / raw)
  To: peterx
  Cc: Linux Kernel Mailing List, Andrea Arcangeli, Andrew Morton,
	Kirill A. Shutemov, Matthew Wilcox, Michal Hocko, dave.jiang,
	Aneesh Kumar K.V, Souptick Joarder,
	Константин
	Хлебников,
	zi.yan, linux-mm

On Thu, Dec 13, 2018 at 8:15 AM Peter Xu <peterx@redhat.com> wrote:
>
> When splitting a huge migrating PMD, we'll transfer all the existing
> PMD bits and apply them again onto the small PTEs.  However we are
> fetching the bits unconditionally via pmd_soft_dirty(), pmd_write()
> or pmd_yound() while actually they don't make sense at all when it's
> a migration entry.  Fix them up.  Since at it, drop the ifdef together
> as not needed.
>
> Note that if my understanding is correct about the problem then if
> without the patch there is chance to lose some of the dirty bits in
> the migrating pmd pages (on x86_64 we're fetching bit 11 which is part
> of swap offset instead of bit 2) and it could potentially corrupt the
> memory of an userspace program which depends on the dirty bit.
>

Looks good to me

Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>

> CC: Andrea Arcangeli <aarcange@redhat.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> CC: Matthew Wilcox <willy@infradead.org>
> CC: Michal Hocko <mhocko@suse.com>
> CC: Dave Jiang <dave.jiang@intel.com>
> CC: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> CC: Souptick Joarder <jrdr.linux@gmail.com>
> CC: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> CC: Zi Yan <zi.yan@cs.rutgers.edu>
> CC: linux-mm@kvack.org
> CC: linux-kernel@vger.kernel.org
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> v2:
> - fix it up for young/write/dirty bits too [Konstantin]
> v3:
> - fetch write correctly for migration entry; drop macro [Konstantin]
> ---
>  mm/huge_memory.c | 20 +++++++++++---------
>  1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f2d19e4fe854..aebade83cec9 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2145,23 +2145,25 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>          */
>         old_pmd = pmdp_invalidate(vma, haddr, pmd);
>
> -#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
>         pmd_migration = is_pmd_migration_entry(old_pmd);
> -       if (pmd_migration) {
> +       if (unlikely(pmd_migration)) {
>                 swp_entry_t entry;
>
>                 entry = pmd_to_swp_entry(old_pmd);
>                 page = pfn_to_page(swp_offset(entry));
> -       } else
> -#endif
> +               write = is_write_migration_entry(entry);
> +               young = false;
> +               soft_dirty = pmd_swp_soft_dirty(old_pmd);
> +       } else {
>                 page = pmd_page(old_pmd);
> +               if (pmd_dirty(old_pmd))
> +                       SetPageDirty(page);
> +               write = pmd_write(old_pmd);
> +               young = pmd_young(old_pmd);
> +               soft_dirty = pmd_soft_dirty(old_pmd);
> +       }
>         VM_BUG_ON_PAGE(!page_count(page), page);
>         page_ref_add(page, HPAGE_PMD_NR - 1);
> -       if (pmd_dirty(old_pmd))
> -               SetPageDirty(page);
> -       write = pmd_write(old_pmd);
> -       young = pmd_young(old_pmd);
> -       soft_dirty = pmd_soft_dirty(old_pmd);
>
>         /*
>          * Withdraw the table only after we mark the pmd entry invalid.
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] mm: thp: fix flags for pmd migration when split
  2018-12-13  5:15 [PATCH v3] mm: thp: fix flags for pmd migration when split Peter Xu
  2018-12-13  7:55 ` Konstantin Khlebnikov
@ 2018-12-13  8:24 ` William Kucharski
  2018-12-13  9:59 ` Kirill A. Shutemov
  2 siblings, 0 replies; 5+ messages in thread
From: William Kucharski @ 2018-12-13  8:24 UTC (permalink / raw)
  To: Peter Xu
  Cc: LKML, Andrea Arcangeli, Andrew Morton, Kirill A. Shutemov,
	Matthew Wilcox, Michal Hocko, Dave Jiang, Aneesh Kumar K.V,
	Souptick Joarder, Konstantin Khlebnikov, Zi Yan, linux-mm



> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f2d19e4fe854..aebade83cec9 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2145,23 +2145,25 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
> 	 */
> 	old_pmd = pmdp_invalidate(vma, haddr, pmd);
> 
> -#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> 	pmd_migration = is_pmd_migration_entry(old_pmd);
> -	if (pmd_migration) {
> +	if (unlikely(pmd_migration)) {
> 		swp_entry_t entry;
> 
> 		entry = pmd_to_swp_entry(old_pmd);
> 		page = pfn_to_page(swp_offset(entry));
> -	} else
> -#endif
> +		write = is_write_migration_entry(entry);
> +		young = false;
> +		soft_dirty = pmd_swp_soft_dirty(old_pmd);
> +	} else {
> 		page = pmd_page(old_pmd);
> +		if (pmd_dirty(old_pmd))
> +			SetPageDirty(page);
> +		write = pmd_write(old_pmd);
> +		young = pmd_young(old_pmd);
> +		soft_dirty = pmd_soft_dirty(old_pmd);
> +	}
> 	VM_BUG_ON_PAGE(!page_count(page), page);
> 	page_ref_add(page, HPAGE_PMD_NR - 1);
> -	if (pmd_dirty(old_pmd))
> -		SetPageDirty(page);
> -	write = pmd_write(old_pmd);
> -	young = pmd_young(old_pmd);
> -	soft_dirty = pmd_soft_dirty(old_pmd);
> 
> 	/*
> 	 * Withdraw the table only after we mark the pmd entry invalid.
> -- 

Looks good.

Reviewed-by: William Kucharski <william.kucharski@oracle.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] mm: thp: fix flags for pmd migration when split
  2018-12-13  5:15 [PATCH v3] mm: thp: fix flags for pmd migration when split Peter Xu
  2018-12-13  7:55 ` Konstantin Khlebnikov
  2018-12-13  8:24 ` William Kucharski
@ 2018-12-13  9:59 ` Kirill A. Shutemov
  2018-12-13 10:24   ` Peter Xu
  2 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2018-12-13  9:59 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-kernel, Andrea Arcangeli, Andrew Morton,
	Kirill A. Shutemov, Matthew Wilcox, Michal Hocko, Dave Jiang,
	Aneesh Kumar K.V, Souptick Joarder, Konstantin Khlebnikov,
	Zi Yan, linux-mm

On Thu, Dec 13, 2018 at 01:15:10PM +0800, Peter Xu wrote:
> When splitting a huge migrating PMD, we'll transfer all the existing
> PMD bits and apply them again onto the small PTEs.  However we are
> fetching the bits unconditionally via pmd_soft_dirty(), pmd_write()
> or pmd_yound() while actually they don't make sense at all when it's
> a migration entry.  Fix them up.  Since at it, drop the ifdef together
> as not needed.
> 
> Note that if my understanding is correct about the problem then if
> without the patch there is chance to lose some of the dirty bits in
> the migrating pmd pages (on x86_64 we're fetching bit 11 which is part
> of swap offset instead of bit 2) and it could potentially corrupt the
> memory of an userspace program which depends on the dirty bit.
> 
> CC: Andrea Arcangeli <aarcange@redhat.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> CC: Matthew Wilcox <willy@infradead.org>
> CC: Michal Hocko <mhocko@suse.com>
> CC: Dave Jiang <dave.jiang@intel.com>
> CC: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> CC: Souptick Joarder <jrdr.linux@gmail.com>
> CC: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> CC: Zi Yan <zi.yan@cs.rutgers.edu>
> CC: linux-mm@kvack.org
> CC: linux-kernel@vger.kernel.org
> Signed-off-by: Peter Xu <peterx@redhat.com>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Stable?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] mm: thp: fix flags for pmd migration when split
  2018-12-13  9:59 ` Kirill A. Shutemov
@ 2018-12-13 10:24   ` Peter Xu
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Xu @ 2018-12-13 10:24 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, Andrea Arcangeli, Andrew Morton,
	Kirill A. Shutemov, Matthew Wilcox, Michal Hocko, Dave Jiang,
	Aneesh Kumar K.V, Souptick Joarder, Konstantin Khlebnikov,
	Zi Yan, linux-mm, stable

On Thu, Dec 13, 2018 at 12:59:42PM +0300, Kirill A. Shutemov wrote:
> On Thu, Dec 13, 2018 at 01:15:10PM +0800, Peter Xu wrote:
> > When splitting a huge migrating PMD, we'll transfer all the existing
> > PMD bits and apply them again onto the small PTEs.  However we are
> > fetching the bits unconditionally via pmd_soft_dirty(), pmd_write()
> > or pmd_yound() while actually they don't make sense at all when it's
> > a migration entry.  Fix them up.  Since at it, drop the ifdef together
> > as not needed.
> > 
> > Note that if my understanding is correct about the problem then if
> > without the patch there is chance to lose some of the dirty bits in
> > the migrating pmd pages (on x86_64 we're fetching bit 11 which is part
> > of swap offset instead of bit 2) and it could potentially corrupt the
> > memory of an userspace program which depends on the dirty bit.
> > 
> > CC: Andrea Arcangeli <aarcange@redhat.com>
> > CC: Andrew Morton <akpm@linux-foundation.org>
> > CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > CC: Matthew Wilcox <willy@infradead.org>
> > CC: Michal Hocko <mhocko@suse.com>
> > CC: Dave Jiang <dave.jiang@intel.com>
> > CC: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> > CC: Souptick Joarder <jrdr.linux@gmail.com>
> > CC: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> > CC: Zi Yan <zi.yan@cs.rutgers.edu>
> > CC: linux-mm@kvack.org
> > CC: linux-kernel@vger.kernel.org
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> Stable?

Sorry I missed the reply from Zi.  I think it should be:

CC: linux-stable <stable@vger.kernel.org> # 4.14+

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-12-13 10:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-13  5:15 [PATCH v3] mm: thp: fix flags for pmd migration when split Peter Xu
2018-12-13  7:55 ` Konstantin Khlebnikov
2018-12-13  8:24 ` William Kucharski
2018-12-13  9:59 ` Kirill A. Shutemov
2018-12-13 10:24   ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox