* [PATCH] dax: Split pmd map when fallback on COW
@ 2015-11-23 20:05 Toshi Kani
2015-11-23 20:45 ` Dan Williams
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Toshi Kani @ 2015-11-23 20:05 UTC (permalink / raw)
To: dan.j.williams
Cc: kirill.shutemov, willy, ross.zwisler, linux-mm, linux-fsdevel,
linux-nvdimm, linux-kernel, Toshi Kani
An infinite loop of PMD faults was observed when attempted to
mlock() a private read-only PMD mmap'd range of a DAX file.
__dax_pmd_fault() simply returns with VM_FAULT_FALLBACK when
falling back to PTE on COW. However, __handle_mm_fault()
returns without falling back to handle_pte_fault() because
a PMD map is present in this case.
Change __dax_pmd_fault() to split the PMD map, if present,
before returning with VM_FAULT_FALLBACK.
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
---
fs/dax.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/dax.c b/fs/dax.c
index 43671b6..3405583 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -546,8 +546,10 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
return VM_FAULT_FALLBACK;
/* Fall back to PTEs if we're going to COW */
- if (write && !(vma->vm_flags & VM_SHARED))
+ if (write && !(vma->vm_flags & VM_SHARED)) {
+ split_huge_page_pmd(vma, address, pmd);
return VM_FAULT_FALLBACK;
+ }
/* If the PMD would extend outside the VMA */
if (pmd_addr < vma->vm_start)
return VM_FAULT_FALLBACK;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] dax: Split pmd map when fallback on COW
2015-11-23 20:05 [PATCH] dax: Split pmd map when fallback on COW Toshi Kani
@ 2015-11-23 20:45 ` Dan Williams
2015-11-23 20:45 ` Toshi Kani
2015-11-23 22:58 ` Toshi Kani
2015-11-24 17:08 ` Dan Williams
2 siblings, 1 reply; 7+ messages in thread
From: Dan Williams @ 2015-11-23 20:45 UTC (permalink / raw)
To: Toshi Kani
Cc: Kirill A. Shutemov, Matthew Wilcox, Ross Zwisler, Linux MM,
linux-fsdevel, linux-nvdimm, linux-kernel
On Mon, Nov 23, 2015 at 12:05 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> An infinite loop of PMD faults was observed when attempted to
> mlock() a private read-only PMD mmap'd range of a DAX file.
>
> __dax_pmd_fault() simply returns with VM_FAULT_FALLBACK when
> falling back to PTE on COW. However, __handle_mm_fault()
> returns without falling back to handle_pte_fault() because
> a PMD map is present in this case.
>
> Change __dax_pmd_fault() to split the PMD map, if present,
> before returning with VM_FAULT_FALLBACK.
>
> Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Matthew Wilcox <willy@linux.intel.com>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
I thought the patch from Ross already addressed the infinite loop:
https://patchwork.kernel.org/patch/7653731/
> ---
> fs/dax.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 43671b6..3405583 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -546,8 +546,10 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
> return VM_FAULT_FALLBACK;
>
> /* Fall back to PTEs if we're going to COW */
> - if (write && !(vma->vm_flags & VM_SHARED))
> + if (write && !(vma->vm_flags & VM_SHARED)) {
> + split_huge_page_pmd(vma, address, pmd);
> return VM_FAULT_FALLBACK;
> + }
> /* If the PMD would extend outside the VMA */
> if (pmd_addr < vma->vm_start)
> return VM_FAULT_FALLBACK;
This is a nop if CONFIG_TRANSPARENT_HUGEPAGE=n, so I don't think it's
a complete fix.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] dax: Split pmd map when fallback on COW
2015-11-23 20:45 ` Dan Williams
@ 2015-11-23 20:45 ` Toshi Kani
2015-11-23 20:56 ` Dan Williams
0 siblings, 1 reply; 7+ messages in thread
From: Toshi Kani @ 2015-11-23 20:45 UTC (permalink / raw)
To: Dan Williams
Cc: Kirill A. Shutemov, Matthew Wilcox, Ross Zwisler, Linux MM,
linux-fsdevel, linux-nvdimm, linux-kernel
On Mon, 2015-11-23 at 12:45 -0800, Dan Williams wrote:
> On Mon, Nov 23, 2015 at 12:05 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> > An infinite loop of PMD faults was observed when attempted to
> > mlock() a private read-only PMD mmap'd range of a DAX file.
> >
> > __dax_pmd_fault() simply returns with VM_FAULT_FALLBACK when
> > falling back to PTE on COW. However, __handle_mm_fault()
> > returns without falling back to handle_pte_fault() because
> > a PMD map is present in this case.
> >
> > Change __dax_pmd_fault() to split the PMD map, if present,
> > before returning with VM_FAULT_FALLBACK.
> >
> > Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Matthew Wilcox <willy@linux.intel.com>
> > Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>
> I thought the patch from Ross already addressed the infinite loop:
>
> https://patchwork.kernel.org/patch/7653731/
This fixes a different issue. I hit this one while testing my other patch along
with the Ross's patch.
> > ---
> > fs/dax.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/dax.c b/fs/dax.c
> > index 43671b6..3405583 100644
> > --- a/fs/dax.c
> > +++ b/fs/dax.c
> > @@ -546,8 +546,10 @@ int __dax_pmd_fault(struct vm_area_struct *vma,
> > unsigned long address,
> > return VM_FAULT_FALLBACK;
> >
> > /* Fall back to PTEs if we're going to COW */
> > - if (write && !(vma->vm_flags & VM_SHARED))
> > + if (write && !(vma->vm_flags & VM_SHARED)) {
> > + split_huge_page_pmd(vma, address, pmd);
> > return VM_FAULT_FALLBACK;
> > + }
> > /* If the PMD would extend outside the VMA */
> > if (pmd_addr < vma->vm_start)
> > return VM_FAULT_FALLBACK;
>
> This is a nop if CONFIG_TRANSPARENT_HUGEPAGE=n, so I don't think it's
> a complete fix.
Well, __dax_pmd_fault() itself depends on CONFIG_TRANSPARENT_HUGEPAGE.
Thanks,
-Toshi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] dax: Split pmd map when fallback on COW
2015-11-23 20:45 ` Toshi Kani
@ 2015-11-23 20:56 ` Dan Williams
2015-11-23 21:04 ` Toshi Kani
0 siblings, 1 reply; 7+ messages in thread
From: Dan Williams @ 2015-11-23 20:56 UTC (permalink / raw)
To: Toshi Kani
Cc: Kirill A. Shutemov, Matthew Wilcox, Ross Zwisler, Linux MM,
linux-fsdevel, linux-nvdimm, linux-kernel
On Mon, Nov 23, 2015 at 12:45 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> On Mon, 2015-11-23 at 12:45 -0800, Dan Williams wrote:
>> On Mon, Nov 23, 2015 at 12:05 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
[..]
>> This is a nop if CONFIG_TRANSPARENT_HUGEPAGE=n, so I don't think it's
>> a complete fix.
>
> Well, __dax_pmd_fault() itself depends on CONFIG_TRANSPARENT_HUGEPAGE.
>
Indeed it is... I think that's wrong because transparent huge pages
rely on struct page??
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] dax: Split pmd map when fallback on COW
2015-11-23 20:56 ` Dan Williams
@ 2015-11-23 21:04 ` Toshi Kani
0 siblings, 0 replies; 7+ messages in thread
From: Toshi Kani @ 2015-11-23 21:04 UTC (permalink / raw)
To: Dan Williams
Cc: Kirill A. Shutemov, Matthew Wilcox, Ross Zwisler, Linux MM,
linux-fsdevel, linux-nvdimm, linux-kernel
On Mon, 2015-11-23 at 12:56 -0800, Dan Williams wrote:
> On Mon, Nov 23, 2015 at 12:45 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> > On Mon, 2015-11-23 at 12:45 -0800, Dan Williams wrote:
> > > On Mon, Nov 23, 2015 at 12:05 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> [..]
> > > This is a nop if CONFIG_TRANSPARENT_HUGEPAGE=n, so I don't think it's
> > > a complete fix.
> >
> > Well, __dax_pmd_fault() itself depends on CONFIG_TRANSPARENT_HUGEPAGE.
> >
>
> Indeed it is... I think that's wrong because transparent huge pages
> rely on struct page??
I do not think this issue is related with struct page. wp_huge_pmd() calls
either do_huge_pmd_wp_page() or dax_pmd_fault(). do_huge_pmd_wp_page() splits a
pmd page when it returns with VM_FAULT_FALLBACK. So, this change keeps them
consistent on VM_FAULT_FALLBACK.
Thanks,
-Toshi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] dax: Split pmd map when fallback on COW
2015-11-23 20:05 [PATCH] dax: Split pmd map when fallback on COW Toshi Kani
2015-11-23 20:45 ` Dan Williams
@ 2015-11-23 22:58 ` Toshi Kani
2015-11-24 17:08 ` Dan Williams
2 siblings, 0 replies; 7+ messages in thread
From: Toshi Kani @ 2015-11-23 22:58 UTC (permalink / raw)
To: dan.j.williams
Cc: kirill.shutemov, willy, ross.zwisler, linux-mm, linux-fsdevel,
linux-nvdimm, linux-kernel
On Mon, 2015-11-23 at 13:05 -0700, Toshi Kani wrote:
> An infinite loop of PMD faults was observed when attempted to
> mlock() a private read-only PMD mmap'd range of a DAX file.
Typo: the above description should be (remove "read-only"):
An infinite loop of PMD faults was observed when attempted to mlock() a private
PMD mmap'd range of a DAX file.
-Toshi
> __dax_pmd_fault() simply returns with VM_FAULT_FALLBACK when
> falling back to PTE on COW. However, __handle_mm_fault()
> returns without falling back to handle_pte_fault() because
> a PMD map is present in this case.
>
> Change __dax_pmd_fault() to split the PMD map, if present,
> before returning with VM_FAULT_FALLBACK.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] dax: Split pmd map when fallback on COW
2015-11-23 20:05 [PATCH] dax: Split pmd map when fallback on COW Toshi Kani
2015-11-23 20:45 ` Dan Williams
2015-11-23 22:58 ` Toshi Kani
@ 2015-11-24 17:08 ` Dan Williams
2 siblings, 0 replies; 7+ messages in thread
From: Dan Williams @ 2015-11-24 17:08 UTC (permalink / raw)
To: Toshi Kani
Cc: Kirill A. Shutemov, Matthew Wilcox, Ross Zwisler, Linux MM,
linux-fsdevel, linux-nvdimm, linux-kernel
On Mon, Nov 23, 2015 at 12:05 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> An infinite loop of PMD faults was observed when attempted to
> mlock() a private read-only PMD mmap'd range of a DAX file.
>
> __dax_pmd_fault() simply returns with VM_FAULT_FALLBACK when
> falling back to PTE on COW. However, __handle_mm_fault()
> returns without falling back to handle_pte_fault() because
> a PMD map is present in this case.
>
> Change __dax_pmd_fault() to split the PMD map, if present,
> before returning with VM_FAULT_FALLBACK.
>
> Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Matthew Wilcox <willy@linux.intel.com>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> ---
> fs/dax.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 43671b6..3405583 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -546,8 +546,10 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
> return VM_FAULT_FALLBACK;
>
> /* Fall back to PTEs if we're going to COW */
> - if (write && !(vma->vm_flags & VM_SHARED))
> + if (write && !(vma->vm_flags & VM_SHARED)) {
> + split_huge_page_pmd(vma, address, pmd);
> return VM_FAULT_FALLBACK;
> + }
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
I took a closer look at dax's CONFIG_TRANSPARENT_HUGEPAGE interactions
and it turns out THP is a performance enhancement not a functional
dependency. I.e. a performance enhancement to use a huge_zero_page
where available, but not a requirement.
I'll fold this in with my series make pmd_trans_huge() return false
for non-huge_zero_page dax mappings, and in that case I'll need to
up-level the call to pmdp_huge_clear_flush_notify() from
__split_huge_page_pmd.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-11-24 17:08 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-23 20:05 [PATCH] dax: Split pmd map when fallback on COW Toshi Kani
2015-11-23 20:45 ` Dan Williams
2015-11-23 20:45 ` Toshi Kani
2015-11-23 20:56 ` Dan Williams
2015-11-23 21:04 ` Toshi Kani
2015-11-23 22:58 ` Toshi Kani
2015-11-24 17:08 ` Dan Williams
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox