* [PATCH 0/2] mm/hmm: more HMM clean up @ 2019-07-23 23:30 Ralph Campbell 2019-07-23 23:30 ` [PATCH 1/2] mm/hmm: a few more C style and comment clean ups Ralph Campbell 2019-07-23 23:30 ` [PATCH 2/2] mm/hmm: make full use of walk_page_range() Ralph Campbell 0 siblings, 2 replies; 8+ messages in thread From: Ralph Campbell @ 2019-07-23 23:30 UTC (permalink / raw) To: linux-mm; +Cc: linux-kernel, Ralph Campbell Here are two more patches for things I found to clean up. I assume this will go into Jason's tree since there will likely be more HMM changes in this cycle. Ralph Campbell (2): mm/hmm: a few more C style and comment clean ups mm/hmm: make full use of walk_page_range() mm/hmm.c | 231 +++++++++++++++++++++---------------------------------- 1 file changed, 86 insertions(+), 145 deletions(-) -- 2.20.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] mm/hmm: a few more C style and comment clean ups 2019-07-23 23:30 [PATCH 0/2] mm/hmm: more HMM clean up Ralph Campbell @ 2019-07-23 23:30 ` Ralph Campbell 2019-07-23 23:57 ` Jason Gunthorpe 2019-07-23 23:30 ` [PATCH 2/2] mm/hmm: make full use of walk_page_range() Ralph Campbell 1 sibling, 1 reply; 8+ messages in thread From: Ralph Campbell @ 2019-07-23 23:30 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, Ralph Campbell, Jérôme Glisse, Jason Gunthorpe, Christoph Hellwig A few more comments and minor programming style clean ups. There should be no functional changes. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> --- mm/hmm.c | 34 ++++++++++++++++------------------ 1 file changed, 16 insertions(+), 18 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index b810a4fa3de9..8271f110c243 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -32,7 +32,7 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops; * hmm_get_or_create - register HMM against an mm (HMM internal) * * @mm: mm struct to attach to - * Returns: returns an HMM object, either by referencing the existing + * Return: an HMM object, either by referencing the existing * (per-process) object, or by creating a new one. * * This is not intended to be used directly by device drivers. If mm already @@ -323,8 +323,8 @@ static int hmm_pfns_bad(unsigned long addr, } /* - * hmm_vma_walk_hole() - handle a range lacking valid pmd or pte(s) - * @start: range virtual start address (inclusive) + * hmm_vma_walk_hole_() - handle a range lacking valid pmd or pte(s) + * @addr: range virtual start address (inclusive) * @end: range virtual end address (exclusive) * @fault: should we fault or not ? * @write_fault: write fault ? @@ -374,9 +374,9 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk, /* * So we not only consider the individual per page request we also * consider the default flags requested for the range. The API can - * be use in 2 fashions. The first one where the HMM user coalesce - * multiple page fault into one request and set flags per pfns for - * of those faults. The second one where the HMM user want to pre- + * be used 2 ways. The first one where the HMM user coalesces + * multiple page faults into one request and sets flags per pfn for + * those faults. The second one where the HMM user wants to pre- * fault a range with specific flags. For the latter one it is a * waste to have the user pre-fill the pfn arrays with a default * flags value. @@ -386,7 +386,7 @@ static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk, /* We aren't ask to do anything ... */ if (!(pfns & range->flags[HMM_PFN_VALID])) return; - /* If this is device memory than only fault if explicitly requested */ + /* If this is device memory then only fault if explicitly requested */ if ((cpu_flags & range->flags[HMM_PFN_DEVICE_PRIVATE])) { /* Do we fault on device memory ? */ if (pfns & range->flags[HMM_PFN_DEVICE_PRIVATE]) { @@ -500,7 +500,7 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, hmm_vma_walk->last = end; return 0; #else - /* If THP is not enabled then we should never reach that code ! */ + /* If THP is not enabled then we should never reach this code ! */ return -EINVAL; #endif } @@ -624,13 +624,12 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, pte_t *ptep; pmd_t pmd; - again: pmd = READ_ONCE(*pmdp); if (pmd_none(pmd)) return hmm_vma_walk_hole(start, end, walk); - if (pmd_huge(pmd) && (range->vma->vm_flags & VM_HUGETLB)) + if (pmd_huge(pmd) && is_vm_hugetlb_page(vma)) return hmm_pfns_bad(start, end, walk); if (thp_migration_supported() && is_pmd_migration_entry(pmd)) { @@ -655,11 +654,11 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, if (pmd_devmap(pmd) || pmd_trans_huge(pmd)) { /* - * No need to take pmd_lock here, even if some other threads + * No need to take pmd_lock here, even if some other thread * is splitting the huge pmd we will get that event through * mmu_notifier callback. * - * So just read pmd value and check again its a transparent + * So just read pmd value and check again it's a transparent * huge or device mapping one and compute corresponding pfn * values. */ @@ -673,7 +672,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, } /* - * We have handled all the valid case above ie either none, migration, + * We have handled all the valid cases above ie either none, migration, * huge or transparent huge. At this point either it is a valid pmd * entry pointing to pte directory or it is a bad pmd that will not * recover. @@ -793,10 +792,10 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, pte_t entry; int ret = 0; - size = 1UL << huge_page_shift(h); + size = huge_page_size(h); mask = size - 1; if (range->page_shift != PAGE_SHIFT) { - /* Make sure we are looking at full page. */ + /* Make sure we are looking at a full page. */ if (start & mask) return -EINVAL; if (end < (start + size)) @@ -807,8 +806,7 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, size = PAGE_SIZE; } - - ptl = huge_pte_lock(hstate_vma(walk->vma), walk->mm, pte); + ptl = huge_pte_lock(hstate_vma(vma), walk->mm, pte); entry = huge_ptep_get(pte); i = (start - range->start) >> range->page_shift; @@ -857,7 +855,7 @@ static void hmm_pfns_clear(struct hmm_range *range, * @start: start virtual address (inclusive) * @end: end virtual address (exclusive) * @page_shift: expect page shift for the range - * Returns 0 on success, -EFAULT if the address space is no longer valid + * Return: 0 on success, -EFAULT if the address space is no longer valid * * Track updates to the CPU page table see include/linux/hmm.h */ -- 2.20.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm/hmm: a few more C style and comment clean ups 2019-07-23 23:30 ` [PATCH 1/2] mm/hmm: a few more C style and comment clean ups Ralph Campbell @ 2019-07-23 23:57 ` Jason Gunthorpe 2019-07-24 5:41 ` Christoph Hellwig 0 siblings, 1 reply; 8+ messages in thread From: Jason Gunthorpe @ 2019-07-23 23:57 UTC (permalink / raw) To: Ralph Campbell Cc: linux-mm, linux-kernel, Jérôme Glisse, Christoph Hellwig On Tue, Jul 23, 2019 at 04:30:15PM -0700, Ralph Campbell wrote: > - if (pmd_huge(pmd) && (range->vma->vm_flags & VM_HUGETLB)) > + if (pmd_huge(pmd) && is_vm_hugetlb_page(vma)) > return hmm_pfns_bad(start, end, walk); This one is not a minor cleanup.. I think it should be done on its own commit, and more comletely, maybe based on the below.. If vma is always the same as the the first vma, then your hunk above here is much better than introducing a hugetlb flag as I did below.. Although I don't understand why we have this test when it does seem to support huge pages, and the commit log suggests hugetlbfs was deliberately supported. So a comment (or deletion) sure would be nice. So maybe sequence this into your series? Jason From 6ea7cd2565b5b660d22a659b71b62614e66bc345 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe <jgg@mellanox.com> Date: Tue, 23 Jul 2019 12:28:32 -0300 Subject: [PATCH] mm/hmm: remove hmm_range vma This value is only read inside hmm_vma_walk_pmd() and all the callers, through walk_page_range(), always set the value. The proper place for per-walk data is in hmm_vma_walk, and since the only usage is a vm_flags test just precompute and store that. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> --- drivers/gpu/drm/nouveau/nouveau_svm.c | 7 +++---- include/linux/hmm.h | 1 - mm/hmm.c | 11 ++++++----- 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index a9c5c58d425b3d..4f4bec40b887a6 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -495,12 +495,12 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) range->start, range->end, PAGE_SHIFT); if (ret) { - up_read(&range->vma->vm_mm->mmap_sem); + up_read(&range->hmm->mm->mmap_sem); return (int)ret; } if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { - up_read(&range->vma->vm_mm->mmap_sem); + up_read(&range->hmm->mm->mmap_sem); return -EBUSY; } @@ -508,7 +508,7 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range) if (ret <= 0) { if (ret == 0) ret = -EBUSY; - up_read(&range->vma->vm_mm->mmap_sem); + up_read(&range->hmm->mm->mmap_sem); hmm_range_unregister(range); return ret; } @@ -681,7 +681,6 @@ nouveau_svm_fault(struct nvif_notify *notify) args.i.p.addr + args.i.p.size, fn - fi); /* Have HMM fault pages within the fault window to the GPU. */ - range.vma = vma; range.start = args.i.p.addr; range.end = args.i.p.addr + args.i.p.size; range.pfns = args.phys; diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 9f32586684c9c3..d4b89f655817cd 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -164,7 +164,6 @@ enum hmm_pfn_value_e { */ struct hmm_range { struct hmm *hmm; - struct vm_area_struct *vma; struct list_head list; unsigned long start; unsigned long end; diff --git a/mm/hmm.c b/mm/hmm.c index 16b6731a34db79..3d8cdfb67a6ab8 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -285,8 +285,9 @@ struct hmm_vma_walk { struct hmm_range *range; struct dev_pagemap *pgmap; unsigned long last; - bool fault; - bool block; + bool fault : 1; + bool block : 1; + bool hugetlb : 1; }; static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr, @@ -635,7 +636,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, if (pmd_none(pmd)) return hmm_vma_walk_hole(start, end, walk); - if (pmd_huge(pmd) && (range->vma->vm_flags & VM_HUGETLB)) + if (pmd_huge(pmd) && hmm_vma_walk->hugetlb) return hmm_pfns_bad(start, end, walk); if (thp_migration_supported() && is_pmd_migration_entry(pmd)) { @@ -994,7 +995,7 @@ long hmm_range_snapshot(struct hmm_range *range) return -EPERM; } - range->vma = vma; + hmm_vma_walk.hugetlb = vma->vm_flags & VM_HUGETLB; hmm_vma_walk.pgmap = NULL; hmm_vma_walk.last = start; hmm_vma_walk.fault = false; @@ -1090,7 +1091,7 @@ long hmm_range_fault(struct hmm_range *range, bool block) return -EPERM; } - range->vma = vma; + hmm_vma_walk.hugetlb = vma->vm_flags & VM_HUGETLB; hmm_vma_walk.pgmap = NULL; hmm_vma_walk.last = start; hmm_vma_walk.fault = true; -- 2.22.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm/hmm: a few more C style and comment clean ups 2019-07-23 23:57 ` Jason Gunthorpe @ 2019-07-24 5:41 ` Christoph Hellwig 0 siblings, 0 replies; 8+ messages in thread From: Christoph Hellwig @ 2019-07-24 5:41 UTC (permalink / raw) To: Jason Gunthorpe Cc: Ralph Campbell, linux-mm, linux-kernel, Jérôme Glisse, Christoph Hellwig On Tue, Jul 23, 2019 at 11:57:52PM +0000, Jason Gunthorpe wrote: > diff --git a/mm/hmm.c b/mm/hmm.c > index 16b6731a34db79..3d8cdfb67a6ab8 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -285,8 +285,9 @@ struct hmm_vma_walk { > struct hmm_range *range; > struct dev_pagemap *pgmap; > unsigned long last; > - bool fault; > - bool block; > + bool fault : 1; > + bool block : 1; > + bool hugetlb : 1; I don't think we should even keep these bools around. I have something like this hiding in a branche, which properly cleans much of this up: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/hmm-dma-cleanup Notably: http://git.infradead.org/users/hch/misc.git/commitdiff/2abdc0ac8f9f32149246957121ebccbe5c0a729d http://git.infradead.org/users/hch/misc.git/commitdiff/a34ccd30ee8a8a3111d9e91711c12901ed7dea74 http://git.infradead.org/users/hch/misc.git/commitdiff/81f442ebac7170815af7770a1efa9c4ab662137e This doesn't go all the way yet - the page_walk infrastructure is built around the idea of doing its own vma lookups, and we should eventually kill the lookup inside hmm entirely. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] mm/hmm: make full use of walk_page_range() 2019-07-23 23:30 [PATCH 0/2] mm/hmm: more HMM clean up Ralph Campbell 2019-07-23 23:30 ` [PATCH 1/2] mm/hmm: a few more C style and comment clean ups Ralph Campbell @ 2019-07-23 23:30 ` Ralph Campbell 2019-07-24 6:51 ` Christoph Hellwig 1 sibling, 1 reply; 8+ messages in thread From: Ralph Campbell @ 2019-07-23 23:30 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, Ralph Campbell, Jérôme Glisse, Jason Gunthorpe, Christoph Hellwig hmm_range_snapshot() and hmm_range_fault() both call find_vma() and walk_page_range() in a loop. This is unnecessary duplication since walk_page_range() calls find_vma() in a loop already. Simplify hmm_range_snapshot() and hmm_range_fault() by defining a walk_test() callback function to filter unhandled vmas. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> --- mm/hmm.c | 197 ++++++++++++++++++++----------------------------------- 1 file changed, 70 insertions(+), 127 deletions(-) diff --git a/mm/hmm.c b/mm/hmm.c index 8271f110c243..218557451070 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -839,13 +839,40 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, #endif } -static void hmm_pfns_clear(struct hmm_range *range, - uint64_t *pfns, - unsigned long addr, - unsigned long end) +static int hmm_vma_walk_test(unsigned long start, + unsigned long end, + struct mm_walk *walk) { - for (; addr < end; addr += PAGE_SIZE, pfns++) - *pfns = range->values[HMM_PFN_NONE]; + const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP; + struct hmm_vma_walk *hmm_vma_walk = walk->private; + struct hmm_range *range = hmm_vma_walk->range; + struct vm_area_struct *vma = walk->vma; + + /* If range is no longer valid, force retry. */ + if (!range->valid) + return -EBUSY; + + if (vma->vm_flags & device_vma) + return -EFAULT; + + if (is_vm_hugetlb_page(vma)) { + if (huge_page_shift(hstate_vma(vma)) != range->page_shift && + range->page_shift != PAGE_SHIFT) + return -EINVAL; + } else { + if (range->page_shift != PAGE_SHIFT) + return -EINVAL; + } + + /* + * If vma does not allow read access, then assume that it does not + * allow write access, either. HMM does not support architectures + * that allow write without read. + */ + if (!(vma->vm_flags & VM_READ)) + return -EPERM; + + return 0; } /* @@ -949,65 +976,28 @@ EXPORT_SYMBOL(hmm_range_unregister); */ long hmm_range_snapshot(struct hmm_range *range) { - const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP; - unsigned long start = range->start, end; - struct hmm_vma_walk hmm_vma_walk; + unsigned long start = range->start; + struct hmm_vma_walk hmm_vma_walk = {}; struct hmm *hmm = range->hmm; - struct vm_area_struct *vma; - struct mm_walk mm_walk; + struct mm_walk mm_walk = {}; + int ret; lockdep_assert_held(&hmm->mm->mmap_sem); - do { - /* If range is no longer valid force retry. */ - if (!range->valid) - return -EBUSY; - vma = find_vma(hmm->mm, start); - if (vma == NULL || (vma->vm_flags & device_vma)) - return -EFAULT; - - if (is_vm_hugetlb_page(vma)) { - if (huge_page_shift(hstate_vma(vma)) != - range->page_shift && - range->page_shift != PAGE_SHIFT) - return -EINVAL; - } else { - if (range->page_shift != PAGE_SHIFT) - return -EINVAL; - } + hmm_vma_walk.range = range; + hmm_vma_walk.last = start; + mm_walk.private = &hmm_vma_walk; - if (!(vma->vm_flags & VM_READ)) { - /* - * If vma do not allow read access, then assume that it - * does not allow write access, either. HMM does not - * support architecture that allow write without read. - */ - hmm_pfns_clear(range, range->pfns, - range->start, range->end); - return -EPERM; - } + mm_walk.mm = hmm->mm; + mm_walk.pud_entry = hmm_vma_walk_pud; + mm_walk.pmd_entry = hmm_vma_walk_pmd; + mm_walk.pte_hole = hmm_vma_walk_hole; + mm_walk.hugetlb_entry = hmm_vma_walk_hugetlb_entry; + mm_walk.test_walk = hmm_vma_walk_test; - range->vma = vma; - hmm_vma_walk.pgmap = NULL; - hmm_vma_walk.last = start; - hmm_vma_walk.fault = false; - hmm_vma_walk.range = range; - mm_walk.private = &hmm_vma_walk; - end = min(range->end, vma->vm_end); - - mm_walk.vma = vma; - mm_walk.mm = vma->vm_mm; - mm_walk.pte_entry = NULL; - mm_walk.test_walk = NULL; - mm_walk.hugetlb_entry = NULL; - mm_walk.pud_entry = hmm_vma_walk_pud; - mm_walk.pmd_entry = hmm_vma_walk_pmd; - mm_walk.pte_hole = hmm_vma_walk_hole; - mm_walk.hugetlb_entry = hmm_vma_walk_hugetlb_entry; - - walk_page_range(start, end, &mm_walk); - start = end; - } while (start < range->end); + ret = walk_page_range(start, range->end, &mm_walk); + if (ret) + return ret; return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; } @@ -1043,83 +1033,36 @@ EXPORT_SYMBOL(hmm_range_snapshot); */ long hmm_range_fault(struct hmm_range *range, bool block) { - const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP; - unsigned long start = range->start, end; - struct hmm_vma_walk hmm_vma_walk; + unsigned long start = range->start; + struct hmm_vma_walk hmm_vma_walk = {}; struct hmm *hmm = range->hmm; - struct vm_area_struct *vma; - struct mm_walk mm_walk; + struct mm_walk mm_walk = {}; int ret; lockdep_assert_held(&hmm->mm->mmap_sem); - do { - /* If range is no longer valid force retry. */ - if (!range->valid) - return -EBUSY; + hmm_vma_walk.range = range; + hmm_vma_walk.last = start; + hmm_vma_walk.fault = true; + hmm_vma_walk.block = block; + mm_walk.private = &hmm_vma_walk; - vma = find_vma(hmm->mm, start); - if (vma == NULL || (vma->vm_flags & device_vma)) - return -EFAULT; - - if (is_vm_hugetlb_page(vma)) { - if (huge_page_shift(hstate_vma(vma)) != - range->page_shift && - range->page_shift != PAGE_SHIFT) - return -EINVAL; - } else { - if (range->page_shift != PAGE_SHIFT) - return -EINVAL; - } + mm_walk.mm = hmm->mm; + mm_walk.pud_entry = hmm_vma_walk_pud; + mm_walk.pmd_entry = hmm_vma_walk_pmd; + mm_walk.pte_hole = hmm_vma_walk_hole; + mm_walk.hugetlb_entry = hmm_vma_walk_hugetlb_entry; + mm_walk.test_walk = hmm_vma_walk_test; - if (!(vma->vm_flags & VM_READ)) { - /* - * If vma do not allow read access, then assume that it - * does not allow write access, either. HMM does not - * support architecture that allow write without read. - */ - hmm_pfns_clear(range, range->pfns, - range->start, range->end); - return -EPERM; - } + do { + ret = walk_page_range(start, range->end, &mm_walk); + start = hmm_vma_walk.last; - range->vma = vma; - hmm_vma_walk.pgmap = NULL; - hmm_vma_walk.last = start; - hmm_vma_walk.fault = true; - hmm_vma_walk.block = block; - hmm_vma_walk.range = range; - mm_walk.private = &hmm_vma_walk; - end = min(range->end, vma->vm_end); - - mm_walk.vma = vma; - mm_walk.mm = vma->vm_mm; - mm_walk.pte_entry = NULL; - mm_walk.test_walk = NULL; - mm_walk.hugetlb_entry = NULL; - mm_walk.pud_entry = hmm_vma_walk_pud; - mm_walk.pmd_entry = hmm_vma_walk_pmd; - mm_walk.pte_hole = hmm_vma_walk_hole; - mm_walk.hugetlb_entry = hmm_vma_walk_hugetlb_entry; - - do { - ret = walk_page_range(start, end, &mm_walk); - start = hmm_vma_walk.last; - - /* Keep trying while the range is valid. */ - } while (ret == -EBUSY && range->valid); - - if (ret) { - unsigned long i; - - i = (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; - hmm_pfns_clear(range, &range->pfns[i], - hmm_vma_walk.last, range->end); - return ret; - } - start = end; + /* Keep trying while the range is valid. */ + } while (ret == -EBUSY && range->valid); - } while (start < range->end); + if (ret) + return ret; return (hmm_vma_walk.last - range->start) >> PAGE_SHIFT; } -- 2.20.1 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm/hmm: make full use of walk_page_range() 2019-07-23 23:30 ` [PATCH 2/2] mm/hmm: make full use of walk_page_range() Ralph Campbell @ 2019-07-24 6:51 ` Christoph Hellwig 2019-07-24 11:53 ` Jason Gunthorpe 0 siblings, 1 reply; 8+ messages in thread From: Christoph Hellwig @ 2019-07-24 6:51 UTC (permalink / raw) To: Ralph Campbell Cc: linux-mm, linux-kernel, Jérôme Glisse, Jason Gunthorpe, Christoph Hellwig On Tue, Jul 23, 2019 at 04:30:16PM -0700, Ralph Campbell wrote: > hmm_range_snapshot() and hmm_range_fault() both call find_vma() and > walk_page_range() in a loop. This is unnecessary duplication since > walk_page_range() calls find_vma() in a loop already. > Simplify hmm_range_snapshot() and hmm_range_fault() by defining a > walk_test() callback function to filter unhandled vmas. I like the approach a lot! But we really need to sort out the duplication between hmm_range_fault and hmm_range_snapshot first, as they are basically the same code. I have patches here: http://git.infradead.org/users/hch/misc.git/commitdiff/a34ccd30ee8a8a3111d9e91711c12901ed7dea74 http://git.infradead.org/users/hch/misc.git/commitdiff/81f442ebac7170815af7770a1efa9c4ab662137e That being said we don't really have any users for the snapshot mode or non-blocking faults, and I don't see any in the immediate pipeline either. It might actually be a better idea to just kill that stuff off for now until we have a user, as code without users is per definition untested and will just bitrot and break. > + const unsigned long device_vma = VM_IO | VM_PFNMAP | VM_MIXEDMAP; > + struct hmm_vma_walk *hmm_vma_walk = walk->private; > + struct hmm_range *range = hmm_vma_walk->range; > + struct vm_area_struct *vma = walk->vma; > + > + /* If range is no longer valid, force retry. */ > + if (!range->valid) > + return -EBUSY; > + > + if (vma->vm_flags & device_vma) Can we just kill off this odd device_vma variable? if (vma->vm_flags & (VM_IO | VM_PFNMAP | VM_MIXEDMAP)) and maybe add a comment on why we are skipping them (because they don't have struct page backing I guess..) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm/hmm: make full use of walk_page_range() 2019-07-24 6:51 ` Christoph Hellwig @ 2019-07-24 11:53 ` Jason Gunthorpe 2019-07-24 19:50 ` Ralph Campbell 0 siblings, 1 reply; 8+ messages in thread From: Jason Gunthorpe @ 2019-07-24 11:53 UTC (permalink / raw) To: Christoph Hellwig Cc: Ralph Campbell, linux-mm, linux-kernel, Jérôme Glisse On Wed, Jul 24, 2019 at 08:51:46AM +0200, Christoph Hellwig wrote: > On Tue, Jul 23, 2019 at 04:30:16PM -0700, Ralph Campbell wrote: > > hmm_range_snapshot() and hmm_range_fault() both call find_vma() and > > walk_page_range() in a loop. This is unnecessary duplication since > > walk_page_range() calls find_vma() in a loop already. > > Simplify hmm_range_snapshot() and hmm_range_fault() by defining a > > walk_test() callback function to filter unhandled vmas. > > I like the approach a lot! > > But we really need to sort out the duplication between hmm_range_fault > and hmm_range_snapshot first, as they are basically the same code. I > have patches here: > > http://git.infradead.org/users/hch/misc.git/commitdiff/a34ccd30ee8a8a3111d9e91711c12901ed7dea74 > > http://git.infradead.org/users/hch/misc.git/commitdiff/81f442ebac7170815af7770a1efa9c4ab662137e Yeah, that is a straightforward improvement, maybe Ralph should grab these two as part of his series? > That being said we don't really have any users for the snapshot mode > or non-blocking faults, and I don't see any in the immediate pipeline > either. If this code was production ready I'd use it in ODP right away. When we first create a ODP MR we'd want to snapshot to pre-load the NIC tables with something other than page fault, but not fault anything. This would be a big performance improvement for ODP. Jason ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm/hmm: make full use of walk_page_range() 2019-07-24 11:53 ` Jason Gunthorpe @ 2019-07-24 19:50 ` Ralph Campbell 0 siblings, 0 replies; 8+ messages in thread From: Ralph Campbell @ 2019-07-24 19:50 UTC (permalink / raw) To: Jason Gunthorpe, Christoph Hellwig Cc: linux-mm, linux-kernel, Jérôme Glisse On 7/24/19 4:53 AM, Jason Gunthorpe wrote: > On Wed, Jul 24, 2019 at 08:51:46AM +0200, Christoph Hellwig wrote: >> On Tue, Jul 23, 2019 at 04:30:16PM -0700, Ralph Campbell wrote: >>> hmm_range_snapshot() and hmm_range_fault() both call find_vma() and >>> walk_page_range() in a loop. This is unnecessary duplication since >>> walk_page_range() calls find_vma() in a loop already. >>> Simplify hmm_range_snapshot() and hmm_range_fault() by defining a >>> walk_test() callback function to filter unhandled vmas. >> >> I like the approach a lot! >> >> But we really need to sort out the duplication between hmm_range_fault >> and hmm_range_snapshot first, as they are basically the same code. I >> have patches here: >> >> http://git.infradead.org/users/hch/misc.git/commitdiff/a34ccd30ee8a8a3111d9e91711c12901ed7dea74 >> >> http://git.infradead.org/users/hch/misc.git/commitdiff/81f442ebac7170815af7770a1efa9c4ab662137e > > Yeah, that is a straightforward improvement, maybe Ralph should grab > these two as part of his series? Sure, no problem. I'll add them in v2 when I fix the other issues in the series. >> That being said we don't really have any users for the snapshot mode >> or non-blocking faults, and I don't see any in the immediate pipeline >> either. > > If this code was production ready I'd use it in ODP right away. > > When we first create a ODP MR we'd want to snapshot to pre-load the > NIC tables with something other than page fault, but not fault > anything. > > This would be a big performance improvement for ODP. > > Jason > ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-07-24 19:50 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-23 23:30 [PATCH 0/2] mm/hmm: more HMM clean up Ralph Campbell 2019-07-23 23:30 ` [PATCH 1/2] mm/hmm: a few more C style and comment clean ups Ralph Campbell 2019-07-23 23:57 ` Jason Gunthorpe 2019-07-24 5:41 ` Christoph Hellwig 2019-07-23 23:30 ` [PATCH 2/2] mm/hmm: make full use of walk_page_range() Ralph Campbell 2019-07-24 6:51 ` Christoph Hellwig 2019-07-24 11:53 ` Jason Gunthorpe 2019-07-24 19:50 ` Ralph Campbell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox