linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
@ 2008-11-13 17:19 Jeremy Fitzhardinge
  2008-11-13 19:53 ` Johannes Weiner
  2008-11-14  2:19 ` [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range Nick Piggin
  0 siblings, 2 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-13 17:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nick Piggin, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

remap_pte_range() just wants to apply a function over a range of ptes
corresponding to a virtual address range.  That's exactly what
apply_to_page_range() does, so use it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 mm/memory.c |   92 ++++++++++++-----------------------------------------------
 1 file changed, 20 insertions(+), 72 deletions(-)

===================================================================
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1472,69 +1472,20 @@
 }
 EXPORT_SYMBOL(vm_insert_mixed);
 
-/*
- * maps a range of physical memory into the requested pages. the old
- * mappings are removed. any references to nonexistent pages results
- * in null mappings (currently treated as "copy-on-access")
- */
-static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
+struct remap_data {
+	struct mm_struct *mm;
+	unsigned long pfn;
+	pgprot_t prot;
+};
+
+static int remap_area_pte_fn(pte_t *ptep, pgtable_t token,
+			     unsigned long addr, void *data)
 {
-	pte_t *pte;
-	spinlock_t *ptl;
+	struct remap_data *rmd = data;
+	pte_t pte = pte_mkspecial(pfn_pte(rmd->pfn++, rmd->prot));
 
-	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
-	if (!pte)
-		return -ENOMEM;
-	arch_enter_lazy_mmu_mode();
-	do {
-		BUG_ON(!pte_none(*pte));
-		set_pte_at(mm, addr, pte, pte_mkspecial(pfn_pte(pfn, prot)));
-		pfn++;
-	} while (pte++, addr += PAGE_SIZE, addr != end);
-	arch_leave_lazy_mmu_mode();
-	pte_unmap_unlock(pte - 1, ptl);
-	return 0;
-}
+	set_pte_at(rmd->mm, addr, ptep, pte);
 
-static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
-{
-	pmd_t *pmd;
-	unsigned long next;
-
-	pfn -= addr >> PAGE_SHIFT;
-	pmd = pmd_alloc(mm, pud, addr);
-	if (!pmd)
-		return -ENOMEM;
-	do {
-		next = pmd_addr_end(addr, end);
-		if (remap_pte_range(mm, pmd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
-	} while (pmd++, addr = next, addr != end);
-	return 0;
-}
-
-static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
-			unsigned long addr, unsigned long end,
-			unsigned long pfn, pgprot_t prot)
-{
-	pud_t *pud;
-	unsigned long next;
-
-	pfn -= addr >> PAGE_SHIFT;
-	pud = pud_alloc(mm, pgd, addr);
-	if (!pud)
-		return -ENOMEM;
-	do {
-		next = pud_addr_end(addr, end);
-		if (remap_pmd_range(mm, pud, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
-	} while (pud++, addr = next, addr != end);
 	return 0;
 }
 
@@ -1551,10 +1502,9 @@
 int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
 		    unsigned long pfn, unsigned long size, pgprot_t prot)
 {
-	pgd_t *pgd;
-	unsigned long next;
 	unsigned long end = addr + PAGE_ALIGN(size);
 	struct mm_struct *mm = vma->vm_mm;
+	struct remap_data rmd;
 	int err;
 
 	/*
@@ -1584,16 +1534,14 @@
 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
 
 	BUG_ON(addr >= end);
-	pfn -= addr >> PAGE_SHIFT;
-	pgd = pgd_offset(mm, addr);
-	flush_cache_range(vma, addr, end);
-	do {
-		next = pgd_addr_end(addr, end);
-		err = remap_pud_range(mm, pgd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot);
-		if (err)
-			break;
-	} while (pgd++, addr = next, addr != end);
+
+	rmd.mm = mm;
+	rmd.pfn = pfn;
+	rmd.prot = prot;
+
+	err = apply_to_page_range(mm, addr, end - addr,
+				  remap_area_pte_fn, &rmd);
+
 	return err;
 }
 EXPORT_SYMBOL(remap_pfn_range);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-13 17:19 [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range Jeremy Fitzhardinge
@ 2008-11-13 19:53 ` Johannes Weiner
  2008-11-13 20:12   ` Jeremy Fitzhardinge
  2008-11-13 20:13   ` [PATCH 3/2] mm/remap_pfn_range: restore missing flush Jeremy Fitzhardinge
  2008-11-14  2:19 ` [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range Nick Piggin
  1 sibling, 2 replies; 12+ messages in thread
From: Johannes Weiner @ 2008-11-13 19:53 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Nick Piggin, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

Hi Jeremy,

On Thu, Nov 13, 2008 at 09:19:45AM -0800, Jeremy Fitzhardinge wrote:
> remap_pte_range() just wants to apply a function over a range of ptes
> corresponding to a virtual address range.  That's exactly what
> apply_to_page_range() does, so use it.
> 
> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> ---
> mm/memory.c |   92 
> ++++++++++++-----------------------------------------------
> 1 file changed, 20 insertions(+), 72 deletions(-)
> 
> ===================================================================
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1472,69 +1472,20 @@
> }
> EXPORT_SYMBOL(vm_insert_mixed);
> 
> -/*
> - * maps a range of physical memory into the requested pages. the old
> - * mappings are removed. any references to nonexistent pages results
> - * in null mappings (currently treated as "copy-on-access")
> - */
> -static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> +struct remap_data {
> +	struct mm_struct *mm;
> +	unsigned long pfn;
> +	pgprot_t prot;
> +};
> +
> +static int remap_area_pte_fn(pte_t *ptep, pgtable_t token,
> +			     unsigned long addr, void *data)
> {
> -	pte_t *pte;
> -	spinlock_t *ptl;
> +	struct remap_data *rmd = data;
> +	pte_t pte = pte_mkspecial(pfn_pte(rmd->pfn++, rmd->prot));
> 
> -	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
> -	if (!pte)
> -		return -ENOMEM;
> -	arch_enter_lazy_mmu_mode();
> -	do {
> -		BUG_ON(!pte_none(*pte));

Dropping by intention?

> -		set_pte_at(mm, addr, pte, pte_mkspecial(pfn_pte(pfn, prot)));
> -		pfn++;
> -	} while (pte++, addr += PAGE_SIZE, addr != end);
> -	arch_leave_lazy_mmu_mode();
> -	pte_unmap_unlock(pte - 1, ptl);
> -	return 0;
> -}
> +	set_pte_at(rmd->mm, addr, ptep, pte);
> 
> -static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> -{
> -	pmd_t *pmd;
> -	unsigned long next;
> -
> -	pfn -= addr >> PAGE_SHIFT;
> -	pmd = pmd_alloc(mm, pud, addr);
> -	if (!pmd)
> -		return -ENOMEM;
> -	do {
> -		next = pmd_addr_end(addr, end);
> -		if (remap_pte_range(mm, pmd, addr, next,
> -				pfn + (addr >> PAGE_SHIFT), prot))
> -			return -ENOMEM;
> -	} while (pmd++, addr = next, addr != end);
> -	return 0;
> -}
> -
> -static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> -{
> -	pud_t *pud;
> -	unsigned long next;
> -
> -	pfn -= addr >> PAGE_SHIFT;
> -	pud = pud_alloc(mm, pgd, addr);
> -	if (!pud)
> -		return -ENOMEM;
> -	do {
> -		next = pud_addr_end(addr, end);
> -		if (remap_pmd_range(mm, pud, addr, next,
> -				pfn + (addr >> PAGE_SHIFT), prot))
> -			return -ENOMEM;
> -	} while (pud++, addr = next, addr != end);
> 	return 0;
> }
> 
> @@ -1551,10 +1502,9 @@
> int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
> 		    unsigned long pfn, unsigned long size, pgprot_t prot)
> {
> -	pgd_t *pgd;
> -	unsigned long next;
> 	unsigned long end = addr + PAGE_ALIGN(size);
> 	struct mm_struct *mm = vma->vm_mm;
> +	struct remap_data rmd;
> 	int err;
> 
> 	/*
> @@ -1584,16 +1534,14 @@
> 	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
> 
> 	BUG_ON(addr >= end);
> -	pfn -= addr >> PAGE_SHIFT;
> -	pgd = pgd_offset(mm, addr);
> -	flush_cache_range(vma, addr, end);

Was the flushing redundant?  I can't spot it reappearing anywhere.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-13 19:53 ` Johannes Weiner
@ 2008-11-13 20:12   ` Jeremy Fitzhardinge
  2008-11-13 20:13   ` [PATCH 3/2] mm/remap_pfn_range: restore missing flush Jeremy Fitzhardinge
  1 sibling, 0 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-13 20:12 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Nick Piggin, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

Johannes Weiner wrote:
>> -	do {
>> -		BUG_ON(!pte_none(*pte));
>>     
>
> Dropping by intention?
>   

Hm, I couldn't really see the point.   But I didn't really want to 
introduce any functional changes with this patch, so I'll add it back.

>> 	BUG_ON(addr >= end);
>> -	pfn -= addr >> PAGE_SHIFT;
>> -	pgd = pgd_offset(mm, addr);
>> -	flush_cache_range(vma, addr, end);
>>     
>
> Was the flushing redundant?  I can't spot it reappearing anywhere.

I guess its needed for virtually indexed cache architectures; I'll add 
it back.

Thanks for reviewing.

    J

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 3/2] mm/remap_pfn_range: restore missing flush
  2008-11-13 19:53 ` Johannes Weiner
  2008-11-13 20:12   ` Jeremy Fitzhardinge
@ 2008-11-13 20:13   ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-13 20:13 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Nick Piggin, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

Restore the cache flush and BUG_ON removed in the conversion to using
apply_to_page_range().

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citix.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memory.c |    4 ++++
 1 file changed, 4 insertions(+)

===================================================================
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1484,6 +1484,8 @@
 	struct remap_data *rmd = data;
 	pte_t pte = pte_mkspecial(pfn_pte(rmd->pfn++, rmd->prot));
 
+	BUG_ON(!pte_none(*ptep));
+
 	set_pte_at(rmd->mm, addr, ptep, pte);
 
 	return 0;
@@ -1535,6 +1537,8 @@
 
 	BUG_ON(addr >= end);
 
+	flush_cache_range(vma, addr, end);
+
 	rmd.mm = mm;
 	rmd.pfn = pfn;
 	rmd.prot = prot;


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-13 17:19 [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range Jeremy Fitzhardinge
  2008-11-13 19:53 ` Johannes Weiner
@ 2008-11-14  2:19 ` Nick Piggin
  2008-11-14  2:56   ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 12+ messages in thread
From: Nick Piggin @ 2008-11-14  2:19 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

On Friday 14 November 2008 04:19, Jeremy Fitzhardinge wrote:
> remap_pte_range() just wants to apply a function over a range of ptes
> corresponding to a virtual address range.  That's exactly what
> apply_to_page_range() does, so use it.

This isn't performance critical to anyone?

I see DRM, IB, GRU, other media and video drivers use it.

It IS exactly what apply_to_page_range does, I grant you. But so does
our traditional set of nested loops. So is there any particular reason
to change it? You're not planning to change fork/exit next, are you? :)


> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> ---
>  mm/memory.c |   92
> ++++++++++++----------------------------------------------- 1 file changed,
> 20 insertions(+), 72 deletions(-)
>
> ===================================================================
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1472,69 +1472,20 @@
>  }
>  EXPORT_SYMBOL(vm_insert_mixed);
>
> -/*
> - * maps a range of physical memory into the requested pages. the old
> - * mappings are removed. any references to nonexistent pages results
> - * in null mappings (currently treated as "copy-on-access")
> - */
> -static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> +struct remap_data {
> +	struct mm_struct *mm;
> +	unsigned long pfn;
> +	pgprot_t prot;
> +};
> +
> +static int remap_area_pte_fn(pte_t *ptep, pgtable_t token,
> +			     unsigned long addr, void *data)
>  {
> -	pte_t *pte;
> -	spinlock_t *ptl;
> +	struct remap_data *rmd = data;
> +	pte_t pte = pte_mkspecial(pfn_pte(rmd->pfn++, rmd->prot));
>
> -	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
> -	if (!pte)
> -		return -ENOMEM;
> -	arch_enter_lazy_mmu_mode();
> -	do {
> -		BUG_ON(!pte_none(*pte));
> -		set_pte_at(mm, addr, pte, pte_mkspecial(pfn_pte(pfn, prot)));
> -		pfn++;
> -	} while (pte++, addr += PAGE_SIZE, addr != end);
> -	arch_leave_lazy_mmu_mode();
> -	pte_unmap_unlock(pte - 1, ptl);
> -	return 0;
> -}
> +	set_pte_at(rmd->mm, addr, ptep, pte);
>
> -static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> -{
> -	pmd_t *pmd;
> -	unsigned long next;
> -
> -	pfn -= addr >> PAGE_SHIFT;
> -	pmd = pmd_alloc(mm, pud, addr);
> -	if (!pmd)
> -		return -ENOMEM;
> -	do {
> -		next = pmd_addr_end(addr, end);
> -		if (remap_pte_range(mm, pmd, addr, next,
> -				pfn + (addr >> PAGE_SHIFT), prot))
> -			return -ENOMEM;
> -	} while (pmd++, addr = next, addr != end);
> -	return 0;
> -}
> -
> -static inline int remap_pud_range(struct mm_struct *mm, pgd_t *pgd,
> -			unsigned long addr, unsigned long end,
> -			unsigned long pfn, pgprot_t prot)
> -{
> -	pud_t *pud;
> -	unsigned long next;
> -
> -	pfn -= addr >> PAGE_SHIFT;
> -	pud = pud_alloc(mm, pgd, addr);
> -	if (!pud)
> -		return -ENOMEM;
> -	do {
> -		next = pud_addr_end(addr, end);
> -		if (remap_pmd_range(mm, pud, addr, next,
> -				pfn + (addr >> PAGE_SHIFT), prot))
> -			return -ENOMEM;
> -	} while (pud++, addr = next, addr != end);
>  	return 0;
>  }
>
> @@ -1551,10 +1502,9 @@
>  int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
>  		    unsigned long pfn, unsigned long size, pgprot_t prot)
>  {
> -	pgd_t *pgd;
> -	unsigned long next;
>  	unsigned long end = addr + PAGE_ALIGN(size);
>  	struct mm_struct *mm = vma->vm_mm;
> +	struct remap_data rmd;
>  	int err;
>
>  	/*
> @@ -1584,16 +1534,14 @@
>  	vma->vm_flags |= VM_IO | VM_RESERVED | VM_PFNMAP;
>
>  	BUG_ON(addr >= end);
> -	pfn -= addr >> PAGE_SHIFT;
> -	pgd = pgd_offset(mm, addr);
> -	flush_cache_range(vma, addr, end);
> -	do {
> -		next = pgd_addr_end(addr, end);
> -		err = remap_pud_range(mm, pgd, addr, next,
> -				pfn + (addr >> PAGE_SHIFT), prot);
> -		if (err)
> -			break;
> -	} while (pgd++, addr = next, addr != end);
> +
> +	rmd.mm = mm;
> +	rmd.pfn = pfn;
> +	rmd.prot = prot;
> +
> +	err = apply_to_page_range(mm, addr, end - addr,
> +				  remap_area_pte_fn, &rmd);
> +
>  	return err;
>  }
>  EXPORT_SYMBOL(remap_pfn_range);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-14  2:19 ` [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range Nick Piggin
@ 2008-11-14  2:56   ` Jeremy Fitzhardinge
  2008-11-14  3:17     ` Nick Piggin
  0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-14  2:56 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

Nick Piggin wrote:
>
> This isn't performance critical to anyone?
>   

The only difference should be between having the specialized code and an 
indirect function call, no?

> I see DRM, IB, GRU, other media and video drivers use it.
>
> It IS exactly what apply_to_page_range does, I grant you. But so does
> our traditional set of nested loops. So is there any particular reason
> to change it? You're not planning to change fork/exit next, are you? :)
>   

No ;)  But I need to have a more Xen-specific version of 
remap_pfn_range, and I wanted to 1) have the two versions look as 
similar as possible, and 2) not have a pile of duplicate code.

    J

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-14  2:56   ` Jeremy Fitzhardinge
@ 2008-11-14  3:17     ` Nick Piggin
  2008-11-14  5:22       ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 12+ messages in thread
From: Nick Piggin @ 2008-11-14  3:17 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

On Friday 14 November 2008 13:56, Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> > This isn't performance critical to anyone?
>
> The only difference should be between having the specialized code and an
> indirect function call, no?

Indirect function call per pte. It's going to be slower surely.


> > I see DRM, IB, GRU, other media and video drivers use it.
> >
> > It IS exactly what apply_to_page_range does, I grant you. But so does
> > our traditional set of nested loops. So is there any particular reason
> > to change it? You're not planning to change fork/exit next, are you? :)
>
> No ;)  But I need to have a more Xen-specific version of
> remap_pfn_range, and I wanted to 1) have the two versions look as
> similar as possible, and 2) not have a pile of duplicate code.

It is accepted practice to (carefully) duplicate the page table walking
functions in memory management code. I don't think that's a problem,
there is already so many instances of them (just be sure to stick to
exactly the same form and variable names, and any update or bugfix to
any of them is trivially applicable to all).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-14  3:17     ` Nick Piggin
@ 2008-11-14  5:22       ` Jeremy Fitzhardinge
  2008-11-14  7:35         ` Nick Piggin
  0 siblings, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-14  5:22 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

Nick Piggin wrote:
> On Friday 14 November 2008 13:56, Jeremy Fitzhardinge wrote:
>   
>> Nick Piggin wrote:
>>     
>>> This isn't performance critical to anyone?
>>>       
>> The only difference should be between having the specialized code and an
>> indirect function call, no?
>>     
>
> Indirect function call per pte. It's going to be slower surely.
>   

Yes, though changing the calling convention to handle (up to) a whole 
page worth of ptes in one call would be fairly simple I think.

> It is accepted practice to (carefully) duplicate the page table walking
> functions in memory management code. I don't think that's a problem,
> there is already so many instances of them (just be sure to stick to
> exactly the same form and variable names, and any update or bugfix to
> any of them is trivially applicable to all).
>   

I think that's pretty awful practice, frankly, and I'd much prefer there 
to be a single iterator function which everyone uses.  The open-coded 
iterators everywhere just makes it completely impractical to even think 
about other kinds of pagetable structures.  (Of course we have at least 
two "general purpose" pagetable walkers now...)

    J

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-14  5:22       ` Jeremy Fitzhardinge
@ 2008-11-14  7:35         ` Nick Piggin
  2008-11-14 18:04           ` Jeremy Fitzhardinge
  2008-11-17  3:03           ` Peter Chubb
  0 siblings, 2 replies; 12+ messages in thread
From: Nick Piggin @ 2008-11-14  7:35 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Andrew Morton, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

On Friday 14 November 2008 16:22, Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> > On Friday 14 November 2008 13:56, Jeremy Fitzhardinge wrote:
> >> Nick Piggin wrote:
> >>> This isn't performance critical to anyone?
> >>
> >> The only difference should be between having the specialized code and an
> >> indirect function call, no?
> >
> > Indirect function call per pte. It's going to be slower surely.
>
> Yes, though changing the calling convention to handle (up to) a whole
> page worth of ptes in one call would be fairly simple I think.

Yep. And leaving it alone is even simpler and still faster :)


> > It is accepted practice to (carefully) duplicate the page table walking
> > functions in memory management code. I don't think that's a problem,
> > there is already so many instances of them (just be sure to stick to
> > exactly the same form and variable names, and any update or bugfix to
> > any of them is trivially applicable to all).
>
> I think that's pretty awful practice, frankly, and I'd much prefer there
> to be a single iterator function which everyone uses.

I think its pretty nice. It means you can make the loops fairly
optimal even if they might have slightly different requirements
(different arguments, latency breaks, copy_page_range etc).


> The open-coded 
> iterators everywhere just makes it completely impractical to even think
> about other kinds of pagetable structures.  (Of course we have at least
> two "general purpose" pagetable walkers now...)

I think that's being way over dramatic. When switching to a
different page table structure, I assure you that copying and
pasting your new walking algorithm a few times will be the least
of your worries :)

It's not meant to be pluggable. Actually this came up last I think
when the UNSW wanted to add page table accessors to abstract this.
They came up with a good set of things, but in the end you can't
justify slowing things down in these paths unless you actually have
a replacement page table structure that gets you a *net win*. So
far, I haven't heard from them again.

No, adding a cycle here or an indirect function call there IMO is
not acceptable in core mm/ code without a good reason.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-14  7:35         ` Nick Piggin
@ 2008-11-14 18:04           ` Jeremy Fitzhardinge
  2008-11-15  9:28             ` Hugh Dickins
  2008-11-17  3:03           ` Peter Chubb
  1 sibling, 1 reply; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2008-11-14 18:04 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

Nick Piggin wrote:
> No, adding a cycle here or an indirect function call there IMO is
> not acceptable in core mm/ code without a good reason.
>   

<shrug> OK.

    J

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-14 18:04           ` Jeremy Fitzhardinge
@ 2008-11-15  9:28             ` Hugh Dickins
  0 siblings, 0 replies; 12+ messages in thread
From: Hugh Dickins @ 2008-11-15  9:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Jeremy Fitzhardinge, Nick Piggin, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh, Johannes Weiner

On Fri, 14 Nov 2008, Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> > No, adding a cycle here or an indirect function call there IMO is
> > not acceptable in core mm/ code without a good reason.
> 
> <shrug> OK.

I'm with Nick on this: admittedly remap_pfn_range() is a borderline
case (since it has no latency breaks at present), but it is a core
mm function, and I'd prefer we leave it as is unless good reason.

So, no hurry, but I'd prefer 

mm-implement-remap_pfn_range-with-apply_to_page_range.patch
mm-remap_pfn_range-restore-missing-flush.patch

to be removed from mmotm - and don't I deserve that,
just for actually reading the mm-commits boilerplate ;-?

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range
  2008-11-14  7:35         ` Nick Piggin
  2008-11-14 18:04           ` Jeremy Fitzhardinge
@ 2008-11-17  3:03           ` Peter Chubb
  1 sibling, 0 replies; 12+ messages in thread
From: Peter Chubb @ 2008-11-17  3:03 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Jeremy Fitzhardinge, Andrew Morton, Linux Memory Management List,
	Linux Kernel Mailing List, Pallipadi, Venkatesh

>>>>> "Nick" == Nick Piggin <nickpiggin@yahoo.com.au> writes:


Nick> It's not meant to be pluggable. Actually this came up last I
Nick> think when the UNSW wanted to add page table accessors to
Nick> abstract this.  They came up with a good set of things, but in
Nick> the end you can't justify slowing things down in these paths
Nick> unless you actually have a replacement page table structure that
Nick> gets you a *net win*. So far, I haven't heard from them again.

We tried hard.  The slowdown wasn't all that much on system
benchmarks, but you could see it on the microbenchmarks.  And it made
stuff a LOT cleaner.

We wanted it to put super-page friendly pagetables in for
architectures not wedded to the 3/4 level x86 pagetable hardware
format. 

The people I had working on this left (finished a masters,
finished a PhD, contract ran out and no more money), and I haven't had
the manpower to maintain the patchset, especially after the negative
responses we got from linux-MM.

--
Dr Peter Chubb  http://www.gelato.unsw.edu.au  peterc AT gelato.unsw.edu.au
http://www.ertos.nicta.com.au           ERTOS within National ICT Australia

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-11-17  3:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-13 17:19 [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range Jeremy Fitzhardinge
2008-11-13 19:53 ` Johannes Weiner
2008-11-13 20:12   ` Jeremy Fitzhardinge
2008-11-13 20:13   ` [PATCH 3/2] mm/remap_pfn_range: restore missing flush Jeremy Fitzhardinge
2008-11-14  2:19 ` [PATCH 1/2] mm: implement remap_pfn_range with apply_to_page_range Nick Piggin
2008-11-14  2:56   ` Jeremy Fitzhardinge
2008-11-14  3:17     ` Nick Piggin
2008-11-14  5:22       ` Jeremy Fitzhardinge
2008-11-14  7:35         ` Nick Piggin
2008-11-14 18:04           ` Jeremy Fitzhardinge
2008-11-15  9:28             ` Hugh Dickins
2008-11-17  3:03           ` Peter Chubb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox