From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
Received: from mail191.messagelabs.com (mail191.messagelabs.com [216.82.242.19])
	by kanga.kvack.org (Postfix) with ESMTP id DBF7F6B003D
	for <linux-mm@kvack.org>; Tue, 10 Mar 2009 10:50:10 -0400 (EDT)
Date: Tue, 10 Mar 2009 14:49:55 +0000 (GMT)
From: Hugh Dickins <hugh@veritas.com>
Subject: Re: [PATCH] [ARM] Flush only the needed range when unmapping a VMA
In-Reply-To: <1236690093-3037-1-git-send-email-Aaro.Koskinen@nokia.com>
Message-ID: <Pine.LNX.4.64.0903101442450.31010@blonde.anvils>
References: <49B54B2A.9090408@nokia.com> <1236690093-3037-1-git-send-email-Aaro.Koskinen@nokia.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-linux-mm@kvack.org
To: Aaro Koskinen <Aaro.Koskinen@nokia.com>
Cc: linux-arm-kernel@lists.arm.linux.org.uk, linux-mm@kvack.org
List-ID: <linux-mm.kvack.org>

On Tue, 10 Mar 2009, Aaro Koskinen wrote:

> When unmapping N pages (e.g. shared memory) the amount of TLB flushes
> done can be (N*PAGE_SIZE/ZAP_BLOCK_SIZE)*N although it should be N at
> maximum. With PREEMPT kernel ZAP_BLOCK_SIZE is 8 pages, so there is a
> noticeable performance penalty when unmapping a large VMA and the system
> is spending its time in flush_tlb_range().
> 
> The problem is that tlb_end_vma() is always flushing the full VMA
> range. The subrange that needs to be flushed can be calculated by
> tlb_remove_tlb_entry(). This approach was suggested by Hugh Dickins,
> and is also used by other arches.
> 
> Signed-off-by: Aaro Koskinen <Aaro.Koskinen@nokia.com>
> Cc: Hugh Dickins <hugh@veritas.com>
> Cc: linux-mm@kvack.org
> ---

Looks good to me:
Acked-by: Hugh Dickins <hugh@veritas.com>

> 
> For earlier discussion, see:
> http://marc.info/?t=123609820900002&r=1&w=2
> http://marc.info/?t=123660375800003&r=1&w=2
> 
>  arch/arm/include/asm/tlb.h |   25 +++++++++++++++++++++----
>  1 files changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
> index 857f1df..321c83e 100644
> --- a/arch/arm/include/asm/tlb.h
> +++ b/arch/arm/include/asm/tlb.h
> @@ -36,6 +36,8 @@
>  struct mmu_gather {
>  	struct mm_struct	*mm;
>  	unsigned int		fullmm;
> +	unsigned long		range_start;
> +	unsigned long		range_end;
>  };
>  
>  DECLARE_PER_CPU(struct mmu_gather, mmu_gathers);
> @@ -63,7 +65,19 @@ tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
>  	put_cpu_var(mmu_gathers);
>  }
>  
> -#define tlb_remove_tlb_entry(tlb,ptep,address)	do { } while (0)
> +/*
> + * Memorize the range for the TLB flush.
> + */
> +static inline void
> +tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long addr)
> +{
> +	if (!tlb->fullmm) {
> +		if (addr < tlb->range_start)
> +			tlb->range_start = addr;
> +		if (addr + PAGE_SIZE > tlb->range_end)
> +			tlb->range_end = addr + PAGE_SIZE;
> +	}
> +}
>  
>  /*
>   * In the case of tlb vma handling, we can optimise these away in the
> @@ -73,15 +87,18 @@ tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
>  static inline void
>  tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
>  {
> -	if (!tlb->fullmm)
> +	if (!tlb->fullmm) {
>  		flush_cache_range(vma, vma->vm_start, vma->vm_end);
> +		tlb->range_start = TASK_SIZE;
> +		tlb->range_end = 0;
> +	}
>  }
>  
>  static inline void
>  tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
>  {
> -	if (!tlb->fullmm)
> -		flush_tlb_range(vma, vma->vm_start, vma->vm_end);
> +	if (!tlb->fullmm && tlb->range_end > 0)
> +		flush_tlb_range(vma, tlb->range_start, tlb->range_end);
>  }
>  
>  #define tlb_remove_page(tlb,page)	free_page_and_swap_cache(page)
> -- 
> 1.5.4.3
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>