Re: [PATCH 06/17] arm: mmu_gather rework

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Chris Metcalf <cmetcalf@tilera.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk@arm.linux.org.uk>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Avi Kivity <avi@redhat.com>, Thomas Gleixner <tglx@linutronix.de>,
	Rik van Riel <riel@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	akpm@linux-foundation.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	David Miller <davem@davemloft.net>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Yanmin Zhang <yanmin_zhang@linux.intel.com>,
	"Luck,Tony" <tony.luck@intel.com>,
	PaulMundt <lethal@linux-sh.org>
Subject: Re: [PATCH 06/17] arm: mmu_gather rework
Date: Tue, 1 Mar 2011 17:05:40 -0500	[thread overview]
Message-ID: <4D6D6DB4.5020603@tilera.com> (raw)
In-Reply-To: <1298893487.2428.10537.camel@twins>

On 2/28/2011 6:44 AM, Peter Zijlstra wrote:
> [...]
> Now, if we want to track ranges _and_ have hardware page table walkers
> (ARM seems to be one such), we must flush TLBs at tlb_end_vma() because
> flush_tlb_range() requires a vma pointer (ARM and TILE actually use more
> than ->vm_mm), and on tlb_finish_mmu() issue a full mm wide invalidate
> because the hardware walker could have re-populated the cache after
> clearing the PTEs but before freeing the page tables.
>
> What ARM does is it retains the last vma pointer and tracks
> pte_free_tlb() range and uses that in tlb_finish_mmu(), which is a tad
> hacky.
>
> Mostly because of shift_arg_pages(), where we have:
>
>   shift_arg_pages()
>     tlb_gather_mmu()
>     free_*_range()
>     tlb_finish_mmu()
>
> For which ARM now punts and does a full tlb invalidate (no vma pointer).
> But also because it drags along that vma pointer, which might not at all
> match the range its actually going to invalidate (and hence its vm_flags
> might not accurately reflect things -- at worst more expensive than
> needed).
>
> The reason I wanted flush_tlb_range() to take an mm_struct and not the
> current vm_area_struct is because we can avoid doing the
> flush_tlb_range() from tlb_end_vma() and delay the thing until
> tlb_finish_mmu() without having to resort to such games as above. We
> could simply track the full range over all VMAs and free'd page-tables
> and do one range invalidate.
>
> ARM uses vm_flags & VM_EXEC to see if it also needs to invalidate
> I-TLBs, and TILE uses VM_EXEC and VM_HUGETLB.
>
> For the I-TLBs we could easily use
> ptep_get_and_clear_full()/tlb_remove_tlb_entry() and see if any of the
> cleared pte's had its executable bit set (both ARM and TILE seem to have
> such a PTE bit).

For Tile, the concern is that we want to make sure to invalidate the
i-cache.  The I-TLB is handled by the regular TLB flush just fine, like the
other architectures.  So our concern is that once we have cleared the page
table entries and invalidated the TLBs, we still have to deal with i-cache
lines in any core that may have run code from that page.  The risk is that
the kernel might free, reallocate, and then run code from one of those
pages, all before the stale i-cache lines happened to be evicted.

The current Tile code flushes the icache explicitly at two different times:

1. Whenever we flush the TLB, since this is one time when we know who might
currently be using the page (via cpu_vm_mask) and we can flush all of them
easily, piggybacking on the infrastructure we use to flush remote TLBs.

2. Whenever we context switch, to handle the case where cpu 1 is running
process A, then switches to B, but another cpu still running process A
unmaps an executable page that was in cpu 1's icache.  This way when cpu 1
switches back to A, it doesn't have to worry about any unmaps that occurred
while it was switched out.

> I'm not sure what we can do about TILE's VM_HUGETLB usage though, if it
> needs explicit flushes for huge ptes it might just have to issue
> multiple tlb invalidates and do them from tlb_start_vma()/tlb_end_vma().

I'm not too concerned about this.  We can make the flush code check both
page sizes at a small cost in efficiency, relative to the overall cost of
global TLB invalidation.

>   CONFIG_HAVE_MMU_GATHER_ITLB - will use
> ptep_get_and_clear_full()/tlb_remove_tlb_entry() to test pte_exec() and
> issue flush_itlb_range(mm,start,end).

So it sounds like the proposal for tile would be to piggy-back on
flush_itlb_range() and use it to flush the i-cache?  It does seem like
there must be other Linux architectures with incoherent icache out there,
and some existing solution we could just repurpose.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2011-03-01 22:05 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-17 16:23 [PATCH 00/17] mm: " Peter Zijlstra
2011-02-17 16:23 ` [PATCH 01/17] tile: Fix __pte_free_tlb Peter Zijlstra
2011-02-17 16:23 ` [PATCH 02/17] mm: mmu_gather rework Peter Zijlstra
2011-03-10 15:50   ` Mel Gorman
2011-03-16 18:55     ` Peter Zijlstra
2011-03-16 20:15       ` Geert Uytterhoeven
2011-03-16 21:08         ` Peter Zijlstra
2011-03-21  8:47       ` Avi Kivity
2011-04-01 12:07         ` Peter Zijlstra
2011-04-01 16:13           ` Linus Torvalds
2011-04-02  0:07             ` David Miller
2011-02-17 16:23 ` [PATCH 03/17] powerpc: " Peter Zijlstra
2011-02-17 16:23 ` [PATCH 04/17] sparc: " Peter Zijlstra
2011-02-17 16:23 ` [PATCH 05/17] s390: " Peter Zijlstra
2011-02-17 16:23 ` [PATCH 06/17] arm: " Peter Zijlstra
2011-02-24 16:34   ` Peter Zijlstra
2011-02-25 18:04     ` Peter Zijlstra
2011-02-25 19:45       ` Peter Zijlstra
2011-02-25 19:59         ` Hugh Dickins
2011-02-25 21:51       ` Russell King
2011-02-28 11:44         ` Peter Zijlstra
2011-02-28 11:59           ` Russell King
2011-02-28 12:06             ` Russell King
2011-02-28 12:25               ` Peter Zijlstra
2011-02-28 12:06             ` Russell King
2011-02-28 12:20             ` Peter Zijlstra
2011-02-28 12:28               ` Russell King
2011-02-28 12:49                 ` Peter Zijlstra
2011-02-28 12:50                   ` Russell King
2011-02-28 13:03                     ` Peter Zijlstra
2011-02-28 14:18           ` Peter Zijlstra
2011-02-28 14:57             ` Russell King
2011-02-28 15:05               ` Peter Zijlstra
2011-02-28 15:15                 ` Russell King
2011-03-01 22:05           ` Chris Metcalf [this message]
2011-03-02 10:54             ` Peter Zijlstra
2011-02-17 16:23 ` [PATCH 07/17] sh: " Peter Zijlstra
2011-02-17 16:23 ` [PATCH 08/17] um: " Peter Zijlstra
2011-02-17 16:23 ` [PATCH 09/17] ia64: " Peter Zijlstra
2011-02-17 16:23 ` [PATCH 10/17] mm: Now that all old mmu_gather code is gone, remove the storage Peter Zijlstra
2011-02-17 16:23 ` [PATCH 11/17] mm, powerpc: Move the RCU page-table freeing into generic code Peter Zijlstra
2011-02-17 16:23 ` [PATCH 12/17] s390: use generic RCP page-table freeing Peter Zijlstra
2011-02-17 16:23 ` [PATCH 13/17] mm: Extended batches for generic mmu_gather Peter Zijlstra
2011-02-17 16:23 ` [PATCH 14/17] mm: Provide generic range tracking and flushing Peter Zijlstra
2011-02-17 16:23 ` [PATCH 15/17] arm, mm: Convert arm to generic tlb Peter Zijlstra
2011-02-17 16:23 ` [PATCH 16/17] ia64, mm: Convert ia64 " Peter Zijlstra
2011-02-17 16:23 ` [PATCH 17/17] sh, mm: Convert sh " Peter Zijlstra
2011-02-17 17:36 ` [PATCH 00/17] mm: mmu_gather rework Peter Zijlstra
2011-02-17 17:42 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D6D6DB4.5020603@tilera.com \
    --to=cmetcalf@tilera.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=davem@davemloft.net \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=lethal@linux-sh.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mingo@elte.hu \
    --cc=npiggin@kernel.dk \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rmk@arm.linux.org.uk \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox