Re: [PATCH RFC] vm_unmap_aliases: allow callers to inhibit TLB flush

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: [PATCH RFC] vm_unmap_aliases: allow callers to inhibit TLB flush
Date: Mon, 23 Feb 2009 20:13:42 +1100	[thread overview]
Message-ID: <200902232013.43054.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <49A25086.30606@goop.org>

On Monday 23 February 2009 18:30:14 Jeremy Fitzhardinge wrote:
> Nick Piggin wrote:
> > On Friday 20 February 2009 06:11:32 Jeremy Fitzhardinge wrote:
> >> Nick Piggin wrote:
> >>> Then what is the point of the vm_unmap_aliases? If you are doing it
> >>> for security it won't work because other CPUs might still be able
> >>> to write through dangling TLBs. If you are not doing it for
> >>> security then it does not need to be done at all.
> >>
> >> Xen will make sure any danging tlb entries are flushed before handing
> >> the page out to anyone else.
> >>
> >>> Unless it is something strange that Xen does with the page table
> >>> structure and you just need to get rid of those?
> >>
> >> Yeah.  A pte pointing at a page holds a reference on it, saying that it
> >> belongs to the domain.  You can't return it to Xen until the refcount is
> >> 0.
> >
> > OK. Then I will remember to find some time to get the interrupt
> > safe patches working. I wonder why you can't just return it to
> > Xen when (or have Xen hold it somewhere until) the refcount
> > reaches 0?
>
> It would still need to allocate a page in the meantime, which could fail
> because the domain has hit its hard memory limit (which will be the
> common case, because a domain generally starts with its full compliment
> of memory).   The nice thing about the exchange is that there's no
> accounting to take into account.

OK, well I don't really understand the details but I trust you if
you say it's hard :)


> >>> Or... what if we just allow a compile and/or boot time flag to direct
> >>> that it does not want lazy vmap unmapping and it will just revert to
> >>> synchronous unmapping? If Xen needs lots of flushing anyway it might
> >>> not be a win anyway.
> >>
> >> That may be worth considering.
> >
> > ... in the meantime, shall we just do this for Xen? It is probably
> > safer and may end up with no worse performance on Xen anyway. If
> > we get more vmap users and it becomes important, you could look at
> > more sophisticated ways of doing this. Eg. a page could be flagged
> > if it potentially has lazy vmaps.
>
> OK.  Do you want to do the patch, or shall I?

Here's a start for you. I think it gets rid of all the dead code and
data without introducing any actual conditional compilation...

---
 mm/vmalloc.c |   66 ++++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 48 insertions(+), 18 deletions(-)

Index: linux-2.6/mm/vmalloc.c
===================================================================
--- linux-2.6.orig/mm/vmalloc.c
+++ linux-2.6/mm/vmalloc.c
@@ -29,6 +29,11 @@
 #include <asm/uaccess.h>
 #include <asm/tlbflush.h>
 
+#ifdef CONFIG_VMAP_NO_LAZY_FLUSH
+#define VMAP_LAZY_FLUSHES 0
+#else
+#define VMAP_LAZY_FLUSHES 1
+#endif
 
 /*** Page table manipulation functions ***/
 
@@ -376,7 +381,7 @@ retry:
 found:
 	if (addr + size > vend) {
 		spin_unlock(&vmap_area_lock);
-		if (!purged) {
+		if (VMAP_LAZY_FLUSHES && !purged) {
 			purge_vmap_area_lazy();
 			purged = 1;
 			goto retry;
@@ -413,7 +418,10 @@ static void __free_vmap_area(struct vmap
 	RB_CLEAR_NODE(&va->rb_node);
 	list_del_rcu(&va->list);
 
-	call_rcu(&va->rcu_head, rcu_free_va);
+	if (VMAP_LAZY_FLUSHES)
+		call_rcu(&va->rcu_head, rcu_free_va);
+	else
+		kfree(va);
 }
 
 /*
@@ -450,8 +458,10 @@ static void vmap_debug_free_range(unsign
 	 * faster).
 	 */
 #ifdef CONFIG_DEBUG_PAGEALLOC
-	vunmap_page_range(start, end);
-	flush_tlb_kernel_range(start, end);
+	if (VMAP_LAZY_FLUSHES) {
+		vunmap_page_range(start, end);
+		flush_tlb_kernel_range(start, end);
+	}
 #endif
 }
 
@@ -571,10 +581,16 @@ static void purge_vmap_area_lazy(void)
  */
 static void free_unmap_vmap_area_noflush(struct vmap_area *va)
 {
-	va->flags |= VM_LAZY_FREE;
-	atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr);
-	if (unlikely(atomic_read(&vmap_lazy_nr) > lazy_max_pages()))
-		try_purge_vmap_area_lazy();
+	if (VMAP_LAZY_FLUSHES) {
+		va->flags |= VM_LAZY_FREE;
+		atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT,
+							&vmap_lazy_nr);
+		if (unlikely(atomic_read(&vmap_lazy_nr) > lazy_max_pages()))
+			try_purge_vmap_area_lazy();
+	} else {
+		vunmap_page_range(va->va_start, va->va_end);
+		flush_tlb_kernel_range(va->va_start, va->va_end);
+	}
 }
 
 /*
@@ -610,6 +626,15 @@ static void free_unmap_vmap_area_addr(un
 /*** Per cpu kva allocator ***/
 
 /*
+ * This does lazy flushing as well, so don't call it if the arch doesn't want
+ * lazy vmap kva flushes... The scalability aspect should be less important
+ * in that case anyway seeing as kernel tlb flushing tends not to be scalable.
+ * It would be possible to make this work without lazy tlb flushing if it
+ * was really a big deal.
+ */
+
+
+/*
  * vmap space is limited especially on 32 bit architectures. Ensure there is
  * room for at least 16 percpu vmap blocks per CPU.
  */
@@ -877,6 +902,9 @@ void vm_unmap_aliases(void)
 	int cpu;
 	int flush = 0;
 
+	if (!VMAP_LAZY_FLUSHES)
+		return;
+
 	if (unlikely(!vmap_initialized))
 		return;
 
@@ -937,7 +965,7 @@ void vm_unmap_ram(const void *mem, unsig
 	debug_check_no_locks_freed(mem, size);
 	vmap_debug_free_range(addr, addr+size);
 
-	if (likely(count <= VMAP_MAX_ALLOC))
+	if (VMAP_LAZY_FLUSHES && likely(count <= VMAP_MAX_ALLOC))
 		vb_free(mem, size);
 	else
 		free_unmap_vmap_area_addr(addr);
@@ -959,7 +987,7 @@ void *vm_map_ram(struct page **pages, un
 	unsigned long addr;
 	void *mem;
 
-	if (likely(count <= VMAP_MAX_ALLOC)) {
+	if (VMAP_LAZY_FLUSHES && likely(count <= VMAP_MAX_ALLOC)) {
 		mem = vb_alloc(size, GFP_KERNEL);
 		if (IS_ERR(mem))
 			return NULL;
@@ -988,14 +1016,16 @@ void __init vmalloc_init(void)
 	struct vm_struct *tmp;
 	int i;
 
-	for_each_possible_cpu(i) {
-		struct vmap_block_queue *vbq;
-
-		vbq = &per_cpu(vmap_block_queue, i);
-		spin_lock_init(&vbq->lock);
-		INIT_LIST_HEAD(&vbq->free);
-		INIT_LIST_HEAD(&vbq->dirty);
-		vbq->nr_dirty = 0;
+	if (VMAP_LAZY_FLUSHES) {
+		for_each_possible_cpu(i) {
+			struct vmap_block_queue *vbq;
+
+			vbq = &per_cpu(vmap_block_queue, i);
+			spin_lock_init(&vbq->lock);
+			INIT_LIST_HEAD(&vbq->free);
+			INIT_LIST_HEAD(&vbq->dirty);
+			vbq->nr_dirty = 0;
+		}
 	}
 
 	/* Import existing vmlist entries. */
\0

next prev parent reply	other threads:[~2009-02-23  9:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-11 19:05 Jeremy Fitzhardinge
2007-07-24  0:52 ` Nick Piggin
2008-12-12  1:59   ` Jeremy Fitzhardinge
2007-07-24  1:40     ` Nick Piggin
2008-12-16  1:28       ` Jeremy Fitzhardinge
2008-12-30  3:42         ` Nick Piggin
2008-12-30 11:27           ` Jeremy Fitzhardinge
2009-02-17 21:57           ` Jeremy Fitzhardinge
2009-02-19 11:54             ` Nick Piggin
2009-02-19 17:02               ` Jeremy Fitzhardinge
2009-02-19 17:41                 ` Nick Piggin
2009-02-19 19:11                   ` Jeremy Fitzhardinge
2009-02-23  4:14                     ` Nick Piggin
2009-02-23  7:30                       ` Jeremy Fitzhardinge
2009-02-23  9:13                         ` Nick Piggin [this message]
2009-02-23 19:27                           ` Jeremy Fitzhardinge
2009-02-24 12:23                             ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200902232013.43054.nickpiggin@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox