[PATCH] Avoid unecessary zone spinlocking on refill_inactive_zone()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: linux-mm@kvack.org, akpm@osdl.org
Subject: [PATCH] Avoid unecessary zone spinlocking on refill_inactive_zone()
Date: Fri, 27 Aug 2004 21:55:50 -0300	[thread overview]
Message-ID: <20040828005550.GC4482@logos.cnet> (raw)

Hi, 

As I've noticed in previous to the list that refill_inactive_zone does 
unecessary spinlocking by calling __release_pages() twice (while processing
l_active and later on l_inactive).

The waste happens:

1) It drops the zone lock, calls __release_pages(), which in turn call release_page(), which 
locks the same lock again. 
2) It unlocks the zone lock at the end of the function even if there is no page to be freed (which
needs to be pagevec_free'd).

Atomic operations which lock the bus are expensive, and we can avoid those.

The following patch implements a release_pages_nolock/release_pages_nl pair
of function (not the best names in the world) to take that into account 
and avoid these unecessary atomic operations. 

This is a fast path on pagecache intensive workloads (as the comment on top
of the function says), so I believe such optimization is worth.

Its important to benchmark it carefully (it survives my stress tests on multiprocessor box).

On a side note, the current accounting of inactive/active pages is broken 
in refill_inactive_zone (due to pages being freed in __release_pages). 
I plan to fix that tomorrow - should be easy as returning the number of pages
freed in __release_pages and take that into account.

Comments are appreciated.

This is against 2.6.8.1. 

diff -Nur --exclude='*.cmd' linux-2.6.7/mm/swap.c linux-2.6.7-vmscan/mm/swap.c
--- linux-2.6.8/mm/swap.c	2004-08-27 23:32:47.708904528 -0300
+++ linux-2.6.8-vmscan/mm/swap.c	2004-08-27 23:35:34.152601240 -0300
@@ -192,6 +192,49 @@
 
 EXPORT_SYMBOL(__page_cache_release);
 
+
+void release_pages_nolock(struct page **pages, int nr, struct zone *zone)
+{
+	int i;
+	struct pagevec pages_to_free;
+	int to_free = 0;
+
+
+	pagevec_init(&pages_to_free, 1);
+	for (i = 0; i < nr; i++) {
+		struct page *page = pages[i];
+
+		if (PageReserved(page) || !put_page_testzero(page))
+			continue;
+
+		if (TestClearPageLRU(page))
+			del_page_from_lru(zone, page);
+
+		if (page_count(page) == 0) {
+			if (!pagevec_add(&pages_to_free, page)) {
+				spin_unlock_irq(&zone->lru_lock);
+				to_free = 0;
+				__pagevec_free(&pages_to_free);
+				pagevec_reinit(&pages_to_free);
+				spin_lock_irq(&zone->lru_lock);
+			}
+			to_free = 1;
+		}
+			
+	}
+
+	if (to_free) {
+		spin_unlock_irq(&zone->lru_lock);
+		pagevec_free(&pages_to_free);
+		spin_lock_irq(&zone->lru_lock);
+	}
+}
+
+void release_pages_nl(struct pagevec *pvec, struct zone *zone) {
+	release_pages_nolock(pvec->pages, pagevec_count(pvec), zone);
+	pagevec_reinit(pvec);
+}
+
 /*
  * Batched page_cache_release().  Decrement the reference count on all the
  * passed pages.  If it fell to zero then remove the page from the LRU and
diff -Nur --exclude='*.cmd' linux-2.6.7/mm/vmscan.c linux-2.6.7-vmscan/mm/vmscan.c
--- linux-2.6.8/mm/vmscan.c	2004-08-27 23:32:47.714903616 -0300
+++ linux-2.6.8-vmscan/mm/vmscan.c	2004-08-27 23:31:16.410783952 -0300
@@ -74,6 +74,8 @@
 	int may_writepage;
 };
 
+void release_pages_nl(struct pagevec *, struct zone *zone);
+
 /*
  * The list of shrinker callbacks used by to apply pressure to
  * ageable caches.
@@ -753,13 +755,14 @@
 		pgmoved++;
 		if (!pagevec_add(&pvec, page)) {
 			zone->nr_inactive += pgmoved;
-			spin_unlock_irq(&zone->lru_lock);
 			pgdeactivate += pgmoved;
 			pgmoved = 0;
-			if (buffer_heads_over_limit)
+			if (buffer_heads_over_limit) {
+				spin_unlock_irq(&zone->lock);
 				pagevec_strip(&pvec);
-			__pagevec_release(&pvec);
-			spin_lock_irq(&zone->lru_lock);
+				spin_lock_irq(&zone->lock);
+			}
+			release_pages_nl(&pvec, zone);
 		}
 	}
 	zone->nr_inactive += pgmoved;
@@ -782,9 +785,7 @@
 		if (!pagevec_add(&pvec, page)) {
 			zone->nr_active += pgmoved;
 			pgmoved = 0;
-			spin_unlock_irq(&zone->lru_lock);
-			__pagevec_release(&pvec);
-			spin_lock_irq(&zone->lru_lock);
+			release_pages_nl(&pvec, zone);
 		}
 	}
 	zone->nr_active += pgmoved;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

next             reply	other threads:[~2004-08-28  0:55 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-28  0:55 Marcelo Tosatti [this message]
2004-08-28  5:14 ` Nick Piggin
2004-08-28 12:35   ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040828005550.GC4482@logos.cnet \
    --to=marcelo.tosatti@cyclades.com \
    --cc=akpm@osdl.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox