From: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
To: linux-mm@kvack.org, akpm@osdl.org
Subject: [PATCH] Avoid unecessary zone spinlocking on refill_inactive_zone()
Date: Fri, 27 Aug 2004 21:55:50 -0300 [thread overview]
Message-ID: <20040828005550.GC4482@logos.cnet> (raw)
Hi,
As I've noticed in previous to the list that refill_inactive_zone does
unecessary spinlocking by calling __release_pages() twice (while processing
l_active and later on l_inactive).
The waste happens:
1) It drops the zone lock, calls __release_pages(), which in turn call release_page(), which
locks the same lock again.
2) It unlocks the zone lock at the end of the function even if there is no page to be freed (which
needs to be pagevec_free'd).
Atomic operations which lock the bus are expensive, and we can avoid those.
The following patch implements a release_pages_nolock/release_pages_nl pair
of function (not the best names in the world) to take that into account
and avoid these unecessary atomic operations.
This is a fast path on pagecache intensive workloads (as the comment on top
of the function says), so I believe such optimization is worth.
Its important to benchmark it carefully (it survives my stress tests on multiprocessor box).
On a side note, the current accounting of inactive/active pages is broken
in refill_inactive_zone (due to pages being freed in __release_pages).
I plan to fix that tomorrow - should be easy as returning the number of pages
freed in __release_pages and take that into account.
Comments are appreciated.
This is against 2.6.8.1.
diff -Nur --exclude='*.cmd' linux-2.6.7/mm/swap.c linux-2.6.7-vmscan/mm/swap.c
--- linux-2.6.8/mm/swap.c 2004-08-27 23:32:47.708904528 -0300
+++ linux-2.6.8-vmscan/mm/swap.c 2004-08-27 23:35:34.152601240 -0300
@@ -192,6 +192,49 @@
EXPORT_SYMBOL(__page_cache_release);
+
+void release_pages_nolock(struct page **pages, int nr, struct zone *zone)
+{
+ int i;
+ struct pagevec pages_to_free;
+ int to_free = 0;
+
+
+ pagevec_init(&pages_to_free, 1);
+ for (i = 0; i < nr; i++) {
+ struct page *page = pages[i];
+
+ if (PageReserved(page) || !put_page_testzero(page))
+ continue;
+
+ if (TestClearPageLRU(page))
+ del_page_from_lru(zone, page);
+
+ if (page_count(page) == 0) {
+ if (!pagevec_add(&pages_to_free, page)) {
+ spin_unlock_irq(&zone->lru_lock);
+ to_free = 0;
+ __pagevec_free(&pages_to_free);
+ pagevec_reinit(&pages_to_free);
+ spin_lock_irq(&zone->lru_lock);
+ }
+ to_free = 1;
+ }
+
+ }
+
+ if (to_free) {
+ spin_unlock_irq(&zone->lru_lock);
+ pagevec_free(&pages_to_free);
+ spin_lock_irq(&zone->lru_lock);
+ }
+}
+
+void release_pages_nl(struct pagevec *pvec, struct zone *zone) {
+ release_pages_nolock(pvec->pages, pagevec_count(pvec), zone);
+ pagevec_reinit(pvec);
+}
+
/*
* Batched page_cache_release(). Decrement the reference count on all the
* passed pages. If it fell to zero then remove the page from the LRU and
diff -Nur --exclude='*.cmd' linux-2.6.7/mm/vmscan.c linux-2.6.7-vmscan/mm/vmscan.c
--- linux-2.6.8/mm/vmscan.c 2004-08-27 23:32:47.714903616 -0300
+++ linux-2.6.8-vmscan/mm/vmscan.c 2004-08-27 23:31:16.410783952 -0300
@@ -74,6 +74,8 @@
int may_writepage;
};
+void release_pages_nl(struct pagevec *, struct zone *zone);
+
/*
* The list of shrinker callbacks used by to apply pressure to
* ageable caches.
@@ -753,13 +755,14 @@
pgmoved++;
if (!pagevec_add(&pvec, page)) {
zone->nr_inactive += pgmoved;
- spin_unlock_irq(&zone->lru_lock);
pgdeactivate += pgmoved;
pgmoved = 0;
- if (buffer_heads_over_limit)
+ if (buffer_heads_over_limit) {
+ spin_unlock_irq(&zone->lock);
pagevec_strip(&pvec);
- __pagevec_release(&pvec);
- spin_lock_irq(&zone->lru_lock);
+ spin_lock_irq(&zone->lock);
+ }
+ release_pages_nl(&pvec, zone);
}
}
zone->nr_inactive += pgmoved;
@@ -782,9 +785,7 @@
if (!pagevec_add(&pvec, page)) {
zone->nr_active += pgmoved;
pgmoved = 0;
- spin_unlock_irq(&zone->lru_lock);
- __pagevec_release(&pvec);
- spin_lock_irq(&zone->lru_lock);
+ release_pages_nl(&pvec, zone);
}
}
zone->nr_active += pgmoved;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
next reply other threads:[~2004-08-28 0:55 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-08-28 0:55 Marcelo Tosatti [this message]
2004-08-28 5:14 ` Nick Piggin
2004-08-28 12:35 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040828005550.GC4482@logos.cnet \
--to=marcelo.tosatti@cyclades.com \
--cc=akpm@osdl.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox