[RFC][PATCH] shrink_mmap avoid list_del (Was: Re: [PATCH] Recent VM fiasco - fixed)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Roger Larsson <roger.larsson@norran.net>
To: Linus Torvalds <torvalds@transmeta.com>,
	Rik van Riel <riel@conectiva.com.br>
Cc: linux-mm@kvack.org
Subject: [RFC][PATCH] shrink_mmap avoid list_del (Was: Re: [PATCH] Recent VM fiasco - fixed)
Date: Fri, 12 May 2000 04:51:42 +0200	[thread overview]
Message-ID: <391B71BE.6302F9BD@norran.net> (raw)
In-Reply-To: <Pine.LNX.4.10.10005111700520.1319-100000@penguin.transmeta.com>

[-- Attachment #1: Type: text/plain, Size: 2210 bytes --]

Hi,

I tried to find a way to walk the lru list without list_del.

Here is my patch:
- not compiled nor run (low on HD...)

Could something like this be used?
If no, why not?

/RogerL


Linus Torvalds wrote:
> 
> On Thu, 11 May 2000, Simon Kirby wrote:
> >
> > Hrm!  pre7 release seems to be even better.  113 vmstat-line-seconds now
> > (yes, I know this isn't a very scientific testing method :)).  Second try
> > was 114 vmstat-line-seconds.  classzone-27 did it in 107, so that's not
> > very far off!  Also, it swapped much less this time, and used less CPU.
> > vmstat output attached.
> 
> The final pre7 did something that I'm not entirely excited about, but that
> kind of makes sense at least from a CPU standpoint (as the SGI people have
> repeated multiple times). What the real pre7 does is to just move any page
> that has problems getting free'd to the head of the LRU list, so that we
> won't try it immediately the next time. This way we don't test the same
> pages over and over again when they are either shared, in the wrong zone,
> or have dirty/locked buffers.
> 
> It means that the "LRU" is less LRU, but you could see it as a "how hard
> do we want to free this" pressure-based system that really a least
> recently _used_ system. And it avoids the "repeat the whole thing on the
> same page" issue. And it looks like it behaves reasonably well, while
> saving a lot of CPU.
> 
> Knock wood.
> 
> I'm still considering the pre7 as more a "ok, I tried to get rid of the
> cruft" thing. Most of the special case code that has accumulated lately is
> gone. We can start adding stuff back now, I'm happy that the basics are
> reasonably clean.
> 
> I think Ingo already posted a very valid concern about high-memory
> machines, and there are other issues we should look at. I just want to be
> in a position where we can look at the code and say "we do X because Y",
> rather than a collection of random tweaks that just happens to work.
> 
>                 Linus
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux.eu.org/Linux-MM/

--
Home page:
  http://www.norran.net/nra02596/

[-- Attachment #2: patch-2.3.99-pre7-9-shrink_mmap.1 --]
[-- Type: text/plain, Size: 3624 bytes --]

diff -Naur linux-2.3-pre9--/mm/filemap.c linux-2.3/mm/filemap.c
--- linux-2.3-pre9--/mm/filemap.c	Fri May 12 02:42:19 2000
+++ linux-2.3/mm/filemap.c	Fri May 12 04:28:30 2000
@@ -236,7 +236,6 @@
 int shrink_mmap(int priority, int gfp_mask)
 {
 	int ret = 0, count;
-	LIST_HEAD(old);
 	struct list_head * page_lru, * dispose;
 	struct page * page = NULL;
 	
@@ -244,26 +243,29 @@
 
 	/* we need pagemap_lru_lock for list_del() ... subtle code below */
 	spin_lock(&pagemap_lru_lock);
-	while (count > 0 && (page_lru = lru_cache.prev) != &lru_cache) {
+	page_lru = &lru_cache;
+	while (count > 0 && (page_lru = page_lru->prev) != &lru_cache) {
 		page = list_entry(page_lru, struct page, lru);
-		list_del(page_lru);
 
 		dispose = &lru_cache;
 		if (PageTestandClearReferenced(page))
 			goto dispose_continue;
 
 		count--;
-		dispose = &old;
+
+		dispose = NULL;
 
 		/*
 		 * Avoid unscalable SMP locking for pages we can
 		 * immediate tell are untouchable..
 		 */
 		if (!page->buffers && page_count(page) > 1)
-			goto dispose_continue;
+			continue;
 
+		/* Lock this lru page, reentrant
+		 * will be disposed correctly when unlocked */
 		if (TryLockPage(page))
-			goto dispose_continue;
+			continue;
 
 		/* Release the pagemap_lru lock even if the page is not yet
 		   queued in any lru queue since we have just locked down
@@ -281,7 +283,7 @@
 		 */
 		if (page->buffers) {
 			if (!try_to_free_buffers(page))
-				goto unlock_continue;
+				goto page_unlock_continue;
 			/* page was locked, inode can't go away under us */
 			if (!page->mapping) {
 				atomic_dec(&buffermem_pages);
@@ -336,27 +338,43 @@
 
 cache_unlock_continue:
 		spin_unlock(&pagecache_lock);
-unlock_continue:
+page_unlock_continue:
 		spin_lock(&pagemap_lru_lock);
 		UnlockPage(page);
 		put_page(page);
+		continue;
+
 dispose_continue:
-		list_add(page_lru, dispose);
-	}
-	goto out;
+		/* have the pagemap_lru_lock, lru cannot change */
+		{
+		  struct list_head * page_lru_to_move = page_lru; 
+		  page_lru = page_lru->next; /* continues with page_lru.prev */
+		  list_del(page_lru_to_move);
+		  list_add(page_lru_to_move, dispose);
+		}
+		continue;
 
 made_inode_progress:
-	page_cache_release(page);
+		page_cache_release(page);
 made_buffer_progress:
-	UnlockPage(page);
-	put_page(page);
-	ret = 1;
-	spin_lock(&pagemap_lru_lock);
-	/* nr_lru_pages needs the spinlock */
-	nr_lru_pages--;
+		/* like to have the lru lock before UnlockPage */
+		spin_lock(&pagemap_lru_lock);
 
-out:
-	list_splice(&old, lru_cache.prev);
+		UnlockPage(page);
+		put_page(page);
+		ret++;
+
+		/* lru manipulation needs the spin lock */
+		{
+		  struct list_head * page_lru_to_free = page_lru; 
+		  page_lru = page_lru->next; /* continues with page_lru.prev */
+		  list_del(page_lru_to_free);
+		}
+
+		/* nr_lru_pages needs the spinlock */
+		nr_lru_pages--;
+
+	}
 
 	spin_unlock(&pagemap_lru_lock);
 
diff -Naur linux-2.3-pre9--/mm/vmscan.c linux-2.3/mm/vmscan.c
--- linux-2.3-pre9--/mm/vmscan.c	Fri May 12 02:42:19 2000
+++ linux-2.3/mm/vmscan.c	Fri May 12 04:32:16 2000
@@ -443,10 +443,9 @@
 
 	priority = 6;
 	do {
-		while (shrink_mmap(priority, gfp_mask)) {
-			if (!--count)
-				goto done;
-		}
+	        count -= shrink_mmap(priority, gfp_mask);
+		if (count <= 0)
+		  goto done;
 
 		/* Try to get rid of some shared memory pages.. */
 		if (gfp_mask & __GFP_IO) {
@@ -481,10 +480,9 @@
 	} while (--priority >= 0);
 
 	/* Always end on a shrink_mmap.. */
-	while (shrink_mmap(0, gfp_mask)) {
-		if (!--count)
-			goto done;
-	}
+	count -= shrink_mmap(priority, gfp_mask);
+	if (count <= 0)
+	  goto done;
 
 	return 0;

next prev parent reply	other threads:[~2000-05-12  2:51 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-05-08 17:21 [PATCH] Recent VM fiasco - fixed Zlatko Calusic
2000-05-08 17:43 ` Rik van Riel
2000-05-08 18:16   ` Zlatko Calusic
2000-05-08 18:20     ` Linus Torvalds
2000-05-08 18:46     ` Rik van Riel
2000-05-08 18:53       ` Zlatko Calusic
2000-05-08 19:04         ` Rik van Riel
2000-05-09  7:56   ` Daniel Stone
2000-05-09  8:25     ` Christoph Rohland
2000-05-09 15:44       ` Linus Torvalds
2000-05-09 16:12         ` Simon Kirby
2000-05-09 17:42         ` Christoph Rohland
2000-05-09 19:50           ` Linus Torvalds
2000-05-10 11:25             ` Christoph Rohland
2000-05-10 11:50               ` Zlatko Calusic
2000-05-11 23:40                 ` Mark Hahn
2000-05-10  4:05         ` James H. Cloos Jr.
2000-05-10  7:29           ` James H. Cloos Jr.
2000-05-11  0:16             ` Linus Torvalds
2000-05-11  0:32               ` Linus Torvalds
2000-05-11 16:36                 ` [PATCH] Recent VM fiasco - fixed (pre7-9) Rajagopal Ananthanarayanan
2000-05-11  1:04               ` [PATCH] Recent VM fiasco - fixed Juan J. Quintela
2000-05-11  1:53                 ` Simon Kirby
2000-05-11  7:23                   ` Linus Torvalds
2000-05-11 14:17                     ` Simon Kirby
2000-05-11 23:38                       ` Simon Kirby
2000-05-12  0:09                         ` Linus Torvalds
2000-05-12  2:51                           ` Roger Larsson [this message]
2000-05-11 11:15                   ` Rik van Riel
2000-05-11  5:10                 ` Linus Torvalds
2000-05-11 10:09                   ` James H. Cloos Jr.
2000-05-11 17:25                   ` Juan J. Quintela
2000-05-11 23:25                   ` [patch] balanced highmem subsystem under pre7-9 Ingo Molnar
2000-05-11 23:46                     ` Linus Torvalds
2000-05-12  0:08                       ` Ingo Molnar
2000-05-12  0:15                         ` Ingo Molnar
2000-05-12  9:02                     ` Christoph Rohland
2000-05-12  9:56                       ` Ingo Molnar
2000-05-12 11:49                         ` Christoph Rohland
2000-05-12 16:12                       ` Linus Torvalds
2000-05-12 10:57                     ` Andrea Arcangeli
2000-05-12 12:11                       ` Ingo Molnar
2000-05-12 12:57                         ` Andrea Arcangeli
2000-05-12 13:20                           ` Rik van Riel
2000-05-12 16:40                             ` Ingo Molnar
2000-05-12 17:15                               ` Rik van Riel
2000-05-12 18:15                               ` Linus Torvalds
2000-05-12 18:53                                 ` Ingo Molnar
2000-05-12 19:06                                   ` Linus Torvalds
2000-05-12 19:36                                     ` Ingo Molnar
2000-05-12 19:40                                     ` Ingo Molnar
2000-05-12 19:54                                     ` Ingo Molnar
2000-05-12 22:48                                       ` Rik van Riel
2000-05-13 11:57                                         ` Stephen C. Tweedie
2000-05-13 12:03                                           ` Rik van Riel
2000-05-13 12:14                                             ` Ingo Molnar
2000-05-13 14:23                                               ` Ingo Molnar
2000-05-19  1:58                               ` Andrea Arcangeli
2000-05-19 15:03                                 ` Rik van Riel
2000-05-19 16:08                                   ` Andrea Arcangeli
2000-05-19 17:05                                     ` Rik van Riel
2000-05-19 22:28                                     ` Linus Torvalds
2000-05-11 11:12               ` [PATCH] Recent VM fiasco - fixed Christoph Rohland
2000-05-11 17:38               ` Steve Dodd
2000-05-09 10:21     ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=391B71BE.6302F9BD@norran.net \
    --to=roger.larsson@norran.net \
    --cc=linux-mm@kvack.org \
    --cc=riel@conectiva.com.br \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox