* A possible winner in pre7-8
[not found] <Pine.LNX.4.10.10005082332560.773-100000@penguin.transmeta.com>
@ 2000-05-09 7:50 ` Rajagopal Ananthanarayanan
2000-05-09 17:33 ` Juan J. Quintela
0 siblings, 1 reply; 10+ messages in thread
From: Rajagopal Ananthanarayanan @ 2000-05-09 7:50 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-mm
Linus Torvalds wrote:
>
> On Mon, 8 May 2000, Rajagopal Ananthanarayanan wrote:
> >
> > Not sure entirely what effect this has, except for freeing underlying
> > buffer_head's. The page itself is still skipped. Anyway, brief examination
> > shows that you've changed several things here (in 7-7), so I'll have to go
> > at it some more time to get a full picture.
>
> Actually, look at pre7-8 instead.
>
> pre7-7 was rather useful to me - I tested the exact same kernel with the
> only difference being the order of the "zone free" and the
> "try_to_free_buffers()" tests, and that's what I then released as pre7-7.
> But pre7-8 has what I believe to be a saner order when it comes to the
> other tests.
Interesting! This stuff is coming out faster than I can patch.
In any case, good news about pre7-8: not only does dbench run without
errors, but it runs well. Let's hope that others (Juan & Benjamin to name two)
see similar results.
>
> > Unfortunately my dbench test really runs bad with pre 7-7.
> > Quantitively, the amount of memory in "cache" of vmstat
> > is higher than before. write()'s start failing.
>
> Can you tell me how they fail? Is it with a ENOMEM, or is there something
> more insidious going on?
>
> I tested pre7-7 with 20MB of RAM, and it was fine. But I didn't run
> dbench: instead I tested it with X and netscape and a kernel recursive
> diff - really more to test that it works ok under real load. Something
> which previous pre7's definitely did not do well on at all. pre7-8 should
> be better, because it has the LRU enabled on the buffer cache too,
> something that pre7-7 lost due to the ordering changes.
>
pre7-8 is definitely better; 7-7 was really bad. I don't know for
sure but the write failure was similar to what I've seen earlier with ENOMEM.
More after looking at your changes in 7-6 -> 7-7 and 7-7 ->7-8 ...
--
--------------------------------------------------------------------------
Rajagopal Ananthanarayanan ("ananth")
Member Technical Staff, SGI.
--------------------------------------------------------------------------
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A possible winner in pre7-8
2000-05-09 7:50 ` A possible winner in pre7-8 Rajagopal Ananthanarayanan
@ 2000-05-09 17:33 ` Juan J. Quintela
2000-05-10 1:59 ` Roger Larsson
2000-05-10 3:29 ` Juan J. Quintela
0 siblings, 2 replies; 10+ messages in thread
From: Juan J. Quintela @ 2000-05-09 17:33 UTC (permalink / raw)
To: Rajagopal Ananthanarayanan; +Cc: Linus Torvalds, linux-mm
>>>>> "rajagopal" == Rajagopal Ananthanarayanan <ananth@sgi.com> writes:
Hi
rajagopal> Interesting! This stuff is coming out faster than I can patch.
rajagopal> In any case, good news about pre7-8: not only does dbench run without
rajagopal> errors, but it runs well. Let's hope that others (Juan & Benjamin to name two)
rajagopal> see similar results.
No way, here my tests run two iterations, and in the second iteration
init was killed, and the system become unresponsive (headless machine,
you know....). I have no time now to do a more detailed report, more
information later today.
Later, Juan.
--
In theory, practice and theory are the same, but in practice they
are different -- Larry McVoy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A possible winner in pre7-8
2000-05-09 17:33 ` Juan J. Quintela
@ 2000-05-10 1:59 ` Roger Larsson
2000-05-10 22:13 ` [plastic bag] " Roger Larsson
2000-05-10 3:29 ` Juan J. Quintela
1 sibling, 1 reply; 10+ messages in thread
From: Roger Larsson @ 2000-05-10 1:59 UTC (permalink / raw)
To: Juan J. Quintela; +Cc: Rajagopal Ananthanarayanan, Linus Torvalds, linux-mm
Hi all,
Since everyone is testing shrink_mmap...
Here is my latest version.
(Currently I have some problems with pre-version
I am kind of out of synch...)
It should compile, but it is not tested:
- lack of HD, courage, backups...
/RogerL
"Juan J. Quintela" wrote:
>
> >>>>> "rajagopal" == Rajagopal Ananthanarayanan <ananth@sgi.com> writes:
>
> Hi
>
> rajagopal> Interesting! This stuff is coming out faster than I can patch.
> rajagopal> In any case, good news about pre7-8: not only does dbench run without
> rajagopal> errors, but it runs well. Let's hope that others (Juan & Benjamin to name two)
> rajagopal> see similar results.
>
> No way, here my tests run two iterations, and in the second iteration
> init was killed, and the system become unresponsive (headless machine,
> you know....). I have no time now to do a more detailed report, more
> information later today.
>
> Later, Juan.
>
> --
> In theory, practice and theory are the same, but in practice they
> are different -- Larry McVoy
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux.eu.org/Linux-MM/
--
Home page:
http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A possible winner in pre7-8
2000-05-09 17:33 ` Juan J. Quintela
2000-05-10 1:59 ` Roger Larsson
@ 2000-05-10 3:29 ` Juan J. Quintela
2000-05-10 15:31 ` Linus Torvalds
1 sibling, 1 reply; 10+ messages in thread
From: Juan J. Quintela @ 2000-05-10 3:29 UTC (permalink / raw)
To: Rajagopal Ananthanarayanan; +Cc: Linus Torvalds, linux-mm
>>>>> "juan" == Juan J Quintela <quintela@fi.udc.es> writes:
Hi
juan> No way, here my tests run two iterations, and in the second iteration
juan> init was killed, and the system become unresponsive (headless machine,
juan> you know....). I have no time now to do a more detailed report, more
juan> information later today.
I have been checking today pre7-8 + manfred patch.
(test as always while (true); do time ./mmap002; done).
Things have improved a lot from pre7-6, but they are not perfect.
With that patch I have obtained the following times:
real 2m41.772s
user 0m16.610s
sys 0m12.470s
(this is a typical value, there are fluctuations between 2m35 and
2m54).
It begin to kill processes after the 10th iteration. After that, the
machine freezes.
The results for pre7-8 + manfred patch + andrea classzone 27 is
real 2m7.622s
user 0m15.480s
sys 0m8.240s
(almost no variations between runs +-1second). And it is rock solid
here, no freezes at all.
The results for 2.2.15 are:
real 1m57.619s
user 0m16.320s
sys 0m11.820s
but it kills processes after 10/12 iterations.
I hope this helps.
Later, Juan.
--
In theory, practice and theory are the same, but in practice they
are different -- Larry McVoy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A possible winner in pre7-8
2000-05-10 3:29 ` Juan J. Quintela
@ 2000-05-10 15:31 ` Linus Torvalds
2000-05-10 16:04 ` Juan J. Quintela
2000-05-10 18:11 ` Rik van Riel
0 siblings, 2 replies; 10+ messages in thread
From: Linus Torvalds @ 2000-05-10 15:31 UTC (permalink / raw)
To: Juan J. Quintela; +Cc: Rajagopal Ananthanarayanan, linux-mm
On 10 May 2000, Juan J. Quintela wrote:
>
> It begin to kill processes after the 10th iteration. After that, the
> machine freezes.
Do you have a SMP machine? If so, I think I found this one.
And it's been there for ages.
The bug is that GFP_ATOMIC _really_ must not try to page stuff out,
eventhe stuff that doesn't need IO to be dropped.
Why? Because GFP_ATOMIC can be (and mostly is) called from interrupts, and
even when we don't do IO we _do_ access a number of spinlocks in order to
see whether we can even just drop it.
For example, in order to scan the page tables we take the page_table_lock
("vmlist_access_lock") which is not irq-safe.
So the lockup will occur if you take an interrupt that does an allocation
(usually networking-related) while you hold the page_table_lock (which can
be due to a swapout, for example).
The reason it has been there for long is that usually SMP machines have
enough memory that this condition is really hard to trigger in normal use.
And on UP machines you'd never see the problem (except, possibly, as page
table double-freeing, but the window for that looks extremely small
indeed, much smaller than the double-spinlock window).
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A possible winner in pre7-8
2000-05-10 15:31 ` Linus Torvalds
@ 2000-05-10 16:04 ` Juan J. Quintela
2000-05-10 18:11 ` Rik van Riel
1 sibling, 0 replies; 10+ messages in thread
From: Juan J. Quintela @ 2000-05-10 16:04 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Rajagopal Ananthanarayanan, linux-mm
>>>>> "linus" == Linus Torvalds <torvalds@transmeta.com> writes:
linus> On 10 May 2000, Juan J. Quintela wrote:
>>
>> It begin to kill processes after the 10th iteration. After that, the
>> machine freezes.
linus> Do you have a SMP machine? If so, I think I found this one.
My machine here is UP, it must be other problem.
Later, Juan.
--
In theory, practice and theory are the same, but in practice they
are different -- Larry McVoy
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A possible winner in pre7-8
2000-05-10 15:31 ` Linus Torvalds
2000-05-10 16:04 ` Juan J. Quintela
@ 2000-05-10 18:11 ` Rik van Riel
2000-05-10 18:21 ` Linus Torvalds
1 sibling, 1 reply; 10+ messages in thread
From: Rik van Riel @ 2000-05-10 18:11 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Juan J. Quintela, Rajagopal Ananthanarayanan, linux-mm
On Wed, 10 May 2000, Linus Torvalds wrote:
> Do you have a SMP machine? If so, I think I found this one.
> And it's been there for ages.
>
> The bug is that GFP_ATOMIC _really_ must not try to page stuff out,
> eventhe stuff that doesn't need IO to be dropped.
>
> Why? Because GFP_ATOMIC can be (and mostly is) called from
> interrupts, and even when we don't do IO we _do_ access a number
> of spinlocks in order to see whether we can even just drop it.
I'm sorry to dissapoint you, but I'm afraid this isn't
the bug. Please look at this code from vmscan.c...
int try_to_free_pages(unsigned int gfp_mask, zone_t *zone)
{
int retval = 1;
if (gfp_mask & __GFP_WAIT) {
current->flags |= PF_MEMALLOC;
retval = do_try_to_free_pages(gfp_mask, zone);
current->flags &= ~PF_MEMALLOC;
}
return retval;
}
As you see, we never call do_try_to_free_pages() if we don't
have __GFP_WAIT set. And GFP_ATOMIC doesn't include __GFP_WAIT.
regards,
Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.
Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/ http://www.surriel.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: A possible winner in pre7-8
2000-05-10 18:11 ` Rik van Riel
@ 2000-05-10 18:21 ` Linus Torvalds
0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2000-05-10 18:21 UTC (permalink / raw)
To: Rik van Riel; +Cc: Juan J. Quintela, Rajagopal Ananthanarayanan, linux-mm
On Wed, 10 May 2000, Rik van Riel wrote:
>
> I'm sorry to dissapoint you, but I'm afraid this isn't
> the bug. Please look at this code from vmscan.c...
Oh, I overlooked that. And I'm definitely not disappointed: that would
have been a brown-paper-bag bug indeed.
I started trying mmap002 again, and can easily reproduce the failures, and
also see the performance problems. I think I've fixed the performance
issue, now I just need to fix the failure ;)
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [plastic bag] Re: A possible winner in pre7-8
2000-05-10 22:13 ` [plastic bag] " Roger Larsson
@ 2000-05-10 21:23 ` Rik van Riel
0 siblings, 0 replies; 10+ messages in thread
From: Rik van Riel @ 2000-05-10 21:23 UTC (permalink / raw)
To: Roger Larsson; +Cc: linux-mm
On Thu, 11 May 2000, Roger Larsson wrote:
> Here is the file too...
Have you tried to boot this?
regards,
Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.
Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies
http://www.conectiva.com/ http://www.surriel.com/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [plastic bag] Re: A possible winner in pre7-8
2000-05-10 1:59 ` Roger Larsson
@ 2000-05-10 22:13 ` Roger Larsson
2000-05-10 21:23 ` Rik van Riel
0 siblings, 1 reply; 10+ messages in thread
From: Roger Larsson @ 2000-05-10 22:13 UTC (permalink / raw)
To: Juan J. Quintela, Rajagopal Ananthanarayanan, Linus Torvalds, linux-mm
[-- Attachment #1: Type: text/plain, Size: 1603 bytes --]
Ok,
Here is the file too...
/RogerL
Roger Larsson wrote:
>
> Hi all,
>
> Since everyone is testing shrink_mmap...
>
> Here is my latest version.
>
> (Currently I have some problems with pre-version
> I am kind of out of synch...)
>
> It should compile, but it is not tested:
> - lack of HD, courage, backups...
>
> /RogerL
>
> "Juan J. Quintela" wrote:
> >
> > >>>>> "rajagopal" == Rajagopal Ananthanarayanan <ananth@sgi.com> writes:
> >
> > Hi
> >
> > rajagopal> Interesting! This stuff is coming out faster than I can patch.
> > rajagopal> In any case, good news about pre7-8: not only does dbench run without
> > rajagopal> errors, but it runs well. Let's hope that others (Juan & Benjamin to name two)
> > rajagopal> see similar results.
> >
> > No way, here my tests run two iterations, and in the second iteration
> > init was killed, and the system become unresponsive (headless machine,
> > you know....). I have no time now to do a more detailed report, more
> > information later today.
> >
> > Later, Juan.
> >
> > --
> > In theory, practice and theory are the same, but in practice they
> > are different -- Larry McVoy
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org. For more info on Linux MM,
> > see: http://www.linux.eu.org/Linux-MM/
>
> --
> Home page:
> http://www.norran.net/nra02596/
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux.eu.org/Linux-MM/
--
Home page:
http://www.norran.net/nra02596/
[-- Attachment #2: patch-2.3-shrink_mmap.2 --]
[-- Type: text/plain, Size: 10493 bytes --]
--- linux-2.3-pre6/mm/filemap.c Sat May 6 02:20:17 2000
+++ linux-2.3/mm/filemap.c Tue May 9 02:35:04 2000
@@ -236,153 +233,228 @@
spin_unlock(&pagecache_lock);
}
+
+static zone_t null_zone;
+
+/*
+ * Precondition:
+ * lru sorted as least recently used
+ * PG_referenced updated from pte_young(pte) to PG_referenced
+ * pages continouosly scanned and resorted due to PG_referenced
+ * Parameters:
+ * zone==NULL
+ * try to free pages belonging to any zone with zone_wake_kswapd
+ * zone!=NULL
+ * try harder (x2) when this zone is low_on_memory (>priority)
+ * relax when this zone has not zone_wake_kswapd (<priority)
+ */
int shrink_mmap(int priority, int gfp_mask, zone_t *zone)
{
- int ret = 0, loop = 0, count;
+ int ret = 0, zone_ret = 0;
+ int attempt = 0, count;
LIST_HEAD(young);
- LIST_HEAD(old);
LIST_HEAD(forget);
- struct list_head * page_lru, * dispose;
+ struct list_head * page_lru, * cursor, * dispose;
struct page * page = NULL;
struct zone_struct * p_zone;
- int maxloop = 256 >> priority;
-
- if (!zone)
- BUG();
-
- count = nr_lru_pages >> priority;
- if (!count)
- return ret;
-
- spin_lock(&pagemap_lru_lock);
-again:
- /* we need pagemap_lru_lock for list_del() ... subtle code below */
- while (count > 0 && (page_lru = lru_cache.prev) != &lru_cache) {
- page = list_entry(page_lru, struct page, lru);
- list_del(page_lru);
- p_zone = page->zone;
-
- /*
- * These two tests are there to make sure we don't free too
- * many pages from the "wrong" zone. We free some anyway,
- * they are the least recently used pages in the system.
- * When we don't free them, leave them in &old.
- */
- dispose = &old;
- if (p_zone != zone && (loop > (maxloop / 4) ||
- p_zone->free_pages > p_zone->pages_high))
- goto dispose_continue;
-
- /* The page is in use, or was used very recently, put it in
- * &young to make sure that we won't try to free it the next
- * time */
- dispose = &young;
-
- if (test_and_clear_bit(PG_referenced, &page->flags))
- goto dispose_continue;
-
- count--;
- if (!page->buffers && page_count(page) > 1)
- goto dispose_continue;
-
- /* Page not used -> free it; if that fails -> &old */
- dispose = &old;
- if (TryLockPage(page))
- goto dispose_continue;
-
- /* Release the pagemap_lru lock even if the page is not yet
- queued in any lru queue since we have just locked down
- the page so nobody else may SMP race with us running
- a lru_cache_del() (lru_cache_del() always run with the
- page locked down ;). */
- spin_unlock(&pagemap_lru_lock);
-
- /* avoid freeing the page while it's locked */
- get_page(page);
-
- /* Is it a buffer page? */
- if (page->buffers) {
- if (!try_to_free_buffers(page))
- goto unlock_continue;
- /* page was locked, inode can't go away under us */
- if (!page->mapping) {
- atomic_dec(&buffermem_pages);
- goto made_buffer_progress;
- }
- }
-
- /* Take the pagecache_lock spinlock held to avoid
- other tasks to notice the page while we are looking at its
- page count. If it's a pagecache-page we'll free it
- in one atomic transaction after checking its page count. */
- spin_lock(&pagecache_lock);
+ struct page cursor_page; /* unique by thread, too much on stack? */
- /*
- * We can't free pages unless there's just one user
- * (count == 2 because we added one ourselves above).
- */
- if (page_count(page) != 2)
- goto cache_unlock_continue;
-
- /*
- * Is it a page swap page? If so, we want to
- * drop it if it is no longer used, even if it
- * were to be marked referenced..
- */
- if (PageSwapCache(page)) {
- spin_unlock(&pagecache_lock);
- __delete_from_swap_cache(page);
- goto made_inode_progress;
- }
-
- /* is it a page-cache page? */
- if (page->mapping) {
- if (!PageDirty(page) && !pgcache_under_min()) {
- remove_page_from_inode_queue(page);
- remove_page_from_hash_queue(page);
- page->mapping = NULL;
- spin_unlock(&pagecache_lock);
- goto made_inode_progress;
- }
- goto cache_unlock_continue;
- }
-
- dispose = &forget;
- printk(KERN_ERR "shrink_mmap: unknown LRU page!\n");
+ /* Initialize the cursor (fake) page */
+ cursor = &cursor_page.lru;
+ cursor_page.zone = &null_zone;
+ spin_lock(&pagemap_lru_lock);
+ /* cursor always part of the list, but not a real page...
+ * make a special page that points to a special zone
+ * with zone_wake_kswapd always 0
+ * - some more toughts required... */
+ list_add_tail(cursor, &lru_cache);
+
+ again:
+ attempt++;
+
+ if (priority == 0)
+ count = -1 >> 1; /* maxint => do not count, search to end of list */
+ else
+ count = nr_lru_pages >> priority;
+
+ for (page_lru = lru_cache.prev;
+ count-- && page_lru != &lru_cache;
+ page_lru = page_lru->prev) {
+
+ /* Avoid processing our own cursor...
+ * Note: check not needed with page cursor.
+ * if (page_lru == cursor)
+ * continue;
+ */
+
+ page = list_entry(page_lru, struct page, lru);
+ p_zone = page->zone;
+
+
+ /* Check if zone has pressure, most pages would continue here.
+ * Also pages from zones that initally was under pressure */
+ if (!p_zone->zone_wake_kswapd)
+ continue;
+
+ /* Can't do anything about this... */
+ if (!page->buffers && page_count(page) > 1)
+ continue;
+
+ /* Page not used -> free it
+ * If it could not be locked it is somehow in use
+ * try another time */
+ if (TryLockPage(page))
+ continue;
+
+ /* Ok, a possible page.
+ * Note: can't unlock lru if we do we will have
+ * to restart this loop */
+
+ /* The page is in use, or was used very recently, put it in
+ * &young to make it ulikely that we will try to free it the next
+ * time.
+ * Note 1: Currently only try_to_swap and __find_page_nolock
+ * will set this bit - how does mmaped pages get referenced?
+ * [not in lru? - I do not know enough :-( ... yet :-) ]
+ * Note 2: all pages need to be searched at once to get
+ * a better lru aproximation.
+ */
+ dispose = &young;
+ if (test_and_clear_bit(PG_referenced, &page->flags))
+ goto dispose_continue;
+
+
+ /* cursor takes page_lru's place in lru_list
+ * if disposed later it ends up at the same place!
+ * Note: compilers should be able to optimize this a bit... */
+ list_del(cursor);
+ list_add_tail(cursor, page_lru);
+ list_del(page_lru);
+ spin_unlock(&pagemap_lru_lock);
+
+ /* Spinlock is released, anything might happen to the list!
+ * But the cursor will remain on spot.
+ * - it will not be deleted from outside,
+ * no one knows about it.
+ * - it will not be deleted by another shrink_mmap,
+ * zone_wake_kswapd == 0
+ */
+
+ /* If page is redisposed after attempt, place it at the same spot */
+ dispose = cursor;
+
+ /* avoid freeing the page while it's locked */
+ get_page(page);
+
+ /* Is it a buffer page? */
+ if (page->buffers) {
+ if (!try_to_free_buffers(page))
+ goto unlock_continue;
+ /* page was locked, inode can't go away under us */
+ if (!page->mapping) {
+ atomic_dec(&buffermem_pages);
+ goto made_buffer_progress;
+ }
+ }
+
+ /* Take the pagecache_lock spinlock held to avoid
+ other tasks to notice the page while we are looking at its
+ page count. If it's a pagecache-page we'll free it
+ in one atomic transaction after checking its page count. */
+ spin_lock(&pagecache_lock);
+
+ /*
+ * We can't free pages unless there's just one user
+ * (count == 2 because we added one ourselves above).
+ */
+ if (page_count(page) != 2)
+ goto cache_unlock_continue;
+
+ /*
+ * Is it a page swap page? If so, we want to
+ * drop it if it is no longer used, even if it
+ * were to be marked referenced..
+ */
+ if (PageSwapCache(page)) {
+ spin_unlock(&pagecache_lock);
+ __delete_from_swap_cache(page);
+ goto made_inode_progress;
+ }
+
+ /* is it a page-cache page? */
+ if (page->mapping) {
+ if (!PageDirty(page) && !pgcache_under_min()) {
+ remove_page_from_inode_queue(page);
+ remove_page_from_hash_queue(page);
+ page->mapping = NULL;
+ spin_unlock(&pagecache_lock);
+ goto made_inode_progress;
+ }
+ goto cache_unlock_continue;
+ }
+
+ dispose = &forget;
+ printk(KERN_ERR "shrink_mmap: unknown LRU page!\n");
+
cache_unlock_continue:
- spin_unlock(&pagecache_lock);
+ spin_unlock(&pagecache_lock);
unlock_continue:
- spin_lock(&pagemap_lru_lock);
- UnlockPage(page);
- put_page(page);
- list_add(page_lru, dispose);
- continue;
+ spin_lock(&pagemap_lru_lock);
+ UnlockPage(page);
+ put_page(page);
- /* we're holding pagemap_lru_lock, so we can just loop again */
dispose_continue:
- list_add(page_lru, dispose);
- }
- goto out;
+ list_add(page_lru, dispose);
+ /* final disposition to other list than lru? */
+ /* then return list index to old lru-list position */
+ if (dispose != cursor)
+ page_lru = cursor;
+ continue;
made_inode_progress:
- page_cache_release(page);
+ page_cache_release(page);
made_buffer_progress:
- UnlockPage(page);
- put_page(page);
- ret = 1;
- spin_lock(&pagemap_lru_lock);
- /* nr_lru_pages needs the spinlock */
- nr_lru_pages--;
+ UnlockPage(page);
+ put_page(page);
+ ret++;
+ spin_lock(&pagemap_lru_lock);
+ /* nr_lru_pages needs the spinlock */
+ nr_lru_pages--;
+
+ /* Might (and should) have been done by free calls
+ * p_zone->zone_wake_kswapd = 0;
+ */
+
+ /* If no more pages are needed to release on specifically
+ requested zone concider it done!
+ Note: zone might be NULL to make all requests fulfilled */
+ if (p_zone == zone) {
+ zone_ret++;
+ if (!p_zone->zone_wake_kswapd)
+ break;
+ }
- loop++;
- /* wrong zone? not looped too often? roll again... */
- if (page->zone != zone && loop < maxloop)
- goto again;
+ /* Back to cursor position to ensure correct next step */
+ page_lru = cursor;
+ }
-out:
+ /* cursor may be at top of lru list, insert young
+ * pages at top - may be scanned next turn...
+ */
list_splice(&young, &lru_cache);
- list_splice(&old, lru_cache.prev);
+
+ /* if zone request not fulfilled, try harder */
+ if (zone) {
+ if (zone->low_on_memory) {
+ if (attempt < 2)
+ goto again;
+ }
+ ret = zone_ret;
+ }
+
+
+ list_del(cursor);
spin_unlock(&pagemap_lru_lock);
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2000-05-10 22:13 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.10.10005082332560.773-100000@penguin.transmeta.com>
2000-05-09 7:50 ` A possible winner in pre7-8 Rajagopal Ananthanarayanan
2000-05-09 17:33 ` Juan J. Quintela
2000-05-10 1:59 ` Roger Larsson
2000-05-10 22:13 ` [plastic bag] " Roger Larsson
2000-05-10 21:23 ` Rik van Riel
2000-05-10 3:29 ` Juan J. Quintela
2000-05-10 15:31 ` Linus Torvalds
2000-05-10 16:04 ` Juan J. Quintela
2000-05-10 18:11 ` Rik van Riel
2000-05-10 18:21 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox