From: Linus Torvalds <torvalds@transmeta.com>
To: "Stephen C. Tweedie" <sct@redhat.com>
Cc: riel@nl.linux.org, Andrea Arcangeli <andrea@suse.de>, linux-mm@kvack.org
Subject: Re: 2.3.x mem balancing
Date: Wed, 26 Apr 2000 09:44:28 -0700 (PDT) [thread overview]
Message-ID: <Pine.LNX.4.10.10004260929340.1492-100000@penguin.transmeta.com> (raw)
In-Reply-To: <20000426122448.G3792@redhat.com>
On Wed, 26 Apr 2000, Stephen C. Tweedie wrote:
>
> We just shouldn't need to keep much memory free.
>
> I'd much rather see a scheme in which we have two separate goals for
> the VM. Goal one would be to keep a certain number of free pages in
> each class, for use by atomic allocations. Goal two would be to have
> a minimum number of pages in each class either free or on a global LRU
> list which contains only pages known to be clean and unmapped (and
> hence available for instant freeing without IO).
This would work. However, there is a rather subtle issue with allocating
contiguous chunks of memory - something that is frowned upon, but however
hard we've triedthere has always been people that really need to do it.
And that subtle issue is that in order for the buddy system to work for
contiguous areas, you cannot have "free" pages _outside_ the buddy system.
The reason the buddy system works for contiguous allocations >1 pages is
_not_ simply that it has the data structures to keep track of power-of-
two pages. The bigger reason for why the buddy system works at all is that
it is inherenty anti-fragmenting - whenever there are free pages, the
buddy system coalesces them, and has a very strong bias to returning
already-fragmented areas over contiguous areas on new allocations.
This advantage of the buddy system is also why keeping a "free list" is
not actually necessarily that great of an idea. Because the free list will
make fragmentation much worse by not allowing the coalescing - which in
turn is needed in order to try to keep future allocations from fragmenting
the heap more.
And yes, part of having memory free is to have low latency - oneof the
huge advantages of kswapd is that it allows us to do background freeing so
that the perceived latency to the occasional page allocator is great. And
that is important, and the "almost free" list would work quite well for
that.
However, the contiguous area concern is also a real concern. That iswhy I
want to keep "alloc_page()" and "free_page()" as the main memory
allocators: the buddy system is certainly not the fastest memory allocator
around, but it's so far the only one I've seen that has reasonable
behaviour wrt contiguous areas without excessive overhead.
[ Side comment: maybe somebody remembers the _original_ page allocator in
Linux. It was based on a very very simple linked list of free pages -
and it was fast as hell. There is absolutely no allocator that does it
faster: getting a new page was not just constant time, but it was just a
few cycles. FAST. The reason I moved to the buddy allocator was that the
flexibility of being able to allocate two or four pages at a time
outweighed the speed disadvantage. I'd hate for people to unwittingly
lose that advantage by just not thinking about these issues.. ]
However, it's certainly true that ourmemory freeing machinery could be
cleaned up a bit, and having the "two phase" thing encoded explicitly in
the page freeing logic might not be a bad idea. I just wanted to point out
some reasons why it might not be all that sensible to count the "easily
freed queue" as real free memory..
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
next prev parent reply other threads:[~2000-04-26 16:44 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Pine.LNX.4.21.0004250401520.4898-100000@alpha.random>
2000-04-25 16:57 ` Linus Torvalds
2000-04-25 17:50 ` Rik van Riel
2000-04-25 18:11 ` Jeff Garzik
2000-04-25 18:33 ` Rik van Riel
2000-04-25 18:53 ` Linus Torvalds
2000-04-25 19:27 ` Rik van Riel
2000-04-26 0:26 ` Linus Torvalds
2000-04-26 1:19 ` Rik van Riel
2000-04-26 1:07 ` Andrea Arcangeli
2000-04-26 2:10 ` Rik van Riel
2000-04-26 11:24 ` Stephen C. Tweedie
2000-04-26 16:44 ` Linus Torvalds [this message]
2000-04-26 17:13 ` Rik van Riel
2000-04-26 17:24 ` Linus Torvalds
2000-04-27 13:22 ` Stephen C. Tweedie
2000-04-26 14:19 ` Andrea Arcangeli
2000-04-26 16:52 ` Linus Torvalds
2000-04-26 17:49 ` Andrea Arcangeli
2000-04-26 16:03 Mark_H_Johnson.RTS
2000-04-26 17:06 ` Andrea Arcangeli
2000-04-26 17:36 ` Kanoj Sarcar
2000-04-26 21:58 ` Andrea Arcangeli
2000-04-26 17:43 ` Kanoj Sarcar
2000-04-26 19:06 frankeh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.10.10004260929340.1492-100000@penguin.transmeta.com \
--to=torvalds@transmeta.com \
--cc=andrea@suse.de \
--cc=linux-mm@kvack.org \
--cc=riel@nl.linux.org \
--cc=sct@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox