* new allocation algorithm
@ 1998-03-27 9:03 Rik van Riel
1998-03-27 17:30 ` Linus Torvalds
0 siblings, 1 reply; 5+ messages in thread
From: Rik van Riel @ 1998-03-27 9:03 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Stephen C. Tweedie, linux-mm
Hi Linus and Stephen,
I just came up with the idea of using an ext2 like algorithm
for memory allocation, in which we:
- group memory in 128 PAGE groups
- have one unsigned char counter per group, counting the number
of used pages
- allocate with the goal to fill the busiest group
- penalize pages (page->age) when the group is relatively empty
- thereby shift memory usage from empty groups to full groups, making
larger allocation in empty groups easier
- PLUS we get rid of the 'randomness' in the current system, where we
sometimes swap out over half of total memory in one sweep while
still not reaching our goals
Of course, such an algorithm would be relatively expensive...
But could it be worth it?
Trading off the tradeoffs,
Rik.
+-------------------------------------------+--------------------------+
| Linux: - LinuxHQ MM-patches page | Scouting webmaster |
| - kswapd ask-him & complain-to guy | Vries cubscout leader |
| http://www.fys.ruu.nl/~riel/ | <H.H.vanRiel@fys.ruu.nl> |
+-------------------------------------------+--------------------------+
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: new allocation algorithm
1998-03-27 9:03 new allocation algorithm Rik van Riel
@ 1998-03-27 17:30 ` Linus Torvalds
1998-03-27 18:33 ` Benjamin C.R. LaHaise
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Linus Torvalds @ 1998-03-27 17:30 UTC (permalink / raw)
To: Rik van Riel; +Cc: Stephen C. Tweedie, linux-mm
On Fri, 27 Mar 1998, Rik van Riel wrote:
>
> I just came up with the idea of using an ext2 like algorithm
> for memory allocation, in which we:
> - group memory in 128 PAGE groups
> - have one unsigned char counter per group, counting the number
> of used pages
Let's wait with how well the current setup works. It seems to perform
reasonably well even on smaller machines (modulo your patch), and I think
we'd better more-or-less freeze it waiting for further info on what people
actually think.
The current scheme is fairly efficient and extremely stable, and gives
good behaviour for the cases we _really_ care about (pageorders 0, 1 and
to some degree 2). It comes reasonably close to working for the higher
orders too, but they really aren't as critical..
Linus
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: new allocation algorithm
1998-03-27 17:30 ` Linus Torvalds
@ 1998-03-27 18:33 ` Benjamin C.R. LaHaise
1998-03-30 15:07 ` Rik van Riel
1998-04-01 20:28 ` Stephen C. Tweedie
2 siblings, 0 replies; 5+ messages in thread
From: Benjamin C.R. LaHaise @ 1998-03-27 18:33 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Rik van Riel, Stephen C. Tweedie, linux-mm
On Fri, 27 Mar 1998, Linus Torvalds wrote:
> The current scheme is fairly efficient and extremely stable, and gives
> good behaviour for the cases we _really_ care about (pageorders 0, 1 and
> to some degree 2). It comes reasonably close to working for the higher
> orders too, but they really aren't as critical..
As a 'useful gathering of statistics' measure, could something along the
lines of the following pseudo-patch be added to 2.1.92? This way we'll
learn of any memory allocation failures, rather than syscalls failing or
SIGSEGVs occurring and people not knowing what's up.
in __get_free_pages:
nopage:
+{ static long last_nomem_jiffies, nomem_fails;
+ if ((jiffies - last_nomem_jiffies) >= 2*HZ) {
+ printk("__get_free_pages(%x, %lu) failed (%ld)\n",
+ gfp_mask, order, ++nomem_fails);
+ last_nomem_jiffies = jiffies;
+ }
+}
return 0;
-ben
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: new allocation algorithm
1998-03-27 17:30 ` Linus Torvalds
1998-03-27 18:33 ` Benjamin C.R. LaHaise
@ 1998-03-30 15:07 ` Rik van Riel
1998-04-01 20:28 ` Stephen C. Tweedie
2 siblings, 0 replies; 5+ messages in thread
From: Rik van Riel @ 1998-03-30 15:07 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Stephen C. Tweedie, linux-mm
On Fri, 27 Mar 1998, Linus Torvalds wrote:
> On Fri, 27 Mar 1998, Rik van Riel wrote:
> >
> > I just came up with the idea of using an ext2 like algorithm
> > for memory allocation, in which we:
> > - group memory in 128 PAGE groups
> > - have one unsigned char counter per group, counting the number
> > of used pages
>
> Let's wait with how well the current setup works. It seems to perform
> reasonably well even on smaller machines (modulo your patch), and I think
> we'd better more-or-less freeze it waiting for further info on what people
> actually think.
At the moment, all that's freezing are the small-memory
machines :-(
> The current scheme is fairly efficient and extremely stable, and gives
> good behaviour for the cases we _really_ care about (pageorders 0, 1 and
> to some degree 2). It comes reasonably close to working for the higher
> orders too, but they really aren't as critical..
However, when we:
- allocate one page out of a 128 area
- we continue allocating pages out of that area, even when
- several 16k area's are freed and
- we are not able to free another large area again, so:
- the system swaps to death
This is not hypothetical, I've seen it happen :-(
Another way to do it, is to have machines exit with different
kswapd tests, as in:
if (num_physpages < arbitrary_limit && free_memory_available(2))
break;
Rik.
+-------------------------------------------+--------------------------+
| Linux: - LinuxHQ MM-patches page | Scouting webmaster |
| - kswapd ask-him & complain-to guy | Vries cubscout leader |
| http://www.fys.ruu.nl/~riel/ | <H.H.vanRiel@fys.ruu.nl> |
+-------------------------------------------+--------------------------+
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: new allocation algorithm
1998-03-27 17:30 ` Linus Torvalds
1998-03-27 18:33 ` Benjamin C.R. LaHaise
1998-03-30 15:07 ` Rik van Riel
@ 1998-04-01 20:28 ` Stephen C. Tweedie
2 siblings, 0 replies; 5+ messages in thread
From: Stephen C. Tweedie @ 1998-04-01 20:28 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Rik van Riel, Stephen C. Tweedie, linux-mm
Hi,
On Fri, 27 Mar 1998 09:30:40 -0800 (PST), Linus Torvalds
<torvalds@transmeta.com> said:
> The current scheme is fairly efficient and extremely stable, and gives
> good behaviour for the cases we _really_ care about (pageorders 0, 1 and
> to some degree 2). It comes reasonably close to working for the higher
> orders too, but they really aren't as critical..
Sorry to put a spanner in the works at this stage, but there's
something we haven't really considered yet in the page balancing. The
aim I'm currently working towards is to eliminate free memory as much
as possible, by replacing "free" space with reserved, lazy-reclaimed
cache memory. We ought to be able to maintain 5-10% memory in this
form with much less performance impact than we would have if that
memory was truly free, but the downside is that this nearly-free
memory is not on our free page lists and therefore we have no simple
way of assessing the fragmentation of the lazy-reclaimable pages.
Now, this is both a blessing and a curse. The positive side is that
we can do what to some extent happens today, and keep as much memory
as possible on the lazy list in the blind hope that we will be able to
find free higher order pages when we need them by returning lazy pages
to the free list one by one. The drawback is that we don't have an
easy way of suspending kswapd when we get enough free higher order
pages.
This is just an observation --- we cannot tune this stuff until it is
stable enough to integrate, but the impact on our free space
heuristics may be worth thinking about now.
--Stephen
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~1998-04-02 19:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-03-27 9:03 new allocation algorithm Rik van Riel
1998-03-27 17:30 ` Linus Torvalds
1998-03-27 18:33 ` Benjamin C.R. LaHaise
1998-03-30 15:07 ` Rik van Riel
1998-04-01 20:28 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox