linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andrew Morton <akpm@osdl.org>,
	Linux Memory Management <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [RFC][PATCH 0/3] beat kswapd with the proverbial clue-bat
Date: Sun, 5 Sep 2004 09:49:31 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.58.0409050911450.2331@ppc970.osdl.org> (raw)
In-Reply-To: <413AA7B2.4000907@yahoo.com.au>


On Sun, 5 Sep 2004, Nick Piggin wrote:
> 
> So my solution? Just teach kswapd and the watermark code about higher
> order allocations in a fairly simple way. If pages_low is (say), 1024KB,
> we now also require 512KB of order-1 and above pages, 256K of order-2
> and up, 128K of order 3, etc. (perhaps we should stop at about order-3?)

I'm pretty sure we did something like this (where we not only required a
certain amount of memory free, but also a certain buddy availability) for
a short while in the 2.3.x timeframe or something like that, and that it
caused problems for some machines that just kept on trying to free memory.

I wrote a program at the time to check why, but I've long since lost it, 
so I re-created it just now (probably buggy as hell), and append it here 
for you to debug the dang thing.

In it's first rough draft (ie "it might be totally wrong") using this 
script:

	for i in 4 64 128 256 512 1024 2048 4096; do
		echo $i megabytes:
		./a.out $(($i*256)) pages
		echo
	done

I get

	4 megabytes:
	Got 2**1 buddy in 63 iterations (6%)
	Got 2**2 buddy in 199 iterations (19%)
	Got 2**3 buddy in 623 iterations (60%)

	64 megabytes:
	Got 2**1 buddy in 66 iterations (0%)
	Got 2**2 buddy in 1541 iterations (9%)
	Got 2**3 buddy in 4872 iterations (29%)

	128 megabytes:
	Got 2**1 buddy in 66 iterations (0%)
	Got 2**2 buddy in 2865 iterations (8%)
	Got 2**3 buddy in 7247 iterations (22%)

	256 megabytes:
	Got 2**1 buddy in 66 iterations (0%)
	Got 2**2 buddy in 5593 iterations (8%)
	Got 2**3 buddy in 13163 iterations (20%)

	512 megabytes:
	Got 2**1 buddy in 66 iterations (0%)
	Got 2**2 buddy in 10897 iterations (8%)
	Got 2**3 buddy in 30265 iterations (23%)

	1024 megabytes:
	Got 2**1 buddy in 1539 iterations (0%)
	Got 2**2 buddy in 20664 iterations (7%)
	Got 2**3 buddy in 67737 iterations (25%)

	2048 megabytes:
	Got 2**1 buddy in 1539 iterations (0%)
	Got 2**2 buddy in 32541 iterations (6%)
	Got 2**3 buddy in 122437 iterations (23%)

	4096 megabytes:
	Got 2**1 buddy in 1539 iterations (0%)
	Got 2**2 buddy in 38852 iterations (3%)
	Got 2**3 buddy in 220501 iterations (21%)

ie it shows how it's _really_ hard to get high-order buddies if you don't 
have a lot of memory.

Now, this test-result is not "real life", in that the whole point of the 
buddy allocator is that it is good at trying to keep higher orders free, 
and as such real life should behave much better. But this _is_ pretty 
accurate for the case where we totally ran out of memory and are trying
to randomly free one page here and one page there. That's where buddy 
doesn't much help us, so this kind of shows the worst case.

(Rule: you should not allow the machine to get to this point, exactly 
because at the < ~5% free point, buddy stops being very effective).

Notice how you may need to free 20% of memory to get a 2**3 allocation, if 
you have totally depleted your pages. And it's much worse if you have very 
little memory to begin with.

Anyway. I haven't debugged this program, so it may have serious bugs, and 
be off by an order of magnitude or two. Whatever. If I'm wrong, somebody 
can fix the program/script and see what the real numbers are.

		Linus

----
#include <stdlib.h>

#define MAX_NR_PAGES (1024*1024)

#define DECLARE_BITMAP(x,size) \
	unsigned int x[((size)+31)/32];


DECLARE_BITMAP(first, MAX_NR_PAGES);
DECLARE_BITMAP(second, MAX_NR_PAGES/2);
DECLARE_BITMAP(third, MAX_NR_PAGES/4);
DECLARE_BITMAP(fourth, MAX_NR_PAGES/8);

static int switch_bit(int nr, unsigned int *bitmap)
{
	unsigned int mask = 1U << (nr & 31);
	unsigned int *map = bitmap + (nr >> 5);
	unsigned int old = *map;

	*map = old ^ mask;
	return (old & mask) != 0;
}

static int set_bit(int nr, unsigned int *bitmap)
{
	unsigned int mask = 1U << (nr & 31);
	unsigned int *map = bitmap + (nr >> 5);
	unsigned int old = *map;

	*map = old | mask;
	return (old & mask) != 0;
}

static void fill_one(int nr, int max, int random)
{
	random >>= 1;
	if (switch_bit(random, second)) {
		static int hit = 0;

		if (!hit) {
			hit = 1;
			printf("Got 2**1 buddy in %d iterations (%d%%)\n", nr, nr*100/max);
		}
		random >>= 1;
		if (switch_bit(random, third)) {
			static int hit = 0;

			if (!hit) {
				hit = 1;
				printf("Got 2**2 buddy in %d iterations (%d%%)\n", nr, nr*100/max);
			}
			random >>= 1;
			if (switch_bit(random, fourth)) {
				printf("Got 2**3 buddy in %d iterations (%d%%)\n", nr, nr*100/max);
				exit(0);
			}
		}
	}
}

int main(int argc, char **argv)
{
	int i;
	int nrpages;

	if (argc < 2)
		return -1;
	nrpages = atoi(argv[1]);
	if (nrpages < 1 || nrpages > MAX_NR_PAGES)
		return -1;
	srandom(time(NULL));

	i = 0;
	do {
		unsigned r = random() % nrpages;
		if (set_bit(r, first))
			continue;
		i++;
		fill_one(i, nrpages, r);
	} while (i < nrpages * 3 / 4);
	printf("Filled 75% (%d)\n", i);
	return -1;
}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  parent reply	other threads:[~2004-09-05 16:49 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-05  5:44 Nick Piggin
2004-09-05  5:45 ` [RFC][PATCH 1/3] account free buddy areas Nick Piggin
2004-09-05  5:46   ` [RFC][PATCH 2/3] alloc-order watermarks Nick Piggin
2004-09-05  5:47     ` [RFC][PATCH 3/3] teach kswapd about watermarks Nick Piggin
2004-09-05  6:04       ` David S. Miller
2004-09-05  6:20         ` Nick Piggin
2004-09-05  5:50     ` [RFC][PATCH 2/3] alloc-order watermarks Nick Piggin
2004-09-05  6:13   ` [RFC][PATCH 1/3] account free buddy areas Nick Piggin
2004-09-05  6:02 ` [RFC][PATCH 0/3] beat kswapd with the proverbial clue-bat David S. Miller
2004-09-05  6:16   ` Nick Piggin
2004-09-05 10:13     ` Nick Piggin
2004-09-05 17:24       ` Linus Torvalds
2004-09-05 17:36         ` Martin J. Bligh
2004-09-05 17:37         ` Arjan van de Ven
2004-09-05 17:58           ` Linus Torvalds
2004-09-05 18:41             ` Arjan van de Ven
2004-09-06  1:35             ` Nick Piggin
2004-09-15 13:27             ` Jörn Engel
2004-09-15 13:29               ` Arjan van de Ven
2004-09-15 13:34                 ` Jörn Engel
2004-09-15 13:39                   ` Arjan van de Ven
2004-09-15 14:18                     ` Jörn Engel
2004-09-06  1:09         ` Nick Piggin
2004-09-05  6:09 ` Andrew Morton
2004-09-05  6:26   ` Nick Piggin
2004-09-05  6:27   ` Anton Blanchard
2004-09-05 10:09     ` Nick Piggin
2004-09-06  3:33       ` David S. Miller
2004-09-06  8:55         ` Nick Piggin
2004-09-05 16:49 ` Linus Torvalds [this message]
2004-09-06  0:54   ` Nick Piggin
2004-09-06  1:49     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.58.0409050911450.2331@ppc970.osdl.org \
    --to=torvalds@osdl.org \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox