[RFC][PATCH] __alloc_pages_limit & order

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [RFC][PATCH] __alloc_pages_limit & order > 0
@ 2001-08-24 22:49 Roger Larsson
  2001-08-25  3:20 ` Rik van Riel
  0 siblings, 1 reply; 2+ messages in thread
From: Roger Larsson @ 2001-08-24 22:49 UTC (permalink / raw)
  To: linux-mm

[-- Attachment #1: Type: text/plain, Size: 771 bytes --]

Hi again,

[it compiles, but I have not runned it yet]

I read through __alloc_pages again and found out that allocs with order > 0
are not treated nicely.

To begin with if order > 0 then direct_reclaim will be false even if it is
allowed to wait...

This version allows direct_reclaim with order > 0 !

How?

Like we finally end up doing anyway...
reclaiming pages and freeing.

While adding this I thought why not always do it like this,
even with order == 0?
since it will allow for merging of pages to higher orders.
Before returning a page that was not mergeable...

Doing this - the code started to collaps...
__alloc_pages_limit could suddenly handle all special cases!
(with small functional differences)

Comments?

/RogerL

-- 
Roger Larsson
Skelleftea
Sweden

[-- Attachment #2: patch-2.4.8-pre3-alloc_pages_limit-R1 --]
[-- Type: text/x-diff, Size: 8691 bytes --]

*******************************************
Patch prepared by: roger.larsson@norran.net
Name of file: /home/roger/patches/patch-2.4.8-pre3-alloc_pages_limit-R1

*******************************************
Patch prepared by: roger.larsson@norran.net
Name of file: /home/roger/patches/patch-2.4.8-pre3-alloc_pages_limit-R1

--- linux/mm/page_alloc.c.orig	Thu Aug 23 22:02:04 2001
+++ linux/mm/page_alloc.c	Sat Aug 25 00:34:25 2001
@@ -212,9 +212,12 @@
 	return NULL;
 }
 
-#define PAGES_MIN	0
-#define PAGES_LOW	1
-#define PAGES_HIGH	2
+#define PAGES_MEMALLOC  0
+#define PAGES_CRITICAL  1
+#define PAGES_MIN	2
+#define PAGES_LOW	3
+#define PAGES_HIGH	4
+#define PAGES_LOW_FREE  5
 
 /*
  * This function does the dirty work for __alloc_pages
@@ -228,7 +231,7 @@
 
 	for (;;) {
 		zone_t *z = *(zone++);
-		unsigned long water_mark;
+		unsigned long water_mark, free_min;
 
 		if (!z)
 			break;
@@ -239,10 +242,25 @@
 		 * We allocate if the number of free + inactive_clean
 		 * pages is above the watermark.
 		 */
+
+		free_min = z->pages_min;
+
 		switch (limit) {
+			case PAGES_MEMALLOC:
+				free_min = water_mark = 1;
+				break;
+			case PAGES_CRITICAL:
+				/* XXX: is pages_min/4 a good amount to reserve for this? */
+				free_min = water_mark = z->pages_min / 4;
+				break;
 			default:
 			case PAGES_MIN:
-				water_mark = z->pages_min;
+				water_mark = z->pages_min; /*  + (1 << order) - 1; */
+				break;
+			case PAGES_LOW_FREE:
+				free_min = water_mark = z->pages_low;
+				if (!direct_reclaim)
+					printk(KERN_WARNING "__alloc_free_limit(PAGES_FREE && direct_reclaim = 1)");
 				break;
 			case PAGES_LOW:
 				water_mark = z->pages_low;
@@ -251,23 +269,44 @@
 				water_mark = z->pages_high;
 		}
 
-		if (z->free_pages + z->inactive_clean_pages >= water_mark) {
-			struct page *page = NULL;
-			/* If possible, reclaim a page directly. */
-			if (direct_reclaim)
-				page = reclaim_page(z);
-			/* If that fails, fall back to rmqueue. */
-			if (!page)
-				page = rmqueue(z, order);
-			if (page)
-				return page;
-		}
+
+
+		if (z->free_pages + z->inactive_clean_pages < water_mark) 
+			continue;
+
+		do {
+			/*
+			 * Reclaim a page from the inactive_clean list.
+			 * low water mark. Free all reclaimed pages to
+			 * give them a chance to merge to higher orders.
+			 */
+			if (direct_reclaim) {
+				struct page *reclaim = reclaim_page(z);
+				if (reclaim) {
+					__free_page(reclaim);
+				} else if (z->inactive_clean_pages > 0) {
+					printk(KERN_ERR "reclaim_pages failed but there are inactive_clean_pages");
+					break;
+				}
+			}
+				
+			/* Always alloc via rmqueue */
+			if (z->free_pages >= free_min)
+			{
+				struct page *page = rmqueue(z, order);
+				if (page)
+					return page;
+			}
+
+			/* if it is possible to make progress by retrying - do it */
+		} while (direct_reclaim && z->inactive_clean_pages);
 	}
 
 	/* Found nothing. */
 	return NULL;
 }
 
+
 #ifndef CONFIG_DISCONTIGMEM
 struct page *_alloc_pages(unsigned int gfp_mask, unsigned long order)
 {
@@ -281,7 +320,6 @@
  */
 struct page * __alloc_pages(unsigned int gfp_mask, unsigned long order, zonelist_t *zonelist)
 {
-	zone_t **zone;
 	int direct_reclaim = 0;
 	struct page * page;
 
@@ -300,34 +338,24 @@
 
 	/*
 	 * Can we take pages directly from the inactive_clean
-	 * list?
+	 * list? __alloc_pages_limit now handles any 'order'.
 	 */
-	if (order == 0 && (gfp_mask & __GFP_WAIT))
+	if (gfp_mask & __GFP_WAIT)
 		direct_reclaim = 1;
 
-try_again:
 	/*
 	 * First, see if we have any zones with lots of free memory.
 	 *
 	 * We allocate free memory first because it doesn't contain
 	 * any data ... DUH!
 	 */
-	zone = zonelist->zones;
-	for (;;) {
-		zone_t *z = *(zone++);
-		if (!z)
-			break;
-		if (!z->size)
-			BUG();
+	page = __alloc_pages_limit(zonelist, order, PAGES_LOW_FREE, 0);
+	if (page)
+		return page;
 
-		if (z->free_pages >= z->pages_low) {
-			page = rmqueue(z, order);
-			if (page)
-				return page;
-		} else if (z->free_pages < z->pages_min &&
-					waitqueue_active(&kreclaimd_wait)) {
-				wake_up_interruptible(&kreclaimd_wait);
-		}
+	/* "all" requested zones has less than LOW free memory, start kreclaimd */
+	if (waitqueue_active(&kreclaimd_wait)) {
+		wake_up_interruptible(&kreclaimd_wait);
 	}
 
 	/*
@@ -356,7 +384,7 @@
 
 	/*
 	 * OK, none of the zones on our zonelist has lots
-	 * of pages free.
+	 * of pages free or a higher order alloc did not succeed
 	 *
 	 * We wake up kswapd, in the hope that kswapd will
 	 * resolve this situation before memory gets tight.
@@ -371,6 +399,8 @@
 	 * - if we don't have __GFP_IO set, kswapd may be
 	 *   able to free some memory we can't free ourselves
 	 */
+
+
 	wakeup_kswapd();
 	if (gfp_mask & __GFP_WAIT) {
 		__set_current_state(TASK_RUNNING);
@@ -385,6 +415,7 @@
 	 * Kswapd should, in most situations, bring the situation
 	 * back to normal in no time.
 	 */
+try_again:
 	page = __alloc_pages_limit(zonelist, order, PAGES_MIN, direct_reclaim);
 	if (page)
 		return page;
@@ -398,40 +429,21 @@
 	 * - we're /really/ tight on memory
 	 * 	--> try to free pages ourselves with page_launder
 	 */
-	if (!(current->flags & PF_MEMALLOC)) {
+	if (!(current->flags & PF_MEMALLOC) &&
+	    (gfp_mask & __GFP_WAIT)) {
 		/*
-		 * Are we dealing with a higher order allocation?
-		 *
-		 * Move pages from the inactive_clean to the free list
-		 * in the hope of creating a large, physically contiguous
-		 * piece of free memory.
+		 * Move pages from the inactive_dirty to the inactive_clean
 		 */
-		if (order > 0 && (gfp_mask & __GFP_WAIT)) {
-			zone = zonelist->zones;
-			/* First, clean some dirty pages. */
-			current->flags |= PF_MEMALLOC;
-			page_launder(gfp_mask, 1);
-			current->flags &= ~PF_MEMALLOC;
-			for (;;) {
-				zone_t *z = *(zone++);
-				if (!z)
-					break;
-				if (!z->size)
-					continue;
-				while (z->inactive_clean_pages) {
-					struct page * page;
-					/* Move one page to the free list. */
-					page = reclaim_page(z);
-					if (!page)
-						break;
-					__free_page(page);
-					/* Try if the allocation succeeds. */
-					page = rmqueue(z, order);
-					if (page)
-						return page;
-				}
-			}
-		}
+
+		/* First, clean some dirty pages. */
+		current->flags |= PF_MEMALLOC;
+		page_launder(gfp_mask, 1);
+		current->flags &= ~PF_MEMALLOC;
+
+		page = __alloc_pages_limit(zonelist, order, PAGES_MIN, direct_reclaim); 
+		if (page)
+			return page;
+
 		/*
 		 * When we arrive here, we are really tight on memory.
 		 * Since kswapd didn't succeed in freeing pages for us,
@@ -447,17 +459,15 @@
 		 * any progress freeing pages, in that case it's better
 		 * to give up than to deadlock the kernel looping here.
 		 */
-		if (gfp_mask & __GFP_WAIT) {
-			if (!order || total_free_shortage()) {
-				int progress = try_to_free_pages(gfp_mask);
-				if (progress || (gfp_mask & __GFP_FS))
-					goto try_again;
-				/*
-				 * Fail in case no progress was made and the
-				 * allocation may not be able to block on IO.
-				 */
-				return NULL;
-			}
+		if (!order || total_free_shortage()) {
+			int progress = try_to_free_pages(gfp_mask);
+			if (progress || (gfp_mask & __GFP_FS))
+				goto try_again;
+			/*
+			 * Fail in case no progress was made and the
+			 * allocation may not be able to block on IO.
+			 */
+			return NULL;
 		}
 	}
 
@@ -471,35 +481,11 @@
 	 * in the system, otherwise it would be just too easy to
 	 * deadlock the system...
 	 */
-	zone = zonelist->zones;
-	for (;;) {
-		zone_t *z = *(zone++);
-		struct page * page = NULL;
-		if (!z)
-			break;
-		if (!z->size)
-			BUG();
-
-		/*
-		 * SUBTLE: direct_reclaim is only possible if the task
-		 * becomes PF_MEMALLOC while looping above. This will
-		 * happen when the OOM killer selects this task for
-		 * instant execution...
-		 */
-		if (direct_reclaim) {
-			page = reclaim_page(z);
-			if (page)
-				return page;
-		}
-
-		/* XXX: is pages_min/4 a good amount to reserve for this? */
-		if (z->free_pages < z->pages_min / 4 &&
-				!(current->flags & PF_MEMALLOC))
-			continue;
-		page = rmqueue(z, order);
-		if (page)
-			return page;
-	}
+	page = __alloc_pages_limit(zonelist, order,
+				   current->flags & PF_MEMALLOC ? PAGES_MEMALLOC : PAGES_CRITICAL,
+				   direct_reclaim); 
+	if (page)
+		return page;
 
 	/* No luck.. */
 	printk(KERN_ERR "__alloc_pages: %lu-order allocation failed.\n", order);

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [RFC][PATCH] __alloc_pages_limit & order > 0
  2001-08-24 22:49 [RFC][PATCH] __alloc_pages_limit & order > 0 Roger Larsson
@ 2001-08-25  3:20 ` Rik van Riel
  0 siblings, 0 replies; 2+ messages in thread
From: Rik van Riel @ 2001-08-25  3:20 UTC (permalink / raw)
  To: Roger Larsson; +Cc: linux-mm

On Sat, 25 Aug 2001, Roger Larsson wrote:

> To begin with if order > 0 then direct_reclaim will be false even if
> it is allowed to wait...

That's because direct_reclaim can only reclaim 1 page from
the page cache at the same time, while a higher-order alloc
needs _multiple_ pages.

Thus, by definition, a direct-reclaim won't satisfy a higher
order allocation.

> This version allows direct_reclaim with order > 0 !

The old code already did this, albeit in a very ugly way.

I'd like to see the old code cleaned up, but I'm not too happy
about the main loop being complicated because of these (very rare)
higher-order allocations.

IIRC somebody measured his system one day and 99.5% of the allocs
were 0-order GFP_USER or GFP_KERNEL, so I guess we really want to
keep the multi-order allocs from messing with the main allocation
loop.

Then again, please do clean up the multi-order allocation page
cleaning loop, the way I coded it originally is just plain ugly ;)

regards,

Rik -- after a few drinks, so apply a grain of salt ;)
-- 
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2001-08-25  3:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-24 22:49 [RFC][PATCH] __alloc_pages_limit & order > 0 Roger Larsson
2001-08-25  3:20 ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox