linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [patch] 2.2.15-pre3 kswapd fix
@ 2000-01-21  1:55 Rik van Riel
  2000-01-21  2:18 ` Andrea Arcangeli
  0 siblings, 1 reply; 4+ messages in thread
From: Rik van Riel @ 2000-01-21  1:55 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux MM, Linux Kernel, Andrea Arcangeli

Hi Alan,

here's a much cleaner (and hopefully slightly higher
performance) fix of the kswapd problem. The code is
also more readable now...

About Andrea's freepages.low vs. freepages.min problem,
I propose we chose .low here since it's value is higher
and we're trying to solve a reliability problem here.
We'll find out what to do with the freepages.min once
we need it ... in 2.3 it _will_ be used.

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.




--- linux-2.2.15-pre3/mm/vmscan.c.orig	Wed Jan 19 21:18:54 2000
+++ linux-2.2.15-pre3/mm/vmscan.c	Fri Jan 21 02:46:42 2000
@@ -485,18 +485,16 @@
 		 * the processes needing more memory will wake us
 		 * up on a more timely basis.
 		 */
-		interruptible_sleep_on_timeout(&kswapd_wait, HZ);
 		while (nr_free_pages < freepages.high)
 		{
-			if (do_try_to_free_pages(GFP_KSWAPD))
-			{
-				if (tsk->need_resched)
-					schedule();
-				continue;
-			}
-			tsk->state = TASK_INTERRUPTIBLE;
-			schedule_timeout(10*HZ);
+			if (!do_try_to_free_pages(GFP_KSWAPD))
+				break;
+			if (tsk->need_resched)
+				schedule();
 		}
+		run_task_queue(&tq_disk);
+		tsk->state = TASK_INTERRUPTIBLE;
+		schedule_timeout(HZ);
 	}
 }
 
@@ -509,18 +507,16 @@
  * from user processes, because the locking issues are
  * nasty to the extreme (file write locks, and MM locking)
  *
- * One option might be to let kswapd do all the page-out
- * and VM page table scanning that needs locking, and this
- * process thread could do just the mmap shrink stage that
- * can be done by just dropping cached pages without having
- * any deadlock issues.
+ * If we're on or just slighly below freepages.low, kswapd
+ * should manage on its own, we just give it a nudge. This
+ * should also reduce contention for the kernel lock above.
  */
 int try_to_free_pages(unsigned int gfp_mask)
 {
 	int retval = 1;
 
 	wake_up_interruptible(&kswapd_wait);
-	if (gfp_mask & __GFP_WAIT)
+	if ((gfp_mask & __GFP_WAIT) && (nr_free_pages < (freepages.low - 4)))
 		retval = do_try_to_free_pages(gfp_mask);
 	return retval;
 }
--- linux-2.2.15-pre3/mm/page_alloc.c.orig	Wed Jan 19 21:32:05 2000
+++ linux-2.2.15-pre3/mm/page_alloc.c	Wed Jan 19 21:42:00 2000
@@ -212,7 +212,7 @@
 	if (!(current->flags & PF_MEMALLOC)) {
 		int freed;
 
-		if (nr_free_pages > freepages.min) {
+		if (nr_free_pages > freepages.low) {
 			if (!low_on_memory)
 				goto ok_to_allocate;
 			if (nr_free_pages >= freepages.high) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.nl.linux.org/Linux-MM/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch] 2.2.15-pre3 kswapd fix
  2000-01-21  1:55 [patch] 2.2.15-pre3 kswapd fix Rik van Riel
@ 2000-01-21  2:18 ` Andrea Arcangeli
  2000-01-21  2:29   ` Rik van Riel
  0 siblings, 1 reply; 4+ messages in thread
From: Andrea Arcangeli @ 2000-01-21  2:18 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Alan Cox, Linux MM, Linux Kernel

On Fri, 21 Jan 2000, Rik van Riel wrote:

>About Andrea's freepages.low vs. freepages.min problem,

It's not a problem. It's just that low is unused and you are now making
min unused. You are simply increasing the min value.

>I propose we chose .low here since it's value is higher
>and we're trying to solve a reliability problem here.

So just increase freepages.min as somebody does with sysctl.

>+++ linux-2.2.15-pre3/mm/vmscan.c	Fri Jan 21 02:46:42 2000
>@@ -485,18 +485,16 @@
> 		 * the processes needing more memory will wake us
> 		 * up on a more timely basis.
> 		 */
>-		interruptible_sleep_on_timeout(&kswapd_wait, HZ);
> 		while (nr_free_pages < freepages.high)
> 		{
>-			if (do_try_to_free_pages(GFP_KSWAPD))
>-			{
>-				if (tsk->need_resched)
>-					schedule();
>-				continue;
>-			}
>-			tsk->state = TASK_INTERRUPTIBLE;
>-			schedule_timeout(10*HZ);
>+			if (!do_try_to_free_pages(GFP_KSWAPD))
>+				break;
>+			if (tsk->need_resched)
>+				schedule();
> 		}
>+		run_task_queue(&tq_disk);
>+		tsk->state = TASK_INTERRUPTIBLE;
>+		schedule_timeout(HZ);

How do you get a wakeup now? :) now it's pure too slow polling.

> 	wake_up_interruptible(&kswapd_wait);
>-	if (gfp_mask & __GFP_WAIT)
>+	if ((gfp_mask & __GFP_WAIT) && (nr_free_pages < (freepages.low - 4)))

Again treshing_mem heuristic broken...

Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch] 2.2.15-pre3 kswapd fix
  2000-01-21  2:18 ` Andrea Arcangeli
@ 2000-01-21  2:29   ` Rik van Riel
  2000-01-21 13:02     ` Andrea Arcangeli
  0 siblings, 1 reply; 4+ messages in thread
From: Rik van Riel @ 2000-01-21  2:29 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Alan Cox, Linux MM, Linux Kernel

On Fri, 21 Jan 2000, Andrea Arcangeli wrote:
> On Fri, 21 Jan 2000, Rik van Riel wrote:
> 
> >About Andrea's freepages.low vs. freepages.min problem,
> 
> It's not a problem. It's just that low is unused and you are now making
> min unused. You are simply increasing the min value.

More than that. Min is supposed to be the absolute boundary
below which nothing can allocate memory (except for ATOMIC
allocations). It's not with the old code, that's broken.

> >+++ linux-2.2.15-pre3/mm/vmscan.c	Fri Jan 21 02:46:42 2000
> >@@ -485,18 +485,16 @@
> > 		 * the processes needing more memory will wake us
> > 		 * up on a more timely basis.
> > 		 */
> >-		interruptible_sleep_on_timeout(&kswapd_wait, HZ);
> > 		while (nr_free_pages < freepages.high)
> > 		{
> >-			if (do_try_to_free_pages(GFP_KSWAPD))
> >-			{
> >-				if (tsk->need_resched)
> >-					schedule();
> >-				continue;
> >-			}
> >-			tsk->state = TASK_INTERRUPTIBLE;
> >-			schedule_timeout(10*HZ);
> >+			if (!do_try_to_free_pages(GFP_KSWAPD))
> >+				break;
> >+			if (tsk->need_resched)
> >+				schedule();
> > 		}
> >+		run_task_queue(&tq_disk);
> >+		tsk->state = TASK_INTERRUPTIBLE;
> >+		schedule_timeout(HZ);
> 
> How do you get a wakeup now? :) now it's pure too slow polling.

I copied this from the 2.3 code in the expectation that
schedule_timeout(HZ); is interruptible. If it's not, then
we've just found a 2.3 bug :)

> > 	wake_up_interruptible(&kswapd_wait);
> >-	if (gfp_mask & __GFP_WAIT)
> >+	if ((gfp_mask & __GFP_WAIT) && (nr_free_pages < (freepages.low - 4)))
> 
> Again treshing_mem heuristic broken...

Please explain to me why this is broken? I've carefully
explained why this makes sense and you haven't given
any practical explanation on your point of view...

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch] 2.2.15-pre3 kswapd fix
  2000-01-21  2:29   ` Rik van Riel
@ 2000-01-21 13:02     ` Andrea Arcangeli
  0 siblings, 0 replies; 4+ messages in thread
From: Andrea Arcangeli @ 2000-01-21 13:02 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Alan Cox, Linux MM, Linux Kernel

On Fri, 21 Jan 2000, Rik van Riel wrote:

>More than that. Min is supposed to be the absolute boundary
>below which nothing can allocate memory (except for ATOMIC
>allocations).

That's definitely not true. min is supposed to be the watermark that force
the task to block but if you succeed freeing some memory, then you go
uncoditionally ahaead also in the GFP_USER case if you succeed to free
some memory despite of the nr_free_pages level. GFP_KERNEL go ahead even
if it's not been possible to free some memory.

> It's not with the old code, that's broken.

The old code is been implemented in mid 2.1.x and it's still the same in
2.2.14.

>> How do you get a wakeup now? :) now it's pure too slow polling.
>
>I copied this from the 2.3 code in the expectation that
>schedule_timeout(HZ); is interruptible. If it's not, then
>we've just found a 2.3 bug :)

Nooo 2.3 has a pointer to the task struct so you wakeup the task directly
but that's very wrong (and I had not yet time to fix it in 2.3.x) because
you'll wakeup kswapd even when it's waiting for I/O completation so
causing not necessary scheduling and so on. It was buggy also in 2.2.13
and I fixed it in 2.2.14.

>> > 	wake_up_interruptible(&kswapd_wait);
>> >-	if (gfp_mask & __GFP_WAIT)
>> >+	if ((gfp_mask & __GFP_WAIT) && (nr_free_pages < (freepages.low - 4)))
>> 
>> Again treshing_mem heuristic broken...
>
>Please explain to me why this is broken? I've carefully
>explained why this makes sense and you haven't given
>any practical explanation on your point of view...

I given the explanation in my first comment about it.

You must _block_ the task until the watermark returns to _high_
again. This ensure that a trashing program will do all the work to free
memory and thus ensure the trashing program won't be able to trash the
machine because it will have to do all the hard work.

Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2000-01-21 13:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-01-21  1:55 [patch] 2.2.15-pre3 kswapd fix Rik van Riel
2000-01-21  2:18 ` Andrea Arcangeli
2000-01-21  2:29   ` Rik van Riel
2000-01-21 13:02     ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox