* Re: Break 2.4 VM in five easy steps
[not found] ` <3B1E7ABA.EECCBFE0@illusionary.com>
@ 2001-06-06 18:52 ` Eric W. Biederman
2001-06-06 19:06 ` Mike Galbraith
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Eric W. Biederman @ 2001-06-06 18:52 UTC (permalink / raw)
To: Derek Glidden; +Cc: linux-kernel, linux-mm
Derek Glidden <dglidden@illusionary.com> writes:
> The problem I reported is not that 2.4 uses huge amounts of swap but
> that trying to recover that swap off of disk under 2.4 can leave the
> machine in an entirely unresponsive state, while 2.2 handles identical
> situations gracefully.
>
The interesting thing from other reports is that it appears to be kswapd
using up CPU resources. Not the swapout code at all. So it appears
to be a fundamental VM issue. And calling swapoff is just a good way
to trigger it.
If you could confirm this by calling swapoff sometime other than at
reboot time. That might help. Say by running top on the console.
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: Break 2.4 VM in five easy steps
2001-06-06 18:52 ` Break 2.4 VM in five easy steps Eric W. Biederman
@ 2001-06-06 19:06 ` Mike Galbraith
2001-06-06 19:28 ` Eric W. Biederman
2001-06-06 19:28 ` Break 2.4 VM in five easy steps Derek Glidden
2001-06-09 7:55 ` Rik van Riel
2 siblings, 1 reply; 20+ messages in thread
From: Mike Galbraith @ 2001-06-06 19:06 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Derek Glidden, linux-kernel, linux-mm
On 6 Jun 2001, Eric W. Biederman wrote:
> Derek Glidden <dglidden@illusionary.com> writes:
>
>
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.
> >
>
> The interesting thing from other reports is that it appears to be kswapd
> using up CPU resources. Not the swapout code at all. So it appears
> to be a fundamental VM issue. And calling swapoff is just a good way
> to trigger it.
>
> If you could confirm this by calling swapoff sometime other than at
> reboot time. That might help. Say by running top on the console.
The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
switch is nogo...
After running his memory hog, swapoff took 18 seconds. I hacked a
bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
utterly comatose for those 4 seconds though.
-Mike
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Break 2.4 VM in five easy steps
2001-06-06 19:06 ` Mike Galbraith
@ 2001-06-06 19:28 ` Eric W. Biederman
2001-06-07 4:32 ` Mike Galbraith
0 siblings, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2001-06-06 19:28 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Derek Glidden, linux-kernel, linux-mm
Mike Galbraith <mikeg@wen-online.de> writes:
> On 6 Jun 2001, Eric W. Biederman wrote:
>
> > Derek Glidden <dglidden@illusionary.com> writes:
> >
> >
> > > The problem I reported is not that 2.4 uses huge amounts of swap but
> > > that trying to recover that swap off of disk under 2.4 can leave the
> > > machine in an entirely unresponsive state, while 2.2 handles identical
> > > situations gracefully.
> > >
> >
> > The interesting thing from other reports is that it appears to be kswapd
> > using up CPU resources. Not the swapout code at all. So it appears
> > to be a fundamental VM issue. And calling swapoff is just a good way
> > to trigger it.
> >
> > If you could confirm this by calling swapoff sometime other than at
> > reboot time. That might help. Say by running top on the console.
>
> The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> switch is nogo...
>
> After running his memory hog, swapoff took 18 seconds. I hacked a
> bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> utterly comatose for those 4 seconds though.
At the top of the while(1) loop in try_to_unuse what happens if you put in.
if (need_resched) schedule();
It should be outside all of the locks. It might just be a matter of everything
serializing on the SMP locks, and the kernel refusing to preempt itself.
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Break 2.4 VM in five easy steps
2001-06-06 19:28 ` Eric W. Biederman
@ 2001-06-07 4:32 ` Mike Galbraith
2001-06-07 6:38 ` Eric W. Biederman
2001-06-07 17:10 ` Marcelo Tosatti
0 siblings, 2 replies; 20+ messages in thread
From: Mike Galbraith @ 2001-06-07 4:32 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Derek Glidden, linux-kernel, linux-mm
On 6 Jun 2001, Eric W. Biederman wrote:
> Mike Galbraith <mikeg@wen-online.de> writes:
>
> > > If you could confirm this by calling swapoff sometime other than at
> > > reboot time. That might help. Say by running top on the console.
> >
> > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > switch is nogo...
> >
> > After running his memory hog, swapoff took 18 seconds. I hacked a
> > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > utterly comatose for those 4 seconds though.
>
> At the top of the while(1) loop in try_to_unuse what happens if you put in.
> if (need_resched) schedule();
> It should be outside all of the locks. It might just be a matter of everything
> serializing on the SMP locks, and the kernel refusing to preempt itself.
That did it.
-Mike
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Break 2.4 VM in five easy steps
2001-06-07 4:32 ` Mike Galbraith
@ 2001-06-07 6:38 ` Eric W. Biederman
2001-06-07 7:28 ` Mike Galbraith
2001-06-07 17:10 ` Marcelo Tosatti
1 sibling, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2001-06-07 6:38 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Derek Glidden, linux-kernel, linux-mm
Mike Galbraith <mikeg@wen-online.de> writes:
> On 6 Jun 2001, Eric W. Biederman wrote:
>
> > Mike Galbraith <mikeg@wen-online.de> writes:
> >
> > > > If you could confirm this by calling swapoff sometime other than at
> > > > reboot time. That might help. Say by running top on the console.
> > >
> > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > switch is nogo...
> > >
> > > After running his memory hog, swapoff took 18 seconds. I hacked a
> > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > utterly comatose for those 4 seconds though.
> >
> > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > if (need_resched) schedule();
> > It should be outside all of the locks. It might just be a matter of
> everything
>
> > serializing on the SMP locks, and the kernel refusing to preempt itself.
>
> That did it.
Does this improve the swapoff speed or just allow other programs to
run at the same time? If it is still slow under that kind of load it
would be interesting to know what is taking up all time.
If it is no longer slow a patch should be made and sent to Linus.
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Break 2.4 VM in five easy steps
2001-06-07 6:38 ` Eric W. Biederman
@ 2001-06-07 7:28 ` Mike Galbraith
2001-06-07 7:59 ` Eric W. Biederman
0 siblings, 1 reply; 20+ messages in thread
From: Mike Galbraith @ 2001-06-07 7:28 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Derek Glidden, linux-kernel, linux-mm
On 7 Jun 2001, Eric W. Biederman wrote:
> Mike Galbraith <mikeg@wen-online.de> writes:
>
> > On 6 Jun 2001, Eric W. Biederman wrote:
> >
> > > Mike Galbraith <mikeg@wen-online.de> writes:
> > >
> > > > > If you could confirm this by calling swapoff sometime other than at
> > > > > reboot time. That might help. Say by running top on the console.
> > > >
> > > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > > switch is nogo...
> > > >
> > > > After running his memory hog, swapoff took 18 seconds. I hacked a
> > > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > > utterly comatose for those 4 seconds though.
> > >
> > > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > > if (need_resched) schedule();
> > > It should be outside all of the locks. It might just be a matter of
> > everything
> >
> > > serializing on the SMP locks, and the kernel refusing to preempt itself.
> >
> > That did it.
>
> Does this improve the swapoff speed or just allow other programs to
> run at the same time? If it is still slow under that kind of load it
> would be interesting to know what is taking up all time.
>
> If it is no longer slow a patch should be made and sent to Linus.
No, it only cures the freeze. The other appears to be the slow code
pointed out by Andrew Morton being tickled by dead swap pages.
-Mike
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Break 2.4 VM in five easy steps
2001-06-07 7:28 ` Mike Galbraith
@ 2001-06-07 7:59 ` Eric W. Biederman
2001-06-07 8:15 ` Mike Galbraith
0 siblings, 1 reply; 20+ messages in thread
From: Eric W. Biederman @ 2001-06-07 7:59 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Derek Glidden, linux-kernel, linux-mm
Mike Galbraith <mikeg@wen-online.de> writes:
> On 7 Jun 2001, Eric W. Biederman wrote:
>
> > Does this improve the swapoff speed or just allow other programs to
> > run at the same time? If it is still slow under that kind of load it
> > would be interesting to know what is taking up all time.
> >
> > If it is no longer slow a patch should be made and sent to Linus.
>
> No, it only cures the freeze. The other appears to be the slow code
> pointed out by Andrew Morton being tickled by dead swap pages.
O.k. I think I'm ready to nominate the dead swap pages for the big
2.4.x VM bug award. So we are burning cpu cycles in sys_swapoff
instead of being IO bound? Just wanting to understand this the cheap way :)
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Break 2.4 VM in five easy steps
2001-06-07 7:59 ` Eric W. Biederman
@ 2001-06-07 8:15 ` Mike Galbraith
0 siblings, 0 replies; 20+ messages in thread
From: Mike Galbraith @ 2001-06-07 8:15 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Derek Glidden, linux-kernel, linux-mm
On 7 Jun 2001, Eric W. Biederman wrote:
> Mike Galbraith <mikeg@wen-online.de> writes:
>
> > On 7 Jun 2001, Eric W. Biederman wrote:
> >
> > > Does this improve the swapoff speed or just allow other programs to
> > > run at the same time? If it is still slow under that kind of load it
> > > would be interesting to know what is taking up all time.
> > >
> > > If it is no longer slow a patch should be made and sent to Linus.
> >
> > No, it only cures the freeze. The other appears to be the slow code
> > pointed out by Andrew Morton being tickled by dead swap pages.
>
> O.k. I think I'm ready to nominate the dead swap pages for the big
> 2.4.x VM bug award. So we are burning cpu cycles in sys_swapoff
> instead of being IO bound? Just wanting to understand this the cheap way :)
There's no IO being done whatsoever (that I can see with only a blinky).
I can fire up ktrace and find out exactly what's going on if that would
be helpful. Eating the dead swap pages from the active page list prior
to swapoff cures all but a short freeze. Eating the rest (few of those)
might cure the rest, but I doubt it.
-Mike
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Break 2.4 VM in five easy steps
2001-06-07 4:32 ` Mike Galbraith
2001-06-07 6:38 ` Eric W. Biederman
@ 2001-06-07 17:10 ` Marcelo Tosatti
2001-06-07 17:43 ` Please test: workaround to help swapoff behaviour Marcelo Tosatti
1 sibling, 1 reply; 20+ messages in thread
From: Marcelo Tosatti @ 2001-06-07 17:10 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Eric W. Biederman, Derek Glidden, linux-kernel, linux-mm
On Thu, 7 Jun 2001, Mike Galbraith wrote:
> On 6 Jun 2001, Eric W. Biederman wrote:
>
> > Mike Galbraith <mikeg@wen-online.de> writes:
> >
> > > > If you could confirm this by calling swapoff sometime other than at
> > > > reboot time. That might help. Say by running top on the console.
> > >
> > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > switch is nogo...
> > >
> > > After running his memory hog, swapoff took 18 seconds. I hacked a
> > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > utterly comatose for those 4 seconds though.
> >
> > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > if (need_resched) schedule();
> > It should be outside all of the locks. It might just be a matter of everything
> > serializing on the SMP locks, and the kernel refusing to preempt itself.
>
> That did it.
What about including this workaround in the kernel ?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Please test: workaround to help swapoff behaviour
2001-06-07 17:10 ` Marcelo Tosatti
@ 2001-06-07 17:43 ` Marcelo Tosatti
0 siblings, 0 replies; 20+ messages in thread
From: Marcelo Tosatti @ 2001-06-07 17:43 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Eric W. Biederman, Derek Glidden, lkml, linux-mm
On Thu, 7 Jun 2001, Marcelo Tosatti wrote:
>
> On Thu, 7 Jun 2001, Mike Galbraith wrote:
>
> > On 6 Jun 2001, Eric W. Biederman wrote:
> >
> > > Mike Galbraith <mikeg@wen-online.de> writes:
> > >
> > > > > If you could confirm this by calling swapoff sometime other than at
> > > > > reboot time. That might help. Say by running top on the console.
> > > >
> > > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > > switch is nogo...
> > > >
> > > > After running his memory hog, swapoff took 18 seconds. I hacked a
> > > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > > utterly comatose for those 4 seconds though.
> > >
> > > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > > if (need_resched) schedule();
> > > It should be outside all of the locks. It might just be a matter of everything
> > > serializing on the SMP locks, and the kernel refusing to preempt itself.
> >
> > That did it.
>
> What about including this workaround in the kernel ?
Well,
This is for the people who has been experiencing the lockups while running
swapoff.
Please test. (against 2.4.6-pre1)
Thanks for the suggestion, Eric.
--- linux.orig/mm/swapfile.c Wed Jun 6 18:16:45 2001
+++ linux/mm/swapfile.c Thu Jun 7 16:06:11 2001
@@ -345,6 +345,8 @@
/*
* Find a swap page in use and read it in.
*/
+ if (current->need_resched)
+ schedule();
swap_device_lock(si);
for (i = 1; i < si->max ; i++) {
if (si->swap_map[i] > 0 && si->swap_map[i] != SWAP_MAP_BAD) {
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Break 2.4 VM in five easy steps
2001-06-06 18:52 ` Break 2.4 VM in five easy steps Eric W. Biederman
2001-06-06 19:06 ` Mike Galbraith
@ 2001-06-06 19:28 ` Derek Glidden
2001-06-09 7:55 ` Rik van Riel
2 siblings, 0 replies; 20+ messages in thread
From: Derek Glidden @ 2001-06-06 19:28 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: linux-kernel, linux-mm
"Eric W. Biederman" wrote:
>
> Derek Glidden <dglidden@illusionary.com> writes:
>
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.
> >
>
> The interesting thing from other reports is that it appears to be kswapd
> using up CPU resources. Not the swapout code at all. So it appears
> to be a fundamental VM issue. And calling swapoff is just a good way
> to trigger it.
>
> If you could confirm this by calling swapoff sometime other than at
> reboot time. That might help. Say by running top on the console.
That's exactly what my original test was doing. I think it was Jeffrey
Baker complaining about "swapoff" at reboot. See my original post that
started this thread and follow the "five easy steps." :) I'm sucking
down a lot of swap, although not all that's available which is something
I am specifically trying to avoid - I wanted to stress the VM/swap
recovery procedure, not "out of RAM and swap" memory pressure - and then
running 'swapoff' from an xterm or a console.
The problem with being able to see what's eating up CPU resources is
that the whole machine stops responding for me to tell. consoles stop
updating, the X display freezes, keyboard input is locked out, etc. As
far as anyone can tell, for several minutes, the whole machine is locked
up. (except, strangely enough, the machine will still respond to ping)
I've tried running 'top' to see what task is taking up all the CPU time,
but the system hangs before it shows anything meaningful. I have been
able to tell that it hits 100% "system" utilization very quickly though.
I did notice that the first thing sys_swapoff() does is call
lock_kernel() ... so if sys_swapoff() takes a long time, I imagine
things will get very unresponsive quickly. (But I'm not intimately
familiar with the various kernel locks, so I don't know what
granularity/atomicity/whatever lock_kernel() enforces.)
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval
usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec -
http://www.eff.org/ http://www.opendvd.org/
http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: Break 2.4 VM in five easy steps
2001-06-06 18:52 ` Break 2.4 VM in five easy steps Eric W. Biederman
2001-06-06 19:06 ` Mike Galbraith
2001-06-06 19:28 ` Break 2.4 VM in five easy steps Derek Glidden
@ 2001-06-09 7:55 ` Rik van Riel
2 siblings, 0 replies; 20+ messages in thread
From: Rik van Riel @ 2001-06-09 7:55 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: Derek Glidden, linux-kernel, linux-mm
On 6 Jun 2001, Eric W. Biederman wrote:
> Derek Glidden <dglidden@illusionary.com> writes:
>
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.
>
> The interesting thing from other reports is that it appears to be
> kswapd using up CPU resources.
This part is being worked on, expect a solution for this thing
soon...
Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...
http://www.surriel.com/ http://distro.conectiva.com/
Send all your spam to aardvark@nl.linux.org (spam digging piggy)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
@ 2001-06-07 20:33 Bulent Abali
2001-06-07 19:40 ` Marcelo Tosatti
2001-06-08 21:11 ` Marcelo Tosatti
0 siblings, 2 replies; 20+ messages in thread
From: Bulent Abali @ 2001-06-07 20:33 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: Mike Galbraith, Eric W. Biederman, Derek Glidden, lkml, linux-mm
>This is for the people who has been experiencing the lockups while running
>swapoff.
>
>Please test. (against 2.4.6-pre1)
>
>
>--- linux.orig/mm/swapfile.c Wed Jun 6 18:16:45 2001
>+++ linux/mm/swapfile.c Thu Jun 7 16:06:11 2001
>@@ -345,6 +345,8 @@
> /*
> * Find a swap page in use and read it in.
> */
>+ if (current->need_resched)
>+ schedule();
> swap_device_lock(si);
> for (i = 1; i < si->max ; i++) {
> if (si->swap_map[i] > 0 && si->swap_map[i] != SWAP_MAP_BAD)
{
I tested your patch against 2.4.5. It works. No more lockups. Without
the
patch it took 14 minutes 51 seconds to complete swapoff (this is to recover
1.5GB of
swap space). During this time the system was frozen. No keyboard, no
screen, etc. Practically locked-up.
With the patch there are no more lockups. Swapoff kept running in the
background.
This is a winner.
But here is the caveat: swapoff keeps burning 100% of the cycles until it
completes.
This is not going to be a big deal during shutdowns. Only when you enter
swapoff from
the command line it is going to be a problem.
I looked at try_to_unuse in swapfile.c. I believe that the algorithm is
broken.
For each and every swap entry it is walking the entire process list
(for_each_task(p)). It is also grabbing a whole bunch of locks
for each swap entry. It might be worthwhile processing swap entries in
batches instead of one entry at a time.
In any case, I think having this patch is worthwhile as a quick and dirty
remedy.
Bulent Abali
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: Please test: workaround to help swapoff behaviour
2001-06-07 20:33 Please test: workaround to help swapoff behaviour Bulent Abali
@ 2001-06-07 19:40 ` Marcelo Tosatti
2001-06-08 21:11 ` Marcelo Tosatti
1 sibling, 0 replies; 20+ messages in thread
From: Marcelo Tosatti @ 2001-06-07 19:40 UTC (permalink / raw)
To: Bulent Abali
Cc: Mike Galbraith, Eric W. Biederman, Derek Glidden, lkml, linux-mm
On Thu, 7 Jun 2001, Bulent Abali wrote:
>
> I tested your patch against 2.4.5. It works. No more lockups. Without
> the
> patch it took 14 minutes 51 seconds to complete swapoff (this is to recover
> 1.5GB of
> swap space). During this time the system was frozen. No keyboard, no
> screen, etc. Practically locked-up.
>
> With the patch there are no more lockups. Swapoff kept running in the
> background.
> This is a winner.
>
> But here is the caveat: swapoff keeps burning 100% of the cycles until it
> completes.
Yup. Wait a while until the dead swap cache issue is sorted out.
When that finally happens, the time spent in swapoff will probably be
"acceptable".
> This is not going to be a big deal during shutdowns. Only when you enter
> swapoff from
> the command line it is going to be a problem.
>
> I looked at try_to_unuse in swapfile.c. I believe that the algorithm is
> broken.
Yes.
> For each and every swap entry it is walking the entire process list
> (for_each_task(p)). It is also grabbing a whole bunch of locks
> for each swap entry. It might be worthwhile processing swap entries in
> batches instead of one entry at a time.
The real fix is to make the processing the other way around --- go looking
into the pte's and from there do the swapins.
Don't have the time to do everything, though. :)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
2001-06-07 20:33 Please test: workaround to help swapoff behaviour Bulent Abali
2001-06-07 19:40 ` Marcelo Tosatti
@ 2001-06-08 21:11 ` Marcelo Tosatti
1 sibling, 0 replies; 20+ messages in thread
From: Marcelo Tosatti @ 2001-06-08 21:11 UTC (permalink / raw)
To: Bulent Abali
Cc: Mike Galbraith, Eric W. Biederman, Derek Glidden, lkml, linux-mm
On Thu, 7 Jun 2001, Bulent Abali wrote:
>
>
>
>
> >This is for the people who has been experiencing the lockups while running
> >swapoff.
> >
> >Please test. (against 2.4.6-pre1)
> >
> >
> >--- linux.orig/mm/swapfile.c Wed Jun 6 18:16:45 2001
> >+++ linux/mm/swapfile.c Thu Jun 7 16:06:11 2001
> >@@ -345,6 +345,8 @@
> > /*
> > * Find a swap page in use and read it in.
> > */
> >+ if (current->need_resched)
> >+ schedule();
> > swap_device_lock(si);
> > for (i = 1; i < si->max ; i++) {
> > if (si->swap_map[i] > 0 && si->swap_map[i] != SWAP_MAP_BAD)
> {
>
>
> I tested your patch against 2.4.5. It works. No more lockups. Without
> the
> patch it took 14 minutes 51 seconds to complete swapoff (this is to recover
> 1.5GB of
> swap space). During this time the system was frozen. No keyboard, no
> screen, etc. Practically locked-up.
>
> With the patch there are no more lockups. Swapoff kept running in the
> background.
> This is a winner.
>
> But here is the caveat: swapoff keeps burning 100% of the cycles until it
> completes.
> This is not going to be a big deal during shutdowns. Only when you enter
> swapoff from
> the command line it is going to be a problem.
>
> I looked at try_to_unuse in swapfile.c. I believe that the algorithm is
> broken.
> For each and every swap entry it is walking the entire process list
> (for_each_task(p)). It is also grabbing a whole bunch of locks
> for each swap entry. It might be worthwhile processing swap entries in
> batches instead of one entry at a time.
>
> In any case, I think having this patch is worthwhile as a quick and dirty
> remedy.
Bulent,
Could you please check if 2.4.6-pre2+the schedule patch has better
swapoff behaviour for you?
Thanks
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
@ 2001-06-08 23:53 Bulent Abali
0 siblings, 0 replies; 20+ messages in thread
From: Bulent Abali @ 2001-06-08 23:53 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: Mike Galbraith, Eric W. Biederman, Derek Glidden, lkml, linux-mm
>> I looked at try_to_unuse in swapfile.c. I believe that the algorithm is
>> broken.
>> For each and every swap entry it is walking the entire process list
>> (for_each_task(p)). It is also grabbing a whole bunch of locks
>> for each swap entry. It might be worthwhile processing swap entries in
>> batches instead of one entry at a time.
>>
>> In any case, I think having this patch is worthwhile as a quick and
dirty
>> remedy.
>
>Bulent,
>
>Could you please check if 2.4.6-pre2+the schedule patch has better
>swapoff behaviour for you?
No problem. I will check it tomorrow. I don't think it can be any worse
than it is now. The patch looks correct in principle.
I believe it should go in to 2.4.6. But I will test it.
On small machines people don't notice it, but otherwise if you have few
GB of memory it really hurts. Shutdowns take forever since swapoff takes
forever.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
@ 2001-06-09 20:32 Bulent Abali
2001-06-10 2:12 ` Eric W. Biederman
2001-06-10 5:54 ` Mikael Abrahamsson
0 siblings, 2 replies; 20+ messages in thread
From: Bulent Abali @ 2001-06-09 20:32 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: Mike Galbraith, Eric W. Biederman, Derek Glidden, lkml, linux-mm,
Stephen Tweedie
>Bulent,
>
>Could you please check if 2.4.6-pre2+the schedule patch has better
>swapoff behaviour for you?
Marcelo,
It works as expected. Doesn't lockup the box however swapoff keeps burning
the CPU cycles. It took 4 1/2 minutes to swapoff about 256MB of swap
content. Shutdown took just as long. I was hoping that shutdown would
kill the swapoff process but it doesn't. It just hangs there. Shutdown
is the common case. Therefore, swapoff needs to be optimized for
shutdowns.
You could imagine users frustration waiting for a shutdown when there are
gigabytes in the swap.
So to summarize, schedule patch is better than nothing but falls far short.
I would put it in 2.4.6. Read on.
----------
The problem is with the try_to_unuse() algorithm which is very inefficient.
I searched the linux-mm archives and Tweedie was on to this. This is what
he wrote: "it is much cheaper to find a swap entry for a given page than
to find the swap cache page for a given swap entry." And he posted a
patch http://mail.nl.linux.org/linux-mm/2001-03/msg00224.html
His patch is in the Redhat 7.1 kernel 2.4.2-2 and not in 2.4.5.
But in any case I believe the patch will not work as expected.
It seems to me that he is calling the function check_orphaned_swap(page)
in the wrong place. He is calling the function while scanning the
active_list in refill_inactive_scan(). The problem with that is if you
wait
60 seconds or longer the orphaned swap pages will move from active
to inactive lists. Therefore the function will miss the orphans in inactive
lists. Any comments?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
2001-06-09 20:32 Bulent Abali
@ 2001-06-10 2:12 ` Eric W. Biederman
2001-06-10 5:54 ` Mikael Abrahamsson
1 sibling, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2001-06-10 2:12 UTC (permalink / raw)
To: Bulent Abali
Cc: Marcelo Tosatti, Mike Galbraith, Derek Glidden, lkml, linux-mm,
Stephen Tweedie
"Bulent Abali" <abali@us.ibm.com> writes:
> >Bulent,
> >
> >Could you please check if 2.4.6-pre2+the schedule patch has better
> >swapoff behaviour for you?
>
> Marcelo,
>
> It works as expected. Doesn't lockup the box however swapoff keeps burning
> the CPU cycles. It took 4 1/2 minutes to swapoff about 256MB of swap
> content. Shutdown took just as long. I was hoping that shutdown would
> kill the swapoff process but it doesn't. It just hangs there. Shutdown
> is the common case. Therefore, swapoff needs to be optimized for
> shutdowns.
> You could imagine users frustration waiting for a shutdown when there are
> gigabytes in the swap.
>
> So to summarize, schedule patch is better than nothing but falls far short.
> I would put it in 2.4.6. Read on.
The fix is to kill the dead/orphaned swap pages before we get to
swapoff. At shutdown time there is practically nothing active in
swap, so this should speed things up tremendously. The dead swap
pages need to be killed as soon as possible to keep us from wasting
RAM and swap, and totally agravating whatever swapping situation is
present.
Once the dead swap pages problem is fixed it is time to optimize
swapoff.
> ----------
>
> The problem is with the try_to_unuse() algorithm which is very inefficient.
> I searched the linux-mm archives and Tweedie was on to this. This is what
> he wrote: "it is much cheaper to find a swap entry for a given page than
> to find the swap cache page for a given swap entry." And he posted a
> patch http://mail.nl.linux.org/linux-mm/2001-03/msg00224.html
> His patch is in the Redhat 7.1 kernel 2.4.2-2 and not in 2.4.5.
>
> But in any case I believe the patch will not work as expected.
> It seems to me that he is calling the function check_orphaned_swap(page)
> in the wrong place. He is calling the function while scanning the
> active_list in refill_inactive_scan(). The problem with that is if you
> wait
> 60 seconds or longer the orphaned swap pages will move from active
> to inactive lists. Therefore the function will miss the orphans in inactive
> lists. Any comments?
The analysis sounds about right.
We should be killing most of these pages in free_pte. Or at the very
least putting them on their own list that we can scan them
effectively. Someone was creating a patch to that effect earlier.
try_to_unuse is inefficient with respect to cpu usage but it is
efficient with respect to swap usage. If you are doing this on a
running machine where you are removing a swap you don't want an
algorithm that increases your need for swap. All of the trivial
transformations of try_to_unuse have the property of breaking the
sharing of swap pages.
Eric
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
2001-06-09 20:32 Bulent Abali
2001-06-10 2:12 ` Eric W. Biederman
@ 2001-06-10 5:54 ` Mikael Abrahamsson
1 sibling, 0 replies; 20+ messages in thread
From: Mikael Abrahamsson @ 2001-06-10 5:54 UTC (permalink / raw)
To: linux-mm; +Cc: Stephen Tweedie
On Sat, 9 Jun 2001, Bulent Abali wrote:
> to find the swap cache page for a given swap entry." And he posted a
> patch http://mail.nl.linux.org/linux-mm/2001-03/msg00224.html
> His patch is in the Redhat 7.1 kernel 2.4.2-2 and not in 2.4.5.
>
> But in any case I believe the patch will not work as expected.
I second this. I have followed the discussion and I tried to swapoff a
vanilla redhat 7.1 machine with vanilla redhat kernel (celeron 500 with
IDE disks) with approx 100 megs in the swap and it took over a minute with
swapoff using 100% cpu.
--
Mikael Abrahamsson email: swmike@swm.pp.se
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Please test: workaround to help swapoff behaviour
@ 2001-06-10 13:56 Bulent Abali
0 siblings, 0 replies; 20+ messages in thread
From: Bulent Abali @ 2001-06-10 13:56 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Marcelo Tosatti, Mike Galbraith, Derek Glidden, lkml, linux-mm,
Stephen Tweedie
>The fix is to kill the dead/orphaned swap pages before we get to
>swapoff. At shutdown time there is practically nothing active in
> ...
>Once the dead swap pages problem is fixed it is time to optimize
>swapoff.
I think fixing the orphaned swap pages problem will eliminate the
problem all together. Probably there is no need to optimize
swapoff.
Because as the system is shutting down all the processes will be
killed and their pages in swap will be orphaned. If those pages
were to be reaped in a timely manner there wouldn't be any work
left for swapoff.
Bulent
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2001-06-10 13:56 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <3B1E4CD0.D16F58A8@illusionary.com>
[not found] ` <3b204fe5.4014698@mail.mbay.net>
[not found] ` <3B1E5316.F4B10172@illusionary.com>
[not found] ` <m1wv6p5uqp.fsf@frodo.biederman.org>
[not found] ` <3B1E7ABA.EECCBFE0@illusionary.com>
2001-06-06 18:52 ` Break 2.4 VM in five easy steps Eric W. Biederman
2001-06-06 19:06 ` Mike Galbraith
2001-06-06 19:28 ` Eric W. Biederman
2001-06-07 4:32 ` Mike Galbraith
2001-06-07 6:38 ` Eric W. Biederman
2001-06-07 7:28 ` Mike Galbraith
2001-06-07 7:59 ` Eric W. Biederman
2001-06-07 8:15 ` Mike Galbraith
2001-06-07 17:10 ` Marcelo Tosatti
2001-06-07 17:43 ` Please test: workaround to help swapoff behaviour Marcelo Tosatti
2001-06-06 19:28 ` Break 2.4 VM in five easy steps Derek Glidden
2001-06-09 7:55 ` Rik van Riel
2001-06-07 20:33 Please test: workaround to help swapoff behaviour Bulent Abali
2001-06-07 19:40 ` Marcelo Tosatti
2001-06-08 21:11 ` Marcelo Tosatti
2001-06-08 23:53 Bulent Abali
2001-06-09 20:32 Bulent Abali
2001-06-10 2:12 ` Eric W. Biederman
2001-06-10 5:54 ` Mikael Abrahamsson
2001-06-10 13:56 Bulent Abali
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox