* 2.4.0-test9-pre4: __alloc_pages(...) try_again:
@ 2000-09-20 20:23 Roger Larsson
2000-09-21 5:21 ` Mike Galbraith
2000-09-21 16:26 ` 2.4.0-test9-pre4: __alloc_pages(...) try_again: Rik van Riel
0 siblings, 2 replies; 9+ messages in thread
From: Roger Larsson @ 2000-09-20 20:23 UTC (permalink / raw)
To: linux-kernel, linux-mm, Rik van Riel, Juan J. Quintela
Hi,
Trying to find out why test9-pre4 freezes with mmap002
I added a counter for try_again loops.
... __alloc_pages(...)
int direct_reclaim = 0;
unsigned int gfp_mask = zonelist->gfp_mask;
struct page * page = NULL;
+ int try_again_loops = 0;
- - -
+ printk("VM: sync kswapd (direct_reclaim: %d) try_again #
%d\n",
+ direct_reclaim, ++try_again_loops);
wakeup_kswapd(1);
goto try_again;
Result was surprising:
direct_reclaim was 1.
try_again_loops did never stop increasing (note: it is not static,
and should restart from zero after each success)
Why does this happen?
a) kswapd did not succeed in freeing a suitable page?
b) __alloc_pages did not succeed in grabbing the page?
/RogerL
--
Home page:
http://www.norran.net/nra02596/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: 2.4.0-test9-pre4: __alloc_pages(...) try_again: 2000-09-20 20:23 2.4.0-test9-pre4: __alloc_pages(...) try_again: Roger Larsson @ 2000-09-21 5:21 ` Mike Galbraith 2000-09-21 18:17 ` Roger Larsson 2000-09-21 16:26 ` 2.4.0-test9-pre4: __alloc_pages(...) try_again: Rik van Riel 1 sibling, 1 reply; 9+ messages in thread From: Mike Galbraith @ 2000-09-21 5:21 UTC (permalink / raw) To: Roger Larsson; +Cc: linux-kernel, linux-mm, Rik van Riel, Juan J. Quintela On Wed, 20 Sep 2000, Roger Larsson wrote: > Hi, > > > Trying to find out why test9-pre4 freezes with mmap002 > I added a counter for try_again loops. > > ... __alloc_pages(...) > > int direct_reclaim = 0; > unsigned int gfp_mask = zonelist->gfp_mask; > struct page * page = NULL; > + int try_again_loops = 0; > > - - - > > + printk("VM: sync kswapd (direct_reclaim: %d) try_again # > %d\n", > + direct_reclaim, ++try_again_loops); > wakeup_kswapd(1); > goto try_again; > > > Result was surprising: > direct_reclaim was 1. > try_again_loops did never stop increasing (note: it is not static, > and should restart from zero after each success) > > Why does this happen? > a) kswapd did not succeed in freeing a suitable page? > b) __alloc_pages did not succeed in grabbing the page? Hi Roger, A trace of locked up box shows endless repetitions of kswapd aparantly failing to free anything. What I don't see in the trace snippet below is reclaim_page(). I wonder if this test in __alloc_pages_limit() should include an || direct_reclaim. if (z->free_pages + z->inactive_clean_pages > water_mark) { struct page *page = NULL; /* If possible, reclaim a page directly. */ if (direct_reclaim && z->free_pages < z->pages_min + 8) page = reclaim_page(z); /* If that fails, fall back to rmqueue. */ if (!page) page = rmqueue(z, order); if (page) return page; } dmesg log after breakout: SysRq: Suspending trace SysRq: Show Memory Mem-info: Free pages: 1404kB ( 0kB HighMem) ( Active: 274, inactive_dirty: 49, inactive_clean: 0, free: 351 (255 510 765) ) 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) 1*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB = 892kB) = 0kB) Swap cache: add 470291, delete 470291, find 212689/658406 Free swap: 199024kB 32752 pages of RAM 0 pages of HIGHMEM 4321 reserved pages 296 pages shared 0 pages swap cached 0 pages in page table cache Buffer memory: 120kB SysRq: Terminate All Tasks trace snippet: ........ c0134aa7 try_to_free_buffers +<13/130> (0.18) pid(3) c0134a47 sync_page_buffers +<13/60> (0.27) pid(3) c0117fb3 __wake_up +<13/144> (0.25) pid(3) c012dea9 __free_pages +<d/28> (0.26) pid(3) c0134aa7 try_to_free_buffers +<13/130> (0.16) pid(3) c0134a47 sync_page_buffers +<13/60> (0.33) pid(3) c0117fb3 __wake_up +<13/144> (0.25) pid(3) c012dea9 __free_pages +<d/28> (0.20) pid(3) c012cce7 free_shortage +<13/ac> (0.13) pid(3) c012df1e nr_free_pages +<e/3c> (0.16) pid(3) c012df5a nr_inactive_clean_pages +<e/40> (0.16) pid(3) c012df1e nr_free_pages +<e/3c> (0.21) pid(3) c014261d shrink_dcache_memory +<d/38> (0.24) pid(3) c0142280 prune_dcache +<10/120> (0.18) pid(3) c012a8ea kmem_cache_shrink +<e/58> (0.14) pid(3) c012a832 is_chained_kmem_cache +<e/60> (0.96) pid(3) c012a892 __kmem_cache_shrink +<e/58> (0.36) pid(3) c0143849 shrink_icache_memory +<d/38> (0.24) pid(3) c014375f prune_icache +<13/f0> (0.14) pid(3) c01432e4 sync_all_inodes +<10/120> (1.12) pid(3) c01435ef dispose_list +<f/68> (0.21) pid(3) c012a8ea kmem_cache_shrink +<e/58> (0.14) pid(3) c012a832 is_chained_kmem_cache +<e/60> (0.29) pid(3) c012a892 __kmem_cache_shrink +<e/58> (0.26) pid(3) c012cde3 refill_inactive +<13/128> (0.12) pid(3) c012cd8e inactive_shortage +<e/50> (0.14) pid(3) c012df1e nr_free_pages +<e/3c> (0.18) pid(3) c012df5a nr_inactive_clean_pages +<e/40> (0.23) pid(3) c012cce7 free_shortage +<13/ac> (0.13) pid(3) c012df1e nr_free_pages +<e/3c> (0.21) pid(3) c012df5a nr_inactive_clean_pages +<e/40> (0.16) pid(3) c012df1e nr_free_pages +<e/3c> (0.19) pid(3) c012af3b kmem_cache_reap +<13/1f0> (2.17) pid(3) c0117bb3 schedule +<13/400> (1.20) pid(3->5) c0109263 __switch_to +<13/cc> (0.69) pid(5) c0134c9f flush_dirty_buffers +<13/cc> (0.19) pid(5) c012df1e nr_free_pages +<e/3c> (0.16) pid(5) c012df5a nr_inactive_clean_pages +<e/40> (0.20) pid(5) c0117fb3 __wake_up +<13/144> (0.31) pid(5) c0117bb3 schedule +<13/400> (1.20) pid(5->141) c0109263 __switch_to +<13/cc> (0.41) pid(141) c0118d7d remove_wait_queue +<d/24> (0.40) pid(141) c012db24 __alloc_pages_limit +<10/b8> (0.35) pid(141) c012db24 __alloc_pages_limit +<10/b8> (0.23) pid(141) c012d0d2 wakeup_kswapd +<12/cc> (0.17) pid(141) c012db24 __alloc_pages_limit +<10/b8> (0.25) pid(141) c012d0d2 wakeup_kswapd +<12/cc> (0.16) pid(141) c0118d13 add_wait_queue +<f/38> (0.22) pid(141) c0117bb3 schedule +<13/400> (1.12) pid(141->144) c0109263 __switch_to +<13/cc> (0.38) pid(144) c0118d7d remove_wait_queue +<d/24> (0.33) pid(144) c012db24 __alloc_pages_limit +<10/b8> (0.22) pid(144) c012db24 __alloc_pages_limit +<10/b8> (0.20) pid(144) c012d0d2 wakeup_kswapd +<12/cc> (0.17) pid(144) c012db24 __alloc_pages_limit +<10/b8> (0.24) pid(144) c012d0d2 wakeup_kswapd +<12/cc> (0.16) pid(144) c0118d13 add_wait_queue +<f/38> (0.22) pid(144) c0117bb3 schedule +<13/400> (1.20) pid(144->1237) c0109263 __switch_to +<13/cc> (0.36) pid(1237) c0118d7d remove_wait_queue +<d/24> (0.33) pid(1237) c012db24 __alloc_pages_limit +<10/b8> (0.22) pid(1237) c012db24 __alloc_pages_limit +<10/b8> (0.20) pid(1237) c012d0d2 wakeup_kswapd +<12/cc> (0.17) pid(1237) c012db24 __alloc_pages_limit +<10/b8> (0.24) pid(1237) c012d0d2 wakeup_kswapd +<12/cc> (0.19) pid(1237) c0118d13 add_wait_queue +<f/38> (0.22) pid(1237) c0117bb3 schedule +<13/400> (1.08) pid(1237->194) c0109263 __switch_to +<13/cc> (0.37) pid(194) c0118d7d remove_wait_queue +<d/24> (0.31) pid(194) c012db24 __alloc_pages_limit +<10/b8> (0.22) pid(194) c012db24 __alloc_pages_limit +<10/b8> (0.20) pid(194) c012d0d2 wakeup_kswapd +<12/cc> (0.17) pid(194) c012db24 __alloc_pages_limit +<10/b8> (0.24) pid(194) c012d0d2 wakeup_kswapd +<12/cc> (0.16) pid(194) c0118d13 add_wait_queue +<f/38> (0.22) pid(194) c0117bb3 schedule +<13/400> (0.99) pid(194->99) c0109263 __switch_to +<13/cc> (0.38) pid(99) c0118d7d remove_wait_queue +<d/24> (0.31) pid(99) c012db24 __alloc_pages_limit +<10/b8> (0.22) pid(99) c012db24 __alloc_pages_limit +<10/b8> (0.20) pid(99) c012d0d2 wakeup_kswapd +<12/cc> (0.22) pid(99) c012db24 __alloc_pages_limit +<10/b8> (0.24) pid(99) c012d0d2 wakeup_kswapd +<12/cc> (0.16) pid(99) c0118d13 add_wait_queue +<f/38> (0.22) pid(99) c0117bb3 schedule +<13/400> (0.89) pid(99->218) c0109263 __switch_to +<13/cc> (0.38) pid(218) c0118d7d remove_wait_queue +<d/24> (0.31) pid(218) c012db24 __alloc_pages_limit +<10/b8> (0.22) pid(218) c012db24 __alloc_pages_limit +<10/b8> (0.20) pid(218) c012d0d2 wakeup_kswapd +<12/cc> (0.17) pid(218) c012db24 __alloc_pages_limit +<10/b8> (0.24) pid(218) c012d0d2 wakeup_kswapd +<12/cc> (0.18) pid(218) c0118d13 add_wait_queue +<f/38> (0.22) pid(218) c0117bb3 schedule +<13/400> (0.90) pid(218->1238) c0109263 __switch_to +<13/cc> (0.48) pid(1238) c0118d7d remove_wait_queue +<d/24> (0.31) pid(1238) c012db24 __alloc_pages_limit +<10/b8> (0.22) pid(1238) c012db24 __alloc_pages_limit +<10/b8> (0.20) pid(1238) c012d0d2 wakeup_kswapd +<12/cc> (0.17) pid(1238) c012db24 __alloc_pages_limit +<10/b8> (0.24) pid(1238) c012d0d2 wakeup_kswapd +<12/cc> (0.16) pid(1238) c0118d13 add_wait_queue +<f/38> (0.22) pid(1238) c0117bb3 schedule +<13/400> (0.53) pid(1238->3) c0109263 __switch_to +<13/cc> (0.35) pid(3) c012cbf3 refill_inactive_scan +<13/f4> (0.18) pid(3) c012b30d age_page_down_nolock +<d/28> (0.16) pid(3) c012b39b deactivate_page_nolock +<f/254> (0.40) pid(3) c012cbf3 refill_inactive_scan +<13/f4> (0.16) pid(3) c012b30d age_page_down_nolock +<d/28> (0.16) pid(3) c012b39b deactivate_page_nolock +<f/254> (0.32) pid(3) .......... -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.4.0-test9-pre4: __alloc_pages(...) try_again: 2000-09-21 5:21 ` Mike Galbraith @ 2000-09-21 18:17 ` Roger Larsson 2000-09-22 6:20 ` Mike Galbraith 0 siblings, 1 reply; 9+ messages in thread From: Roger Larsson @ 2000-09-21 18:17 UTC (permalink / raw) To: Mike Galbraith; +Cc: linux-mm Mike Galbraith wrote: > > On Wed, 20 Sep 2000, Roger Larsson wrote: > > > Hi, > > > > > > Trying to find out why test9-pre4 freezes with mmap002 > > I added a counter for try_again loops. > > > > ... __alloc_pages(...) > > > > int direct_reclaim = 0; > > unsigned int gfp_mask = zonelist->gfp_mask; > > struct page * page = NULL; > > + int try_again_loops = 0; > > > > - - - > > > > + printk("VM: sync kswapd (direct_reclaim: %d) try_again # > > %d\n", > > + direct_reclaim, ++try_again_loops); > > wakeup_kswapd(1); > > goto try_again; > > > > > > Result was surprising: > > direct_reclaim was 1. > > try_again_loops did never stop increasing (note: it is not static, > > and should restart from zero after each success) > > > > Why does this happen? > > a) kswapd did not succeed in freeing a suitable page? > > b) __alloc_pages did not succeed in grabbing the page? > > Hi Roger, > > A trace of locked up box shows endless repetitions of kswapd aparantly > failing to free anything. What I don't see in the trace snippet below > is reclaim_page(). I wonder if this test in __alloc_pages_limit() > should include an || direct_reclaim. When I have run into this problem I have had no inactive_clean pages => reclaim_page() should not work... :-( That is your situation too... > > if (z->free_pages + z->inactive_clean_pages > water_mark) { > struct page *page = NULL; > /* If possible, reclaim a page directly. */ > if (direct_reclaim && z->free_pages < z->pages_min + 8) > page = reclaim_page(z); > /* If that fails, fall back to rmqueue. */ > if (!page) > page = rmqueue(z, order); > if (page) > return page; > } > > dmesg log after breakout: > SysRq: Suspending trace > SysRq: Show Memory > Mem-info: > Free pages: 1404kB ( 0kB HighMem) > ( Active: 274, inactive_dirty: 49, inactive_clean: 0, free: 351 (255 510 765) ) > 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) > 1*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB = 892kB) > = 0kB) > Swap cache: add 470291, delete 470291, find 212689/658406 > Free swap: 199024kB > 32752 pages of RAM > 0 pages of HIGHMEM > 4321 reserved pages > 296 pages shared > 0 pages swap cached > 0 pages in page table cache > Buffer memory: 120kB > SysRq: Terminate All Tasks > -- Home page: http://www.norran.net/nra02596/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.4.0-test9-pre4: __alloc_pages(...) try_again: 2000-09-21 18:17 ` Roger Larsson @ 2000-09-22 6:20 ` Mike Galbraith 2000-09-22 8:49 ` Rik van Riel 0 siblings, 1 reply; 9+ messages in thread From: Mike Galbraith @ 2000-09-22 6:20 UTC (permalink / raw) To: Roger Larsson; +Cc: linux-mm On Thu, 21 Sep 2000, Roger Larsson wrote: > Mike Galbraith wrote: > > > > On Wed, 20 Sep 2000, Roger Larsson wrote: > > > > > Hi, > > > > > > > > > Trying to find out why test9-pre4 freezes with mmap002 > > > I added a counter for try_again loops. > > > > > > ... __alloc_pages(...) > > > > > > int direct_reclaim = 0; > > > unsigned int gfp_mask = zonelist->gfp_mask; > > > struct page * page = NULL; > > > + int try_again_loops = 0; > > > > > > - - - > > > > > > + printk("VM: sync kswapd (direct_reclaim: %d) try_again # > > > %d\n", > > > + direct_reclaim, ++try_again_loops); > > > wakeup_kswapd(1); > > > goto try_again; > > > > > > > > > Result was surprising: > > > direct_reclaim was 1. > > > try_again_loops did never stop increasing (note: it is not static, > > > and should restart from zero after each success) > > > > > > Why does this happen? > > > a) kswapd did not succeed in freeing a suitable page? > > > b) __alloc_pages did not succeed in grabbing the page? > > > > Hi Roger, > > > > A trace of locked up box shows endless repetitions of kswapd aparantly > > failing to free anything. What I don't see in the trace snippet below > > is reclaim_page(). I wonder if this test in __alloc_pages_limit() > > should include an || direct_reclaim. > > When I have run into this problem I have had no inactive_clean pages => > reclaim_page() should not work... :-( > That is your situation too... (Yup, after further reading I quickly ceased wondering about that;) Much more interesting (i hope) is that refill_inactive() _is_ present 2923 times, we're oom as heck, and neither shm_swap() nor swap_out() is ever reached in 1048533 lines of trace. The only way I can see that this can happen is if refill_inactive_scan() eats all count. :-) I'm currently wo^Handering what count is and if I shouldn't try checking nr_inactive_clean_pages() before exiting the loop. -Mike -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.4.0-test9-pre4: __alloc_pages(...) try_again: 2000-09-22 6:20 ` Mike Galbraith @ 2000-09-22 8:49 ` Rik van Riel 2000-09-22 16:51 ` Mike Galbraith 0 siblings, 1 reply; 9+ messages in thread From: Rik van Riel @ 2000-09-22 8:49 UTC (permalink / raw) To: Mike Galbraith; +Cc: Roger Larsson, linux-mm On Fri, 22 Sep 2000, Mike Galbraith wrote: > Much more interesting (i hope) is that refill_inactive() _is_ > present 2923 times, we're oom as heck, and neither shm_swap() > nor swap_out() is ever reached in 1048533 lines of trace. The > only way I can see that this can happen is if > refill_inactive_scan() eats all count. > :-) I'm currently wo^Handering what count is and if I shouldn't try > checking nr_inactive_clean_pages() before exiting the loop. This means that refill_inactive_scan() has moved so many pages to the inactive_dirty/clean list we have satisfied both the inactive_target and the free_target ... Maybe, however, these pages are not freeable and page_launder() moves them back to the active list and we end up moving pages from one list to another ???? With the latest change to get all pages properly deactivated when needed, maybe this is possible to happen. It /seems/ possible now that I think about it, very very unlikely, but possible ;( Btw, was there any swap free when you got into this situation ? regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.4.0-test9-pre4: __alloc_pages(...) try_again: 2000-09-22 8:49 ` Rik van Riel @ 2000-09-22 16:51 ` Mike Galbraith 2000-09-23 11:28 ` 2.4.0-test9-pre6: __alloc_pages(...) datapoint Mike Galbraith 0 siblings, 1 reply; 9+ messages in thread From: Mike Galbraith @ 2000-09-22 16:51 UTC (permalink / raw) To: Rik van Riel; +Cc: Roger Larsson, linux-mm On Fri, 22 Sep 2000, Rik van Riel wrote: > On Fri, 22 Sep 2000, Mike Galbraith wrote: > > > Much more interesting (i hope) is that refill_inactive() _is_ > > present 2923 times, we're oom as heck, and neither shm_swap() > > nor swap_out() is ever reached in 1048533 lines of trace. The > > only way I can see that this can happen is if > > refill_inactive_scan() eats all count. > > > :-) I'm currently wo^Handering what count is and if I shouldn't try > > checking nr_inactive_clean_pages() before exiting the loop. > > This means that refill_inactive_scan() has moved > so many pages to the inactive_dirty/clean list we > have satisfied both the inactive_target and the > free_target ... > > Maybe, however, these pages are not freeable and > page_launder() moves them back to the active list > and we end up moving pages from one list to another ???? I do have page_launder() failing permanently. I also (sysrq-m) do see ;-) list ping-pong being played. I tried 2.4.0-t9p2-vmpatch and had locks with no scheduling happening, always looping in swptst task. Then, I saw the thinko correction and tried that. I still get lockups, though now scheduling happens again. The magnatude of list ping-pong active->inactive_dirty->active has changed from few pages to thousands of pages, but inactive_clean stays 0, free stays constant at 350.. box is as oom as they come. > With the latest change to get all pages properly > deactivated when needed, maybe this is possible to > happen. It /seems/ possible now that I think about > it, very very unlikely, but possible ;( > > Btw, was there any swap free when you got into > this situation ? Yes, the locks always happened (and still do) with swap_used being some random number always far above zero. (question: __get_free_pages below is GFP_USER, is that ok there?) -Mike -----------------------------datapoint--------------------------- dmesg log retrieved after breakout: ... SysRq: Suspending trace (that freezes 1000000+ line 334590.00 usec ktrace buffer during lock.. seperated profiles of kswapd and all_others below) kdb> bt EBP EIP Function(args) 0xc3053d0c 0xc01355f0 sync_page_buffers+0x3c (0xc6f113e0, 0x2) 0xc3053d30 0xc013571b try_to_free_buffers+0x107 (0xc11d61c4, 0x2) 0xc3053d60 0xc012cde0 page_launder+0x294 (0x5, 0x1) 0xc3053d7c 0xc012d60f do_try_to_free_pages+0x47 (0x5, 0x1) 0xc3053d8c 0xc012d88e try_to_free_pages+0x22 (0x5) 0xc3053dac 0xc012e51a __alloc_pages_wrap+0x1fe (0xc02fe8ac, 0x0, 0xc02fe55c) 0xc3053dc0 0xc012e5d3 __get_free_pages_wrap+0x2f (0x5, 0x0, 0xc02fe55c) 0xc3053de0 0xc012ec1b read_swap_cache_async+0x3f (0x10e000, 0x0) 0xc3053e08 0xc0123f68 swapin_readahead+0x90 (0x10e900) 0xc3053e20 0xc0123fcf do_swap_page+0x2b (0xc712e620, 0xc6e4b9e0, 0x804de40, 0xc3050134, 0x10e900) 0xc3053e58 0xc01243b7 handle_mm_fault+0x19b (0xc712e620, 0xc6e4b9e0, 0x804de40, 0x1) more> 0xc3053f0c 0xc0113d21 do_page_fault+0x151 (0xc3053f1c, 0x2, 0x804de40, 0xc02fcc3c, 0x1249) 0xc010abac error_code+0x2c Interrupt registers: eax = 0x00000000 ebx = 0x0804de40 ecx = 0xc02fcc3c edx = 0x00001249 esi = 0xc0354b40 edi = 0x00000000 esp = 0xc3053f50 eip = 0xc011a2b4 ebp = 0xc3053f84 ss = 0x00000018 cs = 0x00000010 eflags = 0x00010246 ds = 0x00000018 es = 0x00000018 origeax = 0xffffffff ®s = 0xc3053f1c 0xc011a2b4 do_syslog+0x140 (0x2, 0x804de40, 0x1000) 0xc3053f98 0xc014c0cc kmsg_read+0x1c (0xc30a4740, 0x804de40, 0x1000, 0xc30a4760) 0xc3053fbc 0xc0131e2f sys_read+0x9b (0x0, 0x804de40, 0x1000, 0x400, 0x804ee40) 0xc010aa84 system_call+0x3c kdb> bp page_launder+0x700 Instruction(i) BP #0 at 0xc012d24c (page_launder+0x700) is enabled globally adjust 1 kdb> go Instruction(i) breakpoint #0 at 0xc012d24c (adjusted) 0xc012d24c page_launder+0x700: ret Entering kdb (0xc3052000) due to Breakpoint @ 0xc012d24c kdb> bt EBP EIP Function(args) 0xc3053d7c 0xc012d24c page_launder+0x700 (0x5, 0x1) 0xc012d60f do_try_to_free_pages+0x47 (0x5, 0x1) 0xc3053d8c 0xc012d88e try_to_free_pages+0x22 (0x5) 0xc3053dac 0xc012e51a __alloc_pages_wrap+0x1fe (0xc02fe8ac, 0x0, 0xc02fe55c) 0xc3053dc0 0xc012e5d3 __get_free_pages_wrap+0x2f (0x5, 0x0, 0xc02fe55c) 0xc3053de0 0xc012ec1b read_swap_cache_async+0x3f (0x10e000, 0x0) 0xc3053e08 0xc0123f68 swapin_readahead+0x90 (0x10e900) 0xc3053e20 0xc0123fcf do_swap_page+0x2b (0xc712e620, 0xc6e4b9e0, 0x804de40, 0xc3050134, 0x10e900) 0xc3053e58 0xc01243b7 handle_mm_fault+0x19b (0xc712e620, 0xc6e4b9e0, 0x804de40, 0x1) 0xc3053f0c 0xc0113d21 do_page_fault+0x151 (0xc3053f1c, 0x2, 0x804de40, 0xc02fcc3c, 0x1249) 0xc010abac error_code+0x2c more> Interrupt registers: eax = 0x00000000 ebx = 0x0804de40 ecx = 0xc02fcc3c edx = 0x00001249 esi = 0xc0354b40 edi = 0x00000000 esp = 0xc3053f50 eip = 0xc011a2b4 ebp = 0xc3053f84 ss = 0x00000018 cs = 0x00000010 eflags = 0x00010246 ds = 0x00000018 es = 0x00000018 origeax = 0xffffffff ®s = 0xc3053f1c 0xc011a2b4 do_syslog+0x140 (0x2, 0x804de40, 0x1000) 0xc3053f98 0xc014c0cc kmsg_read+0x1c (0xc30a4740, 0x804de40, 0x1000, 0xc30a4760) 0xc3053fbc 0xc0131e2f sys_read+0x9b (0x0, 0x804de40, 0x1000, 0x400, 0x804ee40) 0xc010aa84 system_call+0x3c kdb> rd eax = 0x00000000 ebx = 0x00000000 ecx = 0x00000001 edx = 0xc11d558c esi = 0x00000000 edi = 0x00000005 esp = 0xc3053d64 eip = 0xc012d24c ebp = 0xc3053d7c ss = 0x00000018 cs = 0x00000010 eflags = 0x00000202 ds = 0x00000018 es = 0x00000018 origeax = 0xffffffff ®s = 0xc3053d30 kdb> go Instruction(i) breakpoint #0 at 0xc012d24c (adjusted) 0xc012d24c page_launder+0x700: ret Entering kdb (0xc2d5e000) due to Breakpoint @ 0xc012d24c kdb> bt EBP EIP Function(args) 0xc2d5fdb0 0xc012d24c page_launder+0x700 (0x5, 0x1) 0xc012d60f do_try_to_free_pages+0x47 (0x5, 0x1) 0xc2d5fdc0 0xc012d88e try_to_free_pages+0x22 (0x5) 0xc2d5fde0 0xc012e51a __alloc_pages_wrap+0x1fe (0xc02fe8ac, 0x0, 0xc02fe55c) 0xc2d5fdf4 0xc012e5d3 __get_free_pages_wrap+0x2f (0x5, 0x0, 0xc02fe55c) 0xc2d5fe14 0xc012ec1b read_swap_cache_async+0x3f (0x453100, 0x0) 0xc2d5fe3c 0xc0123f68 swapin_readahead+0x90 (0x453400) 0xc2d5fe54 0xc0123fcf do_swap_page+0x2b (0xc712e6c0, 0xc2f4dee0, 0xbfffe1a8, 0xc2d5aff8, 0x453400) 0xc2d5fe8c 0xc01243b7 handle_mm_fault+0x19b (0xc712e6c0, 0xc2f4dee0, 0xbfffe1a8, 0x1) 0xc2d5ff40 0xc0113d21 do_page_fault+0x151 (0xc2d5ff50, 0x2, 0xc2d5e000, 0xbfffe1a8, 0x0) 0xc010abac error_code+0x2c more> Interrupt registers: eax = 0xbfffe1ac ebx = 0xc2d5e000 ecx = 0xbfffe1a8 edx = 0x00000000 esi = 0x00000000 edi = 0x00000000 esp = 0xc2d5ff84 eip = 0xc0140799 ebp = 0xc2d5ffbc ss = 0x00000018 cs = 0x00000010 eflags = 0x00010246 ds = 0x00000018 es = 0x00000018 origeax = 0xffffffff ®s = 0xc2d5ff50 0xc0140799 sys_select+0x395 (0x5, 0xbfffe22c, 0x0, 0x0, 0xbfffe1a8) 0xc010aa84 system_call+0x3c kdb> rd eax = 0x00000000 ebx = 0x00000000 ecx = 0x00000001 edx = 0xc11d558c esi = 0x00000000 edi = 0x00000005 esp = 0xc2d5fd98 eip = 0xc012d24c ebp = 0xc2d5fdb0 ss = 0x00000018 cs = 0x00000010 eflags = 0x00000202 ds = 0x00000018 es = 0x00000018 origeax = 0xffffffff ®s = 0xc2d5fd64 kdb> set LOGGING 0 SysRq: Terminate All Tasks trimmed profile of kswapd portion of trace: %TOTAL-TIME TOTAL-USECS AVG/CALL CALLS ADDR FUNC 0.0706% 51.40 0.99 52 c01092c3 __switch_to 0.2707% 197.09 3.79 52 c012b58b kmem_cache_reap 0.0211% 15.38 0.19 79 c012d3fe inactive_shortage 0.0216% 15.76 0.12 131 c012d357 free_shortage 0.0659% 48.01 0.23 210 c012e6fa nr_inactive_clean_pages 0.1056% 76.88 0.23 341 c012e6b2 nr_free_pages 0.2061% 150.08 0.32 468 c01355c7 sync_page_buffers 0.3044% 221.68 0.24 936 c0135627 try_to_free_buffers 0.3390% 246.86 0.26 936 c012e63d __free_pages 0.3239% 235.86 0.24 963 c0117fb3 __wake_up 21.7178% 15813.72 0.20 80467 c012d263 refill_inactive_scan 24.2149% 17631.96 0.22 81400 c012b95d age_page_down_nolock 51.7663% 37693.43 0.46 81400 c012b9eb deactivate_page_nolock Total entries: 248146 Total usecs: 72814.65 trimmed profile of all_others: %TOTAL-TIME TOTAL-USECS AVG/CALL CALLS ADDR FUNC 0.1098% 287.37 7.77 37 c010f2e2 timer_interrupt 0.0035% 9.26 0.20 46 c010c351 do_IRQ 0.0104% 27.32 0.59 46 c010e5b1 end_8259A_irq 0.0111% 29.00 0.63 46 c011dd14 do_softirq 0.0125% 32.62 0.71 46 c010e631 enable_8259A_irq 0.0108% 28.20 0.60 47 c010c168 handle_IRQ_event 0.0163% 42.65 0.91 47 c010c2b7 do_IRQ 0.0406% 106.41 2.26 47 c010e6e8 mask_and_ack_8259A 0.1509% 395.04 0.12 3288 c012d453 refill_inactive 0.1760% 460.79 0.14 3288 c012d3fe inactive_shortage 0.1849% 483.96 0.15 3288 c01443ef prune_icache 0.1993% 521.67 0.16 3288 c013577a wakeup_bdflush 0.2058% 538.68 0.16 3288 c0142ef0 prune_dcache 0.2582% 675.87 0.21 3288 c01444d9 shrink_icache_memory 0.2679% 701.22 0.21 3288 c014328d shrink_dcache_memory 0.2843% 744.17 0.23 3288 c0118bc6 wake_up_process 0.3264% 854.55 0.26 3288 c014427f dispose_list 1.0448% 2734.97 0.83 3288 c0143f74 sync_all_inodes 0.1619% 423.73 0.13 3289 c012d879 try_to_free_pages 0.1885% 493.39 0.15 3289 c012d5d8 do_try_to_free_pages 0.2297% 601.39 0.18 3289 c012d7b2 wakeup_kswapd 0.4406% 1153.40 0.35 3289 c012cb5f page_launder 0.2189% 572.93 0.17 3314 c013581f flush_dirty_buffers 0.5468% 1431.39 0.43 3315 c0117a07 reschedule_idle 0.3532% 924.58 0.14 6576 c012adca kmem_cache_shrink 0.7292% 1908.83 0.29 6576 c012ad72 __kmem_cache_shrink 1.5872% 4154.79 0.63 6576 c012ad12 is_chained_kmem_cache 6.2938% 16475.60 2.51 6577 c012b58b kmem_cache_reap 1.2516% 3276.47 0.50 6603 c01092c3 __switch_to 2.4868% 6509.88 0.98 6655 c0117ba3 schedule 0.9808% 2567.60 0.26 9867 c012e26c __alloc_pages_limit_wrap 0.6042% 1581.76 0.12 13153 c012d357 free_shortage 1.3030% 3410.99 0.17 19755 c012e6fa nr_inactive_clean_pages 2.3829% 6237.93 0.19 32908 c012e6b2 nr_free_pages 4.3462% 11377.38 0.22 52624 c012d263 refill_inactive_scan 4.6854% 12265.25 0.23 52767 c012b95d age_page_down_nolock 10.7973% 28264.67 0.54 52767 c012b9eb deactivate_page_nolock 13.1619% 34454.59 0.31 111792 c01355c7 sync_page_buffers 11.2581% 29470.95 0.25 118370 c0135627 try_to_free_buffers 20.5987% 53922.25 0.46 118370 c012e63d __free_pages 11.7535% 30767.65 0.25 124974 c0117fb3 __wake_up Total entries: 800409 Total usecs: 261775.35 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* 2.4.0-test9-pre6: __alloc_pages(...) datapoint 2000-09-22 16:51 ` Mike Galbraith @ 2000-09-23 11:28 ` Mike Galbraith 0 siblings, 0 replies; 9+ messages in thread From: Mike Galbraith @ 2000-09-23 11:28 UTC (permalink / raw) To: linux-mm; +Cc: Rik van Riel [-- Attachment #1: Type: TEXT/PLAIN, Size: 290 bytes --] Hi, I just hit a 'no-schedule' VM loop with test9-6. Datapoint attached. schedule() is being called, but __switch_to isn't happening. Box is oom, and page_launder() fails permanently. inactive_clean is pegged 0. -Mike (trying to prevent inactive_clean from hitting 0.. nil progress;) [-- Attachment #2: Type: TEXT/PLAIN, Size: 11304 bytes --] kdb> ps Task Addr Pid Parent [*] cpu State Thread Command 0xc1e3c000 00000001 00000000 0 000 stop 0xc1e3c260 init 0xc7f9c000 00000002 00000001 0 000 run 0xc7f9c260 kapmd 0xc7f9a000 00000003 00000001 0 000 run 0xc7f9a260 kswapd 0xc7f98000 00000004 00000001 0 000 run 0xc7f98260 kreclaimd 0xc7f96000 00000005 00000001 0 000 run 0xc7f96260 kflushd 0xc7f94000 00000006 00000001 0 000 run 0xc7f94260 kupdate 0xc3e1e000 00000082 00000001 0 000 stop 0xc3e1e260 portmap 0xc3da8000 00000086 00000001 0 000 stop 0xc3da8260 rpc.ugidd 0xc3d8c000 00000093 00000001 0 000 stop 0xc3d8c260 scanlogd 0xc3d58000 00000099 00000001 0 000 run 0xc3d58260 syslogd 0xc3d2a000 00000103 00000001 0 000 run 0xc3d2a260 klogd 0xc3cea000 00000122 00000001 0 000 stop 0xc3cea260 keyserv 0xc3c8e000 00000141 00000001 0 000 run 0xc3c8e260 rpc.mountd 0xc3c54000 00000144 00000001 0 000 run 0xc3c54260 rpc.nfsd 0xc3c28000 00000154 00000001 0 000 stop 0xc3c28260 atd 0xc3b78000 00000163 00000001 0 000 stop 0xc3b78260 inetd 0xc3c4a000 00000165 00000001 0 000 stop 0xc3c4a260 gpm 0xc3ad0000 00000181 00000001 0 000 stop 0xc3ad0260 lpd 0xc3b5a000 00000194 00000001 0 000 stop 0xc3b5a260 sendmail 0xc3e5e000 00000201 00000001 0 000 stop 0xc3e5e260 sshd 0xc3a16000 00000205 00000001 0 000 stop 0xc3a16260 cron more> 0xc396a000 00000212 00000001 0 000 run 0xc396a260 nscd 0xc396e000 00000213 00000212 0 000 run 0xc396e260 nscd 0xc393e000 00000214 00000213 0 000 run 0xc393e260 nscd 0xc3938000 00000215 00000213 0 000 run 0xc3938260 nscd 0xc3a44000 00000217 00000213 0 000 stop 0xc3a44260 nscd 0xc3b40000 00000218 00000213 0 000 stop 0xc3b40260 nscd 0xc39f2000 00000219 00000213 0 000 stop 0xc39f2260 nscd 0xc1ef0000 00000235 00000001 0 000 stop 0xc1ef0260 login 0xc1ec2000 00000236 00000001 0 000 stop 0xc1ec2260 login 0xc7db4000 00000237 00000001 0 000 stop 0xc7db4260 mingetty 0xc7e90000 00000238 00000001 0 000 stop 0xc7e90260 mingetty 0xc3e86000 00000239 00000001 0 000 stop 0xc3e86260 mingetty 0xc7dc8000 00000240 00000001 0 000 stop 0xc7dc8260 mingetty 0xc3882000 00000241 00000235 0 000 stop 0xc3882260 bash 0xc3f1c000 00000250 00000236 0 000 stop 0xc3f1c260 bash 0xc4d54000 00000257 00000250 0 000 run 0xc4d54260*vmstat 0xc4c4c000 00000258 00000241 0 000 run 0xc4c4c260 swptst 0xc4ea4000 00000259 00000258 0 000 run 0xc4ea4260 swptst kdb> btp 257 EBP EIP Function(args) 0xc012c720 page_launder+0xa8 (0x7, 0x1) 0xc4d55f00 0xc012d15f do_try_to_free_pages+0x47 (0x7, 0x1) 0xc4d55f10 0xc012d3d6 try_to_free_pages+0x22 (0x7) 0xc4d55f30 0xc012dfe7 __alloc_pages+0x1db (0xc02fc5f4, 0x0) 0xc4d55f40 0xc012e099 __get_free_pages+0x29 (0x7, 0x0) 0xc4d55f68 0xc012ac40 kmem_cache_grow+0xc4 (0xc1e25d40, 0x7) 0xc4d55f84 0xc012adef kmem_cache_alloc+0x4f (0xc1e25d40, 0x7) 0xc4d55fa0 0xc013b5b9 getname+0x25 (0x8049972) 0xc4d55fbc 0xc0131104 sys_open+0x1c (0x8049972, 0x10800, 0xbffff62c, 0x8049972, 0x3cb) 0xc010aa74 system_call+0x3c kdb> go Entering kdb (0xc4d54000) due to Keyboard Entry kdb> bt EBP EIP Function(args) 0xc4d55ee4 0xc012c889 page_launder+0x211 (0x7, 0x1) 0xc4d55f00 0xc012d15f do_try_to_free_pages+0x47 (0x7, 0x1) 0xc4d55f10 0xc012d3d6 try_to_free_pages+0x22 (0x7) 0xc4d55f30 0xc012dfe7 __alloc_pages+0x1db (0xc02fc5f4, 0x0) 0xc4d55f40 0xc012e099 __get_free_pages+0x29 (0x7, 0x0) 0xc4d55f68 0xc012ac40 kmem_cache_grow+0xc4 (0xc1e25d40, 0x7) 0xc4d55f84 0xc012adef kmem_cache_alloc+0x4f (0xc1e25d40, 0x7) 0xc4d55fa0 0xc013b5b9 getname+0x25 (0x8049972) 0xc4d55fbc 0xc0131104 sys_open+0x1c (0x8049972, 0x10800, 0xbffff62c, 0x8049972, 0x3cb) 0xc010aa74 system_call+0x3c kdb> set LOGGING 0 SysRq: Show Memory Mem-info: Free pages: 1400kB ( 0kB HighMem) ( Active: 2383, inactive_dirty: 17, inactive_clean: 0, free: 350 (255 510 765) ) 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB = 888kB) = 0kB) Swap cache: add 27827, delete 25738, find 2485/8960 Free swap: 166256kB 32752 pages of RAM 0 pages of HIGHMEM 4325 reserved pages 295 pages shared 2089 pages swap cached 0 pages in page table cache Buffer memory: 68kB SysRq: Show Memory Mem-info: Free pages: 1400kB ( 0kB HighMem) ( Active: 2383, inactive_dirty: 17, inactive_clean: 0, free: 350 (255 510 765) ) 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB = 888kB) = 0kB) Swap cache: add 27827, delete 25738, find 2485/8960 Free swap: 166256kB 32752 pages of RAM 0 pages of HIGHMEM 4325 reserved pages 295 pages shared 2089 pages swap cached 0 pages in page table cache Buffer memory: 68kB SysRq: Show Memory Mem-info: Free pages: 1400kB ( 0kB HighMem) ( Active: 2367, inactive_dirty: 33, inactive_clean: 0, free: 350 (255 510 765) ) 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB = 888kB) = 0kB) Swap cache: add 27827, delete 25738, find 2485/8960 Free swap: 166256kB 32752 pages of RAM 0 pages of HIGHMEM 4325 reserved pages 295 pages shared 2089 pages swap cached 0 pages in page table cache Buffer memory: 68kB SysRq: Show Memory Mem-info: Free pages: 1400kB ( 0kB HighMem) ( Active: 2375, inactive_dirty: 24, inactive_clean: 0, free: 350 (255 510 765) ) 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB = 888kB) = 0kB) Swap cache: add 27827, delete 25738, find 2485/8960 Free swap: 166256kB 32752 pages of RAM 0 pages of HIGHMEM 4325 reserved pages 295 pages shared 2089 pages swap cached 0 pages in page table cache Buffer memory: 68kB SysRq: Show Memory Mem-info: Free pages: 1400kB ( 0kB HighMem) ( Active: 2380, inactive_dirty: 20, inactive_clean: 0, free: 350 (255 510 765) ) 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB = 888kB) = 0kB) Swap cache: add 27827, delete 25738, find 2485/8960 Free swap: 166256kB 32752 pages of RAM 0 pages of HIGHMEM 4325 reserved pages 295 pages shared 2089 pages swap cached 0 pages in page table cache Buffer memory: 68kB SysRq: Show Memory Mem-info: Free pages: 1400kB ( 0kB HighMem) ( Active: 2367, inactive_dirty: 32, inactive_clean: 0, free: 350 (255 510 765) ) 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB = 888kB) = 0kB) Swap cache: add 27827, delete 25738, find 2485/8960 Free swap: 166256kB 32752 pages of RAM 0 pages of HIGHMEM 4325 reserved pages 296 pages shared 2089 pages swap cached 0 pages in page table cache Buffer memory: 68kB SysRq: Terminate All Tasks SysRq: Terminate All Tasks <didn't work. fortunately for me current is vmstat:> Unable to handle kernel NULL pointer dereference at virtual address 00000000 c02fc18c *pde = 00000000 Oops: 0002 CPU: 0 EIP: 0010:[<c02fc18c>] EFLAGS: 00010246 eax: 00000000 ebx: c4d54000 ecx: c034f540 edx: 00000000 esi: c02fc600 edi: 00000001 ebp: c4d55f10 esp: c4d55ee4 ds: 0018 es: 0018 ss: 0018 Process vmstat (pid: 257, stackpage=c4d55000) Stack: 00000023 00000000 c02fc600 01234567 c4d54000 00000000 00000000 01234567 c4d54000 00000000 00000000 c4d55f30 c012df42 00000000 00000000 00000007 c1e25d40 00000007 00000001 c4d55f40 c012e099 c02fc5f4 00000000 c4d55f68 Call Trace: [<c012df42>] (0) [<c012e099>] (32) [<c012ac40>] (16) [<c012adef>] (40) [<c013b5b9>] (28) [<c0131104>] (28) [<c010aa74>] (28) [<ffffffff>] Code: 00 00 00 00 c8 bf f9 c7 c8 bf f9 c7 00 00 00 00 9c c1 2f c0 >>EIP; c02fc18c <kswapd_wait+0/c> <===== Trace; c012df42 <__alloc_pages+136/264> Trace; c012e099 <__get_free_pages+29/38> Trace; c012ac40 <kmem_cache_grow+c4/224> Trace; c012adef <kmem_cache_alloc+4f/64> Trace; c013b5b9 <getname+25/a8> Trace; c0131104 <sys_open+1c/d4> Trace; c010aa74 <system_call+3c/40> <snip decode of long ago frozen trace buffer> trimmed profile of frozen trace buffer: Total entries: 1048555 Total usecs: 306733.07 %TOTAL-TIME TOTAL-USECS AVG/CALL CALLS ADDR FUNC 0.0169% 51.97 7.42 7 c0117bf3 schedule 0.0260% 79.87 1.90 42 c012113c update_one_process 0.0361% 110.77 2.64 42 c012102d update_wall_time_one_tick 0.0389% 119.33 2.44 49 c010e6a8 mask_and_ack_8259A 0.1040% 319.13 7.60 42 c010f2a2 timer_interrupt 0.2113% 648.25 0.12 5400 c012cfa3 refill_inactive 0.2427% 744.51 0.14 5400 c012d3c1 try_to_free_pages 0.2476% 759.32 0.14 5400 c012cf4e inactive_shortage 0.2601% 797.69 0.15 5400 c014495f prune_icache 0.2648% 812.23 0.15 5400 c012d128 do_try_to_free_pages 0.2667% 818.12 0.15 5400 c0143444 prune_dcache 0.2903% 890.30 0.16 5400 c01447ef dispose_list 0.2970% 911.15 0.17 5400 c013500a wakeup_bdflush 0.3303% 1013.14 0.19 5400 c012d2fe wakeup_kswapd 0.3831% 1175.24 0.22 5400 c0144a49 shrink_icache_memory 0.4028% 1235.43 0.23 5400 c01437ed shrink_dcache_memory 0.4935% 1513.78 0.14 10800 c012aa3a kmem_cache_shrink 0.5372% 1647.67 0.31 5400 c012c68b page_launder 0.7191% 2205.76 0.41 5400 c0118c16 wake_up_process 0.8518% 2612.77 0.24 10800 c012a9e2 __kmem_cache_shrink 0.9255% 2838.67 0.13 21600 c012cea7 free_shortage 1.1119% 3410.50 0.21 16200 c012dd64 __alloc_pages_limit 1.4108% 4327.35 0.40 10802 c0117fff __wake_up 1.4517% 4452.75 0.16 27000 c012e1aa nr_inactive_clean_pages 1.5890% 4874.09 0.90 5400 c01444db sync_all_inodes 1.8562% 5693.61 0.53 10800 c012a982 is_chained_kmem_cache 2.8923% 8871.67 0.18 48600 c012e16e nr_free_pages 5.3308% 16351.35 0.19 86400 c012cdb3 refill_inactive_scan 5.7669% 17689.02 0.20 86436 c012b45d age_page_down_nolock 8.2360% 25262.57 2.34 10800 c012b08b kmem_cache_reap 11.9917% 36782.49 0.43 86436 c012b4eb deactivate_page_nolock 14.7300% 45181.80 0.25 183604 c0134ebb try_to_free_buffers 17.3333% 53167.09 0.30 178200 c0134e5b sync_page_buffers 19.2186% 58949.66 0.32 183604 c012e0f9 __free_pages ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.4.0-test9-pre4: __alloc_pages(...) try_again: 2000-09-20 20:23 2.4.0-test9-pre4: __alloc_pages(...) try_again: Roger Larsson 2000-09-21 5:21 ` Mike Galbraith @ 2000-09-21 16:26 ` Rik van Riel 2000-09-21 17:58 ` Roger Larsson 1 sibling, 1 reply; 9+ messages in thread From: Rik van Riel @ 2000-09-21 16:26 UTC (permalink / raw) To: Roger Larsson; +Cc: linux-kernel, linux-mm, Juan J. Quintela On Wed, 20 Sep 2000, Roger Larsson wrote: > I added a counter for try_again loops. > > ... __alloc_pages(...) > > int direct_reclaim = 0; > unsigned int gfp_mask = zonelist->gfp_mask; > struct page * page = NULL; > + int try_again_loops = 0; > > - - - > > + printk("VM: sync kswapd (direct_reclaim: %d) try_again # > %d\n", > + direct_reclaim, ++try_again_loops); > wakeup_kswapd(1); > goto try_again; > > > Result was surprising: > direct_reclaim was 1. > try_again_loops did never stop increasing (note: it is not static, > and should restart from zero after each success) > > Why does this happen? > a) kswapd did not succeed in freeing a suitable page? > b) __alloc_pages did not succeed in grabbing the page? No. It's a locking issue. In __alloc_pages we may be holding some IO locks (gfp_mask & __GFP_IO == 0), then we wait on kswapd and schedule out. The result is that kswapd will be spinning on a lock it can never get. The fix for this was included in the patch I sent to Linus for 2.4.0-tes9-pre2, but unfortunately not included. Ia'll send it agani soon. There is anathor, more subtle, case like this in buffer.c, which I have also fixed in my local tree here. The patch for this will be online on my home paeg soon. (connectivity isn't that great as you can see from mytyping) cheers, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.4.0-test9-pre4: __alloc_pages(...) try_again: 2000-09-21 16:26 ` 2.4.0-test9-pre4: __alloc_pages(...) try_again: Rik van Riel @ 2000-09-21 17:58 ` Roger Larsson 0 siblings, 0 replies; 9+ messages in thread From: Roger Larsson @ 2000-09-21 17:58 UTC (permalink / raw) To: Rik van Riel; +Cc: linux-kernel, linux-mm, Juan J. Quintela Rik van Riel wrote: > > On Wed, 20 Sep 2000, Roger Larsson wrote: > > > I added a counter for try_again loops. > > > > ... __alloc_pages(...) > > > > int direct_reclaim = 0; > > unsigned int gfp_mask = zonelist->gfp_mask; > > struct page * page = NULL; > > + int try_again_loops = 0; > > > > - - - > > > > + printk("VM: sync kswapd (direct_reclaim: %d) try_again # > > %d\n", > > + direct_reclaim, ++try_again_loops); > > wakeup_kswapd(1); > > goto try_again; > > > > > > Result was surprising: > > direct_reclaim was 1. > > try_again_loops did never stop increasing (note: it is not static, > > and should restart from zero after each success) > > > > Why does this happen? > > a) kswapd did not succeed in freeing a suitable page? > > b) __alloc_pages did not succeed in grabbing the page? > > No. > > It's a locking issue. In __alloc_pages we may be > holding some IO locks (gfp_mask & __GFP_IO == 0), > then we wait on kswapd and schedule out. The result is that > kswapd will be spinning on a lock it can never get. Hmm... If it did how could the __alloc_pages wakeup_kswapd(1) return, it should be synchronous... (counter is incremented past 1, a lot past...) In addition to this I have seen that kswapd really loops with a printk in the loop. (first in do_try_to_free_pages to be exact) > > The fix for this was included in the patch I sent to Linus > for 2.4.0-tes9-pre2, but unfortunately not included. > Ia'll send it agani soon. > Ok, I will try the patch. But I really think that the problem is something different... > There is anathor, more subtle, case like this in > buffer.c, which I have also fixed in my local tree here. > > The patch for this will be online on my home paeg > soon. (connectivity isn't that great as you can see from mytyping) > > cheers, > > Rik > -- > "What you're running that piece of shit Gnome?!?!" > -- Miguel de Icaza, UKUUG 2000 > > http://www.conectiva.com/ http://www.surriel.com/ -- Home page: http://www.norran.net/nra02596/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2000-09-23 11:28 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2000-09-20 20:23 2.4.0-test9-pre4: __alloc_pages(...) try_again: Roger Larsson 2000-09-21 5:21 ` Mike Galbraith 2000-09-21 18:17 ` Roger Larsson 2000-09-22 6:20 ` Mike Galbraith 2000-09-22 8:49 ` Rik van Riel 2000-09-22 16:51 ` Mike Galbraith 2000-09-23 11:28 ` 2.4.0-test9-pre6: __alloc_pages(...) datapoint Mike Galbraith 2000-09-21 16:26 ` 2.4.0-test9-pre4: __alloc_pages(...) try_again: Rik van Riel 2000-09-21 17:58 ` Roger Larsson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox