* Re: Ideas for memory management hackers. [not found] <348D3B36.673BEE82@nospam.isltd.insignia.co.uk> @ 1997-12-09 14:53 ` Rik van Riel 1997-12-09 16:11 ` Dr. Werner Fink 1997-12-09 17:45 ` Benjamin C.R. LaHaise 0 siblings, 2 replies; 9+ messages in thread From: Rik van Riel @ 1997-12-09 14:53 UTC (permalink / raw) To: Stephen Thomas; +Cc: linux-mm On Tue, 9 Dec 1997, Stephen Thomas wrote: > Should vhand have any effect on memory utilisation figures, > as reported by /proc/meminfo? If so, then vhand did not seem > to be achieving much, for all its hard work ... I have integrated mmap aging in kswapd, without the need for vhand, in 2.1.71 (experimental). As ppp isn't working in 2.1.71 I'm back to 2.1.66 now, but I have seen kswapd use over 10% of CPU for short times now :( But it doesn't have the disadvantage of having to scan constantly, and it seemed to work better than vhand (it seems that page->accessed isn't updated automatically, and has to be done via pte->flags in the page-table scanning done by kswapd... This would vhand have a fundamental design flaw, which would explain why some people saw a boost in performance, while others saw performance worsen... I think I'll send it to Linus (together with Zlatko's big-order hack) as a bug-fix (we're on feature-freeze after all:) for inclusion in 2.1.72... opinions please, Rik. -- Send Linux memory-management wishes to me: I'm currently looking for something to hack... ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ideas for memory management hackers. 1997-12-09 14:53 ` Ideas for memory management hackers Rik van Riel @ 1997-12-09 16:11 ` Dr. Werner Fink 1997-12-09 17:10 ` Rik van Riel 1997-12-10 13:13 ` Zlatko Calusic 1997-12-09 17:45 ` Benjamin C.R. LaHaise 1 sibling, 2 replies; 9+ messages in thread From: Dr. Werner Fink @ 1997-12-09 16:11 UTC (permalink / raw) To: H.H.vanRiel; +Cc: linux-mm > > I have integrated mmap aging in kswapd, without the need for > vhand, in 2.1.71 (experimental). As ppp isn't working in 2.1.71 > I'm back to 2.1.66 now, but I have seen kswapd use over 10% of > CPU for short times now :( Q: if ageing is now a separate part the CPU usage of freeing a page in kswapd and __get_free_pages should drop, shouldn't it? > I think I'll send it to Linus (together with Zlatko's > big-order hack) as a bug-fix (we're on feature-freeze after all:) > for inclusion in 2.1.72... > > opinions please, Q2: Is the patch available (ftp/http) for testing/reading? Werner ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ideas for memory management hackers. 1997-12-09 16:11 ` Dr. Werner Fink @ 1997-12-09 17:10 ` Rik van Riel 1997-12-10 13:13 ` Zlatko Calusic 1 sibling, 0 replies; 9+ messages in thread From: Rik van Riel @ 1997-12-09 17:10 UTC (permalink / raw) To: Dr. Werner Fink; +Cc: linux-mm On Tue, 9 Dec 1997, Dr. Werner Fink wrote: > > > > I have integrated mmap aging in kswapd, without the need for > > vhand, in 2.1.71 (experimental). As ppp isn't working in 2.1.71 > > I'm back to 2.1.66 now, but I have seen kswapd use over 10% of > > CPU for short times now :( > > Q: if ageing is now a separate part the CPU usage of freeing a page > in kswapd and __get_free_pages should drop, shouldn't it? In this new patch, aging is not a separate process, because vhand has a design flaw in it :(( (I think). The page->referenced flag is not updated by the mmu, instead it updates the pte->accessed flag... Now vhand can't handle normal user pages (this explains the higher swap usage) and they are swapped more often, actually, they are swapped by a second chance fifo algorithm now, so nobody noticed decreased performance... > > > I think I'll send it to Linus (together with Zlatko's > > big-order hack) as a bug-fix (we're on feature-freeze after all:) > > for inclusion in 2.1.72... > > > > opinions please, > > Q2: Is the patch available (ftp/http) for testing/reading? RSN... Rik. -- Send Linux memory-management wishes to me: I'm currently looking for something to hack... ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ideas for memory management hackers. 1997-12-09 16:11 ` Dr. Werner Fink 1997-12-09 17:10 ` Rik van Riel @ 1997-12-10 13:13 ` Zlatko Calusic 1997-12-10 15:21 ` Dr. Werner Fink 1 sibling, 1 reply; 9+ messages in thread From: Zlatko Calusic @ 1997-12-10 13:13 UTC (permalink / raw) To: Dr. Werner Fink; +Cc: linux-mm "Dr. Werner Fink" <werner@suse.de> writes: > > > > I have integrated mmap aging in kswapd, without the need for > > vhand, in 2.1.71 (experimental). As ppp isn't working in 2.1.71 > > I'm back to 2.1.66 now, but I have seen kswapd use over 10% of > > CPU for short times now :( > > Q: if ageing is now a separate part the CPU usage of freeing a page > in kswapd and __get_free_pages should drop, shouldn't it? > > > I think I'll send it to Linus (together with Zlatko's > > big-order hack) as a bug-fix (we're on feature-freeze after all:) > > for inclusion in 2.1.72... > > > > opinions please, > > Q2: Is the patch available (ftp/http) for testing/reading? > Due to a heavy demand... :) Here it comes (comments below): diff -urN linux-2.1.61/include/linux/swap.h linux/include/linux/swap.h --- linux-2.1.61/include/linux/swap.h Sat Oct 18 00:49:15 1997 +++ linux/include/linux/swap.h Fri Oct 31 19:42:09 1997 @@ -34,6 +34,7 @@ extern int nr_swap_pages; extern int nr_free_pages; +extern int nr_free_pages_bigorder; extern atomic_t nr_async_pages; extern int min_free_pages; extern int free_pages_low; diff -urN linux-2.1.61/mm/page_alloc.c linux/mm/page_alloc.c --- linux-2.1.61/mm/page_alloc.c Tue Jun 17 01:36:01 1997 +++ linux/mm/page_alloc.c Fri Oct 31 19:42:10 1997 @@ -30,6 +30,9 @@ int nr_swap_pages = 0; int nr_free_pages = 0; +/* Number of the free pages in chunks of order 2 and bigger */ +int nr_free_pages_bigorder = 0; + /* * Free area management * @@ -118,12 +121,17 @@ if (!test_and_change_bit(index, area->map)) break; remove_mem_queue(list(map_nr ^ -mask)); + if (order >= 2) + nr_free_pages_bigorder -= 1 << order; mask <<= 1; + order++; area++; index >>= 1; map_nr &= mask; } add_mem_queue(area, list(map_nr)); + if (order >= 2) + nr_free_pages_bigorder += 1 << order; #undef list @@ -171,6 +179,8 @@ (prev->next = ret->next)->prev = prev; \ MARK_USED(map_nr, new_order, area); \ nr_free_pages -= 1 << order; \ + if (new_order >= 2) \ + nr_free_pages_bigorder -= 1 << new_order; \ EXPAND(ret, map_nr, order, new_order, area); \ spin_unlock_irqrestore(&page_alloc_lock, flags); \ return ADDRESS(map_nr); \ @@ -187,6 +197,8 @@ area--; high--; size >>= 1; \ add_mem_queue(area, map); \ MARK_USED(index, high, area); \ + if (high >= 2) \ + nr_free_pages_bigorder += 1 << high; \ index += size; \ map += size; \ } \ diff -urN linux-2.1.61/mm/vmscan.c linux/mm/vmscan.c --- linux-2.1.61/mm/vmscan.c Thu Oct 23 22:30:25 1997 +++ linux/mm/vmscan.c Fri Oct 31 19:42:10 1997 @@ -465,7 +465,8 @@ pages = nr_free_pages; if (nr_free_pages >= min_free_pages) pages += atomic_read(&nr_async_pages); - if (pages >= free_pages_high) + if (pages >= free_pages_high && + nr_free_pages_bigorder >= min_free_pages / 2) break; wait = (pages < free_pages_low); if (try_to_free_page(GFP_KERNEL, 0, wait)) @@ -489,7 +490,7 @@ int want_wakeup = 0, memory_low = 0; int pages = nr_free_pages + atomic_read(&nr_async_pages); - if (pages < free_pages_low) + if (pages < free_pages_low || nr_free_pages_bigorder < min_free_pages / 2) memory_low = want_wakeup = 1; else if (pages < free_pages_high && jiffies >= next_swap_jiffies) want_wakeup = 1; It was originally developed for 2.1.61, but it works perfectly on 2.1.71 (I just checked). It was posted on linux-kernel list during recent problems (massive unsubscribe), so many people missed it. Now some comments on the patch: I had nasty lockups with all 2.1 kernels. I traced problem down to the network stuff which was trying to allocate pages of order 2 what was constantly failing. Problem was (and still is!) that Linux doesn't swap pages out to get more free memory if it already has free_pages_high or more free pages. Of course, it is correct behaviour, but... sometimes memory is completely fragmented, and all free chunks are of one or two pages, so there's no way you could get 16KB of contiguous memory (even if you have 512KB free!). Networking can't proceed without that and if you're logged remotely you're in fact completely disconnected. The patch was my initial attempt to solve that problems, but in the end I found that it had some other problems which I didn't like. Many people that tried it, reported that their machines swapped much more with patch applied. And I noticed it for myself, too. It is true that exactly in that cases when Linux swaps out heavily to get bigger chunks of memory, it would lockup without the patch, but in the end I didn't liked the idea and abandoned work on that. My opinion is that the problem is much bigger, and we will need much more hard work to resolve it in the future. That shouldn't stop anybody from experimenting with the patch, since it is simple enough and thoroughly tested, so you won't have any problems with it. If you don't count heavier swapping, that is. :) My current workaround against network blockups is in mm/slab.c where I explicitely ask from slab allocator that it stop using such a big memory chunks for small network buffers (mostly of ~1700 bytes in size or less). It works perfectly and nobody knows how (and if) it affects the performance, but (so?) I'm happy. :) Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- What has four legs and an arm? A happy pitbull. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ideas for memory management hackers. 1997-12-10 13:13 ` Zlatko Calusic @ 1997-12-10 15:21 ` Dr. Werner Fink 1997-12-10 19:13 ` Benjamin C.R. LaHaise 0 siblings, 1 reply; 9+ messages in thread From: Dr. Werner Fink @ 1997-12-10 15:21 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: linux-mm [patch deleted ... found at http://www.fys.ruu.nl/~riel/ :-)] > > It was originally developed for 2.1.61, but it works perfectly on > 2.1.71 (I just checked). It was posted on linux-kernel list during > recent problems (massive unsubscribe), so many people missed it. > > Now some comments on the patch: > > I had nasty lockups with all 2.1 kernels. I traced problem down to the > network stuff which was trying to allocate pages of order 2 what was > constantly failing. Problem was (and still is!) that Linux doesn't > swap pages out to get more free memory if it already has > free_pages_high or more free pages. Of course, it is correct > behaviour, but... sometimes memory is completely fragmented, and all > free chunks are of one or two pages, so there's no way you could get > 16KB of contiguous memory (even if you have 512KB free!). Networking > can't proceed without that and if you're logged remotely you're in > fact completely disconnected. In other words a better memory defragmentation is needed for 2.2, isn't it? A simple approach could be an addition address check during the scans in shrink_mmap (mm/filemap.c) instead of a freeing the first unused (random) page. This could be used in the first few priorities to free pages mostly useful for defragmentation. An other approach is Ben's anonymous ageing of physical task pages found in http://www.kvack.org/~blah/patches/v2_1_47_ben1.gz ... this approach gives a link of the pte of a page needed for ageing the page. Werner ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ideas for memory management hackers. 1997-12-10 15:21 ` Dr. Werner Fink @ 1997-12-10 19:13 ` Benjamin C.R. LaHaise 1997-12-10 21:55 ` Rik van Riel 0 siblings, 1 reply; 9+ messages in thread From: Benjamin C.R. LaHaise @ 1997-12-10 19:13 UTC (permalink / raw) To: Dr. Werner Fink; +Cc: Zlatko.Calusic, linux-mm On Wed, 10 Dec 1997, Dr. Werner Fink wrote: > In other words a better memory defragmentation is needed for 2.2, isn't it? > A simple approach could be an addition address check during the scans > in shrink_mmap (mm/filemap.c) instead of a freeing the first unused > (random) page. This could be used in the first few priorities to free pages > mostly useful for defragmentation. > > An other approach is Ben's anonymous ageing of physical task pages > found in http://www.kvack.org/~blah/patches/v2_1_47_ben1.gz ... > this approach gives a link of the pte of a page needed for ageing > the page. The past few times this has come up, the general argument from a few core people is that if one *really* cares to find the pte's pointing to a page, traversing the list of vma's attached to the inode, for which a pointer already exists, would be sufficient. Until I come up with something really kick-ass, I really doubt the pte-list stuff will be included. -ben ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ideas for memory management hackers. 1997-12-10 19:13 ` Benjamin C.R. LaHaise @ 1997-12-10 21:55 ` Rik van Riel 0 siblings, 0 replies; 9+ messages in thread From: Rik van Riel @ 1997-12-10 21:55 UTC (permalink / raw) To: Benjamin C.R. LaHaise; +Cc: Dr. Werner Fink, Zlatko.Calusic, linux-mm On Wed, 10 Dec 1997, Benjamin C.R. LaHaise wrote: > The past few times this has come up, the general argument from a few core > people is that if one *really* cares to find the pte's pointing to a page, > traversing the list of vma's attached to the inode, for which a pointer Well then, it looks like we've become the core people :) Let's just do what we want, and if the quality is good enough, Linus will include it. Rik. -- Send Linux memory-management wishes to me: I'm currently looking for something to hack... ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ideas for memory management hackers. 1997-12-09 14:53 ` Ideas for memory management hackers Rik van Riel 1997-12-09 16:11 ` Dr. Werner Fink @ 1997-12-09 17:45 ` Benjamin C.R. LaHaise 1997-12-09 17:53 ` Rik van Riel 1 sibling, 1 reply; 9+ messages in thread From: Benjamin C.R. LaHaise @ 1997-12-09 17:45 UTC (permalink / raw) To: Rik van Riel; +Cc: Stephen Thomas, linux-mm ... > I think I'll send it to Linus (together with Zlatko's > big-order hack) as a bug-fix (we're on feature-freeze after all:) > for inclusion in 2.1.72... Hrmpf - what is Zlatko's big-order hack? -ben ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ideas for memory management hackers. 1997-12-09 17:45 ` Benjamin C.R. LaHaise @ 1997-12-09 17:53 ` Rik van Riel 0 siblings, 0 replies; 9+ messages in thread From: Rik van Riel @ 1997-12-09 17:53 UTC (permalink / raw) To: Benjamin C.R. LaHaise; +Cc: Stephen Thomas, linux-mm On Tue, 9 Dec 1997, Benjamin C.R. LaHaise wrote: > ... > > I think I'll send it to Linus (together with Zlatko's > > big-order hack) as a bug-fix (we're on feature-freeze after all:) > > for inclusion in 2.1.72... > > Hrmpf - what is Zlatko's big-order hack? You can (at least) get it from my homepage at http://www.fys.ruu.nl/~riel, and maybe from Zlatko's homepage (if he has one). It is a patch that makes sure that there are at least min_free_pages/2 pages in 4-page chunks available... This way kernel functions (network, soundcard) can always allocate (at least) 16kb chunks of memory. This solves quite some crashes... Rik. -- Send Linux memory-management wishes to me: I'm currently looking for something to hack... ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~1997-12-10 21:55 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <348D3B36.673BEE82@nospam.isltd.insignia.co.uk>
1997-12-09 14:53 ` Ideas for memory management hackers Rik van Riel
1997-12-09 16:11 ` Dr. Werner Fink
1997-12-09 17:10 ` Rik van Riel
1997-12-10 13:13 ` Zlatko Calusic
1997-12-10 15:21 ` Dr. Werner Fink
1997-12-10 19:13 ` Benjamin C.R. LaHaise
1997-12-10 21:55 ` Rik van Riel
1997-12-09 17:45 ` Benjamin C.R. LaHaise
1997-12-09 17:53 ` Rik van Riel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox