* More info: 2.1.108 page cache performance on low memory @ 1998-07-13 16:53 Stephen C. Tweedie 1998-07-13 18:08 ` Eric W. Biederman 0 siblings, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-13 16:53 UTC (permalink / raw) To: linux-mm Cc: Rik van Riel, Ingo Molnar, Benjamin LaHaise, Alan Cox, Linus Torvalds, Stephen Tweedie Hi all, OK, a bit more benchmarking is showing bad problems with page ageing. I've been running 2.1 with a big ramdisk and without, with page ageing and without. The results for a simple compile job (make a few dependency files then compile four .c files) look like this: 2.0.34, 6m ram: 1:22 2.1.108, 16m ram, 10m ramdisk: With page cache ageing: Not usable (swap death during boot.) Without cache ageing: 8:47 2.1.108, 6m ram: With page cache ageing: 4:14 Without cache ageing: 3:22 So we can see that on these low memory configurations, the page cache ageing is a definite performance loss. The situation with the ramdisk is VERY markedly worse, which I think we can attribute to an overly-large page cache due to the %age-physical-memory tuning parameters; I'll be following this up to check (that's easy, since those parameters are sysctl-able). This is not an artificial situation: having the page cache limits fixed in terms of %age of physical pages is just not going to work if you can have large numbers of those pages locked down for particular purposes. Effectively we're reducing the size of the page pool without the vm taking it into account. Performance sucks overall compared to 2.0. That may well be due to the extra memory lost to the inode and dirent caches on 2.1, which tend to grow much more than they did before; it may be that we can address that without too much pain. It is certainly possible to trim back the kernel's ability to stop caching unused inodes/dirents, and although a self-tuning system will be necessary in the long term, putting bounds on these caches will at least let us see if this is where things are going wrong. I'll be experimenting a bit more to try to identify just where the performance is disappearing here. However you look at it, things look pretty grim on 2.1 right now on low memory machines. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-13 16:53 More info: 2.1.108 page cache performance on low memory Stephen C. Tweedie @ 1998-07-13 18:08 ` Eric W. Biederman 1998-07-13 18:29 ` Zlatko Calusic 1998-07-14 17:30 ` Stephen C. Tweedie 0 siblings, 2 replies; 46+ messages in thread From: Eric W. Biederman @ 1998-07-13 18:08 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: linux-mm >>>>> "ST" == Stephen C Tweedie <sct@redhat.com> writes: ST> Hi all, ST> OK, a bit more benchmarking is showing bad problems with page ageing. ST> I've been running 2.1 with a big ramdisk and without, with page ageing ST> and without. The results for a simple compile job (make a few ST> dependency files then compile four .c files) look like this: ST> 2.0.34, 6m ram: 1:22 ST> 2.1.108, 16m ram, 10m ramdisk: ST> With page cache ageing: Not usable (swap death during boot.) ST> Without cache ageing: 8:47 ST> 2.1.108, 6m ram: ST> With page cache ageing: 4:14 ST> Without cache ageing: 3:22 O.k. Just a few thoughts. 1) We have a minimum size for the buffer cache in percent of physical pages. Setting the minimum to 0% may help. 2) If we play with LRU list it may be most practical use page->next and page->prev fields for the list, and for truncate_inode_pages && invalidate_inode_pages do something like: for(i = 0; i < inode->i_size; i+= PAGE_SIZE) { page = find_in_page_cache(inode, i); if (page) /* remove it */ ; } And remove the inode->i_pages list. This should be roughly equivalent to the bforgets needed by truncate anyway so should impose not large peformance penalty. Personally I think it is broken to set the limits of cache sizes (buffer & page) to anthing besides: max=100% min=0% by default. But now that we have this hand tuneing option in addition to auto tuning we should experiment with it as well. Eric -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-13 18:08 ` Eric W. Biederman @ 1998-07-13 18:29 ` Zlatko Calusic 1998-07-14 17:32 ` Stephen C. Tweedie 1998-07-14 17:30 ` Stephen C. Tweedie 1 sibling, 1 reply; 46+ messages in thread From: Zlatko Calusic @ 1998-07-13 18:29 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Stephen C. Tweedie, linux-mm ebiederm+eric@npwt.net (Eric W. Biederman) writes: > >>>>> "ST" == Stephen C Tweedie <sct@redhat.com> writes: > > ST> Hi all, > ST> OK, a bit more benchmarking is showing bad problems with page ageing. > ST> I've been running 2.1 with a big ramdisk and without, with page ageing > ST> and without. The results for a simple compile job (make a few > ST> dependency files then compile four .c files) look like this: > > ST> 2.0.34, 6m ram: 1:22 > > ST> 2.1.108, 16m ram, 10m ramdisk: > ST> With page cache ageing: Not usable (swap death during boot.) > ST> Without cache ageing: 8:47 > > ST> 2.1.108, 6m ram: > ST> With page cache ageing: 4:14 > ST> Without cache ageing: 3:22 I agree that ageing of the page cache has a bad impact on the performance. Benchmarking disks reveals much lower read speed, mostly thanks to unneeded excessive swapping produced by outswapping pages that will be again swapped in, in few seconds (page cache likes to take 90% of memory when copying large files). This produces lots of redundant head movement (not to mention copying pages...) which effectively cuts read speed to half. I personally run a system with heavily patched VM subsystem, at least for the last three months. Sad thing is that my patch mostly undo latest changes. :( Just to mention, I have 64MB of physical memory, and my machine is definitely not memory starved, but it also suffers from some of the recent VM changes. > > O.k. Just a few thoughts. > 1) We have a minimum size for the buffer cache in percent of physical pages. > Setting the minimum to 0% may help. > > 2) If we play with LRU list it may be most practical use page->next and page->prev > fields for the list, and for truncate_inode_pages && invalidate_inode_pages > do something like: > for(i = 0; i < inode->i_size; i+= PAGE_SIZE) { > page = find_in_page_cache(inode, i); > if (page) > /* remove it */ > ; > } > And remove the inode->i_pages list. This should be roughly equivalent > to the bforgets needed by truncate anyway so should impose not large > peformance penalty. > > Personally I think it is broken to set the limits of cache sizes > (buffer & page) to anthing besides: max=100% min=0% by default. Exactly. That (removing cache limits) is one of my favorite changes. Free memory == unused memory == bad policy! There is no reason why any of the caches would not utilize all of the free memory at any given moment. But, we must be very careful to swap out only unneeded pages if we decide to enlarge cache on the behalf of the text and data pages. > > But now that we have this hand tuneing option in addition to auto > tuning we should experiment with it as well. > If anybody want to see, I can provide benchmark results, but I'm not prepared to compile another kernel image if nobody's interested. :) Regards! -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- 10 out of 5 doctors feel it's OK to be skitzo! -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-13 18:29 ` Zlatko Calusic @ 1998-07-14 17:32 ` Stephen C. Tweedie 1998-07-16 12:31 ` Zlatko Calusic 0 siblings, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-14 17:32 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Eric W. Biederman, Stephen C. Tweedie, linux-mm Hi, On 13 Jul 1998 20:29:33 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> said: > I agree that ageing of the page cache has a bad impact on the > performance. > Just to mention, I have 64MB of physical memory, and my machine is > definitely not memory starved, but it also suffers from some of the > recent VM changes. Yep. Has anybody else got observations about what sort of configurations are helped or hindered by the current 2.1 changes? > That (removing cache limits) is one of my favorite changes. > Free memory == unused memory == bad policy! > There is no reason why any of the caches would not utilize all of the > free memory at any given moment. The existing limits don't affect the ability of the cache to grow; they just give a target bound for the cache when we start trying to get pages back for something else. > If anybody want to see, I can provide benchmark results, but I'm not > prepared to compile another kernel image if nobody's interested. :) Well, I've been compiling kernels all day for this. :) Any information you can give will help, but for now it does look as if backing out the cache ageing is a necessary first step. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-14 17:32 ` Stephen C. Tweedie @ 1998-07-16 12:31 ` Zlatko Calusic 0 siblings, 0 replies; 46+ messages in thread From: Zlatko Calusic @ 1998-07-16 12:31 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, linux-mm "Stephen C. Tweedie" <sct@redhat.com> writes: > Well, I've been compiling kernels all day for this. :) Any information > you can give will help, but for now it does look as if backing out the > cache ageing is a necessary first step. > OK, here we go: Official 2.1.108: -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 200 4552 65.5 5011 20.5 2570 21.1 5643 74.4 4077 14.9 84.8 2.9 ^^^^ ^^^^^ Patched 2.1.108 (no page aging, no cache limits, modified slab, etc... see below) -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 200 6449 89.7 7450 31.4 2605 22.2 6052 80.5 7269 27.3 105.4 2.9 ^^^^ ^^^^^ I'm applying patch that produced results above. I don't claim my work is suitable for anything. It is just part of my Linux MM exploration, testing and simplifying things. But, it worked stable and fast for me, last few months, and survived all torture testing I've been putting on it. YMMV, of course. Test platform is P166MMX, 64MB RAM, aic7xxx, Fujitsu M2954ESP. The results are completely reproducable. Regards, ------------------------------------------------------------ diff -urN --exclude-from=exclude linux-old/Documentation/sysctl/vm.txt linux/Documentation/sysctl/vm.txt --- linux-old/Documentation/sysctl/vm.txt Fri Jun 26 19:44:26 1998 +++ linux/Documentation/sysctl/vm.txt Tue Jul 14 21:32:56 1998 @@ -15,13 +15,9 @@ Currently, these files are in /proc/sys/vm: - bdflush -- buffermem - freepages -- kswapd - overcommit_memory -- pagecache - swapctl -- swapout_interval ============================================================== @@ -90,80 +86,23 @@ age_super is for filesystem metadata. ============================================================== -buffermem: -The three values in this file correspond to the values in -the struct buffer_mem. It controls how much memory should -be used for buffer memory. The percentage is calculated -as a percentage of total system memory. - -The values are: -min_percent -- this is the minimum percentage of memory - that should be spent on buffer memory -borrow_percent -- when Linux is short on memory, and the - buffer cache uses more memory, free pages - are stolen from it -max_percent -- this is the maximum amount of memory that - can be used for buffer memory - -============================================================== freepages: This file contains the values in the struct freepages. That struct contains three members: min, low and high. -Although the goal of the Linux memory management subsystem -is to avoid fragmentation and make large chunks of free -memory (so that we can hand out DMA buffers and such), there -still are some page-based limits in the system, mainly to -make sure we don't waste too much memory trying to get large -free area's. - The meaning of the numbers is: freepages.min When the number of free pages in the system reaches this number, only the kernel can allocate more memory. -freepages.low If memory is too fragmented, the swapout - daemon is started, except when the number - of free pages is larger than freepages.low. -freepages.high The swapping daemon exits when memory is - sufficiently defragmented, when the number - of free pages reaches freepages.high or when - it has tried the maximum number of times. - -============================================================== - -kswapd: - -Kswapd is the kernel swapout daemon. That is, kswapd is that -piece of the kernel that frees memory when it gets fragmented -or full. Since every system is different, you'll probably want -some control over this piece of the system. - -The numbers in this page correspond to the numbers in the -struct pager_daemon {tries_base, tries_min, swap_cluster -}; The tries_base and swap_cluster probably have the -largest influence on system performance. - -tries_base The maximum number of pages kswapd tries to - free in one round is calculated from this - number. Usually this number will be divided - by 4 or 8 (see mm/vmscan.c), so it isn't as - big as it looks. - When you need to increase the bandwidth to/from - swap, you'll want to increase this number. -tries_min This is the minimum number of times kswapd - tries to free a page each time it is called. - Basically it's just there to make sure that - kswapd frees some pages even when it's being - called with minimum priority. -swap_cluster This is the number of pages kswapd writes in - one turn. You want this large so that kswapd - does it's I/O in large chunks and the disk - doesn't have to seek often, but you don't want - it to be too large since that would flood the - request queue. +freepages.low When the number of free pages drops below + this number, swapping daemon (kswapd) is + woken up. +freepages.high This is kswapd's target, when there are more + free pages than this number, kswapd will stop + running. ============================================================== @@ -206,18 +145,6 @@ ============================================================== -pagecache: - -This file does exactly the same as buffermem, only this -file controls the struct page_cache, and thus controls -the amount of memory allowed for memory mapping of files. - -You don't want the minimum level to be too low, otherwise -your system might thrash when memory is tight or fragmentation -is high... - -============================================================== - swapctl: This file contains no less than 8 variables. @@ -273,15 +200,3 @@ process pages in order to satisfy buffer memory demands, you might want to either increase sc_bufferout_weight, or decrease the value of sc_pageout_weight. - -============================================================== - -swapout_interval: - -The single value in this file controls the amount of time -between successive wakeups of kswapd when nr_free_pages is -between free_pages_low and free_pages_high. The default value -of HZ/4 is usually right, but when kswapd can't keep up with -the number of allocations in your system, you might want to -decrease this number. - diff -urN --exclude-from=exclude linux-old/fs/buffer.c linux/fs/buffer.c --- linux-old/fs/buffer.c Fri Jun 26 19:44:35 1998 +++ linux/fs/buffer.c Tue Jul 14 21:32:56 1998 @@ -704,7 +704,7 @@ * of other sizes, this is necessary now that we * no longer have the lav code. */ - try_to_free_buffer(bh,&bh,1); + try_to_free_buffer(bh, &bh); if (!bh) break; continue; @@ -733,9 +733,7 @@ /* We are going to try to locate this much memory. */ needed = bdf_prm.b_un.nrefill * size; - while ((nr_free_pages > freepages.min*2) && - (buffermem >> PAGE_SHIFT) * 100 < (buffer_mem.max_percent * num_physpages) && - grow_buffers(GFP_BUFFER, size)) { + while ((nr_free_pages > freepages.low) && grow_buffers(GFP_BUFFER, size)) { obtained += PAGE_SIZE; if (obtained >= needed) return; @@ -1646,8 +1644,7 @@ * try_to_free_buffer() checks if all the buffers on this particular page * are unused, and free's the page if so. */ -int try_to_free_buffer(struct buffer_head * bh, struct buffer_head ** bhp, - int priority) +int try_to_free_buffer(struct buffer_head * bh, struct buffer_head ** bhp) { unsigned long page; struct buffer_head * tmp, * p; @@ -1659,11 +1656,9 @@ do { if (!tmp) return 0; - if (tmp->b_count || buffer_protected(tmp) || - buffer_dirty(tmp) || buffer_locked(tmp) || - buffer_waiting(tmp)) - return 0; - if (priority && buffer_touched(tmp)) + if (tmp->b_count || buffermem < PAGE_SIZE * freepages.low || + buffer_protected(tmp) || buffer_dirty(tmp) || buffer_locked(tmp) + || buffer_waiting(tmp) || buffer_touched(tmp)) return 0; tmp = tmp->b_this_page; } while (tmp != bh); diff -urN --exclude-from=exclude linux-old/include/linux/fs.h linux/include/linux/fs.h --- linux-old/include/linux/fs.h Thu May 21 01:21:42 1998 +++ linux/include/linux/fs.h Tue Jul 14 21:32:56 1998 @@ -707,7 +707,7 @@ extern void refile_buffer(struct buffer_head * buf); extern void set_writetime(struct buffer_head * buf, int flag); -extern int try_to_free_buffer(struct buffer_head*, struct buffer_head**, int); +extern int try_to_free_buffer(struct buffer_head*, struct buffer_head**); extern int nr_buffers; extern int buffermem; diff -urN --exclude-from=exclude linux-old/include/linux/mm.h linux/include/linux/mm.h --- linux-old/include/linux/mm.h Thu Jul 2 20:07:56 1998 +++ linux/include/linux/mm.h Tue Jul 14 21:32:56 1998 @@ -253,23 +253,6 @@ /* memory.c & swap.c*/ -/* - * This traverses "nr" memory size lists, - * and returns true if there is enough memory. - * - * For example, we want to keep on waking up - * kswapd every once in a while until the highest - * memory order has an entry (ie nr == 0), but - * we want to do it in the background. - * - * We want to do it in the foreground only if - * none of the three highest lists have enough - * memory. Random number. - */ -extern int free_memory_available(int nr); -#define kswapd_continue() (!free_memory_available(3)) -#define kswapd_wakeup() (!free_memory_available(0)) - #define free_page(addr) free_pages((addr),0) extern void FASTCALL(free_pages(unsigned long addr, unsigned long order)); extern void FASTCALL(__free_page(struct page *)); diff -urN --exclude-from=exclude linux-old/include/linux/swap.h linux/include/linux/swap.h --- linux-old/include/linux/swap.h Tue Jun 16 23:29:10 1998 +++ linux/include/linux/swap.h Tue Jul 14 21:32:56 1998 @@ -50,7 +50,7 @@ extern int shm_swap (int, int); /* linux/mm/vmscan.c */ -extern int try_to_free_page(int); +extern void try_to_free_page(int); /* linux/mm/page_io.c */ extern void rw_swap_page(int, unsigned long, char *, int); @@ -92,17 +92,6 @@ * swap cache stuff (in linux/mm/swap_state.c) */ -#define SWAP_CACHE_INFO - -#ifdef SWAP_CACHE_INFO -extern unsigned long swap_cache_add_total; -extern unsigned long swap_cache_add_success; -extern unsigned long swap_cache_del_total; -extern unsigned long swap_cache_del_success; -extern unsigned long swap_cache_find_total; -extern unsigned long swap_cache_find_success; -#endif - extern inline unsigned long in_swap_cache(struct page *page) { if (PageSwapCache(page)) @@ -126,21 +115,6 @@ if (PageFreeAfter(page)) count--; return (count > 1); -} - -/* - * When we're freeing pages from a user application, we want - * to cluster swapouts too. -- Rik. - * linux/mm/page_alloc.c - */ -static inline int try_to_free_pages(int gfp_mask, int count) -{ - int retval = 0; - while (count--) { - if (try_to_free_page(gfp_mask)) - retval = 1; - } - return retval; } /* diff -urN --exclude-from=exclude linux-old/include/linux/swapctl.h linux/include/linux/swapctl.h --- linux-old/include/linux/swapctl.h Thu May 21 01:21:43 1998 +++ linux/include/linux/swapctl.h Tue Jul 14 21:32:56 1998 @@ -31,16 +31,6 @@ typedef swapstat_v1 swapstat_t; extern swapstat_t swapstats; -typedef struct buffer_mem_v1 -{ - unsigned int min_percent; - unsigned int borrow_percent; - unsigned int max_percent; -} buffer_mem_v1; -typedef buffer_mem_v1 buffer_mem_t; -extern buffer_mem_t buffer_mem; -extern buffer_mem_t page_cache; - typedef struct freepages_v1 { unsigned int min; @@ -49,15 +39,6 @@ } freepages_v1; typedef freepages_v1 freepages_t; extern freepages_t freepages; - -typedef struct pager_daemon_v1 -{ - unsigned int tries_base; - unsigned int tries_min; - unsigned int swap_cluster; -} pager_daemon_v1; -typedef pager_daemon_v1 pager_daemon_t; -extern pager_daemon_t pager_daemon; #define SC_VERSION 1 #define SC_MAX_VERSION 1 diff -urN --exclude-from=exclude linux-old/include/linux/sysctl.h linux/include/linux/sysctl.h --- linux-old/include/linux/sysctl.h Tue Jun 16 23:29:10 1998 +++ linux/include/linux/sysctl.h Tue Jul 14 21:32:56 1998 @@ -74,13 +74,9 @@ enum { VM_SWAPCTL=1, /* struct: Set vm swapping control */ - VM_SWAPOUT, /* int: Background pageout interval */ VM_FREEPG, /* struct: Set free page thresholds */ VM_BDFLUSH, /* struct: Control buffer cache flushing */ VM_OVERCOMMIT_MEMORY, /* Turn off the virtual memory safety limit */ - VM_BUFFERMEM, /* struct: Set buffer memory thresholds */ - VM_PAGECACHE, /* struct: Set cache memory thresholds */ - VM_PAGERDAEMON, /* struct: Control kswapd behaviour */ VM_PGT_CACHE /* struct: Set page table cache parameters */ }; diff -urN --exclude-from=exclude linux-old/kernel/sysctl.c linux/kernel/sysctl.c --- linux-old/kernel/sysctl.c Tue Jun 16 23:29:11 1998 +++ linux/kernel/sysctl.c Tue Jul 14 21:32:56 1998 @@ -7,7 +7,7 @@ * Added hooks for /proc/sys/net (minor, minor patch), 96/4/1, Mike Shaver. * Added kernel/java-{interpreter,appletviewer}, 96/5/10, Mike Shaver. * Dynamic registration fixes, Stephen Tweedie. - * Added kswapd-interval, ctrl-alt-del, printk stuff, 1/8/97, Chris Horn. + * Added ctrl-alt-del, printk stuff, 1/8/97, Chris Horn. * Made sysctl support optional via CONFIG_SYSCTL, 1/10/97, Chris Horn. */ @@ -37,7 +37,7 @@ /* External variables not in a header file. */ extern int panic_timeout; -extern int console_loglevel, C_A_D, swapout_interval; +extern int console_loglevel, C_A_D; extern int bdf_prm[], bdflush_min[], bdflush_max[]; extern char binfmt_java_interpreter[], binfmt_java_appletviewer[]; extern int sysctl_overcommit_memory; @@ -191,21 +191,12 @@ static ctl_table vm_table[] = { {VM_SWAPCTL, "swapctl", &swap_control, sizeof(swap_control_t), 0644, NULL, &proc_dointvec}, - {VM_SWAPOUT, "swapout_interval", - &swapout_interval, sizeof(int), 0644, NULL, &proc_dointvec}, {VM_FREEPG, "freepages", &freepages, sizeof(freepages_t), 0644, NULL, &proc_dointvec}, {VM_BDFLUSH, "bdflush", &bdf_prm, 9*sizeof(int), 0600, NULL, - &proc_dointvec_minmax, &sysctl_intvec, NULL, - &bdflush_min, &bdflush_max}, + &proc_dointvec_minmax, &sysctl_intvec, NULL, &bdflush_min, &bdflush_max}, {VM_OVERCOMMIT_MEMORY, "overcommit_memory", &sysctl_overcommit_memory, sizeof(sysctl_overcommit_memory), 0644, NULL, &proc_dointvec}, - {VM_BUFFERMEM, "buffermem", - &buffer_mem, sizeof(buffer_mem_t), 0644, NULL, &proc_dointvec}, - {VM_PAGECACHE, "pagecache", - &page_cache, sizeof(buffer_mem_t), 0644, NULL, &proc_dointvec}, - {VM_PAGERDAEMON, "kswapd", - &pager_daemon, sizeof(pager_daemon_t), 0644, NULL, &proc_dointvec}, {VM_PGT_CACHE, "pagetable_cache", &pgt_cache_water, 2*sizeof(int), 0600, NULL, &proc_dointvec}, {0} diff -urN --exclude-from=exclude linux-old/mm/filemap.c linux/mm/filemap.c --- linux-old/mm/filemap.c Thu Jul 2 20:07:56 1998 +++ linux/mm/filemap.c Tue Jul 14 21:32:56 1998 @@ -150,10 +150,6 @@ } tmp = tmp->b_this_page; } while (tmp != bh); - - /* Refuse to swap out all buffer pages */ - if ((buffermem >> PAGE_SHIFT) * 100 < (buffer_mem.min_percent * num_physpages)) - goto next; } /* We can't throw away shared pages, but we do mark @@ -164,15 +160,11 @@ switch (atomic_read(&page->count)) { case 1: + /* If it has been referenced recently, don't free it */ + if (test_and_clear_bit(PG_referenced, &page->flags)) + break; /* is it a swap-cache or page-cache page? */ if (page->inode) { - if (test_and_clear_bit(PG_referenced, &page->flags)) { - touch_page(page); - break; - } - age_page(page); - if (page->age || page_cache_size * 100 < (page_cache.min_percent * num_physpages)) - break; if (PageSwapCache(page)) { delete_from_swap_cache(page); return 1; @@ -182,13 +174,8 @@ __free_page(page); return 1; } - /* It's not a cache page, so we don't do aging. - * If it has been referenced recently, don't free it */ - if (test_and_clear_bit(PG_referenced, &page->flags)) - break; - /* is it a buffer cache page? */ - if ((gfp_mask & __GFP_IO) && bh && try_to_free_buffer(bh, &bh, 6)) + if ((gfp_mask & __GFP_IO) && bh && try_to_free_buffer(bh, &bh)) return 1; break; diff -urN --exclude-from=exclude linux-old/mm/page_alloc.c linux/mm/page_alloc.c --- linux-old/mm/page_alloc.c Fri Jun 26 19:44:38 1998 +++ linux/mm/page_alloc.c Tue Jul 14 21:32:56 1998 @@ -100,53 +100,6 @@ */ spinlock_t page_alloc_lock = SPIN_LOCK_UNLOCKED; -/* - * This routine is used by the kernel swap daemon to determine - * whether we have "enough" free pages. It is fairly arbitrary, - * but this had better return false if any reasonable "get_free_page()" - * allocation could currently fail.. - * - * This will return zero if no list was found, non-zero - * if there was memory (the bigger, the better). - */ -int free_memory_available(int nr) -{ - int retval = 0; - unsigned long flags; - struct free_area_struct * list; - - /* - * If we have more than about 3% to 5% of all memory free, - * consider it to be good enough for anything. - * It may not be, due to fragmentation, but we - * don't want to keep on forever trying to find - * free unfragmented memory. - * Added low/high water marks to avoid thrashing -- Rik. - */ - if (nr_free_pages > (nr ? freepages.low : freepages.high)) - return nr+1; - - list = free_area + NR_MEM_LISTS; - spin_lock_irqsave(&page_alloc_lock, flags); - /* We fall through the loop if the list contains one - * item. -- thanks to Colin Plumb <colin@nyx.net> - */ - do { - list--; - /* Empty list? Bad - we need more memory */ - if (list->next == memory_head(list)) - break; - /* One item on the list? Look further */ - if (list->next->next == memory_head(list)) - continue; - /* More than one item? We're ok */ - retval = nr + 1; - break; - } while (--nr >= 0); - spin_unlock_irqrestore(&page_alloc_lock, flags); - return retval; -} - static inline void free_pages_ok(unsigned long map_nr, unsigned long order) { struct free_area_struct *area = free_area + order; @@ -215,30 +168,6 @@ */ #define MARK_USED(index, order, area) \ change_bit((index) >> (1+(order)), (area)->map) -#define CAN_DMA(x) (PageDMA(x)) -#define ADDRESS(x) (PAGE_OFFSET + ((x) << PAGE_SHIFT)) -#define RMQUEUE(order, maxorder, dma) \ -do { struct free_area_struct * area = free_area+order; \ - unsigned long new_order = order; \ - do { struct page *prev = memory_head(area), *ret = prev->next; \ - while (memory_head(area) != ret) { \ - if (new_order >= maxorder && ret->next == prev) \ - break; \ - if (!dma || CAN_DMA(ret)) { \ - unsigned long map_nr = ret->map_nr; \ - (prev->next = ret->next)->prev = prev; \ - MARK_USED(map_nr, new_order, area); \ - nr_free_pages -= 1 << order; \ - EXPAND(ret, map_nr, order, new_order, area); \ - spin_unlock_irqrestore(&page_alloc_lock, flags); \ - return ADDRESS(map_nr); \ - } \ - prev = ret; \ - ret = ret->next; \ - } \ - new_order++; area++; \ - } while (new_order < NR_MEM_LISTS); \ -} while (0) #define EXPAND(map,index,low,high,area) \ do { unsigned long size = 1 << high; \ @@ -255,18 +184,11 @@ unsigned long __get_free_pages(int gfp_mask, unsigned long order) { - unsigned long flags, maxorder; + unsigned long flags, new_order, extra = 0; + struct free_area_struct *area; if (order >= NR_MEM_LISTS) - goto nopage; - - /* - * "maxorder" is the highest order number that we're allowed - * to empty in order to find a free page.. - */ - maxorder = NR_MEM_LISTS-1; - if (gfp_mask & __GFP_HIGH) - maxorder = NR_MEM_LISTS; + return 0; if (in_interrupt() && (gfp_mask & __GFP_WAIT)) { static int count = 0; @@ -277,18 +199,39 @@ } } - for (;;) { - spin_lock_irqsave(&page_alloc_lock, flags); - RMQUEUE(order, maxorder, (gfp_mask & GFP_DMA)); - spin_unlock_irqrestore(&page_alloc_lock, flags); - if (!(gfp_mask & __GFP_WAIT)) - break; - if (!try_to_free_pages(gfp_mask, SWAP_CLUSTER_MAX)) - break; - gfp_mask &= ~__GFP_WAIT; /* go through this only once */ - maxorder = NR_MEM_LISTS; /* Allow anything this time */ + repeat: + if ((gfp_mask & __GFP_WAIT)) + if (extra || (nr_free_pages < freepages.min && !(gfp_mask & __GFP_MED))) + while (nr_free_pages + atomic_read(&nr_async_pages) < + freepages.low + extra) + try_to_free_page(gfp_mask); + new_order = order; + area = free_area + order; + spin_lock_irqsave(&page_alloc_lock, flags); + do { + struct page *prev = memory_head(area), *ret; + + while (memory_head(area) != (ret = prev->next)) { + if (!(gfp_mask & GFP_DMA) || PageDMA(ret)) { + unsigned long map_nr = ret->map_nr; + + (prev->next = ret->next)->prev = prev; + MARK_USED(map_nr, new_order, area); + nr_free_pages -= 1 << order; + EXPAND(ret, map_nr, order, new_order, area); + spin_unlock_irqrestore(&page_alloc_lock, flags); + return PAGE_OFFSET + (map_nr << PAGE_SHIFT); + } + prev = ret; + } + new_order++; + area++; + } while (new_order < NR_MEM_LISTS); + spin_unlock_irqrestore(&page_alloc_lock, flags); + if (gfp_mask & __GFP_WAIT) { + extra += SWAP_CLUSTER_MAX; + goto repeat; } -nopage: return 0; } @@ -315,9 +258,6 @@ } spin_unlock_irqrestore(&page_alloc_lock, flags); printk("= %lukB)\n", total); -#ifdef SWAP_CACHE_INFO - show_swap_cache_info(); -#endif } #define LONG_ALIGN(x) (((x)+(sizeof(long))-1)&~((sizeof(long))-1)) @@ -340,14 +280,14 @@ * that we don't waste too much memory on large systems. * This is totally arbitrary. */ - i = (end_mem - PAGE_OFFSET) >> (PAGE_SHIFT+7); + i = (end_mem - PAGE_OFFSET) >> (PAGE_SHIFT + 7); if (i < 48) i = 48; if (i > 256) i = 256; freepages.min = i; freepages.low = i << 1; - freepages.high = freepages.low + i; + freepages.high = i << 2; mem_map = (mem_map_t *) LONG_ALIGN(start_mem); p = mem_map + MAP_NR(end_mem); start_mem = LONG_ALIGN((unsigned long) p); diff -urN --exclude-from=exclude linux-old/mm/slab.c linux/mm/slab.c --- linux-old/mm/slab.c Fri Jun 26 19:44:38 1998 +++ linux/mm/slab.c Tue Jul 14 21:32:56 1998 @@ -308,12 +308,12 @@ #define SLAB_MAX_GFP_ORDER 5 /* 32 pages */ /* the 'preferred' minimum num of objs per slab - maybe less for large objs */ -#define SLAB_MIN_OBJS_PER_SLAB 4 +#define SLAB_MIN_OBJS_PER_SLAB 1 /* If the num of objs per slab is <= SLAB_MIN_OBJS_PER_SLAB, * then the page order must be less than this before trying the next order. */ -#define SLAB_BREAK_GFP_ORDER 2 +#define SLAB_BREAK_GFP_ORDER 1 /* Macros for storing/retrieving the cachep and or slab from the * global 'mem_map'. With off-slab bufctls, these are used to find the diff -urN --exclude-from=exclude linux-old/mm/swap.c linux/mm/swap.c --- linux-old/mm/swap.c Fri Jun 26 19:44:38 1998 +++ linux/mm/swap.c Tue Jul 14 21:32:56 1998 @@ -10,7 +10,6 @@ * linux/Documentation/sysctl/vm.txt. * Started 18.12.91 * Swap aging added 23.2.95, Stephen Tweedie. - * Buffermem limits added 12.3.98, Rik van Riel. */ #include <linux/mm.h> @@ -36,8 +35,8 @@ /* * We identify three levels of free memory. We never let free mem * fall below the freepages.min except for atomic allocations. We - * start background swapping if we fall below freepages.high free - * pages, and we begin intensive swapping below freepages.low. + * start background swapping if we fall below freepages.low free + * pages, and we begin intensive swapping below freepages.min. * * These values are there to keep GCC from complaining. Actual * initialization is done in mm/page_alloc.c or arch/sparc(64)/mm/init.c. @@ -45,7 +44,7 @@ freepages_t freepages = { 48, /* freepages.min */ 96, /* freepages.low */ - 144 /* freepages.high */ + 192 /* freepages.high */ }; /* We track the number of pages currently being asynchronously swapped @@ -65,21 +64,3 @@ }; swapstat_t swapstats = {0}; - -buffer_mem_t buffer_mem = { - 3, /* minimum percent buffer */ - 10, /* borrow percent buffer */ - 30 /* maximum percent buffer */ -}; - -buffer_mem_t page_cache = { - 10, /* minimum percent page cache */ - 30, /* borrow percent page cache */ - 75 /* maximum */ -}; - -pager_daemon_t pager_daemon = { - 512, /* base number for calculating the number of tries */ - SWAP_CLUSTER_MAX, /* minimum number of tries */ - SWAP_CLUSTER_MAX, /* do swap I/O in clusters of this size */ -}; diff -urN --exclude-from=exclude linux-old/mm/swap_state.c linux/mm/swap_state.c --- linux-old/mm/swap_state.c Tue Mar 10 19:51:02 1998 +++ linux/mm/swap_state.c Tue Jul 14 21:32:56 1998 @@ -24,14 +24,6 @@ #include <asm/bitops.h> #include <asm/pgtable.h> -#ifdef SWAP_CACHE_INFO -unsigned long swap_cache_add_total = 0; -unsigned long swap_cache_add_success = 0; -unsigned long swap_cache_del_total = 0; -unsigned long swap_cache_del_success = 0; -unsigned long swap_cache_find_total = 0; -unsigned long swap_cache_find_success = 0; - /* * Keep a reserved false inode which we will use to mark pages in the * page cache are acting as swap cache instead of file cache. @@ -43,21 +35,8 @@ */ struct inode swapper_inode; - -void show_swap_cache_info(void) -{ - printk("Swap cache: add %ld/%ld, delete %ld/%ld, find %ld/%ld\n", - swap_cache_add_total, swap_cache_add_success, - swap_cache_del_total, swap_cache_del_success, - swap_cache_find_total, swap_cache_find_success); -} -#endif - int add_to_swap_cache(struct page *page, unsigned long entry) { -#ifdef SWAP_CACHE_INFO - swap_cache_add_total++; -#endif #ifdef DEBUG_SWAP printk("DebugVM: add_to_swap_cache(%08lx count %d, entry %08lx)\n", page_address(page), atomic_read(&page->count), entry); @@ -78,9 +57,6 @@ page->offset = entry; add_page_to_hash_queue(page, &swapper_inode, entry); add_page_to_inode_queue(&swapper_inode, page); -#ifdef SWAP_CACHE_INFO - swap_cache_add_success++; -#endif return 1; } @@ -168,14 +144,9 @@ long find_in_swap_cache(struct page *page) { -#ifdef SWAP_CACHE_INFO - swap_cache_find_total++; -#endif if (PageSwapCache (page)) { long entry = page->offset; -#ifdef SWAP_CACHE_INFO - swap_cache_find_success++; -#endif + remove_from_swap_cache (page); return entry; } @@ -184,14 +155,8 @@ int delete_from_swap_cache(struct page *page) { -#ifdef SWAP_CACHE_INFO - swap_cache_del_total++; -#endif if (PageSwapCache (page)) { long entry = page->offset; -#ifdef SWAP_CACHE_INFO - swap_cache_del_success++; -#endif #ifdef DEBUG_SWAP printk("DebugVM: delete_from_swap_cache(%08lx count %d, " "entry %08lx)\n", @@ -297,4 +262,3 @@ #endif return new_page; } - diff -urN --exclude-from=exclude linux-old/mm/vmscan.c linux/mm/vmscan.c --- linux-old/mm/vmscan.c Fri Jun 26 19:44:38 1998 +++ linux/mm/vmscan.c Tue Jul 14 21:32:56 1998 @@ -29,17 +29,6 @@ #include <asm/pgtable.h> /* - * When are we next due for a page scan? - */ -static unsigned long next_swap_jiffies = 0; - -/* - * How often do we do a pageout scan during normal conditions? - * Default is four times a second. - */ -int swapout_interval = HZ / 4; - -/* * The wait queue for waking up the pageout daemon: */ static struct wait_queue * kswapd_wait = NULL; @@ -444,61 +433,39 @@ * to be. This works out OK, because we now do proper aging on page * contents. */ -static inline int do_try_to_free_page(int gfp_mask) +void try_to_free_page(int gfp_mask) { static int state = 0; - int i=6; - int stop; + int prio = 6; + + lock_kernel(); /* Always trim SLAB caches when memory gets low. */ kmem_cache_reap(gfp_mask); - /* We try harder if we are waiting .. */ - stop = 3; - if (gfp_mask & __GFP_WAIT) - stop = 0; - if (((buffermem >> PAGE_SHIFT) * 100 > buffer_mem.borrow_percent * num_physpages) - || (page_cache_size * 100 > page_cache.borrow_percent * num_physpages)) - state = 0; - - switch (state) { - do { + for (prio = 6; prio >= 0; prio--) { + switch (state) { case 0: - if (shrink_mmap(i, gfp_mask)) - return 1; + if (shrink_mmap(prio, gfp_mask)) + goto out; state = 1; case 1: - if ((gfp_mask & __GFP_IO) && shm_swap(i, gfp_mask)) - return 1; + if ((gfp_mask & __GFP_IO) && shm_swap(prio, gfp_mask)) + goto out; state = 2; case 2: - if (swap_out(i, gfp_mask)) - return 1; + if (swap_out(prio, gfp_mask)) + goto out; state = 3; case 3: - shrink_dcache_memory(i, gfp_mask); + shrink_dcache_memory(prio, gfp_mask); state = 0; - i--; - } while ((i - stop) >= 0); - } - return 0; -} - -/* - * This is REALLY ugly. - * - * We need to make the locks finer granularity, but right - * now we need this so that we can do page allocations - * without holding the kernel lock etc. - */ -int try_to_free_page(int gfp_mask) -{ - int retval; - - lock_kernel(); - retval = do_try_to_free_page(gfp_mask); - unlock_kernel(); - return retval; + } + } + out: + unlock_kernel(); + if (atomic_read(&nr_async_pages) >= SWAP_CLUSTER_MAX) + run_task_queue(&tq_disk); } /* @@ -547,54 +514,16 @@ init_swap_timer(); add_wait_queue(&kswapd_wait, &wait); - while (1) { - int tries; - int tried = 0; - + for (;;) { current->state = TASK_INTERRUPTIBLE; flush_signals(current); - run_task_queue(&tq_disk); schedule(); swapstats.wakeups++; - /* - * Do the background pageout: be - * more aggressive if we're really - * low on free memory. - * - * We try page_daemon.tries_base times, divided by - * an 'urgency factor'. In practice this will mean - * a value of pager_daemon.tries_base / 8 or 4 = 64 - * or 128 pages at a time. - * This gives us 64 (or 128) * 4k * 4 (times/sec) = - * 1 (or 2) MB/s swapping bandwidth in low-priority - * background paging. This number rises to 8 MB/s - * when the priority is highest (but then we'll be - * woken up more often and the rate will be even - * higher). - */ - tries = pager_daemon.tries_base >> free_memory_available(3); - - while (tries--) { - int gfp_mask; - - if (++tried > pager_daemon.tries_min && free_memory_available(0)) - break; - gfp_mask = __GFP_IO; - try_to_free_page(gfp_mask); - /* - * Syncing large chunks is faster than swapping - * synchronously (less head movement). -- Rik. - */ - if (atomic_read(&nr_async_pages) >= pager_daemon.swap_cluster) - run_task_queue(&tq_disk); - - } - } - /* As if we could ever get here - maybe we want to make this killable */ - remove_wait_queue(&kswapd_wait, &wait); - unlock_kernel(); - return 0; + while (nr_free_pages + atomic_read(&nr_async_pages) < freepages.high) + try_to_free_page(nr_free_pages < freepages.min ? + (__GFP_IO | __GFP_WAIT) : __GFP_IO); + } } /* @@ -602,38 +531,9 @@ */ void swap_tick(void) { - unsigned long now, want; - int want_wakeup = 0; - - want = next_swap_jiffies; - now = jiffies; - - /* - * Examine the memory queues. Mark memory low - * if there is nothing available in the three - * highest queues. - * - * Schedule for wakeup if there isn't lots - * of free memory. - */ - switch (free_memory_available(3)) { - case 0: - want = now; - /* Fall through */ - case 1 ... 3: - want_wakeup = 1; - default: - } - - if ((long) (now - want) >= 0) { - if (want_wakeup || (num_physpages * buffer_mem.max_percent) < (buffermem >> PAGE_SHIFT) * 100 - || (num_physpages * page_cache.max_percent < page_cache_size * 100)) { - /* Set the next wake-up time */ - next_swap_jiffies = now + swapout_interval; - wake_up(&kswapd_wait); - } - } - timer_active |= (1<<SWAP_TIMER); + if (nr_free_pages < freepages.low) + wake_up(&kswapd_wait); + timer_active |= (1 << SWAP_TIMER); } /* @@ -644,5 +544,5 @@ { timer_table[SWAP_TIMER].expires = 0; timer_table[SWAP_TIMER].fn = swap_tick; - timer_active |= (1<<SWAP_TIMER); + timer_active |= (1 << SWAP_TIMER); } -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- Unix _IS_ user friendly - it's just selective about who its friends are! -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-13 18:08 ` Eric W. Biederman 1998-07-13 18:29 ` Zlatko Calusic @ 1998-07-14 17:30 ` Stephen C. Tweedie 1998-07-18 1:10 ` Eric W. Biederman 1 sibling, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-14 17:30 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Stephen C. Tweedie, linux-mm Hi, On 13 Jul 1998 13:08:56 -0500, ebiederm+eric@npwt.net (Eric W. Biederman) said: >>>>>> "ST" == Stephen C Tweedie <sct@redhat.com> writes: > 1) We have a minimum size for the buffer cache in percent of physical pages. > Setting the minimum to 0% may help. ... > Personally I think it is broken to set the limits of cache sizes > (buffer & page) to anthing besides: max=100% min=0% by default. Yep; I disabled those limits for the benchmarks I announced. Disabling the ageing but keeping the limits in place still resulted in a performance loss. > 2) If we play with LRU list it may be most practical use page->next > and page->prev fields for the list, and for truncate_inode_pages && > invalidate_inode_pages Yikes --- for large files the proposal that we do > do something like: > for(i = 0; i < inode->i_size; i+= PAGE_SIZE) { > page = find_in_page_cache(inode, i); > if (page) > /* remove it */ > ; > } will be disasterous. No, I think we still need the per-inode page lists. When we eventually get an fsync() which works through the page cache, this will become even more important. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-14 17:30 ` Stephen C. Tweedie @ 1998-07-18 1:10 ` Eric W. Biederman 1998-07-18 13:28 ` Zlatko Calusic 0 siblings, 1 reply; 46+ messages in thread From: Eric W. Biederman @ 1998-07-18 1:10 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, linux-mm >>>>> "ST" == Stephen C Tweedie <sct@redhat.com> writes: ST> Hi, ST> On 13 Jul 1998 13:08:56 -0500, ebiederm+eric@npwt.net (Eric ST> W. Biederman) said: >>>>>>> "ST" == Stephen C Tweedie <sct@redhat.com> writes: >> 1) We have a minimum size for the buffer cache in percent of physical pages. >> Setting the minimum to 0% may help. ST> ... >> Personally I think it is broken to set the limits of cache sizes >> (buffer & page) to anthing besides: max=100% min=0% by default. ST> Yep; I disabled those limits for the benchmarks I announced. Disabling ST> the ageing but keeping the limits in place still resulted in a ST> performance loss. >> 2) If we play with LRU list it may be most practical use page->next >> and page->prev fields for the list, and for truncate_inode_pages && >> invalidate_inode_pages ST> Yikes --- for large files the proposal that we do >> do something like: >> for(i = 0; i < inode->i_size; i+= PAGE_SIZE) { >> page = find_in_page_cache(inode, i); >> if (page) >> /* remove it */ >> ; >> } ST> will be disasterous. No, I think we still need the per-inode page ST> lists. When we eventually get an fsync() which works through the page ST> cache, this will become even more important. Duh. Ext2 only does with in truncate with the block cache on a real truncate, when and inode is closed it doesn't need to do that. Sorry I though I had precedent for that algorithm. O.k. scracth that idea. So I guess a LRU list for pages will require that we increase the size of struct page. I guess it is makes sense if we can ultimately: a) use if for every page on the system ala the swap cache. b) remove the buffer cache which should provide the necessary expansion room. So we won't ultimately use more space. c) use it for a lru on dirty pages. d) doesn't fragment memory with slabs... I hate considering expanding struct page after all of the work that has gone into shriking the lately.... And for writes it looks like I'll need a write time too, for best performance. I've written the code I just haven't tested it yet. Zlatko could I talk you into setting the defines in mmap.h so it shmfs will use those and report if bonnie improves... Eric p.s. Everyone please excuse any slow replies I'm in the middle of moving and I can't read my mail too often. -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-18 1:10 ` Eric W. Biederman @ 1998-07-18 13:28 ` Zlatko Calusic 1998-07-18 16:40 ` Eric W. Biederman 1998-07-22 10:33 ` Stephen C. Tweedie 0 siblings, 2 replies; 46+ messages in thread From: Zlatko Calusic @ 1998-07-18 13:28 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Stephen C. Tweedie, linux-mm ebiederm+eric@npwt.net (Eric W. Biederman) writes: > >>>>> "ST" == Stephen C Tweedie <sct@redhat.com> writes: > > ST> Hi, > ST> On 13 Jul 1998 13:08:56 -0500, ebiederm+eric@npwt.net (Eric > ST> W. Biederman) said: > > >>>>>>> "ST" == Stephen C Tweedie <sct@redhat.com> writes: > >> 1) We have a minimum size for the buffer cache in percent of physical pages. > >> Setting the minimum to 0% may help. > > ST> ... > > >> Personally I think it is broken to set the limits of cache sizes > >> (buffer & page) to anthing besides: max=100% min=0% by default. > > ST> Yep; I disabled those limits for the benchmarks I announced. Disabling > ST> the ageing but keeping the limits in place still resulted in a > ST> performance loss. > > >> 2) If we play with LRU list it may be most practical use page->next > >> and page->prev fields for the list, and for truncate_inode_pages && > >> invalidate_inode_pages > > ST> Yikes --- for large files the proposal that we do > > >> do something like: > >> for(i = 0; i < inode->i_size; i+= PAGE_SIZE) { > >> page = find_in_page_cache(inode, i); > >> if (page) > >> /* remove it */ > >> ; > >> } > > ST> will be disasterous. No, I think we still need the per-inode page > ST> lists. When we eventually get an fsync() which works through the page > ST> cache, this will become even more important. > > Duh. Ext2 only does with in truncate with the block cache on a real > truncate, when and inode is closed it doesn't need to do that. Sorry > I though I had precedent for that algorithm. > > O.k. scracth that idea. > > So I guess a LRU list for pages will require that we increase the size > of struct page. I guess it is makes sense if we can ultimately: > a) use if for every page on the system ala the swap cache. > b) remove the buffer cache which should provide the necessary > expansion room. So we won't ultimately use more space. > c) use it for a lru on dirty pages. > d) doesn't fragment memory with slabs... > > I hate considering expanding struct page after all of the work > that has gone into shriking the lately.... > > And for writes it looks like I'll need a write time too, for best > performance. I've written the code I just haven't tested it yet. > > Zlatko could I talk you into setting the defines in mmap.h so it shmfs > will use those and report if bonnie improves... > When it comes to benchmarking, I'm always prepared. :) It's just, that I didn't understand completely what are you trying to do, but if you have a prepared patch, I'll gladly test it. BTW, looking at 2.1.109, I'm very pleased with the changes made in mm/ directory. Finally, free_memory_available is simple, readable and efficient. ;) Next week, I will test some ideas which possibly could improve things WITH page aging. I must admit, after lot of critics I made upon page aging, that I believe it's the right way to go, but it should be done properly. Performance should be better, not worse. Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- Any sufficiently advanced bug is indistinguishable from a feature. -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-18 13:28 ` Zlatko Calusic @ 1998-07-18 16:40 ` Eric W. Biederman 1998-07-20 9:15 ` Zlatko Calusic ` (2 more replies) 1998-07-22 10:33 ` Stephen C. Tweedie 1 sibling, 3 replies; 46+ messages in thread From: Eric W. Biederman @ 1998-07-18 16:40 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Stephen C. Tweedie, linux-mm >>>>> "ZC" == Zlatko Calusic <Zlatko.Calusic@CARNet.hr> writes: Let me just step back a second so I can be clear: A) The idea proposed by Stephen way perhaps we could use Least Recently Used lists instead of page aging. It's effectively the same thing but shrink_mmap can find the old pages much much faster, by simply following a linked list. B) This idea intrigues me because handling of generic dirty pages I have about the same problem. In cloneing bdflush for the page cache I discovered two fields I would need to add to struct page to do an exact cloning job. A page writetime, and LRU list pointers for dirty pages. I went ahead and implemented them, but also implemented an alternative, which is the default. So on any discussion with LRU lists I'm terribly interested. As soon as I get the time I'll even implement the more general case. Mostly I just need to get my computer moved to where I am at so I can code when I have free time :) What I have now are controled by the defines I added to include/linux/mm.h with my shmfs patches. #undef USE_PG_FLUSHTIME (This tells sync_old_pages when to stop) #undef USE_PG_DIRTY_LIST (Define this for a first pass at an LRU list for dirty pages) If nothing else it's worth trying to see if it improves my write times which fall way behind the read times, on Zlato's benchmark :( If I can talk Zlatko or someone into looking at these it would be nice. I really need to get my own copy of bonnie and a few other benchmarks... ZC> Next week, I will test some ideas which possibly could improve things ZC> WITH page aging. ZC> I must admit, after lot of critics I made upon page aging, that I ZC> believe it's the right way to go, but it should be done properly. ZC> Performance should be better, not worse. Agreed. We should look very carefully though to see if any aging solution increases fragmentation. According to Stephen the current one does, and this may be a natural result of aging and not just a single implementation :( Eric -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-18 16:40 ` Eric W. Biederman @ 1998-07-20 9:15 ` Zlatko Calusic 1998-07-22 10:40 ` Stephen C. Tweedie 1998-07-20 15:58 ` Stephen C. Tweedie 1998-07-22 10:36 ` Stephen C. Tweedie 2 siblings, 1 reply; 46+ messages in thread From: Zlatko Calusic @ 1998-07-20 9:15 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Stephen C. Tweedie, linux-mm ebiederm+eric@npwt.net (Eric W. Biederman) writes: > >>>>> "ZC" == Zlatko Calusic <Zlatko.Calusic@CARNet.hr> writes: > > Let me just step back a second so I can be clear: > > A) The idea proposed by Stephen way perhaps we could use Least > Recently Used lists instead of page aging. It's effectively the same > thing but shrink_mmap can find the old pages much much faster, by > simply following a linked list. Well, it looks like a good idea. > > B) This idea intrigues me because handling of generic dirty pages > I have about the same problem. In cloneing bdflush for the page cache > I discovered two fields I would need to add to struct page to do an > exact cloning job. A page writetime, and LRU list pointers for dirty > pages. I went ahead and implemented them, but also implemented an > alternative, which is the default. > I don't know how much impact does adding a few fields in the struct page has on the performance. Why don't you just add that two fields, so we can see what happens. I don't know if its easy, but we probably should get rid of buffer cache completely, at one point in time. It's hard to balance things between two caches, not to mention other memory objects in kernel. If page cache is ever to replace buffer cache, it will definitely need some parts of already established mechanisms and data types that buffer cache has now. On the other side, I must admit that I didn't saw any more fragmentation with page aging. It's just that memory gets used in weird ways, when it's on, and there's lots of unneeded swapping. Then again, I have made some changes that make my system very stable wrt memory fragmentation: #define SLAB_MIN_OBJS_PER_SLAB 1 #define SLAB_BREAK_GFP_ORDER 1 in mm/slab.c I discussed this privately with slab maintainer Mark Hemment, where he pointed out that with this setting slab is probably not as efficient as it could be. Also, slack is bigger, obviously. I didn't completely understand all reasons why this could be slower, and I must admit that I can't see any bad impact on the performance. I did really lots of benchmarking. 5.5MB/sec through two 100MBps NICs via router and straight to cheap IDE disk on low end Pentium is not what you call a bad performance. :) But system is much more stable, and it is now very *very* hard to get that annoying "Couldn't get a free page..." message than before (with default setup), when it was as easy as clicking a button in the Netscape. I even have some custom scripts that make lots of FTP connections to fast sites, as that was proven to block my system quite easily before. > So on any discussion with LRU lists I'm terribly interested. > As soon as I get the time I'll even implement the more general case. > Mostly I just need to get my computer moved to where I am at so I can > code when I have free time :) > I hope you that you have found a nice place to live. So that you can get happy and make loads of great code. :) > What I have now are controled by the defines I added to > include/linux/mm.h with my shmfs patches. > #undef USE_PG_FLUSHTIME (This tells sync_old_pages when to stop) > #undef USE_PG_DIRTY_LIST (Define this for a first pass at an LRU list > for dirty pages) > > If nothing else it's worth trying to see if it improves my write times > which fall way behind the read times, on Zlato's benchmark :( > As I alredy said, it will be my pleasure to test things and say my comments. I spent lots of time tweaking here and there and measuring not only performance, but stability, too. Half a year ago, my system was really unstable, thanks to memory fragmentation. I was occasionaly logged via XDM, and had to kill the whole session (Ctrl-Alt-BS), because everything would stall, after initial "Couldn't get a free page...". Than I got annoyed with that, and tried to find a solution, or at least a workaround... :) > If I can talk Zlatko or someone into looking at these it would be > nice. I really need to get my own copy of bonnie and a few other > benchmarks... > I'll send you a copy of bonnie source in another private mail. > ZC> Next week, I will test some ideas which possibly could improve things > ZC> WITH page aging. > > ZC> I must admit, after lot of critics I made upon page aging, that I > ZC> believe it's the right way to go, but it should be done properly. > ZC> Performance should be better, not worse. > > Agreed. We should look very carefully though to see if any aging > solution increases fragmentation. According to Stephen the current > one does, and this may be a natural result of aging and not just a > single implementation :( > Speaking of low memory machines, I thinks that inode memory is much bigger problem there. I had opportunity to test 2.1.x series on 5MB 386DX40, and system runs nothing near perfection. :( Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- "640K ought to be enough for anybody." Bill Gates '81 -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-20 9:15 ` Zlatko Calusic @ 1998-07-22 10:40 ` Stephen C. Tweedie 1998-07-23 10:06 ` Zlatko Calusic 0 siblings, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-22 10:40 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Eric W. Biederman, Stephen C. Tweedie, linux-mm Hi, On 20 Jul 1998 11:15:12 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> said: > I don't know if its easy, but we probably should get rid of buffer > cache completely, at one point in time. It's hard to balance things > between two caches, not to mention other memory objects in kernel. No, we need the buffer cache for all sorts of things. You'd have to reinvent it if you got rid of it, since it is the main mechanism by which we can reliably label IO for the block device driver layer, and we also cache non-page-aligned filesystem metadata there. > Then again, I have made some changes that make my system very stable > wrt memory fragmentation: > #define SLAB_MIN_OBJS_PER_SLAB 1 > #define SLAB_BREAK_GFP_ORDER 1 The SLAB_BREAK_GFP_ORDER one is the important one on low memory configurations. I need to use this setting to get 2.1.110 to work at all with NFS in low memory. > I discussed this privately with slab maintainer Mark Hemment, where > he pointed out that with this setting slab is probably not as > efficient as it could be. Also, slack is bigger, obviously. Correct, but then the main user of these larger packets is networking, where the memory is typically short lived anyway. > But system is much more stable, and it is now very *very* hard to get > that annoying "Couldn't get a free page..." message than before (with > default setup), when it was as easy as clicking a button in the > Netscape. I can still reproduce it if I let the inode cache grow too large: it behaves really badly and seems to lock up rather a lot of memory. Still chasing this one; it's a killer right now. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-22 10:40 ` Stephen C. Tweedie @ 1998-07-23 10:06 ` Zlatko Calusic 1998-07-23 12:22 ` Stephen C. Tweedie 1998-07-26 14:49 ` Eric W Biederman 0 siblings, 2 replies; 46+ messages in thread From: Zlatko Calusic @ 1998-07-23 10:06 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, linux-mm "Stephen C. Tweedie" <sct@redhat.com> writes: > Hi, > > On 20 Jul 1998 11:15:12 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> > said: > > > I don't know if its easy, but we probably should get rid of buffer > > cache completely, at one point in time. It's hard to balance things > > between two caches, not to mention other memory objects in kernel. > > No, we need the buffer cache for all sorts of things. You'd have to > reinvent it if you got rid of it, since it is the main mechanism by > which we can reliably label IO for the block device driver layer, and we > also cache non-page-aligned filesystem metadata there. Yes, I'm aware of lots of problems that would need to be resolved in order to get rid of buffer cache (probably just to reinvent it, as you said :)). But, then again, if I understand you completely, we will always have the buffer cache as it is implemented now?! Non-page aligned filesystem metadata, really looks like a hard problem to solve without buffer cache mechanism, that's out of question, but is there any posibility that we will introduce some logic to use somewhat improved page cache with buffer head functionality (or similar) that will allow us to use page cache in similar way that we use buffer cache now? Even I didn't investigate it that lot, I still see Erics work on adding dirty page functionality as a step toward this. Disclaimer: I really don't see myself as any kind of expert in this area. But that's a one motivation more for me to try to understand things that I don't have at control presently. :) I've been browsing Linux source actively for the last 12 months, as time permitted. MM area is by far of the biggest interest for me. But, I'm still learning. > > > Then again, I have made some changes that make my system very stable > > wrt memory fragmentation: > > > #define SLAB_MIN_OBJS_PER_SLAB 1 > > #define SLAB_BREAK_GFP_ORDER 1 > > The SLAB_BREAK_GFP_ORDER one is the important one on low memory > configurations. I need to use this setting to get 2.1.110 to work at > all with NFS in low memory. > > > I discussed this privately with slab maintainer Mark Hemment, where > > he pointed out that with this setting slab is probably not as > > efficient as it could be. Also, slack is bigger, obviously. > > Correct, but then the main user of these larger packets is networking, > where the memory is typically short lived anyway. Two days ago, I rebooted unpatched 2.1.110 with mem=32m, just to find it dead today: I left at cca 22:00h on Jul 21. Jul 21 22:16:43 atlas kernel: eth0: media is 100Mb/s full duplex. Jul 21 22:34:31 atlas kernel: eth0: Insufficient memory; nuking packet. Jul 21 22:34:44 atlas last message repeated 174 times Jul 22 16:03:40 atlas kernel: eth0: media is TP full duplex. Jul 22 16:03:43 atlas kernel: eth0: media is unconnected, link down or incompati ble connection. ... Used to patch every kernel that I download, I forgot how unstable official kernels are. And that's not good. :( Machine's only task, when I'm not logged in, is to transfer mail (fetchmail + sendmail). > > > But system is much more stable, and it is now very *very* hard to get > > that annoying "Couldn't get a free page..." message than before (with > > default setup), when it was as easy as clicking a button in the > > Netscape. > > I can still reproduce it if I let the inode cache grow too large: it > behaves really badly and seems to lock up rather a lot of memory. Still > chasing this one; it's a killer right now. > My observations with low memory machines led me to conclusion that inode memory grows monotonically until it takes cca 1.5MB of unswappable memory. That is around half of usable memory on 5MB machine. You seconded that in private mail you sent me in January. Is there any possibility that we could use slab allocator for inode allocation/deallocation? Reagrds, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- So much time, and so little to do. -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 10:06 ` Zlatko Calusic @ 1998-07-23 12:22 ` Stephen C. Tweedie 1998-07-23 14:07 ` Zlatko Calusic 1998-07-26 14:49 ` Eric W Biederman 1 sibling, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-23 12:22 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, linux-mm Hi, On 23 Jul 1998 12:06:05 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> said: > Yes, I'm aware of lots of problems that would need to be resolved in > order to get rid of buffer cache (probably just to reinvent it, as you > said :)). But, then again, if I understand you completely, we will > always have the buffer cache as it is implemented now?! I don't see any pressing need to replace it. Changing the _management_ of the buffer cache, and doing things like modifying the file write paths, are different issues which we probably should do. Ultimately we need synchronised access to individual blocks of a block device. We need something which can talk directly to the block device drivers. Once you have that in place, with a suitable form of buffering added, you have something that necessarily looks sufficiently like the buffer cache that I can't see a need to get rid of the current one. That doesn't mean we can't improve the current system, but improving and replacing are two very different things. > Non-page aligned filesystem metadata, really looks like a hard problem > to solve without buffer cache mechanism, that's out of question, but > is there any posibility that we will introduce some logic to use > somewhat improved page cache with buffer head functionality (or > similar) that will allow us to use page cache in similar way that we > use buffer cache now? We still need a way to go to the block device drivers. As you say, we still need the buffer_head. We _already_ have a way of using buffer_heads without full buffers allocated in the cache (the swapper uses such temporary buffer_heads, for example). We also need mechanisms for things like loop devices and RAID. There's a lot going on in the buffer cache! > Two days ago, I rebooted unpatched 2.1.110 with mem=32m, just to find > it dead today: > I left at cca 22:00h on Jul 21. > Jul 21 22:16:43 atlas kernel: eth0: media is 100Mb/s full duplex. > Jul 21 22:34:31 atlas kernel: eth0: Insufficient memory; nuking > packet. I've got a fix for some of the (serious) fragmentation problems in 110. 111-pre1 with the fixes is looking really, really good. Post with patch to follow. > My observations with low memory machines led me to conclusion that > inode memory grows monotonically until it takes cca 1.5MB of > unswappable memory. That is around half of usable memory on 5MB > machine. You seconded that in private mail you sent me in January. Does this still happen? My own tests show 110 behaving very much better in this respect. > Is there any possibility that we could use slab allocator for inode > allocation/deallocation? Yes. I'll have to benchmark to see how much better it gets, but (a) 110 seems to need it less anyway, and (b) it opens up a whole new pile of synchronisation problems in fs/inode.c, which can currently make the assumption that an inode structure can move lists but can never actually die if the inode spinlock is dropped. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 12:22 ` Stephen C. Tweedie @ 1998-07-23 14:07 ` Zlatko Calusic 1998-07-23 17:18 ` Stephen C. Tweedie 0 siblings, 1 reply; 46+ messages in thread From: Zlatko Calusic @ 1998-07-23 14:07 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, linux-mm "Stephen C. Tweedie" <sct@redhat.com> writes: > Hi, > > On 23 Jul 1998 12:06:05 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> > said: > > > Yes, I'm aware of lots of problems that would need to be resolved in > > order to get rid of buffer cache (probably just to reinvent it, as you > > said :)). But, then again, if I understand you completely, we will > > always have the buffer cache as it is implemented now?! > > I don't see any pressing need to replace it. Changing the _management_ > of the buffer cache, and doing things like modifying the file write > paths, are different issues which we probably should do. > > Ultimately we need synchronised access to individual blocks of a block > device. We need something which can talk directly to the block device > drivers. Once you have that in place, with a suitable form of buffering > added, you have something that necessarily looks sufficiently like the > buffer cache that I can't see a need to get rid of the current one. > That doesn't mean we can't improve the current system, but improving and > replacing are two very different things. > OK, I understand. I needed to hear opinion of someone who *really* does know how things work. One of the thoughts that influenced me is text at: http://www.caip.rutgers.edu/~davem/vfsmm.html but I can't (and won't) pretend that I understand everything that's mentioned there. :) Strangely enough, I think I never explained why do *I* think integrating buffer cache functionality into page cache would (in my thought) be a good thing. Since both caches are very different, I'm not sure memory management can be fair enough in some cases. Take a simple example: two applications, I/O bound, where one is accessing raw partition (e.g. fsck) and other uses filesystem (web, ftp...). Question is, how do I know that MM is fair. Maybe page cache grows too large on behalf of buffer cache, so fsck runs much slower than it could. Or if buffer cache grows faster (which is not the case, IMO) then web would be fast, but fsck (or some database accessing raw partition) could take a penalty. Integrating both caches could help in these cases, which are not uncommon (isn't Linux a beautiful multitasker? :)). All this is consequence of buffer cache buffering raw blocks (including FS metadata), and page cache buffering FS data. BUT! if you say buffer cache won't go, then I believe you, just to make it straight. :) And thanks for explanation. I hope my bad English doesn't make you to much trouble understanding. > > Non-page aligned filesystem metadata, really looks like a hard problem > > to solve without buffer cache mechanism, that's out of question, but > > is there any posibility that we will introduce some logic to use > > somewhat improved page cache with buffer head functionality (or > > similar) that will allow us to use page cache in similar way that we > > use buffer cache now? > > We still need a way to go to the block device drivers. As you say, we > still need the buffer_head. We _already_ have a way of using > buffer_heads without full buffers allocated in the cache (the swapper > uses such temporary buffer_heads, for example). We also need mechanisms > for things like loop devices and RAID. There's a lot going on in the > buffer cache! > No doubt! I never tried to underestimate buffer cache complexness and functionality. :) > > Two days ago, I rebooted unpatched 2.1.110 with mem=32m, just to find > > it dead today: > > > I left at cca 22:00h on Jul 21. > > > Jul 21 22:16:43 atlas kernel: eth0: media is 100Mb/s full duplex. > > Jul 21 22:34:31 atlas kernel: eth0: Insufficient memory; nuking > > packet. > > I've got a fix for some of the (serious) fragmentation problems in 110. > 111-pre1 with the fixes is looking really, really good. Post with patch > to follow. > Nice, I will test it right away. > > My observations with low memory machines led me to conclusiaon that > > inode memory grows monotonically until it takes cca 1.5MB of > > unswappable memory. That is around half of usable memory on 5MB > > machine. You seconded that in private mail you sent me in January. > > Does this still happen? My own tests show 110 behaving very much better > in this respect. Huh, my apology needed here. My tests on lowmem machine took place around New Year. I have that 386DX/40 with 5MB at home, but I'm rarely home. :) So, everything I said reflects a situation before 7 months. I didn't made any tests in the mean time. Here, at work, Linux is installed on more appropriate hardware. :) > > > Is there any possibility that we could use slab allocator for inode > > allocation/deallocation? > > Yes. I'll have to benchmark to see how much better it gets, but (a) 110 > seems to need it less anyway, and (b) it opens up a whole new pile of > synchronisation problems in fs/inode.c, which can currently make the > assumption that an inode structure can move lists but can never actually > die if the inode spinlock is dropped. > Wish you luck! Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- Don't mess with Murphy. -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 14:07 ` Zlatko Calusic @ 1998-07-23 17:18 ` Stephen C. Tweedie 1998-07-23 19:33 ` Zlatko Calusic 0 siblings, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-23 17:18 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, linux-mm Hi, On 23 Jul 1998 16:07:23 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> said: > Strangely enough, I think I never explained why do *I* think > integrating buffer cache functionality into page cache would (in my > thought) be a good thing. Since both caches are very different, I'm > not sure memory management can be fair enough in some cases. > Take a simple example: two applications, I/O bound, where one is > accessing raw partition (e.g. fsck) and other uses filesystem (web, > ftp...). Question is, how do I know that MM is fair. Maybe page cache > grows too large on behalf of buffer cache, so fsck runs much slower > than it could. Or if buffer cache grows faster (which is not the case, > IMO) then web would be fast, but fsck (or some database accessing raw > partition) could take a penalty. There's a single loop in shrink_mmap() which treats both buffer-cache pages and page-cache pages identically. It just propogates the buffer referenced bits into the page's PG_referenced bit before doing any ageing on the page. It should be fair enough. There are other issues concerning things like locked and dirty buffers which complicate the issue, but they are not sufficient reason to throw away the buffer cache! --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 17:18 ` Stephen C. Tweedie @ 1998-07-23 19:33 ` Zlatko Calusic 1998-07-27 10:57 ` Stephen C. Tweedie 0 siblings, 1 reply; 46+ messages in thread From: Zlatko Calusic @ 1998-07-23 19:33 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, werner, linux-mm "Stephen C. Tweedie" <sct@redhat.com> writes: > Hi, > > On 23 Jul 1998 16:07:23 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> > said: > > > Strangely enough, I think I never explained why do *I* think > > integrating buffer cache functionality into page cache would (in my > > thought) be a good thing. Since both caches are very different, I'm > > not sure memory management can be fair enough in some cases. > > > Take a simple example: two applications, I/O bound, where one is > > accessing raw partition (e.g. fsck) and other uses filesystem (web, > > ftp...). Question is, how do I know that MM is fair. Maybe page cache > > grows too large on behalf of buffer cache, so fsck runs much slower > > than it could. Or if buffer cache grows faster (which is not the case, > > IMO) then web would be fast, but fsck (or some database accessing raw > > partition) could take a penalty. > > There's a single loop in shrink_mmap() which treats both buffer-cache > pages and page-cache pages identically. It just propogates the buffer > referenced bits into the page's PG_referenced bit before doing any > ageing on the page. It should be fair enough. There are other issues > concerning things like locked and dirty buffers which complicate the > issue, but they are not sufficient reason to throw away the buffer > cache! > Hm, I know how shrink_mmap work, but I never looked at it that way. My eyes are wide open. Seems like all my reasons are not valid, so I will forget about my ideas for a while. :) In the mean time, I applied the same benchmark, I was already doing, to kernel with Werner's lowmem patch applied, and results are interesting. Performance is very similar to that with my change, but there are some differences. With Werner's patch, kernel behaviour is yet slightly less aggressive: procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 0 6492 4548 23100 0 0 179 16 219 157 23 9 68 0 0 0 0 6492 4548 23100 0 0 0 2 108 9 0 0 100 1 0 0 84 1384 1964 31168 0 8 6051 3 229 222 1 24 74 1 0 0 128 1200 1964 31404 0 4 6630 3 238 237 1 25 75 1 0 0 476 1024 1964 31928 0 35 6802 9 240 241 1 26 73 1 0 0 1764 1316 1964 32932 0 129 6522 33 240 233 1 23 76 1 0 0 2584 1172 1964 33896 0 82 6392 21 237 227 1 23 76 1 0 0 3384 1284 1964 34584 0 80 6330 21 234 224 1 24 75 1 0 0 4100 1232 1964 35352 0 72 6365 19 234 228 0 23 76 1 0 0 4164 1432 1964 35236 0 6 6176 2 229 223 1 24 75 1 0 0 4220 1136 1964 35580 0 6 7331 2 250 258 2 27 71 1 0 0 4892 1284 1964 36096 0 67 7417 18 255 261 2 28 70 1 0 0 4940 1532 1964 35896 0 5 7460 2 252 258 1 28 71 1 0 0 4980 1540 1964 35932 0 4 7307 2 251 256 2 27 72 0 0 0 4996 1536 1964 35984 0 2 1496 2 140 66 0 5 95 0 0 0 4996 1536 1964 35984 0 0 0 1 102 7 0 0 100 So whichever solution find a way to the official kernel, will make me happy. :) Thank you for your thoughts and opinions! Wish you a nice weekend (at that wedding, is it yours?) :) -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- Crime doesn't pay... does that mean my job is a crime? -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 19:33 ` Zlatko Calusic @ 1998-07-27 10:57 ` Stephen C. Tweedie 0 siblings, 0 replies; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-27 10:57 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, werner, linux-mm > In the mean time, I applied the same benchmark, I was already doing, > to kernel with Werner's lowmem patch applied, and results are > interesting. Performance is very similar to that with my change, but > there are some differences. With Werner's patch, kernel behaviour is > yet slightly less aggressive: OK, time to look at a bigger set of benchmarks for this. If it helps this case, it needs to be considered for 2.2. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 10:06 ` Zlatko Calusic 1998-07-23 12:22 ` Stephen C. Tweedie @ 1998-07-26 14:49 ` Eric W Biederman 1998-07-27 11:02 ` Stephen C. Tweedie 1 sibling, 1 reply; 46+ messages in thread From: Eric W Biederman @ 1998-07-26 14:49 UTC (permalink / raw) To: Zlatko Calusic; +Cc: Stephen C. Tweedie, linux-mm On 23 Jul 1998, Zlatko Calusic wrote: > "Stephen C. Tweedie" <sct@redhat.com> writes: > > > Hi, > > > > On 20 Jul 1998 11:15:12 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> > > said: > > > > > I don't know if its easy, but we probably should get rid of buffer > > > cache completely, at one point in time. It's hard to balance things > > > between two caches, not to mention other memory objects in kernel. > > > > No, we need the buffer cache for all sorts of things. You'd have to > > reinvent it if you got rid of it, since it is the main mechanism by > > which we can reliably label IO for the block device driver layer, and we > > also cache non-page-aligned filesystem metadata there. > > Even I didn't investigate it that lot, I still see Erics work on > adding dirty page functionality as a step toward this. >From where I sit it looks completly possible to give the buffer cache a fake inode, and have it use the same mechanisms that I have developed for handling other dirty data in the page cache. It should also be possible in this effort to simplify the buffer_head structure as well. As time permits I'll move in that direction. Eric -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-26 14:49 ` Eric W Biederman @ 1998-07-27 11:02 ` Stephen C. Tweedie 1998-08-02 5:19 ` Eric W Biederman 0 siblings, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-27 11:02 UTC (permalink / raw) To: ebiederm+eric; +Cc: Zlatko Calusic, Stephen C. Tweedie, linux-mm Hi, On Sun, 26 Jul 1998 09:49:02 -0500 (CDT), Eric W Biederman <eric@flinx.npwt.net> said: > From where I sit it looks completly possible to give the buffer cache a > fake inode, and have it use the same mechanisms that I have developed for > handling other dirty data in the page cache. It should also be possible > in this effort to simplify the buffer_head structure as well. > As time permits I'll move in that direction. You'd still have to persuade people that it's a good idea. I'm not convinced. The reason for having things in the page cache is for fast lookup. For this to make sense for the buffer cache, you'd have to align the buffer cache on page boundaries, but buffers on disk are not naturally aligned this way. You'd end up wasting a lot of space as perhaps only a few of the buffers in any page were useful, and you'd also have to keep track of which buffers within the page were valid/dirty. We *need* a mechanism which is block-aligned, not page-aligned. The buffer cache is a good way of doing it. Forcing block device caching into a page-aligned cache is not necessarily going to simplify things. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-27 11:02 ` Stephen C. Tweedie @ 1998-08-02 5:19 ` Eric W Biederman 1998-08-17 13:57 ` Stephen C. Tweedie 1998-08-17 15:35 ` Stephen C. Tweedie 0 siblings, 2 replies; 46+ messages in thread From: Eric W Biederman @ 1998-08-02 5:19 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Zlatko Calusic, linux-mm On Mon, 27 Jul 1998, Stephen C. Tweedie wrote: > Hi, > > On Sun, 26 Jul 1998 09:49:02 -0500 (CDT), Eric W Biederman > <eric@flinx.npwt.net> said: > > > From where I sit it looks completly possible to give the buffer cache a > > fake inode, and have it use the same mechanisms that I have developed for > > handling other dirty data in the page cache. It should also be possible > > in this effort to simplify the buffer_head structure as well. > > > As time permits I'll move in that direction. > > You'd still have to persuade people that it's a good idea. I'm not > convinced. > > The reason for having things in the page cache is for fast lookup. > For this to make sense for the buffer cache, you'd have to align the > buffer cache on page boundaries, but buffers on disk are not naturally > aligned this way. You'd end up wasting a lot of space as perhaps only > a few of the buffers in any page were useful, and you'd also have to > keep track of which buffers within the page were valid/dirty. > That wasn't actually how I was envisioning it. Though it is a possibility I have kicked around. For direct device I/O and mmaping of devices it is exactly how we should do it. What I was envisioning is using a single write-out daemon instead of 2 (one for buffer cache, one for page cache). Using the same tests in shrink_mmap. Reducing the size of a buffer_head by a lot because consolidating the two would reduce the number of lists needed. To sit the buffer cache upon a single pseudo inode, and keep it's current hashing scheme. In general allowing the management to be consolidated between the two, but nothing more. At this point it is not a major point, but the buffer cache is quite likely to shrink into something barely noticeable, assuming regular files will write buffer themselves in the page cache preventing double buffering. When the buffer cache becomes a shrunken appendage then we will know what we really need it for, and how much a performance hit we will take, and we can worry about it then. > We *need* a mechanism which is block-aligned, not page-aligned. The > buffer cache is a good way of doing it. Forcing block device caching > into a page-aligned cache is not necessarily going to simplify things. The page-aligned property is only a matter of the inode,offset hash table, and virtually nothing else really cares. Shrink_mmap, or pgflush, the most universall parts of the page cache do not. Eric -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-08-02 5:19 ` Eric W Biederman @ 1998-08-17 13:57 ` Stephen C. Tweedie 1998-08-17 15:35 ` Stephen C. Tweedie 1 sibling, 0 replies; 46+ messages in thread From: Stephen C. Tweedie @ 1998-08-17 13:57 UTC (permalink / raw) To: ebiederm+eric; +Cc: Stephen C. Tweedie, Zlatko Calusic, linux-mm Hi, Sorry, I'm just back from 2 weeks on holiday. On Sun, 2 Aug 1998 00:19:52 -0500 (CDT), Eric W Biederman <eric@flinx.npwt.net> said: >> We *need* a mechanism which is block-aligned, not page-aligned. The >> buffer cache is a good way of doing it. Forcing block device caching >> into a page-aligned cache is not necessarily going to simplify things. > The page-aligned property is only a matter of the inode,offset hash > table, and virtually nothing else really cares. Shrink_mmap, or > pgflush, the most universall parts of the page cache do not. Any mmap()able files *need* to be page aligned in cache. Internal filesystem accesses are always block aligned, not page aligned. That's the conflict. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-08-02 5:19 ` Eric W Biederman 1998-08-17 13:57 ` Stephen C. Tweedie @ 1998-08-17 15:35 ` Stephen C. Tweedie 1998-08-20 12:40 ` Eric W. Biederman 1 sibling, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-08-17 15:35 UTC (permalink / raw) To: ebiederm+eric; +Cc: Stephen C. Tweedie, Zlatko Calusic, linux-mm Hi, On Sun, 2 Aug 1998 00:19:52 -0500 (CDT), Eric W Biederman <eric@flinx.npwt.net> said: > What I was envisioning is using a single write-out daemon > instead of 2 (one for buffer cache, one for page cache). Using the same > tests in shrink_mmap. Reducing the size of a buffer_head by a lot because > consolidating the two would reduce the number of lists needed. > To sit the buffer cache upon a single pseudo inode, and keep it's current > hashing scheme. The only reason we currently have two daemons is that we need one for writing dirty memory and another for reclaiming clean memory. That way, even when we stall for disk writes, we are still able to reclaim free memory via shrink_mmap(). The kswapd daemon and the shrink_mmap() code already treat the page cache and buffer cache both the same. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-08-17 15:35 ` Stephen C. Tweedie @ 1998-08-20 12:40 ` Eric W. Biederman 0 siblings, 0 replies; 46+ messages in thread From: Eric W. Biederman @ 1998-08-20 12:40 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Zlatko Calusic, linux-mm >>>>> "ST" == Stephen C Tweedie <sct@redhat.com> writes: ST> Hi, ST> On Sun, 2 Aug 1998 00:19:52 -0500 (CDT), Eric W Biederman ST> <eric@flinx.npwt.net> said: >> What I was envisioning is using a single write-out daemon >> instead of 2 (one for buffer cache, one for page cache). Using the same >> tests in shrink_mmap. Reducing the size of a buffer_head by a lot because >> consolidating the two would reduce the number of lists needed. >> To sit the buffer cache upon a single pseudo inode, and keep it's current >> hashing scheme. ST> The only reason we currently have two daemons But I have 3. One for writing dirty data in the buffer cache. bdflush One for writing dirty data in the page cache. pgflush One for reclaiming clean memory kswapd I would like to merge bdflush and pgflush in the long run if I can. Since pgflush is more generic than bdflush it should be doable. This happens to give a degree of page cache and buffer cache unification as a side effect, of setting up the buffer cache to use pgflush. ST> is that we need one for ST> writing dirty memory and another for reclaiming clean memory. That way, ST> even when we stall for disk writes, we are still able to reclaim free ST> memory via shrink_mmap(). The kswapd daemon and the shrink_mmap() code ST> already treat the page cache and buffer cache both the same. I was talking of integrating my ``dirty data in the page cache'' code, in with the rest of the kernel. Hopefully for early 2.3. My apologies for being so unclear you totally missed what I was talking about. Eric -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-18 16:40 ` Eric W. Biederman 1998-07-20 9:15 ` Zlatko Calusic @ 1998-07-20 15:58 ` Stephen C. Tweedie 1998-07-22 10:36 ` Stephen C. Tweedie 2 siblings, 0 replies; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-20 15:58 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Zlatko.Calusic, Stephen C. Tweedie, linux-mm Hi, On 18 Jul 1998 11:40:20 -0500, ebiederm+eric@npwt.net (Eric W. Biederman) said: >>>>>> "ZC" == Zlatko Calusic <Zlatko.Calusic@CARNet.hr> writes: > Let me just step back a second so I can be clear: > A) The idea proposed by Stephen way perhaps we could use Least > Recently Used lists instead of page aging. It's effectively the same > thing but shrink_mmap can find the old pages much much faster, by > simply following a linked list. > B) This idea intrigues me because handling of generic dirty pages > I have about the same problem. In cloneing bdflush for the page cache > I discovered two fields I would need to add to struct page to do an > exact cloning job. A page writetime, and LRU list pointers for dirty > pages. I went ahead and implemented them, but also implemented an > alternative, which is the default. We already have all of the inode's pages on a linked list. Extending that to have two separate lists, one for clean pages and one for dirty, would be cheap and would not have the extra memory overhead. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-18 16:40 ` Eric W. Biederman 1998-07-20 9:15 ` Zlatko Calusic 1998-07-20 15:58 ` Stephen C. Tweedie @ 1998-07-22 10:36 ` Stephen C. Tweedie 1998-07-22 18:01 ` Rik van Riel 2 siblings, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-22 10:36 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Zlatko.Calusic, Stephen C. Tweedie, linux-mm Hi, On 18 Jul 1998 11:40:20 -0500, ebiederm+eric@npwt.net (Eric W. Biederman) said: > Agreed. We should look very carefully though to see if any aging > solution increases fragmentation. According to Stephen the current > one does, and this may be a natural result of aging and not just a > single implementation :( No no no! The current VM has two separate but related problems. First is that it keeps too much cache in low memory configurations, and that appears to be much much better in 2.1.109 and 110. Second is the fragmentation issue, but that's a lot harder to address I'm afraid. I have a zoned allocator now working which does help enormously: it's the first time my VM-test 2.1 configuration has _ever_ been able to run successfully with 8k NFS. However, the zoned allocation can use memory less efficiently: the odd free pages in the paged zone cannot be used by non-paged users and vice versa, so overall performance may suffer. Right now I'm cleaning the code up for a release against 2.1.110 so that we can start testing. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-22 10:36 ` Stephen C. Tweedie @ 1998-07-22 18:01 ` Rik van Riel 1998-07-23 10:59 ` Stephen C. Tweedie 0 siblings, 1 reply; 46+ messages in thread From: Rik van Riel @ 1998-07-22 18:01 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, Zlatko.Calusic, linux-mm On Wed, 22 Jul 1998, Stephen C. Tweedie wrote: > successfully with 8k NFS. However, the zoned allocation can use memory > less efficiently: the odd free pages in the paged zone cannot be used by > non-paged users and vice versa, so overall performance may suffer. > Right now I'm cleaning the code up for a release against 2.1.110 so > that we can start testing. Hmm, I'm curious as to what categories your allocator divides memory users in. Is it just plain swappable vs. non-swappable or is it fragmentation-causing vs. fragmentation sensitive or something entirely different? Btw, I'm working on version 2 of my zone allocator design right now. Maybe we want the complex but complete version for 2.3... Rik. +-------------------------------------------------------------------+ | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl | | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ | +-------------------------------------------------------------------+ -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-22 18:01 ` Rik van Riel @ 1998-07-23 10:59 ` Stephen C. Tweedie 0 siblings, 0 replies; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-23 10:59 UTC (permalink / raw) To: Rik van Riel Cc: Stephen C. Tweedie, Eric W. Biederman, Zlatko.Calusic, linux-mm Hi, On Wed, 22 Jul 1998 20:01:51 +0200 (CEST), Rik van Riel <H.H.vanRiel@phys.uu.nl> said: > On Wed, 22 Jul 1998, Stephen C. Tweedie wrote: >> successfully with 8k NFS. However, the zoned allocation can use memory >> less efficiently: the odd free pages in the paged zone cannot be used by >> non-paged users and vice versa, so overall performance may suffer. >> Right now I'm cleaning the code up for a release against 2.1.110 so >> that we can start testing. > Hmm, I'm curious as to what categories your allocator > divides memory users in. Is it just plain swappable > vs. non-swappable Yes, and so far it seems to work pretty well. > or is it fragmentation-causing vs. fragmentation sensitive or > something entirely different? As long as there are enough higher-order free pages to go around, the fragmentation distinction is not so important. The problem of course is that the more different zone types we have, the less efficiently we can use memory, so I really just want a minimal solution which does something about fragmentation for non-swappable allocations. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-18 13:28 ` Zlatko Calusic 1998-07-18 16:40 ` Eric W. Biederman @ 1998-07-22 10:33 ` Stephen C. Tweedie 1998-07-23 10:59 ` Zlatko Calusic 1 sibling, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-22 10:33 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Eric W. Biederman, Stephen C. Tweedie, linux-mm Hi, On 18 Jul 1998 15:28:17 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> said: > I must admit, after lot of critics I made upon page aging, that I > believe it's the right way to go, but it should be done properly. > Performance should be better, not worse. Let me say one thing clearly: I'm not against page ageing (I implemented it in the first place for the swapper), I'm against the bad tuning it introduced. *IF* we can fix that, then keep the ageing, sure. However, we need to fix it _completely_. The non-cache-ageing scheme at least has the advantage that we understand its behaviour, so fiddling too much this close to 2.2 is not necessarily a good idea. 2.1.110, for example, now fails to boot for me in low memory configurations because it cannot keep enough higher order pages free for 4k NFS to work, never mind 8k. That's the danger: we need to introduce new schemes like this at the beginning of the development cycle for a new kernel, not the end. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-22 10:33 ` Stephen C. Tweedie @ 1998-07-23 10:59 ` Zlatko Calusic 1998-07-23 12:23 ` Stephen C. Tweedie ` (3 more replies) 0 siblings, 4 replies; 46+ messages in thread From: Zlatko Calusic @ 1998-07-23 10:59 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, linux-mm "Stephen C. Tweedie" <sct@redhat.com> writes: > Hi, > > On 18 Jul 1998 15:28:17 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> > said: > > > I must admit, after lot of critics I made upon page aging, that I > > believe it's the right way to go, but it should be done properly. > > Performance should be better, not worse. > > Let me say one thing clearly: I'm not against page ageing (I implemented > it in the first place for the swapper), I'm against the bad tuning it > introduced. *IF* we can fix that, then keep the ageing, sure. However, > we need to fix it _completely_. The non-cache-ageing scheme at least > has the advantage that we understand its behaviour, so fiddling too much > this close to 2.2 is not necessarily a good idea. 2.1.110, for example, > now fails to boot for me in low memory configurations because it cannot > keep enough higher order pages free for 4k NFS to work, never mind 8k. > > That's the danger: we need to introduce new schemes like this at the > beginning of the development cycle for a new kernel, not the end. > Cool! Then we agree on all topics. :) As promised, I did some testing and I maybe have a solution (big words, yeah! :)). As I see it, page cache seems too persistant (it grows out of bounds) when we age pages in it. One wrong way of fixing it is to limit page cache size, IMNSHO. I tried the other way, to age page cache harder, and it looks like it works very well. Patch is simple, so simple that I can't understand nobody suggested (something like) it yet. --- filemap.c.virgin Tue Jul 21 18:41:30 1998 +++ filemap.c Thu Jul 23 12:14:43 1998 @@ -171,6 +171,11 @@ touch_page(page); break; } + /* Age named pages aggresively, so page cache + * doesn't grow too fast. -zcalusic + */ + age_page(page); + age_page(page); age_page(page); if (page->age) break; After lots of testing, I am quite pleased with performance with that small change. Where, using official kernel, copying few hundreds of data to /dev/null would outswap cca 20MB (and constantly keep swapping, thus killing performance), now it swaps out only 5MB, probably exactly that pages that are not needed anyway. And that is something that I like with aging. I can provide thorough benchmark data, if needed. If I put only two age_page()s, there's still too much swapping for my taste. With three age_page()s, read performance is as expected, and still we manage memory more efficiently than without page aging. Patch applies cleanly on 2.1.110. Comments? Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- Don't steal - the government hates competition... -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 10:59 ` Zlatko Calusic @ 1998-07-23 12:23 ` Stephen C. Tweedie 1998-07-23 15:06 ` Zlatko Calusic 1998-07-23 17:12 ` Stephen C. Tweedie ` (2 subsequent siblings) 3 siblings, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-23 12:23 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, linux-mm Hi, On 23 Jul 1998 12:59:38 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> said: > As promised, I did some testing and I maybe have a solution (big > words, yeah! :)). > As I see it, page cache seems too persistant (it grows out of bounds) > when we age pages in it. Not on 110, it looks. On low memory, .110 seems to be even better than .108 without the page ageing. It is looking very good right now. > I can provide thorough benchmark data, if needed. Please do, but is this on .110? --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 12:23 ` Stephen C. Tweedie @ 1998-07-23 15:06 ` Zlatko Calusic 1998-07-23 15:17 ` Benjamin C.R. LaHaise 0 siblings, 1 reply; 46+ messages in thread From: Zlatko Calusic @ 1998-07-23 15:06 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, linux-mm "Stephen C. Tweedie" <sct@redhat.com> writes: > Hi, > > On 23 Jul 1998 12:59:38 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> > said: > > > As promised, I did some testing and I maybe have a solution (big > > words, yeah! :)). > > > As I see it, page cache seems too persistant (it grows out of bounds) > > when we age pages in it. > > Not on 110, it looks. On low memory, .110 seems to be even better than > .108 without the page ageing. It is looking very good right now. > > > I can provide thorough benchmark data, if needed. > > Please do, but is this on .110? > Yes, this is on .110. Benchmarking methodology: compile kernel, reboot, fire up XDM, few xterms, Xemacs and Netscape. In one xterm vmstat 10, in another copy 800MB worth of .mp3s :) to /dev/null (nothing special changes if I copy them to another directory) Official kernel: 1 x age_page() in shrink_mmap(): procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 0 6832 4292 22740 0 0 182 15 220 155 25 9 66 0 0 0 0 6860 4292 22744 0 0 0 5 112 10 0 0 100 1 0 0 428 1380 1964 31260 0 43 5579 13 221 202 1 22 77 1 0 0 2472 1428 1964 33256 10 209 5742 53 232 211 2 24 75 1 0 0 5200 3500 1988 33928 5 273 6017 70 236 216 2 25 73 1 0 0 7012 2940 1964 36292 6 181 6318 46 243 224 2 27 71 1 0 0 11036 1084 1964 42168 6 402 5910 101 240 212 1 27 72 1 0 0 12572 3832 2028 40900 6 154 5939 39 239 211 1 23 76 1 0 0 14288 11336 1964 35180 10 172 5863 44 233 209 1 24 75 1 0 0 17484 1188 1964 48552 29 320 5076 81 229 189 1 23 76 1 0 0 18588 10640 1964 40176 42 111 4668 29 217 187 1 18 81 1 0 0 21988 1576 1964 52636 43 342 5434 86 240 204 1 22 77 1 0 0 23524 13676 1964 42076 47 154 5652 39 236 222 1 22 77 1 1 0 23812 1284 1992 54728 41 31 5915 9 234 230 1 25 74 1 0 0 24076 24324 2028 31916 40 30 6106 8 239 226 1 24 75 1 0 0 24092 16064 2028 40188 48 7 5869 3 235 226 1 22 77 0 0 0 24020 1540 2000 54724 30 0 2356 1 162 114 0 11 89 0 0 0 23980 1536 2000 54688 8 0 2 0 104 19 0 0 100 24MB outswapped, lots of swapouts and swapins!!!. There would be much more swap activity if I were actually using Netscape or XEmacs during I/O, but in both test I didn't! I forgot to put "time" before cp :(, but... 15 lines x 10 sec = cca 150 seconds to copy files. Also, notice that I'm not memory starved (starting with cca 7 + 4 + 23 = 34 MB for caches to use). In the last minute, system practically outswapped everything it could, so it started to fight for every other page effectively losing time (~30 pages out, ~40 pages in, every second). Too bad. :( Patched with small patch I posted: 3 x age_page() in shrink_mmap(): procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 0 7072 4292 22768 0 0 172 15 217 153 23 9 68 0 0 0 0 7048 4292 22736 0 0 0 2 109 11 0 0 100 1 0 0 76 1044 1964 31444 0 8 5899 4 228 219 1 20 79 1 0 0 116 6076 1964 26432 0 4 6665 2 243 241 2 27 71 1 0 0 132 6492 2028 25980 0 2 6723 1 239 238 1 25 75 1 0 0 488 6816 2028 26016 0 36 6671 10 240 233 1 25 74 1 0 0 1288 1240 1964 32460 0 80 6163 21 232 220 1 23 76 1 0 0 2152 1536 1964 33028 0 86 6234 22 233 223 1 24 76 1 0 0 3008 1384 1964 34032 0 86 6313 22 235 229 1 22 77 1 0 0 3084 1488 1964 34008 0 8 6135 3 229 223 1 22 77 1 0 0 4816 1128 1964 36096 0 173 6778 44 247 237 2 25 73 1 0 0 5912 1172 1964 37152 0 110 7103 28 252 252 1 29 70 1 0 0 6904 1536 1964 37780 0 99 7247 26 250 252 1 27 72 1 0 0 8348 3704 2028 36988 0 144 7095 37 255 243 1 25 73 0 0 0 9164 14980 2028 26608 1 82 3278 22 173 120 1 13 86 0 0 0 9164 14980 2028 26608 0 0 0 0 102 6 0 0 100 First thing to notice is only 10MB on swap (good). Second, and more important, system was *not* swapping things in at all, because only pages that really belonged to swap (unneeded) were swapped out. Copying finished in 13 x 10 = ~130 seconds. Conclusion: better I/O performance, better feel when using applications (I didn't have to wait for Netscape or XEmacs to come from swap when I started to use them, for real). I was very carefull to do exactly the same sequence in both tests! I think it is obvious from the first line of those vmstat reports. Anything I forgot to test? :) -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- Remember that you are unique. Just like everyone else. -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 15:06 ` Zlatko Calusic @ 1998-07-23 15:17 ` Benjamin C.R. LaHaise 1998-07-23 15:25 ` Zlatko Calusic 0 siblings, 1 reply; 46+ messages in thread From: Benjamin C.R. LaHaise @ 1998-07-23 15:17 UTC (permalink / raw) To: Zlatko Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, linux-mm On 23 Jul 1998, Zlatko Calusic wrote: > I was very carefull to do exactly the same sequence in both tests! > I think it is obvious from the first line of those vmstat reports. > > Anything I forgot to test? :) Yeap! ;-) Could you try Werner Fink's lowmem.patch -- it changes the MAX_PAGE_AGE mechanism to have a dynamic upper limit which is lower on systems with less memory... That should have a similar effect to the multiple invocations of age_page that you tried. -ben -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 15:17 ` Benjamin C.R. LaHaise @ 1998-07-23 15:25 ` Zlatko Calusic 1998-07-23 17:27 ` Benjamin C.R. LaHaise 0 siblings, 1 reply; 46+ messages in thread From: Zlatko Calusic @ 1998-07-23 15:25 UTC (permalink / raw) To: Benjamin C.R. LaHaise; +Cc: linux-mm "Benjamin C.R. LaHaise" <blah@kvack.org> writes: > On 23 Jul 1998, Zlatko Calusic wrote: > > > I was very carefull to do exactly the same sequence in both tests! > > I think it is obvious from the first line of those vmstat reports. > > > > Anything I forgot to test? :) > > Yeap! ;-) Could you try Werner Fink's lowmem.patch -- it changes the > MAX_PAGE_AGE mechanism to have a dynamic upper limit which is lower on > systems with less memory... That should have a similar effect to the > multiple invocations of age_page that you tried. > Not really! :) I'm trying and trying, but every time... While trying to retrieve the URL: http://riemann.suse.de/~werner/patches/ The following error was encountered: ERROR 308 -- Cannot connect to the original site This means that: The remote site may be down. Could you please send me a copy, since I don't know for how long host will be down? Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- Don't mess with Murphy. -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 15:25 ` Zlatko Calusic @ 1998-07-23 17:27 ` Benjamin C.R. LaHaise 1998-07-23 19:17 ` Dr. Werner Fink 0 siblings, 1 reply; 46+ messages in thread From: Benjamin C.R. LaHaise @ 1998-07-23 17:27 UTC (permalink / raw) To: Zlatko Calusic; +Cc: linux-mm On 23 Jul 1998, Zlatko Calusic wrote: > Could you please send me a copy, since I don't know for how long host > will be down? Okay, there's now a copy at http://www.kvack.org/~blah/patches/werner-lowmem.patch-2.1.110.gz (~5k). -ben -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 17:27 ` Benjamin C.R. LaHaise @ 1998-07-23 19:17 ` Dr. Werner Fink 0 siblings, 0 replies; 46+ messages in thread From: Dr. Werner Fink @ 1998-07-23 19:17 UTC (permalink / raw) To: linux-mm On Thu, Jul 23, 1998 at 01:27:53PM -0400, Benjamin C.R. LaHaise wrote: > On 23 Jul 1998, Zlatko Calusic wrote: > > > Could you please send me a copy, since I don't know for how long host > > will be down? > > Okay, there's now a copy at > http://www.kvack.org/~blah/patches/werner-lowmem.patch-2.1.110.gz (~5k). One remark ... Bill's (to be exact Bill Hawes <whawes@star.net>) patch of a dynamic number of inodes is more elegant then mine included in this patch :-) Werner -------------------------------------------------------------------------- This is a multi-part message in MIME format. --------------20E446B52C4B02943B4DB385 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi Bill, I tried running a test similar to what I think you're using for your "rust series", and on 2.1.109 I see very little change in compile time after doing a big find. After booting into 8M, a compile of net-tools-1.45 takes 83 seconds the first time, 89 seconds after doing a "find /usr -type f" (about 53,000 files on my system.) Not a speed-up, but a much smaller change than the typical numbers you've been seeing. Subsequent finds don't have much effect; compile times remain in the range of 84-89 sec. My kernel is heavily patched :-), but I think the relative lack of rust may be largely due to setting inode-max to scale with memory size. For an 8M system I have inode-max set to 1024, which nicely limits the fraction of both inode and dcache memory. If you don't mind trying some further experiments, could you try 2.1.109 with either the attached patch, or just a echo 1024 >/proc/sys/fs/inode-max right after boot. The patch makes this automatic and also preallocates the inodes so that there's no fragmentation effect, but the important part is probably to just get the limit right. Hope this helps a bit ... Regards, Bill --------------20E446B52C4B02943B4DB385 Content-Type: text/plain; charset=us-ascii; name="inode_prealloc109-patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="inode_prealloc109-patch" --- linux-2.1.109/include/linux/fs.h.old Fri Jul 17 09:28:55 1998 +++ linux-2.1.109/include/linux/fs.h Fri Jul 17 09:33:33 1998 @@ -46,7 +46,17 @@ /* And dynamically-tunable limits and defaults: */ extern int max_inodes; extern int max_files, nr_files, nr_free_files; -#define NR_INODE 4096 /* This should no longer be bigger than NR_FILE */ +/* + * Make the default inode limit scale with memory size + * up to a limit. (A 32M system gets 4096 inodes.) + * + * Note: NR_INODE may be larger than NR_FILE, as unused + * inodes are still useful for preserving page cache. + */ +#define NR_INODE_MAX 16384 +#define NR_INODE(pages) \ + (((pages) >> 1) <= NR_INODE_MAX ? ((pages) >> 1) : NR_INODE_MAX) + #define NR_FILE 4096 /* this can well be larger on a larger system */ #define NR_RESERVED_FILES 10 /* reserved for root */ --- linux-2.1.109/fs/inode.c.old Fri Jul 3 10:32:32 1998 +++ linux-2.1.109/fs/inode.c Fri Jul 17 10:05:55 1998 @@ -20,8 +20,12 @@ * Famous last words. */ +/* for sizing the inode limit */ +extern unsigned long num_physpages; + #define INODE_PARANOIA 1 /* #define INODE_DEBUG 1 */ +#define INODE_PREALLOC 1 /* make a CONFIG option */ /* * Inode lookup is no longer as critical as it used to be: @@ -65,7 +69,8 @@ int dummy[4]; } inodes_stat = {0, 0, 0,}; -int max_inodes = NR_INODE; +/* Initialized in inode_init() */ +int max_inodes; /* * Put the inode on the super block's dirty list. @@ -737,15 +791,35 @@ */ void inode_init(void) { - int i; struct list_head *head = inode_hashtable; + int i = HASH_SIZE; - i = HASH_SIZE; do { INIT_LIST_HEAD(head); head++; i--; } while (i); + + /* + * Initialize the default maximum based on memory size. + */ + max_inodes = NR_INODE(num_physpages); + +#ifdef INODE_PREALLOC + /* + * Preallocate the inodes to avoid memory fragmentation. + */ + spin_lock(&inode_lock); + while (inodes_stat.nr_inodes < max_inodes) { + struct inode *inode = grow_inodes(); + if (!inode) + goto done; + list_add(&inode->i_list, &inode_unused); + inodes_stat.nr_free_inodes++; + } + spin_unlock(&inode_lock); +done: +#endif } /* This belongs in file_table.c, not here... */ --------------20E446B52C4B02943B4DB385-- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 10:59 ` Zlatko Calusic 1998-07-23 12:23 ` Stephen C. Tweedie @ 1998-07-23 17:12 ` Stephen C. Tweedie 1998-07-23 17:42 ` Zlatko Calusic 1998-07-23 19:12 ` Dr. Werner Fink 1998-07-23 19:51 ` Rik van Riel 3 siblings, 1 reply; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-23 17:12 UTC (permalink / raw) To: Zlatko.Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, linux-mm Hi, On 23 Jul 1998 12:59:38 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> said: > As I see it, page cache seems too persistant (it grows out of bounds) > when we age pages in it. > One wrong way of fixing it is to limit page cache size, IMNSHO. I_my_NSHO, it's an awful way to fix it: adding yet another rule to the VM is not progress, it's making things worse! > I tried the other way, to age page cache harder, and it looks like it > works very well. Patch is simple, so simple that I can't understand > nobody suggested (something like) it yet. It has been suggested before, and that's why a lot of people have reported great success by having page ageing removed: it essentially lets pages age faster by limiting the number of ageing passes required to remove a page (essentially this just reduces the age value down to the page's single PG_referenced bit). And yes, it should work fine. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 17:12 ` Stephen C. Tweedie @ 1998-07-23 17:42 ` Zlatko Calusic 0 siblings, 0 replies; 46+ messages in thread From: Zlatko Calusic @ 1998-07-23 17:42 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Eric W. Biederman, linux-mm "Stephen C. Tweedie" <sct@redhat.com> writes: > Hi, > > On 23 Jul 1998 12:59:38 +0200, Zlatko Calusic <Zlatko.Calusic@CARNet.hr> > said: > > > As I see it, page cache seems too persistant (it grows out of bounds) > > when we age pages in it. > > > One wrong way of fixing it is to limit page cache size, IMNSHO. > > I_my_NSHO, it's an awful way to fix it: adding yet another rule to the > VM is not progress, it's making things worse! > Good, we agree. :) > > I tried the other way, to age page cache harder, and it looks like it > > works very well. Patch is simple, so simple that I can't understand > > nobody suggested (something like) it yet. > > It has been suggested before, and that's why a lot of people have > reported great success by having page ageing removed: it essentially > lets pages age faster by limiting the number of ageing passes required > to remove a page (essentially this just reduces the age value down to > the page's single PG_referenced bit). > > And yes, it should work fine. > Yep! Exactly that. If only my english was better to explain it as easily and precisely as you are doing. :) As I already said (or at least tried to :)) there's nothing wrong with the idea of page aging, it's just that current implementation is not very good. So I would like page aging to stay, but with my or some similar change that will make things work well and smooth. Thanks to Benjamin, I'm going to download Werners patch and see how does his idea perform. In a minute. :) Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- If you don't think women are explosive, drop one! -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 10:59 ` Zlatko Calusic 1998-07-23 12:23 ` Stephen C. Tweedie 1998-07-23 17:12 ` Stephen C. Tweedie @ 1998-07-23 19:12 ` Dr. Werner Fink 1998-07-27 10:40 ` Stephen C. Tweedie 1998-07-23 19:51 ` Rik van Riel 3 siblings, 1 reply; 46+ messages in thread From: Dr. Werner Fink @ 1998-07-23 19:12 UTC (permalink / raw) To: linux-mm On Thu, Jul 23, 1998 at 12:59:38PM +0200, Zlatko Calusic wrote: > > I tried the other way, to age page cache harder, and it looks like it > works very well. Patch is simple, so simple that I can't understand > nobody suggested (something like) it yet. > > > --- filemap.c.virgin Tue Jul 21 18:41:30 1998 > +++ filemap.c Thu Jul 23 12:14:43 1998 > @@ -171,6 +171,11 @@ > touch_page(page); > break; > } > + /* Age named pages aggresively, so page cache > + * doesn't grow too fast. -zcalusic > + */ > + age_page(page); > + age_page(page); > age_page(page); > if (page->age) > break; > I've something similar ... cut&paste (no tabs) ... which would only do less graduated ageing on small systems. ------------------------------------------------------------------------------- diff -urN linux-2.1.110/include/linux/swapctl.h linux/include/linux/swapctl.h --- linux-2.1.110/include/linux/swapctl.h Tue Jul 21 02:32:01 1998 +++ linux/include/linux/swapctl.h Wed Jul 22 18:04:28 1998 @@ -94,12 +94,26 @@ return n; } +extern int pgcache_max_age; +extern void do_pgcache_max_age(void); + static inline void touch_page(struct page *page) { - if (page->age < (MAX_PAGE_AGE - PAGE_ADVANCE)) + int max_age = MAX_PAGE_AGE; + + if (atomic_read(&page->count) == 1) { + static int save_max_age = 0; + if (save_max_age != max_age) { + save_max_age = max_age; + do_pgcache_max_age(); + } + max_age = pgcache_max_age; + } + + if (page->age < (max_age - PAGE_ADVANCE)) page->age += PAGE_ADVANCE; else - page->age = MAX_PAGE_AGE; + page->age = max_age; } static inline void age_page(struct page *page) diff -urN linux-2.1.110/include/linux/swapctl.h linux/include/linux/swapctl.h --- linux-2.1.110/include/linux/swapctl.h Tue Jul 21 02:32:01 1998 +++ linux/include/linux/swapctl.h Wed Jul 22 18:04:28 1998 @@ -94,12 +94,26 @@ return n; } +extern int pgcache_max_age; +extern void do_pgcache_max_age(void); + static inline void touch_page(struct page *page) { - if (page->age < (MAX_PAGE_AGE - PAGE_ADVANCE)) + int max_age = MAX_PAGE_AGE; + + if (atomic_read(&page->count) == 1) { + static int save_max_age = 0; + if (save_max_age != max_age) { + save_max_age = max_age; + do_pgcache_max_age(); + } + max_age = pgcache_max_age; + } + + if (page->age < (max_age - PAGE_ADVANCE)) page->age += PAGE_ADVANCE; else - page->age = MAX_PAGE_AGE; + page->age = max_age; } static inline void age_page(struct page *page) ------------------------------------------------------------------------------- Werner -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 19:12 ` Dr. Werner Fink @ 1998-07-27 10:40 ` Stephen C. Tweedie 0 siblings, 0 replies; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-27 10:40 UTC (permalink / raw) To: Dr. Werner Fink; +Cc: linux-mm, Stephen Tweedie Hi Werner, On Thu, 23 Jul 1998 21:12:22 +0200, "Dr. Werner Fink" <werner@suse.de> said: > I've something similar ... cut&paste (no tabs) ... which would only do > less graduated ageing on small systems. > ---------------------------------------------------------------------------- > [patch follows] Interesting, but the patch included just two copies of the diff to swapctl.h and no definition of the new do_pgcache_max_age() function. Could you post a complete patch, please?! Thanks, Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 10:59 ` Zlatko Calusic ` (2 preceding siblings ...) 1998-07-23 19:12 ` Dr. Werner Fink @ 1998-07-23 19:51 ` Rik van Riel 1998-07-24 11:21 ` Zlatko Calusic 3 siblings, 1 reply; 46+ messages in thread From: Rik van Riel @ 1998-07-23 19:51 UTC (permalink / raw) To: Zlatko Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, linux-mm On 23 Jul 1998, Zlatko Calusic wrote: > One wrong way of fixing it is to limit page cache size, IMNSHO. > > I tried the other way, to age page cache harder, and it looks like it > works very well. Patch is simple, so simple that I can't understand > nobody suggested (something like) it yet. These solutions are somewhat the same, but your one may take a little less computational power and has a tradeoff in the fact that it is very inflexible. > --- filemap.c.virgin Tue Jul 21 18:41:30 1998 > +++ filemap.c Thu Jul 23 12:14:43 1998 > + age_page(page); > + age_page(page); > age_page(page); > If I put only two age_page()s, there's still too much swapping for my > taste. > With three age_page()s, read performance is as expected, and still we > manage memory more efficiently than without page aging. This only proves that three age_page()s are a good number for _your_ computer and your workload. > Comments? As Stephen put it so nicely when I (in a bad mood) proposed another artificial limit: " O no, another arbitrary limit in the kernel! " And another one of Stephen's wisdoms (heavily paraphrased!): " Good solutions are dynamic and/or self-tuning " [Sorry Stephen, this was VERY heavily paraphrased :)] Rik. +-------------------------------------------------------------------+ | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl | | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ | +-------------------------------------------------------------------+ -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-23 19:51 ` Rik van Riel @ 1998-07-24 11:21 ` Zlatko Calusic 1998-07-24 14:25 ` Rik van Riel 0 siblings, 1 reply; 46+ messages in thread From: Zlatko Calusic @ 1998-07-24 11:21 UTC (permalink / raw) To: Rik van Riel Cc: Zlatko Calusic, Stephen C. Tweedie, Eric W. Biederman, linux-mm Rik van Riel <H.H.vanRiel@phys.uu.nl> writes: > On 23 Jul 1998, Zlatko Calusic wrote: > > > One wrong way of fixing it is to limit page cache size, IMNSHO. > > > > I tried the other way, to age page cache harder, and it looks like it > > works very well. Patch is simple, so simple that I can't understand > > nobody suggested (something like) it yet. > > These solutions are somewhat the same, but your one may take > a little less computational power and has a tradeoff in the > fact that it is very inflexible. Same? Not in your wildest dream. :) Limiting means puting "arbitrary" limit. Then page cache would NEVER grow above that limit. That's how buffer cache work at the present. It never grows above cca 30% of physical memory installed. That means lots of unused memory... I don't like it. Many times, no matter how heavy I/O I have, last 20MB (for exampl, but in many real cases) are free, unused, WASTED. I see that only on two OSes, NT and recent 2.1.x Linuces. I know I can change that limit in /proc/sys... but I was always wondering why is default set so low. With harder aging you're NOT limiting size of page cache. You just say to that subsystem to be polite, but if you have lots of memory, that memory will be instantly used by the cache. That's FUNDAMENTALLY different from limiting. Triple aging has all good characteristics of aging. Why do you think it is inflexible? > > > --- filemap.c.virgin Tue Jul 21 18:41:30 1998 > > +++ filemap.c Thu Jul 23 12:14:43 1998 > > + age_page(page); > > + age_page(page); > > age_page(page); > > If I put only two age_page()s, there's still too much swapping for my > > taste. > > With three age_page()s, read performance is as expected, and still we > > manage memory more efficiently than without page aging. > > This only proves that three age_page()s are a good number > for _your_ computer and your workload. > Could be. So I'd like to see other people benchmarks. I hope I'm not theonly speed freak around. :) I will post another, completely different set of benchmarks today. Under different initial conditions, so as to simulate different machines and loads. > > Comments? > > As Stephen put it so nicely when I (in a bad mood) proposed > another artificial limit: > " O no, another arbitrary limit in the kernel! " > I couldn't agree more. I like sane defaults. And simple solutions, more than anything. > And another one of Stephen's wisdoms (heavily paraphrased!): > " Good solutions are dynamic and/or self-tuning " > [Sorry Stephen, this was VERY heavily paraphrased :)] > Agreed, but only if that self-tuning does not take more code than the core functionality in itself. :) I'm very satisfied with changes (in .109 I think) free_memory_available() went through. Old function was much too much unnessecary complicated and not useful at all. And unreadable. Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- File not found. Should I fake it? (Y/N) -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-24 11:21 ` Zlatko Calusic @ 1998-07-24 14:25 ` Rik van Riel 1998-07-24 17:01 ` Zlatko Calusic 0 siblings, 1 reply; 46+ messages in thread From: Rik van Riel @ 1998-07-24 14:25 UTC (permalink / raw) To: Zlatko Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, Linux MM On 24 Jul 1998, Zlatko Calusic wrote: > Rik van Riel <H.H.vanRiel@phys.uu.nl> writes: > > > These solutions are somewhat the same, but your one may take > > a little less computational power and has a tradeoff in the > > fact that it is very inflexible. > > Same? Not in your wildest dream. :) > > Limiting means puting "arbitrary" limit. Then page cache would NEVER > grow above that limit. There's also a 'soft limit', or borrow percentage. Ultimately the minimum and maximum percentages should be 0 and 100 % respectively. > Triple aging has all good characteristics of aging. > Why do you think it is inflexible? Because there's no way to tune the 'priority' of the page aging. It could be good to do triple aging, but it could be a non-optimal number on other machines ... and there's no way to get out of it! > I will post another, completely different set of benchmarks today. > Under different initial conditions, so as to simulate different > machines and loads. Good, I like this. You will probably get somewhat different results with this... Oh, and changing the code to: int i; for ( i = page_cache_penalty; i--;) age_page(page); and making page_cache_pentalty sysctl tunable will certainly make your tests easier... > I'm very satisfied with changes (in .109 I think) > free_memory_available() went through. Old function was much too much > unnessecary complicated and not useful at all. And unreadable. It _was_ useful; it has always been useful to test for the amount of memory fragmentation. In fact, Linus himself said (when free_memory_available() was introduced in 2.1.89) that he would not accept any function which used the amount of free pages. After some protests (by me) Linus managed to explain to us exactly _why_ we should test for fragmentation, I suggest we all go through the archives again and reread the arguments... Rik. +-------------------------------------------------------------------+ | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl | | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ | +-------------------------------------------------------------------+ -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-24 14:25 ` Rik van Riel @ 1998-07-24 17:01 ` Zlatko Calusic 1998-07-24 21:55 ` Rik van Riel 0 siblings, 1 reply; 46+ messages in thread From: Zlatko Calusic @ 1998-07-24 17:01 UTC (permalink / raw) To: Rik van Riel; +Cc: Stephen C. Tweedie, Eric W. Biederman, Linux MM Rik van Riel <H.H.vanRiel@phys.uu.nl> writes: > On 24 Jul 1998, Zlatko Calusic wrote: > > Rik van Riel <H.H.vanRiel@phys.uu.nl> writes: > > > > > These solutions are somewhat the same, but your one may take > > > a little less computational power and has a tradeoff in the > > > fact that it is very inflexible. > > > > Same? Not in your wildest dream. :) > > > > Limiting means puting "arbitrary" limit. Then page cache would NEVER > > grow above that limit. > > There's also a 'soft limit', or borrow percentage. Ultimately > the minimum and maximum percentages should be 0 and 100 % > respectively. Could you elaborate on "borrow" percentage? I have some trouble understanding what that could be. > > > Triple aging has all good characteristics of aging. > > Why do you think it is inflexible? > > Because there's no way to tune the 'priority' of the page aging. > It could be good to do triple aging, but it could be a non-optimal > number on other machines ... and there's no way to get out of it! Yes, you're right here. See below... > > > I will post another, completely different set of benchmarks today. > > Under different initial conditions, so as to simulate different > > machines and loads. > > Good, I like this. You will probably get somewhat different > results with this... > > Oh, and changing the code to: > > int i; > for ( i = page_cache_penalty; i--;) > age_page(page); > > and making page_cache_pentalty sysctl tunable will certainly > make your tests easier... Yes, I wanted to do something like this, but then again, was to lazy to further complicate things. So, I was just recompiling kernel and rebooting (to do testing), since only one file (filemap.c) was really recompiled and whole operation did not take more than a few minutes. :) Code like that is easy to put in the kernel, but only if people think it would be a good idea. And then remains final question, what should be the default value? But, I also think that too much configurable parameters make trouble too. If you have 100 variables to configure one subsystem in the kernel, where do you start? I like solutions that work good by themselves. Autotuning. With not too much logic in them. :) > > > I'm very satisfied with changes (in .109 I think) > > free_memory_available() went through. Old function was much too much > > unnessecary complicated and not useful at all. And unreadable. > > It _was_ useful; it has always been useful to test for the > amount of memory fragmentation. Whoops, here I don't share your opinion. Checking memory fragmentation and then acting accordingly (in kswapd) seems like a good idea, but, unfortunately, I am now pretty sure it is NOT. And there is one and only one reason: throwing pages out of memory at random (blindly). You know it, too. I came to this conclusion many months before, with my first patch, that aimed to solve fragmentation problem. My first idea was to make sure we have at least one 128KB chunk. It finished with many lockups and kswapd deadlocks. Then I tried to make few 16KB chunks available and performance still sucked. To get few 16KB chunks system would happily outswap whole my memory. Thanks, not again. I used it for a while, only to prevent network lockups. Old (<= 2.1.108) free_memory_available() was practically that, but with limit applied which effectively worked like: "Oh, no, memory fragmented, swap out, swap out, oh no, too much swapped out, never mind fragmentation, stop swapping." So, it didn't work. And it was definitely overcomplicated. Obviously, everybody tried hard to do the right thing, where right thing could not be done. Wrong place to search for solution. Stephen's new patch promises. It has some new logic in it which is not tried before. I already tested it, and results are not bad. But, I can't say that is final solution, either, since I can still easily produce memory shortage, with many network simultaneous network connections even on a 64MB unloaded machine. So, lots of work to be done for 2.4. :) > > In fact, Linus himself said (when free_memory_available() > was introduced in 2.1.89) that he would not accept any > function which used the amount of free pages. > > After some protests (by me) Linus managed to explain to us > exactly _why_ we should test for fragmentation, I suggest > we all go through the archives again and reread the arguments... > Yeah, I remember. That was the time I started patching my kernels with every new release. That was the time I went for another 32MB to solve my problems. :( I'm lagging very much behind on linux-kernel list (~3000 posts) and it seems like I missed some good discussion about Linux MM (I read about it on http://lwn.net/). Now, I hope I can still catch that all, and then spend some time testing and coding. :) Regards, -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- P.S. That Linux-MM page you're doing, kicks ass. Just never had opportunity to tell you that I really like it. :) -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-24 17:01 ` Zlatko Calusic @ 1998-07-24 21:55 ` Rik van Riel 1998-07-25 13:05 ` Zlatko Calusic 1998-07-27 10:54 ` Stephen C. Tweedie 0 siblings, 2 replies; 46+ messages in thread From: Rik van Riel @ 1998-07-24 21:55 UTC (permalink / raw) To: Zlatko Calusic; +Cc: Stephen C. Tweedie, Eric W. Biederman, Linux MM On 24 Jul 1998, Zlatko Calusic wrote: > > There's also a 'soft limit', or borrow percentage. Ultimately > > the minimum and maximum percentages should be 0 and 100 % > > respectively. > > Could you elaborate on "borrow" percentage? I have some trouble > understanding what that could be. It's an idea I stole from Digital Unix :) Basically, the cache is allowed to grow boundless, but is reclaimed until it reaches the borrow percentage when memory is short. The philosophy behind is that caching the disk doesn't make much sense beyond a certain point. It's a primitive idea, but it seems to have saved Andrea's machine quite well (with the additional patch). I admit your patch (multiple aging) should work even better, but in order to do that, we probably want to make it auto-tuning on the borrow percentage: - if page_cache_size > borrow + 5% --> add aging loop - if loads_of_disk_io and almost thrashing [*] --> remove aging loop [*] this thrashing can be measured by testing the cache hit/mis rate; if it falls below (say) 50% we could consider thrashing. (50% should be a good rate for an aging cache, and the amount of loops is trimmed quickly enough when we grow anyway. This mechanism could make a nice somewhat adjusting trimming mechanism. Expect a patch soon...) Rik. +-------------------------------------------------------------------+ | Linux memory management tour guide. H.H.vanRiel@phys.uu.nl | | Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ | +-------------------------------------------------------------------+ -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-24 21:55 ` Rik van Riel @ 1998-07-25 13:05 ` Zlatko Calusic 1998-07-27 10:54 ` Stephen C. Tweedie 1 sibling, 0 replies; 46+ messages in thread From: Zlatko Calusic @ 1998-07-25 13:05 UTC (permalink / raw) To: Rik van Riel; +Cc: Stephen C. Tweedie, Eric W. Biederman, Linux MM Rik van Riel <H.H.vanRiel@phys.uu.nl> writes: > On 24 Jul 1998, Zlatko Calusic wrote: > > > > There's also a 'soft limit', or borrow percentage. Ultimately > > > the minimum and maximum percentages should be 0 and 100 % > > > respectively. > > > > Could you elaborate on "borrow" percentage? I have some trouble > > understanding what that could be. > > It's an idea I stole from Digital Unix :) > > Basically, the cache is allowed to grow boundless, but is > reclaimed until it reaches the borrow percentage when > memory is short. OK, I get it now. Looks good. > > The philosophy behind is that caching the disk doesn't make > much sense beyond a certain point. > I mostly agree. > It's a primitive idea, but it seems to have saved Andrea's > machine quite well (with the additional patch). > > I admit your patch (multiple aging) should work even better, > but in order to do that, we probably want to make it auto-tuning > on the borrow percentage: > > - if page_cache_size > borrow + 5% --> add aging loop > - if loads_of_disk_io and almost thrashing [*] --> remove aging loop Yes, something like this could be worthwhile. I observed some strange patterns of behaviour with aging loop, sometimes system is still too aggresive, and sometimes you can't say if it's working at all. Probably, some debbugging and profiling code should be added to see what's goin' on there. > > [*] this thrashing can be measured by testing the cache hit/mis > rate; if it falls below (say) 50% we could consider thrashing. That probably wouldn't work as well as you expect. Problem is again with that arbitrary 50%. I had code in kernel that reported buffer/page cache hit ratio and was surprised that for both caches it was > 90%. And that was on 5MB machine. Can you imagine? :) > > (50% should be a good rate for an aging cache, and the amount > of loops is trimmed quickly enough when we grow anyway. This > mechanism could make a nice somewhat adjusting trimming > mechanism. Expect a patch soon...) > I'll be glad to test a patch, but I'm not that convinced that this is really a good idea. But, then again I have nothing against it. Keep up the good work! -- Posted by Zlatko Calusic E-mail: <Zlatko.Calusic@CARNet.hr> --------------------------------------------------------------------- Crime doesn't pay... does that mean my job is a crime? -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: More info: 2.1.108 page cache performance on low memory 1998-07-24 21:55 ` Rik van Riel 1998-07-25 13:05 ` Zlatko Calusic @ 1998-07-27 10:54 ` Stephen C. Tweedie 1 sibling, 0 replies; 46+ messages in thread From: Stephen C. Tweedie @ 1998-07-27 10:54 UTC (permalink / raw) To: Rik van Riel Cc: Zlatko Calusic, Stephen C. Tweedie, Eric W. Biederman, Linux MM Hi, On Fri, 24 Jul 1998 23:55:10 +0200 (CEST), Rik van Riel <H.H.vanRiel@phys.uu.nl> said: > I admit your patch (multiple aging) should work even better, > but in order to do that, we probably want to make it auto-tuning > on the borrow percentage: > - if page_cache_size > borrow + 5% --> add aging loop <Bzzt> wrong answer... > - if loads_of_disk_io and almost thrashing [*] --> remove aging loop Yep, much better. > [*] this thrashing can be measured by testing the cache hit/mis > rate; if it falls below (say) 50% we could consider thrashing. Doing even more rules based on the actual cache size is a bad thing since it is enforcing an arbitrary limit which does not depend on what the system load is right now. Making it adapt to the current load is ALWAYS going to be a better way of doing things. --Stephen -- This is a majordomo managed list. To unsubscribe, send a message with the body 'unsubscribe linux-mm me@address' to: majordomo@kvack.org ^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~1998-08-20 14:30 UTC | newest] Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1998-07-13 16:53 More info: 2.1.108 page cache performance on low memory Stephen C. Tweedie 1998-07-13 18:08 ` Eric W. Biederman 1998-07-13 18:29 ` Zlatko Calusic 1998-07-14 17:32 ` Stephen C. Tweedie 1998-07-16 12:31 ` Zlatko Calusic 1998-07-14 17:30 ` Stephen C. Tweedie 1998-07-18 1:10 ` Eric W. Biederman 1998-07-18 13:28 ` Zlatko Calusic 1998-07-18 16:40 ` Eric W. Biederman 1998-07-20 9:15 ` Zlatko Calusic 1998-07-22 10:40 ` Stephen C. Tweedie 1998-07-23 10:06 ` Zlatko Calusic 1998-07-23 12:22 ` Stephen C. Tweedie 1998-07-23 14:07 ` Zlatko Calusic 1998-07-23 17:18 ` Stephen C. Tweedie 1998-07-23 19:33 ` Zlatko Calusic 1998-07-27 10:57 ` Stephen C. Tweedie 1998-07-26 14:49 ` Eric W Biederman 1998-07-27 11:02 ` Stephen C. Tweedie 1998-08-02 5:19 ` Eric W Biederman 1998-08-17 13:57 ` Stephen C. Tweedie 1998-08-17 15:35 ` Stephen C. Tweedie 1998-08-20 12:40 ` Eric W. Biederman 1998-07-20 15:58 ` Stephen C. Tweedie 1998-07-22 10:36 ` Stephen C. Tweedie 1998-07-22 18:01 ` Rik van Riel 1998-07-23 10:59 ` Stephen C. Tweedie 1998-07-22 10:33 ` Stephen C. Tweedie 1998-07-23 10:59 ` Zlatko Calusic 1998-07-23 12:23 ` Stephen C. Tweedie 1998-07-23 15:06 ` Zlatko Calusic 1998-07-23 15:17 ` Benjamin C.R. LaHaise 1998-07-23 15:25 ` Zlatko Calusic 1998-07-23 17:27 ` Benjamin C.R. LaHaise 1998-07-23 19:17 ` Dr. Werner Fink 1998-07-23 17:12 ` Stephen C. Tweedie 1998-07-23 17:42 ` Zlatko Calusic 1998-07-23 19:12 ` Dr. Werner Fink 1998-07-27 10:40 ` Stephen C. Tweedie 1998-07-23 19:51 ` Rik van Riel 1998-07-24 11:21 ` Zlatko Calusic 1998-07-24 14:25 ` Rik van Riel 1998-07-24 17:01 ` Zlatko Calusic 1998-07-24 21:55 ` Rik van Riel 1998-07-25 13:05 ` Zlatko Calusic 1998-07-27 10:54 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox