* Re: Looking for better VM [not found] <Pine.LNX.4.05.10011061954520.26327-100000@humbolt.nl.linux.org> @ 2000-11-08 11:34 ` Szabolcs Szakacsits 2000-11-08 13:53 ` Rik van Riel 0 siblings, 1 reply; 10+ messages in thread From: Szabolcs Szakacsits @ 2000-11-08 11:34 UTC (permalink / raw) To: Rik van Riel; +Cc: linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar On Mon, 6 Nov 2000, Rik van Riel wrote: > On Mon, 6 Nov 2000, Szabolcs Szakacsits wrote: > > On Wed, 1 Nov 2000, Rik van Riel wrote: > > > but simply because > > > it appears there has been amazingly little research on this > > > subject and it's completely unknown which approach will work > > There has been lot of research, this is the reason most Unices support > > both non-overcommit and overcommit memory handling default to > > non-overcommit [think of reliability and high availability]. > It's a shame you didn't take the trouble to actually > go out and see that non-overcommit doesn't solve the > "out of memory" deadlock problem. Read my *entire* email again and please try to understand. No deadlock at all since kernel *falls back* to process killing if memory reserved for *root* is also out. You could ask, so what's the point for non-overcommit if we use process killing in the end? And the answer, in *practise* this almost never happens, root can always clean up and no processes are lost [just as when disk is "full" except the reserved area for root]. See? Human get a chance against hard-wired AI. I also didn't say non-overcommit should be used as default and a patch http://www.cs.helsinki.fi/linux/linux-kernel/2000-13/1208.html, developed for 2.3.99-pre3 by Eduardo Horvath and unfortunately was ignored completely, implemented it this way. And with a runtime tunable OOM killer, Linux really would beat the competitors [where it is quite behind at present] in this area. See? Human get a chance against hard-wired AI again. Believe me, there are people [don't read only kernel lists] who wants a reliable and controllable system and where the kernel doesn't play Russan rulet. [who missed my first email: forget about mem quotas and the the non-scalable "add GB's of swap" in this discussion]. > [if you want an explanation, look in the archives, > we've explained this a dozen times now] I've been reading the list much longer than you and really pissed of that after so many years of discussions, this problem and user requirements^Wwishes are still not understood. You think black and white but the world is colorful. Szaka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Looking for better VM 2000-11-08 11:34 ` Looking for better VM Szabolcs Szakacsits @ 2000-11-08 13:53 ` Rik van Riel 2000-11-08 16:36 ` Mikulas Patocka 2000-11-09 17:30 ` [PATCH] Reserve VM for root (was: Re: Looking for better VM) Szabolcs Szakacsits 0 siblings, 2 replies; 10+ messages in thread From: Rik van Riel @ 2000-11-08 13:53 UTC (permalink / raw) To: Szabolcs Szakacsits; +Cc: linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar On Wed, 8 Nov 2000, Szabolcs Szakacsits wrote: > On Mon, 6 Nov 2000, Rik van Riel wrote: > > On Mon, 6 Nov 2000, Szabolcs Szakacsits wrote: > > > On Wed, 1 Nov 2000, Rik van Riel wrote: > > > > but simply because > > > > it appears there has been amazingly little research on this > > > > subject and it's completely unknown which approach will work > > > There has been lot of research, this is the reason most Unices support > > > both non-overcommit and overcommit memory handling default to > > > non-overcommit [think of reliability and high availability]. > > It's a shame you didn't take the trouble to actually > > go out and see that non-overcommit doesn't solve the > > "out of memory" deadlock problem. > > Read my *entire* email again and please try to understand. No deadlock > at all since kernel *falls back* to process killing if memory reserved > for *root* is also out. > > You could ask, so what's the point for non-overcommit if we use > process killing in the end? And the answer, in *practise* this almost > never happens, root can always clean up and no processes are lost > [just as when disk is "full" except the reserved area for root]. See? > Human get a chance against hard-wired AI. > > I also didn't say non-overcommit should be used as default and a > patch http://www.cs.helsinki.fi/linux/linux-kernel/2000-13/1208.html, > developed for 2.3.99-pre3 by Eduardo Horvath and unfortunately was > ignored completely, implemented it this way. OK. This is a lot more reasonable. I'm actually looking into putting non-overcommit as a configurable option in the kernel. However, this does not save you from the fact that the system is essentially deadlocked when nothing can get more memory and nothing goes away. Non-overcommit won't give you any extra reliability unless your applications are very well behaved ... in which case you don't need non-overcommit. regards, Rik -- The Internet is not a network of computers. It is a network of people. That is its real strength. http://www.conectiva.com/ http://www.surriel.com/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Looking for better VM 2000-11-08 13:53 ` Rik van Riel @ 2000-11-08 16:36 ` Mikulas Patocka 2000-11-08 17:03 ` Christoph Rohland ` (2 more replies) 2000-11-09 17:30 ` [PATCH] Reserve VM for root (was: Re: Looking for better VM) Szabolcs Szakacsits 1 sibling, 3 replies; 10+ messages in thread From: Mikulas Patocka @ 2000-11-08 16:36 UTC (permalink / raw) To: Rik van Riel Cc: Szabolcs Szakacsits, linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar Hi. > > I also didn't say non-overcommit should be used as default and a > > patch http://www.cs.helsinki.fi/linux/linux-kernel/2000-13/1208.html, > > developed for 2.3.99-pre3 by Eduardo Horvath and unfortunately was > > ignored completely, implemented it this way. > > OK. This is a lot more reasonable. I'm actually looking > into putting non-overcommit as a configurable option in > the kernel. > > However, this does not save you from the fact that the > system is essentially deadlocked when nothing can get > more memory and nothing goes away. Non-overcommit won't > give you any extra reliability unless your applications > are very well behaved ... in which case you don't need > non-overcommit. BTW. Why does your OOM killer in 2.4 try to kill process that mmaped most memory? mmap is hamrless. mmap on files can't eat memory and swap. Imagine a case: you have database server that mmaps the whole 2G file but doesn't have too much anonymous memory. You have an offending process that does while (1) malloc(1000) and fills up 512M swap. Your OOM killer would kill the server first... Mikulas -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Looking for better VM 2000-11-08 16:36 ` Mikulas Patocka @ 2000-11-08 17:03 ` Christoph Rohland 2000-11-08 20:52 ` Ingo Oeser 2000-11-09 0:08 ` Rik van Riel 2 siblings, 0 replies; 10+ messages in thread From: Christoph Rohland @ 2000-11-08 17:03 UTC (permalink / raw) To: Mikulas Patocka Cc: Rik van Riel, Szabolcs Szakacsits, linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar Hi Mikulas, On Wed, 8 Nov 2000, Mikulas Patocka wrote: > BTW. Why does your OOM killer in 2.4 try to kill process that mmaped > most memory? mmap is hamrless. mmap on files can't eat memory and > swap. Be careful: They may have shm segments mmaped! Greetings Christoph -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Looking for better VM 2000-11-08 16:36 ` Mikulas Patocka 2000-11-08 17:03 ` Christoph Rohland @ 2000-11-08 20:52 ` Ingo Oeser 2000-11-09 0:08 ` Rik van Riel 2 siblings, 0 replies; 10+ messages in thread From: Ingo Oeser @ 2000-11-08 20:52 UTC (permalink / raw) To: Mikulas Patocka Cc: Rik van Riel, Szabolcs Szakacsits, linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar On Wed, Nov 08, 2000 at 05:36:40PM +0100, Mikulas Patocka wrote: > BTW. Why does your OOM killer in 2.4 try to kill process that mmaped most > memory? mmap is hamrless. mmap on files can't eat memory and swap. Don't complain, build your own and test it ;-) Apply my patch http://www.tu-chemnitz.de/~ioe/oom_kill_api.patch and install your own OOM handler using install_oom_killer() from <linux/swap.h>. It has all the needed documentation inline that will be build along the kernel-api-book. Have fun researching in this area. PS: Applies cleanly since oom_kill.c exists and also against 2.4.0-test11-pre1. Regards Ingo Oeser -- To the systems programmer, users and applications serve only to provide a test load. <esc>:x -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Looking for better VM 2000-11-08 16:36 ` Mikulas Patocka 2000-11-08 17:03 ` Christoph Rohland 2000-11-08 20:52 ` Ingo Oeser @ 2000-11-09 0:08 ` Rik van Riel 2 siblings, 0 replies; 10+ messages in thread From: Rik van Riel @ 2000-11-09 0:08 UTC (permalink / raw) To: Mikulas Patocka Cc: Szabolcs Szakacsits, linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar On Wed, 8 Nov 2000, Mikulas Patocka wrote: > BTW. Why does your OOM killer in 2.4 try to kill process that mmaped > most memory? mmap is hamrless. mmap on files can't eat memory and > swap. Because the thing is too stupid to take that into consideration? :) Btw, if your mmap()ed file still takes 1GB of memory, you have 1GB of freeable memory left and you shouldn't be out of memory ... or should you?? regards, Rik -- The Internet is not a network of computers. It is a network of people. That is its real strength. http://www.conectiva.com/ http://www.surriel.com/ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH] Reserve VM for root (was: Re: Looking for better VM) 2000-11-08 13:53 ` Rik van Riel 2000-11-08 16:36 ` Mikulas Patocka @ 2000-11-09 17:30 ` Szabolcs Szakacsits 2000-11-10 10:38 ` Andrey Savochkin 1 sibling, 1 reply; 10+ messages in thread From: Szabolcs Szakacsits @ 2000-11-09 17:30 UTC (permalink / raw) To: Rik van Riel; +Cc: linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar On Wed, 8 Nov 2000, Rik van Riel wrote: > OK. This is a lot more reasonable. Just the same what was in my first in email. > I'm actually looking into putting non-overcommit as a configurable > option in the kernel. Nice to hear, please make it a boot time option, not a compile time one. Also a control for how many percent the kernel can overcommit would be nice -- this is how modern Unices do. > However, this does not save you from the fact that the > system is essentially deadlocked when nothing can get > more memory and nothing goes away. I've also never said OOM killer should be disabled. In theory the non-overcommitting systems deadlock, Linux survives. Ironically usually it's just the opposite in practice. Any user can deadlock/crash Linux [default install, no quotas] but not an non-overcommitting system [root can clean up]. Here is an example code "simulating" a leaking daemon that will "deadlock" Linux even with your OOM killer patch [that is anyway *MUCH* better than the actually non-existing one in 2.2.x kernels]: main() { while(1) if (fork()) malloc(1); } With the patch below I could ssh to the host and killall the offending processes. To enable reserving VM space for root do echo -1 > /proc/sys/vm/overcommit_memory The number of reserved pages can be tuned via /proc/sys/vm/reserved, default is 5% of the RAM (note, RAM won't be reserved, but VM). BTW, I wanted to take a look at the frequently mentioned beancounter patch, here is the current state, http://www.asp-linux.com/en/products/ubpatch.shtml "Sorry, due to growing expenses for support of public version of ASPcomplete we do not provide sources till first official release." Szaka diff -ur linux.orig/include/linux/sysctl.h linux/include/linux/sysctl.h --- linux.orig/include/linux/sysctl.h Thu Nov 9 08:20:19 2000 +++ linux/include/linux/sysctl.h Thu Nov 9 06:30:11 2000 @@ -122,7 +122,8 @@ VM_PAGECACHE=7, /* struct: Set cache memory thresholds */ VM_PAGERDAEMON=8, /* struct: Control kswapd behaviour */ VM_PGT_CACHE=9, /* struct: Set page table cache parameters */ - VM_PAGE_CLUSTER=10 /* int: set number of pages to swap together */ + VM_PAGE_CLUSTER=10, /* int: set number of pages to swap together */ + VM_RESERVED=11 /* int: number of pages reserved for root */ }; diff -ur linux.orig/kernel/sysctl.c linux/kernel/sysctl.c --- linux.orig/kernel/sysctl.c Thu Nov 9 08:20:19 2000 +++ linux/kernel/sysctl.c Thu Nov 9 06:27:33 2000 @@ -37,6 +37,7 @@ extern int bdf_prm[], bdflush_min[], bdflush_max[]; extern char binfmt_java_interpreter[], binfmt_java_appletviewer[]; extern int sysctl_overcommit_memory; +extern int vm_reserved; extern int nr_queued_signals, max_queued_signals; #ifdef CONFIG_KMOD @@ -259,6 +260,8 @@ &pgt_cache_water, 2*sizeof(int), 0600, NULL, &proc_dointvec}, {VM_PAGE_CLUSTER, "page-cluster", &page_cluster, sizeof(int), 0600, NULL, &proc_dointvec}, + {VM_RESERVED, "reserved", + &vm_reserved, sizeof(int), 0600, NULL, &proc_dointvec}, {0} }; diff -ur linux.orig/mm/mmap.c linux/mm/mmap.c --- linux.orig/mm/mmap.c Thu Nov 9 08:20:19 2000 +++ linux/mm/mmap.c Thu Nov 9 08:17:10 2000 @@ -40,6 +40,7 @@ kmem_cache_t *vm_area_cachep; int sysctl_overcommit_memory; +int vm_reserved; /* Check that a process has enough memory to allocate a * new virtual mapping. @@ -59,7 +60,7 @@ long free; /* Sometimes we want to use more memory than we have. */ - if (sysctl_overcommit_memory) + if (sysctl_overcommit_memory == 1) return 1; free = buffermem >> PAGE_SHIFT; @@ -67,6 +68,8 @@ free += nr_free_pages; free += nr_swap_pages; free -= (page_cache.min_percent + buffer_mem.min_percent + 2)*num_physpages/100; + if (sysctl_overcommit_memory == -1 && current->uid && free < vm_reserved) + return 0; return free > pages; } @@ -872,6 +875,11 @@ void __init vma_init(void) { + struct sysinfo i; + + si_meminfo(&i); + vm_reserved = (i.totalram >> PAGE_SHIFT) / 20; + vm_area_cachep = kmem_cache_create("vm_area_struct", sizeof(struct vm_area_struct), 0, SLAB_HWCACHE_ALIGN, -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Reserve VM for root (was: Re: Looking for better VM) 2000-11-09 17:30 ` [PATCH] Reserve VM for root (was: Re: Looking for better VM) Szabolcs Szakacsits @ 2000-11-10 10:38 ` Andrey Savochkin 2000-11-13 23:44 ` user beancounter (was: Reserve VM for root) Szabolcs Szakacsits 0 siblings, 1 reply; 10+ messages in thread From: Andrey Savochkin @ 2000-11-10 10:38 UTC (permalink / raw) To: Szabolcs Szakacsits Cc: linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar, Rik van Riel Hello, On Thu, Nov 09, 2000 at 06:30:32PM +0100, Szabolcs Szakacsits wrote: > BTW, I wanted to take a look at the frequently mentioned beancounter patch, > here is the current state, > http://www.asp-linux.com/en/products/ubpatch.shtml > "Sorry, due to growing expenses for support of public version of ASPcomplete > we do not provide sources till first official release." That's not a place where I keep my code (and has never been :-) ftp://ftp.sw.com.sg/pub/Linux/people/saw/kernel/user_beancounter/UserBeancounter.html is the right place (but it has some availability problems :-( As for memory management, it provides a simple variant of service level support for - in-core memory (in opposite to swap) - total "virtual" memory. The latter ends up in accounting of how much memory is consumed by each subject of accounting, and an OOM-killer. OOM-killer takes into account guarantees given to the subject and selects the victim. In the patch on the ftp site the selection code is very simple and taken from some old OOM patches. BTW, I've redone memory accounting code to significantly improve it's performance (or, to say in other words, to reduce the performance penalty imposed by the accounting). But this new code isn't integrated to the complete user beancounter patch. Best regards Andrey -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: user beancounter (was: Reserve VM for root) 2000-11-10 10:38 ` Andrey Savochkin @ 2000-11-13 23:44 ` Szabolcs Szakacsits 0 siblings, 0 replies; 10+ messages in thread From: Szabolcs Szakacsits @ 2000-11-13 23:44 UTC (permalink / raw) To: Andrey Savochkin; +Cc: linux-kernel, linux-mm On Fri, 10 Nov 2000, Andrey Savochkin wrote: > On Thu, Nov 09, 2000 at 06:30:32PM +0100, Szabolcs Szakacsits wrote: > > BTW, I wanted to take a look at the frequently mentioned beancounter patch, > > here is the current state, > > http://www.asp-linux.com/en/products/ubpatch.shtml > > "Sorry, due to growing expenses for support of public version of ASPcomplete > > we do not provide sources till first official release." > > That's not a place where I keep my code (and has never been :-) Sorry, I was misguided by your earlier message at http://boudicca.tux.org/hypermail/linux-kernel/2000week30/0114.html where you wrote "Patch web page is http://www.asplinux.com.sg/install/ubpatch.html" They are the same sites [mirrors in .us, .sg, .kr and .ru]. > ftp://ftp.sw.com.sg/pub/Linux/people/saw/kernel/user_beancounter/UserBeancounter.html > is the right place (but it has some availability problems :-( I've also tried two other ftp sites, none of them were available, just as at present ... > As for memory management, it provides a simple variant of service level > support for [...] Thanks for the info, user beancounter is definitely needed but it's a 2.5 issue and people have problems now. Ironically it seems disks soon will be as fast as RAM, many thinks max swap space supported is still 128 MB and they set up systems according to this, app requirements (multimedia, etc) grows eagerly and users run out of much easier then earlier. For many the quota isn't a solution because of performance or other reasons and Linux doesn't give them any chance to survive such a situation. Szaka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Looking for better VM @ 2000-11-08 14:53 Jesse Pollard 0 siblings, 0 replies; 10+ messages in thread From: Jesse Pollard @ 2000-11-08 14:53 UTC (permalink / raw) To: riel, Szabolcs Szakacsits Cc: linux-kernel, linux-mm, Linus Torvalds, Ingo Molnar ------ > On Wed, 8 Nov 2000, Szabolcs Szakacsits wrote: > > On Mon, 6 Nov 2000, Rik van Riel wrote: [snip] > > You could ask, so what's the point for non-overcommit if we use > > process killing in the end? And the answer, in *practise* this almost > > never happens, root can always clean up and no processes are lost > > [just as when disk is "full" except the reserved area for root]. See? > > Human get a chance against hard-wired AI. > > > > I also didn't say non-overcommit should be used as default and a > > patch http://www.cs.helsinki.fi/linux/linux-kernel/2000-13/1208.html, > > developed for 2.3.99-pre3 by Eduardo Horvath and unfortunately was > > ignored completely, implemented it this way. > > OK. This is a lot more reasonable. I'm actually looking > into putting non-overcommit as a configurable option in > the kernel. > > However, this does not save you from the fact that the > system is essentially deadlocked when nothing can get > more memory and nothing goes away. Non-overcommit won't > give you any extra reliability unless your applications > are very well behaved ... in which case you don't need > non-overcommit. Applications are not usually the problem, users are. If a user starts one "well behaved" process, and then starts another, and another.... The system WILL go OOM, and with unpredictable results (as far as the user is concerned). The Eduardo Horvath patch works exactly as he designed. It allowed overcommit by root, disallowed user generating overcommit. or it could disallow overcommit by all, or operate the same as without the patch (but it did accumulate some statistics). The problem is that unless user memory resource controls are available to the administrator to establish some policy, system deadlock will always occur, OR you have random shutdowns, or random process aborts. The resource controls should allow an administrator defined policy, established in user space, and enforced by the kernel. The kernel should be able to enforce any policy from no memory restriction (current, and reasonable for single user workstations), to fully disabled overcommit (dedicated multi-user batch processing in clustered environments). I know the patch was an early prototype. It did provide some identification of the locations that resource controls could/should be done (this should be a 2.5 developement item). ------------------------------------------------------------------------- Jesse I Pollard, II Email: pollard@navo.hpc.mil Any opinions expressed are solely my own. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux.eu.org/Linux-MM/ ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2000-11-13 23:44 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.05.10011061954520.26327-100000@humbolt.nl.linux.org>
2000-11-08 11:34 ` Looking for better VM Szabolcs Szakacsits
2000-11-08 13:53 ` Rik van Riel
2000-11-08 16:36 ` Mikulas Patocka
2000-11-08 17:03 ` Christoph Rohland
2000-11-08 20:52 ` Ingo Oeser
2000-11-09 0:08 ` Rik van Riel
2000-11-09 17:30 ` [PATCH] Reserve VM for root (was: Re: Looking for better VM) Szabolcs Szakacsits
2000-11-10 10:38 ` Andrey Savochkin
2000-11-13 23:44 ` user beancounter (was: Reserve VM for root) Szabolcs Szakacsits
2000-11-08 14:53 Looking for better VM Jesse Pollard
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox