* [discuss] memrlimit - potential applications that can use @ 2008-08-19 7:18 Balbir Singh 2008-08-19 15:58 ` Dave Hansen 0 siblings, 1 reply; 14+ messages in thread From: Balbir Singh @ 2008-08-19 7:18 UTC (permalink / raw) To: Paul Menage, Dave Hansen Cc: Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list After having discussed memrlimit at the container mini-summit, I've been investigating potential users of memrlimits. Here are the use cases that I have so far. 1. To provide a soft landing mechanism for applications that exceed their memory limit. Currently in the memory resource controller, we swap and on failure OOM. 2. To provide a mechanism similar to memory overcommit for control groups. Overcommit has finer accounting, we just account for virtual address space usage. 3. Vserver will directly be able to port over on top of memrlimit (their address space limitation feature) The case against 1 has been that applications, do not tolerate malloc failure, does not imply that applications should not have the capability or will never be allowed the flexibility of doing so Other users of memory limits I found are 1. php - through php.ini allows setting of maximum memory limit 2. Apache - supports setting of memory limits for child processes (RLimitMEM Directive) 3. Java/KVM all take hints about the maximum memory to be used by the application 4. google.com/codesearch for RLIMIT_AS will show up a big list of applications that use memory limits. With this background, I propose that we need a mechanism of providing a memory overcommit feature for cgroups, the options are 1. We keep memrlimit and use it. It's very flexible, but on the down side it does simple total_vm based accounting and provides functionality similar to RLIMIT_AS for control groups. 2. We port the overcommit feature (Andrea did post patches for this), it's harder to implement, but provides functionality similar to what exists for overcommit. Comments? -- Warm Regards, Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-19 7:18 [discuss] memrlimit - potential applications that can use Balbir Singh @ 2008-08-19 15:58 ` Dave Hansen 2008-08-19 16:45 ` Balbir Singh 0 siblings, 1 reply; 14+ messages in thread From: Dave Hansen @ 2008-08-19 15:58 UTC (permalink / raw) To: balbir Cc: Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list On Tue, 2008-08-19 at 12:48 +0530, Balbir Singh wrote: > 1. To provide a soft landing mechanism for applications that exceed their memory > limit. Currently in the memory resource controller, we swap and on failure OOM. > 2. To provide a mechanism similar to memory overcommit for control groups. > Overcommit has finer accounting, we just account for virtual address space usage. > 3. Vserver will directly be able to port over on top of memrlimit (their address > space limitation feature) Balbir, This all seems like a little bit too much hand waving to me. I don't really see a single concrete user in the "potential applications" here. I really don't understand why you're pushing this so hard if you don't have anyone to actually use it. I just don't see anyone that *needs* it. There's a lot of "it would be nice", but no "needs". -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-19 15:58 ` Dave Hansen @ 2008-08-19 16:45 ` Balbir Singh 2008-08-19 17:41 ` Dave Hansen 0 siblings, 1 reply; 14+ messages in thread From: Balbir Singh @ 2008-08-19 16:45 UTC (permalink / raw) To: Dave Hansen Cc: Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list Dave Hansen wrote: > On Tue, 2008-08-19 at 12:48 +0530, Balbir Singh wrote: >> 1. To provide a soft landing mechanism for applications that exceed their memory >> limit. Currently in the memory resource controller, we swap and on failure OOM. >> 2. To provide a mechanism similar to memory overcommit for control groups. >> Overcommit has finer accounting, we just account for virtual address space usage. >> 3. Vserver will directly be able to port over on top of memrlimit (their address >> space limitation feature) > > Balbir, > > This all seems like a little bit too much hand waving to me. I don't Dave, there is no hand waving, just an honest discussion. Although, you may not see it in the background, we still need overcommit protection and we have it enabled by default for the system. There are applications that can deal with the constraints setup by the administrator and constraints of the environment, please see http://en.wikipedia.org/wiki/Autonomic_computing. > really see a single concrete user in the "potential applications" here. > I really don't understand why you're pushing this so hard if you don't > have anyone to actually use it. > > I just don't see anyone that *needs* it. There's a lot of "it would be > nice", but no "needs". If you see the original email, I've sent - I've mentioned that we need overcommit support (either via memrlimit or by porting over the overcommit feature) and the exploiters you are looking for is the same as the ones who need overcommit and RLIMIT_AS support. On the memory overcommit front, please see PostgreSQL Server Administrator's Guide at http://www.network-theory.co.uk/docs/postgresql/vol3/LinuxMemoryOvercommit.html The guide discusses turning off memory overcommit so that the database is never OOM killed, how do we provide these guarantees for a particular control group? We can do it system wide, but ideally we want the control point to be per control group. As far as other users are concerned, I've listed users of the memory limit feature, in the original email I sent out. To try and understand your viewpoint better, could you please tell me if 1. You are opposed to overcommit and RLIMIT_AS as features OR 2. Expanding them to control groups -- Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-19 16:45 ` Balbir Singh @ 2008-08-19 17:41 ` Dave Hansen 2008-08-20 8:26 ` Balbir Singh 2008-08-20 13:25 ` righi.andrea 0 siblings, 2 replies; 14+ messages in thread From: Dave Hansen @ 2008-08-19 17:41 UTC (permalink / raw) To: balbir Cc: Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list On Tue, 2008-08-19 at 22:15 +0530, Balbir Singh wrote: > Dave Hansen wrote: > > On Tue, 2008-08-19 at 12:48 +0530, Balbir Singh wrote: > >> 1. To provide a soft landing mechanism for applications that exceed their memory > >> limit. Currently in the memory resource controller, we swap and on failure OOM. > >> 2. To provide a mechanism similar to memory overcommit for control groups. > >> Overcommit has finer accounting, we just account for virtual address space usage. > >> 3. Vserver will directly be able to port over on top of memrlimit (their address > >> space limitation feature) > > > > Balbir, > > > > This all seems like a little bit too much hand waving to me. I don't > > Dave, there is no hand waving, just an honest discussion. Although, you may not > see it in the background, we still need overcommit protection and we have it > enabled by default for the system. There are applications that can deal with the > constraints setup by the administrator and constraints of the environment, > please see http://en.wikipedia.org/wiki/Autonomic_computing. OK, let's get back to describing the basic problem here. What is the basic problem being solved? Applications basically want to get a failure back from malloc() when the machine is (nearly?) out of memory so they can stop consuming? Is this the only way to do autonomic computing with memory? Or, are there other or better approaches? Surely an autonomic computing app could keep track of its own memory footprint. > > really see a single concrete user in the "potential applications" here. > > I really don't understand why you're pushing this so hard if you don't > > have anyone to actually use it. > > > > I just don't see anyone that *needs* it. There's a lot of "it would be > > nice", but no "needs". > > If you see the original email, I've sent - I've mentioned that we need > overcommit support (either via memrlimit or by porting over the overcommit > feature) and the exploiters you are looking for is the same as the ones who need > overcommit and RLIMIT_AS support. > > On the memory overcommit front, please see PostgreSQL Server Administrator's > Guide at > http://www.network-theory.co.uk/docs/postgresql/vol3/LinuxMemoryOvercommit.html > > The guide discusses turning off memory overcommit so that the database is never > OOM killed, how do we provide these guarantees for a particular control group? > We can do it system wide, but ideally we want the control point to be per > control group. Heh. That suggestion is, at best, working around a kernel bug. The DB guys are just saying to do that because they're the biggest memory users and always seem to get OOM killed first. The base problem here is the OOM killer, not an application that truly uses memory overcommit restriction in an interesting way. > As far as other users are concerned, I've listed users of the memory limit > feature, in the original email I sent out. To try and understand your viewpoint > better, could you please tell me if > > 1. You are opposed to overcommit and RLIMIT_AS as features > > OR > > 2. Expanding them to control groups I think that too many of the users of (1) probably fall into the PostgreSQL category. They found that turning it on "fixed" their bugs, but it really just swept them under the rug. So, before we expand the use of those features to control groups by adding a bunch of new code, let's make sure that there will be users for it and that those users have no better way of doing it. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-19 17:41 ` Dave Hansen @ 2008-08-20 8:26 ` Balbir Singh 2008-08-20 16:29 ` Dave Hansen 2008-08-20 13:25 ` righi.andrea 1 sibling, 1 reply; 14+ messages in thread From: Balbir Singh @ 2008-08-20 8:26 UTC (permalink / raw) To: Dave Hansen Cc: Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list Dave Hansen wrote: > On Tue, 2008-08-19 at 22:15 +0530, Balbir Singh wrote: >> Dave Hansen wrote: >>> On Tue, 2008-08-19 at 12:48 +0530, Balbir Singh wrote: >>>> 1. To provide a soft landing mechanism for applications that exceed their memory >>>> limit. Currently in the memory resource controller, we swap and on failure OOM. >>>> 2. To provide a mechanism similar to memory overcommit for control groups. >>>> Overcommit has finer accounting, we just account for virtual address space usage. >>>> 3. Vserver will directly be able to port over on top of memrlimit (their address >>>> space limitation feature) >>> Balbir, >>> >>> This all seems like a little bit too much hand waving to me. I don't >> Dave, there is no hand waving, just an honest discussion. Although, you may not >> see it in the background, we still need overcommit protection and we have it >> enabled by default for the system. There are applications that can deal with the >> constraints setup by the administrator and constraints of the environment, >> please see http://en.wikipedia.org/wiki/Autonomic_computing. > > OK, let's get back to describing the basic problem here. What is the > basic problem being solved? Applications basically want to get a > failure back from malloc() when the machine is (nearly?) out of memory > so they can stop consuming? > > Is this the only way to do autonomic computing with memory? Or, are > there other or better approaches? > I guess the application needs to know how much of the resources it can consume. > Surely an autonomic computing app could keep track of its own memory > footprint. Yes, an application does know it's memory footprint, but does it know how it is supposed to consume resources in the system. Consider a linear algebra package trying to do a multiplication of 1 million x 1 million rows. Depending on how much resources it is allowed to consume, it could do so in one shot or if there was a restriction, it could multiply smaller matrices and then collate results. The application wants to stretch itself (memory footprint) for performance, but at the same time does not want to get killed because 1. Other applications came in and caused an OOM 2. It stretched itself too much beyond what the system can support > >>> really see a single concrete user in the "potential applications" here. >>> I really don't understand why you're pushing this so hard if you don't >>> have anyone to actually use it. >>> >>> I just don't see anyone that *needs* it. There's a lot of "it would be >>> nice", but no "needs". >> If you see the original email, I've sent - I've mentioned that we need >> overcommit support (either via memrlimit or by porting over the overcommit >> feature) and the exploiters you are looking for is the same as the ones who need >> overcommit and RLIMIT_AS support. >> >> On the memory overcommit front, please see PostgreSQL Server Administrator's >> Guide at >> http://www.network-theory.co.uk/docs/postgresql/vol3/LinuxMemoryOvercommit.html >> >> The guide discusses turning off memory overcommit so that the database is never >> OOM killed, how do we provide these guarantees for a particular control group? >> We can do it system wide, but ideally we want the control point to be per >> control group. > > Heh. That suggestion is, at best, working around a kernel bug. The DB > guys are just saying to do that because they're the biggest memory users > and always seem to get OOM killed first. > > The base problem here is the OOM killer, not an application that truly > uses memory overcommit restriction in an interesting way. > No it is not a kernel BUG, agreed that the database is using a lot of memory, but how can it predict what else will run on the system. Why is it bad for a database for the sake of data integrity to ensure that it does not get OOM killed and thus make sure memory is never overcommitted. Yes, you need performance, so the application expands it's footprint, but at the same time, the stretching should not cause it to be killed. How would you propose to solve the problem without overcommit control? >> As far as other users are concerned, I've listed users of the memory limit >> feature, in the original email I sent out. To try and understand your viewpoint >> better, could you please tell me if >> >> 1. You are opposed to overcommit and RLIMIT_AS as features >> >> OR >> >> 2. Expanding them to control groups > > I think that too many of the users of (1) probably fall into the > PostgreSQL category. They found that turning it on "fixed" their bugs, > but it really just swept them under the rug. > Please see my comment on this in the paragraph above > So, before we expand the use of those features to control groups by > adding a bunch of new code, let's make sure that there will be users for > it and that those users have no better way of doing it. I am all ears to better ways of doing it. Are you suggesting that overcommit was added even though we don't actually need it? -- Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-20 8:26 ` Balbir Singh @ 2008-08-20 16:29 ` Dave Hansen 2008-08-21 3:25 ` Balbir Singh 0 siblings, 1 reply; 14+ messages in thread From: Dave Hansen @ 2008-08-20 16:29 UTC (permalink / raw) To: balbir Cc: Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list On Wed, 2008-08-20 at 13:56 +0530, Balbir Singh wrote: > Dave Hansen wrote: > > On Tue, 2008-08-19 at 22:15 +0530, Balbir Singh wrote: > >> Dave Hansen wrote: > >>> On Tue, 2008-08-19 at 12:48 +0530, Balbir Singh wrote: > >>>> 1. To provide a soft landing mechanism for applications that exceed their memory > >>>> limit. Currently in the memory resource controller, we swap and on failure OOM. > >>>> 2. To provide a mechanism similar to memory overcommit for control groups. > >>>> Overcommit has finer accounting, we just account for virtual address space usage. > >>>> 3. Vserver will directly be able to port over on top of memrlimit (their address > >>>> space limitation feature) > >>> Balbir, > >>> > >>> This all seems like a little bit too much hand waving to me. I don't > >> Dave, there is no hand waving, just an honest discussion. Although, you may not > >> see it in the background, we still need overcommit protection and we have it > >> enabled by default for the system. There are applications that can deal with the > >> constraints setup by the administrator and constraints of the environment, > >> please see http://en.wikipedia.org/wiki/Autonomic_computing. > > > > OK, let's get back to describing the basic problem here. What is the > > basic problem being solved? Applications basically want to get a > > failure back from malloc() when the machine is (nearly?) out of memory > > so they can stop consuming? > > > > Is this the only way to do autonomic computing with memory? Or, are > > there other or better approaches? > > > Yes, an application does know it's memory footprint, but does it know how it is > supposed to consume resources in the system. Consider a linear algebra package > trying to do a multiplication of 1 million x 1 million rows. Depending on how > much resources it is allowed to consume, it could do so in one shot or if there > was a restriction, it could multiply smaller matrices and then collate results. > The application wants to stretch itself (memory footprint) for performance, but > at the same time does not want to get killed because > > 1. Other applications came in and caused an OOM > 2. It stretched itself too much beyond what the system can support So, in (2) it deserves to be oom'd. If other applications came in and caused the oom, then we do have /proc/$pid/oom_adj to help out. That's a much better tunable than overcommit. > >>> really see a single concrete user in the "potential applications" here. > >>> I really don't understand why you're pushing this so hard if you don't > >>> have anyone to actually use it. > >>> > >>> I just don't see anyone that *needs* it. There's a lot of "it would be > >>> nice", but no "needs". > >> If you see the original email, I've sent - I've mentioned that we need > >> overcommit support (either via memrlimit or by porting over the overcommit > >> feature) and the exploiters you are looking for is the same as the ones who need > >> overcommit and RLIMIT_AS support. > >> > >> On the memory overcommit front, please see PostgreSQL Server Administrator's > >> Guide at > >> http://www.network-theory.co.uk/docs/postgresql/vol3/LinuxMemoryOvercommit.html > >> > >> The guide discusses turning off memory overcommit so that the database is never > >> OOM killed, how do we provide these guarantees for a particular control group? > >> We can do it system wide, but ideally we want the control point to be per > >> control group. > > > > Heh. That suggestion is, at best, working around a kernel bug. The DB > > guys are just saying to do that because they're the biggest memory users > > and always seem to get OOM killed first. > > > > The base problem here is the OOM killer, not an application that truly > > uses memory overcommit restriction in an interesting way. > > > > No it is not a kernel BUG, agreed that the database is using a lot of memory, > but how can it predict what else will run on the system. Why is it bad for a > database for the sake of data integrity to ensure that it does not get OOM > killed and thus make sure memory is never overcommitted. Yes, you need > performance, so the application expands it's footprint, but at the same time, > the stretching should not cause it to be killed. How would you propose to solve > the problem without overcommit control? I think that we're tying OOM'ing and overcommit a little too close together here. It's not like you can't have OOMs when strict overcommit is being observed. There are lots of other ways to lock memory down, and any one of those can also cause an oom. Yes, userspace mapped memory is usually the largest single consumer, but the problem space is well beyond overcommit control. Agreed? Just look at why beancounters were implemented and track things far beyond userspace memory use. > > So, before we expand the use of those features to control groups by > > adding a bunch of new code, let's make sure that there will be users > for > > it and that those users have no better way of doing it. > > I am all ears to better ways of doing it. Are you suggesting that overcommit was > added even though we don't actually need it? It serves a purpose, certainly. We have have better ways of doing it now, though. "i>>?So, before we expand the use of those features to control groups by adding a bunch of new code, let's make sure that there will be users for it and that those users have no better way of doing it." The one concrete user that's been offered so far is postgres. I've suggested something that I hope will be more effective than enforcing overcommit. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-20 16:29 ` Dave Hansen @ 2008-08-21 3:25 ` Balbir Singh 2008-08-21 7:43 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 14+ messages in thread From: Balbir Singh @ 2008-08-21 3:25 UTC (permalink / raw) To: Dave Hansen Cc: Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list Dave Hansen wrote: > On Wed, 2008-08-20 at 13:56 +0530, Balbir Singh wrote: >> Dave Hansen wrote: >>> On Tue, 2008-08-19 at 22:15 +0530, Balbir Singh wrote: >>>> Dave Hansen wrote: >>>>> On Tue, 2008-08-19 at 12:48 +0530, Balbir Singh wrote: >>>>>> 1. To provide a soft landing mechanism for applications that exceed their memory >>>>>> limit. Currently in the memory resource controller, we swap and on failure OOM. >>>>>> 2. To provide a mechanism similar to memory overcommit for control groups. >>>>>> Overcommit has finer accounting, we just account for virtual address space usage. >>>>>> 3. Vserver will directly be able to port over on top of memrlimit (their address >>>>>> space limitation feature) >>>>> Balbir, >>>>> >>>>> This all seems like a little bit too much hand waving to me. I don't >>>> Dave, there is no hand waving, just an honest discussion. Although, you may not >>>> see it in the background, we still need overcommit protection and we have it >>>> enabled by default for the system. There are applications that can deal with the >>>> constraints setup by the administrator and constraints of the environment, >>>> please see http://en.wikipedia.org/wiki/Autonomic_computing. >>> OK, let's get back to describing the basic problem here. What is the >>> basic problem being solved? Applications basically want to get a >>> failure back from malloc() when the machine is (nearly?) out of memory >>> so they can stop consuming? >>> >>> Is this the only way to do autonomic computing with memory? Or, are >>> there other or better approaches? >>> >> Yes, an application does know it's memory footprint, but does it know how it is >> supposed to consume resources in the system. Consider a linear algebra package >> trying to do a multiplication of 1 million x 1 million rows. Depending on how >> much resources it is allowed to consume, it could do so in one shot or if there >> was a restriction, it could multiply smaller matrices and then collate results. >> The application wants to stretch itself (memory footprint) for performance, but >> at the same time does not want to get killed because >> >> 1. Other applications came in and caused an OOM >> 2. It stretched itself too much beyond what the system can support > > So, in (2) it deserves to be oom'd. > Not really, how does an application know how to trade-off between maximum performance on a system vs consuming so much of the memory resource that it OOMs? > If other applications came in and caused the oom, then we do > have /proc/$pid/oom_adj to help out. That's a much better tunable than > overcommit. > And oom_adj is not a hack? What if several memory hungry applications striving for performance all turn oom_adj to preven themselves from being oom'ed? >>>>> really see a single concrete user in the "potential applications" here. >>>>> I really don't understand why you're pushing this so hard if you don't >>>>> have anyone to actually use it. >>>>> >>>>> I just don't see anyone that *needs* it. There's a lot of "it would be >>>>> nice", but no "needs". >>>> If you see the original email, I've sent - I've mentioned that we need >>>> overcommit support (either via memrlimit or by porting over the overcommit >>>> feature) and the exploiters you are looking for is the same as the ones who need >>>> overcommit and RLIMIT_AS support. >>>> >>>> On the memory overcommit front, please see PostgreSQL Server Administrator's >>>> Guide at >>>> http://www.network-theory.co.uk/docs/postgresql/vol3/LinuxMemoryOvercommit.html >>>> >>>> The guide discusses turning off memory overcommit so that the database is never >>>> OOM killed, how do we provide these guarantees for a particular control group? >>>> We can do it system wide, but ideally we want the control point to be per >>>> control group. >>> Heh. That suggestion is, at best, working around a kernel bug. The DB >>> guys are just saying to do that because they're the biggest memory users >>> and always seem to get OOM killed first. >>> >>> The base problem here is the OOM killer, not an application that truly >>> uses memory overcommit restriction in an interesting way. >>> >> No it is not a kernel BUG, agreed that the database is using a lot of memory, >> but how can it predict what else will run on the system. Why is it bad for a >> database for the sake of data integrity to ensure that it does not get OOM >> killed and thus make sure memory is never overcommitted. Yes, you need >> performance, so the application expands it's footprint, but at the same time, >> the stretching should not cause it to be killed. How would you propose to solve >> the problem without overcommit control? > > I think that we're tying OOM'ing and overcommit a little too close > together here. It's not like you can't have OOMs when strict overcommit > is being observed. > > There are lots of other ways to lock memory down, and any one of those > can also cause an oom. > The other ways of locking memory down is mlock(), which by default is limited on most distros. We'll end up implementing mlock() control per control group as well. > Yes, userspace mapped memory is usually the largest single consumer, but > the problem space is well beyond overcommit control. Agreed? Just look > at why beancounters were implemented and track things far beyond > userspace memory use. > I've looked at http://wiki.openvz.org/User_pages_accounting and it states "Account a part of memory on mmap/brk and reject there, and account the rest of the memory in page fault handlers without any rejects." This type of accounting is used in UBC. I looked through the code in mm/mmap.c for beancounters, ub_memory_charge() is called from almost the same places that the memrlimit controller does accounting. Please see their git tree at git.openvz.org. My understanding of the code is the private vm and locked vm pages are only charged in that implementation. Agreed, they have additional finer accounting of kernel data structures, but beancounters account for VM usage too. >>> So, before we expand the use of those features to control groups by >>> adding a bunch of new code, let's make sure that there will be users >> for >>> it and that those users have no better way of doing it. >> I am all ears to better ways of doing it. Are you suggesting that overcommit was >> added even though we don't actually need it? > > It serves a purpose, certainly. We have have better ways of doing it > now, though. "i>>?So, before we expand the use of those features to > control groups by adding a bunch of new code, let's make sure that there > will be users for it and that those users have no better way of doing > it." > > The one concrete user that's been offered so far is postgres. I've No, you've been offered several, including php and apache that use memory limits. > suggested something that I hope will be more effective than enforcing > overcommit. Is your suggestion beancounters? -- Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-21 3:25 ` Balbir Singh @ 2008-08-21 7:43 ` KAMEZAWA Hiroyuki 2008-08-21 10:26 ` Balbir Singh 2008-08-21 15:18 ` righi.andrea 0 siblings, 2 replies; 14+ messages in thread From: KAMEZAWA Hiroyuki @ 2008-08-21 7:43 UTC (permalink / raw) To: balbir Cc: Dave Hansen, Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list On Thu, 21 Aug 2008 08:55:52 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > >>> So, before we expand the use of those features to control groups by > >>> adding a bunch of new code, let's make sure that there will be users > >> for > >>> it and that those users have no better way of doing it. > >> I am all ears to better ways of doing it. Are you suggesting that overcommit was > >> added even though we don't actually need it? > > > > It serves a purpose, certainly. We have have better ways of doing it > > now, though. "i>>?So, before we expand the use of those features to > > control groups by adding a bunch of new code, let's make sure that there > > will be users for it and that those users have no better way of doing > > it." > > > > The one concrete user that's been offered so far is postgres. I've > > No, you've been offered several, including php and apache that use memory limits. > > > suggested something that I hope will be more effective than enforcing > > overcommit. > I'm sorry I miss the point. My concern on memrlimit (for overcommiting) is that it's not fair because an application which get -ENOMEM at mmap() is just someone unlucky. I think it's better to trigger some notifier to application or daemon rather than return -ENOMEM at mmap(). Notification like "Oh, it seems the VSZ of total application exceeds the limit you set. Although you can continue your operation, it's recommended that you should fix up the situation". will be good. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-21 7:43 ` KAMEZAWA Hiroyuki @ 2008-08-21 10:26 ` Balbir Singh 2008-08-21 10:59 ` KAMEZAWA Hiroyuki 2008-08-21 15:18 ` righi.andrea 1 sibling, 1 reply; 14+ messages in thread From: Balbir Singh @ 2008-08-21 10:26 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Dave Hansen, Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list KAMEZAWA Hiroyuki wrote: > On Thu, 21 Aug 2008 08:55:52 +0530 > Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > >>>>> So, before we expand the use of those features to control groups by >>>>> adding a bunch of new code, let's make sure that there will be users >>>> for >>>>> it and that those users have no better way of doing it. >>>> I am all ears to better ways of doing it. Are you suggesting that overcommit was >>>> added even though we don't actually need it? >>> It serves a purpose, certainly. We have have better ways of doing it >>> now, though. "i>>?So, before we expand the use of those features to >>> control groups by adding a bunch of new code, let's make sure that there >>> will be users for it and that those users have no better way of doing >>> it." >>> >>> The one concrete user that's been offered so far is postgres. I've >> No, you've been offered several, including php and apache that use memory limits. >> >>> suggested something that I hope will be more effective than enforcing >>> overcommit. > > I'm sorry I miss the point. My concern on memrlimit (for overcommiting) is that > it's not fair because an application which get -ENOMEM at mmap() is just someone > unlucky. It can happen today with overcommit turned on. Why is it unlucky? I think it's better to trigger some notifier to application or daemon > rather than return -ENOMEM at mmap(). Notification like "Oh, it seems the VSZ > of total application exceeds the limit you set. Although you can continue your > operation, it's recommended that you should fix up the situation". > will be good. > So you are suggesting that when we are running out of memory (as defined by our current resource constraints), we don't return -ENOMEM, but instead we now handle a new event that states that we are running out of memory? NOTE: I am not opposed to the event, it can be useful for container administrators to know how to size their containers, not to application developers who want to auto-tune their applications (see my comment on autonomic computing in an earlier thread) or to applications that want to make sure they don't OOM without the system administrator having to do oom_adj for every important application. > Thanks, > -Kame > -- Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-21 10:26 ` Balbir Singh @ 2008-08-21 10:59 ` KAMEZAWA Hiroyuki 2008-08-21 11:13 ` Balbir Singh 0 siblings, 1 reply; 14+ messages in thread From: KAMEZAWA Hiroyuki @ 2008-08-21 10:59 UTC (permalink / raw) To: balbir Cc: Dave Hansen, Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list On Thu, 21 Aug 2008 15:56:41 +0530 Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > KAMEZAWA Hiroyuki wrote: > > On Thu, 21 Aug 2008 08:55:52 +0530 > > Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > > > >>>>> So, before we expand the use of those features to control groups by > >>>>> adding a bunch of new code, let's make sure that there will be users > >>>> for > >>>>> it and that those users have no better way of doing it. > >>>> I am all ears to better ways of doing it. Are you suggesting that overcommit was > >>>> added even though we don't actually need it? > >>> It serves a purpose, certainly. We have have better ways of doing it > >>> now, though. "i>>?So, before we expand the use of those features to > >>> control groups by adding a bunch of new code, let's make sure that there > >>> will be users for it and that those users have no better way of doing > >>> it." > >>> > >>> The one concrete user that's been offered so far is postgres. I've > >> No, you've been offered several, including php and apache that use memory limits. > >> > >>> suggested something that I hope will be more effective than enforcing > >>> overcommit. > > > > I'm sorry I miss the point. My concern on memrlimit (for overcommiting) is that > > it's not fair because an application which get -ENOMEM at mmap() is just someone > > unlucky. > > It can happen today with overcommit turned on. Why is it unlucky? > Today's overcommit is also unlucky ;) For example) process A and B is under a memrlimit. process A no memory leak, it often calls malloc() and free(). process B does memory leak, 100MB per night. process A cannot do anything when it notices malloc() returns NULL. It controls his memory usage perfectly. He is unlucky and will die. process B can use up VSZ which is freed by process A. (OOM-killer, is disliked by everyone, have some kind of fairness. It checks usage.) > I think it's better to trigger some notifier to application or daemon > > rather than return -ENOMEM at mmap(). Notification like "Oh, it seems the VSZ > > of total application exceeds the limit you set. Although you can continue your > > operation, it's recommended that you should fix up the situation". > > will be good. > > > > So you are suggesting that when we are running out of memory (as defined by our > current resource constraints), we don't return -ENOMEM, but instead we now > handle a new event that states that we are running out of memory? > Not "running out of memory" Just "VSZ is over the limit you set/expected". My point is an application witch can handle NULL returned by malloc() is not very popular, I think. Sorry for noise. Thanks, -Kame > NOTE: I am not opposed to the event, it can be useful for container > administrators to know how to size their containers, not to application > developers who want to auto-tune their applications (see my comment on autonomic > computing in an earlier thread) or to applications that want to make sure they > don't OOM without the system administrator having to do oom_adj for every > important application. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-21 10:59 ` KAMEZAWA Hiroyuki @ 2008-08-21 11:13 ` Balbir Singh 0 siblings, 0 replies; 14+ messages in thread From: Balbir Singh @ 2008-08-21 11:13 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: Dave Hansen, Paul Menage, Dave Hansen, Andrea Righi, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list KAMEZAWA Hiroyuki wrote: > On Thu, 21 Aug 2008 15:56:41 +0530 > Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > >> KAMEZAWA Hiroyuki wrote: >>> On Thu, 21 Aug 2008 08:55:52 +0530 >>> Balbir Singh <balbir@linux.vnet.ibm.com> wrote: >>> >>>>>>> So, before we expand the use of those features to control groups by >>>>>>> adding a bunch of new code, let's make sure that there will be users >>>>>> for >>>>>>> it and that those users have no better way of doing it. >>>>>> I am all ears to better ways of doing it. Are you suggesting that overcommit was >>>>>> added even though we don't actually need it? >>>>> It serves a purpose, certainly. We have have better ways of doing it >>>>> now, though. "i>>?So, before we expand the use of those features to >>>>> control groups by adding a bunch of new code, let's make sure that there >>>>> will be users for it and that those users have no better way of doing >>>>> it." >>>>> >>>>> The one concrete user that's been offered so far is postgres. I've >>>> No, you've been offered several, including php and apache that use memory limits. >>>> >>>>> suggested something that I hope will be more effective than enforcing >>>>> overcommit. >>> I'm sorry I miss the point. My concern on memrlimit (for overcommiting) is that >>> it's not fair because an application which get -ENOMEM at mmap() is just someone >>> unlucky. >> It can happen today with overcommit turned on. Why is it unlucky? >> > Today's overcommit is also unlucky ;) > > For example) process A and B is under a memrlimit. > process A no memory leak, it often calls malloc() and free(). > process B does memory leak, 100MB per night. > > process A cannot do anything when it notices malloc() returns NULL. > It controls his memory usage perfectly. He is unlucky and will die. > process B can use up VSZ which is freed by process A. > Yes, true that will happen. Why will A die because it sees NULL? Yes, many applications do die, but that is not how malloc == NULL is expected to be handled. If that is a concern, do not use any memrlimits for A and B, if you do you will find the bug early. Now consider the other scenario, if there really is a memory leak and process B is using all that memory, two things to consider 1. Without swap controller, B will start swapping out A's memory and cause excessive swapping and performance loss 2. With swap controller enabled, at some point we will hit the swap limit, what happens then? > (OOM-killer, is disliked by everyone, have some kind of fairness. > It checks usage.) > >> I think it's better to trigger some notifier to application or daemon >>> rather than return -ENOMEM at mmap(). Notification like "Oh, it seems the VSZ >>> of total application exceeds the limit you set. Although you can continue your >>> operation, it's recommended that you should fix up the situation". >>> will be good. >>> >> So you are suggesting that when we are running out of memory (as defined by our >> current resource constraints), we don't return -ENOMEM, but instead we now >> handle a new event that states that we are running out of memory? >> > Not "running out of memory" Just "VSZ is over the limit you set/expected". > > My point is an application witch can handle NULL returned by malloc() is > not very popular, I think. > Yes and that's why we have the flexibility, if the application can't deal with it don't set memrlimits for those applications :) -- Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-21 7:43 ` KAMEZAWA Hiroyuki 2008-08-21 10:26 ` Balbir Singh @ 2008-08-21 15:18 ` righi.andrea 1 sibling, 0 replies; 14+ messages in thread From: righi.andrea @ 2008-08-21 15:18 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: balbir, Dave Hansen, Paul Menage, Dave Hansen, Hugh Dickins, Andrew Morton, Linux Memory Management List, linux kernel mailing list On 8/21/08, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote: > I'm sorry I miss the point. My concern on memrlimit (for overcommiting) is > that > it's not fair because an application which get -ENOMEM at mmap() is just > someone > unlucky. I think it's better to trigger some notifier to application or > daemon > rather than return -ENOMEM at mmap(). Notification like "Oh, it seems the > VSZ > of total application exceeds the limit you set. Although you can continue > your > operation, it's recommended that you should fix up the situation". > will be good. -ENOMEM should be considered by applications like "try again" (maybe -EAGAIN would be more appropriate). When the notification of the out-of-virtual-memory event occurs the dedicated userspace daemon can do ehm... something... to resolve the situation. Just like the OOM handling in userspace. Similar issues, but a common solution could resolve both problems. -Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-19 17:41 ` Dave Hansen 2008-08-20 8:26 ` Balbir Singh @ 2008-08-20 13:25 ` righi.andrea 2008-08-20 16:38 ` Dave Hansen 1 sibling, 1 reply; 14+ messages in thread From: righi.andrea @ 2008-08-20 13:25 UTC (permalink / raw) To: Dave Hansen Cc: balbir, Paul Menage, Dave Hansen, Hugh Dickins, Andrew Morton, Marco Sbrighi, Linux Memory Management List, linux kernel mailing list On 8/19/08, Dave Hansen <dave@linux.vnet.ibm.com> wrote: > On Tue, 2008-08-19 at 22:15 +0530, Balbir Singh wrote: >> Dave Hansen wrote: >> > On Tue, 2008-08-19 at 12:48 +0530, Balbir Singh wrote: >> >> 1. To provide a soft landing mechanism for applications that exceed >> >> their memory >> >> limit. Currently in the memory resource controller, we swap and on >> >> failure OOM. >> >> 2. To provide a mechanism similar to memory overcommit for control >> >> groups. >> >> Overcommit has finer accounting, we just account for virtual address >> >> space usage. >> >> 3. Vserver will directly be able to port over on top of memrlimit >> >> (their address >> >> space limitation feature) >> > >> > Balbir, >> > >> > This all seems like a little bit too much hand waving to me. I don't >> >> Dave, there is no hand waving, just an honest discussion. Although, you >> may not >> see it in the background, we still need overcommit protection and we have >> it >> enabled by default for the system. There are applications that can deal >> with the >> constraints setup by the administrator and constraints of the environment, >> please see http://en.wikipedia.org/wiki/Autonomic_computing. > > OK, let's get back to describing the basic problem here. What is the > basic problem being solved? Applications basically want to get a > failure back from malloc() when the machine is (nearly?) out of memory > so they can stop consuming? Hi Dave, IMHO there're two different problems, and both should be considered by the kernel system wide as well as for each cgroup: 1) how to prevent OOM conditions 2) how to handle OOM conditions The perfect solution for 2) doesn't exist IMHO, because there's no clean way from the applications point of view to handle such critical condition post-facto. Containing the OOM within a cgroup is surely a great improvement, but there's always the risk to kill the wrong applications (within the cgroup). Another good improvement would be to handle the OOM condition in userspace, Balbir is working/discussing/plannig something about this, if I remember well. An interesting solution, proposed in the past, was to send a special signal to userspace apps to free up caches/buffers/unused mem when the whole memory in the system goes under a critical threshold. But this would require an active support by all the userspace applications, that should implement the signal handler in a proper way. Maybe this could be even considered a special case of the userspace OOM handling. Memory overcommit protection, instead, is a way to *prevent* OOM conditions (problem 1). This approach is safer for critical applications that have a chance to cleanly handle the OOM at the time they're requesting memory to the kernel, instead of receiving a SIGKILL (or whatever signal) asynchronously during the execution path. Unfortunately, this kind of prevention is not always acceptable, because, in this case, userspace apps must request virtual memory carefully, otherwise it would be quite easy to create memory DoS for other applications (and probably the per-application/per-cgroup RLIMIT_AS could help here). As an example, an ideal solution I'd like to implement for a generic enterprise environment is to create all the critical apps inside a cgroup with never-overcommit memory policy and move all the other userspace apps in another cgroup with oom-killer enabled. But for this we need both 1) and 2) functionalities, and I don't see any other way to do so. -Andrea > > Is this the only way to do autonomic computing with memory? Or, are > there other or better approaches? > > Surely an autonomic computing app could keep track of its own memory > footprint. > >> > really see a single concrete user in the "potential applications" here. >> > I really don't understand why you're pushing this so hard if you don't >> > have anyone to actually use it. >> > >> > I just don't see anyone that *needs* it. There's a lot of "it would be >> > nice", but no "needs". >> >> If you see the original email, I've sent - I've mentioned that we need >> overcommit support (either via memrlimit or by porting over the overcommit >> feature) and the exploiters you are looking for is the same as the ones >> who need >> overcommit and RLIMIT_AS support. >> >> On the memory overcommit front, please see PostgreSQL Server >> Administrator's >> Guide at >> http://www.network-theory.co.uk/docs/postgresql/vol3/LinuxMemoryOvercommit.html >> >> The guide discusses turning off memory overcommit so that the database is >> never >> OOM killed, how do we provide these guarantees for a particular control >> group? >> We can do it system wide, but ideally we want the control point to be per >> control group. > > Heh. That suggestion is, at best, working around a kernel bug. The DB > guys are just saying to do that because they're the biggest memory users > and always seem to get OOM killed first. > > The base problem here is the OOM killer, not an application that truly > uses memory overcommit restriction in an interesting way. > >> As far as other users are concerned, I've listed users of the memory limit >> feature, in the original email I sent out. To try and understand your >> viewpoint >> better, could you please tell me if >> >> 1. You are opposed to overcommit and RLIMIT_AS as features >> >> OR >> >> 2. Expanding them to control groups > > I think that too many of the users of (1) probably fall into the > PostgreSQL category. They found that turning it on "fixed" their bugs, > but it really just swept them under the rug. > > So, before we expand the use of those features to control groups by > adding a bunch of new code, let's make sure that there will be users for > it and that those users have no better way of doing it. > > -- Dave > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [discuss] memrlimit - potential applications that can use 2008-08-20 13:25 ` righi.andrea @ 2008-08-20 16:38 ` Dave Hansen 0 siblings, 0 replies; 14+ messages in thread From: Dave Hansen @ 2008-08-20 16:38 UTC (permalink / raw) To: righiandr Cc: balbir, Paul Menage, Dave Hansen, Hugh Dickins, Andrew Morton, Marco Sbrighi, Linux Memory Management List, linux kernel mailing list On Wed, 2008-08-20 at 15:25 +0200, righi.andrea@gmail.com wrote: > Memory overcommit protection, instead, is a way to *prevent* OOM > conditions (problem 1). I completely disagree. :) Think of all the work Eric Biederman did on pid namespaces. One of his motivations was to keep /proc from being able to pin task structs. That is one great example of a way a process can pin lots of memory without mapping it, and overcommit has no effect on this! Eric had a couple of other good examples, but I think task structs were the biggest. As I said to Balbir, there probably are some large-scale solutions to this: things like beancounters. -- Dave -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2008-08-21 15:18 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-08-19 7:18 [discuss] memrlimit - potential applications that can use Balbir Singh 2008-08-19 15:58 ` Dave Hansen 2008-08-19 16:45 ` Balbir Singh 2008-08-19 17:41 ` Dave Hansen 2008-08-20 8:26 ` Balbir Singh 2008-08-20 16:29 ` Dave Hansen 2008-08-21 3:25 ` Balbir Singh 2008-08-21 7:43 ` KAMEZAWA Hiroyuki 2008-08-21 10:26 ` Balbir Singh 2008-08-21 10:59 ` KAMEZAWA Hiroyuki 2008-08-21 11:13 ` Balbir Singh 2008-08-21 15:18 ` righi.andrea 2008-08-20 13:25 ` righi.andrea 2008-08-20 16:38 ` Dave Hansen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox