* [discuss][memcg] oom-kill extension
@ 2008-10-29 2:38 KAMEZAWA Hiroyuki
2008-10-29 4:08 ` Balbir Singh
2008-10-29 5:35 ` Paul Menage
0 siblings, 2 replies; 9+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-29 2:38 UTC (permalink / raw)
To: linux-mm; +Cc: balbir, menage
Under memory resource controller(memcg), oom-killer can be invoked when it
reaches limit and no memory can be reclaimed.
In general, not under memcg, oom-kill(or panic) is an only chance to recover
the system because there is no available memory. But when oom occurs under
memcg, it just reaches limit and it seems we can do something else.
Does anyone have plan to enhance oom-kill ?
What I can think of now is
- add an notifier to user-land.
- receiver of notify should work in another cgroup.
- automatically extend the limit as emergency
- trigger fail-over process.
- automatically create a precise report of OOM.
- record snapshot of 'ps -elf' and so on of memcg which triggers oom.
- freeze processes under cgroup.
- maybe freezer cgroup should be mounted at the same time.
- can we add memcg-oom-freezing-point in somewhere we can sleep ?
Is there a chance to add oom_notifier to memcg ? (netlink ?)
But the real problem is that what we can do in the kernel is limited
and we need proper userland, anyway ;)
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [discuss][memcg] oom-kill extension
2008-10-29 2:38 [discuss][memcg] oom-kill extension KAMEZAWA Hiroyuki
@ 2008-10-29 4:08 ` Balbir Singh
2008-10-29 5:00 ` KAMEZAWA Hiroyuki
2008-10-29 5:35 ` Paul Menage
1 sibling, 1 reply; 9+ messages in thread
From: Balbir Singh @ 2008-10-29 4:08 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, menage
KAMEZAWA Hiroyuki wrote:
> Under memory resource controller(memcg), oom-killer can be invoked when it
> reaches limit and no memory can be reclaimed.
>
> In general, not under memcg, oom-kill(or panic) is an only chance to recover
> the system because there is no available memory. But when oom occurs under
> memcg, it just reaches limit and it seems we can do something else.
>
> Does anyone have plan to enhance oom-kill ?
>
> What I can think of now is
> - add an notifier to user-land.
> - receiver of notify should work in another cgroup.
The discussion at the mini-summit was to notify a FIFO in the cgroup and any
application can listen in for events.
> - automatically extend the limit as emergency
No.. I don't like this
> - trigger fail-over process.
I had suggested memrlimits for the ability to fail application allocations, but
no-one liked the idea. We can still implement overcommit functionality if needed
and catch failures at allocation time.
> - automatically create a precise report of OOM.
> - record snapshot of 'ps -elf' and so on of memcg which triggers oom.
>
> - freeze processes under cgroup.
> - maybe freezer cgroup should be mounted at the same time.
> - can we add memcg-oom-freezing-point in somewhere we can sleep ?
>
> Is there a chance to add oom_notifier to memcg ? (netlink ?)
>
Yes, we should add the oom-notifier. We already have cgroupstats if you want to
make use of it.
> But the real problem is that what we can do in the kernel is limited
> and we need proper userland, anyway ;)
>
Agreed.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [discuss][memcg] oom-kill extension
2008-10-29 4:08 ` Balbir Singh
@ 2008-10-29 5:00 ` KAMEZAWA Hiroyuki
2008-10-29 5:13 ` David Rientjes
0 siblings, 1 reply; 9+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-29 5:00 UTC (permalink / raw)
To: balbir; +Cc: linux-mm, menage
On Wed, 29 Oct 2008 09:38:20 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> KAMEZAWA Hiroyuki wrote:
> > Under memory resource controller(memcg), oom-killer can be invoked when it
> > reaches limit and no memory can be reclaimed.
> >
> > In general, not under memcg, oom-kill(or panic) is an only chance to recover
> > the system because there is no available memory. But when oom occurs under
> > memcg, it just reaches limit and it seems we can do something else.
> >
> > Does anyone have plan to enhance oom-kill ?
> >
> > What I can think of now is
> > - add an notifier to user-land.
> > - receiver of notify should work in another cgroup.
>
> The discussion at the mini-summit was to notify a FIFO in the cgroup and any
> application can listen in for events.
>
add FIFO rather than netlink or user mode helper ?
> > - automatically extend the limit as emergency
>
> No.. I don't like this
>
Oh, I should write as
"automatically extend the limit as emergency via userland daemon
which receives notify"
> > - trigger fail-over process.
>
> I had suggested memrlimits for the ability to fail application allocations, but
> no-one liked the idea. We can still implement overcommit functionality if needed
> and catch failures at allocation time.
>
Difficult point of memrlimit is that system engineer cannot guarantee
"your application will do proper fail over process when malloc() returns NULL".
Important applications have emergency-fail-over method via signal(SIGTERM or some..
(if not killed by SIGKILL.)
I wonder adding an "moderate oom kill mode" to memcg and send SIGTERM rather
than SIGKILL may help many? applications.
(But to do fail over, the apps may use more memory....)
> > - automatically create a precise report of OOM.
> > - record snapshot of 'ps -elf' and so on of memcg which triggers oom.
> >
> > - freeze processes under cgroup.
> > - maybe freezer cgroup should be mounted at the same time.
> > - can we add memcg-oom-freezing-point in somewhere we can sleep ?
> >
> > Is there a chance to add oom_notifier to memcg ? (netlink ?)
> >
>
> Yes, we should add the oom-notifier. We already have cgroupstats if you want to
> make use of it.
>
ok, look into that.
> > But the real problem is that what we can do in the kernel is limited
> > and we need proper userland, anyway ;)
> >
>
> Agreed.
>
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [discuss][memcg] oom-kill extension
2008-10-29 5:00 ` KAMEZAWA Hiroyuki
@ 2008-10-29 5:13 ` David Rientjes
2008-10-29 5:28 ` KAMEZAWA Hiroyuki
2008-10-29 6:55 ` KOSAKI Motohiro
0 siblings, 2 replies; 9+ messages in thread
From: David Rientjes @ 2008-10-29 5:13 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: balbir, linux-mm, menage, KOSAKI Motohiro, Marcelo Tosatti
On Wed, 29 Oct 2008, KAMEZAWA Hiroyuki wrote:
> > > Does anyone have plan to enhance oom-kill ?
> > >
> > > What I can think of now is
> > > - add an notifier to user-land.
> > > - receiver of notify should work in another cgroup.
> >
> > The discussion at the mini-summit was to notify a FIFO in the cgroup and any
> > application can listen in for events.
> >
> add FIFO rather than netlink or user mode helper ?
>
There was a patchset from February that added /dev/mem_notify to warn
userspace of low or out of memory conditions:
http://marc.info/?l=linux-kernel&m=120257050719077
http://marc.info/?l=linux-kernel&m=120257050719087
http://marc.info/?l=linux-kernel&m=120257062719234
http://marc.info/?l=linux-kernel&m=120257071219327
http://marc.info/?l=linux-kernel&m=120257071319334
http://marc.info/?l=linux-kernel&m=120257080919488
http://marc.info/?l=linux-kernel&m=120257081019497
http://marc.info/?l=linux-kernel&m=120257096219705
http://marc.info/?l=linux-kernel&m=120257096319717
Perhaps this idea can simply be reworked for the memory controller or
standalone cgroup?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [discuss][memcg] oom-kill extension
2008-10-29 5:13 ` David Rientjes
@ 2008-10-29 5:28 ` KAMEZAWA Hiroyuki
2008-10-29 6:55 ` KOSAKI Motohiro
1 sibling, 0 replies; 9+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-29 5:28 UTC (permalink / raw)
To: David Rientjes; +Cc: balbir, linux-mm, menage, KOSAKI Motohiro, Marcelo Tosatti
On Tue, 28 Oct 2008 22:13:03 -0700 (PDT)
David Rientjes <rientjes@google.com> wrote:
>
> There was a patchset from February that added /dev/mem_notify to warn
> userspace of low or out of memory conditions:
>
> http://marc.info/?l=linux-kernel&m=120257050719077
> http://marc.info/?l=linux-kernel&m=120257050719087
> http://marc.info/?l=linux-kernel&m=120257062719234
> http://marc.info/?l=linux-kernel&m=120257071219327
> http://marc.info/?l=linux-kernel&m=120257071319334
> http://marc.info/?l=linux-kernel&m=120257080919488
> http://marc.info/?l=linux-kernel&m=120257081019497
> http://marc.info/?l=linux-kernel&m=120257096219705
> http://marc.info/?l=linux-kernel&m=120257096319717
>
> Perhaps this idea can simply be reworked for the memory controller or
> standalone cgroup?
>
I know and like that. The concept of mem_notify is notifing shortage of memory
by watching page reclaimation.
But the situation/usage/purpose is a bit different from oom-killer.
(oom-kill is the final stage to recover memory...)
To implement mem_notify in memcg's context, my idea is
- support followings.
=> account swap (now going on)
=> show usage of swap
=> "reduce memory usage" interface (to decrease noise from usage of file cache)
In usual systems, we watche"amount of swap".
In swapless systems, watches the amount of anonymous/locked memory under memcg.
Or "measure how much time we'll take to reduce memory usage to some level"
maybe it's interresting that we can add multi-purpose notifier to memcg.
for example,
- triggered when anonymous memory is over 95% of limits
- triggered when swap occurs.
(But can be done by user-land daemon...Hmm?)
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [discuss][memcg] oom-kill extension
2008-10-29 2:38 [discuss][memcg] oom-kill extension KAMEZAWA Hiroyuki
2008-10-29 4:08 ` Balbir Singh
@ 2008-10-29 5:35 ` Paul Menage
2008-10-29 5:45 ` KAMEZAWA Hiroyuki
1 sibling, 1 reply; 9+ messages in thread
From: Paul Menage @ 2008-10-29 5:35 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, balbir
On Tue, Oct 28, 2008 at 7:38 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> Under memory resource controller(memcg), oom-killer can be invoked when it
> reaches limit and no memory can be reclaimed.
>
> In general, not under memcg, oom-kill(or panic) is an only chance to recover
> the system because there is no available memory. But when oom occurs under
> memcg, it just reaches limit and it seems we can do something else.
>
> Does anyone have plan to enhance oom-kill ?
We have an in-house implementation of a per-cgroup OOM handler that
we've just ported from cpusets to cgroups. We were considering sending
the patch in as a starting point for discussions - it's a bit of a
kludge as it is.
It's a standalone subsystem that can work with either the memory
cgroup or with cpusets (where memory is constrained by numa nodes).
The features are:
- an oom.delay file that controls how long a thread will pause in the
OOM killer waiting for a response from userspace (in milliseconds)
- an oom.await file that a userspace handler can write a timeout value
to, and be awoken either when a process in that cgroup enters the OOM
killer, or the timeout expires.
If a userspace thread catches and handles the OOM, the OOMing thread
doesn't trigger a kill, but returns to alloc_pages to try again;
alternatively userspace can cause the OOM killer to go ahead as
normal.
We've found it works pretty successfully as a last-ditch notification
to a daemon waiting in a system cgroup which can then expand the
memory limits of the failing cgroup if necessary (potentially killing
off processes from some other cgroup first if necessary to free up
more memory).
I'll try to get someone to send in the patch.
Paul
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [discuss][memcg] oom-kill extension
2008-10-29 5:35 ` Paul Menage
@ 2008-10-29 5:45 ` KAMEZAWA Hiroyuki
2008-10-29 5:49 ` Paul Menage
0 siblings, 1 reply; 9+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-29 5:45 UTC (permalink / raw)
To: Paul Menage; +Cc: linux-mm, balbir
On Tue, 28 Oct 2008 22:35:21 -0700
"Paul Menage" <menage@google.com> wrote:
> On Tue, Oct 28, 2008 at 7:38 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > Under memory resource controller(memcg), oom-killer can be invoked when it
> > reaches limit and no memory can be reclaimed.
> >
> > In general, not under memcg, oom-kill(or panic) is an only chance to recover
> > the system because there is no available memory. But when oom occurs under
> > memcg, it just reaches limit and it seems we can do something else.
> >
> > Does anyone have plan to enhance oom-kill ?
>
> We have an in-house implementation of a per-cgroup OOM handler that
> we've just ported from cpusets to cgroups. We were considering sending
> the patch in as a starting point for discussions - it's a bit of a
> kludge as it is.
>
Sounds interesting. (but I don't ask details now.)
> It's a standalone subsystem that can work with either the memory
> cgroup or with cpusets (where memory is constrained by numa nodes).
> The features are:
>
> - an oom.delay file that controls how long a thread will pause in the
> OOM killer waiting for a response from userspace (in milliseconds)
>
> - an oom.await file that a userspace handler can write a timeout value
> to, and be awoken either when a process in that cgroup enters the OOM
> killer, or the timeout expires.
>
> If a userspace thread catches and handles the OOM, the OOMing thread
> doesn't trigger a kill, but returns to alloc_pages to try again;
> alternatively userspace can cause the OOM killer to go ahead as
> normal.
>
the userland can know "bad process" under group ?
> We've found it works pretty successfully as a last-ditch notification
> to a daemon waiting in a system cgroup which can then expand the
> memory limits of the failing cgroup if necessary (potentially killing
> off processes from some other cgroup first if necessary to free up
> more memory).
>
This is a good news :)
> I'll try to get someone to send in the patch.
>
O.K. looking forward to see that.
Regards,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [discuss][memcg] oom-kill extension
2008-10-29 5:45 ` KAMEZAWA Hiroyuki
@ 2008-10-29 5:49 ` Paul Menage
0 siblings, 0 replies; 9+ messages in thread
From: Paul Menage @ 2008-10-29 5:49 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, balbir
On Tue, Oct 28, 2008 at 10:45 PM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> the userland can know "bad process" under group ?
Not in our current implementation - that's something that might be
good to add if we were doing a proper API for inclusion in mainline.
Paul
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [discuss][memcg] oom-kill extension
2008-10-29 5:13 ` David Rientjes
2008-10-29 5:28 ` KAMEZAWA Hiroyuki
@ 2008-10-29 6:55 ` KOSAKI Motohiro
1 sibling, 0 replies; 9+ messages in thread
From: KOSAKI Motohiro @ 2008-10-29 6:55 UTC (permalink / raw)
To: David Rientjes
Cc: KAMEZAWA Hiroyuki, balbir, linux-mm, menage, Marcelo Tosatti
> There was a patchset from February that added /dev/mem_notify to warn
> userspace of low or out of memory conditions:
>
> http://marc.info/?l=linux-kernel&m=120257050719077
> http://marc.info/?l=linux-kernel&m=120257050719087
> http://marc.info/?l=linux-kernel&m=120257062719234
> http://marc.info/?l=linux-kernel&m=120257071219327
> http://marc.info/?l=linux-kernel&m=120257071319334
> http://marc.info/?l=linux-kernel&m=120257080919488
> http://marc.info/?l=linux-kernel&m=120257081019497
> http://marc.info/?l=linux-kernel&m=120257096219705
> http://marc.info/?l=linux-kernel&m=120257096319717
>
> Perhaps this idea can simply be reworked for the memory controller or
> standalone cgroup?
Very sorry.
I know my laziness is wrong.
I have made split-lru effort give priority more than other awhile.
So I'll restart user-land notify effort soon.
Paul, I strongly interest to your implementation.
Could you post your notify patch?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-10-29 6:55 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-29 2:38 [discuss][memcg] oom-kill extension KAMEZAWA Hiroyuki
2008-10-29 4:08 ` Balbir Singh
2008-10-29 5:00 ` KAMEZAWA Hiroyuki
2008-10-29 5:13 ` David Rientjes
2008-10-29 5:28 ` KAMEZAWA Hiroyuki
2008-10-29 6:55 ` KOSAKI Motohiro
2008-10-29 5:35 ` Paul Menage
2008-10-29 5:45 ` KAMEZAWA Hiroyuki
2008-10-29 5:49 ` Paul Menage
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox