From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f197.google.com (mail-qk0-f197.google.com [209.85.220.197]) by kanga.kvack.org (Postfix) with ESMTP id 4FD546B0038 for ; Wed, 13 Sep 2017 00:57:35 -0400 (EDT) Received: by mail-qk0-f197.google.com with SMTP id t184so19548673qke.0 for ; Tue, 12 Sep 2017 21:57:35 -0700 (PDT) Received: from sonic301-29.consmr.mail.bf2.yahoo.com (sonic301-29.consmr.mail.bf2.yahoo.com. [74.6.129.228]) by mx.google.com with ESMTPS id s29si14304046qtk.103.2017.09.12.21.57.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Sep 2017 21:57:34 -0700 (PDT) Date: Wed, 13 Sep 2017 04:51:26 +0000 (UTC) From: PINTU KUMAR Message-ID: <1969140653.911396.1505278286673@mail.yahoo.com> In-Reply-To: <80c9060f-bf80-51fb-39c0-b36f273c0c9c@yandex-team.ru> References: <149570810989.203600.9492483715840752937.stgit@buzz> <20170605085011.GJ9248@dhcp22.suse.cz> <80c9060f-bf80-51fb-39c0-b36f273c0c9c@yandex-team.ru> Subject: Re: [PATCH v2] mm/oom_kill: count global and memory cgroup oom kills MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_911395_232869749.1505278286671" Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko , Konstantin Khlebnikov Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Tetsuo Handa , Andrew Morton , Roman Guschin , David Rientjes , Greg Kroah-Hartman ------=_Part_911395_232869749.1505278286671 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, I have submitted a similar patch 2 years ago (Oct/2015). But at that time the patch was rejected. Here is the history: https://lkml.org/lkml/2015/10/1/372 Now I see the similar patch got accepted. At least the initial idea and the= objective were same.=C2=A0 Even I were not included here. On one side I feel happy that my initial idea got accepted now.But on the = other side it really hurts :( Thanks,Pintu =20 On Monday 5 June 2017, 7:57:57 PM IST, Konstantin Khlebnikov wrote:=20 On 05.06.2017 11:50, Michal Hocko wrote: > On Thu 25-05-17 13:28:30, Konstantin Khlebnikov wrote: >> Show count of oom killer invocations in /proc/vmstat and count of >> processes killed in memory cgroup in knob "memory.events" >> (in memory.oom_control for v1 cgroup). >> >> Also describe difference between "oom" and "oom_kill" in memory >> cgroup documentation. Currently oom in memory cgroup kills tasks >> iff shortage has happened inside page fault. >> >> These counters helps in monitoring oom kills - for now >> the only way is grepping for magic words in kernel log. >=20 > Yes this is less than optimal and the counter sounds like a good step > forward. I have 2 comments to the patch though. >=20 > [...] >=20 >> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h >> index 899949bbb2f9..42296f7001da 100644 >> --- a/include/linux/memcontrol.h >> +++ b/include/linux/memcontrol.h >> @@ -556,8 +556,11 @@ static inline void mem_cgroup_count_vm_event(struct= mm_struct *mm, >>=C2=A0=20 >>=C2=A0 =C2=A0=C2=A0=C2=A0 rcu_read_lock(); >>=C2=A0 =C2=A0=C2=A0=C2=A0 memcg =3D mem_cgroup_from_task(rcu_dereference(= mm->owner)); >> -=C2=A0=C2=A0=C2=A0 if (likely(memcg)) >> +=C2=A0=C2=A0=C2=A0 if (likely(memcg)) { >>=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 this_cpu_inc(memcg->stat->ev= ents[idx]); >> +=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 if (idx =3D=3D OOM_KILL) >> +=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 cgroup_file_no= tify(&memcg->events_file); >> +=C2=A0=C2=A0=C2=A0 } >>=C2=A0 =C2=A0=C2=A0=C2=A0 rcu_read_unlock(); >=20 > Well, this is ugly. I see how you want to share the global counter and > the memcg event which needs the notification. But I cannot say this > would be really easy to follow. Can we have at least a comment in > memcg_event_item enum definition? Yep, this is a little bit ugly. But this funciton is static-inline and idx always constant so resulting cod= e is fine. >=20 >> diff --git a/mm/oom_kill.c b/mm/oom_kill.c >> index 04c9143a8625..dd30a045ef5b 100644 >> --- a/mm/oom_kill.c >> +++ b/mm/oom_kill.c >> @@ -876,6 +876,11 @@ static void oom_kill_process(struct oom_control *oc= , const char *message) >>=C2=A0 =C2=A0=C2=A0=C2=A0 /* Get a reference to safely compare mm after t= ask_unlock(victim) */ >>=C2=A0 =C2=A0=C2=A0=C2=A0 mm =3D victim->mm; >>=C2=A0 =C2=A0=C2=A0=C2=A0 mmgrab(mm); >> + >> +=C2=A0=C2=A0=C2=A0 /* Raise event before sending signal: reaper must se= e this */ >> +=C2=A0=C2=A0=C2=A0 count_vm_event(OOM_KILL); >> +=C2=A0=C2=A0=C2=A0 mem_cgroup_count_vm_event(mm, OOM_KILL); >> + >>=C2=A0 =C2=A0=C2=A0=C2=A0 /* >>=C2=A0 =C2=A0=C2=A0=C2=A0 * We should send SIGKILL before setting TIF_MEM= DIE in order to prevent >>=C2=A0 =C2=A0=C2=A0=C2=A0 * the OOM victim from depleting the memory rese= rves from the user >=20 > Why don't you count tasks which share mm with the oom victim? Yes, this makes sense. But these kills are not logged thus counter will dif= fers from logged events. Also these tasks might live in different cgroups, so counting to mm owner i= sn't correct. > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 0e2c925e7826..9a95947a60ba 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -924,6 +924,8 @@ static void oom_kill_process(struct oom_control *oc, = const char *message) >=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 */ >=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 if (unlikely(p->flags & PF_KT= HREAD)) >=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 continue; > +=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 count_vm_event(OOM_KILL); > +=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 count_memcg_event_mm(mm, OOM_KILL)= ; >=C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 do_send_sig_info(SIGKILL, SEN= D_SIG_FORCED, p, true); >=C2=A0 =C2=A0=C2=A0=C2=A0 } >=C2=A0 =C2=A0=C2=A0=C2=A0 rcu_read_unlock(); >=20 > Other than that looks good to me. > Acked-by: Michal Hocko >=20 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org.=C2=A0 For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org ------=_Part_911395_232869749.1505278286671 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


Hi,

I have submitted a similar patch = 2 years ago (Oct/2015).
But at that time the patch was rejected.
Here= is the history:
https://lkml.org/lkml/2015/10/1/372

Now I see th= e similar patch got accepted. At least the initial idea and the objective w= ere same. 
Even I were not included here.
On one side I fe= el happy that my initial idea got accepted now.
But on the other = side it really hurts :(


Thanks,
Pintu


On Monday 5 June 2017, 7:57:57 PM IS= T, Konstantin Khlebnikov <khlebnikov@yandex-team.ru> wrote:

<= br>On 05.06.2017 11:50, Michal Hocko wrote:
> On Thu 25-05-17 13:28:3= 0, Konstantin Khlebnikov wrote:
>> Show count of oom killer invoca= tions in /proc/vmstat and count of
>> processes killed in memory c= group in knob "memory.events"
>> (in memory.oom_control for v1 cgr= oup).
>>
>> Also describe difference between "oom" and "o= om_kill" in memory
>> cgroup documentation. Currently oom in memor= y cgroup kills tasks
>> iff shortage has happened inside page faul= t.
>>
>> These counters helps in monitoring oom kills - f= or now
>> the only way is grepping for magic words in kernel log.<= br>>
> Yes this is less than optimal and the counter sounds like = a good step
> forward. I have 2 comments to the patch though.
>=
> [...]
>
>> diff --git a/include/linux/memcontrol.= h b/include/linux/memcontrol.h
>> index 899949bbb2f9..42296f7001da= 100644
>> --- a/include/linux/memcontrol.h
>> +++ b/incl= ude/linux/memcontrol.h
>> @@ -556,8 +556,11 @@ static inline void = mem_cgroup_count_vm_event(struct mm_struct *mm,
>> 
>&= gt;      rcu_read_lock();
>>    = ;  memcg =3D mem_cgroup_from_task(rcu_dereference(mm->owner));
&= gt;> -    if (likely(memcg))
>> +  &nb= sp; if (likely(memcg)) {
>>        &= nbsp; this_cpu_inc(memcg->stat->events[idx]);
>> + &nbs= p;      if (idx =3D=3D OOM_KILL)
>> + &nb= sp;          cgroup_file_notify(&mem= cg->events_file);
>> +    }
>>  &n= bsp;   rcu_read_unlock();
>
> Well, this is ugly. I = see how you want to share the global counter and
> the memcg event wh= ich needs the notification. But I cannot say this
> would be really e= asy to follow. Can we have at least a comment in
> memcg_event_item e= num definition?

Yep, this is a little bit ugly.
But this funciton= is static-inline and idx always constant so resulting code is fine.
>
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>> i= ndex 04c9143a8625..dd30a045ef5b 100644
>> --- a/mm/oom_kill.c
&= gt;> +++ b/mm/oom_kill.c
>> @@ -876,6 +876,11 @@ static void oo= m_kill_process(struct oom_control *oc, const char *message)
>>&nbs= p;     /* Get a reference to safely compare mm after task_un= lock(victim) */
>>      mm =3D victim->mm;<= br>>>      mmgrab(mm);
>> +
>> = +    /* Raise event before sending signal: reaper must see t= his */
>> +    count_vm_event(OOM_KILL);
>>= ; +    mem_cgroup_count_vm_event(mm, OOM_KILL);
>> = +
>>      /*
>>    &nb= sp; * We should send SIGKILL before setting TIF_MEMDIE in order to prevent<= br>>>      * the OOM victim from depleting the me= mory reserves from the user
>
> Why don't you count tasks whic= h share mm with the oom victim?

Yes, this makes sense. But these kil= ls are not logged thus counter will differs from logged events.
Also the= se tasks might live in different cgroups, so counting to mm owner isn't cor= rect.


> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> in= dex 0e2c925e7826..9a95947a60ba 100644
> --- a/mm/oom_kill.c
> += ++ b/mm/oom_kill.c
> @@ -924,6 +924,8 @@ static void oom_kill_process= (struct oom_control *oc, const char *message)
>    &nb= sp;     */
>        &nbs= p; if (unlikely(p->flags & PF_KTHREAD))
>    &n= bsp;         continue;
> +  &= nbsp;     count_vm_event(OOM_KILL);
> +  &nb= sp;     count_memcg_event_mm(mm, OOM_KILL);
>  &n= bsp;       do_send_sig_info(SIGKILL, SEND_SIG_FORC= ED, p, true);
>      }
>    =   rcu_read_unlock();
>
> Other than that looks good to me= .
> Acked-by: Michal Hocko <mhocko@suse.com>
>

--=
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the bo= dy to majordomo@kvack.org.  For more info on Linux MM,
see: http://= www.linux-mm.org/ .
Don't email: <a href=3Dmailto:"dont@kvack.org">= ; email@kvack.org </a>

------=_Part_911395_232869749.1505278286671-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org