From: Hyunhee Kim <hyunhee.kim@samsung.com>
To: Michal Hocko <mhocko@suse.cz>, Minchan Kim <minchan@kernel.org>
Cc: Anton Vorontsov <anton@enomsg.org>,
linux-mm@kvack.org, akpm@linux-foundation.org, rob@landley.net,
kamezawa.hiroyu@jp.fujitsu.com, hannes@cmpxchg.org,
rientjes@google.com, kirill@shutemov.name,
Kyungmin Park <kyungmin.park@samsung.com>
Subject: Re: [PATCH v6] memcg: event control at vmpressure.
Date: Fri, 21 Jun 2013 20:02:50 +0900 [thread overview]
Message-ID: <CAOK=xRMZTTEqX7kAUkkFU+www6jwTQw8bvw6a0p-Jfd828gyCQ@mail.gmail.com> (raw)
In-Reply-To: <20130621091944.GC12424@dhcp22.suse.cz>
2013/6/21 Michal Hocko <mhocko@suse.cz>:
> On Fri 21-06-13 10:22:34, Minchan Kim wrote:
>> On Fri, Jun 21, 2013 at 09:24:38AM +0900, Hyunhee Kim wrote:
>> > In the original vmpressure, events are triggered whenever there is a reclaim
>> > activity. This becomes overheads to user space module and also increases
>>
>> Not true.
>> We have lots of filter to not trigger event even if reclaim is going on.
>> Your statement would make confuse.
>
> Where is the filter implemented? In the kernel? I do not see any
> throttling in the current mm tree.
Thanks for your comments.
As Minchan said, vmpressure_win can filter some of event sinals. But,
when I tested, lots of events are still signaled if a task consume a
lot of memory.
I'll change the expression "whenever there is a reclaim activity".
>
>> > power consumption if there is somebody to listen to it. This patch provides
>> > options to trigger events only when the pressure level changes.
>> > This trigger option can be set when registering each event by writing
>> > a trigger option, "edge" or "always", next to the string of levels.
>> > "edge" means that the event is triggered only when the pressure level is changed.
>> > "always" means that events are triggered whenever there is a reclaim process.
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> Not true, either.
>
> Is this about vmpressure_win? But I agree that this could be more
> specific. Something like "`Always' trigger option will signal all events
> while `edge' option will trigger only events when the level changes."
I also agree. I'll improve it.
>
>> > To keep backward compatibility, "always" is set by default if nothing is input
>> > as an option. Each event can have different option. For example,
>> > "low" level uses "always" trigger option to see reclaim activity at user space
>> > while "medium"/"critical" uses "edge" to do an important job
>> > like killing tasks only once.
>>
>> Question.
>>
>> 1. user: set critical edge
>> 2. kernel: memory is tight and trigger event with critical
>> 3. user: kill a program when he receives a event
>> 4. kernel: memory is very tight again and want to trigger a event
>> with critical but fail because last_level was critical and it was edge.
>>
>> Right?
>
> yes, this is the risk of the edge triggering and the user has to be
> prepared for that. I still think that it makes some sense to have the
> two modes.
Right. the above scenario is possible to happen.
And, as Michal said, I also think that a user should handle this
situation. This could be the users' choice to handle continuous events
or handle the above situation by having two modes.
>
>> > @@ -823,7 +831,7 @@ Test:
>> > # cd /sys/fs/cgroup/memory/
>> > # mkdir foo
>> > # cd foo
>> > - # cgroup_event_listener memory.pressure_level low &
>> > + # cgroup_event_listener memory.pressure_level low edge &
>> > # echo 8000000 > memory.limit_in_bytes
>> > # echo 8000000 > memory.memsw.limit_in_bytes
>> > # echo $$ > tasks
>> > diff --git a/mm/vmpressure.c b/mm/vmpressure.c
>> > index 736a601..a08252e 100644
>> > --- a/mm/vmpressure.c
>> > +++ b/mm/vmpressure.c
>> > @@ -137,6 +137,8 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
>> > struct vmpressure_event {
>> > struct eventfd_ctx *efd;
>> > enum vmpressure_levels level;
>> > + int last_level;
>>
>> int? but level is enum vmpressure_levels?
>
> good catch
Ok, I'll fix it.
>
>> > + bool edge_trigger;
>> > struct list_head node;
>> > };
>> >
>> > @@ -153,11 +155,14 @@ static bool vmpressure_event(struct vmpressure *vmpr,
>> >
>> > list_for_each_entry(ev, &vmpr->events, node) {
>> > if (level >= ev->level) {
>> > + if (ev->edge_trigger && level == ev->last_level)
>> > + continue;
>> > +
>> > eventfd_signal(ev->efd, 1);
>> > signalled = true;
>> > }
>> > + ev->last_level = level;
>> > }
>> > -
>>
>> Unnecessary change.
Ok.
>>
>> > mutex_unlock(&vmpr->events_lock);
>> >
>> > return signalled;
>> > @@ -290,9 +295,11 @@ void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio)
>> > *
>> > * This function associates eventfd context with the vmpressure
>> > * infrastructure, so that the notifications will be delivered to the
>> > - * @eventfd. The @args parameter is a string that denotes pressure level
>> > + * @eventfd. The @args parameters are a string that denotes pressure level
>> > * threshold (one of vmpressure_str_levels, i.e. "low", "medium", or
>> > - * "critical").
>> > + * "critical") and a trigger option that decides whether events are triggered
>> > + * continuously or only on edge ("always" or "edge" if "edge", events
>> > + * are triggered when the pressure level changes.
>> > *
>> > * This function should not be used directly, just pass it to (struct
>> > * cftype).register_event, and then cgroup core will handle everything by
>> > @@ -303,22 +310,43 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
>> > {
>> > struct vmpressure *vmpr = cg_to_vmpressure(cg);
>> > struct vmpressure_event *ev;
>> > + char *strlevel, *strtrigger;
>> > int level;
>> > + bool trigger;
>>
>> What trigger?
>> Would be better to use "bool egde" instead?
>
> yes
Ok. edge is better.
>
>> > +
>> > + strlevel = args;
>> > + strtrigger = strchr(args, ' ');
>> > +
>> > + if (strtrigger) {
>> > + *strtrigger = '\0';
>> > + strtrigger++;
>> > + }
>> >
>> > for (level = 0; level < VMPRESSURE_NUM_LEVELS; level++) {
>> > - if (!strcmp(vmpressure_str_levels[level], args))
>> > + if (!strcmp(vmpressure_str_levels[level], strlevel))
>> > break;
>> > }
>> >
>> > if (level >= VMPRESSURE_NUM_LEVELS)
>> > return -EINVAL;
>> >
>> > + if (strtrigger == NULL)
>> > + trigger = false;
>> > + else if (!strcmp(strtrigger, "always"))
>> > + trigger = false;
>> > + else if (!strcmp(strtrigger, "edge"))
>> > + trigger = true;
>> > + else
>> > + return -EINVAL;
>> > +
>> > ev = kzalloc(sizeof(*ev), GFP_KERNEL);
>> > if (!ev)
>> > return -ENOMEM;
>> >
>> > ev->efd = eventfd;
>> > ev->level = level;
>> > + ev->last_level = -1;
>>
>> VMPRESSURE_NONE is better?
>
> Yes
Ok, I'll add VMPRESSURE_NONE. It will be better.
Thanks for review.
Hyunhee Kim.
> --
> Michal Hocko
> SUSE Labs
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-06-21 11:02 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-17 11:30 [PATCH v3] " Hyunhee Kim
2013-06-17 13:15 ` Michal Hocko
2013-06-18 6:10 ` Hyunhee Kim
2013-06-18 8:00 ` Hyunhee Kim
2013-06-18 11:01 ` Michal Hocko
2013-06-19 11:25 ` Hyunhee Kim
2013-06-19 11:59 ` Michal Hocko
2013-06-19 11:31 ` [PATCH v4] " Hyunhee Kim
2013-06-19 12:53 ` Michal Hocko
2013-06-20 2:13 ` Hyunhee Kim
2013-06-20 2:17 ` [PATCH v5] " Hyunhee Kim
2013-06-20 12:16 ` Michal Hocko
2013-06-21 0:21 ` [PATCH v6] " Hyunhee Kim
2013-06-21 0:24 ` Hyunhee Kim
2013-06-21 1:22 ` Minchan Kim
2013-06-21 9:19 ` Michal Hocko
2013-06-21 11:02 ` Hyunhee Kim [this message]
2013-06-21 11:54 ` Hyunhee Kim
2013-06-21 12:40 ` [PATCH v7] " Hyunhee Kim
2013-06-21 16:27 ` [PATCH v6] " Minchan Kim
2013-06-21 16:44 ` Minchan Kim
2013-06-22 0:27 ` Anton Vorontsov
2013-06-22 1:28 ` Hyunhee Kim
2013-06-26 7:47 ` Minchan Kim
2013-06-21 22:35 ` Anton Vorontsov
2013-06-22 4:36 ` Hyunhee Kim
2013-06-22 4:51 ` Hyunhee Kim
2013-06-22 5:50 ` [PATCH] memcg: consider "scanned < reclaimed" case when calculating Hyunhee Kim
2013-06-22 7:34 ` [PATCH] memcg: add interface to specify thresholds of vmpressure Hyunhee Kim
2013-06-25 20:46 ` Michal Hocko
2013-06-26 7:39 ` Minchan Kim
2013-06-26 7:50 ` Kyungmin Park
2013-06-26 8:03 ` Minchan Kim
2013-06-26 7:35 ` [PATCH] memcg: consider "scanned < reclaimed" case when calculating Minchan Kim
2013-06-27 6:12 ` [PATCH v2] vmpressure: consider "scanned < reclaimed" case when calculating a pressure level Hyunhee Kim
2013-06-27 9:37 ` Michal Hocko
2013-06-27 15:35 ` Minchan Kim
2013-06-27 16:11 ` Michal Hocko
2013-06-27 18:05 ` Anton Vorontsov
2013-06-28 12:17 ` Michal Hocko
2013-06-27 23:54 ` Minchan Kim
2013-06-28 7:43 ` [PATCH v3] " Hyunhee Kim
2013-06-28 12:26 ` Michal Hocko
2013-06-28 12:24 ` [PATCH v2] " Michal Hocko
2013-06-28 13:55 ` Minchan Kim
2013-06-28 15:17 ` Michal Hocko
2013-06-27 18:33 ` Anton Vorontsov
2013-06-26 7:34 ` [PATCH v6] memcg: event control at vmpressure Minchan Kim
2013-06-26 7:31 ` Minchan Kim
2013-06-25 16:07 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOK=xRMZTTEqX7kAUkkFU+www6jwTQw8bvw6a0p-Jfd828gyCQ@mail.gmail.com' \
--to=hyunhee.kim@samsung.com \
--cc=akpm@linux-foundation.org \
--cc=anton@enomsg.org \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kirill@shutemov.name \
--cc=kyungmin.park@samsung.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=minchan@kernel.org \
--cc=rientjes@google.com \
--cc=rob@landley.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox