From: Michal Hocko <mhocko@suse.cz>
To: Minchan Kim <minchan@kernel.org>
Cc: Hyunhee Kim <hyunhee.kim@samsung.com>,
'Anton Vorontsov' <anton@enomsg.org>,
linux-mm@kvack.org, akpm@linux-foundation.org, rob@landley.net,
kamezawa.hiroyu@jp.fujitsu.com, hannes@cmpxchg.org,
rientjes@google.com, kirill@shutemov.name,
'Kyungmin Park' <kyungmin.park@samsung.com>
Subject: Re: [PATCH v6] memcg: event control at vmpressure.
Date: Fri, 21 Jun 2013 11:19:44 +0200 [thread overview]
Message-ID: <20130621091944.GC12424@dhcp22.suse.cz> (raw)
In-Reply-To: <20130621012234.GF11659@bbox>
On Fri 21-06-13 10:22:34, Minchan Kim wrote:
> On Fri, Jun 21, 2013 at 09:24:38AM +0900, Hyunhee Kim wrote:
> > In the original vmpressure, events are triggered whenever there is a reclaim
> > activity. This becomes overheads to user space module and also increases
>
> Not true.
> We have lots of filter to not trigger event even if reclaim is going on.
> Your statement would make confuse.
Where is the filter implemented? In the kernel? I do not see any
throttling in the current mm tree.
> > power consumption if there is somebody to listen to it. This patch provides
> > options to trigger events only when the pressure level changes.
> > This trigger option can be set when registering each event by writing
> > a trigger option, "edge" or "always", next to the string of levels.
> > "edge" means that the event is triggered only when the pressure level is changed.
> > "always" means that events are triggered whenever there is a reclaim process.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Not true, either.
Is this about vmpressure_win? But I agree that this could be more
specific. Something like "`Always' trigger option will signal all events
while `edge' option will trigger only events when the level changes."
> > To keep backward compatibility, "always" is set by default if nothing is input
> > as an option. Each event can have different option. For example,
> > "low" level uses "always" trigger option to see reclaim activity at user space
> > while "medium"/"critical" uses "edge" to do an important job
> > like killing tasks only once.
>
> Question.
>
> 1. user: set critical edge
> 2. kernel: memory is tight and trigger event with critical
> 3. user: kill a program when he receives a event
> 4. kernel: memory is very tight again and want to trigger a event
> with critical but fail because last_level was critical and it was edge.
>
> Right?
yes, this is the risk of the edge triggering and the user has to be
prepared for that. I still think that it makes some sense to have the
two modes.
> > @@ -823,7 +831,7 @@ Test:
> > # cd /sys/fs/cgroup/memory/
> > # mkdir foo
> > # cd foo
> > - # cgroup_event_listener memory.pressure_level low &
> > + # cgroup_event_listener memory.pressure_level low edge &
> > # echo 8000000 > memory.limit_in_bytes
> > # echo 8000000 > memory.memsw.limit_in_bytes
> > # echo $$ > tasks
> > diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> > index 736a601..a08252e 100644
> > --- a/mm/vmpressure.c
> > +++ b/mm/vmpressure.c
> > @@ -137,6 +137,8 @@ static enum vmpressure_levels vmpressure_calc_level(unsigned long scanned,
> > struct vmpressure_event {
> > struct eventfd_ctx *efd;
> > enum vmpressure_levels level;
> > + int last_level;
>
> int? but level is enum vmpressure_levels?
good catch
> > + bool edge_trigger;
> > struct list_head node;
> > };
> >
> > @@ -153,11 +155,14 @@ static bool vmpressure_event(struct vmpressure *vmpr,
> >
> > list_for_each_entry(ev, &vmpr->events, node) {
> > if (level >= ev->level) {
> > + if (ev->edge_trigger && level == ev->last_level)
> > + continue;
> > +
> > eventfd_signal(ev->efd, 1);
> > signalled = true;
> > }
> > + ev->last_level = level;
> > }
> > -
>
> Unnecessary change.
>
> > mutex_unlock(&vmpr->events_lock);
> >
> > return signalled;
> > @@ -290,9 +295,11 @@ void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio)
> > *
> > * This function associates eventfd context with the vmpressure
> > * infrastructure, so that the notifications will be delivered to the
> > - * @eventfd. The @args parameter is a string that denotes pressure level
> > + * @eventfd. The @args parameters are a string that denotes pressure level
> > * threshold (one of vmpressure_str_levels, i.e. "low", "medium", or
> > - * "critical").
> > + * "critical") and a trigger option that decides whether events are triggered
> > + * continuously or only on edge ("always" or "edge" if "edge", events
> > + * are triggered when the pressure level changes.
> > *
> > * This function should not be used directly, just pass it to (struct
> > * cftype).register_event, and then cgroup core will handle everything by
> > @@ -303,22 +310,43 @@ int vmpressure_register_event(struct cgroup *cg, struct cftype *cft,
> > {
> > struct vmpressure *vmpr = cg_to_vmpressure(cg);
> > struct vmpressure_event *ev;
> > + char *strlevel, *strtrigger;
> > int level;
> > + bool trigger;
>
> What trigger?
> Would be better to use "bool egde" instead?
yes
> > +
> > + strlevel = args;
> > + strtrigger = strchr(args, ' ');
> > +
> > + if (strtrigger) {
> > + *strtrigger = '\0';
> > + strtrigger++;
> > + }
> >
> > for (level = 0; level < VMPRESSURE_NUM_LEVELS; level++) {
> > - if (!strcmp(vmpressure_str_levels[level], args))
> > + if (!strcmp(vmpressure_str_levels[level], strlevel))
> > break;
> > }
> >
> > if (level >= VMPRESSURE_NUM_LEVELS)
> > return -EINVAL;
> >
> > + if (strtrigger == NULL)
> > + trigger = false;
> > + else if (!strcmp(strtrigger, "always"))
> > + trigger = false;
> > + else if (!strcmp(strtrigger, "edge"))
> > + trigger = true;
> > + else
> > + return -EINVAL;
> > +
> > ev = kzalloc(sizeof(*ev), GFP_KERNEL);
> > if (!ev)
> > return -ENOMEM;
> >
> > ev->efd = eventfd;
> > ev->level = level;
> > + ev->last_level = -1;
>
> VMPRESSURE_NONE is better?
Yes
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-06-21 9:19 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-17 11:30 [PATCH v3] " Hyunhee Kim
2013-06-17 13:15 ` Michal Hocko
2013-06-18 6:10 ` Hyunhee Kim
2013-06-18 8:00 ` Hyunhee Kim
2013-06-18 11:01 ` Michal Hocko
2013-06-19 11:25 ` Hyunhee Kim
2013-06-19 11:59 ` Michal Hocko
2013-06-19 11:31 ` [PATCH v4] " Hyunhee Kim
2013-06-19 12:53 ` Michal Hocko
2013-06-20 2:13 ` Hyunhee Kim
2013-06-20 2:17 ` [PATCH v5] " Hyunhee Kim
2013-06-20 12:16 ` Michal Hocko
2013-06-21 0:21 ` [PATCH v6] " Hyunhee Kim
2013-06-21 0:24 ` Hyunhee Kim
2013-06-21 1:22 ` Minchan Kim
2013-06-21 9:19 ` Michal Hocko [this message]
2013-06-21 11:02 ` Hyunhee Kim
2013-06-21 11:54 ` Hyunhee Kim
2013-06-21 12:40 ` [PATCH v7] " Hyunhee Kim
2013-06-21 16:27 ` [PATCH v6] " Minchan Kim
2013-06-21 16:44 ` Minchan Kim
2013-06-22 0:27 ` Anton Vorontsov
2013-06-22 1:28 ` Hyunhee Kim
2013-06-26 7:47 ` Minchan Kim
2013-06-21 22:35 ` Anton Vorontsov
2013-06-22 4:36 ` Hyunhee Kim
2013-06-22 4:51 ` Hyunhee Kim
2013-06-22 5:50 ` [PATCH] memcg: consider "scanned < reclaimed" case when calculating Hyunhee Kim
2013-06-22 7:34 ` [PATCH] memcg: add interface to specify thresholds of vmpressure Hyunhee Kim
2013-06-25 20:46 ` Michal Hocko
2013-06-26 7:39 ` Minchan Kim
2013-06-26 7:50 ` Kyungmin Park
2013-06-26 8:03 ` Minchan Kim
2013-06-26 7:35 ` [PATCH] memcg: consider "scanned < reclaimed" case when calculating Minchan Kim
2013-06-27 6:12 ` [PATCH v2] vmpressure: consider "scanned < reclaimed" case when calculating a pressure level Hyunhee Kim
2013-06-27 9:37 ` Michal Hocko
2013-06-27 15:35 ` Minchan Kim
2013-06-27 16:11 ` Michal Hocko
2013-06-27 18:05 ` Anton Vorontsov
2013-06-28 12:17 ` Michal Hocko
2013-06-27 23:54 ` Minchan Kim
2013-06-28 7:43 ` [PATCH v3] " Hyunhee Kim
2013-06-28 12:26 ` Michal Hocko
2013-06-28 12:24 ` [PATCH v2] " Michal Hocko
2013-06-28 13:55 ` Minchan Kim
2013-06-28 15:17 ` Michal Hocko
2013-06-27 18:33 ` Anton Vorontsov
2013-06-26 7:34 ` [PATCH v6] memcg: event control at vmpressure Minchan Kim
2013-06-26 7:31 ` Minchan Kim
2013-06-25 16:07 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130621091944.GC12424@dhcp22.suse.cz \
--to=mhocko@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=anton@enomsg.org \
--cc=hannes@cmpxchg.org \
--cc=hyunhee.kim@samsung.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kirill@shutemov.name \
--cc=kyungmin.park@samsung.com \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=rientjes@google.com \
--cc=rob@landley.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox