On Wed, Apr 20, 2011 at 8:48 PM, KAMEZAWA Hiroyuki < kamezawa.hiroyu@jp.fujitsu.com> wrote: > > memcg-kswapd visits each memcg in round-robin. But required > amounts of works depends on memcg' usage and hi/low watermark > and taking it into account will be good. > > Signed-off-by: KAMEZAWA Hiroyuki > --- > include/linux/memcontrol.h | 1 + > mm/memcontrol.c | 17 +++++++++++++++++ > mm/vmscan.c | 2 ++ > 3 files changed, 20 insertions(+) > > Index: mmotm-Apr14/include/linux/memcontrol.h > =================================================================== > --- mmotm-Apr14.orig/include/linux/memcontrol.h > +++ mmotm-Apr14/include/linux/memcontrol.h > @@ -98,6 +98,7 @@ extern bool mem_cgroup_kswapd_can_sleep( > extern struct mem_cgroup *mem_cgroup_get_shrink_target(void); > extern void mem_cgroup_put_shrink_target(struct mem_cgroup *mem); > extern wait_queue_head_t *mem_cgroup_kswapd_waitq(void); > +extern int mem_cgroup_kswapd_bonus(struct mem_cgroup *mem); > > static inline > int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup > *cgroup) > Index: mmotm-Apr14/mm/memcontrol.c > =================================================================== > --- mmotm-Apr14.orig/mm/memcontrol.c > +++ mmotm-Apr14/mm/memcontrol.c > @@ -4673,6 +4673,23 @@ struct memcg_kswapd_work > > struct memcg_kswapd_work memcg_kswapd_control; > > +int mem_cgroup_kswapd_bonus(struct mem_cgroup *mem) > +{ > + unsigned long long usage, lowat, hiwat; > + int rate; > + > + usage = res_counter_read_u64(&mem->res, RES_USAGE); > + lowat = res_counter_read_u64(&mem->res, RES_LOW_WMARK_LIMIT); > + hiwat = res_counter_read_u64(&mem->res, RES_HIGH_WMARK_LIMIT); > + if (lowat == hiwat) > + return 0; > + > + rate = (usage - hiwat) * 10 / (lowat - hiwat); > + /* If usage is big, we reclaim more */ > + return rate * SWAP_CLUSTER_MAX; > +} > + > > I understand the logic in general, which we would like to reclaim more each > time if more work needs to be done. But not quite sure the calculation here, > the (usage - hiwat) determines the amount of work of kswapd. And why divide > by (lowat - hiwat)? My guess is because the larger the value, the later we > will trigger kswapd? --Ying > > > static void wake_memcg_kswapd(struct mem_cgroup *mem) > { > if (atomic_read(&mem->kswapd_running)) /* already running */ > Index: mmotm-Apr14/mm/vmscan.c > =================================================================== > --- mmotm-Apr14.orig/mm/vmscan.c > +++ mmotm-Apr14/mm/vmscan.c > @@ -2732,6 +2732,8 @@ static int shrink_mem_cgroup(struct mem_ > sc.nr_reclaimed = 0; > total_scanned = 0; > > + sc.nr_to_reclaim += mem_cgroup_kswapd_bonus(mem_cont); > + > do_nodes = node_states[N_ONLINE]; > > for (priority = DEF_PRIORITY; > >