On Mon, Apr 25, 2011 at 2:34 AM, KAMEZAWA Hiroyuki < kamezawa.hiroyu@jp.fujitsu.com> wrote: > > At memcg memory reclaim, get_scan_count() may returns [0, 0, 0, 0] > and no scan was not issued at the reclaim priority. > > The reason is because memory cgroup may not be enough big to have > the number of pages, which is greater than 1 << priority. > > Because priority affects many routines in vmscan.c, it's better > to scan memory even if usage >> priority < 0. > From another point of view, if memcg's zone doesn't have enough memory > which > meets priority, it should be skipped. So, this patch creates a temporal > priority > in get_scan_count() and scan some amount of pages even when > usage is small. By this, memcg's reclaim goes smoother without > having too high priority, which will cause unnecessary congestion_wait(), > etc. > > Signed-off-by: KAMEZAWA Hiroyuki > --- > include/linux/memcontrol.h | 6 ++++++ > mm/memcontrol.c | 5 +++++ > mm/vmscan.c | 11 +++++++++++ > 3 files changed, 22 insertions(+) > > Index: memcg/include/linux/memcontrol.h > =================================================================== > --- memcg.orig/include/linux/memcontrol.h > +++ memcg/include/linux/memcontrol.h > @@ -152,6 +152,7 @@ unsigned long mem_cgroup_soft_limit_recl > gfp_t gfp_mask, > unsigned long > *total_scanned); > u64 mem_cgroup_get_limit(struct mem_cgroup *mem); > +u64 mem_cgroup_get_usage(struct mem_cgroup *mem); > > void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item > idx); > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > @@ -357,6 +358,11 @@ u64 mem_cgroup_get_limit(struct mem_cgro > return 0; > } > > +static inline u64 mem_cgroup_get_limit(struct mem_cgroup *mem) > +{ > + return 0; > +} > + > should be mem_cgroup_get_usage() static inline void mem_cgroup_split_huge_fixup(struct page *head, > struct page *tail) > { > Index: memcg/mm/memcontrol.c > =================================================================== > --- memcg.orig/mm/memcontrol.c > +++ memcg/mm/memcontrol.c > @@ -1483,6 +1483,11 @@ u64 mem_cgroup_get_limit(struct mem_cgro > return min(limit, memsw); > } > > +u64 mem_cgroup_get_usage(struct mem_cgroup *memcg) > +{ > + return res_counter_read_u64(&memcg->res, RES_USAGE); > +} > + > /* > * Visit the first child (need not be the first child as per the ordering > * of the cgroup list, since we track last_scanned_child) of @mem and use > Index: memcg/mm/vmscan.c > =================================================================== > --- memcg.orig/mm/vmscan.c > +++ memcg/mm/vmscan.c > @@ -1762,6 +1762,17 @@ static void get_scan_count(struct zone * > denominator = 1; > goto out; > } > + } else { > + u64 usage; > + /* > + * When memcg is enough small, anon+file >> priority > + * can be 0 and we'll do no scan. Adjust it to proper > + * value against its usage. If this zone's usage is enough > + * small, scan will ignore this zone until priority goes > down. > + */ > + for (usage = mem_cgroup_get_usage(sc->mem_cgroup) >> > PAGE_SHIFT; > + priority && ((usage >> priority) < SWAP_CLUSTER_MAX); > + priority--); > } > --Ying > > /* > >