Please find my comments below. > More comments on the code bellow. > > [...] > >> ?diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> ?index 53b8201..d8e6ee6 100644 >> ?--- a/mm/memcontrol.c >> ?+++ b/mm/memcontrol.c >> ?@@ -1743,6 +1743,53 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, >> ???????????????????????????NULL, "Memory cgroup out of memory"); >> ??} >> >> ?+/* >> ?+ * If a cgroup is under low limit or enough close to it, >> ?+ * decrease speed of page scanning. >> ?+ * >> ?+ * mem_cgroup_low_limit_scale() returns a number >> ?+ * from range [0, DEF_PRIORITY - 2], which is used >> ?+ * in the reclaim code as a scanning priority modifier. >> ?+ * >> ?+ * If the low limit is not set, it returns 0; >> ?+ * >> ?+ * usage - low_limit > usage / 8 ?=> 0 >> ?+ * usage - low_limit > usage / 16 => 1 >> ?+ * usage - low_limit > usage / 32 => 2 >> ?+ * ... >> ?+ * usage - low_limit > usage / (2 ^ DEF_PRIORITY - 3) => DEF_PRIORITY - 3 >> ?+ * usage < low_limit => DEF_PRIORITY - 2 > > Could you clarify why you have used this calculation. The comment > exlaims _what_ is done but not _why_ it is done. > > It is also strange (and unexplained) that the low limit will work > differently depending on the memcg memory usage - bigger groups have a > bigger chance to be reclaimed even if they are under the limit. The idea is to decrease scanning speed smoothly. It's hard to explain why I used exact these numbers. It' like why DEF_PRIORITY is 12? Just because it works :). Of course, these numbers are an object for discussion/change. There is a picture in attachment that illustrates how low limits work: red line - memory usage of cgroup with low_limit set to 1Gb, blue line - memory usage of another cgroup, where I ran cat > /dev/null. >> ?+ * >> ?+ */ >> ?+unsigned int mem_cgroup_low_limit_scale(struct lruvec *lruvec) >> ?+{ >> ?+ struct mem_cgroup_per_zone *mz; >> ?+ struct mem_cgroup *memcg; >> ?+ unsigned long long low_limit; >> ?+ unsigned long long usage; >> ?+ unsigned int i; >> ?+ >> ?+ mz = container_of(lruvec, struct mem_cgroup_per_zone, lruvec); >> ?+ memcg = mz->memcg; >> ?+ if (!memcg) >> ?+ return 0; >> ?+ >> ?+ low_limit = res_counter_read_u64(&memcg->res, RES_LOW_LIMIT); >> ?+ if (!low_limit) >> ?+ return 0; >> ?+ >> ?+ usage = res_counter_read_u64(&memcg->res, RES_USAGE); >> ?+ >> ?+ if (usage < low_limit) >> ?+ return DEF_PRIORITY - 2; >> ?+ >> ?+ for (i = 0; i < DEF_PRIORITY - 2; i++) >> ?+ if (usage - low_limit > (usage >> (i + 3))) >> ?+ break; > > why this doesn't depend in the current reclaim priority? How do you want to use reclaim priority here? I don't like an idea to start ignoring low limit on some priorities. In my implementation low_limit_scale just "increases" scanning priority, but no more than for 10 (DEF_PRIORITY - 2). So, if priority is 0-2, the reclaim works as if the priority were 10-12, that means "normal" slow reclaim. > >> ?+ >> ?+ return i; >> ?+} >> ?+ >> ??static unsigned long mem_cgroup_reclaim(struct mem_cgroup *memcg, >> ??????????????????????????????????????????gfp_t gfp_mask, >> ??????????????????????????????????????????unsigned long flags) > > [...] > >> ?diff --git a/mm/vmscan.c b/mm/vmscan.c >> ?index 88c5fed..9c1c702 100644 >> ?--- a/mm/vmscan.c >> ?+++ b/mm/vmscan.c >> ?@@ -1660,6 +1660,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, >> ??????????bool force_scan = false; >> ??????????unsigned long ap, fp; >> ??????????enum lru_list lru; >> ?+ unsigned int low_limit_scale = 0; >> >> ??????????/* >> ???????????* If the zone or memcg is small, nr[l] can be 0. ?This >> ?@@ -1779,6 +1780,9 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, >> ??????????fraction[1] = fp; >> ??????????denominator = ap + fp + 1; >> ??out: >> ?+ if (global_reclaim(sc)) >> ?+ low_limit_scale = mem_cgroup_low_limit_scale(lruvec); > > What if the group is reclaimed as a result from parent hitting its > limit? For now, low limits will work only for global reclaim. Enabling them for target reclaim will require some additional checks. I plan to do this as a separate change. Thank you for your comments! -- Regards, Roman