On Thu, 14 Apr 2011 15:54:24 -0700
Ying Han <
yinghan@google.com> wrote:
> This add the mechanism for background reclaim which we remember the
> last scanned node and always starting from the next one each time.
> The simple round-robin fasion provide the fairness between nodes for
> each memcg.
>
> changelog v4..v3:
> 1. split off from the per-memcg background reclaim patch.
>
> Signed-off-by: Ying Han <
yinghan@google.com>
Yeah, looks nice. Thank you for splitting.
> ---
> include/linux/memcontrol.h | 3 +++
> mm/memcontrol.c | 40 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 43 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index f7ffd1f..d4ff7f2 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -88,6 +88,9 @@ extern int mem_cgroup_init_kswapd(struct mem_cgroup *mem,
> struct kswapd *kswapd_p);
> extern void mem_cgroup_clear_kswapd(struct mem_cgroup *mem);
> extern wait_queue_head_t *mem_cgroup_kswapd_wait(struct mem_cgroup *mem);
> +extern int mem_cgroup_last_scanned_node(struct mem_cgroup *mem);
> +extern int mem_cgroup_select_victim_node(struct mem_cgroup *mem,
> + const nodemask_t *nodes);
>
> static inline
> int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c4e1904..e22351a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -279,6 +279,11 @@ struct mem_cgroup {
> u64 high_wmark_distance;
> u64 low_wmark_distance;
>
> + /* While doing per cgroup background reclaim, we cache the
> + * last node we reclaimed from
> + */
> + int last_scanned_node;
> +
> wait_queue_head_t *kswapd_wait;
> };
>
> @@ -1536,6 +1541,32 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem,
> }
>
> /*
> + * Visit the first node after the last_scanned_node of @mem and use that to
> + * reclaim free pages from.
> + */
> +int
> +mem_cgroup_select_victim_node(struct mem_cgroup *mem, const nodemask_t *nodes)
> +{
> + int next_nid;
> + int last_scanned;
> +
> + last_scanned = mem->last_scanned_node;
> +
> + /* Initial stage and start from node0 */
> + if (last_scanned == -1)
> + next_nid = 0;
> + else
> + next_nid = next_node(last_scanned, *nodes);
> +
IIUC, mem->last_scanned_node should be initialized to MAX_NUMNODES.