linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"nishimura@mxp.nes.nec.co.jp" <nishimura@mxp.nes.nec.co.jp>,
	"kosaki.motohiro@jp.fujitsu.com" <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [RFC][PATCH 5/5] memcg softlimit hooks to kswapd
Date: Thu, 12 Mar 2009 09:28:37 +0530	[thread overview]
Message-ID: <20090312035837.GD23583@balbir.in.ibm.com> (raw)
In-Reply-To: <20090312100008.aa8379d7.kamezawa.hiroyu@jp.fujitsu.com>

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-12 10:00:08]:

> This patch needs MORE investigation...
> 
> ==
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> This patch adds hooks for memcg's softlimit to kswapd().
> 
> Softlimit handler is called...
>   - before generic shrink_zone() is called.
>   - # of pages to be scanned depends on priority.
>   - If not enough progress, selected memcg will be moved to UNUSED queue.
>   - at each call for balance_pgdat(), softlimit queue is rebalanced.
> 
> Changelog: v3 -> v4
>  - move "sc" as local variable
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
>  mm/vmscan.c |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> Index: mmotm-2.6.29-Mar10/mm/vmscan.c
> ===================================================================
> --- mmotm-2.6.29-Mar10.orig/mm/vmscan.c
> +++ mmotm-2.6.29-Mar10/mm/vmscan.c
> @@ -1733,6 +1733,49 @@ unsigned long try_to_free_mem_cgroup_pag
>  }
>  #endif
> 
> +static void shrink_zone_softlimit(struct zone *zone, int order, int priority,
> +			   int target, int end_zone)
> +{
> +	int scan = SWAP_CLUSTER_MAX;
> +	int nid = zone->zone_pgdat->node_id;
> +	int zid = zone_idx(zone);
> +	struct mem_cgroup *mem;
> +	struct scan_control sc =  {
> +		.gfp_mask = GFP_KERNEL,
> +		.may_writepage = !laptop_mode,
> +		.swap_cluster_max = SWAP_CLUSTER_MAX,
> +		.may_unmap = 1,
> +		.swappiness = vm_swappiness,
> +		.order = order,
> +		.mem_cgroup = NULL,
> +		.isolate_pages = mem_cgroup_isolate_pages,
> +	};
> +
> +	scan = target * 2;
> +
> +	sc.nr_scanned = 0;
> +	sc.nr_reclaimed = 0;
> +	while (scan > 0) {
> +		if (zone_watermark_ok(zone, order, target, end_zone, 0))
> +			break;
> +		mem = mem_cgroup_schedule(nid, zid);
> +		if (!mem)
> +			return;
> +		sc.mem_cgroup = mem;
> +
> +		sc.nr_reclaimed = 0;
> +		shrink_zone(priority, zone, &sc);
> +
> +		if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX/2)
> +			mem_cgroup_schedule_end(nid, zid, mem, true);
> +		else
> +			mem_cgroup_schedule_end(nid, zid, mem, false);
> +
> +		scan -= sc.nr_scanned;
> +	}
> +
> +	return;
> +}

I experimented a *lot* with zone reclaim and found it to be not so
effective. Here is why

1. We have no control over priority or how much to scan, that is
controlled by balance_pgdat(). If we find that we are unable to scan
anything, we continue scanning with the scan > 0 check, but we scan
the same pages and the same number, because shrink_zone does scan >>
priority.
2. If we fail to reclaim pages in shrink_zone_softlimit, shrink_zone()
will reclaim pages independent of the soft limit for us

I spent a couple of days looking at zone based reclaim, but ran into
(1) and (2) above.

>  /*
>   * For kswapd, balance_pgdat() will work across all this node's zones until
>   * they are all at pages_high.
> @@ -1776,6 +1819,8 @@ static unsigned long balance_pgdat(pg_da
>  	 */
>  	int temp_priority[MAX_NR_ZONES];
> 
> +	/* Refill softlimit queue */
> +	mem_cgroup_reschedule_all(pgdat->node_id);
>  loop_again:
>  	total_scanned = 0;
>  	sc.nr_reclaimed = 0;
> @@ -1856,6 +1901,13 @@ loop_again:
>  					       end_zone, 0))
>  				all_zones_ok = 0;
>  			temp_priority[i] = priority;
> +
> +			/*
> +			 * Try soft limit at first.  This reclaims page
> +			 * with regard to user's hint.
> +			 */
> +			shrink_zone_softlimit(zone, order, priority,
> +					       8 * zone->pages_high, end_zone);
>  			sc.nr_scanned = 0;
>  			note_zone_scanning_priority(zone, priority);
>  			/*
> 
> 

-- 
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-03-12  3:58 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-12  0:52 [RFC][PATCH 0/5] memcg softlimit (Another one) v4 KAMEZAWA Hiroyuki
2009-03-12  0:55 ` [BUGFIX][PATCH 1/5] memcg use correct scan number at reclaim KAMEZAWA Hiroyuki
2009-03-12  3:49   ` Balbir Singh
2009-03-12  3:51     ` KAMEZAWA Hiroyuki
2009-03-12  4:00       ` Balbir Singh
2009-03-12  4:05         ` KAMEZAWA Hiroyuki
2009-03-12  4:14           ` Balbir Singh
2009-03-12  4:17             ` KAMEZAWA Hiroyuki
2009-03-12  7:45               ` KOSAKI Motohiro
2009-03-12  9:45                 ` Balbir Singh
2009-03-12 11:23                   ` KOSAKI Motohiro
2009-03-12  0:56 ` [RFC][PATCH 2/5] add softlimit to res_counter KAMEZAWA Hiroyuki
2009-03-12  3:54   ` Balbir Singh
2009-03-12  3:58     ` KAMEZAWA Hiroyuki
2009-03-12  4:10       ` Balbir Singh
2009-03-12  4:14         ` KAMEZAWA Hiroyuki
2009-03-12  0:57 ` [RFC][PATCH 3/5] memcg per zone softlimit scheduler core KAMEZAWA Hiroyuki
2009-03-12  0:58 ` [RFC][PATCH 4/5] memcg softlimit_priority KAMEZAWA Hiroyuki
2009-03-12  1:00 ` [RFC][PATCH 5/5] memcg softlimit hooks to kswapd KAMEZAWA Hiroyuki
2009-03-12  3:58   ` Balbir Singh [this message]
2009-03-12  4:02     ` KAMEZAWA Hiroyuki
2009-03-12  4:59   ` KAMEZAWA Hiroyuki
2009-03-12  1:01 ` [RFC][PATCH 6/5] softlimit document KAMEZAWA Hiroyuki
2009-03-12  1:54   ` Li Zefan
2009-03-12  2:01     ` KAMEZAWA Hiroyuki
2009-03-12  3:46 ` [RFC][PATCH 0/5] memcg softlimit (Another one) v4 Balbir Singh
2009-03-12  4:39   ` KAMEZAWA Hiroyuki
2009-03-12  5:04     ` Balbir Singh
2009-03-12  5:32       ` KAMEZAWA Hiroyuki
2009-03-12  8:26         ` Balbir Singh
2009-03-12  8:45           ` KAMEZAWA Hiroyuki
2009-03-12  9:53             ` Balbir Singh
2009-03-14 18:52 ` Balbir Singh
2009-03-16  0:10   ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090312035837.GD23583@balbir.in.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nishimura@mxp.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox