linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	Ying Han <yinghan@google.com>, Michal Hocko <mhocko@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Mel Gorman <mgorman@suse.de>, Greg Thelen <gthelen@google.com>,
	Michel Lespinasse <walken@google.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: [patch 3/8] memcg: reclaim statistics
Date: Wed,  1 Jun 2011 08:25:14 +0200	[thread overview]
Message-ID: <1306909519-7286-4-git-send-email-hannes@cmpxchg.org> (raw)
In-Reply-To: <1306909519-7286-1-git-send-email-hannes@cmpxchg.org>

Currently, there are no statistics whatsoever that would give an
insight into how memory is reclaimed from specific memcgs.

This patch introduces statistics that break down into the following
categories.

1. Limit-triggered direct reclaim

   pgscan_direct_limit
   pgreclaim_direct_limit

   These counters indicate the number of pages scanned and reclaimed
   directly by tasks that needed to allocate memory while the memcg
   had reached its hard limit.

2. Limit-triggered background reclaim

   pgscan_background_limit
   pgreclaim_background_limit

   These counters indicate the number of pages scanned and reclaimed
   by a kernel thread while the memcg's usage was coming close to the
   hard limit, so to prevent allocators from having to drop into
   direct reclaim.

   There is currently no mechanism in the kernel that would increase
   those counters, but there is per-memcg watermark reclaim in the
   workings that would fall into this category.

3. Hierarchy-triggered direct reclaim

   pgscan_direct_hierarchy
   pgreclaim_direct_hierarchy

   These counters indicate the number of pages scanned and reclaimed
   directly by tasks that needed to allocate memory in hierarchical
   parents of the memcg while those parents where experiencing memory
   shortness.

   For now, this could be either because of a hard limit in the
   parents, or because of global memory pressure.

4. Hierarchy-triggered background reclaim

   pgscan_background_hierarchy
   pgreclaim_background_hierarchy

   These counters indicate the number of pages scanned and reclaimed
   by a kernel thread while one of the memcgs hierarchical parents was
   coming close to running out of memory.

   For now, this only accounts for the work done by kswapd to balance
   zones, but there is also per-memcg watermark reclaim in the
   workings that would fall into this category.

The counters for limit-triggered reclaim always inform about pressure
that exists within the memcg and if the workload is too big for its
container.  The counters for hierarchy-triggered reclaim on the other
hand inform about the pressure outside the memcg, such as the limit of
a parent or physical memory shortness.  Having this distinction helps
locating the cause for a thrashing workload in the hierarchy.

In addition, the distinction between direct and background reclaim
shows how well background reclaim can keep up or whether it is
overwhelmed and forces allocators into direct reclaim.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/memcontrol.h |    9 ++++++
 mm/memcontrol.c            |   61 ++++++++++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                |    6 ++++
 3 files changed, 76 insertions(+), 0 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 332b0a6..8f402b9 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -108,6 +108,8 @@ void mem_cgroup_stop_hierarchy_walk(struct mem_cgroup *, struct mem_cgroup *);
 /*
  * For memory reclaim.
  */
+void mem_cgroup_count_reclaim(struct mem_cgroup *, bool, bool,
+			      unsigned long, unsigned long);
 int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg);
 int mem_cgroup_inactive_file_is_low(struct mem_cgroup *memcg);
 unsigned long mem_cgroup_zone_nr_pages(struct mem_cgroup *memcg,
@@ -293,6 +295,13 @@ static inline bool mem_cgroup_disabled(void)
 	return true;
 }
 
+static inline void mem_cgroup_count_reclaim(struct mem_cgroup *mem,
+					    bool background, bool hierarchy,
+					    unsigned long scanned,
+					    unsigned long reclaimed)
+{
+}
+
 static inline int
 mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg)
 {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 850176e..983efe4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -90,10 +90,24 @@ enum mem_cgroup_stat_index {
 	MEM_CGROUP_STAT_NSTATS,
 };
 
+#define RECLAIM_RECLAIMED 1
+#define RECLAIM_BACKGROUND 2
+#define RECLAIM_HIERARCHY 4
+
 enum mem_cgroup_events_index {
 	MEM_CGROUP_EVENTS_PGPGIN,	/* # of pages paged in */
 	MEM_CGROUP_EVENTS_PGPGOUT,	/* # of pages paged out */
 	MEM_CGROUP_EVENTS_COUNT,	/* # of pages paged in/out */
+	RECLAIM_BASE,
+	/* base + [!]hierarchy + [!]background + [!]reclaimed */
+	PGSCAN_DIRECT_LIMIT = RECLAIM_BASE,
+	PGRECLAIM_DIRECT_LIMIT,
+	PGSCAN_BACKGROUND_LIMIT,
+	PGRECLAIM_BACKGROUND_LIMIT,
+	PGSCAN_DIRECT_HIERARCHY,
+	PGRECLAIM_DIRECT_HIERARCHY,
+	PGSCAN_BACKGROUND_HIERARCHY,
+	PGRECLAIM_BACKGROUND_HIERARCHY,
 	MEM_CGROUP_EVENTS_NSTATS,
 };
 /*
@@ -585,6 +599,21 @@ static void mem_cgroup_swap_statistics(struct mem_cgroup *mem,
 	this_cpu_add(mem->stat->count[MEM_CGROUP_STAT_SWAPOUT], val);
 }
 
+void mem_cgroup_count_reclaim(struct mem_cgroup *mem,
+			      bool background, bool hierarchy,
+			      unsigned long scanned, unsigned long reclaimed)
+{
+	unsigned int base = RECLAIM_BASE;
+
+	if (hierarchy)
+		base += RECLAIM_HIERARCHY;
+	if (background)
+		base += RECLAIM_BACKGROUND;
+
+	this_cpu_add(mem->stat->events[base], scanned);
+	this_cpu_add(mem->stat->events[base + RECLAIM_RECLAIMED], reclaimed);
+}
+
 static unsigned long mem_cgroup_read_events(struct mem_cgroup *mem,
 					    enum mem_cgroup_events_index idx)
 {
@@ -3821,6 +3850,14 @@ enum {
 	MCS_FILE_MAPPED,
 	MCS_PGPGIN,
 	MCS_PGPGOUT,
+	MCS_PGSCAN_DIRECT_LIMIT,
+	MCS_PGRECLAIM_DIRECT_LIMIT,
+	MCS_PGSCAN_BACKGROUND_LIMIT,
+	MCS_PGRECLAIM_BACKGROUND_LIMIT,
+	MCS_PGSCAN_DIRECT_HIERARCHY,
+	MCS_PGRECLAIM_DIRECT_HIERARCHY,
+	MCS_PGSCAN_BACKGROUND_HIERARCHY,
+	MCS_PGRECLAIM_BACKGROUND_HIERARCHY,
 	MCS_SWAP,
 	MCS_INACTIVE_ANON,
 	MCS_ACTIVE_ANON,
@@ -3843,6 +3880,14 @@ struct {
 	{"mapped_file", "total_mapped_file"},
 	{"pgpgin", "total_pgpgin"},
 	{"pgpgout", "total_pgpgout"},
+	{"pgscan_direct_limit", "total_pgscan_direct_limit"},
+	{"pgreclaim_direct_limit", "total_pgreclaim_direct_limit"},
+	{"pgscan_background_limit", "total_pgscan_background_limit"},
+	{"pgreclaim_background_limit", "total_pgreclaim_background_limit"},
+	{"pgscan_direct_hierarchy", "total_pgscan_direct_hierarchy"},
+	{"pgreclaim_direct_hierarchy", "total_pgreclaim_direct_hierarchy"},
+	{"pgscan_background_hierarchy", "total_pgscan_background_hierarchy"},
+	{"pgreclaim_background_hierarchy", "total_pgreclaim_background_hierarchy"},
 	{"swap", "total_swap"},
 	{"inactive_anon", "total_inactive_anon"},
 	{"active_anon", "total_active_anon"},
@@ -3868,6 +3913,22 @@ mem_cgroup_get_local_stat(struct mem_cgroup *mem, struct mcs_total_stat *s)
 	s->stat[MCS_PGPGIN] += val;
 	val = mem_cgroup_read_events(mem, MEM_CGROUP_EVENTS_PGPGOUT);
 	s->stat[MCS_PGPGOUT] += val;
+	val = mem_cgroup_read_events(mem, PGSCAN_DIRECT_LIMIT);
+	s->stat[MCS_PGSCAN_DIRECT_LIMIT] += val;
+	val = mem_cgroup_read_events(mem, PGRECLAIM_DIRECT_LIMIT);
+	s->stat[MCS_PGRECLAIM_DIRECT_LIMIT] += val;
+	val = mem_cgroup_read_events(mem, PGSCAN_BACKGROUND_LIMIT);
+	s->stat[MCS_PGSCAN_BACKGROUND_LIMIT] += val;
+	val = mem_cgroup_read_events(mem, PGRECLAIM_BACKGROUND_LIMIT);
+	s->stat[MCS_PGRECLAIM_BACKGROUND_LIMIT] += val;
+	val = mem_cgroup_read_events(mem, PGSCAN_DIRECT_HIERARCHY);
+	s->stat[MCS_PGSCAN_DIRECT_HIERARCHY] += val;
+	val = mem_cgroup_read_events(mem, PGRECLAIM_DIRECT_HIERARCHY);
+	s->stat[MCS_PGRECLAIM_DIRECT_HIERARCHY] += val;
+	val = mem_cgroup_read_events(mem, PGSCAN_BACKGROUND_HIERARCHY);
+	s->stat[MCS_PGSCAN_BACKGROUND_HIERARCHY] += val;
+	val = mem_cgroup_read_events(mem, PGRECLAIM_BACKGROUND_HIERARCHY);
+	s->stat[MCS_PGRECLAIM_BACKGROUND_HIERARCHY] += val;
 	if (do_swap_account) {
 		val = mem_cgroup_read_stat(mem, MEM_CGROUP_STAT_SWAPOUT);
 		s->stat[MCS_SWAP] += val * PAGE_SIZE;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7e9bfca..c7d4b44 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1985,10 +1985,16 @@ static void shrink_zone(int priority, struct zone *zone,
 
 	first = mem = mem_cgroup_hierarchy_walk(root, mem);
 	for (;;) {
+		unsigned long reclaimed = sc->nr_reclaimed;
+		unsigned long scanned = sc->nr_scanned;
 		unsigned long nr_reclaimed;
 
 		sc->mem_cgroup = mem;
 		do_shrink_zone(priority, zone, sc);
+		mem_cgroup_count_reclaim(mem, current_is_kswapd(),
+					 mem != root, /* limit or hierarchy? */
+					 sc->nr_scanned - scanned,
+					 sc->nr_reclaimed - reclaimed);
 
 		nr_reclaimed = sc->nr_reclaimed - nr_reclaimed_before;
 		if (nr_reclaimed >= sc->nr_to_reclaim)
-- 
1.7.5.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2011-06-01  6:25 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-01  6:25 [patch 0/8] mm: memcg naturalization -rc2 Johannes Weiner
2011-06-01  6:25 ` [patch 1/8] memcg: remove unused retry signal from reclaim Johannes Weiner
2011-06-01  6:25 ` [patch 2/8] mm: memcg-aware global reclaim Johannes Weiner
2011-06-02 13:59   ` Hiroyuki Kamezawa
2011-06-02 15:01     ` Johannes Weiner
2011-06-02 16:14       ` Hiroyuki Kamezawa
2011-06-02 17:29         ` Johannes Weiner
2011-06-09 14:01           ` Michal Hocko
2011-06-07 12:25   ` Christoph Hellwig
2011-06-08  9:30     ` Johannes Weiner
2011-06-09  9:26       ` Christoph Hellwig
2011-06-09 16:57         ` Johannes Weiner
2011-06-09 13:12   ` Michal Hocko
2011-06-09 13:45     ` Johannes Weiner
2011-06-09 15:48   ` Minchan Kim
2011-06-09 17:23     ` Johannes Weiner
2011-06-09 23:41       ` Minchan Kim
2011-06-09 23:47         ` Minchan Kim
2011-06-10  0:34           ` Johannes Weiner
2011-06-10  0:48             ` Minchan Kim
2011-08-11 20:39   ` Ying Han
2011-08-11 21:09     ` Johannes Weiner
2011-08-29  7:15       ` Ying Han
2011-08-29  7:22         ` Ying Han
2011-08-29  7:57           ` Johannes Weiner
2011-08-30  6:08             ` Ying Han
2011-08-29 19:04           ` Johannes Weiner
2011-08-29 20:36             ` Ying Han
2011-08-29 21:05               ` Johannes Weiner
2011-08-30  7:07                 ` Ying Han
2011-08-30 15:14                   ` Johannes Weiner
2011-08-31 22:58                     ` Ying Han
2011-09-21  8:44                       ` Johannes Weiner
2011-08-29  8:07         ` Johannes Weiner
2011-06-01  6:25 ` Johannes Weiner [this message]
2011-06-01  6:25 ` [patch 4/8] memcg: rework soft limit reclaim Johannes Weiner
2011-06-02  5:37   ` Ying Han
2011-06-02 21:55   ` Ying Han
2011-06-03  5:25     ` Ying Han
2011-06-09 15:00       ` Michal Hocko
2011-06-10  7:36         ` Michal Hocko
2011-06-15 22:57           ` Ying Han
2011-06-16  0:33             ` Ying Han
2011-06-16 11:45             ` Michal Hocko
2011-06-15 22:48         ` Ying Han
2011-06-16 11:41           ` Michal Hocko
2011-06-01  6:25 ` [patch 5/8] memcg: remove unused soft limit code Johannes Weiner
2011-06-13  9:26   ` Michal Hocko
2011-06-01  6:25 ` [patch 6/8] vmscan: change zone_nr_lru_pages to take memcg instead of scan control Johannes Weiner
2011-06-02 13:30   ` Hiroyuki Kamezawa
2011-06-02 14:28     ` Johannes Weiner
2011-06-13  9:29   ` Michal Hocko
2011-06-01  6:25 ` [patch 7/8] vmscan: memcg-aware unevictable page rescue scanner Johannes Weiner
2011-06-02 13:27   ` Hiroyuki Kamezawa
2011-06-02 14:27     ` Johannes Weiner
2011-06-02 21:02     ` Ying Han
2011-06-02 22:01       ` Hiroyuki Kamezawa
2011-06-02 22:19         ` Johannes Weiner
2011-06-02 23:15           ` Hiroyuki Kamezawa
2011-06-03  5:08           ` Ying Han
2011-06-13  9:42   ` Michal Hocko
2011-06-13 10:30     ` Johannes Weiner
2011-06-13 11:18       ` Michal Hocko
2011-07-19 22:47   ` Ying Han
2011-07-20  0:36     ` Johannes Weiner
2011-08-29  7:28       ` Ying Han
2011-08-29  7:59         ` Johannes Weiner
2011-06-01  6:25 ` [patch 8/8] mm: make per-memcg lru lists exclusive Johannes Weiner
2011-06-02 13:16   ` Hiroyuki Kamezawa
2011-06-02 14:24     ` Johannes Weiner
2011-06-02 15:54       ` Hiroyuki Kamezawa
2011-06-02 17:57         ` Johannes Weiner
2011-06-08 15:04           ` Michal Hocko
2011-06-07 12:42   ` Christoph Hellwig
2011-06-08  8:54     ` Johannes Weiner
2011-06-09  9:23       ` Christoph Hellwig
2011-08-11 20:33   ` Ying Han
2011-08-12  8:34     ` Johannes Weiner
2011-08-12 17:08       ` Ying Han
2011-08-12 19:17         ` Johannes Weiner
2011-08-15  3:01           ` Ying Han
2011-08-15  1:34       ` Ying Han
2011-08-15  9:39         ` Johannes Weiner
2011-06-01 23:52 ` [patch 0/8] mm: memcg naturalization -rc2 Hiroyuki Kamezawa
2011-06-02  0:35   ` Greg Thelen
2011-06-09  1:13     ` Rik van Riel
2011-06-02  4:05   ` Ying Han
2011-06-02  7:50     ` Johannes Weiner
2011-06-02 15:51       ` Ying Han
2011-06-02 17:51         ` Johannes Weiner
2011-06-08  3:45           ` Ying Han
2011-06-08  3:53           ` Ying Han
2011-06-08 15:32             ` Johannes Weiner
2011-06-09  3:52               ` Ying Han
2011-06-09  8:35                 ` Johannes Weiner
2011-06-09 17:36                   ` Ying Han
2011-06-09 18:36                     ` Johannes Weiner
2011-06-09 21:38                       ` Ying Han
2011-06-09 22:30                       ` Ying Han
2011-06-09 23:31                         ` Johannes Weiner
2011-06-10  0:17                           ` Ying Han
2011-06-02  7:33   ` Johannes Weiner
2011-06-02  9:06     ` Hiroyuki Kamezawa
2011-06-02 10:00       ` Johannes Weiner
2011-06-02 12:59         ` Hiroyuki Kamezawa
2011-06-09  1:15           ` Rik van Riel
2011-06-09  8:43             ` Johannes Weiner
2011-06-09  9:31               ` Christoph Hellwig
2011-06-13  9:47 ` Michal Hocko
2011-06-13 10:35   ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1306909519-7286-4-git-send-email-hannes@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=gthelen@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=minchan.kim@gmail.com \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox