Re: issue with direct reclaims and kswapd reclaims on 2.6.35.7

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jeffrey Vanhoof <jdv1029@gmail.com>
To: linux-mm@kvack.org
Cc: tghk48@motorola.com
Subject: Re: issue with direct reclaims and kswapd reclaims on 2.6.35.7
Date: Thu, 18 Aug 2011 01:47:19 -0500	[thread overview]
Message-ID: <CAML7nqdaLZ-e-+ghU7ywK13NmzXv9fDoK01HwUQ8aX5rsLBnOQ@mail.gmail.com> (raw)
In-Reply-To: <CAML7nqd9_F4L0M7ynLFz4HKET94n2mwsk42Z7g2EjAfYnD-JgQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1170 bytes --]

w.r.t:
> 1) direct reclaims occurring quite frequently, resulting in delayed
> file read requests
> 2) direct reclaims falling into congestion_wait() even though no
> congestion at the time, this results in video jitter.

Backporting the following changes seemed to greatly improve these issues:
-vmscan: synchronous lumpy reclaim should not call congestion_wait()
-writeback: do not sleep on the congestion queue if there are no
congested BDIs or if significant congestion is not being encountered
in the current zone
-vmscan: avoid setting zone congested if no page dirty

w.r.t.:
> 3) kswapd not reclaiming pages quickly enough due to falling into
> congestion_wait() very often. (or stays in congestion_wait() for too long)

The attached patch takes an idea from Mel Gorman's patch for
"writeback: do not sleep on the congestion queue if there are no
congested BDIs or if significant congestion is not being encountered
in the current zone" and applies it around the congestion_wait() in
balance_pgdat(). The idea is that if there is no congestion then avoid
potentially wait for too long. Comments or alternate solutions would
be appreciated.

Thanks,
Jeff Vanhoof

[-- Attachment #2: k35_kswapd_query_iff_congested_workaround.txt --]
[-- Type: text/plain, Size: 3196 bytes --]

commit af8ebaca0d367e14b49a151731e8a3e9bc6685f1
Author: Jeff Vanhoof <jdv1029@gmail.com>
Date:   Tue Aug 16 00:14:31 2011 -0500

    linux-mm: Improve kswapd reclaimation of memory
    
    This is a workaround to improve the number of pages reclaimed
    in kswapd so that direct reclaims and iowaits are minimized.
    
    Change-Id: I491e9c80809b5ec3e1e7807742807a2317fc2394

diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index fa79632..e5b6d3d 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -286,6 +286,7 @@ void clear_bdi_congested(struct backing_dev_info *bdi, int sync);
 void set_bdi_congested(struct backing_dev_info *bdi, int sync);
 long congestion_wait(int sync, long timeout);
 long wait_iff_congested(struct zone *zone, int sync, long timeout);
+int query_iff_congested(struct zone *zone, int sync);
 
 static inline bool bdi_cap_writeback_dirty(struct backing_dev_info *bdi)
 {
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 4254946..fcf7976 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -850,3 +850,32 @@ out:
 	return ret;
 }
 EXPORT_SYMBOL(wait_iff_congested);
+
+/**
+ * query_iff_congested - Checks if a backing_dev (any backing_dev) is
+ *     congested or if the given @zone has has experienced recent congestion.
+ * @zone: A zone to check if it is heavily congested
+ * @sync: SYNC or ASYNC IO
+ *
+ * The return value is 1 if either backing_dev (any) or @zone is congested,
+ * otherwise 0 is returned.
+ *
+ */
+int query_iff_congested(struct zone *zone, int sync)
+{
+	long ret = 1;
+	DEFINE_WAIT(wait);
+	wait_queue_head_t *wqh = &congestion_wqh[sync];
+
+	/*
+	 * If there is no congestion, or heavy congestion is not being
+	 * encountered in the current zone, set ret to 0
+	 */
+	if (atomic_read(&nr_bdi_congested[sync]) == 0 ||
+			!zone_is_reclaim_congested(zone)) {
+		ret = 0;
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL(query_iff_congested);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1bd01ee..e8686f0 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2169,6 +2169,7 @@ loop_again:
 		int end_zone = 0;	/* Inclusive.  0 = ZONE_DMA */
 		unsigned long lru_pages = 0;
 		int has_under_min_watermark_zone = 0;
+		int any_zone_congested = 0;
 
 		/* The swap token gets in the way of swapout... */
 		if (!priority)
@@ -2294,6 +2295,13 @@ loop_again:
 		}
 		if (all_zones_ok)
 			break;		/* kswapd: all done */
+
+		/* Check to see if any zones are congested */
+		for (i = pgdat->nr_zones - 1; i >= 0; i--) {
+			struct zone *zone = pgdat->node_zones + i;
+			any_zone_congested |=
+				 query_iff_congested(zone, BLK_RW_ASYNC);
+		}
+
 		/*
 		 * OK, kswapd is getting into trouble.  Take a nap, then take
 		 * another pass across the zones.
@@ -2301,6 +2309,9 @@ loop_again:
 		if (total_scanned && (priority < DEF_PRIORITY - 2)) {
 			if (has_under_min_watermark_zone)
 				count_vm_event(KSWAPD_SKIP_CONGESTION_WAIT);
+			else if (!any_zone_congested &&
+				 (priority > DEF_PRIORITY - 8))
+				congestion_wait(BLK_RW_ASYNC, HZ/50);
 			else
 				congestion_wait(BLK_RW_ASYNC, HZ/10);
 		}

     prev parent reply	other threads:[~2011-08-18  6:47 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-15  5:08 Jeffrey Vanhoof
2011-08-18  6:47 ` Jeffrey Vanhoof [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAML7nqdaLZ-e-+ghU7ywK13NmzXv9fDoK01HwUQ8aX5rsLBnOQ@mail.gmail.com \
    --to=jdv1029@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=tghk48@motorola.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox