linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: kosaki.motohiro@jp.fujitsu.com,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 10/11] Direct compact when a high-order allocation fails
Date: Fri, 19 Mar 2010 15:21:31 +0900 (JST)	[thread overview]
Message-ID: <20100319152105.8772.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <1268412087-13536-11-git-send-email-mel@csn.ul.ie>

> @@ -1765,6 +1766,31 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
>  
>  	cond_resched();
>  
> +	/* Try memory compaction for high-order allocations before reclaim */
> +	if (order) {
> +		*did_some_progress = try_to_compact_pages(zonelist,
> +						order, gfp_mask, nodemask);
> +		if (*did_some_progress != COMPACT_INCOMPLETE) {
> +			page = get_page_from_freelist(gfp_mask, nodemask,
> +					order, zonelist, high_zoneidx,
> +					alloc_flags, preferred_zone,
> +					migratetype);
> +			if (page) {
> +				__count_vm_event(COMPACTSUCCESS);
> +				return page;
> +			}
> +
> +			/*
> +			 * It's bad if compaction run occurs and fails.
> +			 * The most likely reason is that pages exist,
> +			 * but not enough to satisfy watermarks.
> +			 */
> +			count_vm_event(COMPACTFAIL);
> +
> +			cond_resched();
> +		}
> +	}
> +

Hmm..Hmmm...........

Today, I've reviewed this patch and [11/11] carefully twice. but It is harder to ack.

This patch seems to assume page compaction is faster than direct
reclaim. but it often doesn't, because dropping useless page cache is very
lightweight operation, but page compaction makes a lot of memcpy (i.e. cpu cache
pollution). IOW this patch is focusing to hugepage allocation very aggressively, but
it seems not enough care to reduce typical workload damage.


At first, I would like to clarify current reclaim corner case and how vmscan should do at this mail.

Now we have Lumpy reclaim. It is very excellent solution for externa fragmentation.
but unfortunately it have lots corner case.

Viewpoint 1. Unnecessary IO

isolate_pages() for lumpy reclaim frequently grab very young page. it is often
still dirty. then, pageout() is called much.

Unfortunately, page size grained io is _very_ inefficient. it can makes lots disk
seek and kill disk io bandwidth.


Viewpoint 2. Unevictable pages 

isolate_pages() for lumpy reclaim can pick up unevictable page. it is obviously
undroppable. so if the zone have plenty mlocked pages (it is not rare case on
server use case), lumpy reclaim can become very useless.


Viewpoint 3. GFP_ATOMIC allocation failure

Obviously lumpy reclaim can't help GFP_ATOMIC issue.


Viewpoint 4. reclaim latency

reclaim latency directly affect page allocation latency. so if lumpy reclaim with
much pageout io is slow (often it is), it affect page allocation latency and can
reduce end user experience.


I really hope that auto page migration help to solve above issue. but sadly this 
patch seems doesn't.

Honestly, I think this patch was very impressive and useful at 2-3 years ago.
because 1) we didn't have lumpy reclaim 2) we didn't have sane reclaim bail out.
then, old vmscan is very heavyweight and inefficient operation for high order reclaim.
therefore the downside of adding this page migration is hidden relatively. but...

We have to make an effort to reduce reclaim latency, not adding new latency source.
Instead, I would recommend tightly integrate page-compaction and lumpy reclaim.
I mean 1) reusing lumpy reclaim's neighbor pfn page pickking up logic 2) do page
migration instead pageout when the page is some condition (example active or dirty
or referenced or swapbacked).

This patch seems shoot me! /me die. R.I.P. ;-)


btw please don't use 'hugeadm --set-recommended-min_free_kbytes' at testing.
    To evaluate a case of free memory starvation is very important for this patch
    series, I think. I slightly doubt this patch might invoke useless compaction
    in such case.



At bottom line, the explict compaction via /proc can be merged soon, I think.
but this auto compaction logic seems need more discussion.





--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-03-19  6:21 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-12 16:41 [PATCH 0/11] Memory Compaction v4 Mel Gorman
2010-03-12 16:41 ` [PATCH 01/11] mm,migration: Take a reference to the anon_vma before migrating Mel Gorman
2010-03-14 15:01   ` Minchan Kim
2010-03-15  5:06   ` KAMEZAWA Hiroyuki
2010-03-17  1:44   ` KOSAKI Motohiro
2010-03-17 11:45     ` Mel Gorman
2010-03-17 16:38       ` Christoph Lameter
2010-03-18 11:12         ` Mel Gorman
2010-03-18 16:31           ` Christoph Lameter
2010-03-12 16:41 ` [PATCH 02/11] mm,migration: Do not try to migrate unmapped anonymous pages Mel Gorman
2010-03-15  0:28   ` Minchan Kim
2010-03-15  5:34     ` KAMEZAWA Hiroyuki
2010-03-15  6:28       ` Minchan Kim
2010-03-15  6:44         ` KAMEZAWA Hiroyuki
2010-03-15  7:09           ` KAMEZAWA Hiroyuki
2010-03-15 13:48             ` Minchan Kim
2010-03-15  7:11           ` Minchan Kim
2010-03-15 11:28       ` Mel Gorman
2010-03-15 12:48         ` Minchan Kim
2010-03-15 14:21           ` Mel Gorman
2010-03-15 14:33             ` Minchan Kim
2010-03-15 23:49             ` KAMEZAWA Hiroyuki
2010-03-17  2:12               ` KAMEZAWA Hiroyuki
2010-03-17  3:00                 ` Minchan Kim
2010-03-17  3:15                   ` KAMEZAWA Hiroyuki
2010-03-17  4:15                     ` Minchan Kim
2010-03-17  4:19                       ` KAMEZAWA Hiroyuki
2010-03-17 16:41                     ` Christoph Lameter
2010-03-18  0:30                       ` KAMEZAWA Hiroyuki
2010-03-17 12:07                 ` Mel Gorman
2010-03-17  2:03             ` KOSAKI Motohiro
2010-03-17 11:51               ` Mel Gorman
2010-03-18  0:48                 ` KOSAKI Motohiro
2010-03-18 11:14                   ` Mel Gorman
2010-03-19  6:21                     ` KOSAKI Motohiro
2010-03-19  8:59                       ` Mel Gorman
2010-03-25  2:49                         ` KOSAKI Motohiro
2010-03-25  8:32                           ` Mel Gorman
2010-03-25  8:56                             ` KOSAKI Motohiro
2010-03-25  9:18                               ` Mel Gorman
2010-03-25  9:02                             ` KAMEZAWA Hiroyuki
2010-03-25  9:09                               ` KOSAKI Motohiro
2010-03-25  9:08                                 ` KAMEZAWA Hiroyuki
2010-03-25  9:21                                 ` Mel Gorman
2010-03-25  9:41                                   ` KAMEZAWA Hiroyuki
2010-03-25  9:59                                     ` KOSAKI Motohiro
2010-03-25 10:12                                       ` KAMEZAWA Hiroyuki
2010-03-25 13:39                                         ` Mel Gorman
2010-03-26  3:07                                           ` KOSAKI Motohiro
2010-03-26 13:49                                             ` Mel Gorman
2010-03-25 15:29                                         ` Minchan Kim
2010-03-26  0:58                                           ` KAMEZAWA Hiroyuki
2010-03-26  1:39                                             ` Minchan Kim
2010-03-25 14:35                                   ` Christoph Lameter
2010-03-25 16:16                               ` Minchan Kim
2010-03-12 16:41 ` [PATCH 03/11] mm: Share the anon_vma ref counts between KSM and page migration Mel Gorman
2010-03-12 17:14   ` Rik van Riel
2010-03-15  5:35   ` KAMEZAWA Hiroyuki
2010-03-17  2:06   ` KOSAKI Motohiro
2010-03-12 16:41 ` [PATCH 04/11] Allow CONFIG_MIGRATION to be set without CONFIG_NUMA or memory hot-remove Mel Gorman
2010-03-17  2:28   ` KOSAKI Motohiro
2010-03-17 11:32     ` Mel Gorman
2010-03-17 16:37       ` Christoph Lameter
2010-03-17 23:56         ` KOSAKI Motohiro
2010-03-18 11:24           ` Mel Gorman
2010-03-19  6:21             ` KOSAKI Motohiro
2010-03-19 10:16               ` Mel Gorman
2010-03-25  3:28                 ` KOSAKI Motohiro
2010-03-12 16:41 ` [PATCH 05/11] Export unusable free space index via /proc/unusable_index Mel Gorman
2010-03-15  5:41   ` KAMEZAWA Hiroyuki
2010-03-15  9:48     ` Mel Gorman
2010-03-17  2:42   ` KOSAKI Motohiro
2010-03-12 16:41 ` [PATCH 06/11] Export fragmentation index via /proc/extfrag_index Mel Gorman
2010-03-17  2:49   ` KOSAKI Motohiro
2010-03-17 11:33     ` Mel Gorman
2010-03-23  0:22       ` KOSAKI Motohiro
2010-03-23 12:03         ` Mel Gorman
2010-03-25  2:47           ` KOSAKI Motohiro
2010-03-25  8:47             ` Mel Gorman
2010-03-25 11:20               ` KOSAKI Motohiro
2010-03-25 14:11                 ` Mel Gorman
2010-03-26  3:10                   ` KOSAKI Motohiro
2010-03-12 16:41 ` [PATCH 07/11] Memory compaction core Mel Gorman
2010-03-15 13:44   ` Minchan Kim
2010-03-15 14:41     ` Mel Gorman
2010-03-17 10:31   ` KOSAKI Motohiro
2010-03-17 11:40     ` Mel Gorman
2010-03-18  2:35       ` KOSAKI Motohiro
2010-03-18 11:43         ` Mel Gorman
2010-03-19  6:21           ` KOSAKI Motohiro
2010-03-18 17:08     ` Mel Gorman
2010-03-12 16:41 ` [PATCH 08/11] Add /proc trigger for memory compaction Mel Gorman
2010-03-17  3:18   ` KOSAKI Motohiro
2010-03-12 16:41 ` [PATCH 09/11] Add /sys trigger for per-node " Mel Gorman
2010-03-17  3:18   ` KOSAKI Motohiro
2010-03-12 16:41 ` [PATCH 10/11] Direct compact when a high-order allocation fails Mel Gorman
2010-03-16  2:47   ` Minchan Kim
2010-03-19  6:21   ` KOSAKI Motohiro [this message]
2010-03-19  6:31     ` KOSAKI Motohiro
2010-03-19 10:10       ` Mel Gorman
2010-03-25 11:22         ` KOSAKI Motohiro
2010-03-19 10:09     ` Mel Gorman
2010-03-25 11:08       ` KOSAKI Motohiro
2010-03-25 15:11         ` Mel Gorman
2010-03-26  6:01           ` KOSAKI Motohiro
2010-03-12 16:41 ` [PATCH 11/11] Do not compact within a preferred zone after a compaction failure Mel Gorman
2010-03-23 12:25 [PATCH 0/11] Memory Compaction v5 Mel Gorman
2010-03-23 12:25 ` [PATCH 10/11] Direct compact when a high-order allocation fails Mel Gorman
2010-03-23 23:10   ` Minchan Kim
2010-03-24 11:11     ` Mel Gorman
2010-03-24 11:59       ` Minchan Kim
2010-03-24 12:06         ` Minchan Kim
2010-03-24 12:10           ` Mel Gorman
2010-03-24 12:09         ` Mel Gorman
2010-03-24 12:25           ` Minchan Kim
2010-03-24  1:19   ` KAMEZAWA Hiroyuki
2010-03-24 11:40     ` Mel Gorman
2010-03-25  0:30       ` KAMEZAWA Hiroyuki
2010-03-25  9:48         ` Mel Gorman
2010-03-25  9:50           ` KAMEZAWA Hiroyuki
2010-03-25 10:16             ` Mel Gorman
2010-03-26  1:03               ` KAMEZAWA Hiroyuki
2010-03-26  9:40                 ` Mel Gorman
2010-03-24 20:48   ` Andrew Morton
2010-03-25  0:57     ` KAMEZAWA Hiroyuki
2010-03-25 10:21     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100319152105.8772.A69D9226@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=aarcange@redhat.com \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=cl@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox