linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Christoph Lameter <cl@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Minchan Kim <minchan.kim@gmail.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 11/11] Do not compact within a preferred zone after a compaction failure
Date: Wed, 24 Mar 2010 10:37:49 +0000	[thread overview]
Message-ID: <20100324103749.GB21147@csn.ul.ie> (raw)
In-Reply-To: <alpine.DEB.2.00.1003231422290.10178@router.home>

On Tue, Mar 23, 2010 at 02:27:08PM -0500, Christoph Lameter wrote:
> On Tue, 23 Mar 2010, Mel Gorman wrote:
> 
> > I was having some sort of fit when I wrote that obviously. Try this on
> > for size
> >
> > The fragmentation index may indicate that a failure is due to external
> > fragmentation but after a compaction run completes, it is still possible
> > for an allocation to fail.
> 
> Ok.
> 
> > > > fail. There are two obvious reasons as to why
> > > >
> > > >   o Page migration cannot move all pages so fragmentation remains
> > > >   o A suitable page may exist but watermarks are not met
> > > >
> > > > In the event of compaction and allocation failure, this patch prevents
> > > > compaction happening for a short interval. It's only recorded on the
> > >
> > > compaction is "recorded"? deferred?
> > >
> >
> > deferred makes more sense.
> >
> > What I was thinking at the time was that compact_resume was stored in struct
> > zone - i.e. that is where it is recorded.
> 
> Ok adding a dozen or more words here may be useful.
> 

In the event of compaction followed by an allocation failure, this patch
defers further compaction in the zone for a period of time. The zone that
is deferred is the first zone in the zonelist - i.e. the preferred zone.
To defer compaction in the other zones, the information would need to
be stored in the zonelist or implemented similar to the zonelist_cache.
This would impact the fast-paths and is not justified at this time.

?

> > > > preferred zone but that should be enough coverage. This could have been
> > > > implemented similar to the zonelist_cache but the increased size of the
> > > > zonelist did not appear to be justified.
> > >
> > > > @@ -1787,6 +1787,9 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
> > > >  			 */
> > > >  			count_vm_event(COMPACTFAIL);
> > > >
> > > > +			/* On failure, avoid compaction for a short time. */
> > > > +			defer_compaction(preferred_zone, jiffies + HZ/50);
> > > > +
> > >
> > > 20ms? How was that interval determined?
> > >
> >
> > Matches the time the page allocator would defer to an event like
> > congestion. The choice is somewhat arbitrary. Ideally, there would be
> > some sort of event that would re-enable compaction but there wasn't an
> > obvious candidate so I used time.
> 
> There are frequent uses of HZ/10 as well especially in vmscna.c. A longer
> time may be better? HZ/50 looks like an interval for writeout. But this
> is related to reclaim?
> 

HZ/10 is somewhat of an arbitrary choice as well and there isn't data on
which is better and which is worse. If the zone is full of dirty data, then
HZ/10 makes sense for IO. If it happened to be mainly clean cache but under
heavy memory pressure, then reclaim would be a relatively fast event and a
shorter wait makes sense of HZ/50.

Thing is, if we start with a short timer and it's too short, COMPACTFAIL
will be growing steadily. If we choose a long time and it's too long, there
is no counter to indicate it was a bad choice. Hence, I'd prefer the short
timer to start with and ideally resume compaction after some event in the
future rather than depending on time.

Does that make sense?

> 
>  backing-dev.h    <global>                      283 long congestion_wait(int sync, long timeout);
> 1 backing-dev.c    <global>                      762 EXPORT_SYMBOL(congestion_wait);
> 2 usercopy_32.c    __copy_to_user_ll             754 congestion_wait(BLK_RW_ASYNC, HZ/50);
> 3 pktcdvd.c        pkt_make_request             2557 congestion_wait(BLK_RW_ASYNC, HZ);
> 4 dm-crypt.c       kcryptd_crypt_write_convert   834 congestion_wait(BLK_RW_ASYNC, HZ/100);
> 5 file.c           fat_file_release              137 congestion_wait(BLK_RW_ASYNC, HZ/10);
> 6 journal.c        reiserfs_async_progress_wait  990 congestion_wait(BLK_RW_ASYNC, HZ / 10);
> 7 kmem.c           kmem_alloc                     61 congestion_wait(BLK_RW_ASYNC, HZ/50);
> 8 kmem.c           kmem_zone_alloc               117 congestion_wait(BLK_RW_ASYNC, HZ/50);
> 9 xfs_buf.c        _xfs_buf_lookup_pages         343 congestion_wait(BLK_RW_ASYNC, HZ/50);
> a backing-dev.c    congestion_wait               751 long congestion_wait(int sync, long timeout)
> b memcontrol.c     mem_cgroup_force_empty       2858 congestion_wait(BLK_RW_ASYNC, HZ/10);
> c page-writeback.c throttle_vm_writeout          674 congestion_wait(BLK_RW_ASYNC, HZ/10);
> d page_alloc.c     __alloc_pages_high_priority  1753 congestion_wait(BLK_RW_ASYNC, HZ/50);
> e page_alloc.c     __alloc_pages_slowpath       1924 congestion_wait(BLK_RW_ASYNC, HZ/50);
> f vmscan.c         shrink_inactive_list         1136 congestion_wait(BLK_RW_ASYNC, HZ/10);
> g vmscan.c         shrink_inactive_list         1220 congestion_wait(BLK_RW_ASYNC, HZ/10);
> h vmscan.c         do_try_to_free_pages         1837 congestion_wait(BLK_RW_ASYNC, HZ/10);
> i vmscan.c         balance_pgdat                2161 congestion_wait(BLK_RW_ASYNC, HZ/10);
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-03-24 10:38 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-23 12:25 [PATCH 0/11] Memory Compaction v5 Mel Gorman
2010-03-23 12:25 ` [PATCH 01/11] mm,migration: Take a reference to the anon_vma before migrating Mel Gorman
2010-03-23 12:25 ` [PATCH 02/11] mm,migration: Do not try to migrate unmapped anonymous pages Mel Gorman
2010-03-23 17:22   ` Christoph Lameter
2010-03-23 18:04     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 03/11] mm: Share the anon_vma ref counts between KSM and page migration Mel Gorman
2010-03-23 17:25   ` Christoph Lameter
2010-03-23 23:55   ` KAMEZAWA Hiroyuki
2010-03-23 12:25 ` [PATCH 04/11] Allow CONFIG_MIGRATION to be set without CONFIG_NUMA or memory hot-remove Mel Gorman
2010-03-23 12:25 ` [PATCH 05/11] Export unusable free space index via /proc/unusable_index Mel Gorman
2010-03-23 17:31   ` Christoph Lameter
2010-03-23 18:14     ` Mel Gorman
2010-03-24  0:03   ` KAMEZAWA Hiroyuki
2010-03-24  0:16     ` Minchan Kim
2010-03-24  0:13       ` KAMEZAWA Hiroyuki
2010-03-24 10:25     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 06/11] Export fragmentation index via /proc/extfrag_index Mel Gorman
2010-03-23 17:37   ` Christoph Lameter
2010-03-23 12:25 ` [PATCH 07/11] Memory compaction core Mel Gorman
2010-03-23 17:56   ` Christoph Lameter
2010-03-23 18:15     ` Mel Gorman
2010-03-23 18:33       ` Christoph Lameter
2010-03-23 18:58         ` Mel Gorman
2010-03-23 19:20           ` Christoph Lameter
2010-03-24  1:03   ` KAMEZAWA Hiroyuki
2010-03-24  1:47     ` Minchan Kim
2010-03-24  1:53       ` KAMEZAWA Hiroyuki
2010-03-24  2:10         ` Minchan Kim
2010-03-24 10:57           ` Mel Gorman
2010-03-24 20:33   ` Andrew Morton
2010-03-24 20:59     ` Jonathan Corbet
2010-03-24 21:14       ` Andrew Morton
2010-03-24 21:19         ` Christoph Lameter
2010-03-24 21:19       ` Andrea Arcangeli
2010-03-24 21:28         ` Jonathan Corbet
2010-03-24 21:47           ` Andrea Arcangeli
2010-03-24 21:54             ` Jonathan Corbet
2010-03-24 22:06               ` Andrea Arcangeli
2010-03-24 21:57             ` Andrea Arcangeli
2010-03-25  9:13     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 08/11] Add /proc trigger for memory compaction Mel Gorman
2010-03-23 18:25   ` Christoph Lameter
2010-03-23 18:32     ` Mel Gorman
2010-03-24 20:33   ` Andrew Morton
2010-03-26 10:46     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 09/11] Add /sys trigger for per-node " Mel Gorman
2010-03-23 18:27   ` Christoph Lameter
2010-03-23 22:45   ` Minchan Kim
2010-03-24  0:19   ` KAMEZAWA Hiroyuki
2010-03-23 12:25 ` [PATCH 10/11] Direct compact when a high-order allocation fails Mel Gorman
2010-03-23 23:10   ` Minchan Kim
2010-03-24 11:11     ` Mel Gorman
2010-03-24 11:59       ` Minchan Kim
2010-03-24 12:06         ` Minchan Kim
2010-03-24 12:10           ` Mel Gorman
2010-03-24 12:09         ` Mel Gorman
2010-03-24 12:25           ` Minchan Kim
2010-03-24  1:19   ` KAMEZAWA Hiroyuki
2010-03-24 11:40     ` Mel Gorman
2010-03-25  0:30       ` KAMEZAWA Hiroyuki
2010-03-25  9:48         ` Mel Gorman
2010-03-25  9:50           ` KAMEZAWA Hiroyuki
2010-03-25 10:16             ` Mel Gorman
2010-03-26  1:03               ` KAMEZAWA Hiroyuki
2010-03-26  9:40                 ` Mel Gorman
2010-03-24 20:48   ` Andrew Morton
2010-03-25  0:57     ` KAMEZAWA Hiroyuki
2010-03-25 10:21     ` Mel Gorman
2010-03-23 12:25 ` [PATCH 11/11] Do not compact within a preferred zone after a compaction failure Mel Gorman
2010-03-23 18:31   ` Christoph Lameter
2010-03-23 18:39     ` Mel Gorman
2010-03-23 19:27       ` Christoph Lameter
2010-03-24 10:37         ` Mel Gorman [this message]
2010-03-24 19:54           ` Christoph Lameter
2010-03-24 20:53   ` Andrew Morton
2010-03-25  9:40     ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2010-03-12 16:41 [PATCH 0/11] Memory Compaction v4 Mel Gorman
2010-03-12 16:41 ` [PATCH 11/11] Do not compact within a preferred zone after a compaction failure Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100324103749.GB21147@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=aarcange@redhat.com \
    --cc=agl@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=avi@redhat.com \
    --cc=cl@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox