linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Simon Kirby <sim@hostway.ca>
Cc: Minchan Kim <minchan.kim@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Shaohua Li <shaohua.li@intel.com>, linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [patch]vmscan: make kswapd use a correct order
Date: Fri, 10 Dec 2010 11:32:45 +0000	[thread overview]
Message-ID: <20101210113245.GQ20133@csn.ul.ie> (raw)
In-Reply-To: <20101209234452.GA18263@hostway.ca>

On Thu, Dec 09, 2010 at 03:44:52PM -0800, Simon Kirby wrote:
> On Mon, Dec 06, 2010 at 12:03:42PM +0000, Mel Gorman wrote:
> 
> > > This was part of the problem.  kswapd was throwing so much out while
> > > trying to meet the watermark in zone Normal that the daemons had to keep
> > > being read back in from /dev/sda (non-ssd), and this ended up causing
> > > degraded performance.
> > 
> > But there is still potentially two problems here. The first was kswapd
> > throwing out everything in zone normal. Even when fixed, there is
> > potentially still too many pages being thrown out. The situation might
> > be improved but not repaired.
> 
> Yes.
> 
> > > > Before you said SLUB was using only order-0 and order-1, I would have
> > > > suspected lumpy reclaim. Without high-order allocations, fragmentation
> > > > is not a problem and shouldn't be triggering a mass freeing of memory.
> > > > can you confirm with perf that there is no other constant source of
> > > > high-order allocations?
> > > 
> > > Let me clarify: On _another_ box, with 2.6.36 but without your patches
> > > and without as much load or SSD devices, I forced slub to use order-0
> > > except where order-1 was absolutely necessary (objects > 4096 bytes),
> > > just to see what impact this had on free memory.  There was a change,
> > > but still lots of memory left free.  I was trying to avoid confusion by
> > > posting graphs from different machines, but here is that one just as a
> > > reference: http://0x.ca/sim/ref/2.6.36/memory_stor25r_week.png
> > > (I made the slub order adjustment on Tuesday, November 30th.)
> > > The spikes are actually from mail nightly expunge/purge runs.  It seems
> > > that minimizing the slub orders did remove the large free spike that
> > > was happening during mailbox compaction runs (nightly), and overall there
> > > was a bit more memory used on average, but it definitely didn't "fix" it. 
> > 
> > Ok, but it's still evidence that lumpy reclaim is still the problem here. This
> > should be "fixed" by reclaim/compaction which has less impact and frees
> > fewer pages than lumpy reclaim. If necessary, I can backport this to 2.6.36
> > for you to verify. There is little chance the series would be accepted into
> > -stable but you'd at least know that 2.6.37 or 2.6.38 would behave as expected.
> 
> Isn't lumpy reclaim supposed to _improve_ this situation by trying to
> free contiguous stuff rather than shooting aimlessly until contiguous
> pages appear? 

For lower orders like order-1 and order-2, it reclaims randomly before
using lumpy reclaim as the assumption is that these lower pages free
naturally.

> Or is there some other point to it?  If this is the case,
> maybe the issue is that lumpy reclaim isn't happening soon enough, so it
> shoots around too much before it tries to look for lumpy stuff. 

It used to happen sooner but it ran into latency problems.

> In
> 2.6.3[67], set_lumpy_reclaim_mode() only sets lumpy mode if sc->order >
> PAGE_ALLOC_COSTLY_ORDER (>= 4), or if priority < DEF_PRIORITY - 2.
> 
> Also, try_to_compact_pages() bails without doing anything when order <=
> PAGE_ALLOC_COSTLY_ORDER, which is the order I'm seeing problems at.  So,
> without further chanegs, I don't see how CONFIG_COMPACTION or 2.6.37 will
> make any difference, unless I'm missing some related 2.6.37 changes.
> 

There is increasing pressure to use compaction for the lower orders as
well. This problem is going to be added to the list of justifications :/

> > > There are definitely pages that are leaking from dovecot or similar which
> > > can be swapped out and not swapped in again (you can see "apps" growing),
> > > but there are no tasks I can think of that would ever cause the system to
> > > be starved. 
> > 
> > So dovecot has a memory leak? As you say, this shouldn't starve the system
> > but it's inevitable that swap usage will grow over time.
> 
> Yeah, we just squashed what seemed to be the biggest leak in dovecot, so
> this should stop happening once we rebuild and restart everything.
> 

Ok.

> > > The calls to pageout() seem to happen if sc.may_writepage is
> > > set, which seems to happen when it thinks it has scanned enough without
> > > making enough progress.  Could this happen just from too much
> > > fragmentation?
> > > 
> > 
> > Not on its own but if too many pages have to be scanned due to
> > fragmentation, it can get set.
> > 
> > > The swapping seems to be at a slow but constant rate, so maybe it's
> > 
> > I assume you mean swap usage is growing at a slow but constant rate?
> 
> Yes.
> 
> > > happening just due to the way the types of allocations are biasing to
> > > Normal instead of DMA32, or vice-versa. 
> > > Check out the latest memory
> > > graphs for the server running your original patch:
> > > 
> > > http://0x.ca/sim/ref/2.6.36/memory_mel_patch_dec4.png
> > 
> > Do you think the growth in swap usage is due to dovecot leaking?
> 
> I guess we'll find out shortly, with dovecot being fixed. :)
> 

Great.

> > > http://0x.ca/sim/ref/2.6.36/zoneinfo_mel_patch_dec4
> > > http://0x.ca/sim/ref/2.6.36/pagetypeinfo_mel_patch_dec4
> > > 
> > > Hmm, pagetypeinfo shows none or only a few of the pages in Normal are
> > > considered reclaimable...
> > > 
> > 
> > Reclaimable in the context of pagetypeinfo means slab-reclaimable. The
> > results imply that very few slab allocations are being satisified from
> > the Normal zone or at least very few have been released recently.
> 
> Hmm, ok.
> 
> Simon-
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-12-10 11:33 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-01  3:08 Shaohua Li
2010-12-01  4:21 ` Minchan Kim
2010-12-01  5:42   ` Shaohua Li
2010-12-01  9:44 ` KOSAKI Motohiro
2010-12-01 15:58   ` Minchan Kim
2010-12-02  0:09     ` KOSAKI Motohiro
2010-12-02  0:29       ` KOSAKI Motohiro
2010-12-02  0:58         ` Minchan Kim
2010-12-02  0:19     ` Andrew Morton
2010-12-02  9:40       ` Mel Gorman
2010-12-02  0:29     ` Shaohua Li
2010-12-02  0:54       ` Minchan Kim
2010-12-02  1:05         ` Shaohua Li
2010-12-02  1:23           ` Minchan Kim
2010-12-02  1:36             ` Minchan Kim
2010-12-02  9:42               ` Mel Gorman
2010-12-02 15:25                 ` Minchan Kim
2010-12-02  2:39             ` Shaohua Li
2010-12-02  1:28       ` KOSAKI Motohiro
2010-12-02 10:12     ` Mel Gorman
2010-12-02 15:35       ` Minchan Kim
2010-12-02 15:42         ` Mel Gorman
2010-12-02 20:53           ` Simon Kirby
2010-12-03 12:00             ` Mel Gorman
2010-12-04 12:07               ` Simon Kirby
2010-12-06 12:03                 ` Mel Gorman
2010-12-09 23:44                   ` Simon Kirby
2010-12-10 11:32                     ` Mel Gorman [this message]
2010-12-10 23:42                       ` Simon Kirby
2010-12-14  9:52                         ` Mel Gorman
2010-12-02 16:00 [PATCH] vmscan: " Minchan Kim
2010-12-03 12:11 ` Mel Gorman
2010-12-09 22:13 ` Andrew Morton
2010-12-10  3:53   ` Minchan Kim
2010-12-10 11:17   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101210113245.GQ20133@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=shaohua.li@intel.com \
    --cc=sim@hostway.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox