linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Simon Kirby <sim@hostway.ca>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Shaohua Li <shaohua.li@intel.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@redhat.com>, linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/5] Prevent kswapd dumping excessive amounts of memory in response to high-order allocations V2
Date: Thu, 9 Dec 2010 12:13:09 +0000	[thread overview]
Message-ID: <20101209121309.GB20133@csn.ul.ie> (raw)
In-Reply-To: <20101209011808.GC3796@hostway.ca>

On Wed, Dec 08, 2010 at 05:18:08PM -0800, Simon Kirby wrote:
> On Fri, Dec 03, 2010 at 11:45:29AM +0000, Mel Gorman wrote:
> 
> > This still needs testing. I've tried multiple reproduction scenarios locally
> > but two things are tripping me. One, Simon's network card is using GFP_ATOMIC
> > allocations where as the one I use locally does not. Second, Simon's is a real
> > mail workload with network traffic and there are no decent mail simulator
> > benchmarks (that I could find at least) that would replicate the situation.
> > Still, I'm hopeful it'll stop kswapd going mad on his machine and might
> > also alleviate some of the "too much free memory" problem.
> > 
> > Changelog since V1
> >   o Take classzone into account
> >   o Ensure that kswapd always balances at order-09
> >   o Reset classzone and order after reading
> >   o Require a percentage of a node be balanced for high-order allocations,
> >     not just any zone as ZONE_DMA could be balanced when the node in general
> >     is a mess
> > 
> > Simon Kirby reported the following problem
> > 
> >    We're seeing cases on a number of servers where cache never fully
> >    grows to use all available memory.  Sometimes we see servers with 4
> >    GB of memory that never seem to have less than 1.5 GB free, even with
> >    a constantly-active VM.  In some cases, these servers also swap out
> >    while this happens, even though they are constantly reading the working
> >    set into memory.  We have been seeing this happening for a long time;
> >    I don't think it's anything recent, and it still happens on 2.6.36.
> > 
> > After some debugging work by Simon, Dave Hansen and others, the prevaling
> > theory became that kswapd is reclaiming order-3 pages requested by SLUB
> > too aggressive about it.
> > 
> > There are two apparent problems here. On the target machine, there is a small
> > Normal zone in comparison to DMA32. As kswapd tries to balance all zones, it
> > would continually try reclaiming for Normal even though DMA32 was balanced
> > enough for callers. The second problem is that sleeping_prematurely() uses
> > the requested order, not the order kswapd finally reclaimed at. This keeps
> > kswapd artifically awake.
> > 
> > This series aims to alleviate these problems but needs testing to confirm
> > it alleviates the actual problem and wider review to think if there is a
> > better alternative approach. Local tests passed but are not reproducing
> > the same problem unfortunately so the results are inclusive.
> 
> So, we have been running the first version of this series in production
> since November 26th, and this version of this series in production since
> early yesterday morning.  Both versions definitely solve the kswapd not
> sleeping problem and do improve the use of memory for caching.  There are
> still problems with fragmentation causing reclaim of more page cache than
> I would like, but without this patch, the system is in bad shape (it
> keeps reading daemons in from disk because kswapd keeps reclaiming them).
> 

This is a plus at least. I've cc'd Andrew, Johannes and Rik so they are
aware of this result. I just released V3 of the series which is very
similar to this version with one major exception, patch 5, which alters
how sleeping_prematurely() treats zone->all_unreclaimable.

> http://0x.ca/sim/ref/2.6.36/?C=M;O=A
> http://0x.ca/sim/ref/2.6.36/mel_v2_memory_day.png
> http://0x.ca/sim/ref/2.6.36/mel_v2_buddyinfo_day.png
> http://0x.ca/sim/ref/2.6.36/mel_v2_buddyinfo_DMA32_day.png
> http://0x.ca/sim/ref/2.6.36/mel_v2_buddyinfo_Normal_day.png
> 
> No problem with page allocation failures or any other problem in the
> weeks of testing.
> 

As you've reported that moving slub to order-0 does not help, I don't
think slub is the only problem any more. I think V3 of the series is
worth merging just for the kswapd-being-awake- problem. If there are
still too many free pages after this is merged, the next best guess is
that it's order-1 pages for task_struct causing the problem.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-12-09 12:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-03 11:45 Mel Gorman
2010-12-03 11:45 ` [PATCH 1/5] mm: kswapd: Stop high-order balancing when any suitable zone is balanced Mel Gorman
2010-12-05 23:35   ` Minchan Kim
2010-12-06 10:55     ` Mel Gorman
2010-12-07  1:32       ` Minchan Kim
2010-12-07  9:49         ` Mel Gorman
2010-12-06  2:35   ` KAMEZAWA Hiroyuki
2010-12-06 11:32     ` Mel Gorman
2010-12-06 23:51       ` KAMEZAWA Hiroyuki
2010-12-03 11:45 ` [PATCH 2/5] mm: kswapd: Use the order that kswapd was reclaiming at for sleeping_prematurely() Mel Gorman
2010-12-03 11:45 ` [PATCH 3/5] mm: kswapd: Use the classzone idx that kswapd was using " Mel Gorman
2010-12-03 11:45 ` [PATCH 4/5] mm: kswapd: Reset kswapd_max_order and classzone_idx after reading Mel Gorman
2010-12-03 11:45 ` [PATCH 5/5] mm: kswapd: Keep kswapd awake for high-order allocations until a percentage of the node is balanced Mel Gorman
2010-12-09  1:18 ` [PATCH 0/5] Prevent kswapd dumping excessive amounts of memory in response to high-order allocations V2 Simon Kirby
2010-12-09 12:13   ` Mel Gorman [this message]
2010-12-09  1:55 ` Simon Kirby
2010-12-09 11:45   ` Mel Gorman
2010-12-10  0:06     ` Simon Kirby
2010-12-10 11:28       ` Mel Gorman
2010-12-11  1:33         ` Simon Kirby

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101209121309.GB20133@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hannes@cmpxchg.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --cc=shaohua.li@intel.com \
    --cc=sim@hostway.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox