linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Howard Chu <hyc@symas.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Chris Friesen <chris.friesen@genband.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Jan Kara <jack@suse.cz>, Mel Gorman <mel@csn.ul.ie>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: mmap vs fs cache
Date: Fri, 08 Mar 2013 12:04:46 -0800	[thread overview]
Message-ID: <513A445E.9070806@symas.com> (raw)
In-Reply-To: <20130308161643.GE23767@cmpxchg.org>

Johannes Weiner wrote:
> On Fri, Mar 08, 2013 at 07:00:55AM -0800, Howard Chu wrote:
>> Chris Friesen wrote:
>>> On 03/08/2013 03:40 AM, Howard Chu wrote:
>>>
>>>> There is no way that a process that is accessing only 30GB of a mmap
>>>> should be able to fill up 32GB of RAM. There's nothing else running on
>>>> the machine, I've killed or suspended everything else in userland
>>>> besides a couple shells running top and vmstat. When I manually
>>>> drop_caches repeatedly, then eventually slapd RSS/SHR grows to 30GB and
>>>> the physical I/O stops.
>>>
>>> Is it possible that the kernel is doing some sort of automatic
>>> readahead, but it ends up reading pages corresponding to data that isn't
>>> ever queried and so doesn't get mapped by the application?
>>
>> Yes, that's what I was thinking. I added a
>> posix_madvise(..POSIX_MADV_RANDOM) but that had no effect on the
>> test.
>>
>> First obvious conclusion - kswapd is being too aggressive. When free
>> memory hits the low watermark, the reclaim shrinks slapd down from
>> 25GB to 18-19GB, while the page cache still contains ~7GB of
>> unmapped pages. Ideally I'd like a tuning knob so I can say to keep
>> no more than 2GB of unmapped pages in the cache. (And the desired
>> effect of that would be to allow user processes to grow to 30GB
>> total, in this case.)
>
> We should find out where the unmapped page cache is coming from if you
> are only accessing mapped file cache and disabled readahead.
>
> How do you arrive at this number of unmapped page cache?

This number is pretty obvious. When slapd has grown to 25GB, the page cache 
has grown to 32GB (less about 200MB, the minfree). So: 7GB unmapped in the cache.

> What could happen is that previously used and activated pages do not
> get evicted anymore since there is a constant supply of younger
> reclaimable cache that is actually thrashing.  Whenever you drop the
> caches, you get rid of those stale active pages and allow the
> previously thrashing cache to get activated.  However, that would
> require that there is already a significant amount of active file
> pages before your workload starts (check the nr_active_file number in
> /proc/vmstat before launching slapd, try sync; echo 3 >drop_caches
> before launching to eliminate this option) OR that the set of pages
> accessed during your workload changes and the combined set of pages
> accessed by your workload is bigger than available memory -- which you
> claimed would not happen because you only access the 30GB file area on
> that system.

There are no other active pages before the test begins. There's nothing else 
running. caches have been dropped completely at the beginning.

The test clearly is accessing only 30GB of data. Once slapd reaches this 
process size, the test can be stopped and restarted any number of times, run 
for any number of hours continuously, and memory use on the system is 
unchanged, and no pageins occur.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-03-08 20:05 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5136320E.8030109@symas.com>
2013-03-07 15:43 ` Jan Kara
2013-03-08  2:08   ` Johannes Weiner
2013-03-08  7:46     ` Howard Chu
2013-03-08  8:42       ` Kirill A. Shutemov
2013-03-08  9:40         ` Howard Chu
2013-03-08 14:47           ` Chris Friesen
2013-03-08 15:00             ` Howard Chu
2013-03-08 15:25               ` Chris Friesen
2013-03-08 16:16               ` Johannes Weiner
2013-03-08 20:04                 ` Howard Chu [this message]
2013-03-11 12:04                   ` Jan Kara
2013-03-11 12:40                     ` Howard Chu
2013-03-09  3:28                 ` Ric Mason
2013-03-09  1:22               ` Phillip Susi
2013-03-11 11:52                 ` Jan Kara
2013-03-11 15:03                   ` Phillip Susi
2013-03-09  2:34     ` Ric Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=513A445E.9070806@symas.com \
    --to=hyc@symas.com \
    --cc=chris.friesen@genband.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox