From: Johannes Weiner <hannes@cmpxchg.org>
To: Bob Liu <bob.liu@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Andi Kleen <andi@firstfloor.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Christoph Hellwig <hch@infradead.org>,
Dave Chinner <david@fromorbit.com>,
Greg Thelen <gthelen@google.com>, Hugh Dickins <hughd@google.com>,
Jan Kara <jack@suse.cz>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Luigi Semenzato <semenzato@google.com>,
Mel Gorman <mgorman@suse.de>, Metin Doslu <metin@citusdata.com>,
Michel Lespinasse <walken@google.com>,
Minchan Kim <minchan.kim@gmail.com>,
Ozgun Erdogan <ozgun@citusdata.com>,
Peter Zijlstra <peterz@infradead.org>,
Rik van Riel <riel@redhat.com>,
Roman Gushchin <klamm@yandex-team.ru>,
Ryan Mallon <rmallon@gmail.com>, Tejun Heo <tj@kernel.org>,
Vlastimil Babka <vbabka@suse.cz>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [patch 7/9] mm: thrash detection-based file cache sizing
Date: Tue, 14 Jan 2014 14:16:19 -0500 [thread overview]
Message-ID: <20140114191619.GI6963@cmpxchg.org> (raw)
In-Reply-To: <52D48C55.3020200@oracle.com>
On Tue, Jan 14, 2014 at 09:01:09AM +0800, Bob Liu wrote:
> Hi Johannes,
>
> On 01/11/2014 02:10 AM, Johannes Weiner wrote:
> > The VM maintains cached filesystem pages on two types of lists. One
> > list holds the pages recently faulted into the cache, the other list
> > holds pages that have been referenced repeatedly on that first list.
> > The idea is to prefer reclaiming young pages over those that have
> > shown to benefit from caching in the past. We call the recently used
> > list "inactive list" and the frequently used list "active list".
> >
> > Currently, the VM aims for a 1:1 ratio between the lists, which is the
> > "perfect" trade-off between the ability to *protect* frequently used
> > pages and the ability to *detect* frequently used pages. This means
> > that working set changes bigger than half of cache memory go
> > undetected and thrash indefinitely, whereas working sets bigger than
> > half of cache memory are unprotected against used-once streams that
> > don't even need caching.
> >
>
> Good job! This patch looks good to me and with nice descriptions.
> But it seems that this patch only fix the issue "working set changes
> bigger than half of cache memory go undetected and thrash indefinitely".
> My concern is could it be extended easily to address all other issues
> based on this patch set?
>
> The other possible way is something like Peter has implemented the CART
> and Clock-Pro which I think may be better because of using advanced
> algorithms and consider the problem as a whole from the beginning.(Sorry
> I haven't get enough time to read the source code, so I'm not 100% sure.)
> http://linux-mm.org/PeterZClockPro2
My patches are moving the VM towards something that is comparable to
how Peter implemented Clock-Pro. However, the current VM has evolved
over time in small increments based on real life performance
observations. Rewriting everything in one go would be incredibly
disruptive and I doubt very much we would merge any such proposal in
the first place. So it's not like I don't see the big picture, it's
just divide and conquer:
Peter's Clock-Pro implementation was basically a double clock with an
intricate system to classify hotness, augmented by eviction
information to work with reuse distances independent of memory size.
What we have right now is a double clock with a very rudimentary
system to classify whether a page is hot: it has been accessed twice
while on the inactive clock. My patches now add eviction information
to this, and improve the classification so that it can work with reuse
distances up to memory size and is no longer dependent on the inactive
clock size.
This is the smallest imaginable step that is still useful, and even
then we had a lot of discussions about scalability of the data
structures and confusion about how the new data point should be
interpreted. It also took a long time until somebody read the series
and went, "Ok, this actually makes sense to me." Now, maybe I suck at
documenting, but maybe this is just complicated stuff. Either way, we
have to get there collectively, so that the code is maintainable in
the long term.
Once we have these new concepts established, we can further improve
the hotness detector so that it can classify and order pages with
reuse distances beyond memory size. But this will come with its own
set of problems. For example, some time ago we stopped regularly
scanning and rotating active pages because of scalability issues, but
we'll most likely need an uptodate estimate of the reuse distances on
the active list in order to classify refaults properly.
> > + * Approximating inactive page access frequency - Observations:
> > + *
> > + * 1. When a page is accessed for the first time, it is added to the
> > + * head of the inactive list, slides every existing inactive page
> > + * towards the tail by one slot, and pushes the current tail page
> > + * out of memory.
> > + *
> > + * 2. When a page is accessed for the second time, it is promoted to
> > + * the active list, shrinking the inactive list by one slot. This
> > + * also slides all inactive pages that were faulted into the cache
> > + * more recently than the activated page towards the tail of the
> > + * inactive list.
> > + *
>
> Nitpick, how about the reference bit?
What do you mean?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2014-01-14 19:17 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-10 18:10 [patch 0/9] mm: thrash detection-based file cache sizing v8 Johannes Weiner
2014-01-10 18:10 ` [patch 1/9] fs: cachefiles: use add_to_page_cache_lru() Johannes Weiner
2014-01-13 1:17 ` Minchan Kim
2014-01-10 18:10 ` [patch 2/9] lib: radix-tree: radix_tree_delete_item() Johannes Weiner
2014-01-10 18:10 ` [patch 3/9] mm: shmem: save one radix tree lookup when truncating swapped pages Johannes Weiner
2014-01-10 18:25 ` Rik van Riel
2014-01-10 18:10 ` [patch 4/9] mm: filemap: move radix tree hole searching here Johannes Weiner
2014-01-10 19:22 ` Rik van Riel
2014-01-13 1:25 ` Minchan Kim
2014-01-10 18:10 ` [patch 5/9] mm + fs: prepare for non-page entries in page cache radix trees Johannes Weiner
2014-01-10 19:39 ` Rik van Riel
2014-01-13 2:01 ` Minchan Kim
2014-01-22 17:47 ` Johannes Weiner
2014-01-23 5:07 ` Minchan Kim
2014-02-12 14:00 ` Mel Gorman
2014-03-12 1:15 ` Johannes Weiner
2014-01-10 18:10 ` [patch 6/9] mm + fs: store shadow entries in page cache Johannes Weiner
2014-01-10 22:30 ` Rik van Riel
2014-01-13 2:18 ` Minchan Kim
2014-01-10 18:10 ` [patch 7/9] mm: thrash detection-based file cache sizing Johannes Weiner
2014-01-10 22:51 ` Rik van Riel
2014-01-13 2:42 ` Minchan Kim
2014-01-14 1:01 ` Bob Liu
2014-01-14 19:16 ` Johannes Weiner [this message]
2014-01-15 2:57 ` Bob Liu
2014-01-15 3:52 ` Zhang Yanfei
2014-01-16 21:17 ` Johannes Weiner
2014-01-10 18:10 ` [patch 8/9] lib: radix_tree: tree node interface Johannes Weiner
2014-01-10 22:57 ` Rik van Riel
2014-01-10 18:10 ` [patch 9/9] mm: keep page cache radix tree nodes in check Johannes Weiner
2014-01-10 23:09 ` Rik van Riel
2014-01-13 7:39 ` Minchan Kim
2014-01-14 5:40 ` Minchan Kim
2014-01-22 18:42 ` Johannes Weiner
2014-01-23 5:20 ` Minchan Kim
2014-01-23 19:22 ` Johannes Weiner
2014-01-27 2:31 ` Minchan Kim
2014-01-15 5:55 ` Bob Liu
2014-01-16 22:09 ` Johannes Weiner
2014-01-17 0:05 ` Dave Chinner
2014-01-20 23:17 ` Johannes Weiner
2014-01-21 3:03 ` Dave Chinner
2014-01-21 5:50 ` Johannes Weiner
2014-01-22 3:06 ` Dave Chinner
2014-01-22 6:57 ` Johannes Weiner
2014-01-22 18:48 ` Johannes Weiner
2014-01-23 5:57 ` Minchan Kim
-- strict thread matches above, loose matches on Subject: below --
2013-12-02 19:21 [patch 0/9] mm: thrash detection-based file cache sizing v7 Johannes Weiner
2013-12-02 19:21 ` [patch 7/9] mm: thrash detection-based file cache sizing Johannes Weiner
2013-11-24 23:38 [patch 0/9] mm: thrash detection-based file cache sizing v6 Johannes Weiner
2013-11-24 23:38 ` [patch 7/9] mm: thrash detection-based file cache sizing Johannes Weiner
2013-11-25 23:50 ` Andrew Morton
2013-11-26 2:15 ` Johannes Weiner
2013-11-26 1:56 ` Ryan Mallon
2013-11-26 20:57 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140114191619.GI6963@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=bob.liu@oracle.com \
--cc=david@fromorbit.com \
--cc=gthelen@google.com \
--cc=hch@infradead.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=klamm@yandex-team.ru \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=metin@citusdata.com \
--cc=mgorman@suse.de \
--cc=minchan.kim@gmail.com \
--cc=ozgun@citusdata.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=rmallon@gmail.com \
--cc=semenzato@google.com \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox