Re: [PATCH v2 00/28] memcg-aware slab shrinking

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Glauber Costa <glommer@parallels.com>
To: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: linux-mm@kvack.org, hughd@google.com,
	containers@lists.linux-foundation.org,
	Dave Shrinnker <david@fromorbit.com>,
	Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2 00/28] memcg-aware slab shrinking
Date: Mon, 8 Apr 2013 12:11:55 +0400	[thread overview]
Message-ID: <51627BCB.1060603@parallels.com> (raw)
In-Reply-To: <20130401141217.GA9336@sergelap>

On 04/01/2013 06:12 PM, Serge Hallyn wrote:
> Quoting Glauber Costa (glommer@parallels.com):
>> On 04/01/2013 04:38 PM, Serge Hallyn wrote:
>>> Quoting Glauber Costa (glommer@parallels.com):
>>>> Hi,
>>>>
>>>> Notes:
>>>> ======
>>>>
>>>> This is v2 of memcg-aware LRU shrinking. I've been testing it extensively
>>>> and it behaves well, at least from the isolation point of view. However,
>>>> I feel some more testing is needed before we commit to it. Still, this is
>>>> doing the job fairly well. Comments welcome.
>>>
>>> Do you have any performance tests (preferably with enough runs with and
>>> without this patchset to show 95% confidence interval) to show the
>>> impact this has?  Certainly the feature sounds worthwhile, but I'm
>>> curious about the cost of maintaining this extra state.
>>>
>>> -serge
>>>
>> Not yet. I intend to include them in my next run. I haven't yet decided
>> on a set of tests to run (maybe just a memcg-contained kernel compile?)
>>
>> So if you have suggestions of what I could run to show this, feel free
>> to lay them down here.
> 
> Perhaps mount a 4G tmpfs, copy kernel tree there, and build kernel on
> that tmpfs?
> 

I've just run kernbench with 2Gb setups, with 3 different kernels. I
will include all this data in my opening letter for the next submission,
but wanted to drop a heads up here:

Kernels
========
base: the current -mm
davelru: that + dave's patches applied
fulllru: that + my patches applied.

I've ran all of them in a 1st level cgroup. Please note that the first
two kernels are not capable of shrinking metadata, so I had to select a
size that is enough to be in relatively constant pressure, but at the
same time not having that pressure to be exclusively from kernel memory.
2Gb did the job. This is a 2-node 24-way machine. My access to it is
very limited, and I have no idea when I'll be able to get my hands into
it again

Results:

Base
====

Average Optimal load -j 24 Run (std deviation):
Elapsed Time 415.988 (8.37909)
User Time 4142 (759.964)
System Time 418.483 (62.0377)
Percent CPU 1030.7 (267.462)
Context Switches 391509 (268361)
Sleeps 738483 (149934)

Dave
====

Average Optimal load -j 24 Run (std deviation):
Elapsed Time 424.486 (16.7365) ( + 2 % vs base)
User Time 4146.8 (764.012) ( + 0.84 % vs base)
System Time 419.24 (62.4507) (+ 0.18 % vs base)
Percent CPU 1012.1 (264.558) (-1.8 % vs base)
Context Switches 393363 (268899) (+ 0.47 % vs base)
Sleeps 739905 (147344) (+ 0.19 % vs base)

Full
=====

Average Optimal load -j 24 Run (std deviation):
Elapsed Time 456.644 (15.3567) ( + 9.7 % vs base)
User Time 4036.3 (645.261) ( - 2.5 % vs base)
System Time 438.134 (82.251) ( + 4.7 % vs base)
Percent CPU 973 (168.581) ( - 5.6 % vs base)
Context Switches 350796 (229700) ( - 10 % vs base)
Sleeps 728156 (138808) ( - 1.4 % vs base )

Discussion:
===========

First-level analysis: All figures fall within the std dev, except for
Full LRU wall time. It does fall within 2 std devs, though.
On the other hand, Full LRU kernel leads to better cpu utilization and
greater efficiency.

Details: The reclaim patterns in the three kernels are expected to be
different. User memory will always be the main driver, but in case of
pressure the first two kernels will shrink it while keeping the metadata
intact. This should lead to smaller system times figure at expense of
bigger user time figures, since user pages will be evicted more often.
This is consistent with the figures I've found.

Full LRU kernels have a 2.5 % better user time utilization, with 5.6 %
less CPU consumed and 10 % less context switches.

This comes at the expense of a 4.7 % loss of system time. Because we
will have to bring more dentry and inode objects back from caches, we
will stress more the slab code.

Because this is a benchmark that stresses a lot of metadata, it is
expected that this increase affects the end wall result proportionally.
We notice that the mere introduction of LRU code (Dave's Kernel) does
not affect the end wall time result outside the standard deviation.
Shrinking those objects, however, will lead to bigger wall times. This
is within the expected. No one would ever argue that the right kernel
behavior for all cases should keep the metadata in memory at expense of
user memory (and even if we should, we should do it the same way for the
cgroups).

My final conclusions is that performance wise the work is sound and
operates within expectations.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-04-08  8:11 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-29  9:13 Glauber Costa
2013-03-29  9:13 ` [PATCH v2 01/28] super: fix calculation of shrinkable objects for small numbers Glauber Costa
2013-04-01  7:16   ` Kamezawa Hiroyuki
2013-03-29  9:13 ` [PATCH v2 02/28] vmscan: take at least one pass with shrinkers Glauber Costa
2013-04-01  7:26   ` Kamezawa Hiroyuki
2013-04-01  8:10     ` Glauber Costa
2013-04-10  5:09       ` Ric Mason
2013-04-10  7:32         ` Glauber Costa
2013-04-10  9:19         ` Dave Chinner
2013-04-08  8:42   ` Joonsoo Kim
2013-04-08  8:47     ` Glauber Costa
2013-04-08  9:01       ` Joonsoo Kim
2013-04-08  9:05         ` Glauber Costa
2013-04-09  0:55           ` Joonsoo Kim
2013-04-09  1:29             ` Dave Chinner
2013-04-09  2:05               ` Joonsoo Kim
2013-04-09  7:43                 ` Glauber Costa
2013-04-09  9:08                   ` Joonsoo Kim
2013-04-09 12:30                 ` Dave Chinner
2013-04-10  2:51                   ` Joonsoo Kim
2013-04-10  7:30                     ` Glauber Costa
2013-04-10  8:19                       ` Joonsoo Kim
2013-04-10  8:46                     ` Wanpeng Li
     [not found]                     ` <20130410025115.GA5872-Hm3cg6mZ9cc@public.gmane.org>
2013-04-10  8:46                       ` Wanpeng Li
2013-04-10  8:46                     ` Wanpeng Li
2013-04-10 10:07                       ` Dave Chinner
2013-04-10 14:03                         ` JoonSoo Kim
2013-04-11  0:41                           ` Dave Chinner
2013-04-11  7:27                             ` Wanpeng Li
2013-04-11  9:25                               ` Dave Chinner
2013-04-11  7:27                             ` Wanpeng Li
2013-04-11  7:27                             ` Wanpeng Li
2013-03-29  9:13 ` [PATCH v2 03/28] dcache: convert dentry_stat.nr_unused to per-cpu counters Glauber Costa
2013-04-05  1:09   ` Greg Thelen
2013-04-05  1:15     ` Dave Chinner
2013-04-08  9:14       ` Glauber Costa
2013-04-08 13:18         ` Glauber Costa
2013-04-08 23:26         ` Dave Chinner
2013-04-09  8:02           ` Glauber Costa
2013-04-09 12:47             ` Dave Chinner
2013-03-29  9:13 ` [PATCH v2 04/28] dentry: move to per-sb LRU locks Glauber Costa
2013-03-29  9:13 ` [PATCH v2 05/28] dcache: remove dentries from LRU before putting on dispose list Glauber Costa
2013-04-03  6:51   ` Sha Zhengju
2013-04-03  8:55     ` Glauber Costa
2013-04-04  6:19     ` Dave Chinner
2013-04-04  6:56       ` Glauber Costa
2013-03-29  9:13 ` [PATCH v2 06/28] mm: new shrinker API Glauber Costa
2013-04-05  1:09   ` Greg Thelen
2013-03-29  9:13 ` [PATCH v2 07/28] shrinker: convert superblock shrinkers to new API Glauber Costa
2013-03-29  9:13 ` [PATCH v2 08/28] list: add a new LRU list type Glauber Costa
2013-04-04 21:53   ` Greg Thelen
2013-04-05  1:20     ` Dave Chinner
2013-04-05  8:01       ` Glauber Costa
2013-04-06  0:04         ` Dave Chinner
2013-03-29  9:13 ` [PATCH v2 09/28] inode: convert inode lru list to generic lru list code Glauber Costa
2013-03-29  9:13 ` [PATCH v2 10/28] dcache: convert to use new lru list infrastructure Glauber Costa
2013-04-08 13:14   ` Glauber Costa
2013-04-08 23:28     ` Dave Chinner
2013-03-29  9:13 ` [PATCH v2 11/28] list_lru: per-node " Glauber Costa
2013-03-29  9:13 ` [PATCH v2 12/28] shrinker: add node awareness Glauber Costa
2013-03-29  9:13 ` [PATCH v2 13/28] fs: convert inode and dentry shrinking to be node aware Glauber Costa
2013-03-29  9:13 ` [PATCH v2 14/28] xfs: convert buftarg LRU to generic code Glauber Costa
2013-03-29  9:13 ` [PATCH v2 15/28] xfs: convert dquot cache lru to list_lru Glauber Costa
2013-03-29  9:13 ` [PATCH v2 16/28] fs: convert fs shrinkers to new scan/count API Glauber Costa
2013-03-29  9:13 ` [PATCH v2 17/28] drivers: convert shrinkers to new count/scan API Glauber Costa
2013-03-29  9:14 ` [PATCH v2 18/28] shrinker: convert remaining shrinkers to " Glauber Costa
2013-03-29  9:14 ` [PATCH v2 19/28] hugepage: convert huge zero page shrinker to new shrinker API Glauber Costa
2013-03-29  9:14 ` [PATCH v2 20/28] shrinker: Kill old ->shrink API Glauber Costa
2013-03-29  9:14 ` [PATCH v2 21/28] vmscan: also shrink slab in memcg pressure Glauber Costa
2013-04-01  7:46   ` Kamezawa Hiroyuki
2013-04-01  8:51     ` Glauber Costa
2013-04-03 10:11   ` Sha Zhengju
2013-04-03 10:43     ` Glauber Costa
2013-04-04  9:35       ` Sha Zhengju
2013-04-05  8:25         ` Glauber Costa
2013-03-29  9:14 ` [PATCH v2 22/28] memcg,list_lru: duplicate LRUs upon kmemcg creation Glauber Costa
2013-04-01  8:05   ` Kamezawa Hiroyuki
2013-04-01  8:22     ` Glauber Costa
2013-03-29  9:14 ` [PATCH v2 23/28] lru: add an element to a memcg list Glauber Costa
2013-04-01  8:18   ` Kamezawa Hiroyuki
2013-04-01  8:29     ` Glauber Costa
2013-03-29  9:14 ` [PATCH v2 24/28] list_lru: also include memcg lists in counts and scans Glauber Costa
2013-03-29  9:14 ` [PATCH v2 25/28] list_lru: per-memcg walks Glauber Costa
2013-03-29  9:14 ` [PATCH v2 26/28] memcg: per-memcg kmem shrinking Glauber Costa
2013-04-01  8:31   ` Kamezawa Hiroyuki
2013-04-01  8:48     ` Glauber Costa
2013-04-01  9:01       ` Kamezawa Hiroyuki
2013-04-01  9:14         ` Glauber Costa
2013-04-01  9:35         ` Kamezawa Hiroyuki
2013-03-29  9:14 ` [PATCH v2 27/28] list_lru: reclaim proportionaly between memcgs and nodes Glauber Costa
2013-03-29  9:14 ` [PATCH v2 28/28] super: targeted memcg reclaim Glauber Costa
2013-04-01 12:38 ` [PATCH v2 00/28] memcg-aware slab shrinking Serge Hallyn
2013-04-01 12:45   ` Glauber Costa
2013-04-01 14:12     ` Serge Hallyn
2013-04-08  8:11       ` Glauber Costa [this message]
2013-04-02  4:58   ` Dave Chinner
2013-04-02  7:55     ` Glauber Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51627BCB.1060603@parallels.com \
    --to=glommer@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=serge.hallyn@ubuntu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox