From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: linux-mm@kvack.org
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
lizf@cn.fujitsu.com,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Rik van Riel <riel@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH 0/5] Memory controller soft limit patches (v7)
Date: Tue, 24 Mar 2009 23:04:14 +0530 [thread overview]
Message-ID: <20090324173414.GB24227@balbir.in.ibm.com> (raw)
In-Reply-To: <20090319165713.27274.94129.sendpatchset@localhost.localdomain>
* Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-19 22:27:13]:
>
> From: Balbir Singh <balbir@linux.vnet.ibm.com>
>
> New Feature: Soft limits for memory resource controller.
>
> Changelog v7...v6
> 1. Added checks in reclaim path to make sure we don't infinitely loop
> 2. Refactored reclaim options into a new patch
> 3. Tested several scenarios, see tests below
>
> Changelog v6...v5
> 1. If the number of reclaimed pages are zero, select the next mem cgroup
> for reclamation
> 2. Fixed a bug, where key was being updated after insertion into the tree
> 3. Fixed a build issue, when CONFIG_MEM_RES_CTLR is not enabled
>
> Changelog v5...v4
> 1. Several changes to the reclaim logic, please see the patch 4 (reclaim on
> contention). I've experimented with several possibilities for reclaim
> and chose to come back to this due to the excellent behaviour seen while
> testing the patchset.
> 2. Reduced the overhead of soft limits on resource counters very significantly.
> Reaim benchmark now shows almost no drop in performance.
>
> Changelog v4...v3
> 1. Adopted suggestions from Kamezawa to do a per-zone-per-node reclaim
> while doing soft limit reclaim. We don't record priorities while
> doing soft reclaim
> 2. Some of the overheads associated with soft limits (like calculating
> excess each time) is eliminated
> 3. The time_after(jiffies, 0) bug has been fixed
> 4. Tasks are throttled if the mem cgroup they belong to is being soft reclaimed
> and at the same time tasks are increasing the memory footprint and causing
> the mem cgroup to exceed its soft limit.
>
> Changelog v3...v2
> 1. Implemented several review comments from Kosaki-San and Kamezawa-San
> Please see individual changelogs for changes
>
> Changelog v2...v1
> 1. Soft limits now support hierarchies
> 2. Use spinlocks instead of mutexes for synchronization of the RB tree
>
> Here is v7 of the new soft limit implementation. Soft limits is a new feature
> for the memory resource controller, something similar has existed in the
> group scheduler in the form of shares. The CPU controllers interpretation
> of shares is very different though.
>
> Soft limits are the most useful feature to have for environments where
> the administrator wants to overcommit the system, such that only on memory
> contention do the limits become active. The current soft limits implementation
> provides a soft_limit_in_bytes interface for the memory controller and not
> for memory+swap controller. The implementation maintains an RB-Tree of groups
> that exceed their soft limit and starts reclaiming from the group that
> exceeds this limit by the maximum amount.
>
> So far I have the best test results with this patchset. I've experimented with
> several approaches and methods. I might be a little delayed in responding,
> I might have intermittent access to the internet for the next few days.
>
> TODOs
>
> 1. The current implementation maintains the delta from the soft limit
> and pushes back groups to their soft limits, a ratio of delta/soft_limit
> might be more useful
>
>
> Tests
> -----
>
> I've run two memory intensive workloads with differing soft limits and
> seen that they are pushed back to their soft limit on contention. Their usage
> was their soft limit plus additional memory that they were able to grab
> on the system. Soft limit can take a while before we see the expected
> results.
>
> The other tests I've run are
> 1. Deletion of groups while soft limit is in progress in the hierarchy
> 2. Setting the soft limit to zero and running other groups with non-zero
> soft limits.
> 3. Setting the soft limit to zero and testing if the mem cgroup is able
> to use available memory
> 4. Tested the patches with hierarchy enabled
> 5. Tested with swapoff -a, to make sure we don't go into an infinite loop
>
I've run lmbench with the soft limit patches and the results show no
major overhead, there are some outliers and unexpected results.
The outliers are at context-switch 16p/64K, in communicating
latencies and some unexpected results where the softlimit changes help improve
performance (I consider these to be in the range of noise).
L M B E N C H 2 . 0 S U M M A R Y
------------------------------------
Basic system parameters
----------------------------------------------------
Host OS Description Mhz
--------- ------------- ----------------------- ----
nosoftlim Linux 2.6.29- x86_64-linux-gnu 2131
softlimit Linux 2.6.29- x86_64-linux-gnu 2131
Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host OS Mhz null null open selct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ----
---- ----
nosoftlim Linux 2.6.29- 2131 0.67 1.33 29.9 36.8 6.484 1.12 12.1 508. 1708 6281
softlimit Linux 2.6.29- 2131 0.66 1.31 29.8 36.8 6.486 1.11 12.3 483. 1697 6241
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ----- ------ ------ ------ ------ --------------
nosoftlim Linux 2.6.29- 2.190 9.2300 3.1900 9.7400 10.8 7.93000 4.36000
softlimit Linux 2.6.29- 0.970 4.8200 3.1300 8.8900 10.3 8.82000 10.7
*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP
ctxsw UNIX UDP TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
nosoftlim Linux 2.6.29- 2.190 22.0 58.5 53.3 68.7 61.7 64.9 210.
softlimit Linux 2.6.29- 0.970 20.3 55.3 54.0 53.8 79.7 64.5 211.
File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page
Create Delete Create Delete Latency Fault Fault
--------- ------------- ------ ------ ------ ------ ------- ----- -----
nosoftlim Linux 2.6.29- 51.6 48.6 153.6 87.4 20.2K 7.00000
softlimit Linux 2.6.29- 51.6 48.2 137.8 83.9 20.2K 6.00000
*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem
Mem
UNIX reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
nosoftlim Linux 2.6.29- 1367 778. 803. 2058.5 4659.4 1303.9 1303.5 4664 1422.
softlimit Linux 2.6.29- 1314 823. 812. 2061.3 4659.9 1290.2 1280.9 4662 1422.
Memory latencies in nanoseconds - smaller is better
(WARNING - may not be correct, check graphs)
---------------------------------------------------
Host OS Mhz L1 $ L2 $ Main mem Guesses
--------- ------------- ---- ----- ------ -------- -------
nosoftlim Linux 2.6.29- 2131 1.875 6.5990 76.8
softlimit Linux 2.6.29- 2131 1.875 6.5980 76.8
Earlier, I ran reaim and saw no regression there as well.
--
Balbir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-03-24 17:20 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-19 16:57 Balbir Singh
2009-03-19 16:57 ` [PATCH 1/5] Memory controller soft limit documentation (v7) Balbir Singh
2009-03-19 16:57 ` [PATCH 2/5] Memory controller soft limit interface (v7) Balbir Singh
2009-03-19 16:57 ` [PATCH 3/5] Memory controller soft limit organize cgroups (v7) Balbir Singh
2009-03-20 3:46 ` KAMEZAWA Hiroyuki
2009-03-22 14:21 ` Balbir Singh
2009-03-22 23:53 ` KAMEZAWA Hiroyuki
2009-03-23 3:34 ` Balbir Singh
2009-03-23 3:38 ` KAMEZAWA Hiroyuki
2009-03-23 4:15 ` Balbir Singh
2009-03-23 4:23 ` KAMEZAWA Hiroyuki
2009-03-23 8:22 ` Balbir Singh
2009-03-23 8:47 ` KAMEZAWA Hiroyuki
2009-03-23 9:30 ` Balbir Singh
2009-03-25 4:59 ` KAMEZAWA Hiroyuki
2009-03-25 5:29 ` Balbir Singh
2009-03-25 5:39 ` KAMEZAWA Hiroyuki
2009-03-25 5:53 ` Balbir Singh
2009-03-25 6:01 ` KAMEZAWA Hiroyuki
2009-03-25 6:21 ` Balbir Singh
2009-03-25 6:38 ` Balbir Singh
2009-03-25 5:07 ` KAMEZAWA Hiroyuki
2009-03-25 5:18 ` Balbir Singh
2009-03-25 5:22 ` KAMEZAWA Hiroyuki
2009-03-19 16:57 ` [PATCH 4/5] Memory controller soft limit refactor reclaim flags (v7) Balbir Singh
2009-03-20 3:47 ` KAMEZAWA Hiroyuki
2009-03-22 14:21 ` Balbir Singh
2009-03-19 16:57 ` [PATCH 5/5] Memory controller soft limit reclaim on contention (v7) Balbir Singh
2009-03-20 4:06 ` KAMEZAWA Hiroyuki
2009-03-22 14:27 ` Balbir Singh
2009-03-23 0:02 ` KAMEZAWA Hiroyuki
2009-03-23 4:12 ` Balbir Singh
2009-03-23 4:20 ` KAMEZAWA Hiroyuki
2009-03-23 8:28 ` Balbir Singh
2009-03-23 8:30 ` KAMEZAWA Hiroyuki
2009-03-23 3:50 ` [PATCH 0/5] Memory controller soft limit patches (v7) KAMEZAWA Hiroyuki
2009-03-23 5:22 ` Balbir Singh
2009-03-23 5:31 ` KAMEZAWA Hiroyuki
2009-03-23 6:12 ` KAMEZAWA Hiroyuki
2009-03-23 6:17 ` KAMEZAWA Hiroyuki
2009-03-23 6:35 ` KOSAKI Motohiro
2009-03-23 8:24 ` Balbir Singh
2009-03-23 9:12 ` KOSAKI Motohiro
2009-03-23 9:23 ` Balbir Singh
2009-03-23 8:35 ` Balbir Singh
2009-03-23 8:52 ` KAMEZAWA Hiroyuki
2009-03-23 9:46 ` Balbir Singh
2009-03-23 9:41 ` Balbir Singh
2009-03-23 8:31 ` KAMEZAWA Hiroyuki
2009-03-24 17:34 ` Balbir Singh [this message]
2009-03-24 23:55 ` KAMEZAWA Hiroyuki
2009-03-25 3:42 ` KAMEZAWA Hiroyuki
2009-03-25 4:02 ` Balbir Singh
2009-03-25 4:05 ` KAMEZAWA Hiroyuki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090324173414.GB24227@balbir.in.ibm.com \
--to=balbir@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=riel@redhat.com \
--cc=yamamoto@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox