linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chandra Seetharaman <sekharan@us.ibm.com>
To: ckrm-tech <ckrm-tech@lists.sourceforge.net>,
	linux-mm <linux-mm@kvack.org>
Subject: [PATCH 6/6] CKRM: Documentation for mem controller
Date: Fri, 24 Jun 2005 15:29:02 -0700	[thread overview]
Message-ID: <1119652142.14910.0.camel@linuxchandra> (raw)

Patch 6 of 6 patches to support memory controller under CKRM framework.
Documentaion for the memory controller.

 Documentation/ckrm/mem_rc.design |  184 +++++++++++++++++++++++++++++++
++++++++
 Documentation/ckrm/mem_rc.todo   |   12 ++
 Documentation/ckrm/mem_rc.usage  |  112 +++++++++++++++++++++++
 3 files changed, 308 insertions(+)

Content-Disposition: inline; filename=11-06-mem_config-docs

Index: linux-2.6.12/Documentation/ckrm/mem_rc.design
===================================================================
--- /dev/null
+++ linux-2.6.12/Documentation/ckrm/mem_rc.design
@@ -0,0 +1,184 @@
+0. Lifecycle of a LRU Page:
+----------------------------
+These are the events in a page's lifecycle:
+   - allocation of the page
+     there are multiple high level page alloc functions; __alloc_pages
()
+	 is the lowest level function that does the real allocation.
+   - get into LRU list (active list or inactive list)
+   - get out of LRU list
+   - freeing the page
+     there are multiple high level page free functions; free_pages_bulk
()
+	 is the lowest level function that does the real free.
+
+When the memory subsystem runs low on LRU pages, pages are reclaimed by
+    - moving pages from active list to inactive list
(refill_inactive_zone())
+    - freeing pages from the inactive list (shrink_zone)
+depending on the recent usage of the page(approximately).
+
+In the process of the life cycle a page can move from the lru list to
swap
+and back. For this document's purpose, we treat it same as freeing and
+allocating the page, respectfully.
+
+1. Introduction
+---------------
+Memory resource controller controls the number of lru physical pages
+(active and inactive list) a class uses. It does not restrict any
+other physical pages (slabs etc.,)
+
+For simplicity, this document will always refer lru physical pages as
+physical pages or simply pages.
+
+There are two parameters(that are set by the user) that affect the
number
+of pages a class is allowed to have in active/inactive list.
+They are
+  - guarantee - specifies the number of pages a class is
+	guaranteed to get. In other words, if a class is using less than
+	'guarantee' number of pages, its pages will not be freed when the
+	memory subsystem tries to free some pages.
+  - limit - specifies the maximum number of pages a class can get;
+    'limit' in essence can be considered as the 'hard limit'
+
+Rest of this document details how these two parameters are used in the
+memory allocation logic.
+
+Note that the numbers that are specified in the shares file, doesn't
+directly correspond to the number of pages. But, the user can make
+it so by making the total_guarantee and max_limit of the default class
+(/rcfs/taskclass) to be the total number of pages(given in stats file)
+available in the system.
+
+  for example:
+   # cd /rcfs/taskclass
+   # grep System stats
+   System: tot_pages=257512,active=5897,inactive=2931,free=243991
+   # cat shares
+   res=mem,guarantee=-2,limit=-2,total_guarantee=100,max_limit=100
+
+  "tot_pages=257512" above mean there are 257512 lru pages in
+  the system.
+
+  By making total_guarantee and max_limit to be same as this number at
+  this level (/rcfs/taskclass), one can make guarantee and limit in all
+  classes refer to the number of pages.
+
+  # echo 'res=mem,total_guarantee=257512,max_limit=257512' > shares
+  # cat shares
+  res=mem,guarantee=-2,limit=-2,total_guarantee=257512,max_limit=257512
+
+
+The number of pages a class can use be anywhere between zero and its
+limit. CKRM memory controller springs into action when the system needs
+to choose a victim page to swap out. While the number of pages a class
can
+have allocated may be anywhere between zero and its limit, victim
+pages will be choosen from classes that are above their guarantee.
+
+Victim class will be chosen by the number pages a class is using over
its
+guarantee. i.e a class that is using 10000 pages over its guarantee
will be
+chosen against a class that is using 1000 pages over its guarantee.
+Pages belonging to classes that are below their guarantee will not be
+chosen as a victim.
+
+Whenever a class's usage goes over its limit number of pages, memory
+allocations will fail. In order to reduce the failure rate and to
behave
+like the VM, CKRM provides config parameters that will free up pages
+of a class when it is getting closer to its limit. Next section details
+different parameters and how they can be used.
+
+2. Configuaration parameters
+---------------------------
+
+Memory controller provides the following configuration parameters.
Usage of
+these parameters will be made clear in the following section.
+
+state: Shows whether the memory controller is enabled(1) or disabled
(0). By
+    default, the controller is disabled. User can either enabled it by
just
+    changing the state or is is enabled automatically either when the
user
+    defines a new class or changes the shares of the default root
class.
+
+fail_over: When pages are being allocated, if the class is over
fail_over % of
+    its limit, then fail the memory allocation. Default is 110.
+    ex: If limit of a class is 30000 and fail_over is 110, then memory
+    allocations would start failing once the class is using more than
33000
+    pages.
+
+shrink_at: When a class is using shrink_at % of its limit, then start
+    shrinking the class, i.e start freeing the page to make more free
pages
+    available for this class. Default is 90.
+    ex: If limit of a class is 30000 and shrink_at is 90, then pages
from this
+    class will start to get freed when the class's usage is above 27000
+
+shrink_to: When a class reached shrink_at % of its limit, ckrm will try
to
+    shrink the class's usage to shrink_to %. Defalut is 80.
+    ex: If limit of a class is 30000 with shrink_at being 90 and
shrink_to
+    being 80, then ckrm will try to free pages from the class when its
+    usage reaches 27000 and will try to bring it down to 24000.
+
+num_shrinks: Number of shrink attempts ckrm will do within
shrink_interval
+    seconds. After this many attempts in a period, ckrm will not
attempt a
+    shrink even if the class's usage goes over shrink_at %. Default is
10.
+
+shrink_interval: Number of seconds in a shrink period. Default is 10.
+
+3. Design
+--------------------------
+
+CKRM memory resource controller taps at appropriate low level memory
+management functions to associate a page with a class and to charge
+a class that brings the page to the LRU list.
+
+CKRM maintains lru lists per-class instead of keeping it system-wide,
so
+that reducing a class's usage doesn't involve going through the system-
wide
+lru lists.
+
+3.1 Changes in page allocation function(__alloc_pages())
+--------------------------------------------------------
+- If the class that the current task belong to is over 'fail_over' % of
its
+  'limit', allocation of page(s) fail. Otherwise, the page allocation
will
+  proceed as before.
+- Note that the class is _not_ charged for the page(s) here.
+
+3.2 Adding/Deleting page to active/inactive list
+-------------------------------------------------
+When a page is added to the active or inactive list, the class that the
+task belongs to is charged for the page usage.
+
+When a page is deleted from the active or inactive list, the class that
the
+page belongs to is credited back.
+
+If a class uses 'shrink_at' % of its limit, attempt is made to shrink
+the class's usage to 'shrink_to' % of its limit, in order to help the
class
+stay within its limit.
+But, if the class is aggressive, and keep getting over the class's
limit
+often(more than such 'num_shrinks' events in 'shrink_interval'
seconds),
+then the memory resource controller gives up on the class and doesn't
try
+to shrink the class, which will eventually lead the class to reach
+fail_over % and then the page allocations will start failing.
+
+3.3 Changes in the page reclaimation path (refill_inactive_zone and
shrink_zone)
+-------------------------------------------------------------------------------
+Pages will be moved from active to inactive list(refill_inactive_zone)
and
+pages from inactive list by choosing victim classes. Victim classes are
+chosen depending on their usage over their guarantee.
+
+Classes with DONT_CARE guarantee are assumed an implicit guarantee
which is
+based on the number of children(with DONT_CARE guarantee) its parent
has
+(including the default class) and the unused pages its parent still
has.
+ex1: If a default root class /rcfs/taskclass has 3 children c1, c2 and
c3
+and has 200000 pages, and all the classes have DONT_CARE guarantees,
then
+all the classes (c1, c2, c3 and the default class of /rcfs/taskclass)
will
+get 50000 (200000 / 4) pages each.
+ex2: If, in the above example c1 is set with a guarantee of 80000
pages,
+then the other classes (c2, c3 and the default class
of /rcfs/taskclass)
+will get 40000 ((200000 - 80000) / 3) pages each.
+
+3.5 Handling of Shared pages
+----------------------------
+Even if a mm is shared by tasks, the pages that belong to the mm will
be
+charged against the individual tasks that bring the page into LRU.
+
+But, when any task that is using a mm moves to a different class or
exits,
+then all pages that belong to the mm will be charged against the
richest
+class among the tasks that are using the mm.
+
+Note: Shared page handling need to be improved with a better policy.
+
Index: linux-2.6.12/Documentation/ckrm/mem_rc.usage
===================================================================
--- /dev/null
+++ linux-2.6.12/Documentation/ckrm/mem_rc.usage
@@ -0,0 +1,112 @@
+Installation
+------------
+
+1. Configure "Class based physical memory controller" under CKRM (see
+      Documentation/ckrm/installation)
+
+2. Reboot the system with the new kernel.
+
+3. Verify that the memory controller is present by reading the file
+   /rcfs/taskclass/config (should show a line with res=mem)
+
+Usage
+-----
+
+For brevity, unless otherwise specified all the following commands are
+executed in the default class (/rcfs/taskclass).
+
+Initially, the systemwide default class gets 100% of the LRU pages, and
the
+stats file at the /rcfs/taskclass level displays the total number of
+physical pages.
+
+   # cd /rcfs/taskclass
+   # grep System stats
+   System: tot_pages=239778,active=60473,inactive=135285,free=44555
+   # cat shares
+   res=mem,guarantee=-2,limit=-2,total_guarantee=100,max_limit=100
+
+   tot_pages - total number of pages
+   active    - number of pages in the active list ( sum of all zones)
+   inactive  - number of pages in the inactive list ( sum of all zones)
+   free      - number of free pages (sum of all zones)
+
+   By making total_guarantee and max_limit to be same as tot_pages, one
can
+   make the numbers in shares file be same as the number of pages for a
+   class.
+
+   # echo 'res=mem,total_guarantee=239778,max_limit=239778' > shares
+   # cat shares
+
res=mem,guarantee=-2,limit=-2,total_guarantee=239778,max_limit=239778
+
+Changing configuration parameters:
+----------------------------------
+For description of the paramters read the file mem_rc.design in this
same directory.
+
+Following is the default values for the configuration parameters:
+
+   localhost:~ # cd /rcfs/taskclass
+   localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=110,shrink_at=90,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+Here is how to change a specific configuration parameter. Note that
more than one
+configuration parameter can be changed in a single echo command though
for simplicity
+we show one per echo.
+
+ex: Changing fail_over:
+   localhost:/rcfs/taskclass # echo "res=mem,fail_over=120" > config
+   localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=90,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+ex: Changing shrink_at:
+   localhost:/rcfs/taskclass # echo "res=mem,shrink_at=85" > config
+   localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+ex: Changing shrink_to:
+   localhost:/rcfs/taskclass # echo "res=mem,shrink_to=75" > config
+   localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=10,shrink_interval=10
+
+ex: Changing num_shrinks:
+   localhost:/rcfs/taskclass # echo "res=mem,num_shrinks=20" > config
+   localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=20,shrink_interval=10
+
+ex: Changing shrink_interval:
+   localhost:/rcfs/taskclass # echo "res=mem,shrink_interval=15" >
config
+   localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=20,shrink_interval=15
+
+Class creation
+--------------
+
+   # mkdir c1
+
+Its initial share is DONT_CARE. The parent's share values will be
unchanged.
+
+Setting a new class share
+-------------------------
+
+   # echo 'res=mem,guarantee=25000,limit=50000' > c1/shares
+
+   # cat c1/shares
+
res=mem,guarantee=25000,limit=50000,total_guarantee=100,max_limit=100
+
+   'guarantee' specifies the number of pages this class entitled to get
+   'limit' is the maximum number of pages this class can get.
+
+Monitoring
+----------
+
+stats file shows statistics of the page usage of a class
+   # cat stats
+   ----------- Memory Resource stats start -----------
+   System: tot_pages=239778,active=60473,inactive=135285,free=44555
+   Number of pages used(including pages lent to children): 196654
+   Number of pages guaranteed: 239778
+   Maximum limit of pages: 239778
+   Total number of pages available(after serving guarantees to
children): 214778
+   Number of pages lent to children: 0
+   Number of pages borrowed from the parent: 0
+   ----------- Memory Resource stats end -----------
+
Index: linux-2.6.12/Documentation/ckrm/mem_rc.todo
===================================================================
--- /dev/null
+++ linux-2.6.12/Documentation/ckrm/mem_rc.todo
@@ -0,0 +1,12 @@
+Here are list of things to be done in the memory controller.
+
+	- meaningful names for parres, mem_rcbs etc.,
+	- make functions set_impl() and recalc() clean and simple
+	- in __alloc_pages(), when try_harder is set, try reclaiming
+	  pages if class is over its limit.
+	- move accounting (from zone/ckrm_zone) to different area and
+	  use it in both places
+	- support NUMA
+	- account shared pages properly
+	- use attributes file and make most of the config parameters class
+	  specific.

-- 

----------------------------------------------------------------------
    Chandra Seetharaman               | Be careful what you choose....
              - sekharan@us.ibm.com   |      .......you may get it.
----------------------------------------------------------------------


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

             reply	other threads:[~2005-06-24 22:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-24 22:29 Chandra Seetharaman [this message]
  -- strict thread matches above, loose matches on Subject: below --
2005-05-19  0:33 Chandra Seetharaman
2005-04-02  3:15 Chandra Seetharaman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1119652142.14910.0.camel@linuxchandra \
    --to=sekharan@us.ibm.com \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox