* [PATCH 6/6] CKRM: Documentation for mem controller
@ 2005-06-24 22:29 Chandra Seetharaman
0 siblings, 0 replies; 3+ messages in thread
From: Chandra Seetharaman @ 2005-06-24 22:29 UTC (permalink / raw)
To: ckrm-tech, linux-mm
Patch 6 of 6 patches to support memory controller under CKRM framework.
Documentaion for the memory controller.
Documentation/ckrm/mem_rc.design | 184 +++++++++++++++++++++++++++++++
++++++++
Documentation/ckrm/mem_rc.todo | 12 ++
Documentation/ckrm/mem_rc.usage | 112 +++++++++++++++++++++++
3 files changed, 308 insertions(+)
Content-Disposition: inline; filename=11-06-mem_config-docs
Index: linux-2.6.12/Documentation/ckrm/mem_rc.design
===================================================================
--- /dev/null
+++ linux-2.6.12/Documentation/ckrm/mem_rc.design
@@ -0,0 +1,184 @@
+0. Lifecycle of a LRU Page:
+----------------------------
+These are the events in a page's lifecycle:
+ - allocation of the page
+ there are multiple high level page alloc functions; __alloc_pages
()
+ is the lowest level function that does the real allocation.
+ - get into LRU list (active list or inactive list)
+ - get out of LRU list
+ - freeing the page
+ there are multiple high level page free functions; free_pages_bulk
()
+ is the lowest level function that does the real free.
+
+When the memory subsystem runs low on LRU pages, pages are reclaimed by
+ - moving pages from active list to inactive list
(refill_inactive_zone())
+ - freeing pages from the inactive list (shrink_zone)
+depending on the recent usage of the page(approximately).
+
+In the process of the life cycle a page can move from the lru list to
swap
+and back. For this document's purpose, we treat it same as freeing and
+allocating the page, respectfully.
+
+1. Introduction
+---------------
+Memory resource controller controls the number of lru physical pages
+(active and inactive list) a class uses. It does not restrict any
+other physical pages (slabs etc.,)
+
+For simplicity, this document will always refer lru physical pages as
+physical pages or simply pages.
+
+There are two parameters(that are set by the user) that affect the
number
+of pages a class is allowed to have in active/inactive list.
+They are
+ - guarantee - specifies the number of pages a class is
+ guaranteed to get. In other words, if a class is using less than
+ 'guarantee' number of pages, its pages will not be freed when the
+ memory subsystem tries to free some pages.
+ - limit - specifies the maximum number of pages a class can get;
+ 'limit' in essence can be considered as the 'hard limit'
+
+Rest of this document details how these two parameters are used in the
+memory allocation logic.
+
+Note that the numbers that are specified in the shares file, doesn't
+directly correspond to the number of pages. But, the user can make
+it so by making the total_guarantee and max_limit of the default class
+(/rcfs/taskclass) to be the total number of pages(given in stats file)
+available in the system.
+
+ for example:
+ # cd /rcfs/taskclass
+ # grep System stats
+ System: tot_pages=257512,active=5897,inactive=2931,free=243991
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=100,max_limit=100
+
+ "tot_pages=257512" above mean there are 257512 lru pages in
+ the system.
+
+ By making total_guarantee and max_limit to be same as this number at
+ this level (/rcfs/taskclass), one can make guarantee and limit in all
+ classes refer to the number of pages.
+
+ # echo 'res=mem,total_guarantee=257512,max_limit=257512' > shares
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=257512,max_limit=257512
+
+
+The number of pages a class can use be anywhere between zero and its
+limit. CKRM memory controller springs into action when the system needs
+to choose a victim page to swap out. While the number of pages a class
can
+have allocated may be anywhere between zero and its limit, victim
+pages will be choosen from classes that are above their guarantee.
+
+Victim class will be chosen by the number pages a class is using over
its
+guarantee. i.e a class that is using 10000 pages over its guarantee
will be
+chosen against a class that is using 1000 pages over its guarantee.
+Pages belonging to classes that are below their guarantee will not be
+chosen as a victim.
+
+Whenever a class's usage goes over its limit number of pages, memory
+allocations will fail. In order to reduce the failure rate and to
behave
+like the VM, CKRM provides config parameters that will free up pages
+of a class when it is getting closer to its limit. Next section details
+different parameters and how they can be used.
+
+2. Configuaration parameters
+---------------------------
+
+Memory controller provides the following configuration parameters.
Usage of
+these parameters will be made clear in the following section.
+
+state: Shows whether the memory controller is enabled(1) or disabled
(0). By
+ default, the controller is disabled. User can either enabled it by
just
+ changing the state or is is enabled automatically either when the
user
+ defines a new class or changes the shares of the default root
class.
+
+fail_over: When pages are being allocated, if the class is over
fail_over % of
+ its limit, then fail the memory allocation. Default is 110.
+ ex: If limit of a class is 30000 and fail_over is 110, then memory
+ allocations would start failing once the class is using more than
33000
+ pages.
+
+shrink_at: When a class is using shrink_at % of its limit, then start
+ shrinking the class, i.e start freeing the page to make more free
pages
+ available for this class. Default is 90.
+ ex: If limit of a class is 30000 and shrink_at is 90, then pages
from this
+ class will start to get freed when the class's usage is above 27000
+
+shrink_to: When a class reached shrink_at % of its limit, ckrm will try
to
+ shrink the class's usage to shrink_to %. Defalut is 80.
+ ex: If limit of a class is 30000 with shrink_at being 90 and
shrink_to
+ being 80, then ckrm will try to free pages from the class when its
+ usage reaches 27000 and will try to bring it down to 24000.
+
+num_shrinks: Number of shrink attempts ckrm will do within
shrink_interval
+ seconds. After this many attempts in a period, ckrm will not
attempt a
+ shrink even if the class's usage goes over shrink_at %. Default is
10.
+
+shrink_interval: Number of seconds in a shrink period. Default is 10.
+
+3. Design
+--------------------------
+
+CKRM memory resource controller taps at appropriate low level memory
+management functions to associate a page with a class and to charge
+a class that brings the page to the LRU list.
+
+CKRM maintains lru lists per-class instead of keeping it system-wide,
so
+that reducing a class's usage doesn't involve going through the system-
wide
+lru lists.
+
+3.1 Changes in page allocation function(__alloc_pages())
+--------------------------------------------------------
+- If the class that the current task belong to is over 'fail_over' % of
its
+ 'limit', allocation of page(s) fail. Otherwise, the page allocation
will
+ proceed as before.
+- Note that the class is _not_ charged for the page(s) here.
+
+3.2 Adding/Deleting page to active/inactive list
+-------------------------------------------------
+When a page is added to the active or inactive list, the class that the
+task belongs to is charged for the page usage.
+
+When a page is deleted from the active or inactive list, the class that
the
+page belongs to is credited back.
+
+If a class uses 'shrink_at' % of its limit, attempt is made to shrink
+the class's usage to 'shrink_to' % of its limit, in order to help the
class
+stay within its limit.
+But, if the class is aggressive, and keep getting over the class's
limit
+often(more than such 'num_shrinks' events in 'shrink_interval'
seconds),
+then the memory resource controller gives up on the class and doesn't
try
+to shrink the class, which will eventually lead the class to reach
+fail_over % and then the page allocations will start failing.
+
+3.3 Changes in the page reclaimation path (refill_inactive_zone and
shrink_zone)
+-------------------------------------------------------------------------------
+Pages will be moved from active to inactive list(refill_inactive_zone)
and
+pages from inactive list by choosing victim classes. Victim classes are
+chosen depending on their usage over their guarantee.
+
+Classes with DONT_CARE guarantee are assumed an implicit guarantee
which is
+based on the number of children(with DONT_CARE guarantee) its parent
has
+(including the default class) and the unused pages its parent still
has.
+ex1: If a default root class /rcfs/taskclass has 3 children c1, c2 and
c3
+and has 200000 pages, and all the classes have DONT_CARE guarantees,
then
+all the classes (c1, c2, c3 and the default class of /rcfs/taskclass)
will
+get 50000 (200000 / 4) pages each.
+ex2: If, in the above example c1 is set with a guarantee of 80000
pages,
+then the other classes (c2, c3 and the default class
of /rcfs/taskclass)
+will get 40000 ((200000 - 80000) / 3) pages each.
+
+3.5 Handling of Shared pages
+----------------------------
+Even if a mm is shared by tasks, the pages that belong to the mm will
be
+charged against the individual tasks that bring the page into LRU.
+
+But, when any task that is using a mm moves to a different class or
exits,
+then all pages that belong to the mm will be charged against the
richest
+class among the tasks that are using the mm.
+
+Note: Shared page handling need to be improved with a better policy.
+
Index: linux-2.6.12/Documentation/ckrm/mem_rc.usage
===================================================================
--- /dev/null
+++ linux-2.6.12/Documentation/ckrm/mem_rc.usage
@@ -0,0 +1,112 @@
+Installation
+------------
+
+1. Configure "Class based physical memory controller" under CKRM (see
+ Documentation/ckrm/installation)
+
+2. Reboot the system with the new kernel.
+
+3. Verify that the memory controller is present by reading the file
+ /rcfs/taskclass/config (should show a line with res=mem)
+
+Usage
+-----
+
+For brevity, unless otherwise specified all the following commands are
+executed in the default class (/rcfs/taskclass).
+
+Initially, the systemwide default class gets 100% of the LRU pages, and
the
+stats file at the /rcfs/taskclass level displays the total number of
+physical pages.
+
+ # cd /rcfs/taskclass
+ # grep System stats
+ System: tot_pages=239778,active=60473,inactive=135285,free=44555
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=100,max_limit=100
+
+ tot_pages - total number of pages
+ active - number of pages in the active list ( sum of all zones)
+ inactive - number of pages in the inactive list ( sum of all zones)
+ free - number of free pages (sum of all zones)
+
+ By making total_guarantee and max_limit to be same as tot_pages, one
can
+ make the numbers in shares file be same as the number of pages for a
+ class.
+
+ # echo 'res=mem,total_guarantee=239778,max_limit=239778' > shares
+ # cat shares
+
res=mem,guarantee=-2,limit=-2,total_guarantee=239778,max_limit=239778
+
+Changing configuration parameters:
+----------------------------------
+For description of the paramters read the file mem_rc.design in this
same directory.
+
+Following is the default values for the configuration parameters:
+
+ localhost:~ # cd /rcfs/taskclass
+ localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=110,shrink_at=90,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+Here is how to change a specific configuration parameter. Note that
more than one
+configuration parameter can be changed in a single echo command though
for simplicity
+we show one per echo.
+
+ex: Changing fail_over:
+ localhost:/rcfs/taskclass # echo "res=mem,fail_over=120" > config
+ localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=90,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+ex: Changing shrink_at:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_at=85" > config
+ localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+ex: Changing shrink_to:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_to=75" > config
+ localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=10,shrink_interval=10
+
+ex: Changing num_shrinks:
+ localhost:/rcfs/taskclass # echo "res=mem,num_shrinks=20" > config
+ localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=20,shrink_interval=10
+
+ex: Changing shrink_interval:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_interval=15" >
config
+ localhost:/rcfs/taskclass # cat config
+
res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=20,shrink_interval=15
+
+Class creation
+--------------
+
+ # mkdir c1
+
+Its initial share is DONT_CARE. The parent's share values will be
unchanged.
+
+Setting a new class share
+-------------------------
+
+ # echo 'res=mem,guarantee=25000,limit=50000' > c1/shares
+
+ # cat c1/shares
+
res=mem,guarantee=25000,limit=50000,total_guarantee=100,max_limit=100
+
+ 'guarantee' specifies the number of pages this class entitled to get
+ 'limit' is the maximum number of pages this class can get.
+
+Monitoring
+----------
+
+stats file shows statistics of the page usage of a class
+ # cat stats
+ ----------- Memory Resource stats start -----------
+ System: tot_pages=239778,active=60473,inactive=135285,free=44555
+ Number of pages used(including pages lent to children): 196654
+ Number of pages guaranteed: 239778
+ Maximum limit of pages: 239778
+ Total number of pages available(after serving guarantees to
children): 214778
+ Number of pages lent to children: 0
+ Number of pages borrowed from the parent: 0
+ ----------- Memory Resource stats end -----------
+
Index: linux-2.6.12/Documentation/ckrm/mem_rc.todo
===================================================================
--- /dev/null
+++ linux-2.6.12/Documentation/ckrm/mem_rc.todo
@@ -0,0 +1,12 @@
+Here are list of things to be done in the memory controller.
+
+ - meaningful names for parres, mem_rcbs etc.,
+ - make functions set_impl() and recalc() clean and simple
+ - in __alloc_pages(), when try_harder is set, try reclaiming
+ pages if class is over its limit.
+ - move accounting (from zone/ckrm_zone) to different area and
+ use it in both places
+ - support NUMA
+ - account shared pages properly
+ - use attributes file and make most of the config parameters class
+ specific.
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 6/6] CKRM: Documentation for mem controller
@ 2005-05-19 0:33 Chandra Seetharaman
0 siblings, 0 replies; 3+ messages in thread
From: Chandra Seetharaman @ 2005-05-19 0:33 UTC (permalink / raw)
To: ckrm-tech, linux-mm
Patch 6 of 6 patches to support memory controller under CKRM framework.
Documentaion for the memory controller.
mem_rc.design | 184 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
mem_rc.usage | 112 +++++++++++++++++++++++++++++++++++
2 files changed, 296 insertions(+)
Content-Disposition: inline; filename=11-06-mem_config-docs
Index: aa/Documentation/ckrm/mem_rc.design
===================================================================
--- /dev/null
+++ aa/Documentation/ckrm/mem_rc.design
@@ -0,0 +1,184 @@
+0. Lifecycle of a LRU Page:
+----------------------------
+These are the events in a page's lifecycle:
+ - allocation of the page
+ there are multiple high level page alloc functions; __alloc_pages()
+ is the lowest level function that does the real allocation.
+ - get into LRU list (active list or inactive list)
+ - get out of LRU list
+ - freeing the page
+ there are multiple high level page free functions; free_pages_bulk()
+ is the lowest level function that does the real free.
+
+When the memory subsystem runs low on LRU pages, pages are reclaimed by
+ - moving pages from active list to inactive list (refill_inactive_zone())
+ - freeing pages from the inactive list (shrink_zone)
+depending on the recent usage of the page(approximately).
+
+In the process of the life cycle a page can move from the lru list to swap
+and back. For this document's purpose, we treat it same as freeing and
+allocating the page, respectfully.
+
+1. Introduction
+---------------
+Memory resource controller controls the number of lru physical pages
+(active and inactive list) a class uses. It does not restrict any
+other physical pages (slabs etc.,)
+
+For simplicity, this document will always refer lru physical pages as
+physical pages or simply pages.
+
+There are two parameters(that are set by the user) that affect the number
+of pages a class is allowed to have in active/inactive list.
+They are
+ - guarantee - specifies the number of pages a class is
+ guaranteed to get. In other words, if a class is using less than
+ 'guarantee' number of pages, its pages will not be freed when the
+ memory subsystem tries to free some pages.
+ - limit - specifies the maximum number of pages a class can get;
+ 'limit' in essence can be considered as the 'hard limit'
+
+Rest of this document details how these two parameters are used in the
+memory allocation logic.
+
+Note that the numbers that are specified in the shares file, doesn't
+directly correspond to the number of pages. But, the user can make
+it so by making the total_guarantee and max_limit of the default class
+(/rcfs/taskclass) to be the total number of pages(given in stats file)
+available in the system.
+
+ for example:
+ # cd /rcfs/taskclass
+ # grep System stats
+ System: tot_pages=257512,active=5897,inactive=2931,free=243991
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=100,max_limit=100
+
+ "tot_pages=257512" above mean there are 257512 lru pages in
+ the system.
+
+ By making total_guarantee and max_limit to be same as this number at
+ this level (/rcfs/taskclass), one can make guarantee and limit in all
+ classes refer to the number of pages.
+
+ # echo 'res=mem,total_guarantee=257512,max_limit=257512' > shares
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=257512,max_limit=257512
+
+
+The number of pages a class can use be anywhere between zero and its
+limit. CKRM memory controller springs into action when the system needs
+to choose a victim page to swap out. While the number of pages a class can
+have allocated may be anywhere between zero and its limit, victim
+pages will be choosen from classes that are above their guarantee.
+
+Victim class will be chosen by the number pages a class is using over its
+guarantee. i.e a class that is using 10000 pages over its guarantee will be
+chosen against a class that is using 1000 pages over its guarantee.
+Pages belonging to classes that are below their guarantee will not be
+chosen as a victim.
+
+Whenever a class's usage goes over its limit number of pages, memory
+allocations will fail. In order to reduce the failure rate and to behave
+like the VM, CKRM provides config parameters that will free up pages
+of a class when it is getting closer to its limit. Next section details
+different parameters and how they can be used.
+
+2. Configuaration parameters
+---------------------------
+
+Memory controller provides the following configuration parameters. Usage of
+these parameters will be made clear in the following section.
+
+state: Shows whether the memory controller is enabled(1) or disabled(0). By
+ default, the controller is disabled. User can either enabled it by just
+ changing the state or is is enabled automatically either when the user
+ defines a new class or changes the shares of the default root class.
+
+fail_over: When pages are being allocated, if the class is over fail_over % of
+ its limit, then fail the memory allocation. Default is 110.
+ ex: If limit of a class is 30000 and fail_over is 110, then memory
+ allocations would start failing once the class is using more than 33000
+ pages.
+
+shrink_at: When a class is using shrink_at % of its limit, then start
+ shrinking the class, i.e start freeing the page to make more free pages
+ available for this class. Default is 90.
+ ex: If limit of a class is 30000 and shrink_at is 90, then pages from this
+ class will start to get freed when the class's usage is above 27000
+
+shrink_to: When a class reached shrink_at % of its limit, ckrm will try to
+ shrink the class's usage to shrink_to %. Defalut is 80.
+ ex: If limit of a class is 30000 with shrink_at being 90 and shrink_to
+ being 80, then ckrm will try to free pages from the class when its
+ usage reaches 27000 and will try to bring it down to 24000.
+
+num_shrinks: Number of shrink attempts ckrm will do within shrink_interval
+ seconds. After this many attempts in a period, ckrm will not attempt a
+ shrink even if the class's usage goes over shrink_at %. Default is 10.
+
+shrink_interval: Number of seconds in a shrink period. Default is 10.
+
+3. Design
+--------------------------
+
+CKRM memory resource controller taps at appropriate low level memory
+management functions to associate a page with a class and to charge
+a class that brings the page to the LRU list.
+
+CKRM maintains lru lists per-class instead of keeping it system-wide, so
+that reducing a class's usage doesn't involve going through the system-wide
+lru lists.
+
+3.1 Changes in page allocation function(__alloc_pages())
+--------------------------------------------------------
+- If the class that the current task belong to is over 'fail_over' % of its
+ 'limit', allocation of page(s) fail. Otherwise, the page allocation will
+ proceed as before.
+- Note that the class is _not_ charged for the page(s) here.
+
+3.2 Adding/Deleting page to active/inactive list
+-------------------------------------------------
+When a page is added to the active or inactive list, the class that the
+task belongs to is charged for the page usage.
+
+When a page is deleted from the active or inactive list, the class that the
+page belongs to is credited back.
+
+If a class uses 'shrink_at' % of its limit, attempt is made to shrink
+the class's usage to 'shrink_to' % of its limit, in order to help the class
+stay within its limit.
+But, if the class is aggressive, and keep getting over the class's limit
+often(more than such 'num_shrinks' events in 'shrink_interval' seconds),
+then the memory resource controller gives up on the class and doesn't try
+to shrink the class, which will eventually lead the class to reach
+fail_over % and then the page allocations will start failing.
+
+3.3 Changes in the page reclaimation path (refill_inactive_zone and shrink_zone)
+-------------------------------------------------------------------------------
+Pages will be moved from active to inactive list(refill_inactive_zone) and
+pages from inactive list by choosing victim classes. Victim classes are
+chosen depending on their usage over their guarantee.
+
+Classes with DONT_CARE guarantee are assumed an implicit guarantee which is
+based on the number of children(with DONT_CARE guarantee) its parent has
+(including the default class) and the unused pages its parent still has.
+ex1: If a default root class /rcfs/taskclass has 3 children c1, c2 and c3
+and has 200000 pages, and all the classes have DONT_CARE guarantees, then
+all the classes (c1, c2, c3 and the default class of /rcfs/taskclass) will
+get 50000 (200000 / 4) pages each.
+ex2: If, in the above example c1 is set with a guarantee of 80000 pages,
+then the other classes (c2, c3 and the default class of /rcfs/taskclass)
+will get 40000 ((200000 - 80000) / 3) pages each.
+
+3.5 Handling of Shared pages
+----------------------------
+Even if a mm is shared by tasks, the pages that belong to the mm will be
+charged against the individual tasks that bring the page into LRU.
+
+But, when any task that is using a mm moves to a different class or exits,
+then all pages that belong to the mm will be charged against the richest
+class among the tasks that are using the mm.
+
+Note: Shared page handling need to be improved with a better policy.
+
Index: aa/Documentation/ckrm/mem_rc.usage
===================================================================
--- /dev/null
+++ aa/Documentation/ckrm/mem_rc.usage
@@ -0,0 +1,112 @@
+Installation
+------------
+
+1. Configure "Class based physical memory controller" under CKRM (see
+ Documentation/ckrm/installation)
+
+2. Reboot the system with the new kernel.
+
+3. Verify that the memory controller is present by reading the file
+ /rcfs/taskclass/config (should show a line with res=mem)
+
+Usage
+-----
+
+For brevity, unless otherwise specified all the following commands are
+executed in the default class (/rcfs/taskclass).
+
+Initially, the systemwide default class gets 100% of the LRU pages, and the
+stats file at the /rcfs/taskclass level displays the total number of
+physical pages.
+
+ # cd /rcfs/taskclass
+ # grep System stats
+ System: tot_pages=239778,active=60473,inactive=135285,free=44555
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=100,max_limit=100
+
+ tot_pages - total number of pages
+ active - number of pages in the active list ( sum of all zones)
+ inactive - number of pages in the inactive list ( sum of all zones)
+ free - number of free pages (sum of all zones)
+
+ By making total_guarantee and max_limit to be same as tot_pages, one can
+ make the numbers in shares file be same as the number of pages for a
+ class.
+
+ # echo 'res=mem,total_guarantee=239778,max_limit=239778' > shares
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=239778,max_limit=239778
+
+Changing configuration parameters:
+----------------------------------
+For description of the paramters read the file mem_rc.design in this same directory.
+
+Following is the default values for the configuration parameters:
+
+ localhost:~ # cd /rcfs/taskclass
+ localhost:/rcfs/taskclass # cat config
+ res=mem,state=1,fail_over=110,shrink_at=90,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+Here is how to change a specific configuration parameter. Note that more than one
+configuration parameter can be changed in a single echo command though for simplicity
+we show one per echo.
+
+ex: Changing fail_over:
+ localhost:/rcfs/taskclass # echo "res=mem,fail_over=120" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,state=1,fail_over=120,shrink_at=90,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+ex: Changing shrink_at:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_at=85" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+ex: Changing shrink_to:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_to=75" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=10,shrink_interval=10
+
+ex: Changing num_shrinks:
+ localhost:/rcfs/taskclass # echo "res=mem,num_shrinks=20" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=20,shrink_interval=10
+
+ex: Changing shrink_interval:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_interval=15" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,state=1,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=20,shrink_interval=15
+
+Class creation
+--------------
+
+ # mkdir c1
+
+Its initial share is DONT_CARE. The parent's share values will be unchanged.
+
+Setting a new class share
+-------------------------
+
+ # echo 'res=mem,guarantee=25000,limit=50000' > c1/shares
+
+ # cat c1/shares
+ res=mem,guarantee=25000,limit=50000,total_guarantee=100,max_limit=100
+
+ 'guarantee' specifies the number of pages this class entitled to get
+ 'limit' is the maximum number of pages this class can get.
+
+Monitoring
+----------
+
+stats file shows statistics of the page usage of a class
+ # cat stats
+ ----------- Memory Resource stats start -----------
+ System: tot_pages=239778,active=60473,inactive=135285,free=44555
+ Number of pages used(including pages lent to children): 196654
+ Number of pages guaranteed: 239778
+ Maximum limit of pages: 239778
+ Total number of pages available(after serving guarantees to children): 214778
+ Number of pages lent to children: 0
+ Number of pages borrowed from the parent: 0
+ ----------- Memory Resource stats end -----------
+
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 6/6] CKRM: Documentation for mem controller
@ 2005-04-02 3:15 Chandra Seetharaman
0 siblings, 0 replies; 3+ messages in thread
From: Chandra Seetharaman @ 2005-04-02 3:15 UTC (permalink / raw)
To: ckrm-tech, linux-mm
[-- Attachment #1: Type: text/plain, Size: 287 bytes --]
--
----------------------------------------------------------------------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
----------------------------------------------------------------------
[-- Attachment #2: 11-06-mem_config-docs --]
[-- Type: text/plain, Size: 13315 bytes --]
Patch 6 of 6 patches to support memory controller under CKRM framework.
Documentaion for the memory controller.
mem_rc.design | 178 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
mem_rc.usage | 112 ++++++++++++++++++++++++++++++++++++
2 files changed, 290 insertions(+)
Index: linux-2.6.12-rc1/Documentation/ckrm/mem_rc.design
===================================================================
--- /dev/null
+++ linux-2.6.12-rc1/Documentation/ckrm/mem_rc.design
@@ -0,0 +1,178 @@
+0. Lifecycle of a LRU Page:
+----------------------------
+These are the events in a page's lifecycle:
+ - allocation of the page
+ there are multiple high level page alloc functions; __alloc_pages()
+ is the lowest level function that does the real allocation.
+ - get into LRU list (active list or inactive list)
+ - get out of LRU list
+ - freeing the page
+ there are multiple high level page free functions; free_pages_bulk()
+ is the lowest level function that does the real free.
+
+When the memory subsystem runs low on LRU pages, pages are reclaimed by
+ - moving pages from active list to inactive list (refill_inactive_zone())
+ - freeing pages from the inactive list (shrink_zone)
+depending on the recent usage of the page(approximately).
+
+In the process of the life cycle a page can move from the lru list to swap
+and back. For this document's purpose, we treat it same as freeing and
+allocating the page, respectfully.
+
+1. Introduction
+---------------
+Memory resource controller controls the number of lru physical pages
+(active and inactive list) a class uses. It does not restrict any
+other physical pages (slabs etc.,)
+
+For simplicity, this document will always refer lru physical pages as
+physical pages or simply pages.
+
+There are two parameters(that are set by the user) that affect the number
+of pages a class is allowed to have in active/inactive list.
+They are
+ - guarantee - specifies the number of pages a class is
+ guaranteed to get. In other words, if a class is using less than
+ 'guarantee' number of pages, its pages will not be freed when the
+ memory subsystem tries to free some pages.
+ - limit - specifies the maximum number of pages a class can get;
+ 'limit' in essence can be considered as the 'hard limit'
+
+Rest of this document details how these two parameters are used in the
+memory allocation logic.
+
+Note that the numbers that are specified in the shares file, doesn't
+directly correspond to the number of pages. But, the user can make
+it so by making the total_guarantee and max_limit of the default class
+(/rcfs/taskclass) to be the total number of pages(given in stats file)
+available in the system.
+
+ for example:
+ # cd /rcfs/taskclass
+ # grep System stats
+ System: tot_pages=257512,active=5897,inactive=2931,free=243991
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=100,max_limit=100
+
+ "tot_pages=257512" above mean there are 257512 lru pages in
+ the system.
+
+ By making total_guarantee and max_limit to be same as this number at
+ this level (/rcfs/taskclass), one can make guarantee and limit in all
+ classes refer to the number of pages.
+
+ # echo 'res=mem,total_guarantee=257512,max_limit=257512' > shares
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=257512,max_limit=257512
+
+
+The number of pages a class can use be anywhere between its guarantee and
+limit. CKRM memory controller springs into action when the system needs
+to choose a victim page to swap out. While the number of pages a class can
+have allocated may be anywhere between its guarantee and limit, victim
+pages will be choosen from classes that are above their guarantee.
+
+Victim class will be chosen by the number pages a class is using over its
+guarantee. i.e a class that is using 10000 pages over its guarantee will be
+chosen against a class that is using 1000 pages over its guarantee.
+Pages belonging to classes that are below their guarantee will not be
+chosen as a victim.
+
+2. Configuaration parameters
+---------------------------
+
+Memory controller provides the following configuration parameters. Usage of
+these parameters will be made clear in the following section.
+
+fail_over: When pages are being allocated, if the class is over fail_over % of
+ its limit, then fail the memory allocation. Default is 110.
+ ex: If limit of a class is 30000 and fail_over is 110, then memory
+ allocations would start failing once the class is using more than 33000
+ pages.
+
+shrink_at: When a class is using shrink_at % of its limit, then start
+ shrinking the class, i.e start freeing the page to make more free pages
+ available for this class. Default is 90.
+ ex: If limit of a class is 30000 and shrink_at is 90, then pages from this
+ class will start to get freed when the class's usage is above 27000
+
+shrink_to: When a class reached shrink_at % of its limit, ckrm will try to
+ shrink the class's usage to shrink_to %. Defalut is 80.
+ ex: If limit of a class is 30000 with shrink_at being 90 and shrink_to
+ being 80, then ckrm will try to free pages from the class when its
+ usage reaches 27000 and will try to bring it down to 24000.
+
+num_shrinks: Number of shrink attempts ckrm will do within shrink_interval
+ seconds. After this many attempts in a period, ckrm will not attempt a
+ shrink even if the class's usage goes over shrink_at %. Default is 10.
+
+shrink_interval: Number of seconds in a shrink period. Default is 10.
+
+3. Design
+--------------------------
+
+CKRM memory resource controller taps at appropriate low level memory
+management functions to associate a page with a class and to charge
+a class that brings the page to the LRU list.
+
+CKRM maintains lru lists per-class instead of keeping it system-wide, so
+that reducing a class's usage doesn't involve going through the system-wide
+lru lists.
+
+3.1 Changes in page allocation function(__alloc_pages())
+--------------------------------------------------------
+- If the class that the current task belong to is over 'fail_over' % of its
+ 'limit', allocation of page(s) fail. Otherwise, the page allocation will
+ proceed as before.
+- Note that the class is _not_ charged for the page(s) here.
+
+3.2 Changes in page free(free_pages_bulk())
+-------------------------------------------
+- If the page still belong to a class, the class will be credited for this
+ page.
+
+3.3 Adding/Deleting page to active/inactive list
+-------------------------------------------------
+When a page is added to the active or inactive list, the class that the
+task belongs to is charged for the page usage.
+
+When a page is deleted from the active or inactive list, the class that the
+page belongs to is credited back.
+
+If a class uses 'shrink_at' % of its limit, attempt is made to shrink
+the class's usage to 'shrink_to' % of its limit, in order to help the class
+stay within its limit.
+But, if the class is aggressive, and keep getting over the class's limit
+often(more than such 'num_shrinks' events in 'shrink_interval' seconds),
+then the memory resource controller gives up on the class and doesn't try
+to shrink the class, which will eventually lead the class to reach
+fail_over % and then the page allocations will start failing.
+
+3.4 Changes in the page reclaimation path (refill_inactive_zone and shrink_zone)
+-------------------------------------------------------------------------------
+Pages will be moved from active to inactive list(refill_inactive_zone) and
+pages from inactive list by choosing victim classes. Victim classes are
+chosen depending on their usage over their guarantee.
+
+Classes with DONT_CARE guarantee are assumed an implicit guarantee which is
+based on the number of children(with DONT_CARE guarantee) its parent has
+(including the default class) and the unused pages its parent still has.
+ex1: If a default root class /rcfs/taskclass has 3 children c1, c2 and c3
+and has 200000 pages, and all the classes have DONT_CARE guarantees, then
+all the classes (c1, c2, c3 and the default class of /rcfs/taskclass) will
+get 50000 (200000 / 4) pages each.
+ex2: If, in the above example c1 is set with a guarantee of 80000 pages,
+then the other classes (c2, c3 and the default class of /rcfs/taskclass)
+will get 40000 ((200000 - 80000) / 3) pages each.
+
+3.5 Handling of Shared pages
+----------------------------
+Even if a mm is shared by tasks, the pages that belong to the mm will be
+charged against the individual tasks that bring the page into LRU.
+
+But, when any task that is using a mm moves to a different class or exits,
+then all pages that belong to the mm will be charged against the richest
+class among the tasks that are using the mm.
+
+Note: Shared page handling need to be improved with a better policy.
+
Index: linux-2.6.12-rc1/Documentation/ckrm/mem_rc.usage
===================================================================
--- /dev/null
+++ linux-2.6.12-rc1/Documentation/ckrm/mem_rc.usage
@@ -0,0 +1,112 @@
+Installation
+------------
+
+1. Configure "Class based physical memory controller" under CKRM (see
+ Documentation/ckrm/installation)
+
+2. Reboot the system with the new kernel.
+
+3. Verify that the memory controller is present by reading the file
+ /rcfs/taskclass/config (should show a line with res=mem)
+
+Usage
+-----
+
+For brevity, unless otherwise specified all the following commands are
+executed in the default class (/rcfs/taskclass).
+
+Initially, the systemwide default class gets 100% of the LRU pages, and the
+stats file at the /rcfs/taskclass level displays the total number of
+physical pages.
+
+ # cd /rcfs/taskclass
+ # grep System stats
+ System: tot_pages=239778,active=60473,inactive=135285,free=44555
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=100,max_limit=100
+
+ tot_pages - total number of pages
+ active - number of pages in the active list ( sum of all zones)
+ inactive - number of pages in the inactive list ( sum of all zones)
+ free - number of free pages (sum of all zones)
+
+ By making total_guarantee and max_limit to be same as tot_pages, one can
+ make the numbers in shares file be same as the number of pages for a
+ class.
+
+ # echo 'res=mem,total_guarantee=239778,max_limit=239778' > shares
+ # cat shares
+ res=mem,guarantee=-2,limit=-2,total_guarantee=239778,max_limit=239778
+
+Changing configuration parameters:
+----------------------------------
+For description of the paramters read the file mem_rc.design in this same directory.
+
+Following is the default values for the configuration parameters:
+
+ localhost:~ # cd /rcfs/taskclass
+ localhost:/rcfs/taskclass # cat config
+ res=mem,fail_over=110,shrink_at=90,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+Here is how to change a specific configuration parameter. Note that more than one
+configuration parameter can be changed in a single echo command though for simplicity
+we show one per echo.
+
+ex: Changing fail_over:
+ localhost:/rcfs/taskclass # echo "res=mem,fail_over=120" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,fail_over=120,shrink_at=90,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+ex: Changing shrink_at:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_at=85" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,fail_over=120,shrink_at=85,shrink_to=80,num_shrinks=10,shrink_interval=10
+
+ex: Changing shrink_to:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_to=75" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=10,shrink_interval=10
+
+ex: Changing num_shrinks:
+ localhost:/rcfs/taskclass # echo "res=mem,num_shrinks=20" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=20,shrink_interval=10
+
+ex: Changing shrink_interval:
+ localhost:/rcfs/taskclass # echo "res=mem,shrink_interval=15" > config
+ localhost:/rcfs/taskclass # cat config
+ res=mem,fail_over=120,shrink_at=85,shrink_to=75,num_shrinks=20,shrink_interval=15
+
+Class creation
+--------------
+
+ # mkdir c1
+
+Its initial share is DONT_CARE. The parent's share values will be unchanged.
+
+Setting a new class share
+-------------------------
+
+ # echo 'res=mem,guarantee=25000,limit=50000' > c1/shares
+
+ # cat c1/shares
+ res=mem,guarantee=25000,limit=50000,total_guarantee=100,max_limit=100
+
+ 'guarantee' specifies the number of pages this class entitled to get
+ 'limit' is the maximum number of pages this class can get.
+
+Monitoring
+----------
+
+stats file shows statistics of the page usage of a class
+ # cat stats
+ ----------- Memory Resource stats start -----------
+ System: tot_pages=239778,active=60473,inactive=135285,free=44555
+ Number of pages used(including pages lent to children): 196654
+ Number of pages guaranteed: 239778
+ Maximum limit of pages: 239778
+ Total number of pages available(after serving guarantees to children): 214778
+ Number of pages lent to children: 0
+ Number of pages borrowed from the parent: 0
+ ----------- Memory Resource stats end -----------
+
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-06-24 22:29 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-06-24 22:29 [PATCH 6/6] CKRM: Documentation for mem controller Chandra Seetharaman
-- strict thread matches above, loose matches on Subject: below --
2005-05-19 0:33 Chandra Seetharaman
2005-04-02 3:15 Chandra Seetharaman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox