linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: clameter@sgi.com, rohitseth@google.com, nickpiggin@yahoo.com.au,
	ckrm-tech@lists.sourceforge.net, devel@openvz.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [patch00/05]: Containers(V2)- Introduction
Date: Wed, 20 Sep 2006 12:48:13 -0700	[thread overview]
Message-ID: <20060920124813.fe160e71.pj@sgi.com> (raw)
In-Reply-To: <1158775586.28174.27.camel@lappy>

Peter wrote:
> > Which comes naturally with cpusets.
> 
> How are shared mappings dealt with, are pages charged to the set that
> first faults them in?

Cpusets does not attempt to manage how much memory a task can allocate,
but where it can allocate it.  If a task can find an existing page to
share, and avoid the allocation, then it entirely avoids dealing with
cpusets in that case.

Cpusets pays no attention to how often a page is shared.  It controls
which tasks can allocate a given free page, based on the node on which
that page resides.  If that node is allowed in a tasks 'nodemask_t
mems_allowed' (a task struct field), then the task can allocate
that page, so far as cpusets is concerned.

Cpusets does not care who links to a page, once it is allocated.

Every page is assigned to one specific node, and may only be allocated
by tasks allowed to allocate from that node.

These cpusets can overlap - which so far as memory goes, roughly means
that the various mems_allowed nodemask_t's of different tasks can overlap.

Here's an oddball example configuration that might make this easier to
think about.

    Let's say we have a modest sized NUMA system with an extra bank
    of memory added, in addition to the per-node memory.  Let's say
    the extra bank is a huge pile of cheaper (slower) memory, off a
    slower bus.

    Normal sized tasks running on one or more of the NUMA nodes just
    get to fight for the CPUs and memory on those nodes allowed them.

    Let's say an occassional big memory job is to be allowed to use
    some of the extra cheap memory, and we use the idea of Andrew
    and others to split that memory into fake nodes to manage the
    portion of memory available to specified tasks.

    Then one of these big jobs could be in a cpuset that let it use
    one or more of the CPUs and memory on the node it ran on, plus
    some number of the fake nodes on the extra cheap memory.

    Other jobs could be allowed, using cpusets, to use any combination
    of the same or overlapping CPUs or nodes, and/or other disjoint
    CPUs or nodes, fake or real.

Another example, restating some of the above.

    If say some application happened to fault in a libc.so page,
    it would be required to place that page on one of the nodes
    allowed to it.  If an other application comes along later and
    ends up wanting shared references to that same page, it could
    certainly do so, regardless of its cpuset settings.  It would
    not be allocating a new page for this, so would not encounter
    the cpuset constraints on where it could allocate such a page.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2006-09-20 19:48 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1158718568.29000.44.camel@galaxy.corp.google.com>
2006-09-20  5:39 ` Nick Piggin
2006-09-20 16:26   ` Christoph Lameter
2006-09-20 16:56     ` Nick Piggin
2006-09-20 17:08       ` Christoph Lameter
2006-09-20 17:19         ` Nick Piggin
2006-09-20 17:30           ` Christoph Lameter
2006-09-20 18:03             ` Nick Piggin
2006-09-20 17:40       ` Alan Cox
2006-09-20 16:27   ` Rohit Seth
     [not found]   ` <1158751720.8970.67.camel@twins>
     [not found]     ` <4511626B.9000106@yahoo.com.au>
     [not found]       ` <1158767787.3278.103.camel@taijtu>
2006-09-20 17:00         ` Nick Piggin
2006-09-20 17:23           ` [ckrm-tech] " Paul Menage
2006-09-20 17:36           ` Alan Cox
2006-09-20 17:30             ` Nick Piggin
2006-09-20 17:50           ` Rohit Seth
2006-09-20 17:52             ` Christoph Lameter
2006-09-20 18:06               ` Peter Zijlstra
2006-09-20 18:14                 ` Rohit Seth
2006-09-20 18:27                   ` Peter Zijlstra
2006-09-20 18:33                     ` [ckrm-tech] " Paul Menage
2006-09-20 18:38                     ` Rohit Seth
2006-09-20 19:48                 ` Paul Jackson [this message]
2006-09-20 19:48                 ` Christoph Lameter
2006-09-20 19:51                   ` [ckrm-tech] " Paul Menage
2006-09-20 18:37             ` Peter Zijlstra
2006-09-20 18:57               ` Rohit Seth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060920124813.fe160e71.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=ckrm-tech@lists.sourceforge.net \
    --cc=clameter@sgi.com \
    --cc=devel@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rohitseth@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox