Re: Supporting overcommit with the memory controller

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Paul Menage <menage@google.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>,
	Pavel Emelianov <xemul@openvz.org>,
	Hugh Dickins <hugh@veritas.com>,
	Linux Containers <containers@lists.osdl.org>,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: Supporting overcommit with the memory controller
Date: Thu, 6 Mar 2008 12:20:27 +0900	[thread overview]
Message-ID: <20080306122027.018e7d52.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <6599ad830803051854x5ee204bej7212d9c1e444e4d0@mail.gmail.com>

On Wed, 5 Mar 2008 18:54:52 -0800
"Paul Menage" <menage@google.com> wrote:

> On Wed, Mar 5, 2008 at 5:01 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> >  > But to make this more interesting, there are plenty of jobs that will
> >  > happily fill as much pagecache as they have available. Even a job
> >  > that's just writing out logs will continually expand its pagecache
> >  > usage without anything to stop it, and so just keeping the reserved
> >  > pool at a fixed amount of free memory will result in the job expanding
> >  > even if it doesn't need to.
> >  It's current memory management style. "reclaim only when necessary".
> >
> 
> Exactly - if the high-priority latency-sensitive job really needs that
> extra memory, we want it to be able to automatically squash/kill the
> low-priority job when memory runs low, and not suffer any latency
> spikes. But if it doesn't actually need the memory, we'd rather use it
> for low-priority batch stuff. The "no latency spikes" bit is important
> - we don't want the high-priority job to get bogged down in
> try_to_free_pages() and out_of_memory() loops when it needs to
> allocate memory.
> 
In our measurements(on RHEL5), setting dirty_ratio to suitable value can 
help us to avoid *long* latency in most of *usual* situation.
(I'm sorry that I can't show the numbers, please try.)
Some mm people are trying to improve the kernel behavior under *unusual*
situation. If you don't want any latency spikes for high priority processes,
we'll have to try to make global page allocator handle priority of process/pages.

It seems what you really want is priority based file-cache control.
I have no objectio to using cgroup as controller interface of it.

For avoiding spike, I'm now considering to support dirty_ratio
for memcg. (Now, it seems difficut.)

> >  >
> >  Can Balbir's soft-limit patches help ?
> >
> >  It reclamims each cgroup's pages to soft-limit if the system needs.
> >
> >  Make limitation  like this
> >
> >  Assume 4G server.
> >                            Limit      soft-limit
> >  Not important Apss:         2G          100M
> >  Important Apps    :         3G          2.7G
> >
> >  When the system memory reachs to the limit, each cgroup's memory usages will
> >  goes down to soft-limit. (And there will 1.3G of free pages in above example)
> >
> 
> Yes, that could be a useful part of the solution - I suspect we'd need
> to have kswapd do the soft-limit push back as well as in
> try_to_free_pages(), to avoid the high-priority jobs getting stuck in
> the reclaim code. It would also be nice if we had:
> 
> - a way to have the soft-limit pushing kick in substantially *before*
> the machine ran out of memory, to provide a buffer for the
> high-priority jobs.
> 
Maybe background-reclaim thread can be a help. (I'm now maintaining a patch.)

> - a way to measure the actual working set of a cgroup (which may be
> smaller than its allocated memory if it has plenty of stale pagecache
> pages allocated). Maybe refaults, or maybe usage-based information.
> 
Hmm, current memory resource controller shows

- failcnt
- active/inactive
- rss/cache

I think we have enough infrastructure to account additional parameters.
But I think support all vmstat members for memcg is a bit overkill.
We'll have to choice what is necessary.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-03-06  3:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-06  0:17 Paul Menage
2008-03-06  1:01 ` KAMEZAWA Hiroyuki
2008-03-06  2:54   ` Paul Menage
2008-03-06  3:20     ` KAMEZAWA Hiroyuki [this message]
2008-03-06  8:55     ` Pavel Emelyanov
2008-03-06  9:05       ` KAMEZAWA Hiroyuki
2008-03-06  9:07         ` Pavel Emelyanov
2008-03-06 18:42 ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080306122027.018e7d52.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=containers@lists.osdl.org \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    --cc=menage@google.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox