linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: linux-mm@kvack.org, Sudhir Kumar <skumar@linux.vnet.ibm.com>,
	YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
	Bharata B Rao <bharata@in.ibm.com>,
	Paul Menage <menage@google.com>,
	lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	David Rientjes <rientjes@google.com>,
	Pavel Emelianov <xemul@openvz.org>,
	Dhaval Giani <dhaval@linux.vnet.ibm.com>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 4/4] Memory controller soft limit reclaim on contention (v4)
Date: Fri, 6 Mar 2009 16:11:06 +0530	[thread overview]
Message-ID: <20090306104106.GE5482@balbir.in.ibm.com> (raw)
In-Reply-To: <20090306191436.ceeb6e42.kamezawa.hiroyu@jp.fujitsu.com>

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-06 19:14:36]:

> On Fri, 6 Mar 2009 15:31:55 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
> 
> > > > +		if (wait)
> > > > +			wait_for_completion(&mem->wait_on_soft_reclaim);
> > > >  	}
> > > What ???? Why we have to wait here...holding mmap->sem...This is too bad.
> > >
> > 
> > Since mmap_sem is no longer used for pthread_mutex*, I was not sure.
> > That is why I added the comment asking for more review and see what
> > people think about it. We get here only when
> > 
> > 1. The memcg is over its soft limit
> > 2. Tasks/threads belonging to memcg are faulting in more pages
> > 
> > The idea is to throttle them. If we did reclaim inline, like we do for
> > hard limits, we can still end up holding mmap_sem for a long time.
> > 
> This "throttle" is hard to measuer the effect and IIUC, not implemneted in
> vmscan.c ...for global try_to_free_pages() yet.
> Under memory shortage. before reaching here, the thread already called
> try_to_free_pages() or check some memory shorage conditions because
> it called alloc_pages(). So, waiting here is redundant and gives it
> too much penaly.

The reason for adding it consider the the following scenario

1. Create cgroup "a", give it a soft limit of 0
2. Create cgroup "b", give it a soft limit of 3G.

With both "a' and "b" running, reclaiming from "a" makes no sense, it
goes and does a bulk allocation and increases it usage again. It does
not make sense to reclaim from "b" until it crosses 3G.

Throttling is not implemented in the main VM, but we have seen several
patches for it. This is a special case for soft limits.

> 
> 
> > > > +	/*
> > > > +	 * This loop can run a while, specially if mem_cgroup's continuously
> > > > +	 * keep exceeding their soft limit and putting the system under
> > > > +	 * pressure
> > > > +	 */
> > > > +	do {
> > > > +		mem = mem_cgroup_get_largest_soft_limit_exceeding_node();
> > > > +		if (!mem)
> > > > +			break;
> > > > +		usage = mem_cgroup_get_node_zone_usage(mem, zone, nid);
> > > > +		if (!usage)
> > > > +			goto skip_reclaim;
> > > 
> > > Why this works well ? if "mem" is the laragest, it will be inserted again
> > > as the largest. Do I miss any ?
> > >
> > 
> > No that is correct, but when reclaim is initiated from a different
> > zone/node combination, we still want mem to show up. 
> ....
> your logic is
> ==
>    nr_reclaimd = 0;
>    do {
>       mem = select victim.
>       remvoe victim from the RBtree (the largest usage one is selected)
>       if (victim is not good)
>           goto  skip this.
>       reclaimed += shirnk_zone.
>       
> skip_this:
>       if (mem is still exceeds soft limit)
>            insert RB tree again.
>    } while(!nr_reclaimed)
> ==
> When this exits loop ?
>

This is spill over from the main code without zones and nodes. Since
there, there was no concept of 0 usage and having a mem_cgroup on the
tree with highest usage. In practice, if we hit soft limit reclaim,
for each zone, kswapd will be called, at-least for one of the
node/zones that the mem we dequeud from has memory usage in. At that
point, the necessary changes to the RB-Tree will happen. However, you
have found a potential problem and I'll fix it in the next iteration.
 
> Thanks,
> -Kame
> 
> 

-- 
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-03-06 10:41 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-06  9:23 [PATCH 0/4] Memory controller soft limit patches (v4) Balbir Singh
2009-03-06  9:23 ` [PATCH 1/4] Memory controller soft limit documentation (v4) Balbir Singh
2009-03-06  9:23 ` [PATCH 2/4] Memory controller soft limit interface (v4) Balbir Singh
2009-03-06  9:23 ` [PATCH 3/4] Memory controller soft limit organize cgroups (v4) Balbir Singh
2009-03-06  9:23 ` [PATCH 4/4] Memory controller soft limit reclaim on contention (v4) Balbir Singh
2009-03-06  9:51   ` KAMEZAWA Hiroyuki
2009-03-06 10:01     ` Balbir Singh
2009-03-06 10:14       ` KAMEZAWA Hiroyuki
2009-03-06 10:41         ` Balbir Singh [this message]
2009-03-06  9:54 ` [PATCH 0/4] Memory controller soft limit patches (v4) KAMEZAWA Hiroyuki
2009-03-06 10:05   ` Balbir Singh
2009-03-06 10:34   ` [RFC][PATCH 0/3] memory controller soft limit (Yet Another One) v1 KAMEZAWA Hiroyuki
2009-03-06 10:36     ` [RFC][PATCH 1/3] soft limit interface (Yet Another One) KAMEZAWA Hiroyuki
2009-03-06 10:37     ` [RFC][PATCH 2/3] memcg sotlimit logic " KAMEZAWA Hiroyuki
2009-03-06 10:38     ` [RFC][PATCH 3/3] memcg documenation soft limit " KAMEZAWA Hiroyuki
2009-03-06 16:47       ` Randy Dunlap
2009-03-08 23:44         ` KAMEZAWA Hiroyuki
2009-03-08 23:45         ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090306104106.GE5482@balbir.in.ibm.com \
    --to=balbir@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata@in.ibm.com \
    --cc=dhaval@linux.vnet.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=skumar@linux.vnet.ibm.com \
    --cc=xemul@openvz.org \
    --cc=yamamoto@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox