linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: Satoru Moriya <satoru.moriya@hds.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Randy Dunlap <rdunlap@xenotime.net>,
	Satoru Moriya <smoriya@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"lwoodman@redhat.com" <lwoodman@redhat.com>,
	Seiji Aguchi <saguchi@redhat.com>,
	"hughd@google.com" <hughd@google.com>,
	"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
	David Rientjes <rientjes@google.com>
Subject: Re: [PATCH -v2 -mm] add extra free kbytes tunable
Date: Thu, 13 Oct 2011 17:09:07 +0900	[thread overview]
Message-ID: <20111013170907.80775c54.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20111013073321.GA2784@barrios-desktop>

On Thu, 13 Oct 2011 16:33:21 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:

> On Fri, Sep 02, 2011 at 12:31:14PM -0400, Satoru Moriya wrote:
> > On 09/01/2011 05:58 PM, Andrew Morton wrote:
> > > On Thu, 1 Sep 2011 15:26:50 -0400
> > > Rik van Riel <riel@redhat.com> wrote:
> > > 
> > >> Add a userspace visible knob
> > > 
> > > argh.  Fear and hostility at new knobs which need to be maintained for 
> > > ever, even if the underlying implementation changes.
> > > 
> > > Unfortunately, this one makes sense.
> > > 
> > >> to tell the VM to keep an extra amount of memory free, by increasing 
> > >> the gap between each zone's min and low watermarks.
> > >>
> > >> This is useful for realtime applications that call system calls and 
> > >> have a bound on the number of allocations that happen in any short 
> > >> time period.  In this application, extra_free_kbytes would be left at 
> > >> an amount equal to or larger than the maximum number of 
> > >> allocations that happen in any burst.
> > > 
> > > _is_ it useful?  Proof?
> > > 
> > > Who is requesting this?  Have they tested it?  Results?
> > 
> > This is interesting for me.
> > 
> > Some of our customers have realtime applications and they are concerned 
> > the fact that Linux uses free memory as pagecache. It means that
> > when their application allocate memory, Linux kernel tries to reclaim
> > memory at first and then allocate it. This may make memory allocation
> > latency bigger.
> > 
> > In many cases this is not a big issue because Linux has kswapd for
> > background reclaim and it is fast enough not to enter direct reclaim
> > path if there are a lot of clean cache. But under some situations -
> > e.g. Application allocates a lot of memory which is larger than delta
> > between watermark_low and watermark_min in a short time and kswapd
> > can't reclaim fast enough due to dirty page reclaim, direct reclaim
> > is executed and causes big latency.
> > 
> > We can avoid the issue above by using preallocation and mlock.
> > But it can't cover kmalloc used in systemcall. So I'd like to use
> > this patch with mlock to avoid memory allocation latency issue as
> > low as possible. It may not be a perfect solution but it is important
> > for customers in enterprise area to configure the amount of free
> > memory at their own risk.
> 
> I agree needs for such feature but don't like such primitive interface
> exporting to user.
> 
> As Satoru said, we can reserve free pages for user through preallocation and mlocking.
> The thing is free pages for kernel itself.
> Most desirable thing is we have to avoid syscall in critical realtime section.
> But if we can't avoid, my crazy idea is to use memcg for kernel pages.
> Of course, we should implement it and not simple stuff but AFAIK, memcg people
> always consider it and finally will do it. :)
> Recently, Glauber try "Basic kernel memory functionality" but I don't have reviewed
> it yet. I am not sure we can reuse it, anyway. Kame?
> 

I reviewed it and it seems good. It adds kmem.limit_in_bytes then we're ready
to go forward to kernel memory cgroup.
But it adds only interfaces now.

I think  Greg Thelen <gthelen@google.com> has some idea.


> My simple idea is as follows,
> 
> We can assign basic revered page pool and/or size of user-determined pages pool
> for each task registred at memcg-slab.

Hmm, memcg-mempool ?


> The application have to notify start of RT section to memcg before it goes to
> RT section. So, memcg could fill up page pool if it is short. In this case,
> application can stuck but it's okay as it doesn't go to RT section yet.
> The applicatoin have to notify end of RT section to memcg, too so that memcg
> could try to fill up reserved page pool in case of shortage.
> 

That 'notification' doesn't sounds good to me. When application died/moved to
other group without notification, memcg will be unstable.
It should be task's state rather than memcg's state.


> Why we need such notification is kswapd high prioiry, new knob and others never
> can meet application's deadline requirement in some situations(ex,
> there are so many dirty pages in LRU or fill up anon pages in non-swap case and so on)
> so that application might end up stuck at some point. The somepoint must be out of RT
> section of the task.
> 
> For implemenation, we might need new watermark setting for each memcg or/and
> kswapd prioirity promotion like thing for hurry reclaiming.
> Anyway, they are just implementaions and we could enhance/add further more through
> various techniques as time goes by.
> 
> Personally, I think it could a valuable featue.
> 

Hmm. For avoid latency at allocation, what we can do is only pre-allocation before it's
required. But the problem is that applications cannot forecast when the 'burst' allocation
happens and we need to prepare memory pool always.

I think we need 2 implemenations.

1. free-page mempool for a memcg.
2. a background reclaim thread for a memcg. This is triggered by mempool.
   Prioritity of this thread should be able to controlled by some ways.

If we take care of memcg's limit, watermark should trigger background reclaim.

?
But the memory reclaim routine should never be in sleep...


Thanks,
-Kame




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-10-13  8:10 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-01 14:52 [PATCH " Rik van Riel
2011-09-01 17:06 ` Randy Dunlap
2011-09-01 19:26   ` [PATCH -v2 " Rik van Riel
2011-09-01 21:58     ` Andrew Morton
2011-09-01 22:08       ` David Rientjes
2011-09-01 22:16         ` Andrew Morton
2011-09-02 16:31       ` Satoru Moriya
2011-10-13  7:33         ` Minchan Kim
2011-10-13  8:09           ` KAMEZAWA Hiroyuki [this message]
     [not found]       ` <E1FA588BC672D846BDBB452FCA1E308C2389B4@USINDEVS02.corp.hds.com>
2011-09-15  3:33         ` Satoru Moriya
2011-09-01 22:09     ` Andrew Morton
2011-09-02 16:26       ` [PATCH -mm] fixes & cleanups for "add extra free kbytes tunable" Rik van Riel
2011-09-30 21:43     ` [PATCH -v2 -mm] add extra free kbytes tunable Johannes Weiner
2011-10-08  3:08     ` David Rientjes
2011-10-10 22:37       ` Andrew Morton
2011-10-11 19:32         ` Satoru Moriya
2011-10-11 19:54           ` Andrew Morton
2011-10-11 20:23             ` Satoru Moriya
2011-10-11 20:54               ` Andrew Morton
2011-10-12 13:09                 ` Rik van Riel
2011-10-12 19:20                   ` Andrew Morton
2011-10-12 19:58                     ` Rik van Riel
2011-10-12 20:26                       ` David Rientjes
2011-10-21 23:48                       ` Satoru Moriya
2011-10-23 21:22                         ` David Rientjes
2011-10-25  2:04                           ` Satoru Moriya
2011-10-25 21:50                             ` David Rientjes
2011-10-26 18:59                               ` Satoru Moriya
2011-10-12 21:08                 ` Satoru Moriya
2011-10-12 22:41                   ` David Rientjes
2011-10-12 23:52                     ` Satoru Moriya
2011-10-13  0:01                       ` David Rientjes
2011-10-13  5:35                         ` KAMEZAWA Hiroyuki
2011-10-13 20:55                           ` David Rientjes
2011-10-14 22:16                             ` Satoru Moriya
2011-10-14 22:46                               ` David Rientjes
2011-10-14  5:32                           ` Satoru Moriya
2011-10-14  5:06                         ` Satoru Moriya
2011-10-11 23:22           ` David Rientjes
2011-10-13 16:54             ` Satoru Moriya
2011-10-13 20:48               ` David Rientjes
2011-10-13 21:11                 ` Rik van Riel
2011-10-13 22:02                   ` David Rientjes
2011-10-11 19:20       ` Satoru Moriya
2011-10-11 21:04         ` David Rientjes
2011-10-12 13:13           ` Rik van Riel
2011-10-12 20:21             ` David Rientjes
2011-10-13  4:13               ` Rik van Riel
2011-10-13  5:22                 ` David Rientjes
2011-10-22  0:11                   ` Satoru Moriya
2011-09-09 23:01 Satoru Moriya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111013170907.80775c54.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=minchan.kim@gmail.com \
    --cc=rdunlap@xenotime.net \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=saguchi@redhat.com \
    --cc=satoru.moriya@hds.com \
    --cc=smoriya@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox