linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Joel Schopp <jschopp@austin.ibm.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	lhms-devel@lists.sourceforge.net
Subject: Re: [Lhms-devel] Re: [PATCH 0/5] Reducing fragmentation using zones
Date: Fri, 20 Jan 2006 14:02:13 +0000 (GMT)	[thread overview]
Message-ID: <Pine.LNX.4.58.0601201357450.14292@skynet> (raw)
In-Reply-To: <20060120213210.126B.Y-GOTO@jp.fujitsu.com>

On Fri, 20 Jan 2006, Yasunori Goto wrote:

> > > > So, in terms of performance on this set of tests, both approachs perform
> > > > roughly the same as the stock kernel in terms of absolute performance. In
> > > > terms of high-order allocations, zone-based appears to do better under
> > > > load. However, if you look at the zones that are used, you will see that
> > > > zone-based appears to do as well as list-based *only* because it has the
> > > > EASYRCLM zone to play with. list-based was way better at keeping the
> > > > normal zone defragmented as well as highmem which is especially obvious
> > > > when tested at rest.  list-based was able to allocate 83 huge pages from
> > > > ZONE_NORMAL at rest while zone-based only managed 8.
> > > >
> > > yes, this is intersiting point :)
> > > list-based one can defrag NORMAL zone.
> > > The point will be "does we need to defrag NORMAL ?" , I think.
> >
> > The original intention was two fold. One, it helps HugeTLB in situations
> > where it was not configured correctly at boot-time. this is the case for a
> > number of sites running HPC-related jobs. The second objective was to help
> > high-order kernel allocations to potentially reduce things like
> > scatter-gather IO.
>
> Probably, Linus-san's wish is reduce high order kernel allocation
> to avoid fragment. (Did he say defragment is meaningless, right?)

Right.

> If there is a driver/kernel component which require high order
> allocation though physical contiguous memory is not necessary,
> it should be modified to collect pieces of pages.

Yes.

> (I guess there is some component like it. But I'm not sure....)
> If the scatter-gather IO is cause of bad performance,
> it might be desirable that trying highorder allocation at first,
> then collect peace of pages which can be allocated.
>

Figures have never been produced to show that high-order allocations would
help performnace for something like scatter/gather IO.

> It is just my guess.
> But, some of components might not be able to do it.
> If there are impossible components, it is good reason for
> defragment....
>
> > > > On the flip side, zone-based code changes are easier to understand than
> > > > the list-based ones (at least in terms of volume of code changes). The
> > > > zone-based gives guarantees on what will happen in the future while
> > > > list-based is best-effort.
> > > >
> > > > In terms of fragmentation, I still think that list-based is better overall
> > > > without configuration.
> > > I agree here.
> > >
> > > > The results above also represent the best possible
> > > > configuration with zone-based versus no configuration at all against
> > > > list-based. In an environment with changing workloads a constant reality,
> > > > I bet that list-based would win overall.
> > > >
> > > On x86, NORMAL is only 896M anyway. there is no discussion.
> > >
> >
> > There is a discussion with architecutes like ppc64 which do not have a
> > normal zone (only ZONE_DMA) and 64 bit architectures that have very large
> > normal zones.
> >
> > Take ppc64 as an example. Today, when memory is hot-added, it is available
> > for use by the kernel and userspace applications. Right now, hot-added
> > memory goes to ZONE_DMA but it should be going to ZONE_EASYRCLM. In this
> > case, the size of the kernel at the beginning is fixed. If you allow the
> > kernel zone to grow, it cannot be shrunk again and worse, if the kernel
> > expands to take up available memory, it loses all advantages.
>
> Just for correction, ZONE_EASYRCLM is useful only hot-remove.
> So, if kernel would like to have more memory, hot-add of ZONE_DMA(If its
> address is in DMA area) Zone_NORMAL should be OK.
> Only the new memory will not be able to be removed.
>

My understanding is that choosing what zone to add memory to is not an
option. The main case where memory is hot-added and hot-removed is to meet
changing demands of the workload. The memory is hot-added and removed by
an automated system which, no matter how well written, will end up adding
memory to the wrong zone some of the time.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2006-01-20 14:02 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-19 19:08 Mel Gorman
2006-01-19 19:08 ` [PATCH 1/5] Add __GFP_EASYRCLM flag and update callers Mel Gorman
2006-01-19 19:08 ` [PATCH 2/5] Create the ZONE_EASYRCLM zone Mel Gorman
2006-01-19 19:09 ` [PATCH 3/5] x86 - Specify amount of kernel memory at boot time Mel Gorman
2006-01-19 19:09 ` [PATCH 4/5] ppc64 " Mel Gorman
2006-01-19 19:09 ` [PATCH 5/5] ForTesting - Prevent OOM killer firing for high-order allocations Mel Gorman
2006-01-19 19:24 ` [PATCH 0/5] Reducing fragmentation using zones Joel Schopp
2006-01-20  0:13   ` [Lhms-devel] " KAMEZAWA Hiroyuki
2006-01-20  1:09     ` Mel Gorman
2006-01-20  1:25       ` KAMEZAWA Hiroyuki
2006-01-20  9:44         ` Mel Gorman
2006-01-20 10:40           ` KAMEZAWA Hiroyuki
2006-01-20 14:53             ` Mel Gorman
2006-01-20 18:10               ` Kamezawa Hiroyuki
2006-01-20 12:08           ` Yasunori Goto
2006-01-20 12:25             ` Mel Gorman
2006-01-20 13:22               ` Yasunori Goto
2006-01-20  0:42   ` Mel Gorman
2006-01-20  1:18     ` KAMEZAWA Hiroyuki
2006-01-20 12:03       ` Mel Gorman
2006-01-20 13:28         ` [Lhms-devel] " Yasunori Goto
2006-01-20 14:02           ` Mel Gorman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.58.0601201357450.14292@skynet \
    --to=mel@csn.ul.ie \
    --cc=jschopp@austin.ibm.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=lhms-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=y-goto@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox