linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: David Rientjes <rientjes@cs.washington.edu>
Cc: linux-mm@kvack.org, akpm@osdl.org, nickpiggin@yahoo.com.au,
	ak@suse.de, mbligh@google.com, rohitseth@google.com,
	menage@google.com, clameter@sgi.com
Subject: Re: [RFC] another way to speed up fake numa node page_alloc
Date: Tue, 26 Sep 2006 00:06:12 -0700	[thread overview]
Message-ID: <20060926000612.9db145a9.pj@sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64N.0609252214590.14826@attu4.cs.washington.edu>

Thanks for reviewing this, David.

David wrote:
> If there's mangling on 'last_full_zap' in the scenario with multiple CPU's 
> on one node, that means that we might be clearing 'fullnodes' more often 
> than every 1*HZ, and that clear is always done by one CPU.  Since the only 
> purpose of the delay is to allow a certain period of time go by where 
> these hints will actually serve a purpose, this entire speed-up will 
> then be degraded.  I agree that adding locking for 'zonelist_faster' is 
> probably going too far in terms of performance hint data, but it seems 
> necessary with 'last_full_zap' if the goal is to preserve this 1*HZ 
> delay.

I doubt it.  An occassional extra clearing of fullnodes seems quite
harmless to me.  I doubt it matters whether we zap fullnodes once per
second, or once per two seconds, or twice a second.  We're just dealing
with a single 64 bit word (a jiffies value), and it's a word that just
the few CPUs local to a single node are contending over.  On real 64 bit
systems, it may not even be possible to mangle it

The goal is not to preserve a 1*HZ delay.  I just pulled that delay out
of some unspeakable place.

Roughly I wanted to throttle the rate of wasteful scans of already full
zones to some rate that was infrequent enough to solve our performance
problem, while still fast enough that no one would ever seriously
notice the subtle transient changes in memory placement behaviour.

> It seems like an immutable time interval embedded in the page alloc code 
> may not be the best way to measure when a full zap should occur.

Eh ... why not?  Sure, it's dirt simple.  But in this case, fancier
control of this interval seems like it risks spending more effort than
it would save, with almost no discernable advantage to the user.

If we already had the exact metric handy that we needed, so no more
code needed to be added to a hot path to maintain the metric (including
likely real locks, since most metrics don't like to be mangled by
code that takes a cavelier attitude to locking), then I might reconsider.

But I doubt that this use would justify adding a metric.

> This is a creative solution, 

thanks ..

> This definitely seems to be headed in the right direction because it works 
> in both the real NUMA case and the fake NUMA case.

I hope so.

> I would really like to 
> run benchmarks on this implementation as I have done for the others but I 
> no longer have access to a 64-bit machine. 

Odd ...  Do you expect that situation to be remedied anytime soon?

I'd like to see the results of your rerunning your benchmark.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-09-26  7:06 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-25  9:14 Paul Jackson
2006-09-26  6:08 ` David Rientjes
2006-09-26  7:06   ` Paul Jackson [this message]
2006-09-26 18:17     ` David Rientjes
2006-09-26 19:24       ` Paul Jackson
2006-09-26 19:58         ` David Rientjes
2006-09-26 21:48           ` Paul Jackson
2006-10-02  6:18 ` Paul Jackson
2006-10-02  6:31   ` David Rientjes
2006-10-02  6:48     ` Paul Jackson
2006-10-02  7:05       ` David Rientjes
2006-10-02  8:41         ` Paul Jackson
2006-10-03 18:15           ` Paul Jackson
2006-10-03 19:37             ` David Rientjes
2006-10-04 15:45               ` Paul Jackson
2006-10-04 16:11                 ` Christoph Lameter
2006-10-04 22:10                 ` David Rientjes
2006-10-05  2:27                   ` Paul Jackson
2006-10-05  2:37                     ` David Rientjes
2006-10-05  2:53                       ` Paul Jackson
2006-10-05  3:00                         ` David Rientjes
2006-10-05  3:26                           ` Paul Jackson
2006-10-05  3:49                             ` David Rientjes
2006-10-05  4:07                               ` Andrew Morton
2006-10-05  4:14                                 ` Paul Jackson
2006-10-05  4:50                                 ` David Rientjes
2006-10-05  4:53                                   ` Paul Jackson
2006-10-11  3:42                     ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060926000612.9db145a9.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@google.com \
    --cc=menage@google.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rientjes@cs.washington.edu \
    --cc=rohitseth@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox