Re: [RFC] another way to speed up fake numa node page_alloc

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Paul Jackson <pj@sgi.com>
To: David Rientjes <rientjes@cs.washington.edu>
Cc: linux-mm@kvack.org, akpm@osdl.org, nickpiggin@yahoo.com.au,
	ak@suse.de, mbligh@google.com, rohitseth@google.com,
	menage@google.com, clameter@sgi.com
Subject: Re: [RFC] another way to speed up fake numa node page_alloc
Date: Tue, 26 Sep 2006 14:48:12 -0700	[thread overview]
Message-ID: <20060926144812.3ebbd7e6.pj@sgi.com> (raw)
In-Reply-To: <Pine.LNX.4.64N.0609261242170.22108@attu2.cs.washington.edu>

> Why is it arbitrary, though?

I was just trying to throttle the rate of futile zonelist scans.

In my implementation, the choice of 1*HZ for the zap time is obviously
an arbitrarily chosen time, within some acceptable range - right?

If you are asking why I didn't pick the non-arbitrary variant
implementation you suggested, wherein we clear individual node bits in
the nodemask of full nodes, anytime we free memory on that node, then I
did not do this because it was more code, and because it required a
lock to safely clear the bit, and because I had no particular reason to
think it would provide measurable improvement anyway.

I am quite happy coding stupid, simple, short and racey code, if it
looks to me like it will perform just as well, and be just as robust,
if not more so, than the more exact, longer, lock protected code.

> If that's the case, then the entire speed-up is broken. 

Are we looking at the same patch ;)?  My patch enables us to only have
to look closely at each full node once per second, instead of once per
page allocation.  That's the speedup.  That and the more rapid
application of the cpuset constraint in most cases.  The unallowed and
recently full nodes are skipped over on the first scan at the per-zone
cost of loading just a single unsigned short, from a compact array, plus
modest constant overhead per __alloc_pages call.

(My unit of cost here is 'cache line misses'.)

> And since the node bit is only turned 
> on when it has been passed by and deemed too full to allocate on, I don't 
> see where the race exists.

If two cpus on the same node each go to clear a (different) bit in the
nodemask at the same time, you could have each cpu load the mask, each
cpu compute a new mask, with its bit cleared, and each cpu store the
mask, all in that order.  Notice that the second cpu to store just
clobbered the bit clear done by the first cpu.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2006-09-26 21:48 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-25  9:14 Paul Jackson
2006-09-26  6:08 ` David Rientjes
2006-09-26  7:06   ` Paul Jackson
2006-09-26 18:17     ` David Rientjes
2006-09-26 19:24       ` Paul Jackson
2006-09-26 19:58         ` David Rientjes
2006-09-26 21:48           ` Paul Jackson [this message]
2006-10-02  6:18 ` Paul Jackson
2006-10-02  6:31   ` David Rientjes
2006-10-02  6:48     ` Paul Jackson
2006-10-02  7:05       ` David Rientjes
2006-10-02  8:41         ` Paul Jackson
2006-10-03 18:15           ` Paul Jackson
2006-10-03 19:37             ` David Rientjes
2006-10-04 15:45               ` Paul Jackson
2006-10-04 16:11                 ` Christoph Lameter
2006-10-04 22:10                 ` David Rientjes
2006-10-05  2:27                   ` Paul Jackson
2006-10-05  2:37                     ` David Rientjes
2006-10-05  2:53                       ` Paul Jackson
2006-10-05  3:00                         ` David Rientjes
2006-10-05  3:26                           ` Paul Jackson
2006-10-05  3:49                             ` David Rientjes
2006-10-05  4:07                               ` Andrew Morton
2006-10-05  4:14                                 ` Paul Jackson
2006-10-05  4:50                                 ` David Rientjes
2006-10-05  4:53                                   ` Paul Jackson
2006-10-11  3:42                     ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060926144812.3ebbd7e6.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@google.com \
    --cc=menage@google.com \
    --cc=nickpiggin@yahoo.com.au \
    --cc=rientjes@cs.washington.edu \
    --cc=rohitseth@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox