[PATCH] avoid atomic op on page free

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] avoid atomic op on page free
@ 2006-03-07  0:10 Benjamin LaHaise
  2006-03-07  0:50 ` Andrew Morton
  2006-03-07  1:53 ` Nick Piggin
  0 siblings, 2 replies; 15+ messages in thread
From: Benjamin LaHaise @ 2006-03-07  0:10 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, netdev

Hello Andrew et al,

The patch below adds a fast path that avoids the atomic dec and test 
operation and spinlock acquire/release on page free.  This is especially 
important to the network stack which uses put_page() to free user 
buffers.  Removing these atomic ops helps improve netperf on the P4 
from ~8126Mbit/s to ~8199Mbit/s (although that number fluctuates quite a 
bit with some runs getting 8243Mbit/s).  There are probably better 
workloads to see an improvement from this on, but removing 3 atomics and 
an irq save/restore is good.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <dont@kvack.org>.

Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
diff --git a/mm/swap.c b/mm/swap.c
index cce3dda..d6934cf 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -49,7 +49,10 @@ void put_page(struct page *page)
 {
 	if (unlikely(PageCompound(page)))
 		put_compound_page(page);
-	else if (put_page_testzero(page))
+	else if (page_count(page) == 1 && !PageLRU(page)) {
+		set_page_count(page, 0);
+		free_hot_page(page);
+	} else if (put_page_testzero(page))
 		__page_cache_release(page);
 }
 EXPORT_SYMBOL(put_page);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  0:10 [PATCH] avoid atomic op on page free Benjamin LaHaise
@ 2006-03-07  0:50 ` Andrew Morton
  2006-03-07  1:11   ` Benjamin LaHaise
  2006-03-07  1:21   ` Rick Jones
  2006-03-07  1:53 ` Nick Piggin
  1 sibling, 2 replies; 15+ messages in thread
From: Andrew Morton @ 2006-03-07  0:50 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: linux-mm, netdev

Benjamin LaHaise <bcrl@linux.intel.com> wrote:
>
> Hello Andrew et al,
> 
> The patch below adds a fast path that avoids the atomic dec and test 
> operation and spinlock acquire/release on page free.  This is especially 
> important to the network stack which uses put_page() to free user 
> buffers.  Removing these atomic ops helps improve netperf on the P4 
> from ~8126Mbit/s to ~8199Mbit/s (although that number fluctuates quite a 
> bit with some runs getting 8243Mbit/s).  There are probably better 
> workloads to see an improvement from this on, but removing 3 atomics and 
> an irq save/restore is good.
> 

Am a bit surprised at those numbers.

> diff --git a/mm/swap.c b/mm/swap.c
> index cce3dda..d6934cf 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -49,7 +49,10 @@ void put_page(struct page *page)
>  {
>  	if (unlikely(PageCompound(page)))
>  		put_compound_page(page);
> -	else if (put_page_testzero(page))
> +	else if (page_count(page) == 1 && !PageLRU(page)) {
> +		set_page_count(page, 0);
> +		free_hot_page(page);
> +	} else if (put_page_testzero(page))
>  		__page_cache_release(page);

Because userspace has to do peculiar things to get its pages taken off the
LRU.  What exactly was that application doing?

The patch adds slight overhead to the common case while providing
improvement to what I suspect is a very uncommon case?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  0:50 ` Andrew Morton
@ 2006-03-07  1:11   ` Benjamin LaHaise
  2006-03-07  1:39     ` Andrew Morton
  2006-03-07  2:04     ` Nick Piggin
  2006-03-07  1:21   ` Rick Jones
  1 sibling, 2 replies; 15+ messages in thread
From: Benjamin LaHaise @ 2006-03-07  1:11 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, netdev

On Mon, Mar 06, 2006 at 04:50:39PM -0800, Andrew Morton wrote:
> Am a bit surprised at those numbers.

> Because userspace has to do peculiar things to get its pages taken off the
> LRU.  What exactly was that application doing?

It's just a simple send() and recv() pair of processes.  Networking uses 
pages for the buffer on user transmits.  Those pages tend to be freed 
in irq context on transmit or in the receiver if the traffic is local.

> The patch adds slight overhead to the common case while providing
> improvement to what I suspect is a very uncommon case?

At least on any modern CPU with branch prediction, the test is essentially 
free (2 memory reads that pipeline well, iow 1 cycle, maybe 2).  The 
upside is that you get to avoid the atomic (~17 cycles on a P4 with a 
simple test program, the penalty doubles if there is one other instruction 
that operates on memory in the loop), disabling interrupts (~20 cycles?, I 
don't remember) another atomic for the spinlock, another atomic for 
TestClearPageLRU() and the pushf/popf (expensive as they rely on whatever 
instruction that might still be in flight to complete and add the penalty 
for changing irq state).  That's at least 70 cycles without including the 
memory barrier side effects which can cost 100 cycles+.  Add in the costs 
for the cacheline bouncing of the lru_lock and we're talking *expensive*.

So, a 1-2 cycle cost for a case that normally takes from 17 to 100+ cycles?  
I think that's worth it given the benefits.

Also, I think the common case (page cache read / map) is something that 
should be done differently, as those atomics really do add up to major 
pain.  Using rcu for page cache reads would be truely wonderful, but that 
will take some time.

		-ben

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  0:50 ` Andrew Morton
  2006-03-07  1:11   ` Benjamin LaHaise
@ 2006-03-07  1:21   ` Rick Jones
  1 sibling, 0 replies; 15+ messages in thread
From: Rick Jones @ 2006-03-07  1:21 UTC (permalink / raw)
  To: netdev; +Cc: linux-mm

Andrew Morton wrote:
> Benjamin LaHaise <bcrl@linux.intel.com> wrote:
> 
>>Hello Andrew et al,
>>
>>The patch below adds a fast path that avoids the atomic dec and test 
>>operation and spinlock acquire/release on page free.  This is especially 
>>important to the network stack which uses put_page() to free user 
>>buffers.  Removing these atomic ops helps improve netperf on the P4 
>>from ~8126Mbit/s to ~8199Mbit/s (although that number fluctuates quite a 
>>bit with some runs getting 8243Mbit/s).  There are probably better 
>>workloads to see an improvement from this on, but removing 3 atomics and 
>>an irq save/restore is good.
>>
 > ...
> Because userspace has to do peculiar things to get its pages taken off the
> LRU.  What exactly was that application doing?
> 
> The patch adds slight overhead to the common case while providing
> improvement to what I suspect is a very uncommon case?

A netperf TCP_STREAM test sits in a tight loop calling send() on the 
side running netperf and recv() on the side running netserver.  By 
default it accepts the default socket buffer sizes, and uses what is 
returned by a getsockopt(SO_SNDBUF) _before_ connect() as its "send 
size"  (and SO_RCVBUF as the default recv size)

So, in that regard it will be akin to a unidirectional bulk transfer 
application - eg ftp.

Netperf TCP_STREAM will send from a "ring" of buffers allocated at one 
time via malloc that in number are one more than SO_SNDBUF/sendsize.

There is also the TCP_SENDFILE test that is similar to TCP_STREAM only 
the netperf side calls sendfile(); and a TCP_RR test that will by 
default exchange single-byte requests and responses - single 
"transaction" outstanding at a time.  The idea was to test path length 
without taxing link bandwidth.

There are commandline options to change all of that, and several other 
tests, some optional compilations:

http://www.netperf.org/svn/netperf2/trunk/doc/

will have most if not all the nitty gritty details.  Some of the more 
recent additions to netperf are only described in the netperf-talk 
mailing list:

http://www.netperf.org/pipermail/netperf-talk/

eg support for more than one transaction outstanding in an _RR test and 
other odds and ends.

rick jones

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  1:11   ` Benjamin LaHaise
@ 2006-03-07  1:39     ` Andrew Morton
  2006-03-07  1:52       ` Benjamin LaHaise
  2006-03-07  2:04     ` Nick Piggin
  1 sibling, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2006-03-07  1:39 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: linux-mm, netdev

Benjamin LaHaise <bcrl@linux.intel.com> wrote:
>
> On Mon, Mar 06, 2006 at 04:50:39PM -0800, Andrew Morton wrote:
> > Am a bit surprised at those numbers.
> 
> > Because userspace has to do peculiar things to get its pages taken off the
> > LRU.  What exactly was that application doing?
> 
> It's just a simple send() and recv() pair of processes.  Networking uses 
> pages for the buffer on user transmits.

You mean non-zero-copy transmits?  If they were zero-copy then those pages
would still be on the LRU.

>  Those pages tend to be freed 
> in irq context on transmit or in the receiver if the traffic is local.

If it was a non-zero-copy Tx then networking owns that page and can just do
free_hot_page() on it and avoid all that stuff in put_page().


> > The patch adds slight overhead to the common case while providing
> > improvement to what I suspect is a very uncommon case?
> 
> At least on any modern CPU with branch prediction, the test is essentially 
> free (2 memory reads that pipeline well, iow 1 cycle, maybe 2).  The 
> upside is that you get to avoid the atomic (~17 cycles on a P4 with a 
> simple test program, the penalty doubles if there is one other instruction 
> that operates on memory in the loop), disabling interrupts (~20 cycles?, I 
> don't remember) another atomic for the spinlock, another atomic for 
> TestClearPageLRU() and the pushf/popf (expensive as they rely on whatever 
> instruction that might still be in flight to complete and add the penalty 
> for changing irq state).  That's at least 70 cycles without including the 
> memory barrier side effects which can cost 100 cycles+.  Add in the costs 
> for the cacheline bouncing of the lru_lock and we're talking *expensive*.
> 
> So, a 1-2 cycle cost for a case that normally takes from 17 to 100+ cycles?  
> I think that's worth it given the benefits.

Thing is, that case would represent about 1000000th of the number of
put_pages()s which get done in the world.  IOW: a net loss.

> Also, I think the common case (page cache read / map) is something that 
> should be done differently, as those atomics really do add up to major 
> pain.  Using rcu for page cache reads would be truely wonderful, but that 
> will take some time.
> 

We'd to consider the interaction with those pages which get temporarily
removed from the LRU in reclaim.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  1:39     ` Andrew Morton
@ 2006-03-07  1:52       ` Benjamin LaHaise
  2006-03-07  6:30         ` Andi Kleen
  0 siblings, 1 reply; 15+ messages in thread
From: Benjamin LaHaise @ 2006-03-07  1:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, netdev

On Mon, Mar 06, 2006 at 05:39:41PM -0800, Andrew Morton wrote:
> > It's just a simple send() and recv() pair of processes.  Networking uses 
> > pages for the buffer on user transmits.
> 
> You mean non-zero-copy transmits?  If they were zero-copy then those pages
> would still be on the LRU.

Correct.

> >  Those pages tend to be freed 
> > in irq context on transmit or in the receiver if the traffic is local.
> 
> If it was a non-zero-copy Tx then networking owns that page and can just do
> free_hot_page() on it and avoid all that stuff in put_page().

At least currently, networking has no way of knowing that is the case since 
pages may have their reference count increased when an skb() is cloned, and 
in fact do when TCP sends them off.

> Thing is, that case would represent about 1000000th of the number of
> put_pages()s which get done in the world.  IOW: a net loss.

Those 1-2 cycles are free if you look at how things get scheduled with the 
execution of the surrounding code. I bet $20 that you can't find a modern 
CPU where the cost is measurable (meaning something like a P4, Athlon).  
If this level of cost for the common case is a concern, it's probably worth 
making atomic_dec_and_test() inline for page_cache_release().  The overhead 
of the function call and the PageCompound() test is probably more than what 
we're talking about as you're increasing the cache footprint and actually 
performing a write to memory.

		-ben

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  0:10 [PATCH] avoid atomic op on page free Benjamin LaHaise
  2006-03-07  0:50 ` Andrew Morton
@ 2006-03-07  1:53 ` Nick Piggin
  2006-03-07  1:58   ` Benjamin LaHaise
  1 sibling, 1 reply; 15+ messages in thread
From: Nick Piggin @ 2006-03-07  1:53 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: akpm, linux-mm, netdev

Benjamin LaHaise wrote:

>Hello Andrew et al,
>
>The patch below adds a fast path that avoids the atomic dec and test 
>operation and spinlock acquire/release on page free.  This is especially 
>important to the network stack which uses put_page() to free user 
>buffers.  Removing these atomic ops helps improve netperf on the P4 
>from ~8126Mbit/s to ~8199Mbit/s (although that number fluctuates quite a 
>bit with some runs getting 8243Mbit/s).  There are probably better 
>workloads to see an improvement from this on, but removing 3 atomics and 
>an irq save/restore is good.
>
>		-ben
>

You can't do this because you can't test PageLRU like that.

Have a look in the lkml archives a few months back, where I proposed
a way to do this for __free_pages(). You can't do it for put_page.

BTW I have quite a large backlog of patches in -mm which should end
up avoiding an atomic or two around these parts.

--

Send instant messages to your online friends http://au.messenger.yahoo.com 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  1:53 ` Nick Piggin
@ 2006-03-07  1:58   ` Benjamin LaHaise
  2006-03-07  2:14     ` Nick Piggin
  0 siblings, 1 reply; 15+ messages in thread
From: Benjamin LaHaise @ 2006-03-07  1:58 UTC (permalink / raw)
  To: Nick Piggin; +Cc: akpm, linux-mm, netdev

On Tue, Mar 07, 2006 at 12:53:27PM +1100, Nick Piggin wrote:
> You can't do this because you can't test PageLRU like that.
> 
> Have a look in the lkml archives a few months back, where I proposed
> a way to do this for __free_pages(). You can't do it for put_page.

Even if we know that we are the last user of the page (the count is 1)?  
Who can bump the page's count then?

> BTW I have quite a large backlog of patches in -mm which should end
> up avoiding an atomic or two around these parts.

That certainly looks like it will help.  Not taking the spinlock 
unconditionally gets rid of quite a bit of the cost.

		-ben

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  1:11   ` Benjamin LaHaise
  2006-03-07  1:39     ` Andrew Morton
@ 2006-03-07  2:04     ` Nick Piggin
  2006-03-07  2:10       ` Benjamin LaHaise
  2006-03-07  2:30       ` Chen, Kenneth W
  1 sibling, 2 replies; 15+ messages in thread
From: Nick Piggin @ 2006-03-07  2:04 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Andrew Morton, linux-mm, netdev

Benjamin LaHaise wrote:

>On Mon, Mar 06, 2006 at 04:50:39PM -0800, Andrew Morton wrote:
>
>>Am a bit surprised at those numbers.
>>
>
>>Because userspace has to do peculiar things to get its pages taken off the
>>LRU.  What exactly was that application doing?
>>
>
>It's just a simple send() and recv() pair of processes.  Networking uses 
>pages for the buffer on user transmits.  Those pages tend to be freed 
>in irq context on transmit or in the receiver if the traffic is local.
>
>
>>The patch adds slight overhead to the common case while providing
>>improvement to what I suspect is a very uncommon case?
>>
>
>At least on any modern CPU with branch prediction, the test is essentially 
>free (2 memory reads that pipeline well, iow 1 cycle, maybe 2).  The 
>upside is that you get to avoid the atomic (~17 cycles on a P4 with a 
>simple test program, the penalty doubles if there is one other instruction 
>that operates on memory in the loop), disabling interrupts (~20 cycles?, I 
>don't remember) another atomic for the spinlock, another atomic for 
>TestClearPageLRU() and the pushf/popf (expensive as they rely on whatever 
>instruction that might still be in flight to complete and add the penalty 
>for changing irq state).  That's at least 70 cycles without including the 
>memory barrier side effects which can cost 100 cycles+.  Add in the costs 
>for the cacheline bouncing of the lru_lock and we're talking *expensive*.
>
>

My patches in -mm avoid the lru_lock and disabling/enabling interrupts
if the page is not on lru too, btw.

>So, a 1-2 cycle cost for a case that normally takes from 17 to 100+ cycles?  
>I think that's worth it given the benefits.
>
>Also, I think the common case (page cache read / map) is something that 
>should be done differently, as those atomics really do add up to major 
>pain.  Using rcu for page cache reads would be truely wonderful, but that 
>will take some time.
>
>

It is not very difficult to implement (and is something I intend to look
at after I finish my lockless pagecache). But it has quite a lot of 
problems,
including a potentially big (temporal) increase of cache footprint to 
process
the pages, more CPU time in general to traverse the lists, increased over /
underflows in the per cpu pagelists. Possibly even worse would be the 
increased
overhead on the RCU infrastructure and potential OOM conditions.

Not to mention the extra logic involved to either retry, or fall back to 
get/put
in the case that the userspace target page is not resident.

I'd say it will turn out to be more trouble than its worth, for the 
miserly cost
avoiding one atomic_inc, and one atomic_dec_and_test on page-local data 
that will
be in L1 cache. I'd never turn my nose up at anyone just having a go 
though :)

--

Send instant messages to your online friends http://au.messenger.yahoo.com 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  2:04     ` Nick Piggin
@ 2006-03-07  2:10       ` Benjamin LaHaise
  2006-03-07  4:08         ` Nick Piggin
  2006-03-07  2:30       ` Chen, Kenneth W
  1 sibling, 1 reply; 15+ messages in thread
From: Benjamin LaHaise @ 2006-03-07  2:10 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, linux-mm, netdev

On Tue, Mar 07, 2006 at 01:04:36PM +1100, Nick Piggin wrote:
> I'd say it will turn out to be more trouble than its worth, for the 
> miserly cost
> avoiding one atomic_inc, and one atomic_dec_and_test on page-local data 
> that will
> be in L1 cache. I'd never turn my nose up at anyone just having a go 
> though :)

The cost is anything but miserly.  Consider that every lock instruction is 
a memory barrier which takes your OoO CPU with lots of instructions in flight 
to ramp down to just 1 for the time it takes that instruction to execute.  
That synchronization is what makes the atomic expensive.

In the case of netperf, I ended up with a 2.5Gbit/s (~30%) performance 
improvement through nothing but microoptimizations.  There is method to 
my madness. ;-)

		-ben

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  1:58   ` Benjamin LaHaise
@ 2006-03-07  2:14     ` Nick Piggin
  0 siblings, 0 replies; 15+ messages in thread
From: Nick Piggin @ 2006-03-07  2:14 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: akpm, linux-mm, netdev

Benjamin LaHaise wrote:

>On Tue, Mar 07, 2006 at 12:53:27PM +1100, Nick Piggin wrote:
>
>>You can't do this because you can't test PageLRU like that.
>>
>>Have a look in the lkml archives a few months back, where I proposed
>>a way to do this for __free_pages(). You can't do it for put_page.
>>
>
>Even if we know that we are the last user of the page (the count is 1)?  
>Who can bump the page's count then?
>
>

Yes. vmscan.

Your page_count and PageLRU tests have no synchronisation between
them, which is the problem AFAIKS. Anything can happen between them
and they can probably also be executed out of order (the loads).

>>BTW I have quite a large backlog of patches in -mm which should end
>>up avoiding an atomic or two around these parts.
>>
>
>That certainly looks like it will help.  Not taking the spinlock 
>unconditionally gets rid of quite a bit of the cost.
>
>

Cool.

--

Send instant messages to your online friends http://au.messenger.yahoo.com 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH] avoid atomic op on page free
  2006-03-07  2:04     ` Nick Piggin
  2006-03-07  2:10       ` Benjamin LaHaise
@ 2006-03-07  2:30       ` Chen, Kenneth W
  2006-03-07  4:13         ` Nick Piggin
  1 sibling, 1 reply; 15+ messages in thread
From: Chen, Kenneth W @ 2006-03-07  2:30 UTC (permalink / raw)
  To: 'Nick Piggin', Benjamin LaHaise; +Cc: Andrew Morton, linux-mm, netdev

Nick Piggin wrote on Monday, March 06, 2006 6:05 PM
> 
> My patches in -mm avoid the lru_lock and disabling/enabling interrupts
> if the page is not on lru too, btw.

Can you put the spin lock/unlock inside TestClearPageLRU()?  The
difference is subtle though.

- Ken


--- ./mm/swap.c.orig	2006-03-06 19:25:10.680967542 -0800
+++ ./mm/swap.c	2006-03-06 19:27:02.334286487 -0800
@@ -210,14 +210,16 @@ int lru_add_drain_all(void)
 void fastcall __page_cache_release(struct page *page)
 {
 	unsigned long flags;
-	struct zone *zone = page_zone(page);
+	struct zone *zone;
 
-	spin_lock_irqsave(&zone->lru_lock, flags);
-	if (TestClearPageLRU(page))
+	if (TestClearPageLRU(page)) {
+		zone = page_zone(page);
+		spin_lock_irqsave(&zone->lru_lock, flags);
 		del_page_from_lru(zone, page);
-	if (page_count(page) != 0)
-		page = NULL;
-	spin_unlock_irqrestore(&zone->lru_lock, flags);
+		if (page_count(page) != 0)
+			page = NULL;
+		spin_unlock_irqrestore(&zone->lru_lock, flags);
+	}
 	if (page)
 		free_hot_page(page);
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  2:10       ` Benjamin LaHaise
@ 2006-03-07  4:08         ` Nick Piggin
  0 siblings, 0 replies; 15+ messages in thread
From: Nick Piggin @ 2006-03-07  4:08 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Andrew Morton, linux-mm, netdev

Benjamin LaHaise wrote:
> On Tue, Mar 07, 2006 at 01:04:36PM +1100, Nick Piggin wrote:
> 
>>I'd say it will turn out to be more trouble than its worth, for the 
>>miserly cost
>>avoiding one atomic_inc, and one atomic_dec_and_test on page-local data 
>>that will
>>be in L1 cache. I'd never turn my nose up at anyone just having a go 
>>though :)
> 
> 
> The cost is anything but miserly.  Consider that every lock instruction is 
> a memory barrier which takes your OoO CPU with lots of instructions in flight 
> to ramp down to just 1 for the time it takes that instruction to execute.  
> That synchronization is what makes the atomic expensive.
> 

Yeah x86(-64) is a _little_ worse off in that regard because its locks
imply rmbs.

But I'm saying the cost is miserly compared to the likely overheads
of using RCU-ed page freeing, when taken as impact on the system as a
whole.

Though definitely if we can get rid of atomic ops for free in any low
level page handling functions in mm/ then we want to do that.

> In the case of netperf, I ended up with a 2.5Gbit/s (~30%) performance 
> improvement through nothing but microoptimizations.  There is method to 
> my madness. ;-)
> 

Well... it was wrong too ;)

But as you can see, I'm not against microoptimisations either and I'm
glad others, like yourself, are looking at the problem too.

The 30% number is very impressive. I'd be interested to see what the
stuff currently in -mm is worth.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  2:30       ` Chen, Kenneth W
@ 2006-03-07  4:13         ` Nick Piggin
  0 siblings, 0 replies; 15+ messages in thread
From: Nick Piggin @ 2006-03-07  4:13 UTC (permalink / raw)
  To: Chen, Kenneth W; +Cc: Benjamin LaHaise, Andrew Morton, linux-mm, netdev

Chen, Kenneth W wrote:
> Nick Piggin wrote on Monday, March 06, 2006 6:05 PM
> 
>>My patches in -mm avoid the lru_lock and disabling/enabling interrupts
>>if the page is not on lru too, btw.
> 
> 
> Can you put the spin lock/unlock inside TestClearPageLRU()?  The
> difference is subtle though.
> 

That's the idea, but you just need to do a little bit more so as not to
introduce a race.

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc2/2.6.16-rc2-mm1/broken-out/mm-never-clearpagelru-released-pages.patch

> - Ken
> 
> 
> --- ./mm/swap.c.orig	2006-03-06 19:25:10.680967542 -0800
> +++ ./mm/swap.c	2006-03-06 19:27:02.334286487 -0800
> @@ -210,14 +210,16 @@ int lru_add_drain_all(void)
>  void fastcall __page_cache_release(struct page *page)
>  {
>  	unsigned long flags;
> -	struct zone *zone = page_zone(page);
> +	struct zone *zone;
>  
> -	spin_lock_irqsave(&zone->lru_lock, flags);
> -	if (TestClearPageLRU(page))
> +	if (TestClearPageLRU(page)) {
> +		zone = page_zone(page);
> +		spin_lock_irqsave(&zone->lru_lock, flags);
>  		del_page_from_lru(zone, page);
> -	if (page_count(page) != 0)
> -		page = NULL;
> -	spin_unlock_irqrestore(&zone->lru_lock, flags);
> +		if (page_count(page) != 0)
> +			page = NULL;
> +		spin_unlock_irqrestore(&zone->lru_lock, flags);
> +	}
>  	if (page)
>  		free_hot_page(page);
>  }
> 
> 


-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] avoid atomic op on page free
  2006-03-07  1:52       ` Benjamin LaHaise
@ 2006-03-07  6:30         ` Andi Kleen
  0 siblings, 0 replies; 15+ messages in thread
From: Andi Kleen @ 2006-03-07  6:30 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Andrew Morton, linux-mm, netdev

On Tuesday 07 March 2006 02:52, Benjamin LaHaise wrote:

> Those 1-2 cycles are free if you look at how things get scheduled with the 
> execution of the surrounding code. I bet $20 that you can't find a modern 
> CPU where the cost is measurable (meaning something like a P4, Athlon).  
> If this level of cost for the common case is a concern, it's probably worth 
> making atomic_dec_and_test() inline for page_cache_release().  The overhead 
> of the function call and the PageCompound() test is probably more than what 
> we're talking about as you're increasing the cache footprint and actually 
> performing a write to memory.

The test should be essentially free at least on an out of order CPU. Not quite sure 
about in order though.

-Andi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2006-03-07  6:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-03-07  0:10 [PATCH] avoid atomic op on page free Benjamin LaHaise
2006-03-07  0:50 ` Andrew Morton
2006-03-07  1:11   ` Benjamin LaHaise
2006-03-07  1:39     ` Andrew Morton
2006-03-07  1:52       ` Benjamin LaHaise
2006-03-07  6:30         ` Andi Kleen
2006-03-07  2:04     ` Nick Piggin
2006-03-07  2:10       ` Benjamin LaHaise
2006-03-07  4:08         ` Nick Piggin
2006-03-07  2:30       ` Chen, Kenneth W
2006-03-07  4:13         ` Nick Piggin
2006-03-07  1:21   ` Rick Jones
2006-03-07  1:53 ` Nick Piggin
2006-03-07  1:58   ` Benjamin LaHaise
2006-03-07  2:14     ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox