[PATCH] Remove unnecessary smp_wmb from clear_user

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
@ 2007-07-18 15:05 Mel Gorman
  2007-07-18 16:45 ` Hugh Dickins
  2007-07-19  1:57 ` Nick Piggin
  0 siblings, 2 replies; 12+ messages in thread
From: Mel Gorman @ 2007-07-18 15:05 UTC (permalink / raw)
  To: npiggin, hugh; +Cc: linux-mm

Hi,

At the nudging of Andrew, I was checking to see if the architecture-specific
implementations of alloc_zeroed_user_highpage() can be removed or not.
With the exception of barriers, the differences are negligible and the main
memory barrier is in clear_user_highpage(). However, it's unclear why it's
needed. Do you mind looking at the following patch and telling me if it's
wrong and if so, why?

Thanks a lot.

===

    This patch removes an unnecessary write barrier from clear_user_highpage().
    
    clear_user_highpage() is called from alloc_zeroed_user_highpage() on a
    number of architectures and from clear_huge_page(). However, these callers
    are already protected by the necessary memory barriers due to spinlocks
    in the fault path and the page should not be visible on other CPUs anyway
    making the barrier unnecessary. A hint of lack of necessity is that there
    does not appear to be a read barrier anywhere for this zeroed page.
    
    The sequence for the first use of alloc_zeroed_user_highpage()
    looks like;
    
    pte_unmap_unlock()
    alloc_zeroed_user_highpage()
    pte_offset_map_lock()
    
    The second is
    
    pte_unmap()	(usually nothing but sometimes a barrier()
    alloc_zeroed_user_highpage()
    pte_offset_map_lock()
    
    The two sequences with the use of locking should already have sufficient
    barriers.
    
    By removing this write barrier, IA64 could use the default implementation
    of alloc_zeroed_user_highpage() instead of a custom version which appears
    to do nothing but avoid calling smp_wmb(). Once that is done, there is
    little reason to have architecture-specific alloc_zeroed_user_highpage()
    helpers as it would be redundant.

diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 12c5e4e..ace5a32 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -68,8 +68,6 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
 	void *addr = kmap_atomic(page, KM_USER0);
 	clear_user_page(addr, vaddr, page);
 	kunmap_atomic(addr, KM_USER0);
-	/* Make sure this page is cleared on other CPU's too before using it */
-	smp_wmb();
 }
 
 #ifndef __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-18 15:05 [PATCH] Remove unnecessary smp_wmb from clear_user_highpage() Mel Gorman
@ 2007-07-18 16:45 ` Hugh Dickins
  2007-07-19  2:17   ` Nick Piggin
                     ` (3 more replies)
  2007-07-19  1:57 ` Nick Piggin
  1 sibling, 4 replies; 12+ messages in thread
From: Hugh Dickins @ 2007-07-18 16:45 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Linus Torvalds, npiggin, linux-mm

On Wed, 18 Jul 2007, Mel Gorman wrote:
> 
> At the nudging of Andrew, I was checking to see if the architecture-specific
> implementations of alloc_zeroed_user_highpage() can be removed or not.

Ah, so that was part of the deal for getting MOVABLE in, eh ;-?

> With the exception of barriers, the differences are negligible and the main
> memory barrier is in clear_user_highpage(). However, it's unclear why it's
> needed. Do you mind looking at the following patch and telling me if it's
> wrong and if so, why?
> 
> Thanks a lot.

I laugh when someone approaches me with a question on barriers ;)
I usually get confused and have to go ask someone else.

And I should really to leave this query to Nick: he'll be glad of the
opportunity to post his PageUptodate memorder patches again (looking
in my mailbox I see versions from February, but I'm pretty sure he put
out a more compact, less scary one later on).  He contends that the
barrier in clear_user_highpage should not be there, but instead
barriers (usually) needed when setting and testing PageUptodate.

Andrew and I weren't entirely convinced: I don't think we found
him wrong, just didn't find time to think about it deeply enough,
suspicious of a fix in search of a problem, scared by the extent
of the first patch, put off by the usual host of __..._nolock
variants and micro-optimizations.  It is worth another look.

But setting aside PageUptodate futures...  "git blame" is handy,
and took me to the patch from Linus appended.  I think there's
as much need for that smp_wmb() now as there was then.  (But
am I really _thinking_?  No, just pointing you in directions.)

> ===
> 
>     This patch removes an unnecessary write barrier from clear_user_highpage().
>     
>     clear_user_highpage() is called from alloc_zeroed_user_highpage() on a
>     number of architectures and from clear_huge_page(). However, these callers
>     are already protected by the necessary memory barriers due to spinlocks

Be careful: as Linus indicates, spinlocks on x86 act as good barriers,
but on some architectures they guarantee no more than is strictly
necessary.  alpha, powerpc and ia64 spring to my mind as particularly
difficult ordering-wise, but I bet there are others too.

>     in the fault path and the page should not be visible on other CPUs anyway

The page may not be intentionally visible on another CPU yet.  But imagine
interesting stale data in the page being cleared, and another thread
peeking racily at unfaulted areas, hoping to catch sight of that data.

>     making the barrier unnecessary. A hint of lack of necessity is that there
>     does not appear to be a read barrier anywhere for this zeroed page.

Yes, I think Nick was similarly suspicious of a wmb without an rmb; but
Linus is _very_ barrier-savvy, so we might want to ask him about it (CC'ed).

>     
>     The sequence for the first use of alloc_zeroed_user_highpage()
>     looks like;
>     
>     pte_unmap_unlock()
>     alloc_zeroed_user_highpage()
>     pte_offset_map_lock()
>     
>     The second is
>     
>     pte_unmap()	(usually nothing but sometimes a barrier()
>     alloc_zeroed_user_highpage()
>     pte_offset_map_lock()
>     
>     The two sequences with the use of locking should already have sufficient
>     barriers.

To be honest, I've not thought about what you've written there:
assumed perhaps wrongly that my remarks above invalidate your logic.

>     
>     By removing this write barrier, IA64 could use the default implementation
>     of alloc_zeroed_user_highpage() instead of a custom version which appears
>     to do nothing but avoid calling smp_wmb(). Once that is done, there is
>     little reason to have architecture-specific alloc_zeroed_user_highpage()
>     helpers as it would be redundant.

Hmm, I'd expect IA64 to be one of the ones that really needs that smp_wmb()
anyway.

> 
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 12c5e4e..ace5a32 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -68,8 +68,6 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
>  	void *addr = kmap_atomic(page, KM_USER0);
>  	clear_user_page(addr, vaddr, page);
>  	kunmap_atomic(addr, KM_USER0);
> -	/* Make sure this page is cleared on other CPU's too before using it */
> -	smp_wmb();
>  }
>  
>  #ifndef __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE

commit 538ce05c0ef4055cf29a92a4abcdf139d180a0f9
Author: Linus Torvalds <torvalds@ppc970.osdl.org>
Date:   Wed Oct 13 21:00:06 2004 -0700

    Fix threaded user page write memory ordering

    Make sure we order the writes to a newly created page
    with the page table update that potentially exposes the
    page to another CPU.

    This is a no-op on any architecture where getting the
    page table spinlock will already do the ordering (notably
    x86), but other architectures can care.

diff --git a/include/linux/highmem.h b/include/linux/highmem.h
index 232d8fd..7153aef 100644
--- a/include/linux/highmem.h
+++ b/include/linux/highmem.h
@@ -40,6 +40,8 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
 	void *addr = kmap_atomic(page, KM_USER0);
 	clear_user_page(addr, vaddr, page);
 	kunmap_atomic(addr, KM_USER0);
+	/* Make sure this page is cleared on other CPU's too before using it */
+	smp_wmb();
 }

 static inline void clear_highpage(struct page *page)
@@ -73,6 +75,8 @@ static inline void copy_user_highpage(struct page *to, struct page *from, unsign
 	copy_user_page(vto, vfrom, vaddr, to);
 	kunmap_atomic(vfrom, KM_USER0);
 	kunmap_atomic(vto, KM_USER1);
+	/* Make sure this page is cleared on other CPU's too before using it */
+	smp_wmb();
 }

 static inline void copy_highpage(struct page *to, struct page *from)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-18 16:45 ` Hugh Dickins
@ 2007-07-19  2:17   ` Nick Piggin
  2007-07-20 13:08     ` Mel Gorman
  2007-07-19  2:28   ` Linus Torvalds
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Nick Piggin @ 2007-07-19  2:17 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Mel Gorman, Linus Torvalds, linux-mm

On Wed, Jul 18, 2007 at 05:45:22PM +0100, Hugh Dickins wrote:
> On Wed, 18 Jul 2007, Mel Gorman wrote:
> > 
> > At the nudging of Andrew, I was checking to see if the architecture-specific
> > implementations of alloc_zeroed_user_highpage() can be removed or not.
> 
> Ah, so that was part of the deal for getting MOVABLE in, eh ;-?
> 
> > With the exception of barriers, the differences are negligible and the main
> > memory barrier is in clear_user_highpage(). However, it's unclear why it's
> > needed. Do you mind looking at the following patch and telling me if it's
> > wrong and if so, why?
> > 
> > Thanks a lot.
> 
> I laugh when someone approaches me with a question on barriers ;)
> I usually get confused and have to go ask someone else.
> 
> And I should really to leave this query to Nick: he'll be glad of the
> opportunity to post his PageUptodate memorder patches again (looking
> in my mailbox I see versions from February, but I'm pretty sure he put
> out a more compact, less scary one later on).  He contends that the
> barrier in clear_user_highpage should not be there, but instead
> barriers (usually) needed when setting and testing PageUptodate.
> 
> Andrew and I weren't entirely convinced: I don't think we found
> him wrong, just didn't find time to think about it deeply enough,
> suspicious of a fix in search of a problem, scared by the extent
> of the first patch, put off by the usual host of __..._nolock
> variants and micro-optimizations.  It is worth another look.

Well, at least I probably won't have to debug the remaining problem --
the IBM guys will :)

 
> But setting aside PageUptodate futures...  "git blame" is handy,
> and took me to the patch from Linus appended.  I think there's
> as much need for that smp_wmb() now as there was then.  (But
> am I really _thinking_?  No, just pointing you in directions.)
> 
> > ===
> > 
> >     This patch removes an unnecessary write barrier from clear_user_highpage().
> >     
> >     clear_user_highpage() is called from alloc_zeroed_user_highpage() on a
> >     number of architectures and from clear_huge_page(). However, these callers
> >     are already protected by the necessary memory barriers due to spinlocks
> 
> Be careful: as Linus indicates, spinlocks on x86 act as good barriers,
> but on some architectures they guarantee no more than is strictly
> necessary.  alpha, powerpc and ia64 spring to my mind as particularly
> difficult ordering-wise, but I bet there are others too.

The problem cases here are those which don't provide an smp_mb() over
locks (eg. ones which only give acquire semantics). I think these only
are ia64 and powerpc. Of those, I think only powerpc implementations have
a really deep out of order memory system (at least on the store side)...
which is probably why they see and have to fix most of our barrier
problems :)


> >     in the fault path and the page should not be visible on other CPUs anyway
> 
> The page may not be intentionally visible on another CPU yet.  But imagine
> interesting stale data in the page being cleared, and another thread
> peeking racily at unfaulted areas, hoping to catch sight of that data.
> 
> >     making the barrier unnecessary. A hint of lack of necessity is that there
> >     does not appear to be a read barrier anywhere for this zeroed page.
> 
> Yes, I think Nick was similarly suspicious of a wmb without an rmb; but
> Linus is _very_ barrier-savvy, so we might want to ask him about it (CC'ed).

I was not so suspicious in the page fault case: there is a causal
ordering between loading the valid pte and dereferencing it to load
the page data. Potentially I think alpha is the only thing that
could have problems here, but a) if any implementations did hardware
TLB fills, they would have to do the rmb in microcode; and b) the
software path appears to use the regular fault handler, so it would
be subject to synchronisatoin via ptl. But maybe they are unsafe...

What I am worried about is exactly the same race at the read(2)/write(2)
level where there is _no_ spinlock synchronisation, and no wmb, let
alone a rmb :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-19  2:17   ` Nick Piggin
@ 2007-07-20 13:08     ` Mel Gorman
  2007-07-23  2:02       ` Nick Piggin
  0 siblings, 1 reply; 12+ messages in thread
From: Mel Gorman @ 2007-07-20 13:08 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Hugh Dickins, Linus Torvalds, linux-mm

On (19/07/07 04:17), Nick Piggin didst pronounce:
> On Wed, Jul 18, 2007 at 05:45:22PM +0100, Hugh Dickins wrote:
> > On Wed, 18 Jul 2007, Mel Gorman wrote:
> > > 
> > > At the nudging of Andrew, I was checking to see if the architecture-specific
> > > implementations of alloc_zeroed_user_highpage() can be removed or not.
> > 
> > Ah, so that was part of the deal for getting MOVABLE in, eh ;-?
> > 
> > > With the exception of barriers, the differences are negligible and the main
> > > memory barrier is in clear_user_highpage(). However, it's unclear why it's
> > > needed. Do you mind looking at the following patch and telling me if it's
> > > wrong and if so, why?
> > > 
> > > Thanks a lot.
> > 
> > I laugh when someone approaches me with a question on barriers ;)
> > I usually get confused and have to go ask someone else.
> > 
> > And I should really to leave this query to Nick: he'll be glad of the
> > opportunity to post his PageUptodate memorder patches again (looking
> > in my mailbox I see versions from February, but I'm pretty sure he put
> > out a more compact, less scary one later on).  He contends that the
> > barrier in clear_user_highpage should not be there, but instead
> > barriers (usually) needed when setting and testing PageUptodate.
> > 
> > Andrew and I weren't entirely convinced: I don't think we found
> > him wrong, just didn't find time to think about it deeply enough,
> > suspicious of a fix in search of a problem, scared by the extent
> > of the first patch, put off by the usual host of __..._nolock
> > variants and micro-optimizations.  It is worth another look.
> 
> Well, at least I probably won't have to debug the remaining problem --
> the IBM guys will :)
> 

I weep for joy. I'll go looking for a test case for this. It sounds like
something that we'll need anyway if this area is to be kicked at all.

> > But setting aside PageUptodate futures...  "git blame" is handy,
> > and took me to the patch from Linus appended.  I think there's
> > as much need for that smp_wmb() now as there was then.  (But
> > am I really _thinking_?  No, just pointing you in directions.)
> > 
> > > ===
> > > 
> > >     This patch removes an unnecessary write barrier from clear_user_highpage().
> > >     
> > >     clear_user_highpage() is called from alloc_zeroed_user_highpage() on a
> > >     number of architectures and from clear_huge_page(). However, these callers
> > >     are already protected by the necessary memory barriers due to spinlocks
> > 
> > Be careful: as Linus indicates, spinlocks on x86 act as good barriers,
> > but on some architectures they guarantee no more than is strictly
> > necessary.  alpha, powerpc and ia64 spring to my mind as particularly
> > difficult ordering-wise, but I bet there are others too.
> 
> The problem cases here are those which don't provide an smp_mb() over
> locks (eg. ones which only give acquire semantics). I think these only
> are ia64 and powerpc.

If IA64 has these sort of semantics, then it's current behaviour is
buggy unless their call to flush_dcache_page() has a similar effect to
having a write barrier elsewhere. I'll ask them.

> Of those, I think only powerpc implementations have
> a really deep out of order memory system (at least on the store side)...
> which is probably why they see and have to fix most of our barrier
> problems :)
> 

Yeah, this could be more of the same.

> 
> > >     in the fault path and the page should not be visible on other CPUs anyway
> > 
> > The page may not be intentionally visible on another CPU yet.  But imagine
> > interesting stale data in the page being cleared, and another thread
> > peeking racily at unfaulted areas, hoping to catch sight of that data.
> > 
> > >     making the barrier unnecessary. A hint of lack of necessity is that there
> > >     does not appear to be a read barrier anywhere for this zeroed page.
> > 
> > Yes, I think Nick was similarly suspicious of a wmb without an rmb; but
> > Linus is _very_ barrier-savvy, so we might want to ask him about it (CC'ed).
> 
> I was not so suspicious in the page fault case: there is a causal
> ordering between loading the valid pte and dereferencing it to load
> the page data. Potentially I think alpha is the only thing that
> could have problems here, but a) if any implementations did hardware
> TLB fills, they would have to do the rmb in microcode; and b) the
> software path appears to use the regular fault handler, so it would
> be subject to synchronisatoin via ptl. But maybe they are unsafe...
> 

One way to find out. Minimally, I think the cleanup here if it exists at
all is to replace the arch-specific alloc_zeroed helpers with barrier and
no-barrier versions and have architectures specify when they do not require
barrier to exist so the default behaviour is the safer choice. At least the
issue will be a bit clearer then to the next guy. IA64 will still be the
different but maybe it can be brought in line with other arches behaviour.

> What I am worried about is exactly the same race at the read(2)/write(2)
> level where there is _no_ spinlock synchronisation, and no wmb, let
> alone a rmb :)
> 

Where is the race in read/write that is affected by the behaviour of
clear_user_highpage()? Is it where a sparse file is mmap()ed and being read
at the same time or what?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-20 13:08     ` Mel Gorman
@ 2007-07-23  2:02       ` Nick Piggin
  0 siblings, 0 replies; 12+ messages in thread
From: Nick Piggin @ 2007-07-23  2:02 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Hugh Dickins, Linus Torvalds, linux-mm

On Fri, Jul 20, 2007 at 02:08:49PM +0100, Mel Gorman wrote:
> On (19/07/07 04:17), Nick Piggin didst pronounce:
> > On Wed, Jul 18, 2007 at 05:45:22PM +0100, Hugh Dickins wrote:
> > > 
> > > Andrew and I weren't entirely convinced: I don't think we found
> > > him wrong, just didn't find time to think about it deeply enough,
> > > suspicious of a fix in search of a problem, scared by the extent
> > > of the first patch, put off by the usual host of __..._nolock
> > > variants and micro-optimizations.  It is worth another look.
> > 
> > Well, at least I probably won't have to debug the remaining problem --
> > the IBM guys will :)
> > 
> 
> I weep for joy. I'll go looking for a test case for this. It sounds like
> something that we'll need anyway if this area is to be kicked at all.

Well I'd be happy to fix it right now, but nobody believes me! (I
might be wrong of course, but nobody has told me why).
Yes maybe a test case would help :)


> > > Be careful: as Linus indicates, spinlocks on x86 act as good barriers,
> > > but on some architectures they guarantee no more than is strictly
> > > necessary.  alpha, powerpc and ia64 spring to my mind as particularly
> > > difficult ordering-wise, but I bet there are others too.
> > 
> > The problem cases here are those which don't provide an smp_mb() over
> > locks (eg. ones which only give acquire semantics). I think these only
> > are ia64 and powerpc.
> 
> If IA64 has these sort of semantics, then it's current behaviour is
> buggy unless their call to flush_dcache_page() has a similar effect to
> having a write barrier elsewhere. I'll ask them.
> 
> > Of those, I think only powerpc implementations have
> > a really deep out of order memory system (at least on the store side)...
> > which is probably why they see and have to fix most of our barrier
> > problems :)
> > 
> 
> Yeah, this could be more of the same.
> 
> > I was not so suspicious in the page fault case: there is a causal
> > ordering between loading the valid pte and dereferencing it to load
> > the page data. Potentially I think alpha is the only thing that
> > could have problems here, but a) if any implementations did hardware
> > TLB fills, they would have to do the rmb in microcode; and b) the
> > software path appears to use the regular fault handler, so it would
> > be subject to synchronisatoin via ptl. But maybe they are unsafe...
> > 
> 
> One way to find out. Minimally, I think the cleanup here if it exists at
> all is to replace the arch-specific alloc_zeroed helpers with barrier and
> no-barrier versions and have architectures specify when they do not require
> barrier to exist so the default behaviour is the safer choice. At least the
> issue will be a bit clearer then to the next guy. IA64 will still be the
> different but maybe it can be brought in line with other arches behaviour.

I'd be inclined to unify them and put the barrier in SetPageUptodate
as in my patch. If architectures really can do out of order stores, then
they need it; if not then smp_wmb should be a noop.

We could argue to have a smp_wmb__before_spin_lock, but I'd really rather
do the sane thing first, and then introduce yet another barrier type
after it is proven to have a performance benefit.


> > What I am worried about is exactly the same race at the read(2)/write(2)
> > level where there is _no_ spinlock synchronisation, and no wmb, let
> > alone a rmb :)
> > 
> 
> Where is the race in read/write that is affected by the behaviour of
> clear_user_highpage()? Is it where a sparse file is mmap()ed and being read
> at the same time or what?

No not by the behaviour of clear_user_highpage, but the larger conceptual
problem that pages are being initialised, then made visible to the wider
VM (with SetPageUptodate or set_pte), without an smp_wmb between the stores
to initialise the page and the store to make it visible.

This clear_user_highpage thingy is just a subset of that.

And no, the read/write inconsistency is not just for sparse files: write(2)
writes have the same problem, and even non-sparse reads in some filesystems
(they don't all do DMA: think RAM backed filesystems, ecryptfs, and I think
pktcdvd, possibly NFS, and probably others)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-18 16:45 ` Hugh Dickins
  2007-07-19  2:17   ` Nick Piggin
@ 2007-07-19  2:28   ` Linus Torvalds
  2007-07-19  2:58     ` Nick Piggin
  2007-07-19  2:36   ` Nick Piggin
  2007-07-19 11:16   ` Mel Gorman
  3 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2007-07-19  2:28 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Mel Gorman, npiggin, linux-mm

On Wed, 18 Jul 2007, Hugh Dickins wrote:
> 
> Be careful: as Linus indicates, spinlocks on x86 act as good barriers,
> but on some architectures they guarantee no more than is strictly
> necessary.  alpha, powerpc and ia64 spring to my mind as particularly
> difficult ordering-wise, but I bet there are others too.

A full lock/unlock *pair* should (as far as I know) always be equivalent 
to a full memory barrier. Why? Because, by definition, no reads or writes 
inside the locked region may escape outside it, and that in turn implies 
that no access _outside_ the locked region may escape to the other side of 
it. 

I think.

However, neither a "lock" nor an "unlock" on *its*own* is a barrier at 
all, at most they are semi-permeable barriers for some things, where 
different architectures can be differently semi-permeable.

So if you have both a lock and an unlock between two points, you don't 
need any extra barriers, but if you only have one or the other, you'd need 
to add barriers.

And yes, on x86, just the "lock" part ends up being a total barrier, but 
that's not necessarily true on other architectures.

(Interestingly, it shouldn't matter "which way" the lock/unlock pair is: 
if the unlock of a previous lock was first, and a lock of another lock 
comes second, the *combination* of those two operations should still be a 
total memory barrier on the CPU that executed that pair, afaik, and it 
would be a bug if a memory op could escape from one critical region to the 
other. So "lock + unlock" and "unlock + lock" should both be equivalent to 
memory barriers, I think, even if neither of lock and unlock on their own 
is one).

> >     making the barrier unnecessary. A hint of lack of necessity is that there
> >     does not appear to be a read barrier anywhere for this zeroed page.
> 
> Yes, I think Nick was similarly suspicious of a wmb without an rmb; but
> Linus is _very_ barrier-savvy, so we might want to ask him about it (CC'ed).

A smp_wmb() should in general always have a paired smp_rmb(), or it's 
pointless. A special case is when the wmb() is between the "data" and the 
"exposure" of that data (ie the pointer write that makes the data 
visible), in which case the other end doesn't need a smp_rmb(), but may 
well still need a "smp_read_barrier_depends()".

> >  	void *addr = kmap_atomic(page, KM_USER0);
> >  	clear_user_page(addr, vaddr, page);
> >  	kunmap_atomic(addr, KM_USER0);
> > -	/* Make sure this page is cleared on other CPU's too before using it */
> > -	smp_wmb();

I suspect that the smp_wmb() is probably a good idea, since the 
"kunmap_atomic()" is generally a no-op, and other CPU's may read the page 
through the page tables without any other serialization.

And in that case, the others only need the "smp_read_barrier_depends()", 
and the fact is, that's a no-op for pretty much everybody, and a TLB 
lookup *has* to have that even on alpha, because otherwise the race is 
simply unfixable.

But I did *not* look through the whole sequence, so who knows. If there is 
a full lock/unlock pair between the clear_user_highpage() and actually 
making it available in the page tables, the wmb wouldn't be needed.

			Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-19  2:28   ` Linus Torvalds
@ 2007-07-19  2:58     ` Nick Piggin
  0 siblings, 0 replies; 12+ messages in thread
From: Nick Piggin @ 2007-07-19  2:58 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Hugh Dickins, Mel Gorman, linux-mm

On Wed, Jul 18, 2007 at 07:28:26PM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 18 Jul 2007, Hugh Dickins wrote:
> 
> > >     making the barrier unnecessary. A hint of lack of necessity is that there
> > >     does not appear to be a read barrier anywhere for this zeroed page.
> > 
> > Yes, I think Nick was similarly suspicious of a wmb without an rmb; but
> > Linus is _very_ barrier-savvy, so we might want to ask him about it (CC'ed).
> 
> A smp_wmb() should in general always have a paired smp_rmb(), or it's 
> pointless. A special case is when the wmb() is between the "data" and the 
> "exposure" of that data (ie the pointer write that makes the data 
> visible), in which case the other end doesn't need a smp_rmb(), but may 
> well still need a "smp_read_barrier_depends()".

I think the core mm should be OK, because setting and getting ptes should
(AFAIKS) always take the ptl. arch code that does lockless pte lookups
(ppc64's find_linux_pte for example seems to), and hardware fills of course
need a causal ordering there. So if there was something like find_linux_pte
used to load the TLB on alpha without smp_read_barrier_depends, I think
that would be a bug.


> > >  	void *addr = kmap_atomic(page, KM_USER0);
> > >  	clear_user_page(addr, vaddr, page);
> > >  	kunmap_atomic(addr, KM_USER0);
> > > -	/* Make sure this page is cleared on other CPU's too before using it */
> > > -	smp_wmb();
> 
> I suspect that the smp_wmb() is probably a good idea, since the 
> "kunmap_atomic()" is generally a no-op, and other CPU's may read the page 
> through the page tables without any other serialization.
> 
> And in that case, the others only need the "smp_read_barrier_depends()", 
> and the fact is, that's a no-op for pretty much everybody, and a TLB 
> lookup *has* to have that even on alpha, because otherwise the race is 
> simply unfixable.
> 
> But I did *not* look through the whole sequence, so who knows. If there is 
> a full lock/unlock pair between the clear_user_highpage() and actually 
> making it available in the page tables, the wmb wouldn't be needed.

Pretty sure Paulus, Ben, or Anton ran into it, yes. Actually, from
memory they submitted a variant on that patch which you didn't like ;)


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-18 16:45 ` Hugh Dickins
  2007-07-19  2:17   ` Nick Piggin
  2007-07-19  2:28   ` Linus Torvalds
@ 2007-07-19  2:36   ` Nick Piggin
  2007-07-19 11:16   ` Mel Gorman
  3 siblings, 0 replies; 12+ messages in thread
From: Nick Piggin @ 2007-07-19  2:36 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Mel Gorman, Linus Torvalds, linux-mm

On Wed, Jul 18, 2007 at 05:45:22PM +0100, Hugh Dickins wrote:
> On Wed, 18 Jul 2007, Mel Gorman wrote:
> > 
> > At the nudging of Andrew, I was checking to see if the architecture-specific
> > implementations of alloc_zeroed_user_highpage() can be removed or not.
> 
> Ah, so that was part of the deal for getting MOVABLE in, eh ;-?
> 
> > With the exception of barriers, the differences are negligible and the main
> > memory barrier is in clear_user_highpage(). However, it's unclear why it's
> > needed. Do you mind looking at the following patch and telling me if it's
> > wrong and if so, why?
> > 
> > Thanks a lot.
> 
> I laugh when someone approaches me with a question on barriers ;)
> I usually get confused and have to go ask someone else.
> 
> And I should really to leave this query to Nick: he'll be glad of the
> opportunity to post his PageUptodate memorder patches again (looking
> in my mailbox I see versions from February, but I'm pretty sure he put
> out a more compact, less scary one later on).  He contends that the
> barrier in clear_user_highpage should not be there, but instead
> barriers (usually) needed when setting and testing PageUptodate.

And btw. (I don't think you're confused, but the last sentence could
be mislreading to readers)... I don't contend the barrier should not be
there in that it is _technically_ wrong... but logicaly the condition
we are interested in is whether the page is uptodate or not (the fact
that we only ever have uptodate pages in ptes *cough*, and the causal
dependency on *pte -> page means we don't bother setting or checking
PageUptodate for anonymous faults, but the logical condition we want
is that the page is uptodate).

So when I found that both ordering problems (fault and read(2)) could
be solved with PageUptodate, it just seems like a better place to
put it than in clear_user_highpage.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-18 16:45 ` Hugh Dickins
                     ` (2 preceding siblings ...)
  2007-07-19  2:36   ` Nick Piggin
@ 2007-07-19 11:16   ` Mel Gorman
  3 siblings, 0 replies; 12+ messages in thread
From: Mel Gorman @ 2007-07-19 11:16 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Linus Torvalds, npiggin, linux-mm

On (18/07/07 17:45), Hugh Dickins didst pronounce:
> On Wed, 18 Jul 2007, Mel Gorman wrote:
> > 
> > At the nudging of Andrew, I was checking to see if the architecture-specific
> > implementations of alloc_zeroed_user_highpage() can be removed or not.
> 
> Ah, so that was part of the deal for getting MOVABLE in, eh ;-?
> 

heh, no. But I was touching off that area so I get to kick it while I'm
there. It's an interesting one.

> > With the exception of barriers, the differences are negligible and the main
> > memory barrier is in clear_user_highpage(). However, it's unclear why it's
> > needed. Do you mind looking at the following patch and telling me if it's
> > wrong and if so, why?
> > 
> > Thanks a lot.
> 
> I laugh when someone approaches me with a question on barriers ;)

I guess people live in hope :)

> I usually get confused and have to go ask someone else.
> 
> And I should really to leave this query to Nick: he'll be glad of the
> opportunity to post his PageUptodate memorder patches again (looking
> in my mailbox I see versions from February, but I'm pretty sure he put
> out a more compact, less scary one later on). 

Ok, I didn't look at these closely at the the time. I'll take a closer look
when/if the patches make a re-appearance. As of now, it's looking like the
barrier is needed and removing it may result in really obscure bugs with
relation to threads running on different CPUs faulting the same region.

The core of the problem I'm getting from this thread is that with the locking
as-is, the set_pte() can appear to happen before the page was zeroed so many
readers/writers on different CPUs will see a different result if they are
looking PTEs in a lockless fashion.

> He contends that the
> barrier in clear_user_highpage should not be there, but instead
> barriers (usually) needed when setting and testing PageUptodate.
> 
> Andrew and I weren't entirely convinced: I don't think we found
> him wrong, just didn't find time to think about it deeply enough,
> suspicious of a fix in search of a problem, scared by the extent
> of the first patch, put off by the usual host of __..._nolock
> variants and micro-optimizations.  It is worth another look.
> 

It is not easy to prove right or wrong. Building a test-case is not
particularly easy either.

> But setting aside PageUptodate futures...  "git blame" is handy,
> and took me to the patch from Linus appended.  I think there's
> as much need for that smp_wmb() now as there was then.  (But
> am I really _thinking_?  No, just pointing you in directions.)
> 

Good tip. For those watching, finding this commit
via git-blame needs the historical 2.6 git tree at
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git .

> > ===
> > 
> >     This patch removes an unnecessary write barrier from clear_user_highpage().
> >     
> >     clear_user_highpage() is called from alloc_zeroed_user_highpage() on a
> >     number of architectures and from clear_huge_page(). However, these callers
> >     are already protected by the necessary memory barriers due to spinlocks
> 
> Be careful: as Linus indicates, spinlocks on x86 act as good barriers,
> but on some architectures they guarantee no more than is strictly
> necessary.  alpha, powerpc and ia64 spring to my mind as particularly
> difficult ordering-wise, but I bet there are others too.
> 

There was a good reminder of the rules here and it's a bit clearer why
it's possible for the page clear to apparently happen after the set_pte.

> >     in the fault path and the page should not be visible on other CPUs anyway
> 
> The page may not be intentionally visible on another CPU yet.  But imagine
> interesting stale data in the page being cleared, and another thread
> peeking racily at unfaulted areas, hoping to catch sight of that data.
> 

I'm going to attempt to construct a test case to see if it's possible to
reproduce without that barrier in place. I'll contact the PowerPC people
to know if they've done this already.

> >     making the barrier unnecessary. A hint of lack of necessity is that there
> >     does not appear to be a read barrier anywhere for this zeroed page.
> 
> Yes, I think Nick was similarly suspicious of a wmb without an rmb; but
> Linus is _very_ barrier-savvy, so we might want to ask him about it (CC'ed).
> 

Thanks

> >     
> >     The sequence for the first use of alloc_zeroed_user_highpage()
> >     looks like;
> >     
> >     pte_unmap_unlock()
> >     alloc_zeroed_user_highpage()
> >     pte_offset_map_lock()
> >     
> >     The second is
> >     
> >     pte_unmap()	(usually nothing but sometimes a barrier()
> >     alloc_zeroed_user_highpage()
> >     pte_offset_map_lock()
> >     
> >     The two sequences with the use of locking should already have sufficient
> >     barriers.
> 
> To be honest, I've not thought about what you've written there:
> assumed perhaps wrongly that my remarks above invalidate your logic.
> 

Yeah, my logic is invalidated to the extent that removing this barrier
is almost certainly wrong but very difficult to reproduce.

> >     
> >     By removing this write barrier, IA64 could use the default implementation
> >     of alloc_zeroed_user_highpage() instead of a custom version which appears
> >     to do nothing but avoid calling smp_wmb(). Once that is done, there is
> >     little reason to have architecture-specific alloc_zeroed_user_highpage()
> >     helpers as it would be redundant.
> 
> Hmm, I'd expect IA64 to be one of the ones that really needs that smp_wmb()
> anyway.
> 

I'll have to check. They avoid the memory barrier at the moment so we
might as well check that it's being done on purpose.

> > 
> > diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> > index 12c5e4e..ace5a32 100644
> > --- a/include/linux/highmem.h
> > +++ b/include/linux/highmem.h
> > @@ -68,8 +68,6 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
> >  	void *addr = kmap_atomic(page, KM_USER0);
> >  	clear_user_page(addr, vaddr, page);
> >  	kunmap_atomic(addr, KM_USER0);
> > -	/* Make sure this page is cleared on other CPU's too before using it */
> > -	smp_wmb();
> >  }
> >  
> >  #ifndef __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
> 
> commit 538ce05c0ef4055cf29a92a4abcdf139d180a0f9
> Author: Linus Torvalds <torvalds@ppc970.osdl.org>
> Date:   Wed Oct 13 21:00:06 2004 -0700
> 
>     Fix threaded user page write memory ordering
>     
>     Make sure we order the writes to a newly created page
>     with the page table update that potentially exposes the
>     page to another CPU.
>     
>     This is a no-op on any architecture where getting the
>     page table spinlock will already do the ordering (notably
>     x86), but other architectures can care.
> 
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 232d8fd..7153aef 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -40,6 +40,8 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
>  	void *addr = kmap_atomic(page, KM_USER0);
>  	clear_user_page(addr, vaddr, page);
>  	kunmap_atomic(addr, KM_USER0);
> +	/* Make sure this page is cleared on other CPU's too before using it */
> +	smp_wmb();
>  }
>  
>  static inline void clear_highpage(struct page *page)
> @@ -73,6 +75,8 @@ static inline void copy_user_highpage(struct page *to, struct page *from, unsign
>  	copy_user_page(vto, vfrom, vaddr, to);
>  	kunmap_atomic(vfrom, KM_USER0);
>  	kunmap_atomic(vto, KM_USER1);
> +	/* Make sure this page is cleared on other CPU's too before using it */
> +	smp_wmb();
>  }
>  
>  static inline void copy_highpage(struct page *to, struct page *from)

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-18 15:05 [PATCH] Remove unnecessary smp_wmb from clear_user_highpage() Mel Gorman
  2007-07-18 16:45 ` Hugh Dickins
@ 2007-07-19  1:57 ` Nick Piggin
  1 sibling, 0 replies; 12+ messages in thread
From: Nick Piggin @ 2007-07-19  1:57 UTC (permalink / raw)
  To: Mel Gorman; +Cc: hugh, linux-mm, Linus Torvalds

On Wed, Jul 18, 2007 at 04:05:14PM +0100, Mel Gorman wrote:
> Hi,
> 
> At the nudging of Andrew, I was checking to see if the architecture-specific
> implementations of alloc_zeroed_user_highpage() can be removed or not.
> With the exception of barriers, the differences are negligible and the main
> memory barrier is in clear_user_highpage(). However, it's unclear why it's
> needed. Do you mind looking at the following patch and telling me if it's
> wrong and if so, why?
> 
> Thanks a lot.
> 
> ===
> 
>     This patch removes an unnecessary write barrier from clear_user_highpage().
>     
>     clear_user_highpage() is called from alloc_zeroed_user_highpage() on a
>     number of architectures and from clear_huge_page(). However, these callers
>     are already protected by the necessary memory barriers due to spinlocks
>     in the fault path and the page should not be visible on other CPUs anyway
>     making the barrier unnecessary. A hint of lack of necessity is that there
>     does not appear to be a read barrier anywhere for this zeroed page.
>     
>     The sequence for the first use of alloc_zeroed_user_highpage()
>     looks like;
>     
>     pte_unmap_unlock()
>     alloc_zeroed_user_highpage()

pte_offset_map_lock only provides acquire semantics. So stores from
alloc_zeroed_user_highpage can sit in store buffers and not hit the
cache coherency until...

>     pte_offset_map_lock()

      set_pte()

... here. By which time the store from the set_pte may already be
in cache (and it is likely -- a previous fault to an adjacent pte
will probably have brought the line in).

So then along comes another CPU and fills the TLB with the now visible
pte and returns to let userspace play with uninitilized memory (and it
doesn't matter how much memory synchronisation ops this guy does, because
the problem stores are sitting in the first CPU).

I'm pretty sure this was actually observed in powerpc CPUs (what fun to
debug).

>     
>     The second is
>     
>     pte_unmap()	(usually nothing but sometimes a barrier()
>     alloc_zeroed_user_highpage()
>     pte_offset_map_lock()
>     
>     The two sequences with the use of locking should already have sufficient
>     barriers.
>     
>     By removing this write barrier, IA64 could use the default implementation
>     of alloc_zeroed_user_highpage() instead of a custom version which appears
>     to do nothing but avoid calling smp_wmb(). Once that is done, there is
>     little reason to have architecture-specific alloc_zeroed_user_highpage()
>     helpers as it would be redundant.

I'd say that either ia64 didn't realise they need the wmb, or did
realise they didn't need it :) Either way you have to talk them into
adding an smp_wmb here, rather than us removing it, if you just want
a single version.


> 
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 12c5e4e..ace5a32 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -68,8 +68,6 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
>  	void *addr = kmap_atomic(page, KM_USER0);
>  	clear_user_page(addr, vaddr, page);
>  	kunmap_atomic(addr, KM_USER0);
> -	/* Make sure this page is cleared on other CPU's too before using it */
> -	smp_wmb();
>  }
>  
>  #ifndef __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
> -- 
> -- 
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
@ 2007-07-20 21:06 Oleg Nesterov
  2007-07-20 21:57 ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2007-07-20 21:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Hugh Dickins, Mel Gorman, Nick Piggin, linux-mm

(Off-topic)

Linus Torvalds wrote:
>
> A full lock/unlock *pair* should (as far as I know) always be equivalent 
> to a full memory barrier.

Is it so? I am not arguing, I am trying to understand.

> Because, by definition, no reads or writes 
> inside the locked region may escape outside it, and that in turn implies 
> that no access _outside_ the locked region may escape to the other side of 
> it.

This means that unlock + lock is a full barrier,

> However, neither a "lock" nor an "unlock" on *its*own* is a barrier at 
> all, at most they are semi-permeable barriers for some things, where 
> different architectures can be differently semi-permeable.

and this means that lock + unlock is not.

	A;
	lock();
	unlock();
	B;

If both A and B can leak into the critical section, they could be reordered
inside this section, so we can have

	lock();
	B;
	A;
	unlock();

Yes?

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
  2007-07-20 21:06 Oleg Nesterov
@ 2007-07-20 21:57 ` Linus Torvalds
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2007-07-20 21:57 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Hugh Dickins, Mel Gorman, Nick Piggin, linux-mm


On Sat, 21 Jul 2007, Oleg Nesterov wrote:
> 
> Linus Torvalds wrote:
> >
> > A full lock/unlock *pair* should (as far as I know) always be equivalent 
> > to a full memory barrier.
> 
> Is it so? I am not arguing, I am trying to understand.

Yeah, no, I think you're right, and I'm wrong.

I think unlock+lock is a complete barrier, but lock+unlock isn't. Funny.

> This means that unlock + lock is a full barrier,

Indeed. If nothing else, because on the same lock it obviously had better 
be (you have two critical regions, and the whole *point* of the lock is to 
keep them clear of each others).

> > However, neither a "lock" nor an "unlock" on *its*own* is a barrier at 
> > all, at most they are semi-permeable barriers for some things, where 
> > different architectures can be differently semi-permeable.
> 
> and this means that lock + unlock is not.
> 
> 	A;
> 	lock();
> 	unlock();
> 	B;
> 
> If both A and B can leak into the critical section, they could be reordered
> inside this section, so we can have
> 
> 	lock();
> 	B;
> 	A;
> 	unlock();
> 
> Yes?

Yes, I think you're right.

		Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2007-07-23  2:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-07-18 15:05 [PATCH] Remove unnecessary smp_wmb from clear_user_highpage() Mel Gorman
2007-07-18 16:45 ` Hugh Dickins
2007-07-19  2:17   ` Nick Piggin
2007-07-20 13:08     ` Mel Gorman
2007-07-23  2:02       ` Nick Piggin
2007-07-19  2:28   ` Linus Torvalds
2007-07-19  2:58     ` Nick Piggin
2007-07-19  2:36   ` Nick Piggin
2007-07-19 11:16   ` Mel Gorman
2007-07-19  1:57 ` Nick Piggin
2007-07-20 21:06 Oleg Nesterov
2007-07-20 21:57 ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox