linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Mel Gorman <mel@skynet.ie>
Cc: hugh@veritas.com, linux-mm@kvack.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] Remove unnecessary smp_wmb from clear_user_highpage()
Date: Thu, 19 Jul 2007 03:57:22 +0200	[thread overview]
Message-ID: <20070719015722.GB23641@wotan.suse.de> (raw)
In-Reply-To: <20070718150514.GA21823@skynet.ie>

On Wed, Jul 18, 2007 at 04:05:14PM +0100, Mel Gorman wrote:
> Hi,
> 
> At the nudging of Andrew, I was checking to see if the architecture-specific
> implementations of alloc_zeroed_user_highpage() can be removed or not.
> With the exception of barriers, the differences are negligible and the main
> memory barrier is in clear_user_highpage(). However, it's unclear why it's
> needed. Do you mind looking at the following patch and telling me if it's
> wrong and if so, why?
> 
> Thanks a lot.
> 
> ===
> 
>     This patch removes an unnecessary write barrier from clear_user_highpage().
>     
>     clear_user_highpage() is called from alloc_zeroed_user_highpage() on a
>     number of architectures and from clear_huge_page(). However, these callers
>     are already protected by the necessary memory barriers due to spinlocks
>     in the fault path and the page should not be visible on other CPUs anyway
>     making the barrier unnecessary. A hint of lack of necessity is that there
>     does not appear to be a read barrier anywhere for this zeroed page.
>     
>     The sequence for the first use of alloc_zeroed_user_highpage()
>     looks like;
>     
>     pte_unmap_unlock()
>     alloc_zeroed_user_highpage()

pte_offset_map_lock only provides acquire semantics. So stores from
alloc_zeroed_user_highpage can sit in store buffers and not hit the
cache coherency until...

>     pte_offset_map_lock()

      set_pte()

... here. By which time the store from the set_pte may already be
in cache (and it is likely -- a previous fault to an adjacent pte
will probably have brought the line in).

So then along comes another CPU and fills the TLB with the now visible
pte and returns to let userspace play with uninitilized memory (and it
doesn't matter how much memory synchronisation ops this guy does, because
the problem stores are sitting in the first CPU).

I'm pretty sure this was actually observed in powerpc CPUs (what fun to
debug).

>     
>     The second is
>     
>     pte_unmap()	(usually nothing but sometimes a barrier()
>     alloc_zeroed_user_highpage()
>     pte_offset_map_lock()
>     
>     The two sequences with the use of locking should already have sufficient
>     barriers.
>     
>     By removing this write barrier, IA64 could use the default implementation
>     of alloc_zeroed_user_highpage() instead of a custom version which appears
>     to do nothing but avoid calling smp_wmb(). Once that is done, there is
>     little reason to have architecture-specific alloc_zeroed_user_highpage()
>     helpers as it would be redundant.

I'd say that either ia64 didn't realise they need the wmb, or did
realise they didn't need it :) Either way you have to talk them into
adding an smp_wmb here, rather than us removing it, if you just want
a single version.


> 
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 12c5e4e..ace5a32 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -68,8 +68,6 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr)
>  	void *addr = kmap_atomic(page, KM_USER0);
>  	clear_user_page(addr, vaddr, page);
>  	kunmap_atomic(addr, KM_USER0);
> -	/* Make sure this page is cleared on other CPU's too before using it */
> -	smp_wmb();
>  }
>  
>  #ifndef __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
> -- 
> -- 
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2007-07-19  1:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-18 15:05 Mel Gorman
2007-07-18 16:45 ` Hugh Dickins
2007-07-19  2:17   ` Nick Piggin
2007-07-20 13:08     ` Mel Gorman
2007-07-23  2:02       ` Nick Piggin
2007-07-19  2:28   ` Linus Torvalds
2007-07-19  2:58     ` Nick Piggin
2007-07-19  2:36   ` Nick Piggin
2007-07-19 11:16   ` Mel Gorman
2007-07-19  1:57 ` Nick Piggin [this message]
2007-07-20 21:06 Oleg Nesterov
2007-07-20 21:57 ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070719015722.GB23641@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@skynet.ie \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox