linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Matthew Wilcox <willy@debian.org>
Cc: Andrea Arcangeli <andrea@suse.de>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Andrew Morton <akpm@osdl.org>,
	Linux Kernel list <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@elte.hu>, Ben LaHaise <bcrl@kvack.org>,
	linux-mm@kvack.org,
	Architectures Group <linux-arch@vger.kernel.org>
Subject: Re: [PATCH] ppc64: Fix possible race with set_pte on a present PTE
Date: Tue, 25 May 2004 07:48:24 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.58.0405250726000.9951@ppc970.osdl.org> (raw)
In-Reply-To: <20040525114437.GC29154@parcelfarce.linux.theplanet.co.uk>


On Tue, 25 May 2004, Matthew Wilcox wrote:

> On Mon, May 24, 2004 at 09:00:02PM -0700, Linus Torvalds wrote:
> > I suspect we should just make a "ptep_set_bits()" inline function that 
> > _atomically_ does "set the dirty/accessed bits". On x86, it would be a 
> > simple
> > 
> > 		asm("lock ; orl %1,%0"
> > 			:"m" (*ptep)
> > 			:"r" (entry));
> > 
> > and similarly on most other architectures it should be quite easy to do 
> > the equivalent. You can always do it with a simple compare-and-exchange 
> > loop, something any SMP-capable architecture should have.
> 
> ... but PA doesn't.  Just load-and-clear-word (and its 64-bit equivalent
> in 64-bit mode).  And that word has to be 16-byte aligned.

Wow. And this architecture claims to support SMP? 

> What race are we protecting against?  If it's like xchg() and we only
> need to protect against a racing xchg() and not a reader, we can just
> reuse the global array of hashed spinlocks we have for that.

The race is:
 - one CPU sets the dirty bit (possibly with a hardware walker, but I 
   guess on PA it's probably done in sw)
 - the other CPU sets the accessed bit in sw as part of the 
   "handle_pte_fault()" processing.

Right now we set the accessed bit with a simple "ptep_establish()", which 
will use "set_pte()", which is just a regular write. So setting the 
accessed bit will basically be a nonatomic sequence of

 - read pte entry
 - entry = pte_mkyoung(entry)
 - set_pte(entry)

which is all done under the mm->page_table_lock, but which does NOT 
protect against any hardware page-table walkers or any asynchronous sw 
walkers (if anybody does them).

Basically, the suggestion is to replace the "set_pte()" with something 
that is safe against anything else that updates the page tables (whether 
software or hardware). If only core kernel code does that, then you should 
already be fine, since the page-table spinlock should already be held by 
all updaters.

NOTE! One really easy approach would be to say that we never mix software 
updates of the accessed bit with hw updates, and just have a rule that if 
the architecture does accessed-bit updates in hardware (and can thus race 
with us doing them in software _despite_ the fact that we hold the page 
table lock), then we just don't do the update at all. 

We'd just pass in a flag to "ptep_establish()" to tell it whether we
changed the dirty bit or not. It would be "write_access" in
handle_pte_fault(), and 1 in the other two cases.

> Ah, atomic writes we can do.  That's easy.  I think all Linux architectures
> support atomic writes to naturally aligned addresses, don't they?

Yes. You'd really have to work at it _not_ to support them ;)

However, the atomic write case only helps in the case when we update _all_ 
the bits that hw walkers can update, 

			Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

  reply	other threads:[~2004-05-25 14:48 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1085369393.15315.28.camel@gaston>
     [not found] ` <Pine.LNX.4.58.0405232046210.25502@ppc970.osdl.org>
     [not found]   ` <1085371988.15281.38.camel@gaston>
     [not found]     ` <Pine.LNX.4.58.0405232134480.25502@ppc970.osdl.org>
     [not found]       ` <1085373839.14969.42.camel@gaston>
2004-05-24  5:10         ` Linus Torvalds
2004-05-24  5:34           ` Benjamin Herrenschmidt
2004-05-24  5:38             ` Benjamin Herrenschmidt
2004-05-24  5:52               ` Benjamin Herrenschmidt
2004-05-24  7:39           ` Ingo Molnar
2004-05-24  5:39             ` Benjamin Herrenschmidt
2004-05-25  3:43           ` Andrea Arcangeli
2004-05-25  4:00             ` Linus Torvalds
2004-05-25  4:17               ` Benjamin Herrenschmidt
2004-05-25  4:37                 ` Andrea Arcangeli
2004-05-25  4:40                   ` Benjamin Herrenschmidt
2004-05-25  4:20               ` Andrea Arcangeli
2004-05-25  4:39                 ` Linus Torvalds
2004-05-25  4:44                   ` Linus Torvalds
2004-05-25  4:59                     ` Andrea Arcangeli
2004-05-25  5:09                       ` Andrea Arcangeli
2004-05-25  4:50                   ` Andrea Arcangeli
2004-05-25  4:59                     ` Linus Torvalds
2004-05-25  4:43                 ` David Mosberger
2004-05-25  4:53                   ` Andrea Arcangeli
2004-05-27 21:56                     ` David Mosberger
2004-05-27 22:00                       ` Benjamin Herrenschmidt
2004-05-27 22:12                         ` David Mosberger
2004-05-25 11:44               ` Matthew Wilcox
2004-05-25 14:48                 ` Linus Torvalds [this message]
2004-05-25 15:35                   ` Keith M Wesolowski
2004-05-25 16:19                     ` Linus Torvalds
2004-05-25 17:25                       ` David S. Miller
2004-05-25 17:49                         ` Linus Torvalds
2004-05-25 17:54                           ` David S. Miller
2004-05-25 18:05                             ` Linus Torvalds
2004-05-25 20:30                               ` Linus Torvalds
2004-05-25 20:35                               ` David S. Miller
2004-05-25 20:49                                 ` Linus Torvalds
2004-05-25 20:57                                   ` David S. Miller
2004-05-26  6:20                                   ` Keith M Wesolowski
2004-05-25 21:40                               ` Benjamin Herrenschmidt
2004-05-25 21:54                                 ` Linus Torvalds
2004-05-25 22:00                                   ` Linus Torvalds
2004-05-25 22:07                                     ` Benjamin Herrenschmidt
2004-05-25 22:14                                       ` Linus Torvalds
2004-05-26  0:21                                         ` Benjamin Herrenschmidt
2004-05-26  0:50                                           ` Linus Torvalds
2004-05-26  3:25                                             ` Benjamin Herrenschmidt
2004-05-26  4:08                                               ` Linus Torvalds
2004-05-26  4:12                                                 ` Benjamin Herrenschmidt
2004-05-26  4:18                                                   ` Benjamin Herrenschmidt
2004-05-26  4:50                                                     ` Linus Torvalds
2004-05-26  4:49                                                       ` Benjamin Herrenschmidt
2004-05-26  4:28                                                   ` Linus Torvalds
2004-05-26  4:46                                                 ` Benjamin Herrenschmidt
2004-05-26  4:54                                                   ` Linus Torvalds
2004-05-26  4:55                                                     ` Benjamin Herrenschmidt
2004-05-26  5:41                                                     ` Benjamin Herrenschmidt
2004-05-26  5:59                                                     ` [PATCH] (signoff) " Benjamin Herrenschmidt
2004-05-26  6:55                                                       ` Benjamin Herrenschmidt
2004-05-25 22:05                                   ` [PATCH] " Benjamin Herrenschmidt
2004-05-25 22:09                                 ` Linus Torvalds
2004-05-25 22:19                                   ` Benjamin Herrenschmidt
2004-05-25 22:24                                     ` Linus Torvalds
2004-05-25 21:27                   ` Andrea Arcangeli
2004-05-25 21:43                     ` Linus Torvalds
2004-05-25 21:55                       ` Andrea Arcangeli
2004-05-25 22:01                         ` Linus Torvalds
2004-05-25 22:18                           ` Ivan Kokshaysky
2004-05-25 22:42                             ` Andrea Arcangeli
2004-05-26  2:26                               ` Linus Torvalds
2004-05-26  7:06                                 ` Andrea Arcangeli
2004-05-25 21:44                     ` Andrea Arcangeli
2004-06-01 12:04 Martin Schwidefsky
2004-06-01 12:10 Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.58.0405250726000.9951@ppc970.osdl.org \
    --to=torvalds@osdl.org \
    --cc=akpm@osdl.org \
    --cc=andrea@suse.de \
    --cc=bcrl@kvack.org \
    --cc=benh@kernel.crashing.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=willy@debian.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox