Re: x86 ptep_get_and_clear question

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jamie Lokier <lk@tantalophile.demon.co.uk>
To: Kanoj Sarcar <kanoj@google.engr.sgi.com>
Cc: Ben LaHaise <bcrl@redhat.com>,
	linux-mm@kvack.org, mingo@redhat.com, alan@redhat.com,
	linux-kernel@vger.kernel.org
Subject: Re: x86 ptep_get_and_clear question
Date: Thu, 15 Feb 2001 18:47:29 +0100	[thread overview]
Message-ID: <20010215184729.A2247@pcep-jamie.cern.ch> (raw)
In-Reply-To: <200102151723.JAA43255@google.engr.sgi.com>; from kanoj@google.engr.sgi.com on Thu, Feb 15, 2001 at 09:23:42AM -0800

[Added Linus and linux-kernel as I think it's of general interest]

Kanoj Sarcar wrote:
> Whether Jamie was trying to illustrate a different problem, I am not
> sure.

Yes, I was talking about pte_test_and_clear_dirty in the earlier post.

> Look in mm/mprotect.c. Look at the call sequence change_protection() -> ...
> change_pte_range(). Specifically at the sequence:
> 
> 	entry = ptep_get_and_clear(pte);
> 	set_pte(pte, pte_modify(entry, newprot));
>
> Go ahead and pull your x86 specs, and prove to me that between the 
> ptep_get_and_clear(), which zeroes out the pte (specifically, when the 
> dirty bit is not set), processor 2 can not come in and set the dirty 
> bit on the in-memory pte. Which immediately gets overwritten by the 
> set_pte(). For an example of how this can happen, look at my previous 
> postings.

Let's see.  We'll assume processor 2 does a write between the
ptep_get_and_clear and the set_pte, which are done on processor 1.

Now, ptep_get_and_clear is atomic, so we can talk about "before" and
"after".  Before it, either processor 2 has a TLB entry with the dirty
bit set, or it does not (it has either a clean TLB entry or no TLB entry
at all).

After ptep_get_and_clear, processor 2 does a write.  If it already has a
dirty TLB entry, then `entry' will also be dirty so the dirty bit is
preserved.  If processor 2 does not have a dirty TLB entry, then it will
look up the pte.  Processor 2 finds the pte is clear, so raises a page fault.
Spinlocks etc. sort everything out in the page fault.

Here's the important part: when processor 2 wants to set the pte's dirty
bit, it *rereads* the pte and *rechecks* the permission bits again.
Even though it has a non-dirty TLB entry for that pte.

That is how I read Ben LaHaise's description, and his test program tests
exactly this.

If the processor worked by atomically setting the dirty bit in the pte
without rechecking the permissions when it reads that pte bit, then this
scheme would fail and you'd be right about the lost dirty bits.  I would
have thought it would be simpler to implement a CPU this way, but
clearly it is not as efficient for SMP OS design so perhaps CPU
designers thought about this.

The only remaining question is: is the observed behaviour defined for
x86 CPUs in general, or are we depending on the results of testing a few
particular CPUs?

-- Jamie
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

next prev parent reply	other threads:[~2001-02-15 17:47 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-02-15  1:50 Kanoj Sarcar
2001-02-15  2:13 ` Ben LaHaise
2001-02-15  2:37   ` Kanoj Sarcar
2001-02-15 10:55   ` Jamie Lokier
2001-02-15 16:06     ` Ben LaHaise
2001-02-15 16:35       ` Jamie Lokier
2001-02-15 17:23         ` Kanoj Sarcar
2001-02-15 17:27           ` Ben LaHaise
2001-02-15 17:38             ` Kanoj Sarcar
2001-02-15 17:46               ` Ben LaHaise
2001-02-15 17:47           ` Jamie Lokier [this message]
2001-02-15 18:05             ` Kanoj Sarcar
2001-02-15 18:23             ` Kanoj Sarcar
2001-02-15 18:42               ` Jamie Lokier
2001-02-15 18:57                 ` Kanoj Sarcar
2001-02-15 19:06                   ` Ben LaHaise
2001-02-15 19:19                     ` Kanoj Sarcar
2001-02-15 18:51               ` Manfred Spraul
2001-02-15 19:05                 ` Kanoj Sarcar
2001-02-15 19:19                   ` Jamie Lokier
2001-02-15 19:07                 ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010215184729.A2247@pcep-jamie.cern.ch \
    --to=lk@tantalophile.demon.co.uk \
    --cc=alan@redhat.com \
    --cc=bcrl@redhat.com \
    --cc=kanoj@google.engr.sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox