linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Robin Holt <holt@sgi.com>
To: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Robin Holt <holt@sgi.com>, Hugh Dickins <hugh@veritas.com>,
	Roland McGrath <roland@redhat.com>,
	linux-mm@kvack.org
Subject: Re: get_user_pages() with write=1 and force=1 gets read-only pages.
Date: Sun, 31 Jul 2005 06:30:59 -0500	[thread overview]
Message-ID: <20050731113059.GC2254@lnx-holt.americas.sgi.com> (raw)
In-Reply-To: <42ECB0EC.4000808@yahoo.com.au>

On Sun, Jul 31, 2005 at 09:07:24PM +1000, Nick Piggin wrote:
> Robin Holt wrote:
> >Should there be a check to ensure we don't return VM_FAULT_RACE when the
> >pte which was inserted is exactly the same one we would have inserted?
> 
> That would slow down the do_xxx_fault fastpaths, though.
> 
> Considering VM_FAULT_RACE will only make any difference to get_user_pages
> (ie. not the page fault fastpath), and only then in rare cases of a racing
> fault on the same pte, I don't think the extra test would be worthwhile.
> 
> >Could we generalize that more to the point of only returning VM_FAULT_RACE
> >when write access was requested but the racing pte was not writable?
> >
> 
> I guess get_user_pages could be changed to retry on VM_FAULT_RACE only if
> it is attempting write access... is that worthwhile? I guess so...
> 
> >Most of the test cases I have thrown at this have gotten the writer
> >faulting first which did not result in problems.  I would hate to slow
> >things down if not necessary.  I am unaware of more issues than the one
> >I have been tripping.
> >
> 
> I think the VM_FAULT_RACE patch as-is should be fairly unintrusive to the
> page fault fastpaths. I think weighing down get_user_pages is preferable to
> putting logic in the general fault path - though I don't think there should
> be too much overhead introduced even there...
> 
> Do you think the patch (or at least, the idea) looks like a likely solution
> to your problem? Obviously the !i386 architecture specific parts still need
> to be filled in...

The patch works for me.

What I was thinking didn't seem that much heavier than what is already being
done.  I guess a patch against your patch might be a better illustration:

This is on top of your patch:

Index: linux/mm/memory.c
===================================================================
--- linux.orig/mm/memory.c	2005-07-31 05:39:24.161826311 -0500
+++ linux/mm/memory.c	2005-07-31 06:26:33.687274327 -0500
@@ -1768,17 +1768,17 @@ do_anonymous_page(struct mm_struct *mm, 
 		spin_lock(&mm->page_table_lock);
 		page_table = pte_offset_map(pmd, addr);
 
+		entry = maybe_mkwrite(pte_mkdirty(mk_pte(page,
+						vma->vm_page_prot)), vma);
 		if (!pte_none(*page_table)) {
+			if (!pte_same(*page_table, entry))
+				ret = VM_FAULT_RACE;
 			pte_unmap(page_table);
 			page_cache_release(page);
 			spin_unlock(&mm->page_table_lock);
-			ret = VM_FAULT_RACE;
 			goto out;
 		}
 		inc_mm_counter(mm, rss);
-		entry = maybe_mkwrite(pte_mkdirty(mk_pte(page,
-							 vma->vm_page_prot)),
-				      vma);
 		lru_cache_add_active(page);
 		SetPageReferenced(page);
 		page_add_anon_rmap(page, vma, addr);
@@ -1879,6 +1879,10 @@ retry:
 	}
 	page_table = pte_offset_map(pmd, address);
 
+	entry = mk_pte(new_page, vma->vm_page_prot);
+	if (write_access)
+		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+
 	/*
 	 * This silly early PAGE_DIRTY setting removes a race
 	 * due to the bad i386 page protection. But it's valid
@@ -1895,9 +1899,6 @@ retry:
 			inc_mm_counter(mm, rss);
 
 		flush_icache_page(vma, new_page);
-		entry = mk_pte(new_page, vma->vm_page_prot);
-		if (write_access)
-			entry = maybe_mkwrite(pte_mkdirty(entry), vma);
 		set_pte_at(mm, address, page_table, entry);
 		if (anon) {
 			lru_cache_add_active(new_page);
@@ -1906,11 +1907,12 @@ retry:
 			page_add_file_rmap(new_page);
 		pte_unmap(page_table);
 	} else {
+		if (!pte_same(*page_table, entry))
+			ret=VM_FAULT_RACE;
 		/* One of our sibling threads was faster, back out. */
 		pte_unmap(page_table);
 		page_cache_release(new_page);
 		spin_unlock(&mm->page_table_lock);
-		ret = VM_FAULT_RACE;
 		goto out;
 	}



In both cases, we have immediately before this read the value from the
pte so all the processor infrastructure is already in place and the
read should be extremely quick.  In truth, the compiler should eliminate
the second load, but I can not guarantee that.

What do you think?

Robin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2005-07-31 11:30 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-30 20:53 Robin Holt
2005-07-30 22:13 ` Hugh Dickins
2005-07-31  1:52   ` Nick Piggin
2005-07-31 10:52     ` Robin Holt
2005-07-31 11:07       ` Nick Piggin
2005-07-31 11:30         ` Robin Holt [this message]
2005-07-31 11:39           ` Robin Holt
2005-07-31 12:09           ` Robin Holt
2005-07-31 22:27             ` Nick Piggin
2005-08-01  3:22               ` Roland McGrath
2005-08-01  8:21                 ` [patch 2.6.13-rc4] fix get_user_pages bug Nick Piggin
2005-08-01  9:19                   ` Ingo Molnar
2005-08-01  9:27                     ` Nick Piggin
2005-08-01 10:15                       ` Ingo Molnar
2005-08-01 10:57                         ` Nick Piggin
2005-08-01 19:43                           ` Hugh Dickins
2005-08-01 20:08                             ` Linus Torvalds
2005-08-01 21:06                               ` Hugh Dickins
2005-08-01 21:51                                 ` Linus Torvalds
2005-08-01 22:01                                   ` Linus Torvalds
2005-08-02 12:01                                     ` Martin Schwidefsky
2005-08-02 12:26                                       ` Hugh Dickins
2005-08-02 12:28                                         ` Nick Piggin
2005-08-02 15:19                                         ` Martin Schwidefsky
2005-08-02 15:30                                       ` Linus Torvalds
2005-08-02 16:03                                         ` Hugh Dickins
2005-08-02 16:25                                           ` Linus Torvalds
2005-08-02 17:02                                             ` Linus Torvalds
2005-08-02 17:27                                               ` Hugh Dickins
2005-08-02 17:21                                             ` Hugh Dickins
2005-08-02 18:47                                               ` Linus Torvalds
2005-08-02 19:20                                                 ` Hugh Dickins
2005-08-02 19:54                                                   ` Linus Torvalds
2005-08-02 20:55                                                     ` Hugh Dickins
2005-08-03 10:24                                                       ` Nick Piggin
2005-08-03 11:47                                                         ` Hugh Dickins
2005-08-03 12:13                                                           ` Nick Piggin
2005-08-03 16:12                                                         ` Linus Torvalds
2005-08-03 16:39                                                           ` Linus Torvalds
2005-08-03 16:42                                                             ` Linus Torvalds
2005-08-03 17:12                                                           ` Hugh Dickins
2005-08-03 23:03                                                           ` Nick Piggin
2005-08-04 14:14                                                           ` Alexander Nyberg
2005-08-04 14:30                                                             ` Nick Piggin
2005-08-04 15:00                                                               ` Alexander Nyberg
2005-08-04 15:35                                                                 ` Hugh Dickins
2005-08-04 16:32                                                                   ` Russell King
2005-08-04 15:36                                                                 ` Linus Torvalds
2005-08-04 16:29                                                               ` Russell King
2005-08-03 10:24                                                       ` Martin Schwidefsky
2005-08-03 11:57                                                         ` Hugh Dickins
2005-08-02 16:44                                         ` Martin Schwidefsky
2005-08-01 15:42                   ` Linus Torvalds
2005-08-01 18:18                     ` Linus Torvalds
2005-08-03  8:24                       ` Robin Holt
2005-08-03 11:31                         ` Hugh Dickins
2005-08-04 11:48                           ` Robin Holt
2005-08-04 13:04                             ` Hugh Dickins
2005-08-01 19:29                     ` Hugh Dickins
2005-08-01 19:48                       ` Linus Torvalds
2005-08-02  8:07                         ` Martin Schwidefsky
2005-08-01 19:57                       ` Andrew Morton
2005-08-01 20:16                         ` Linus Torvalds
2005-08-02  0:14                     ` Nick Piggin
2005-08-02  1:27                     ` Nick Piggin
2005-08-02  3:45                       ` Linus Torvalds
2005-08-02  4:25                         ` Nick Piggin
2005-08-02  4:35                           ` Linus Torvalds
2005-08-01 20:03                   ` Hugh Dickins
2005-08-01 20:12                     ` Andrew Morton
2005-08-01 20:26                       ` Linus Torvalds
2005-08-01 20:51                       ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050731113059.GC2254@lnx-holt.americas.sgi.com \
    --to=holt@sgi.com \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=roland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox