linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Hugh Dickins <hugh@veritas.com>
Cc: a.p.zijlstra@chello.nl, torvalds@osdl.org, dhowells@redhat.com,
	linux-mm@kvack.org
Subject: Re: tracking dirty pages patches
Date: Mon, 22 May 2006 13:29:05 -0700	[thread overview]
Message-ID: <20060522132905.6e1a711c.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0605222022100.11067@blonde.wat.veritas.com>

Hugh Dickins <hugh@veritas.com> wrote:
>
> Belated observations on your "tracking dirty pages" patches.

Thanks, Hugh.

> page_wrprotect is a nice use of rmap, but I see a couple of problems.
> One is in the lock ordering (there's info on mm lock ordering at the
> top of filemap.c, but I find the list at the top of rmap.c easier).
> 
> set_page_dirty has always (awkwardly) been liable to be called from
> very low in the hierarchy; whereas you're assuming clear_page_dirty
> is called from much higher up.  And in most cases there's no problem
> (please cross-check to confirm that); but try_to_free_buffers in fs/
> buffer.c calls it while holding mapping->private_lock - page_wrprotect
> called from test_clear_page_dirty then violates the order.
> 
> If we're lucky and that is indeed the only violation, maybe Andrew
> can recommend a change to try_to_free_buffers to avoid it: I have
> no appreciation of the issues at that end myself.

I had troubles with that as well - tree_lock is a very "inner" lock.  So I
moved test_clear_page_dirty()'s call to page_wrprotect() to be outside
tree_lock.

But I don't think you were referring to that - I am unable to evaluate your
expression "the order".

The running of page_wrprotect_file() inside private_lock is a worry, yes. 
We can move the clear_page_dirty() call in try_to_free_buffers() to be
outside private_lock.

But I don't know which particular ranking violation you've identified.

> ...
>
> (Why does follow_pages set_page_dirty at all?  I _think_ it's in case
> the get_user_pages caller forgets to set_page_dirty when releasing.
> But that's not how we usually write kernel code, to hide mistakes most
> of the time,

Yes, that would be bad.

> and your mods may change the balance there.  Andrew will
> remember better whether that set_page_dirty has stronger justification.)

It was added by the below, which nobody was terribly happy with at the
time.  (Took me 5-10 minutes to hunt this down.  Insert rote comment about
comments).




Date: Mon, 19 Jan 2004 18:43:46 +0000
From: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
To: bk-commits-head@vger.kernel.org
Subject: [PATCH] s390: endless loop in follow_page.


ChangeSet 1.1490.3.215, 2004/01/19 10:43:46-08:00, akpm@osdl.org

	[PATCH] s390: endless loop in follow_page.
	
	From: Martin Schwidefsky <schwidefsky@de.ibm.com>
	
	Fix endless loop in get_user_pages() on s390.  It happens only on s/390
	because pte_dirty always returns 0.  For all other architectures this is an
	optimization.
	
	In the case of "write && !pte_dirty(pte)" follow_page() returns NULL.  On all
	architectures except s390 handle_pte_fault() will then create a pte with
	pte_dirty(pte)==1 because write_access==1.  In the following, second call to
	follow_page() all is fine.  With the physical dirty bit patch pte_dirty() is
	always 0 for s/390 because the dirty bit doesn't live in the pte.


# This patch includes the following deltas:
#	           ChangeSet	1.1490.3.214 -> 1.1490.3.215
#	         mm/memory.c	1.145   -> 1.146  
#

 memory.c |   21 +++++++++++++--------
 1 files changed, 13 insertions(+), 8 deletions(-)


diff -Nru a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c	Mon Jan 19 15:47:24 2004
+++ b/mm/memory.c	Mon Jan 19 15:47:24 2004
@@ -651,14 +651,19 @@
 	pte = *ptep;
 	pte_unmap(ptep);
 	if (pte_present(pte)) {
-		if (!write || (pte_write(pte) && pte_dirty(pte))) {
-			pfn = pte_pfn(pte);
-			if (pfn_valid(pfn)) {
-				struct page *page = pfn_to_page(pfn);
-
-				mark_page_accessed(page);
-				return page;
-			}
+		if (write && !pte_write(pte))
+			goto out;
+		if (write && !pte_dirty(pte)) {
+			struct page *page = pte_page(pte);
+			if (!PageDirty(page))
+				set_page_dirty(page);
+		}
+		pfn = pte_pfn(pte);
+		if (pfn_valid(pfn)) {
+			struct page *page = pfn_to_page(pfn);
+			
+			mark_page_accessed(page);
+			return page;
 		}
 	}
 
-
To unsubscribe from this list: send the line "unsubscribe bk-commits-head" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-05-22 20:29 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-22 19:31 Hugh Dickins
2006-05-22 20:29 ` Andrew Morton [this message]
2006-05-23  8:17   ` Nick Piggin
2006-05-23 14:55   ` Hugh Dickins
2006-05-23 16:24 ` Christoph Lameter
2006-05-23 19:21   ` Hugh Dickins
2006-05-23 19:31     ` Christoph Lameter
2006-05-23 20:34       ` Hugh Dickins
2006-05-23 21:16         ` Christoph Lameter
2006-05-23 21:17         ` Chen, Kenneth W
2006-05-23 21:40           ` update_mmu_cache vs. lazy_mmu_prot_update Christoph Lameter
2006-05-24 14:12             ` Hugh Dickins
2006-05-23 22:28         ` remove VM_LOCKED before remap_pfn_range and drop VM_SHM Christoph Lameter
2006-05-24 14:57           ` Hugh Dickins
2006-05-24  2:25         ` tracking dirty pages patches Arjan van de Ven
2006-05-24 15:10           ` Hugh Dickins
2006-05-25  2:26             ` Arjan van de Ven
2006-05-23 16:41 ` David Howells
2006-05-23 23:07 ` Peter Zijlstra
2006-05-24 14:20   ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060522132905.6e1a711c.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=dhowells@redhat.com \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox