linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Hugh Dickins <hugh@veritas.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@osdl.org>,
	David Howells <dhowells@redhat.com>,
	Christoph Lameter <christoph@lameter.com>,
	Martin Bligh <mbligh@google.com>, Nick Piggin <npiggin@suse.de>,
	Linus Torvalds <torvalds@osdl.org>
Subject: Re: [PATCH 1/6] mm: tracking shared dirty pages
Date: Fri, 23 Jun 2006 01:02:27 +0200	[thread overview]
Message-ID: <1151017347.15744.135.camel@lappy> (raw)
In-Reply-To: <Pine.LNX.4.64.0606222126310.26805@blonde.wat.veritas.com>

On Thu, 2006-06-22 at 21:52 +0100, Hugh Dickins wrote:
> On Mon, 19 Jun 2006, Peter Zijlstra wrote:

> > +static inline int is_shared_writable(unsigned int flags)
> > +{
> > +	return (flags & (VM_SHARED|VM_WRITE|VM_PFNMAP)) ==
> > +		(VM_SHARED|VM_WRITE);
> > +}
> > +
> 
> Andrew asked for the inclusion of VM_PFNMAP to be commented there,
> I don't believe that's enough: a function called "is_shared_writable"
> should be testing precisely that, or people will misuse it.
> 
> Either you change the name to "is_shared_writable_but_not_pfnmap"
> or somesuch, or you split out the VM_PFNMAP test, or you do away
> with the function and make the tests explicit inline.  As before,
> my instinctive preference is the latter: I really want to see what's
> being tested (especially in do_wp_page); but perhaps it'll just look
> too ugly all over - give it a try and see.

*sight*, thats it, explicit it will be :-)

> > +	/*
> > +	 * This is not fully correct in the light of trapping write faults
> > +	 * for writable shared mappings. However since we're going to mark
> > +	 * the page dirty anyway some few lines downward, we might as well
> > +	 * take the write fault now.
> > +	 */
> 
> I don't understand what you're getting at here: please explain,
> what is not fully correct and why?  In mail first, then we can
> decide what the comment should say, or if it should be removed.
> follow_page isn't making a pte writable, so what's the issue?

I have no idea either, I reread this part earlier today and found it one
big brainfart. It does indeed seem to do the right thing.

> > -	if (unlikely(vma->vm_flags & VM_SHARED)) {
> > +	if (unlikely(is_shared_writable(vma->vm_flags))) {
> 
> Most interesting line in the series, yes, and I'd find it
> easier to think through if it showed the flags test explicitly:
> 	if ((vma->vm_flags & (VM_SHARED|VM_WRITE|VM_PFNMAP)) ==
> 		(VM_SHARED|VM_WRITE))
> 
> Yes, Andrew, you're right it's a change in behaviour from David's
> page_mkwrite patch.  I've realized that when I was originally
> reviewing David's patch, I believed do_wp_page was mistaken to be
> doing COW on VM_SHARED areas.  But Linus has since asserted very
> forcefully that it's intentional, that ptrace poke on a VM_SHARED
> area which is currently not !VM_WRITE should COW it, so I mentioned
> that to Peter.
> 
> Has he got the test right there now?  Ummm... maybe: my brain
> exploded weeks ago.  Several strangenesses collide here, I'll
> try again tomorrow, maybe others will argue it to certainty before.

I don't think the VM_PFNMAP is needed here, but it doesn't hurt either.
Like said, I'll do explicits from now on.

> > @@ -1084,18 +1086,13 @@ munmap_back:
> >  		error = file->f_op->mmap(file, vma);
> >  		if (error)
> >  			goto unmap_and_free_vma;
> > +
> 
> Do you really need this blank line?

:-) uhu..

> > +	/*
> > +	 * Tracking of dirty pages for shared writable mappings. Do this by
> > +	 * write protecting writable pages, and mark dirty in the write fault.
> > +	 *
> > +	 * Modify vma->vm_page_prot (the default protection for new pages)
> > +	 * to this effect.
> > +	 *
> > +	 * Cannot do before because the condition depends on:
> > +	 *  - backing_dev_info having the right capabilities
> > +	 *    (set by f_op->open())
> 
> Is that so, backing_dev_info set by f_op->open()?
> And how would that be a problem here if it were so?

useless information indeed, a remnant from old times when I placed the
vm_page_prot modification between the two calls, shall remove.

> > +	 *  - vma->vm_flags being fully set
> > +	 *    (finished in f_op->mmap(), which could call remap_pfn_range())
> > +	 *
> > +	 *  Also, cannot reset vma->vm_page_prot from vma->vm_flags because
> > +	 *  f_op->mmap() can modify it.
> > +	 */
> > +	if (is_shared_writable(vm_flags) && vma->vm_file)
> > +		mapping = vma->vm_file->f_mapping;
> > +	if ((mapping && mapping_cap_account_dirty(mapping)) ||
> > +			(vma->vm_ops && vma->vm_ops->page_mkwrite))
> 
> The only way "mapping" might be set is just above.
> Wouldn't it all be clearer (though more indented) if you said
> 
> 	if (is_shared_writable(vm_flags) && vma->vm_file) {
> 		mapping = vma->vm_file->f_mapping;
> 		if ((mapping && mapping_cap_account_dirty(mapping)) ||
> 				(vma->vm_ops && vma->vm_ops->page_mkwrite)) {
> 			vma->vm_page_prot = whatever;
> 		}
> 	}
> 
> Or no need for "mapping" here at all if you change
> mapping_cap_account_dirty(vma->vm_file->f_mapping)
> to do the right thing with NULL.

Made it one big if stmt, perhaps too big, we'll see.

> 
> > +		vma->vm_page_prot =
> > +			__pgprot(pte_val
> > +				(pte_wrprotect
> > +				 (__pte(pgprot_val(vma->vm_page_prot)))));
> > +
> 
> In other mail I've suggested saving vm_page_prot above, and
> changing it here only if the driver's ->mmap did not change it.

Yes, that was a very good suggestion and has already been incorporated,
thanks.

> I remain uneasy about interfering with the permissions expected by
> strange drivers, but can't really justify my paranoia.  Certainly
> you're right to exclude VM_PFNMAPs from this interference, that's
> important; I'd be less uneasy if you also exclude VM_INSERTPAGEs,
> they're strange too - but at least they're dealing with proper struct
> pages, so should be able to handle an unexpected do_wp_page; that
> leaves the driver nopage cases, which again should be okay now you're
> (one way or another) protecting specially added vm_page_prot flags.

VM_INSERTPAGE thou shall have.

> I guess I'm just paranoid; it's irritating me that we do not have
> the right backing_dev_infos in place and having to hack around it.

Sad situation but true.

> > +static int page_mkclean_file(struct address_space *mapping, struct page *page)
> > +{
> > +	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> > +	struct vm_area_struct *vma;
> > +	struct prio_tree_iter iter;
> > +	int ret = 0;
> > +
> > +	BUG_ON(PageAnon(page));
> > +
> > +	spin_lock(&mapping->i_mmap_lock);
> > +	vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
> > +		int protect = mapping_cap_account_dirty(mapping) &&
> > +			is_shared_writable(vma->vm_flags);
> > +		ret += page_mkclean_one(page, vma, protect);
> 
> You have a good point here, one I'd completely missed: because a vma
> may have been recently mprotected !VM_WRITE, you have to check readonly
> mappings too.  Perhaps worth a comment.  But I think "is_shared_writable"
> is not the best test here: just test for VM_SHARED vmas, they're the
> only ones which can be mprotected to/from shared writable.  And then
> I think you don't need to pass down an additional "protect" argument?
> It's only being called for mapping_cap_account_dirty mappings anyway,
> isn't it?

Well, no, not anymore. I thought to make it actually do what its name
said it does: clean the page's PTEs (I am even pondering about
implementing the anonymous branch).

In that light, its now called for each page.


New patch will follow shortly since I can't seem to sleep anyway...



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2006-06-22 23:02 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-19 17:52 [PATCH 0/6] mm: tracking dirty pages -v9 Peter Zijlstra
2006-06-19 17:52 ` [PATCH 1/6] mm: tracking shared dirty pages Peter Zijlstra
2006-06-22  5:56   ` Andrew Morton
2006-06-22  6:07     ` Christoph Lameter
2006-06-22  6:15       ` Andrew Morton
2006-06-22 11:33     ` Peter Zijlstra
2006-06-22 13:17       ` Hugh Dickins
2006-06-22 20:52   ` Hugh Dickins
2006-06-22 23:02     ` Peter Zijlstra [this message]
2006-06-22 23:39     ` [PATCH] mm: tracking shared dirty pages -v10 Peter Zijlstra
2006-06-23  3:10       ` Jeff Dike
2006-06-23  3:31         ` Andrew Morton
2006-06-23  3:50           ` Jeff Dike
2006-06-23  4:01           ` H. Peter Anvin
2006-06-23 15:08             ` Jeff Dike
2006-06-23  6:08       ` Linus Torvalds
2006-06-23  7:27         ` Hugh Dickins
2006-06-23 17:00           ` Christoph Lameter
2006-06-23 17:22             ` Peter Zijlstra
2006-06-23 17:52               ` Christoph Lameter
2006-06-23 18:11                 ` Martin Bligh
2006-06-23 18:20                   ` Linus Torvalds
2006-06-23 17:56               ` Linus Torvalds
2006-06-23 18:03                 ` Peter Zijlstra
2006-06-23 18:23                   ` Christoph Lameter
2006-06-23 18:41                 ` Christoph Hellwig
2006-06-23 17:49           ` Linus Torvalds
2006-06-23 18:05             ` Arjan van de Ven
2006-06-23 18:08             ` Miklos Szeredi
2006-06-23 19:06       ` Hugh Dickins
2006-06-23 22:00         ` Peter Zijlstra
2006-06-23 22:35           ` Linus Torvalds
2006-06-23 22:44             ` Peter Zijlstra
2006-06-28 14:58         ` [RFC][PATCH] mm: fixup do_wp_page() Peter Zijlstra
2006-06-28 18:20           ` Hugh Dickins
2006-06-19 17:53 ` [PATCH 2/6] mm: balance dirty pages Peter Zijlstra
2006-06-19 17:53 ` [PATCH 3/6] mm: msync() cleanup Peter Zijlstra
2006-06-22 17:02   ` Hugh Dickins
2006-06-19 17:53 ` [PATCH 4/6] mm: optimize the new mprotect() code a bit Peter Zijlstra
2006-06-22 17:21   ` Hugh Dickins
2006-06-19 17:53 ` [PATCH 5/6] mm: small cleanup of install_page() Peter Zijlstra
2006-06-19 17:53 ` [PATCH 6/6] mm: remove some update_mmu_cache() calls Peter Zijlstra
2006-06-22 16:29   ` Hugh Dickins
2006-06-22 16:37     ` Christoph Lameter
2006-06-22 17:35       ` Hugh Dickins
2006-06-22 18:31         ` Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2006-06-28 20:17 [PATCH 0/6] mm: tracking dirty pages -v14 Peter Zijlstra
2006-06-28 20:17 ` [PATCH 1/6] mm: tracking shared dirty pages Peter Zijlstra
2006-06-13 11:21 [PATCH 0/6] mm: tracking dirty pages -v8 Peter Zijlstra
2006-06-13 11:21 ` [PATCH 1/6] mm: tracking shared dirty pages Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1151017347.15744.135.camel@lappy \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@osdl.org \
    --cc=christoph@lameter.com \
    --cc=dhowells@redhat.com \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@google.com \
    --cc=npiggin@suse.de \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox