From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Hugh Dickins <hugh@veritas.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Andrew Morton <akpm@osdl.org>,
David Howells <dhowells@redhat.com>,
Christoph Lameter <christoph@lameter.com>,
Martin Bligh <mbligh@google.com>, Nick Piggin <npiggin@suse.de>,
Linus Torvalds <torvalds@osdl.org>
Subject: Re: [PATCH] mm: tracking shared dirty pages -v10
Date: Sat, 24 Jun 2006 00:00:17 +0200 [thread overview]
Message-ID: <1151100017.30819.50.camel@lappy> (raw)
In-Reply-To: <Pine.LNX.4.64.0606231933060.7524@blonde.wat.veritas.com>
On Fri, 2006-06-23 at 20:06 +0100, Hugh Dickins wrote:
> > - if (unlikely((vma->vm_flags & (VM_SHARED|VM_WRITE)) ==
> > - (VM_SHARED|VM_WRITE))) {
> > + /*
> > + * Only catch write-faults on shared writable pages, read-only
> > + * shared pages can get COWed by get_user_pages(.write=1, .force=1).
> > + */
> > + if (unlikely(((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
> > + (VM_WRITE|VM_SHARED)))) {
>
> I was about to say I'm satisfied with that test now, and it's all the
> better for not mentioning VM_PFNMAP there - though odd that you've
> exchanged WRITE and SHARED, and added extra unnecessary parentheses ;)
Oops, on the extra parentheses. I exchanged the order because that is
more consistent with all the other sites.
> But grrr, sigh, damn, argh - I now realize it's right to the first
> order (normal case) and to the second order (ptrace poke), but not
> to the third order (ptrace poke anon page here to be COWed -
> perhaps can't occur without intervening mprotects).
>
> That's not an issue for you at all (there are other places which
> are inconsistent on whether such pages are private or shared e.g.
> copy_one_pte does not wrprotect them), but could be a problem for
> David's page_mkwrite - there's a danger of passing it an anonymous
> page, which (depending on what the ->page_mkwrite actually does)
> could go seriously wrong.
>
> I guess it ought to be restructured
> if (PageAnon(old_page)) {
> ...
> } else if (shared writable vma) {
> ...
> }
> and a patch to do that should precede your dirty page patches
> (and the only change your dirty page patches need add here on top
> of that would be the dirty_page = old_page, get_page(dirty_page)).
Hmm, I'll investigate this and maybe write a follow-up patch if you or
someone else doesn't beaten me to it.
> (You've probably noticed that use of mprotect on problem vmas
> is liable to remove the cache bits. There have been occasional
> complaints about that, and perhaps a patch to fix it. But that's
> long-standing behaviour, so not something you need worry about:
> we've only had to worry about a new change in behaviour in mmap.)
Yeah, had noticed, but had assumed that since it was a user action
they'd know better than to call it on dubious mappings.
However if you say there is demand; I've been through all the arch's to
verify the use of that most ugly drm construct, and whipping up an arch
function like pgprot_modify() that would fold a pgprot_t and
protection_map[] argument together seems quite doable.
> > Page from remap_pfn_range() are explicitly excluded because their
> > COW semantics are already horrid enough (see vm_normal_page() in
> > do_wp_page()) and because they don't have a backing store anyway.
> >
> > copy_one_pte() cleans child PTEs, not wrong in the current context,
> > but why?
>
> Dates back to Linux 2.2 if not earlier. I think the idea is that the
> parent has its pte dirty, and that's enough to mark the page as dirty;
> the child is unlikely to dirty the page further itself, and a second
> instance of dirty pte for that page may entail some overhead. What
> overhead would vary from release to release, I've not thought through
> what the current saving would be in 2.6.17, probably very little.
> But you can imagine there might have been some release which would
> write out the page each time it found pte dirty, in which case good
> reason to clear it there. But I'm guessing, it's from before my time.
I'll keep that in mind for a future cleanup.
> > + if ((pgprot_val(vma->vm_page_prot) == pgprot_val(vm_page_prot) &&
> > + ((vm_flags & (VM_WRITE|VM_SHARED|VM_PFNMAP|VM_INSERTPAGE)) ==
> > + (VM_WRITE|VM_SHARED)) &&
> > + vma->vm_file && vma->vm_file->f_mapping &&
> > + mapping_cap_account_dirty(vma->vm_file->f_mapping)) ||
> > + (vma->vm_ops && vma->vm_ops->page_mkwrite))
> > + vma->vm_page_prot =
> > + protection_map[vm_flags & (VM_READ|VM_WRITE|VM_EXEC)];
> > +
>
> I'm dazzled by the beauty of it!
It's a real beauty isn't it :-)
> Given the change you've made to
> mapping_cap_account_dirty, you don't have to check vma->vm_file->f_mapping
> there; but other than that, seems right to me. May offend other tastes.
I meant to remove the mapping_cap_account_dirty() changes, so as not to
burden all call sites with the extra conditional. This one extra check
in here does not make the difference between beauty and beast.
> > Index: 2.6-mm/mm/rmap.c
> > ===================================================================
> > --- 2.6-mm.orig/mm/rmap.c 2006-06-22 17:59:07.000000000 +0200
> > +++ 2.6-mm/mm/rmap.c 2006-06-23 01:14:39.000000000 +0200
> > +static int page_mkclean_file(struct address_space *mapping, struct page *page)
> > +{
> > + pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> > + struct vm_area_struct *vma;
> > + struct prio_tree_iter iter;
> > + int ret = 0;
> > +
> > + BUG_ON(PageAnon(page));
> > +
> > + spin_lock(&mapping->i_mmap_lock);
> > + vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) {
> > + int protect = mapping_cap_account_dirty(mapping) &&
> > + ((vma->vm_flags & (VM_WRITE|VM_SHARED|VM_PFNMAP|VM_INSERTPAGE)) ==
> > + (VM_WRITE|VM_SHARED));
>
> I think you can forget about testing VM_PFNMAP|VM_INSERTPAGE here,
> or else BUG_ON or WARN_ON(vma->vm_flags & (VM_PFNMAP|VM_INSERTPAGE)):
> those just shouldn't arrive here.
Done
> But I'm still puzzled, why did you move page_mkclean out from under
> the mapping_cap_account_dirty tests in page-writeback.c: no explanation
> in your change history. Doesn't that just add some unnecessary work,
> and this "protect" business?
You're right, added back under the wing of mapping_cap_account_dirty().
> If you're thinking ahead to other changes you want to make, you'd
> do better to add these changes then, and for now just work on the
> VM_SHARED vmas.
But I'll leave page_mkclean() as callable without, that way its more
true to its name.
Peter
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-06-23 22:00 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-19 17:52 [PATCH 0/6] mm: tracking dirty pages -v9 Peter Zijlstra
2006-06-19 17:52 ` [PATCH 1/6] mm: tracking shared dirty pages Peter Zijlstra
2006-06-22 5:56 ` Andrew Morton
2006-06-22 6:07 ` Christoph Lameter
2006-06-22 6:15 ` Andrew Morton
2006-06-22 11:33 ` Peter Zijlstra
2006-06-22 13:17 ` Hugh Dickins
2006-06-22 20:52 ` Hugh Dickins
2006-06-22 23:02 ` Peter Zijlstra
2006-06-22 23:39 ` [PATCH] mm: tracking shared dirty pages -v10 Peter Zijlstra
2006-06-23 3:10 ` Jeff Dike
2006-06-23 3:31 ` Andrew Morton
2006-06-23 3:50 ` Jeff Dike
2006-06-23 4:01 ` H. Peter Anvin
2006-06-23 15:08 ` Jeff Dike
2006-06-23 6:08 ` Linus Torvalds
2006-06-23 7:27 ` Hugh Dickins
2006-06-23 17:00 ` Christoph Lameter
2006-06-23 17:22 ` Peter Zijlstra
2006-06-23 17:52 ` Christoph Lameter
2006-06-23 18:11 ` Martin Bligh
2006-06-23 18:20 ` Linus Torvalds
2006-06-23 17:56 ` Linus Torvalds
2006-06-23 18:03 ` Peter Zijlstra
2006-06-23 18:23 ` Christoph Lameter
2006-06-23 18:41 ` Christoph Hellwig
2006-06-23 17:49 ` Linus Torvalds
2006-06-23 18:05 ` Arjan van de Ven
2006-06-23 18:08 ` Miklos Szeredi
2006-06-23 19:06 ` Hugh Dickins
2006-06-23 22:00 ` Peter Zijlstra [this message]
2006-06-23 22:35 ` Linus Torvalds
2006-06-23 22:44 ` Peter Zijlstra
2006-06-28 14:58 ` [RFC][PATCH] mm: fixup do_wp_page() Peter Zijlstra
2006-06-28 18:20 ` Hugh Dickins
2006-06-19 17:53 ` [PATCH 2/6] mm: balance dirty pages Peter Zijlstra
2006-06-19 17:53 ` [PATCH 3/6] mm: msync() cleanup Peter Zijlstra
2006-06-22 17:02 ` Hugh Dickins
2006-06-19 17:53 ` [PATCH 4/6] mm: optimize the new mprotect() code a bit Peter Zijlstra
2006-06-22 17:21 ` Hugh Dickins
2006-06-19 17:53 ` [PATCH 5/6] mm: small cleanup of install_page() Peter Zijlstra
2006-06-19 17:53 ` [PATCH 6/6] mm: remove some update_mmu_cache() calls Peter Zijlstra
2006-06-22 16:29 ` Hugh Dickins
2006-06-22 16:37 ` Christoph Lameter
2006-06-22 17:35 ` Hugh Dickins
2006-06-22 18:31 ` Christoph Lameter
2006-06-23 18:27 [PATCH] mm: tracking shared dirty pages -v10 Brian D. McGrew
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1151100017.30819.50.camel@lappy \
--to=a.p.zijlstra@chello.nl \
--cc=akpm@osdl.org \
--cc=christoph@lameter.com \
--cc=dhowells@redhat.com \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mbligh@google.com \
--cc=npiggin@suse.de \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox