Re: PTE access rules & abstraction - Benjamin Herrenschmidt

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Nick Piggin <npiggin@suse.de>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Linux Kernel list <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hugh@veritas.com>
Subject: Re: PTE access rules & abstraction
Date: Tue, 23 Sep 2008 16:49:32 +1000	[thread overview]
Message-ID: <1222152572.12085.129.camel@pasglop> (raw)
In-Reply-To: <48D88904.4030909@goop.org>

> A good first step might be to define some conventions.  For example,
> define that set_pte*() *always* means setting a non-valid pte to either
> a new non-valid state (like a swap reference) or to a valid state. 
> modify_pte() would modify the flags of a valid
> pte, giving a new valid pte.  etc...

Yup. Or make it clear that ptep_set_access_flags() should only be used
to -relax- access (ie, set dirty, writeable, accessed, ... but not
remove any of them).

> It may be that a given architecture collapses some or all of these down
> to the same underlying functionality, but it would allow the core intent
> to be clearly expressed.
> 
> What is the complete set of primitives we need?  I also noticed that a
> number of the existing pagetable operations are used only once or twice
> in the core code; I wonder if we really need such special cases, or
> whether we can make each arch pte operation carry a bit more weight?

Yes, that was some of my concern. It's getting close to having one API
per call site :-)

> Also, rather than leaving all the rule enforcing to documentation and a
> maintainer, we should also consider having a debug mode which adds
> enough paranoid checks to each operation so that any rule breakage will
> fail obviously on all architectures.

We could do both.

Now, regarding operations, let's first find the major call sites, see
what I miss. I'm omitting free_* in memory.c as those are for freeing
pte pages, not accessing PTEs themselves. I'm also ignoring read-only
call sites and hugetlb for now.

* None-iterative accessors

 - handle_pte_fault in memory.c, on "fixup" faults (pte is present and
it's not a COW), for fixing up DIRTY and ACCESSED (btw, could we make
that also fixup EXEC ? I would like this for some stuff I'm working on
at the moment, ie set it if the vma has VM_EXEC and it was lost from the
PTE as I might want to mask it out of PTEs under some circumstances).
Textbook usage of ptep_set_access_flags(), so that's fine.

 - do_wp_page() in memory.c for COW or fixup of shared writeable mapping
writeable-ness. Doesn't overwrite existing PTE for COW anymore, it uses
clear_flush nowadays and fixup of shared writeable mapping uses
ptep_set_access_flags() as it should, so that's all good.

 - insert_pfn() and insert_page() still in memory.c for fancy page
faults. Just a trivial set_pte_at() of a !present one, no big deal here

  - RMAP ones ? Some ad-hoc stuff due to _notify thingies.

* Iterative accessors (some don't batch, maybe they could/should).

 - zapping a mapping (zap_p*) in memory.c
 - fork (copy_p*) in memory.c could batch better maybe ?
 - setting linear user mappings (remap_p*) in memory.c, trivial
set_pte_at() on a range, pte's should be !present I think.
 - mprotect (change_p*) in memory.c, which has the problem I mentioned
 - moving page tables (move_p*), pretty trivial clear_flush + set_pte_at
 - clear_regs_pte_range via walk_page_range in fs/proc/task_mmu.c, does
a test_and_clear_young, flushes mm afterward, could use some lazy stuff
so we can batch properly on ppc64.
 - vmalloc, that's a bit special and kernel only, doesn't have nasty
races between creating/tearing down mappings vs. using them
 - highmem I leave alone for now, it's mostly trivial set_pte_at &
flushing for normal kmap but kmap_atomic can be nasty, though it's arch
specific.
 - some stuff in fremap I'm not too familiar with and I need to run...

What did I miss ?

Cheers,
Ben.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2008-09-23  6:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-19 17:42 Benjamin Herrenschmidt
2008-09-22  6:22 ` Jeremy Fitzhardinge
2008-09-22 21:05   ` Benjamin Herrenschmidt
2008-09-23  3:10     ` Nick Piggin
2008-09-23  3:16       ` David Miller, Nick Piggin
2008-09-23  5:35         ` Benjamin Herrenschmidt
2008-09-23  6:18           ` Nick Piggin
2008-09-23  5:31       ` Benjamin Herrenschmidt
2008-09-23  6:13         ` Jeremy Fitzhardinge
2008-09-23  6:49           ` Benjamin Herrenschmidt [this message]
2008-09-23  9:50             ` Nick Piggin
2008-09-23 11:54               ` peter
2008-09-24 18:45     ` Hugh Dickins
2008-09-24 21:20       ` Benjamin Herrenschmidt
2008-09-24 21:57         ` Jeremy Fitzhardinge
2008-09-24 22:07           ` Benjamin Herrenschmidt
2008-09-24 22:43             ` Jeremy Fitzhardinge
2008-09-24 22:53               ` Benjamin Herrenschmidt
2008-09-24 23:55         ` Hugh Dickins
2008-09-25  1:04           ` Benjamin Herrenschmidt
2008-09-25 18:15             ` Jeremy Fitzhardinge
2008-09-25 21:44               ` Benjamin Herrenschmidt
2008-09-25 22:27                 ` Jeremy Fitzhardinge
2008-09-25 23:02                   ` Benjamin Herrenschmidt
2008-09-24 22:17       ` Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1222152572.12085.129.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=hugh@veritas.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox