On 11/01/2017 09:08 AM, Linus Torvalds wrote: > On Tue, Oct 31, 2017 at 4:44 PM, Dave Hansen > wrote: >> On 10/31/2017 04:27 PM, Linus Torvalds wrote: >>> (c) am I reading the code correctly, and the shadow page tables are >>> *completely* duplicated? >>> >>> That seems insane. Why isn't only tyhe top level shadowed, and >>> then lower levels are shared between the shadowed and the "kernel" >>> page tables? >> >> There are obviously two PGDs. The userspace half of the PGD is an exact >> copy so all the lower levels are shared. The userspace copying is >> done via the code we add to native_set_pgd(). > > So the thing that made me think you do all levels was that confusing > kaiser_pagetable_walk() code (and to a lesser degree > get_pa_from_mapping()). > > That code definitely walks and allocates all levels. > > So it really doesn't seem to be just sharing the top page table entry. Yeah, they're quite lightly commented and badly named now that I go look at them. get_pa_from_mapping() should be called something like get_pa_from_kernel_map(). Its job is to look at the main (kernel) page tables and go get an address from there. It's only ever called on kernel addresses. kaiser_pagetable_walk() should probably be kaiser_shadow_pagetable_walk(). Its job is to walk the shadow copy and find the location of a 4k PTE. You can then populate that PTE with the address you got from get_pa_from_mapping() (or clear it in the remove mapping case). I've attached an update to the core patch and Documentation that should help clear this up. > And that worries me because that seems to be a very fundamental coherency issue. > > I'm assuming that this is about mapping only the individual kernel > parts, but I'd like to get comments and clarification about that. I assume that you're really worried about having to go two places to do one thing, like clearing a dirty bit, or unmapping a PTE, especially when we have to do that for userspace. Thankfully, the sharing of the page tables (under the PGD) for userspace gets rid of most of this nastiness. I hope that's more clear now.