On Mon, Apr 21, 2014 at 3:44 PM, Dave Hansen wrote: > > I came up with something pretty similar to what you've got. I used some > local variables for the dirty state rather than using the pte, but > otherwise looks pretty similar. It actually boots, runs, and > superficially looks to be doing the right thing. .. except your version doesn't seem to have a chance of even compiling on anything that doesn't use asm-generic/tlb.h and thus HAVE_GENERIC_MMU_GATHER. Now, I don't know that mine works either, but at least I tried. I've love to hear if somebody who has a cross-compile environment set up for the non-generic architectures. I tried 'um', but we have at least arm, ia64, s390 and sh that don't use the generic mmu gather logic. I'm not entirely sure why ARM doesn't do the generic one, but I think s390 is TLB-coherent at the ptep_get_and_clear() point, so there just doing the set_page_dirty() is fine (assuming it compiles - there could be some header file ordering issue). > I fixed free_pages_and_swap_cache() but just making a first pass through > the array and clearing the bits. Yeah. I have to say, I think it's a bit ugly. I am personally starting to think that we could just make release_pages() ignore the low bit of the "struct page" pointer in the array it is passed in, and then free_pages_and_swap_cache() could easily just do the "set_page_dirty()" in the loop it already does. Now, I agree that that is certainly *also* a bit ugly, but it ends up simplifying everything else, so it's a preferable kind of ugly to me. So here's a suggested *incremental* patch (on top of my previous patch that did the interface change) that does that. Does this work for people? It *looks* sane. It compiles for me (tested on x86 that uses generic mmu gather, and on UM that does not). Linus