On 12/17/2013 07:44 AM, Kirill A. Shutemov wrote: > Sasha Levin wrote: >> Hi Andrea, >> >> On 12/16/2013 03:52 PM, Andrea Arcangeli wrote: >>> Is the bug reproducible? If yes the simplest is probably to add some >>> allocation tracking to the page, so if page->ptl is null we can simply >>> print a stack trace of who allocated the page (and later forgot to >>> initialize the ptl). >> >> Yes, it's easy to reproduce. > > I'm trying to reproduce it with trinity. No luck so far. Any suggestions? > Kernel config? VM setup? Do you have swap enabled? How do you run trinity? I've attached my config. There's swap enabled. I'm running trinity with '-g vm' to focus on the mm code. I'm also running it with 65GB RAM. >> I've done as suggested and here's the trace from >> the allocation: >> >> [ 184.139519] [] save_stack_trace+0x2f/0x50 >> [ 184.140706] [] get_page_from_freelist+0x759/0x7a0 >> [ 184.141605] [] __alloc_pages_nodemask+0x3b8/0x520 >> [ 184.142810] [] alloc_pages_vma+0x1df/0x220 >> [ 184.143631] [] do_huge_pmd_wp_page+0x2d8/0x730 >> [ 184.144526] [] __handle_mm_fault+0x2b1/0x3d0 >> [ 184.145361] [] handle_mm_fault+0x133/0x1c0 >> [ 184.146129] [] __get_user_pages+0x448/0x640 >> [ 184.147055] [] __mlock_vma_pages_range+0xd4/0xe0 >> [ 184.147980] [] __mm_populate+0x110/0x190 >> [ 184.148933] [] SyS_mlock+0xf2/0x130 >> [ 184.149689] [] tracesys+0xdd/0xe2 > > It's trace from huge page allocation, not from page table allocation we > are interested in. > > In our case we need to know who allocated pmd_page(*pmd) when > > orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl); > > crashes. Note: we usually allocate page tables with __GFP_NOTRACK. It > probably need to be changed for this experiment. So this is actually the place pmd_page(*pmd) gets allocated. I've basically added a saved trace into struct page, saving it at get_page_from_freelist() seems to catch most allocations. I have this piece of code in the place that crashes: for (index = start; index != end; index += PAGE_SIZE) { pte_t pte; swp_entry_t entry; struct page *page; spinlock_t *ptl; if (!pte_lockptr(vma->vm_mm, pmd)) print_stack_trace(&pmd_page(*(pmd))->trace, 0); orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl); And the result is what you see above. >>> Agree with Kirill that it would help to verify the bug goes away by >>> disabling USE_SPLIT_PTE_PTLOCKS. >> >> It seems that the bug is gone without USE_SPLIT_PTE_PTLOCKS. > > What about PMD sibling: USE_SPLIT_PMD_PTLOCKS? > I mean USE_SPLIT_PTE_PTLOCKS == 1, USE_SPLIT_PMD_PTLOCKS == 0. Crash still occurs in that case. Thanks, Sasha