* vma_needs_copy always true for VM_HUGETLB ? @ 2022-05-18 22:36 Mike Kravetz 2022-05-19 1:30 ` Hugh Dickins 0 siblings, 1 reply; 3+ messages in thread From: Mike Kravetz @ 2022-05-18 22:36 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: Michal Hocko, Oscar Salvador, David Hildenbrand, Naoya Horiguchi, Hugh Dickins, Peter Xu For most non-anonymous vmas, we do not copy page tables at fork time, but rather lazily populate the tables after fork via faults. The routine vma_needs_copy() is used to make this decision. For VM_HUGETLB vmas, it always returns true. Anyone know/remember why? The code was added more than 15 years ago and my search for why hugetlb vmas were excluded came up empty. I do not see a reason why VM_HUGETLB is in this list. Initial testing did not reveal any problems when I removed the VM_HUGETLB check. FYI - I am looking at the performance of fork and exec (unmap) of processes with very large hugetlb mappings. Skipping the copy at fork time would certainly speed things up. Of course, there could some users who would notice if hugetlb page tables are not copied at fork time. However, this is the behavior for 'normal' mappings. I am inclined to make hugetlb be 'more normal'. -- Mike Kravetz ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: vma_needs_copy always true for VM_HUGETLB ? 2022-05-18 22:36 vma_needs_copy always true for VM_HUGETLB ? Mike Kravetz @ 2022-05-19 1:30 ` Hugh Dickins 2022-05-19 3:36 ` Mike Kravetz 0 siblings, 1 reply; 3+ messages in thread From: Hugh Dickins @ 2022-05-19 1:30 UTC (permalink / raw) To: Mike Kravetz Cc: linux-mm, linux-kernel, Michal Hocko, Oscar Salvador, David Hildenbrand, Naoya Horiguchi, Hugh Dickins, Peter Xu, Nick Piggin, Andi Kleen On Wed, 18 May 2022, Mike Kravetz wrote: > For most non-anonymous vmas, we do not copy page tables at fork time, but > rather lazily populate the tables after fork via faults. The routine > vma_needs_copy() is used to make this decision. For VM_HUGETLB vmas, it always > returns true. "vma_needs_copy()" is *very* recent coinage, not reached Linus yet. > > Anyone know/remember why? The code was added more than 15 years ago and > my search for why hugetlb vmas were excluded came up empty. > > I do not see a reason why VM_HUGETLB is in this list. Initial testing did > not reveal any problems when I removed the VM_HUGETLB check. > > FYI - I am looking at the performance of fork and exec (unmap) of processes > with very large hugetlb mappings. Skipping the copy at fork time would > certainly speed things up. Of course, there could some users who would > notice if hugetlb page tables are not copied at fork time. However, this > is the behavior for 'normal' mappings. I am inclined to make hugetlb be > 'more normal'. Good question, not obvious to me either: but I've found the answer. The commit was of course Nick's d992895ba2b2 ("[PATCH] Lazy page table copies in fork()") in 2.6.14; but it doesn't explain why VM_HUGETLB is there in the test, and goes on to be copied. I haven't re-read through the whole mail thread which led to that commit, but I think you'll find the crucial observation comes from Andi in https://lore.kernel.org/lkml/200508251756.07849.ak@suse.de/#t "Actually I disabled it for hugetlbfs (... !is_huge...vma). The reason is that lazy faulting for huge pages is still not in mainline." and indeed, look at the 2.6.13 or 2.6.14 mm/hugetlb.c and you find /* * We cannot handle pagefaults against hugetlb pages at all. They cause * handle_mm_fault() to try to instantiate regular-sized pages in the * hugegpage VMA. do_page_fault() is supposed to trap this, so BUG is we get * this far. */ static struct page *hugetlb_nopage(struct vm_area_struct *vma, unsigned long address, int *unused) { BUG(); return NULL; } Oh, and that pretty much still exists to this day, to cover that path to a fault; but 2.6.16 implemented hugetlb_no_page(), which is what then actually got used to satisfy a hugetlb fault. So the reason for fork copying VM_HUGETLB appears to have gone away in 2.6.16. (I haven't a clue on private hugetlb mappings and reservations and whether anon_vma means the same on hugetlb, but you know all that.) Hugh ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: vma_needs_copy always true for VM_HUGETLB ? 2022-05-19 1:30 ` Hugh Dickins @ 2022-05-19 3:36 ` Mike Kravetz 0 siblings, 0 replies; 3+ messages in thread From: Mike Kravetz @ 2022-05-19 3:36 UTC (permalink / raw) To: Hugh Dickins Cc: linux-mm, linux-kernel, Michal Hocko, Oscar Salvador, David Hildenbrand, Naoya Horiguchi, Peter Xu, Nick Piggin, Andi Kleen On 5/18/22 18:30, Hugh Dickins wrote: > On Wed, 18 May 2022, Mike Kravetz wrote: > >> For most non-anonymous vmas, we do not copy page tables at fork time, but >> rather lazily populate the tables after fork via faults. The routine >> vma_needs_copy() is used to make this decision. For VM_HUGETLB vmas, it always >> returns true. > > "vma_needs_copy()" is *very* recent coinage, not reached Linus yet. > >> >> Anyone know/remember why? The code was added more than 15 years ago and >> my search for why hugetlb vmas were excluded came up empty. >> >> I do not see a reason why VM_HUGETLB is in this list. Initial testing did >> not reveal any problems when I removed the VM_HUGETLB check. >> >> FYI - I am looking at the performance of fork and exec (unmap) of processes >> with very large hugetlb mappings. Skipping the copy at fork time would >> certainly speed things up. Of course, there could some users who would >> notice if hugetlb page tables are not copied at fork time. However, this >> is the behavior for 'normal' mappings. I am inclined to make hugetlb be >> 'more normal'. > > Good question, not obvious to me either: but I've found the answer. Thank you Hugh! You went above and beyond as usual. > The commit was of course Nick's d992895ba2b2 ("[PATCH] Lazy page table > copies in fork()") in 2.6.14; but it doesn't explain why VM_HUGETLB is > there in the test, and goes on to be copied. > > I haven't re-read through the whole mail thread which led to that > commit, but I think you'll find the crucial observation comes from > Andi in https://lore.kernel.org/lkml/200508251756.07849.ak@suse.de/#t Sorry, that I did not find the entire thread. There were only a couple pieces on linux-mm and that is the only place I looked. > > "Actually I disabled it for hugetlbfs (... !is_huge...vma). The reason > is that lazy faulting for huge pages is still not in mainline." > > and indeed, look at the 2.6.13 or 2.6.14 mm/hugetlb.c and you find > /* > * We cannot handle pagefaults against hugetlb pages at all. They cause > * handle_mm_fault() to try to instantiate regular-sized pages in the > * hugegpage VMA. do_page_fault() is supposed to trap this, so BUG is we get > * this far. > */ > static struct page *hugetlb_nopage(struct vm_area_struct *vma, > unsigned long address, int *unused) > { > BUG(); > return NULL; > } > > Oh, and that pretty much still exists to this day, to cover that path > to a fault; but 2.6.16 implemented hugetlb_no_page(), which is what > then actually got used to satisfy a hugetlb fault. > > So the reason for fork copying VM_HUGETLB appears to have gone away > in 2.6.16. Yes, that is the likely reason. Functionality was not originally supported, and when it was added this 'optimization' was not enabled. > (I haven't a clue on private hugetlb mappings and reservations and > whether anon_vma means the same on hugetlb, but you know all that.) Yes, I believe anon_vma means the same on hugetlb for this purpose. Although, I do need to look closer just to make sure there are not hidden surprises. Thanks again, -- Mike Kravetz ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-05-19 3:36 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-18 22:36 vma_needs_copy always true for VM_HUGETLB ? Mike Kravetz 2022-05-19 1:30 ` Hugh Dickins 2022-05-19 3:36 ` Mike Kravetz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox