From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07EA7C433EF for ; Thu, 19 May 2022 01:30:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 488906B0072; Wed, 18 May 2022 21:30:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 437BE6B0073; Wed, 18 May 2022 21:30:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2FFAB6B0074; Wed, 18 May 2022 21:30:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 200B76B0072 for ; Wed, 18 May 2022 21:30:18 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E48B832600 for ; Thu, 19 May 2022 01:30:17 +0000 (UTC) X-FDA: 79480762074.21.6607310 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) by imf14.hostedemail.com (Postfix) with ESMTP id 7E1461000DC for ; Thu, 19 May 2022 01:30:14 +0000 (UTC) Received: by mail-qk1-f174.google.com with SMTP id w3so3818077qkb.3 for ; Wed, 18 May 2022 18:30:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=Vctvs+yLST0ladkxVngSqvQUozt/YOrV+kKx1mu8aJA=; b=HLVHydgvgbmMDpw1rj4PjhbTGGTqhSOBCKM1afw/p50es6EyWkVCtEGGecQAxyuN/D 0odjMkW1hUryO0AvAiAHLmQXlkTJnm6tAv6y1HGh86MuGS4IvK1JVJwNpRfye+MpVDyB oxrT64vf4E2NlfkriEDTVvNoqrqhb+FNvs8bB2LF/hTdfGyPkGnS2YHi5Gc8aYUUjT36 7NtkYJykqGQDpatDfRjiSmfoLcVTRtC5jDk6bKhKfbSBkuk4Xsqo4mQKYJoE3uuL/9Rx rN+p8QEK0sETgd7BdwMT/TXWzTPMQ/mjvf1tjgu7OfiSzx+DCeRiPeRyXy8Vsyf+p+TB GsxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=Vctvs+yLST0ladkxVngSqvQUozt/YOrV+kKx1mu8aJA=; b=TFkYOINuWUyON2huImQortSIJMW/RlmZskNFZ0EZo9N5Fx5j+1qgOEM25ZmGXbrSnC tlne00eEzjydXD3eSjndR8KkhGJKXAGludYpfzyxih0dN99eMEumVF619MiZKiJ+fEOy NNRzPcbPhh8+LyXlHwLVklUUZU1POcHRdlFQYOadc/U17cS5dSDfaD3jHIGCp7BtG6tE 75J+LxreQG/9FTOFOL/m2RBEHceoDwTqapf/O46UrvefLrNSwd2MPpduBoN0aGnYTUTp eTHJq+wUzOfWTeHac6kaeRFct0VHvvlbPZ+MTVgjSgIo6j8G9EnN11OlH9JAGK7tilrX KVmw== X-Gm-Message-State: AOAM530eb/4oMPvWkihGMY4ua+5pnNi3RTbr65E5WSk9lrrQmxIfR1Cc GMlHjyN8NeIeIboytCJPAKRRUg== X-Google-Smtp-Source: ABdhPJxJH+K2i0AvHY4rT9AU1ucarCVHI0V+Qi/EtshWI9oGclaUc6guPBsrY1vw6csBfv/yLrmnOA== X-Received: by 2002:a37:9144:0:b0:69f:789b:7581 with SMTP id t65-20020a379144000000b0069f789b7581mr1617411qkd.773.1652923816375; Wed, 18 May 2022 18:30:16 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id b7-20020a379907000000b0069fcc501851sm496304qke.78.2022.05.18.18.30.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 May 2022 18:30:15 -0700 (PDT) Date: Wed, 18 May 2022 18:30:03 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.anvils To: Mike Kravetz cc: "linux-mm@kvack.org" , linux-kernel , Michal Hocko , Oscar Salvador , David Hildenbrand , Naoya Horiguchi , Hugh Dickins , Peter Xu , Nick Piggin , Andi Kleen Subject: Re: vma_needs_copy always true for VM_HUGETLB ? In-Reply-To: Message-ID: <872b743d-ac21-59a3-bd31-109229f63112@google.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7E1461000DC X-Stat-Signature: 9ck16wrqn9y6ncaz6ac1c3g7qyix34c8 X-Rspam-User: Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=HLVHydgv; spf=pass (imf14.hostedemail.com: domain of hughd@google.com designates 209.85.222.174 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1652923814-883515 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 18 May 2022, Mike Kravetz wrote: > For most non-anonymous vmas, we do not copy page tables at fork time, but > rather lazily populate the tables after fork via faults. The routine > vma_needs_copy() is used to make this decision. For VM_HUGETLB vmas, it always > returns true. "vma_needs_copy()" is *very* recent coinage, not reached Linus yet. > > Anyone know/remember why? The code was added more than 15 years ago and > my search for why hugetlb vmas were excluded came up empty. > > I do not see a reason why VM_HUGETLB is in this list. Initial testing did > not reveal any problems when I removed the VM_HUGETLB check. > > FYI - I am looking at the performance of fork and exec (unmap) of processes > with very large hugetlb mappings. Skipping the copy at fork time would > certainly speed things up. Of course, there could some users who would > notice if hugetlb page tables are not copied at fork time. However, this > is the behavior for 'normal' mappings. I am inclined to make hugetlb be > 'more normal'. Good question, not obvious to me either: but I've found the answer. The commit was of course Nick's d992895ba2b2 ("[PATCH] Lazy page table copies in fork()") in 2.6.14; but it doesn't explain why VM_HUGETLB is there in the test, and goes on to be copied. I haven't re-read through the whole mail thread which led to that commit, but I think you'll find the crucial observation comes from Andi in https://lore.kernel.org/lkml/200508251756.07849.ak@suse.de/#t "Actually I disabled it for hugetlbfs (... !is_huge...vma). The reason is that lazy faulting for huge pages is still not in mainline." and indeed, look at the 2.6.13 or 2.6.14 mm/hugetlb.c and you find /* * We cannot handle pagefaults against hugetlb pages at all. They cause * handle_mm_fault() to try to instantiate regular-sized pages in the * hugegpage VMA. do_page_fault() is supposed to trap this, so BUG is we get * this far. */ static struct page *hugetlb_nopage(struct vm_area_struct *vma, unsigned long address, int *unused) { BUG(); return NULL; } Oh, and that pretty much still exists to this day, to cover that path to a fault; but 2.6.16 implemented hugetlb_no_page(), which is what then actually got used to satisfy a hugetlb fault. So the reason for fork copying VM_HUGETLB appears to have gone away in 2.6.16. (I haven't a clue on private hugetlb mappings and reservations and whether anon_vma means the same on hugetlb, but you know all that.) Hugh