From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BA0EE77188 for ; Fri, 10 Jan 2025 19:38:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 09DE46B00C8; Fri, 10 Jan 2025 14:38:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 026F16B00CE; Fri, 10 Jan 2025 14:38:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE0986B00D0; Fri, 10 Jan 2025 14:38:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BA0566B00C8 for ; Fri, 10 Jan 2025 14:38:30 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6915F802B4 for ; Fri, 10 Jan 2025 19:38:30 +0000 (UTC) X-FDA: 82992553980.08.A12621B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf29.hostedemail.com (Postfix) with ESMTP id 11CD7120004 for ; Fri, 10 Jan 2025 19:38:27 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UzlfXyqr; spf=pass (imf29.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736537908; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4i6QVmd+yleKDGHGKnwix/umqKFqMgUDXuwZqmy1wLI=; b=7v53uI+GHvaRVgEAhvTqy70UfALqWmYgodAPXU5/+HgMICo4iOnAhxIJ6EGYxgRdbB/CQx bQ6O/g1iRPa96YqjPZRZ+X5wIEMiGNLcqVPv2K841JoUIBvtpMylwsNP/XSAigEQryQAVb IyJayJ5g3sCtbyQ41G/vzfNNwQTpa7I= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UzlfXyqr; spf=pass (imf29.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736537908; a=rsa-sha256; cv=none; b=5VYQ1pnKugrQJRWnazuThq3v7QQtqJv3ChWaG3L2ej6OQZtXXpMC8GDp/WuU4qR1Walt2r +zPyDL9wYgYSSGHyLoIUXXyWGR5Z8H1HJa9iFgcGOVKe6paegT51kPd8l45p31hJ3y/yAC I9z5Rw7eb7xbiPCJmUbyr03AIxghzI8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736537907; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4i6QVmd+yleKDGHGKnwix/umqKFqMgUDXuwZqmy1wLI=; b=UzlfXyqrnCI1GpjyNNo3WYSaTh71LEqa0V9thMAgbIXcGNAO2MqF+XfnqzflNMXzrlXnmW CMu6+soc1DuUdCtaLOVGflgLcvtGK0n6xFRbo0H2LGvaYgqkYFv+RdTwurTqs2Sa76WwV9 YDNkk4vARpolNyiamIEcUGfQw80qMuo= Received: from mail-yb1-f198.google.com (mail-yb1-f198.google.com [209.85.219.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-578-defXzFfdNtyv71-xJm73LQ-1; Fri, 10 Jan 2025 14:38:25 -0500 X-MC-Unique: defXzFfdNtyv71-xJm73LQ-1 X-Mimecast-MFC-AGG-ID: defXzFfdNtyv71-xJm73LQ Received: by mail-yb1-f198.google.com with SMTP id 3f1490d57ef6-e572f6dee18so1516520276.0 for ; Fri, 10 Jan 2025 11:38:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736537905; x=1737142705; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4i6QVmd+yleKDGHGKnwix/umqKFqMgUDXuwZqmy1wLI=; b=L/G0gxpLP/NjLiB9Cuj35KXO1EIaf2jMXLEBvz7ocqpJFXdlhbp82ZrYO9BOrQXUos M/WXDkzzuoJJJjKRanHVAJCPbDlJKryAxOmAp9aBQLQ5mUn8vhqFyMUIPQ/JAzyFPe1F iIDVbU4/WKVz8xUTdKRRzEwYMXQhY1AiZUhL7HPFeMq3RlVUo319Kh7mIFyRm1jmdpfJ qxuAdJtVaAz5MzImNjD8GkvYuqpU8jy4CGEXPazwZE1vsy2AxioOf+VMrD60AtY/OELu f09ZRqC1wGKrpTT4WfG6W+gHErp0RuyPb2he865pix9+/4onzEON+gQRYG/KGwzBUF3t Ulyw== X-Forwarded-Encrypted: i=1; AJvYcCXeSZpR/o8gIH6G9SqVtS569m0PgTiAkcR4I6JzC4ihi9YKZcBsSstErPX2YXWy3yk8bC2GCicPig==@kvack.org X-Gm-Message-State: AOJu0YyAfrOK/aR/ZHBcDvMh1cdx/8ovcg9lH8XdPB8BgTaX4ZqiyvTW ncqoBY7Xr4O44B4k7+lYFhCoauJCgvKchQuvbFgJWdtLrUwPGa4Yw6vUbtBkJUBnvM4sf1CP1B8 hM92OF9VunqhiA00IBRy6FYDLGCeXm9TpZbyLnU7M8p5fluBU11V6Tpjqmx/XulCMM8lE5ln6Rp gk4UmWVuzXRpZ7t3KuoclAyiM= X-Gm-Gg: ASbGnctZIlDGKTMbN4Syp9p5Gh/xUpzbTaX0ZAtaMWewFwkAeiSWjIuq+4kwHpbPM1e FyLNQLiLZSCcSnFgNkNdkQQfYSkZ3uFLE5zsHLCuklkGhSNHanagH X-Received: by 2002:a05:690c:3608:b0:6ef:522a:1c28 with SMTP id 00721157ae682-6f5312d0f62mr114587167b3.35.1736537904847; Fri, 10 Jan 2025 11:38:24 -0800 (PST) X-Google-Smtp-Source: AGHT+IG1PJk9ISk0vnLdE1hLazhMMrKNyJ+qpUok8LM2GiWNrm2Wa7tSCzxPA/LHgfOgsaQP/jIt5vMqGC5THorP0lQ= X-Received: by 2002:a05:690c:3608:b0:6ef:522a:1c28 with SMTP id 00721157ae682-6f5312d0f62mr114586627b3.35.1736537904546; Fri, 10 Jan 2025 11:38:24 -0800 (PST) MIME-Version: 1.0 References: <20250108233128.14484-1-npache@redhat.com> <20250108233128.14484-4-npache@redhat.com> <3a1af9a6-451d-46ec-804f-cdb5d3d21f41@arm.com> In-Reply-To: <3a1af9a6-451d-46ec-804f-cdb5d3d21f41@arm.com> From: Nico Pache Date: Fri, 10 Jan 2025 12:37:58 -0700 X-Gm-Features: AbW1kvYt1Rz8VaoEJIlQ5o_kTXYOtX5WCL39x6GsyQo1xznL1URKfDHxgYwHfHs Message-ID: Subject: Re: [RFC 03/11] khugepaged: Don't allocate khugepaged mm_slot early To: Dev Jain Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, willy@infradead.org, kirill.shutemov@linux.intel.com, david@redhat.com, aarcange@redhat.com, raquini@redhat.com, sunnanyong@huawei.com, usamaarif642@gmail.com, audra@redhat.com, akpm@linux-foundation.org X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: -ZQlJBeo_p-uoMTZE1i3QEZlIF8qKqkPTDVm-flJGUI_1736537905 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 11CD7120004 X-Rspamd-Server: rspam12 X-Stat-Signature: ax6ysjniwknxgoy85udsbsckautnhsa7 X-Rspam-User: X-HE-Tag: 1736537907-151715 X-HE-Meta: U2FsdGVkX18WWAorRGTAB5zEcbP27CsUV1Foxjn6veSIfnG8FNx511w9uY9OSLsgiAbddiDD8xUaasqzTWfxTQ0ixdXHo4QM9AaXVvJlJPgzyIhk55KadiJaRdiTV0yOa/SRjyZx5alCGYrs+XT1evKvcDCEmZOXxGj1syjVpLGoOAXsGwZ8X3Srj4L3qXffHrb1M7lhR5qAF8BwnHYaqlAqNRFo5kEXEHTAah3j6Fp0fX5rpkGX20O5+MUIOvmyxOnENaor/m3Oq0H5b7GAJvtvRqTnXZCBNC7MIJUgIAaTkabbIda6AEd3uBUs1a3RwGwz6i+U2jxCLcgfeMGw5dvOXDmubOmzUk/hbDhwceMsswnD8Gy1USCJuiXzvUU5H8D+z/UPOQeDqb+Z3QXcyNk8Fv562LNPgu9TdxbiFkKg8ROtqxmtFASvVrTdtzAZN0Hg5nb8gLir+QHAsqAAYTjEWJKoU3Q/FT6QTbja6VqpMWBViL9U4Y9frUmkgWljhbOJM+4yLrfjnH01VTZTzjx7vgdIknMgMFhYX06tNUB+vwATdXEeNHhxMmPtyE6Fzz+7rWc2JCFyP5hBJl/YZpIw3l6goJjPn11R8q2ol0b9lh7x8DoXmPk5cVxuBUVQM0CwEipRQBwxC2dMjuBap3SdjcIsEZ+kmlE1GVkVxowt463byprRV10TtwO4J/AAfMwHXMEpGJVDulcSkuiBn3sUyI/OzJOO5t/L3DMIZY6Lp3Q7tmBQHx6dwAWPjZzKiTEpO3hTBoN9ZHqheBwGKp2SnGV94UpIbQscqr7S8Zs6VqmQbumhU+Y10jyL52TAUxgqKvA00ljYg16rPXiViiUFjiDrTMFtOZ7Dcn9MyhVjmybzboJT4tttlSIFQAdQ7vIR3XtPXF14WnQAkBG2qSxiVzMZWHxmmxheO8cFCDD9ZpI83Cy4iPru4oT7/FMk2Kr23LQTPkVrzZQSo6d TriuboW5 X3Hw8II6xfmgessd8ot0zaAh9pihHO/ke2f6lnPKh+Xp3fPDLkVm6QQTNV29854ZTTYxox15Oax3Lv5qNexzhWFF6aO3mqTkWNFuaupuPUzUzH6OMkEiSwsJoxgRUCYa/6YzNCBSO+V4qaEadGzePAXD+FDu55ykLfhhDzcRCGDJnrTG6jJ5c2lBS7uziUFFZmtdGsLblYKSml7WIVk2S6Fkz0lG/u6whVo3IxhUtUHkY67sXXtEHE0H2x7ZG9Ak9Px3qg8vwwam3la/PamNjLV0+mH2zEG+bvY7gsvfWgCN76JBc9Hx3+IQb23H4qnD5U6BNSqpChqX2wV1qP1kBlHaw6EPjoKFlWqo128xah0KHeQm0YTTcaKtNb0pDiQFdXOLdQSwyDDX12oWGFvhyVp3K6IHCVKvERraBgfbWKmJPP2X1c8cSH/ytpwKbXJaEI4qXI0YclKgFViI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000084, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 9, 2025 at 11:11=E2=80=AFPM Dev Jain wrote: > > > > On 09/01/25 5:01 am, Nico Pache wrote: > > We should only "enter"/allocate the khugepaged mm_slot if we succeed at > > allocating the PMD sized folio. Move the khugepaged_enter_vma call unti= l > > after we know the vma_alloc_folio was successful. > > Why? We have the appropriate checks from thp_vma_allowable_orders() and > friends, so the VMA should be registered with khugepaged irrespective of > whether during fault time we are able to allocate a PMD-THP or not. If > we fail at fault time, it is the job of khugepaged to try to collapse it > later. That's a fair point. This was written a while back when I first started looking into khugepaged. I believe the current schema for khugepaged_enter_vma is to only register when there is a mapping large enough for khugepaged to work on. I'd like to remove this restriction in the future to simplify the entry points of khugepaged. Currently we need to add these khugepaged_enter_vma functions all over the place, ideally we just register everything with khugepaged. Either way, you are correct, even if we FALLBACK, the mapping would still be eligible for promotion in the future. Ill drop this patch. Thanks! > > > > Signed-off-by: Nico Pache > > --- > > mm/huge_memory.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index e53d83b3e5cf..635c65e7ef63 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -1323,7 +1323,6 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_f= ault *vmf) > > ret =3D vmf_anon_prepare(vmf); > > if (ret) > > return ret; > > - khugepaged_enter_vma(vma, vma->vm_flags); > > > > if (!(vmf->flags & FAULT_FLAG_WRITE) && > > !mm_forbids_zeropage(vma->vm_mm) && > > @@ -1365,7 +1364,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_f= ault *vmf) > > } > > return ret; > > } > > - > > + khugepaged_enter_vma(vma, vma->vm_flags); > > return __do_huge_pmd_anonymous_page(vmf); > > } > > > > In any case, you are not achieving what you described in the patch > description: you have moved khugepaged_enter_vma() after the read fault > logic, what you want to do is to move it after > vma_alloc_anon_folio_pmd() in __do_huge_pmd_anonymous_page(). Good catch! This was a byproduct of a rebase... back when i wrote this the vma_alloc_folio was in the do_huge_pmd_anonymous_page function. >