From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E110C54E71 for ; Fri, 22 Mar 2024 13:30:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BAEA06B0088; Fri, 22 Mar 2024 09:30:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5F236B0089; Fri, 22 Mar 2024 09:30:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D8FA6B008A; Fri, 22 Mar 2024 09:30:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8EADA6B0088 for ; Fri, 22 Mar 2024 09:30:20 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5B9EDA0134 for ; Fri, 22 Mar 2024 13:30:20 +0000 (UTC) X-FDA: 81924759000.06.F3FC276 Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam02on2054.outbound.protection.outlook.com [40.107.212.54]) by imf08.hostedemail.com (Postfix) with ESMTP id 7511316000C for ; Fri, 22 Mar 2024 13:30:17 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=keSUss2H; spf=pass (imf08.hostedemail.com: domain of jgg@nvidia.com designates 40.107.212.54 as permitted sender) smtp.mailfrom=jgg@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1711114217; a=rsa-sha256; cv=pass; b=KfnwmNfE3smbjH7pjRJNOrKwqU6+XdScp2+PO23SP+cXFObRORPCmvGadNmNDDQJFyhXqM R5QQk6A/y1oYtkBwZGhpDNHpFes57walJK4TvcMEKixsTmm3lDkmcqlJZuP5LpcjowyMog GwDMDn0rF6vWmBdpsNnGXEw1n8HUj7M= ARC-Authentication-Results: i=2; imf08.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=keSUss2H; spf=pass (imf08.hostedemail.com: domain of jgg@nvidia.com designates 40.107.212.54 as permitted sender) smtp.mailfrom=jgg@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711114217; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6VF4t36T/NJjlyDvaq1pUodkN7r3pfczzOTKwZNhKxc=; b=JuRKlLK9nIbafDnS3dbFC4Y3NHOjXgszb2ebwZqov4vmk1shp1fkbivwnNVgMhE7Kvx3E6 tO+JNY/JERzhIFuG+TSrb6XsWX1pvxlYR+6O/MVewOKZB9JHKGXFFjQ7m4ZYvpq+W7uBBR NeZp1S8YUV3SNFPn+ZOk68UjkvBsqro= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Tv1cL1mEVdR1wBllehEFSVxCkFuxyOcjn5cPtK6oFvW41n5MN4egFd3DgyzVycMCRhigbXbCZQ6qOe3omJTFrK/Jy4URp0osrhZB2t26TzZy5KfPWJ56BYYMk+5RFVdUlyVzsuSBwOmt3oEJAgbQGsK1dEQrIAGU6/jWeqWZM1GW1ck1ZQrQLiJPPkT3Rmq1kZbH+7DplXbUmteDKenXXOeYrAgfG2YT8A0FY1TkXuScy7Z71uhoZoaMvlhdZcsQAtvKEqL5Pg+TOfgQF3uFXYaw7jLx6cfgSKl60WPrWWQixvdB/zPAGr9k3gNwh0eIkYJBBLUMBMIctQyXXuG+fA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6VF4t36T/NJjlyDvaq1pUodkN7r3pfczzOTKwZNhKxc=; b=TJyo93QmWiT/9Kn1dekL6Da6HNYzrJSGD80h0xf5+p6kU5w1DxiuixEyFdIIRQD8v+FL+xXvI2n08plBxvjP3XSTWbJ5RTJnrJrNlLMKzwNSNgBRs03a53mevij1zA0LsYIMXeAU8VOJFo7IqjYSA1Nzl/H2hVyTtgY3CLgVQQfRvvaUk+sLYTqzsCN+FqPAStWgdVD4TNhiHGkiwI+fI/2stBJpj3eN8Gi4P16huaLcCQ72muwfIDj9i9Hlq49jSlgGSmTNLEjFYq0y21MbKS5jp2jWA16bwHm3j0sS8Xtya9sSF8s4J0pPp8q+nGvZH4jVkLvIYDCm1cC6KBFtMw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6VF4t36T/NJjlyDvaq1pUodkN7r3pfczzOTKwZNhKxc=; b=keSUss2HYlfI2d2zBu3X7lxTGVnzbfffPpE+FbSWnR7SxKEisNE/OOLmtKhD01klB+VIaMQPCKUGgenJfUKtx5knul6Ng63iSLH2K6KykLMMz/BzO19zndBf8Z3gaZceDtxXAKCyb5hBJxAHmiG6HomsUCX7/KmeRx9H0Gm+3jkzoCbGjhE69d4H4ZMGsP/Pe9OOwphl/bXtWE3IQNQtJOCUbUso29uZInd9D9MBqsEr4VYnUwmBUAPBLMVH1njmP3+H6SKBoz9PKfPsb81p4w4BlTncOOSdVSo2BshtDZxtE3F1X3LbbvE0XghoI2Ns3uBrAofSSI6B1o6BVD8GNg== Received: from DM6PR12MB3849.namprd12.prod.outlook.com (2603:10b6:5:1c7::26) by DM4PR12MB7743.namprd12.prod.outlook.com (2603:10b6:8:101::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.24; Fri, 22 Mar 2024 13:30:13 +0000 Received: from DM6PR12MB3849.namprd12.prod.outlook.com ([fe80::6aec:dbca:a593:a222]) by DM6PR12MB3849.namprd12.prod.outlook.com ([fe80::6aec:dbca:a593:a222%5]) with mapi id 15.20.7386.030; Fri, 22 Mar 2024 13:30:13 +0000 Date: Fri, 22 Mar 2024 10:30:12 -0300 From: Jason Gunthorpe To: peterx@redhat.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Michael Ellerman , Christophe Leroy , Matthew Wilcox , Rik van Riel , Lorenzo Stoakes , Axel Rasmussen , Yang Shi , John Hubbard , linux-arm-kernel@lists.infradead.org, "Kirill A . Shutemov" , Andrew Jones , Vlastimil Babka , Mike Rapoport , Andrew Morton , Muchun Song , Christoph Hellwig , linux-riscv@lists.infradead.org, James Houghton , David Hildenbrand , Andrea Arcangeli , "Aneesh Kumar K . V" , Mike Kravetz Subject: Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code Message-ID: <20240322133012.GI159172@nvidia.com> References: <20240321220802.679544-1-peterx@redhat.com> <20240321220802.679544-13-peterx@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240321220802.679544-13-peterx@redhat.com> X-ClientProxiedBy: BL1PR13CA0434.namprd13.prod.outlook.com (2603:10b6:208:2c3::19) To DM6PR12MB3849.namprd12.prod.outlook.com (2603:10b6:5:1c7::26) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR12MB3849:EE_|DM4PR12MB7743:EE_ X-MS-Office365-Filtering-Correlation-Id: 89c5d53c-ea49-447e-97fe-08dc4a74342c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: vBP8luzKzXTkiT5eg0Y6H3fTTn42qi1ORBemdbr+pREGByLPWV7Der/iTx+DWjYR9bFHGS6dCAUGQYxIE/PvIm64IZOsksrRN8lYIMyiCGJfGQk1zM2Yygw+JWjeA1p2TlAYrc67qFgHN1+0qSgbUjvpqIYoXroWpw3zyv33GqKXYHvDaQ7bSeYfHW0uUDX7pa3i0BQ3U4RCcaAh6QhVSUuo0ZpI5FFBt2uygwCnXFF6v0O+sNureFdBxzjDmXGAtymR4WcQiIqVCvJkuLP3AzK5kCtTmNhLvqDoNJFithbQu1thidn2MTrTmjvVxzwC+qtsdalA0GdM5e4bK1P1nOh92+o+vq3C9Ja+uvSySP5228Y43eIhi6T4aT3QsSSJYeKR3fAGSOdNMQAX7g8a0vcq98cqKXhdCXrxBJwutNYNXl9Ip9gZ5gW/SCtiyuX4WhRzxxXEqVSZCIspcHksUY6LkdGzGhzROsson35ic/AVg8Y7n/61KB8bYzybEOv+UBWV9RRI/YwZxNkqkEm3LeBghGRVxwTPqVI/p1I4649QKXHIneF9E06fpB5q85gNpcM6xwqBBkBmCofa8Cp+nwZPdo9VE2Eqx1okGUJpovc9aB57jpmQpgNmbQVJHJi9hAlpwzd6P9gab3CxjZK0hEljuXwSVTHjb4dfd9kxWlY= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB3849.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(1800799015)(376005)(7416005)(366007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?l4ieSFyzZ2rGUDelhAOyqvVH+ny0RWLUn7y10EKMNkQEysKT5AtiwJZfRl1z?= =?us-ascii?Q?hgHxnTvT3YAQit1IRmxhwgoVzSiO9vHfgrZRFh5ZjshWgoVGEd9m27jJTUWA?= =?us-ascii?Q?Nh8/o3oBtWr9rVeqEkoNgHd/Ing2Goz6HqoqGJeqmCaVTXnYIFX/mH0AVHPe?= =?us-ascii?Q?71YXRup1HIK9Jp+i8U+iUfPZvW2xLu7gUj0N/tUqOIAWNyymcRI89sjw+zEH?= =?us-ascii?Q?DSULdgqq4SlN7Oe/Mj1bpbrGMKf5cp5SwUwm8mgauDGfjB9ydVTiPUKwtTBu?= =?us-ascii?Q?8pO3YCOJxOCTCDw3loBN1Pa56TSHIYodXZAvQ8TpHnlC3aXMjmeishC1lIXB?= =?us-ascii?Q?IGz/o+Q02DA2wtxVeoRcz3Pbf6Gk571m13Q/e15USJ9t8MIv+z5h6xJEVUis?= =?us-ascii?Q?RW9j9N3SS1odMOrFGX769isMr847HxvGWIKBDP5zW1IVkc8BScUYsbgam/BM?= =?us-ascii?Q?7JDqcGHAkSjuQ2TjPdW6dhpH2oA04Sb7srTtOb2eLmkyCHyBdfoWl/iNhftQ?= =?us-ascii?Q?lAL1BHT8Ew/IAydXE8TkWSQBTEU4Kjez1xcyK104hnbGZse6yxf5o7KTqsPS?= =?us-ascii?Q?8CfvqnKlY1By+swi1iM8su5xivzYxWS8Wj8WqPqIT4vfkh5RjakVdFM3cNM6?= =?us-ascii?Q?JzooYIYRPvE33sCBUppNnGMOaTtTKCPlSZ4ReBdHmmCIsPUGSJn3hUuq7kBO?= =?us-ascii?Q?Ba2Bs4tmeaBJ6CFNWBqQGulovnc2NlI2U91QjHp2gmBB7TyqolvUSufOWzaU?= =?us-ascii?Q?viQUBzdaHZXrS0CUSNIDrgMsYNzKj2pZ3XyTt6Bt6ndJp0BtKCWjPYksDJTa?= =?us-ascii?Q?3+wOK+I7n6YDI6nOTQ32Wh9TgE8/e09BMiVsxiWXgucewtkJUqM7PnQo14rC?= =?us-ascii?Q?8nyL5i34CQEP7TIpAfC7/OhdC+BotAjzr8SEarYSFI07svWGKCSlu11D5oui?= =?us-ascii?Q?wmweQo4Wu9WrVjEybmikRJRkvlviMzQq81FzxaoXSosOPG8lWKHxrZ+LK244?= =?us-ascii?Q?h03GH/Miay0SZWOK91hPWNw5AdGgIQ1TEEKiKJqkGYP+kjFwSHNhSXM9cjlY?= =?us-ascii?Q?l4nx93EeyOHeDSzF3Yn5EjJeXa7FwkRIoykk81oddf82pvQ+HOkom9/xL1+f?= =?us-ascii?Q?q5bdhfqvD5juhxNX4C/eHgkHVgEAQp9tUvvfnzaINMZhVA9sgg5++66qF/J+?= =?us-ascii?Q?7gbtjo+3klzty/8Qh7m6W+FsDLUxQJAmZpdCEbhP5zOdsS6trkxefHJHICFV?= =?us-ascii?Q?mhM6i9/ddzHrBSyyMESR3fJdPp0fWUt6MdYzDKmq6U+eAOd/mEk2RmuFqWR8?= =?us-ascii?Q?1706ZWlRxv83I6Sr31IRoxoma+YcWPrNgf7IdPx1Z2y635Gv8ykPBR+MlU64?= =?us-ascii?Q?Cykg0JkpTROE+IyE0yBy1RdGIaLDJvMeAFKsKqjr8CMuuWF4DBwz8PMVjsw+?= =?us-ascii?Q?xdQ8jg0ivJ2OWLrri3rclpVUDY4aYPnbbviiGv4opEy8xSelvXAjQWajdi7u?= =?us-ascii?Q?5/nNLwybu/Thfrx7/OCLPW+P1F2CrCPoyswZiy0rh1zFOTfX9vmyvrafUw7/?= =?us-ascii?Q?gtnj534LPK3rYMKbwC3JmUshVs6WZzkoxcOtjU5Z?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 89c5d53c-ea49-447e-97fe-08dc4a74342c X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB3849.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Mar 2024 13:30:13.3152 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 8pA1y3yaQX6Ktul/c55GGtQOFjgVFsKw9HAGxXYXKfQdSPAzxIC87/F0X5O5zHhU X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB7743 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7511316000C X-Stat-Signature: a4heqoyszjkz9nbmwyz66rjjgw5q3d4i X-Rspam-User: X-HE-Tag: 1711114217-728866 X-HE-Meta: U2FsdGVkX1/r76YqdZcWrRJ0bYVNPijlFib17hWqE+5s3Xidw3yu+gMd/ob93I0aNmfPAVrLYmw2b8oVgQYtp2vpfprjoiUJSejlaok9TMOQPLVyp1VWBKQR9G6O++/4uSpZu+BybQMf/0XqmV/xdbeYky/UFFysnRvFlD44v+r2YqKarSrcysX5bIxCBh9sd5V7sBN/F1AND4zpJ9n9UgDOoo5IHiPEI6luU/wsqiLx8AjpLwUE9ZKUbmvf074SVbMCX3JLavZQZ+kRRS6h7gf2zgDOn9bLwVaeXzxYaDwx6W6imQQHPk3tCM/pYSryqZSbJ6Bs009GlYXNF7yMtt71jD7Vx6QTGHQf0HOnASYcrDCoffyuY4f6TeJObDG5EL2KKxdK5C/ZGrXfcRbb7kgyZmC1ZmGRjWsK35BOQp8qbvnf/3dwXMYBxaaT4guTwyCn1wBG2sHeHUZyF8+TYyux3wT89MstrhPaGWx3P1jm+YLkLOv5K1dSBkhFHmd53At71dwtRxh4Xtkw4KTfsDjtpBdxSugym8LYD+3YjfA+fcq0ZmH0Nu6nABGN5uouV6PnRzdwEO4dyykKPF6aOpEdTVeRvVxk42eLPIkJ5VE7+ybaM3RpBxsGyQyivF+nduEa2E2gj9iKt2r55u0kUuGWAjd9HOFk4uF+kKfczeqaUQYYcIRFwsxfMHb9AsvHemsbx9eRIsMrww8YGLHrv0YJPYMAFaSNmiT3fL5cB9eA7ICz08gDDOYRc8tlCiukFuW46aP5pJxSLBlqLGyJ3ItpBmPqs8Ukz0j0CuE4myAHwSQWBLCUSFxlFPX7/YxtI9tS8DiPYPAAU9BT4ZbCXArGD/SmQA9r38Iouv0uvUT3UZbIAGQFgM1NMgDYRpCmFWR96sBMihgcA7Ufl3VcUz6dg6AwUQLrzX5a2067YI8bLDrrt+rVJpaQZhdmKX2Sj9J6+XSM7tJT7Zg0alO MZJqLpn3 EYCjp21W7hqaLVIAoooAUpF6SJteJ1zS8JnZE2sAa8+46ezaY0Z8YsKJpHphyHBCK/7Jxv/wew/hCQm5nVfRnWi4OJlXgGa9q9y4mQPkztt9qqFNJXjyDs3c9uw1xw7JoY++FbfqstHLH4a27EsaYbqp/NEgvtctdlSQ2CfEtKN6ME+cUEvxqGHuo79JjcJJTMLh4t7wbeIAXkmVOozcMH8aShBjjlk2BxmZCyeDf7GnSYwWKaqFJX/2kbC11Sgo08TWL63dNane52mpPHeVsILfOSAfaRkzCoB+oYEHFHbNLzpLPER48qLwXsw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 21, 2024 at 06:08:02PM -0400, peterx@redhat.com wrote: > A quick performance test on an aarch64 VM on M1 chip shows 15% degrade over > a tight loop of slow gup after the path switched. That shouldn't be a > problem because slow-gup should not be a hot path for GUP in general: when > page is commonly present, fast-gup will already succeed, while when the > page is indeed missing and require a follow up page fault, the slow gup > degrade will probably buried in the fault paths anyway. It also explains > why slow gup for THP used to be very slow before 57edfcfd3419 ("mm/gup: > accelerate thp gup even for "pages != NULL"") lands, the latter not part of > a performance analysis but a side benefit. If the performance will be a > concern, we can consider handle CONT_PTE in follow_page(). I think this is probably fine for the moment, at least for this series, as CONT_PTE is still very new. But it will need to be optimized. "slow" GUP is the only GUP that is used by FOLL_LONGTERM and it still needs to be optimized because you can't assume a FOLL_LONGTERM user will be hitting the really slow fault path. There are enough important cases where it is just reading already populted page tables, and these days, often with large folios. Reviewed-by: Jason Gunthorpe Jason