From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60664C67861 for ; Tue, 9 Apr 2024 23:44:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEA386B0089; Tue, 9 Apr 2024 19:44:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9A706B008A; Tue, 9 Apr 2024 19:44:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C13E76B008C; Tue, 9 Apr 2024 19:44:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A1C2D6B0089 for ; Tue, 9 Apr 2024 19:44:03 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 317101A053B for ; Tue, 9 Apr 2024 23:44:03 +0000 (UTC) X-FDA: 81991623966.15.C434A97 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2125.outbound.protection.outlook.com [40.107.237.125]) by imf14.hostedemail.com (Postfix) with ESMTP id 8F84010000A for ; Tue, 9 Apr 2024 23:44:00 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=nAcjPDAW; spf=pass (imf14.hostedemail.com: domain of jgg@nvidia.com designates 40.107.237.125 as permitted sender) smtp.mailfrom=jgg@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712706240; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZtYTOEh4QaWorr43L3gIXx64bhliZyCJB+XpBOpnfnY=; b=Tzr7/p3YFCCXn9TmQ7NE6MfDdTJ4GhHdlZW1HoqApQ/ZHUcMfk+lTHZDXJaQv9BGo8qVyY Nbbi7UZWxzp2cfErba80reBPty9zpctG9vdgtTb7pwHPIIBgUuSjXrHH1SuOTtcKp/gaJH AqeZE+Un0yGvox3fZBDM5J6/MqNz844= ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1712706240; a=rsa-sha256; cv=pass; b=eWETFte7qlNC4Qfuc5dWYNLJnfJyj1z0Zl2JNWJh3IAzeHtfWWyJGbUzkye44U0h3a581z W5QJSIiQhWZ4lndu5X6OyG9//Gcy7elnnl4IXej6HxbdkLfWi+8QJzn/h1tQNoIf/k4CuP HWle/qMSLT2EvZmIApCNQO+/V4N7R5U= ARC-Authentication-Results: i=2; imf14.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=nAcjPDAW; spf=pass (imf14.hostedemail.com: domain of jgg@nvidia.com designates 40.107.237.125 as permitted sender) smtp.mailfrom=jgg@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xi+3pvDvOVxI5yXfdGGEQhgvZpb+NKDP96OnbU3lTPZxmfFJxuI72YaIEktEJdLDtS8dTtF5+dxEgaFKn9dpdjO0dGfgBdUOa0iRMybTAJeVRfTAu723wxZN6FbbR4/fv+4hjZHbdXrfr1wrOMiJgn62bIjnIYoHRGhsvcmWOZ0DUh1EMKip5LvYZSK8qCwlhCKukGFl9bPmD+CvaL6AGhFIXo6KD3cYeAz8rIprwihx3zLcPjOxaDtoLn2x8POp/Lj/0Ql5pUS4O3awxM4iUwrezebj4mEm0SO6zVxKCbAffkwddfFN9XfdJvllRi4M4vv9rl0tWRjlaERwjN/UOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZtYTOEh4QaWorr43L3gIXx64bhliZyCJB+XpBOpnfnY=; b=acpbnNMFSjuW2F/GF7XDJ2VI4xwM9sVYi7kcXFseCy5rM6Bser2UxfQtZHDM5PgFXLLakEAzgJOeLqARrZer0PydIAZh8YGsGIAzp8+sf40tFPy7EQMAcY+XRWEyfBrUJJyVZUbsz7GqvpmeQP7j4u6YnNB/c5B5w7g8tUWxQEljmAXWmaMTBFehxY+yfm4wZoDG4A9WvvE6ClJn3JlxB8HILipb0XM3QdU/s0OIosMVePW52tGx75UHawEnHdT3da11Wi0iqWS4x+SFGWTm5x0lyqUScllgEdZjWgL0jm3qISuW/mNiSZ0B87Z2iTqDAsMoajACRXBI2/r2UPpoiA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZtYTOEh4QaWorr43L3gIXx64bhliZyCJB+XpBOpnfnY=; b=nAcjPDAW0kACTUv5lj0Y/8tRiH/b3l1rt3VC8DswdR/Y9x2JskqMcO5yof+yx1xoRzs+RhlUszlFJMCKFqKyGM/j3Z9wb53p6NMN3vUnPidrWJp8kWBmWhsqbEv05qZSU5gy8xb6InpN6aX8Jwi3muJRcFYzks21wyEutN1UQCgGFkrObwsw4KRhQyTuc72hx3p57wOaubPe21ByEu9vigKyRRkTW5xHDP/i7Xhz+6MMX/7z3FVNi1iMZkv6ZQTS6wX7tBsYQBNN8qqx5EPXlYxlBTm3L3Gnii4tQ8ort7+34j5WkmL0/JgoscbdAbCqcco4TbKwwTTrFdM3JxTocQ== Received: from DM6PR12MB3849.namprd12.prod.outlook.com (2603:10b6:5:1c7::26) by CY5PR12MB6371.namprd12.prod.outlook.com (2603:10b6:930:f::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.55; Tue, 9 Apr 2024 23:43:57 +0000 Received: from DM6PR12MB3849.namprd12.prod.outlook.com ([fe80::6aec:dbca:a593:a222]) by DM6PR12MB3849.namprd12.prod.outlook.com ([fe80::6aec:dbca:a593:a222%5]) with mapi id 15.20.7409.053; Tue, 9 Apr 2024 23:43:57 +0000 Date: Tue, 9 Apr 2024 20:43:55 -0300 From: Jason Gunthorpe To: Peter Xu Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Michael Ellerman , Christophe Leroy , Matthew Wilcox , Rik van Riel , Lorenzo Stoakes , Axel Rasmussen , Yang Shi , John Hubbard , linux-arm-kernel@lists.infradead.org, "Kirill A . Shutemov" , Andrew Jones , Vlastimil Babka , Mike Rapoport , Andrew Morton , Muchun Song , Christoph Hellwig , linux-riscv@lists.infradead.org, James Houghton , David Hildenbrand , Andrea Arcangeli , "Aneesh Kumar K . V" , Mike Kravetz Subject: Re: [PATCH v3 00/12] mm/gup: Unify hugetlb, part 2 Message-ID: <20240409234355.GJ5383@nvidia.com> References: <20240321220802.679544-1-peterx@redhat.com> <20240322161000.GJ159172@nvidia.com> <20240326140252.GH6245@nvidia.com> <20240405181633.GH5383@nvidia.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: BL1PR13CA0252.namprd13.prod.outlook.com (2603:10b6:208:2ba::17) To DM6PR12MB3849.namprd12.prod.outlook.com (2603:10b6:5:1c7::26) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR12MB3849:EE_|CY5PR12MB6371:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iXQTxovTY807gc9Dbk5H0zd9d0njVbKMhGDXOQQ6ZTfgqyaFPOgiRx4Mm4bl0xB8/31350HIhR084ItECUUS9eEcpvJqW3JxtpmM59l+Nu2rQCGLcq4OISstEP3jYYHjaDUN8g5XVvgfMUvzafeYWdwIrDNHzODw8jTGuwRjiYTmswad9tTkbQmFH8XKGqebBpVt/IcchePxCyZcJ6YmacEx5O4zWVPpkfj3gBLD8YVroioVW8QUWelKirZsM3CD6Kml3GZsn5G6nNUhqMG88wk/GJRjfobkkcxRAGqR0BsiyBXIA6j95P7wnfrNs07dGJbcKq51rcL3V6CjHtoEJRIIKQGqVMyvHKxdb1ikHXTH7dmFZA+uWT3b8biLb0OfshoICsnk0pZLix2Q8FALh1vvX/Y6OB61pAR2koybkQimnh5CCHlf/RmEfnTxIYkeUQIe2CoBvUnmsfTJEQ8H2Olwuijj2tn0a7t26qisAnGPCwrBOrP0RZGSz7czl0awcF4UB4aH4Ba02cVp/JkguACxwR2bRXlWx1VStVtu52JNseiF73W3+DGuGK5BXNlooS3TDobd+CE4AgVZlKDeLiYr9rGDnhWKgjts2A/J2Wb2BH8XY81uonrjBpgM/v5lL6OzW9Hch0PRZqIzzcx9cburqln0VWUxWN25AOVSMnI= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB3849.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(376005)(366007)(1800799015)(7416005);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?LsbAex/tFaOhbX5oXZtD2xnKSHaLeub2PpPUmpl9Y4EinNTpaBVG8vnUJOWG?= =?us-ascii?Q?e/y7V8hJtgWVDC0rkXlLnKpc4TtC+9zKlae4wCl+pXU85lIFDQP9seYfJLYF?= =?us-ascii?Q?oEANW1pktsvHbr6OZlSh8TkvzUGe6e/jJG6J6c9OJQjl3yAZKkRj2KAmAGuC?= =?us-ascii?Q?7DWJR4SyjRGOln34+eJDGyo8OkRjSfnSmjkvn9UnXoTMqdzh8c08s1CVxqin?= =?us-ascii?Q?piyxmajEGbJ8xSYtlc0qHVZZT9r4PqnZazYJUu3h2sFb8WULeXkEcjRn+2/T?= =?us-ascii?Q?CL+wwDvuFk6ycyTyatRwdWkvZoR0xMP+3nqLp/aiLN1nJxiRf4EKH5BK0F5K?= =?us-ascii?Q?+O7CmocJ4fIGpWNwXj/2sDYpUykq58u2jZGwxFlvEpxy3/Fg95YGadeOowqC?= =?us-ascii?Q?J+RRMpxr0mF3gqyrSxQ6lu/QEgIMXeedjrl4a0tO6kmQqs+93a3I2lKgHdGw?= =?us-ascii?Q?tHUTHInOH6556qlhkMptj5KeK2r92pPeG1U5+zY0NlgaGexxpnT7/GJWW6TR?= =?us-ascii?Q?q+3WGX1ffFsk2yQAXMuvYVBDewEWi+zUKnHdxpxa1CMK0hqJyqdzUb3K7VrJ?= =?us-ascii?Q?aWW/9BqXWMPBJT/7M4mNzozLBLiKipDlaQ4C4nq/UJuj8XfMWsvUf4O+Styo?= =?us-ascii?Q?aEM9YwlIG4AzxxV65ZXH5BFJ0mSMo/rQpd++4tWJth8cLrfMwhUvgTl+exbk?= =?us-ascii?Q?SjKlTW7yGaKIIIUAM35bfS7SVK4W7J4l+D5ymCFCH8M6QIgEM5ZlTU7hwL/m?= =?us-ascii?Q?E++4/wtyRULEQd04V1YCuUZYMEX1+8NzhrE2eOuOoNMle0O3+c/qPvXv125V?= =?us-ascii?Q?niP47nXoiMGXvSd9+GyYUnNkuwfiiXa9bqEXTRdwtAp2zNerQslFSA9j48yO?= =?us-ascii?Q?0iWeN6EroSDCvRyhpTrj5HFTtpwjaCCglGLgXYe68aOxiDLqmXHuw/1f7NzZ?= =?us-ascii?Q?ER4NULBHpqZW/H77fQLVa3KpkFueTW0qCMAaxz+2zEMC/WZwzSJnh6cc6wpg?= =?us-ascii?Q?n2iVCPxpVhLcNm6LNkPBKDpcaemM65as+I39xolAOodWS0sVi02NmxWG95KC?= =?us-ascii?Q?J48e5+W/77mvgrt8p/1edxo10TGzDdBSUj6fqojnZaDjGVeaQa5j/l7mRIgT?= =?us-ascii?Q?YtW5amIZHFu1ScXv7QvJ8JUMuLKNyznJPJdNikPmoFc9Z3FKoAuwRo5FFkzA?= =?us-ascii?Q?iziyqOneDlKtDUTooHdz+6xef+7poKlK1II4pe4VCX0xj9/3U/8XwSMNRKlX?= =?us-ascii?Q?TcaQNtPGsdwlAcj9ZqrZq+7V9FS5MBHoN3oHgIPv4X1pi531SZUmJEAiMuvP?= =?us-ascii?Q?laDiXae2IQ1TM1+QqUu/Taey/VgL0nJya/bbhVQu6XU1p4NiIOZ8BitSojst?= =?us-ascii?Q?iXuUC1TEBRedIsRztPhmsrARqVSkLxjmuJrEvbnGHPVMDfkXz26fvgLAtSPr?= =?us-ascii?Q?yKaQLoVnWrBmbcqfaf1OF4dse5IietzKQddHb/62jAYx/USjB0FD7vL/PMBH?= =?us-ascii?Q?DT2+xLhhAtDQE4+/Ho/512dmzC9N4te2pdMA4wHNbAXfSISUOsAxb9UM/m8S?= =?us-ascii?Q?ihV6z0q3kuX/QAwP5flrKN0tCFCiLbrObP4TA0aC?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 85e3d269-d78b-41b7-d9ce-08dc58eeec68 X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB3849.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Apr 2024 23:43:57.2831 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: W8tC1X3VQ0gDilRgFzcj7VTWoVsSZCmOuOX+sy3uB0ahEU1M7HHxIrKg+VfaSHQA X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY5PR12MB6371 X-Rspamd-Queue-Id: 8F84010000A X-Rspam-User: X-Stat-Signature: yh9pud7w57zwjckp8jzqbirdzxpibbgy X-Rspamd-Server: rspam03 X-HE-Tag: 1712706240-887830 X-HE-Meta: U2FsdGVkX1/Y45CGx6Ke33oyD1KbCsVFIHwQy2AKTIumrKuSE2IUW+lQ8y4RROoqoUb0Vo+ZRKKnuwRbqHUvSKrSfrrjU0WnEm8xJlykmaMc4O6wtoaPEd1gq0iQBr9Wd5AbFa9v0/uesAi/FcVWriJwW45JH5YHaxgRvEJJIENxSLiEjTEwKN6zmej8IErHSxs3bOOeW1lH0w12xC+ZYRIKc/2n1MULRGQhHyl7FkNTPL048AAzZHcX0VLNt3b9I7weS/AvWC2VtFARi/U9sjPrm3XDQ2VBMOVhIqDX8vmHChkXS9/0rMbnBp1ID8npA6m+wLv53eKqBgrHikfqJ61zq/w2DwIumTX0HIWz80BnvYyx1L6fBldzRxDTdc9HHWWeXxV8Pkspc6xZUXoCiarBKjHbTPBtt6Cnh8tRsGcaaRYUqA7ynxotyb+IxAYP6qWUXKrMzv2R8pudG8ixZzKbidY90oAfHiweibWjedXnTdgtT/yeBNexyAhMWPvTIT9urNMQd+FfpBUXylJE/+eB2AE0jAarSiDfr5BbOwHSEIZcUtwf+8Y0hXmnGcibWHUDNQWEVGnqvATfcYPv5AzIIsKZDx8zkhrndW7fGnT8qJOp0W1STv7TART2QxGUWQ6k0zcmgjbS6t8KnumE3Duym0NfOuPM9r4pHN1cKf8vnXCmn35ghCD1mQO5cO7WM4UC921B+0rIb6It0wZtfdqPbF1LBs/ijJKdtkutPtxy4GvTvZkI+rZ9ltkLy1uMxXIfrq9SbsrMf2BRqlBX92hmaHTX9nD76B5k48VaE+o/0pSQo6pnib/XZYPvDMWyVMtCxfcKY/Va7sdcEbjfM0AcT8Ye7IIkAOwVrc27NQdnCL8KpLUpOmZaw1z9RrCKoPd78geVYlDUSrtitZ5DiQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 05, 2024 at 05:42:44PM -0400, Peter Xu wrote: > In short, hugetlb mappings shouldn't be special comparing to other huge pXd > and large folio (cont-pXd) mappings for most of the walkers in my mind, if > not all. I need to look at all the walkers and there can be some tricky > ones, but I believe that applies in general. It's actually similar to what > I did with slow gup here. I think that is the big question, I also haven't done the research to know the answer. At this point focusing on moving what is reasonable to the pXX_* API makes sense to me. Then reviewing what remains and making some decision. > Like this series, for cont-pXd we'll need multiple walks comparing to > before (when with hugetlb_entry()), but for that part I'll provide some > performance tests too, and we also have a fallback plan, which is to detect > cont-pXd existance, which will also work for large folios. I think we can optimize this pretty easy. > > I think if you do the easy places for pXX conversion you will have a > > good idea about what is needed for the hard places. > > Here IMHO we don't need to understand "what is the size of this hugetlb > vma" Yeh, I never really understood why hugetlb was linked to the VMA.. The page table is self describing, obviously. > or "which level of pgtable does this hugetlb vma pages locate", Ditto > because we may not need that, e.g., when we only want to collect some smaps > statistics. "whether it's hugetlb" may matter, though. E.g. in the mm > walker we see a huge pmd, it can be a thp, it can be a hugetlb (when > hugetlb_entry removed), we may need extra check later to put things into > the right bucket, but for the walker itself it doesn't necessarily need > hugetlb_entry(). Right, places may still need to know it is part of a huge VMA because we have special stuff linked to that. > > But then again we come back to power and its big list of page sizes > > and variety :( Looks like some there have huge sizes at the pgd level > > at least. > > Yeah this is something I want to be super clear, because I may miss > something: we don't have real pgd pages, right? Powerpc doesn't even > define p4d_leaf(), AFAICT. AFAICT it is because it hides it all in hugepd. If the goal is to purge hugepd then some of the options might turn out to convert hugepd into huge p4d/pgd, as I understand it. It would be nice to have certainty on this at least. We have effectively three APIs to parse a single page table and currently none of the APIs can return 100% of the data for power. Jason