From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3B15C71135 for ; Fri, 13 Jun 2025 23:17:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4B02B6B0088; Fri, 13 Jun 2025 19:17:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 460C16B0089; Fri, 13 Jun 2025 19:17:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 327E76B008A; Fri, 13 Jun 2025 19:17:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 16A236B0088 for ; Fri, 13 Jun 2025 19:17:06 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6AFED1A1587 for ; Fri, 13 Jun 2025 23:17:05 +0000 (UTC) X-FDA: 83551940010.07.EF707CF Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2054.outbound.protection.outlook.com [40.107.244.54]) by imf18.hostedemail.com (Postfix) with ESMTP id 8C80B1C000D for ; Fri, 13 Jun 2025 23:17:02 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=izuF9FLj; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf18.hostedemail.com: domain of jgg@nvidia.com designates 40.107.244.54 as permitted sender) smtp.mailfrom=jgg@nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749856622; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=37ImgQf0vLy09qa2BuKPUxvHIpCkhSK0+XwtQemUkdU=; b=zgYfaWqw8JapmJWVgCM+oRGe8HBAXGb16+BzEaROeN6pqGeqHD30v5XybTTRVodyEKQ1GJ 5qGUtqh1hgqqMOyhaIVZCnR87VWRBXO3NieivKoVl1iNC6RTNPkrn7RMdJq9r37Tc6eoIC cm8wgb+0EBOealvqAnaVlGaPos6HBmo= ARC-Authentication-Results: i=2; imf18.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=izuF9FLj; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1"); spf=pass (imf18.hostedemail.com: domain of jgg@nvidia.com designates 40.107.244.54 as permitted sender) smtp.mailfrom=jgg@nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1749856622; a=rsa-sha256; cv=pass; b=RxyOJ4kAuNFzyLwwKC1H0JG3ueFgkJ7PufnpHGHi8wW4gQCVzHU/yxWra6o21H5h9SEm/8 DZ2l92KeKAXJ3ihbzkAqkozERUduX1T9XsbWzK3F2F0ipRUGR0v6DjF/hjTVHiUkzqxANH t9Ey2kmQAjz1IfdeGyciU0Vsv+wUe+o= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=MfJfvpOuL0TexNtwGXrLJwOFpGfK5tojOwvUVLUGqGCqKiuPOELv03Sa2SU+CL14BBQkApURc/nzoIHq05ThCoGvPu1Pb0CJCldsKu7N6mBGpTKRlic11q/rwolmkNHzn/DiHHfymOu4qAZUQPWgx6COcITwXA4SJyB1cXxFT3BnpTQAm8fGfjO3POpoOuY2YsWkQntQUEQ7gSYQCeHS/id2O6dD6TBJgCDqHnRpRUOh0Df7yI/Wn0NM7KFU9b/dac/CQTvIhaGMUnHXVd/TY+VmcgJwVgWwCL7dxOjKAa5xqdkhfZP3iHX7gN+AbbDzNrHge7WKshR2XQEQibf9Dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=37ImgQf0vLy09qa2BuKPUxvHIpCkhSK0+XwtQemUkdU=; b=WX1VYb7atsK/P74CXqymuGynNaUANtRi5zCK6oMUj4GqPMbL4XubRoWU/uz7cZCdXlS3OfRPipKrX66HeJKvsHA/OfUCZjnOcv9rIiuuX4v8DhR/m/03IyWpx+ufvh30T7vHbiw0pvgWkhmMgeyNkNNXQDrTnPm0WP+Oe9LXsAUabVjUngpFHqkd4LNRrsNamhh867cTnhh2W7Ij28WUre9nj9cQtTqtjqjZYRaMLBoQo97warQN1+rnzYB6x211kX6Mdp9E90XZdzg2vjW178zbLYrx/oUSO/TUEKefQK+8Fe0MIIc8jh5ZkKRvr4DNeSL9Y8HvweiFE2LOHDGaRw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=37ImgQf0vLy09qa2BuKPUxvHIpCkhSK0+XwtQemUkdU=; b=izuF9FLjj3wuCUKcIUNTU8WGn2cvT2Tt4FjUp+57M4VO8d1skXLau+6+rry+MlFaLdhrvBV4eIGiBazN2C5rz2npGKnZFf+x5P+xdaQkZeta0zFEWLekrbmPFqoTOZSf/1/9RPhRSJAyAUfY5cepvZMma9pz4JP+2Db5ew3cKZW6foyijMI5KB7420qzYo+sxpy9nPRUVZKqpkBhJfvse9geNl8dMEr/oXlixW4JwPxy6vkGVuyo5gF7jWlvrOazmjmMHAA9tteuk8TxDeXmasK5763ri19aVB8extaCDsYSscxjoPhw23+qZCeXVmzVBfuhUbE6quROwKT5cgqQ6w== Received: from CH3PR12MB8659.namprd12.prod.outlook.com (2603:10b6:610:17c::13) by SA3PR12MB7877.namprd12.prod.outlook.com (2603:10b6:806:31b::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8792.36; Fri, 13 Jun 2025 23:16:59 +0000 Received: from CH3PR12MB8659.namprd12.prod.outlook.com ([fe80::6eb6:7d37:7b4b:1732]) by CH3PR12MB8659.namprd12.prod.outlook.com ([fe80::6eb6:7d37:7b4b:1732%7]) with mapi id 15.20.8835.023; Fri, 13 Jun 2025 23:16:58 +0000 Date: Fri, 13 Jun 2025 20:16:57 -0300 From: Jason Gunthorpe To: Peter Xu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, Andrew Morton , Alex Williamson , Zi Yan , Alex Mastro , David Hildenbrand , Nico Pache Subject: Re: [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Message-ID: <20250613231657.GO1174925@nvidia.com> References: <20250613134111.469884-1-peterx@redhat.com> <20250613134111.469884-6-peterx@redhat.com> <20250613142903.GL1174925@nvidia.com> <20250613160956.GN1174925@nvidia.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: YT4PR01CA0412.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:10b::11) To CH3PR12MB8659.namprd12.prod.outlook.com (2603:10b6:610:17c::13) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH3PR12MB8659:EE_|SA3PR12MB7877:EE_ X-MS-Office365-Filtering-Correlation-Id: c3d74319-8a16-4fcd-8c90-08ddaad0652b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?RK9QJuAJvu4UsTO9zbyuA+yonmzwOHuO1ShKFlBU3qlEkkQDbG+YuQ8tM+sP?= =?us-ascii?Q?/MlUiU+D+95J4YccsukRkxS4MqTIdFi0YmGj7SVj6HXstlIt5ebzMFslst+Z?= =?us-ascii?Q?mlI0ZsVxbnofUMMn5VD1ICrDgpcuttUmBaGcrbIiBECPZx89KJTOIzC/g5Zc?= =?us-ascii?Q?5Bx56jHknEXY90mntSZ2YlETFUKX/KqUBqlrieUv+wqhAfwQN5cXmgXsEdFK?= =?us-ascii?Q?6JZXmpA7pNNZjFO3Rc5E1sU39RdhaNLKBvsJOpVDgYyYIc2JNO87+VlfEFjU?= =?us-ascii?Q?WH8QK5FHPuI97m2HlZHgEKNZsj6hCshgzhbSg10WQVfb1wIB3xCFES+sysfI?= =?us-ascii?Q?55nV17l6aN3VknE5n+LxcfAmTdzX/bBtlnWV3W5ch5dbOonjjVftHRsekd/F?= =?us-ascii?Q?IwsOzSBVb4AxDAaEGHw6PlwudJtlKDtFhgUv7FOSqgeRuehLg2yxH3S6I8Bc?= =?us-ascii?Q?Y7FhI9/po49Z6OnixrS7zUEmT1VmRskhOAY7FhR4QzWy6G92M9FasVaXOPrE?= =?us-ascii?Q?AQf+wn0hOCIXT4EP0JodsYaupREoZ5CoMEPkPpPUT1XVmoHdb0phPlvrZ9Qf?= =?us-ascii?Q?mX462gNl69y6FfJqEUBsI8NKxCwiX6M/UXd0rs1EbBmrphg8O+VExTsfUnZM?= =?us-ascii?Q?S4a59gtzOt7cN48WcAa3UobnPqJXmndb19dTzPQN1R8x9gdIeYsogeU3Sgtz?= =?us-ascii?Q?yoz0JHG4CxJtV2NqkI5OaGhGGQ6bDhC0eIpH4Ndkfwd8/7AL9pdY4wCls94H?= =?us-ascii?Q?fj/Bw5bI/wRgDUqwFrJXpqf4wegcCJHdARp14sxi+CMaO6fyS+CcC3WC7HJm?= =?us-ascii?Q?eu0VHbqNzliTHNmzHIWhos5tgzzZBTV/8JQHT1v/e1w+I4HPLnUsre+wQO/j?= =?us-ascii?Q?dul8UCPuvFwf2r3ACW7DnESGesnIA95tNV1qvrBSeIg0UBNyHJW9SKgw6R3t?= =?us-ascii?Q?4bt106jmConTfCgj+H4qLTXO9uViNKqpsdyPmXwZKyDlSpGmhC5QpOGDlUJh?= =?us-ascii?Q?U5M6ZiLgXGUjsOHuCLMAT4X5IYrWOSUwA7xEFHhMSrIoYN89C4RVdtMCLFYD?= =?us-ascii?Q?GueBRg9ow6v/wVnNIHITkN7/nxHM8gvACd1YYGudS4RFbGKX7rFh5bP+gHLD?= =?us-ascii?Q?R6scGhZGzn+fSCPabDdRg2pGoP68+dnbm4ssBjw63MiK67oENrTUOfisdRP4?= =?us-ascii?Q?MXrhpVv1S/tFgeu3klu9WxuDFo9kdJ0k49DF83Y6ytazc8e4S+hOcmDE7yZO?= =?us-ascii?Q?LBFM+kR/m5yCJujN1wXiKq3LBORQnEEBvSNNqNnEMU9Tjzrz0qA/xa3Ms/Ls?= =?us-ascii?Q?lFDbcLu0ntM75CEO2WSWp01kqn0lbKSK/zPBMWeJkKvJpHuPU6vJe5r59jzl?= =?us-ascii?Q?licXOTStTTzTmM97gbyKNs0ee98fRvs6Bzw8FqqQD2Z0YI/EKKgrq1V/Qztt?= =?us-ascii?Q?3aDcw7zZ9eA=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CH3PR12MB8659.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?GevDEWb1N7frfycA+zE31fUMo4FA8JD0rk+uRTHW0PZhUGEiDmzkBUclfH27?= =?us-ascii?Q?TN9WHa1sh/mzP64YIPlgVABoz5XYrnPyr5oUCNW6YQeTr5fUsrREpkdTt9dS?= =?us-ascii?Q?ePDrk12LcNVNX/aH1YHVmur0MO7OWVnWkouPsRSa8eqf+oWqu0jvGkdCBHpW?= =?us-ascii?Q?LA0hHXSGYKPjZj8l9CLMaiSozX0fdAgUCVA56SVdw2laFhSdFBzp/RbBBfLv?= =?us-ascii?Q?kSOYQfKMo4WgFX/IBUfK+m6/9xIITDoUp15ClJnQld+XtS5V/IZ+vCvhmU3U?= =?us-ascii?Q?YiJKcrUgk9MwozaE/Fv/KVes2kiKc3uEUdrBHXwjppw3hpyjzW6BhxrEpFJp?= =?us-ascii?Q?CvZdK+wmpl24qvl9uSK0XPw8Y1xUI2uKFBCsuYMk3+wLLQWrgC79ZdGDeKJE?= =?us-ascii?Q?2FESUWsqjuGetTYpES8Hs3AIPRDrFsUZwNlDJJWfPZfgzGBeUl6N4q5RsESM?= =?us-ascii?Q?wDENMWig2DykNfS67sMmgnJxKeWlFzvMvbvMCI114RDHVO5umHtmL4w56/8p?= =?us-ascii?Q?Zh9M04VKoMcRhWdcBbAxd5eC27OZMOVJQIFh+ItfsfUshm9u9IctCY3aY7nD?= =?us-ascii?Q?cY9nVPkpn7UJE3mZKxuM0SuEYMCJ5f/PplJh4jl12UuHKCAbeXMQP9pKP1e7?= =?us-ascii?Q?aOpY8ZIcrfEQfZWlj+ADw0JA+QWF5qROuy/LpJABuZYdcw13LoytWBmYjWrB?= =?us-ascii?Q?KmmGr+fXL1VzuulvFX8tV1zzKEHvqEAsDDvgCwFnV5TqPSkIt5otsxYUV6QI?= =?us-ascii?Q?vcXEJDaWpl0VOP6AsNbHcUYwXGTiT5L1XmPfj1fWxzz1Xt4ZsKOH8pK6ntgQ?= =?us-ascii?Q?GC6Q6UfqnOQA8to5aCVMcScWwAzCYljhyxdqT1d9Y1/nR0o/uULamkfJNNVz?= =?us-ascii?Q?6+8qXkWlpZo0YWEGOjow2zjKfTB5z2ZxLrwYqAIwK35wGepePphAFPUgzpby?= =?us-ascii?Q?5v/nVLp7fI1R2nNwDHRXrm2DAMNw3t3VgKbB6m0cLQN5eUlpi9DGV8McMvDx?= =?us-ascii?Q?N5y6XZLayB6jkhWeX5/V7n7scuUT4/7sN/UlWjwf2XHIMCQMZP9XQAelIit5?= =?us-ascii?Q?Hfh2J0jFp6vm+a/tCJoTWX1XeHMt8Lxy2wMXnfcF1d4lRJuv0p0IvIXdZOX6?= =?us-ascii?Q?VbgDzmLuMUclAMqy7zlTfwDhYAyNS59OHg4JnlYkdWuWy1gA+UaNqVg+DEQR?= =?us-ascii?Q?tvkWvmpjc0+KrLyh0PxAS9OKl5PO97LqTLt5ADlv66v1R1siqkYMB61Cei+m?= =?us-ascii?Q?SCF9bXJLwLi6Ki07tapOfpn2aFjs4MQP8aGLgwVLR22ZaDhcUgIbu+QOmyBS?= =?us-ascii?Q?Grc7U8/qRDL+dkaXfuiu9TFQ4a0PYc3ZkQB5itfG4vipA8nvDBskcS01EL8f?= =?us-ascii?Q?tEFGCtjLjrPLyYQILbK8hnMTAOWYYQpxZ39JUojYEOqfsHKnbRhe3FMyRcDB?= =?us-ascii?Q?WYqrZvMeS485NzXtKoLHeCUraPcmZtAhn+icVa2OIGW0IKRUokYEpZRXL21E?= =?us-ascii?Q?MYg5Je7nLz1bODXlow8jSOO2j9C6buXGTlxcABe4avrHoNQdPWlzPrAVczGs?= =?us-ascii?Q?mQvktUoLS4GLje4ysADDS8e4/xlfsoUQoJxDSojE?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: c3d74319-8a16-4fcd-8c90-08ddaad0652b X-MS-Exchange-CrossTenant-AuthSource: CH3PR12MB8659.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jun 2025 23:16:58.7777 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 33Qc0etUiT4+QXQ+769maTlvaEYpvT77o6+RaebvTZlGrEhO3t/8vvNwGuH93cBw X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB7877 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 8C80B1C000D X-Stat-Signature: aextztnsz5xei6jugs169q5sgkxj8mi5 X-Rspam-User: X-HE-Tag: 1749856622-909737 X-HE-Meta: U2FsdGVkX1++V4WcgfpWZPFSTDBFKQr2DN0/jNmk7PuG9wggnLve7QkNMOn+a2NSPFDF4VPB+JwNlDmbcd7S3RmPTPqjpCKwVWovMnvpQPp23bTozfygwYLKjqvsb5ZAoKmcc0MmxvmIV0ouYXK69a+9uHaIgNJm1w6T5R9ImbAlzPIZqKFAAEAF+Yii/mpqGyNzgweDZmn8Rzkp54FRhKnTqDH/eyd/WvoTxpbAuElTona6RN+HLlnqj0/eTbVUBAQGp30lRrnsAd8B/X5fNMnfce3rgZ42iTCvz4p0LXz+YibAHZRse9or7FgX7z6/+VgtHpZsRcFnVoM8LySWJ0kryAjtLtkDg0+mEaWlHZzF/0X0q0Z9baaYxyXiE9MeuV5Im109ZQvIa8SdfK1UnOpVvNteguIdxwXKhK826UCFuM5ObugATH27r2rWYJk6zuc3eHQxXvg9HJFhCxwDBDX3RRzyQxlRvP6QpUJ69qaTP+ad2m74Hq9oCoaWCv8s0gS2sBqyXD2sAOffbgDgDOlCVYaO6QFHDIL3qpXDwjWQV8c5PZrguhSRZ3ngn2JWcDKZRsFeeE5l6ZGH9RnEeiTktmGhj2Fd7Fo+B23ULpwNzu29qoCebgh5OnmNYQPkUNdRD3kG2sguxdiGPd45GWbpiG+Wnk3p31aDh3bBRctCGH3JEkLflVpxMDLQGZET4OhHHKD8SNL7PSUTlVakLBuYTvFRYNS7XZ7K/MkomL7Nj+EvgW/wGYMdaShNKUFhRuV87T/TLCOE1WQu202yoOQZ9xGisSucwPR4VUjcXLUH+3Yo3PE95+WCXnmkuHWmzNrtecvTRK024im+iFxDFxYMo4Lqpz8UrGsW8BNnZRSwHf6MuBOAJ3hZxqvIK2MRaJWS9YOgTVVXRNpwH+nViv9d+JLHY0/3e/NSM7Mb66DuGaXjNnFT3u6KR2rdP86+q2goz/czG2fzbZKP6T5 gPqkSKWd IRCtYd4SUvdBCbEycFR+P3gbC4hxlXSkkMWpcgomhetqTvRbQNs7IqxlylJSS8Z7xKq6kRoQv6l9jO2Ep0nXHz7WCglYPzomcmGrvBAFyMGgUD6BROrhlDvRnUsnrarMRuAcZG1j2PWkBWJhVHKH5FhMLLnIHZAuX4TE892+4Kjt9JvRQ4CL5O99LQFQh04DL0pWPtUYzkPj8uqAnhltcdt3WuU4FOgsZMnQBg4LYWtCDEG2xh1oA/uoq+bpDK8IoUz2GViNHC/adsL1Ggbw2aYoDJzl/Uf9C3WVNAoFBf5ePkZ0J0BzddWHJOrFFOC0LWRR/8LvNuR8+HczqgEBAep12HXuPrj8TTOkFFA6jZKVFxZML8zrSSf58G87ayHHBSPXogSUla5jKNb2ja+Aw/Kvt019EvpeKqx6BC+c+zheTL+vEEOfRSyIw2/sijP4Zb7nH7YT4HPxo33lrFJvsv+ZFpLXwYE5IbNzDlt3ZvA72mwrg1fBleVnzznpER2DwuXoYbBVZt9lsFfhBZyMOXoPkVo6SVJEvhlIv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jun 13, 2025 at 03:15:19PM -0400, Peter Xu wrote: > > > > > + if (phys_len >= PMD_SIZE) { > > > > > + ret = mm_get_unmapped_area_aligned(file, addr, len, phys_addr, > > > > > + flags, PMD_SIZE, 0); > > > > > + if (ret) > > > > > + return ret; > > > > > + } > > > > > > > > Hurm, we have contiguous pages now, so PMD_SIZE is not so great, eg on > > > > 4k ARM with we can have a 16*2M=32MB contiguity, and 16k ARM uses > > > > contiguity to get a 32*16k=1GB option. > > > > > > > > Forcing to only align to the PMD or PUD seems suboptimal.. > > > > > > Right, however the cont-pte / cont-pmd are still not supported in huge > > > pfnmaps in general? It'll definitely be nice if someone could look at that > > > from ARM perspective, then provide support of both in one shot. > > > > Maybe leave behind a comment about this. I've been poking around if > > somone would do the ARM PFNMAP support but can't report any commitment. > > I didn't know what's the best part to take a note for the whole pfnmap > effort, but I added a note into the commit message on this patch: > > Note 2: Currently continuous pgtable entries (for example, cont-pte) is not > yet supported for huge pfnmaps in general. It also is not considered in > this patch so far. Separate work will be needed to enable continuous > pgtable entries on archs that support it. > > > > > > > > +fallback: > > > > > + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); > > > > > > > > Why not put this into mm_get_unmapped_area_vmflags() and get rid of > > > > thp_get_unmapped_area_vmflags() too? > > > > > > > > Is there any reason the caller should have to do a retry? > > > > > > We would still need thp_get_unmapped_area_vmflags() because that encodes > > > PMD_SIZE for THPs; we need the flexibility of providing any size alignment > > > as a generic helper. > > > > There is only one caller for thp_get_unmapped_area_vmflags(), just > > open code PMD_SIZE there and thin this whole thing out. It reads > > better like that anyhow: > > > > } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && !file > > && !addr /* no hint */ > > && IS_ALIGNED(len, PMD_SIZE)) { > > /* Ensures that larger anonymous mappings are THP aligned. */ > > addr = mm_get_unmapped_area_aligned(file, 0, len, pgoff, > > flags, vm_flags, PMD_SIZE); > > > > > That was ok, however that loses some flexibility when the caller wants to > > > try with different alignments, exactly like above: currently, it was trying > > > to do a first attempt of PUD mapping then fallback to PMD if that fails. > > > > Oh, that's a good point, I didn't notice that subtle bit. > > > > But then maybe that is showing the API is just wrong and the core code > > should be trying to find the best alignment not the caller. Like we > > can have those PUD/PMD size ifdefs inside the mm instead of in VFIO? > > > > VFIO would just pass the BAR size, implying the best alignment, and > > the core implementation will try to get the largest VMA alignment that > > snaps to an arch supported page contiguity, testing each of the arches > > page size possibilities in turn. > > > > That sounds like a much better API than pushing this into drivers?? > > Yes it would be nice if the core mm can evolve to make supporting such > easier. Though the question is how to pass information over to core mm. I was just thinking something simple, change how your new mm_get_unmapped_area_aligned() works so that the caller is expected to pass in the size of the biggest folio/pfn page in as align. The mm_get_unmapped_area_aligned() returns a vm address that will result in large mappings. pgoff works the same way, the assumption is the biggest folio is at pgoff 0 and followed by another biggest folio so the pgoff logic tries to make the second folio map fully. ie what a hugetlb fd or thp memfd would like. Then you still hook the file operations and still figure out what BAR and so on to call mm_get_unmapped_area_aligned() with the correct aligned parameter. mm_get_unmapped_area_aligned() goes through the supported page sizes of the arch and selects the best one for the indicated biggest folio If we were happy writing this in vfio then it can work just as well in the core mm side. > It's similar to many other use cases of get_unmapped_area() users. For > example, see v4l2_m2m_get_unmapped_area() which has similar treatment on at > least knowing which part of the file was being mapped: > > if (offset < DST_QUEUE_OFF_BASE) { > vq = v4l2_m2m_get_src_vq(fh->m2m_ctx); > } else { > vq = v4l2_m2m_get_dst_vq(fh->m2m_ctx); > pgoff -= (DST_QUEUE_OFF_BASE >> PAGE_SHIFT); > } Careful thats only use for nommu :) Jason