From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1FA60CF45C5 for ; Mon, 12 Jan 2026 18:59:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 77D776B0005; Mon, 12 Jan 2026 13:59:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7009E6B0088; Mon, 12 Jan 2026 13:59:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5848D6B0089; Mon, 12 Jan 2026 13:59:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 42F8F6B0005 for ; Mon, 12 Jan 2026 13:59:06 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C54CB5C28A for ; Mon, 12 Jan 2026 18:59:05 +0000 (UTC) X-FDA: 84324224250.03.DAA7BE8 Received: from CY7PR03CU001.outbound.protection.outlook.com (mail-westcentralusazon11010035.outbound.protection.outlook.com [40.93.198.35]) by imf28.hostedemail.com (Postfix) with ESMTP id E6DF2C0012 for ; Mon, 12 Jan 2026 18:59:02 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b="Xmoi/2vI"; spf=pass (imf28.hostedemail.com: domain of ziy@nvidia.com designates 40.93.198.35 as permitted sender) smtp.mailfrom=ziy@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1768244343; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=imp5Rg8GjKQy09882wThPb+rwyfcaCGvnKsPwjFEaXI=; b=bqpurQ4nGA+QCtC286E7mVcA4bihhcxPMkZvSsK3/Atu/6ERtxb2MuASRr6FSDFwptIS4G ciENXp9Pgc1s+tTsKKnvvnJhxD2yyid21z6Gyg5dMa+Hm7uKweWRBRMfKTBWyJKuWOwNHK +COwK7P54lb3lxm7erdAUSaJYGMfGcA= ARC-Authentication-Results: i=2; imf28.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b="Xmoi/2vI"; spf=pass (imf28.hostedemail.com: domain of ziy@nvidia.com designates 40.93.198.35 as permitted sender) smtp.mailfrom=ziy@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1768244343; a=rsa-sha256; cv=pass; b=pEbXpm/OZPPjzfFUUtOLU3SXvaV8WUxVkw+9Yutn7JGrK/WE45KMtVJmfC6qgtlav4u/MF D568783pIPsYJ/ONHx0qiQjABWeKZ9zcd5eYfCC8OXiym1wB+u1TjZOnSqxetowsCXyVra DZgGNTSl8QukLU1+Gq4KK7WQwFC7Dss= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=N/jiW29etrzWKaMs9d04DErldxK6fzRxXXaNMkiVW9tAdtCaCVkRJsa2JY1w1iQbXYzKbzx0D/XnN2BHjvtuYYcaji3s7GRiYiJeS8CnXxC+e+HaEpFcPvunfzOStj42w7uDOiYH5HUtJRoNlH9iLRB6d5dvKZFoTeXrp04KOeqYG1u/4XE6lNATytjijZDwey1rGef40jbtcopnibqRTJNxdnU6F/GvHG8wOCtox5aLwxnLytVHm5yXSkBJjZhgfm+bIF6GFuNHPSFUtLNOmyrKgkSm/ko3xnhYHxMbL/tevH9NjFyX3SDP1OiEJ5ua9UrK1N4fvTCGWdghXNHAOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=imp5Rg8GjKQy09882wThPb+rwyfcaCGvnKsPwjFEaXI=; b=OXUcVDh6lVrqeRispfallYaO6hQxGqLNFNGiMkdh1hk4BoTH37550Uk+TvygvV7L7ESh93yNKsaXry+i8AjZBkHSHCz8rj4Q6MtipaI5j/0P14vUg/so73OwW3xg3Dy/TRz787Nen0yeLrxbLZWng6Lh2zqQ9LZlQ8nd9j8GcsnvG9xaORRq+N6zOuDgaHignjl1lapxIWr12rkGGfXXGNuMWtGn1vbNmbGFAQ5wq81/tszWnk3hNvyR/mpLsF/ep2HEi0PrpneX8PEAP7bpTfzddyWUH/c+57RslaTNLo71A1AD0srWLtqArOMeomJ+SRRsW+O3Uy7XjIJPa6DfzA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=imp5Rg8GjKQy09882wThPb+rwyfcaCGvnKsPwjFEaXI=; b=Xmoi/2vI3yU2s3pM+AGC9UeenEuw8ljJ2MDmwYm72/iKyXs62Wrh+HAP7/HD3YED/UNPW8s/DiJYt8WZWzgsY9yNrXsvC3UcABI3hcRCxhFfnkW5QyJRjMKu0+oV0moOEJ5JolMX7UMrbSVEoF9oUf/L/kWTI9ULFPeN2LY1HyQXgmALnVTfVt/HosNiv5kLJNVdUSOQnpUrLQ1hpGs7h9MxlEGFSxlQOyDoDwgNtQNR4byA6ZdHErrF8FGTS79s92ojwEoJyzkZZu9jcgZXgr8R0lnfFfn9PPrVS9tod9+4Sdry4TRQc3zd7xoM9/UC8hTUDYzBitReUUORTZJJcg== Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by PH8PR12MB6939.namprd12.prod.outlook.com (2603:10b6:510:1be::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9499.7; Mon, 12 Jan 2026 18:58:56 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.9499.005; Mon, 12 Jan 2026 18:58:56 +0000 From: Zi Yan To: Ryan Roberts Cc: Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , Uladzislau Rezki , "Vishal Moola (Oracle)" , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Jiaqi Yan Subject: Re: [PATCH v1 1/2] mm/page_alloc: Optimize free_contig_range() Date: Mon, 12 Jan 2026 13:58:50 -0500 X-Mailer: MailMate (2.0r6290) Message-ID: <48E7570E-0C7A-4082-902E-52984A7781EB@nvidia.com> In-Reply-To: <03ce2f9b-9a77-48fe-89bd-0e1e598ed85f@arm.com> References: <20260105161741.3952456-1-ryan.roberts@arm.com> <20260105161741.3952456-2-ryan.roberts@arm.com> <280a3945-ff1e-48a5-a51b-6bb479d23819@arm.com> <700A6B2B-9DD1-4F8D-8A38-17FC8BA2F779@nvidia.com> <89dfd2d0-cf28-41c7-be9f-b49963e77aa5@arm.com> <21E14DB6-DA70-4F7F-8482-12DFED81153D@nvidia.com> <03ce2f9b-9a77-48fe-89bd-0e1e598ed85f@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BYAPR06CA0068.namprd06.prod.outlook.com (2603:10b6:a03:14b::45) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|PH8PR12MB6939:EE_ X-MS-Office365-Filtering-Correlation-Id: 33682a5a-7c0d-4f46-2149-08de520ca2d1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|366016|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?B?RHczY1VZUFBRcmVrM2I3QStMTUhqWHMzQi9CSDFYMWhOMEJ5WndUUXdva3lH?= =?utf-8?B?cWtNYUFiai9oMU1vSEh4TUZJdlhKS0lEd2ZiYUhpOERxWmkzdS9WR0hnZ2s5?= =?utf-8?B?QWMwVjZ2bVpUT05BVnlxeUR6bDNJZmw5clpTckNiR0VGVlV0QUFmNG5ZL0g1?= =?utf-8?B?WFNJdGtSaHhNN3g3c2dRWTVjS3RlUGNXVFB0azUzMWpNQXNyZnBOTDBYWUJr?= =?utf-8?B?NHZHN0Zxb0pMZCtlSlRDMHhBdVpGS1JTaHN5dEJIdVZFMUYwZUY3OC9XQ1l0?= =?utf-8?B?QTUvRHZlN2YvUEdmM3IvcGh4YmtNWTEyT1JHdmVwbE1iY0pQd2F2YmRWS2xs?= =?utf-8?B?Q3ZuLzd5NmNDR0d6UGJIWFpjdWlHRlJWb2xIQW8zaWNNeU92cEV4V2tRNVdX?= =?utf-8?B?TUgyb2Rlc3hvR1U2TUFjQ2hoYWo1K1hoTGVYUXk5Z3B5ZmhncU4yR2tmTktG?= =?utf-8?B?cGNyV2ltS3Nnd3ROeTVsMXRVT3ZTa1JlVTZ5MDc1WGNrbWFQejlpRGt2d3pD?= =?utf-8?B?ajFzdVl1K04rVkI5dStDeFhsenJpVzNlamd1blgvdXFsVG16RWJUSXcwcWVG?= =?utf-8?B?SjhKT3F4WUlmV3FuSmdlNjZRRlBWSU00OXZPK2NPQURrd2g0eGFBdVdyUmYw?= =?utf-8?B?b0ZPenpaNTRBUDlha0t3OUt6UDFZYWxSajdmT2pDOEJDL2RhSjV1MFZZZVJp?= =?utf-8?B?akRLanVqWVpUR1hIaC9XNVc5c2Jxc0xYbmNodDduUzRCUEhhT1BkUk9QWFdG?= =?utf-8?B?Q2c0ejY5UmZsSDRSMGhVbThhVm54bUE3eWJrTnJ1M08yeW53YUVoYkNoZzdH?= =?utf-8?B?NjBXWGo1ZHg2VG4zeUNQRmxxUnNacEh0UjA2bzR2cG5BVlJON0p4Q1IrZDl2?= =?utf-8?B?M3lqREZvQ2NWWW5ZQ2lZay85UXh2OGh1elA0MDNGZXVvSXU4V0EyY0szanND?= =?utf-8?B?QWIxSWdGZlJmcThiTGUrcE0yckg1aEJ5a21iVXkrU1IwYW5iNHFNQVJ1NTFS?= =?utf-8?B?SnR1NHRrams2SURLK0tlc29sM0UvM2pXZGV3VkZSdWtQbGtDNFh2ZDNFbUNl?= =?utf-8?B?eUZlYlpiR0d1bTFkZkJacGNjWnMrdWU3TEhZdStFdDlOcTBHZjBFcmVvUFRn?= =?utf-8?B?S01HZFlpdzVmQk1ub0tUNjZBMHd3NW11cVJsTXozeGNyYjhMNkc3VlVjWVk4?= =?utf-8?B?V0NJTEV0V3BjdVhqWTFnV2NQN3B4VUlaaFFDcU9rOVVsNVZ4Rm9iMktZNXBC?= =?utf-8?B?S3M2TnZBOEhnOUZjQkFxazBXWGVNcG1TVXZ6S3dBdTc2ckJ4NzZkZWZobEli?= =?utf-8?B?ZWZOZkFxT2Mvdlc2T2Z2Tm1vMDF0amMxb3QwRU1YdWpoYkhOTG80TUJCMWov?= =?utf-8?B?OFl2NnZUL1dNQm1HazBqMW1paTUxczd5UlRmK0gzZzFhbGJRYlFHZnJXSVVC?= =?utf-8?B?aU5qWktkWFBMQ1MxMDE0NTZ4WVIwMllBOTd4VEFHU3N4bG9oR0JoY1NSSlFq?= =?utf-8?B?LzhNZFBnMnNQUEJOejhQTmV4dXMxTFhwU1lkLzlxVm9adTJlOE1VUkhZbjlT?= =?utf-8?B?aXAyWXE2ZHh4Z0RZY3IvRWpxb2ZmR1VOY2NBR0pXSFRpRzRUdDRXTmhscVBT?= =?utf-8?B?TndSMi90cUVpREREQWlwSFY1dG9DR01hMXRkS1dpcE5vVm1iZU5PbU9BWkdE?= =?utf-8?B?T2YwZWgzOXBtbk9qQUI3d29leU9acDJxcEk0amJtOGR3TjUyazVIWitZNXUx?= =?utf-8?B?QVpMTlgzc2t2T2hPcUxRR29KY3FGeldOQlM5MXZuZ3craTg5c2xFNytuSWM5?= =?utf-8?B?U2wvVkZkSy9vRllpb0ZXUmVFNE9DTStSa2MxOUZLOVZPT1VmQk1vNHllVlk2?= =?utf-8?B?b1F3TG1peGkxV1hyckVuL2E4Y1BLRjFPVkE0REJpTUhqN3owVmE2aUNZU3Bx?= =?utf-8?Q?NOusPFX/xC40+AWZMs+ug5fUfnbifG5H?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR12MB9473.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(366016)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?bVpFcUlHYnlsVWd1dXlGU2hkOERvM2wvUTV1U3ZVUExvWHZzaDA4Ni9DNXNR?= =?utf-8?B?VXI3dnJ2azJJS1N1dy9jcnltdkZKMUgwOTdJa3VoeEd1WGozelpiNDI4Z0ZZ?= =?utf-8?B?ZmpNVkUvTkExOTNCUGZMK1dQV0NtMERjaTF1MWlpOFdEYjl4aFB4d1VKT1JJ?= =?utf-8?B?dndIaFJERHYxektpUUhrbVZHZWF3NkFlMHY5ZHJvdUh1V3BxYlpzMGE2RlhP?= =?utf-8?B?UmlvejEwQlpWQTNkRVhlWi9UdGRaSjBtRXU1ZFFVT3VHYitsWDBoSWpoMnk4?= =?utf-8?B?b1p0MzNKakV1S2JIaWVGbVkvTjRYTzNCaklaaVRRVnhLWkNPSGtyeVlIOWxU?= =?utf-8?B?T2l6MGNmcHhteFE2UktobkcrSW9FQkxLMldSWGhOdmsxVUVtbTdBT0xJeFRW?= =?utf-8?B?bk90dWFCK3VUWld1amdDSW14b1pNZkZTMXZvUWdOYUJORENnUUJjWFVqTlU3?= =?utf-8?B?QjlGRkxoQjdJRmlzU1k3aEhQSFI2MHdVYUF2d3JRbXRCcmtKNkRrL3V4Z1BK?= =?utf-8?B?TEV5UjdCSmQwSFVveldqSnppa2g2dHh3a28wUTNiSExWVDNCK0lHTWNMRXZr?= =?utf-8?B?Qy9XY21qUERIT3dTNHFNZ0Y0bU1GdEpsYlVHU0VnZ0RCM1Iyb0dGS0gxY0pP?= =?utf-8?B?NVFzdnUwN0ZYMi9lTWYzdjY5RXFDeHFVamFyanhiWDRRV2JtL2pYdVNtQWJG?= =?utf-8?B?dHYvQ2JUZlkvM0J0eGcveTFYdGdndVRsYnVoL1JZL0Nrb1dZTk95aTNicEhq?= =?utf-8?B?VW50Z3V3R2hFbzBFMUhROGFKTFBIK0grOXZrdHFqeEtmZnZ2QS95QVBIcGJ1?= =?utf-8?B?QkxkcUJQam1IcW5wTWpINjl4VFlqUVdSZWdqUlF3YjFNWXJQN3VhQnFzQ1J4?= =?utf-8?B?d3dQWHErY2MwUHJVdGhHSi9qaWtlenBhS2ZBb1B2cEhqckwxUC93MmxQb1NL?= =?utf-8?B?YkRNOFU3K1dJNENJRjFEd1k1VlFVTHlkRkNWV3JrMzh1RllRQ0pqTFB1VGFs?= =?utf-8?B?SGFmQmtlUXZTc1RLaWNqUTlKdHMvbkY4WjY5aWRrYjF4dEI5YUpFVjJ6WnBT?= =?utf-8?B?bHlZdjB1K2RQa0JmdWtBdmFKZXlxK2lwc0JZb0l5ZGtKZDhBMWsxQzhrSXFP?= =?utf-8?B?ZkdDcDhaR08rUzZVMVU1WmZyNXRyNWZqWmFIeWdwTTFsampaQzFMcVM0ckZm?= =?utf-8?B?czlYcWN6cUhVU1lWdkxmRlRpR3BiVFhDSXRlblBvQy9kdFl6K2JVU1N0V3M3?= =?utf-8?B?ZklQVFMvN1NkU0RoQnp3Mk1EdU9Fb0swYThhdnpSZFRseUp0OGpVRjltN0Ni?= =?utf-8?B?WFN1VktiMFYzMEY5ODJmWDMxdHdhQUtUQkdzWHlNeHhmSnQwcWpRTEREdmxa?= =?utf-8?B?d1pMdmFnbWhxVFJjR2lIWmJ1N1dLL0R6NzdSTHFaYXNQeTJYeWJJQ3RIdDJF?= =?utf-8?B?RFdzZjFkaEl6bmJ4U1ZNVFc1Z25GRGV6azB5NUZwRzFFcjJuYWdXZENyVkx5?= =?utf-8?B?NEZMV2FaUjdmbDBnbmlGVnNBQ1YvMDhOdDMwcTNqOUpvcE5wWFpCa1QvSzFk?= =?utf-8?B?QlJvUzIxb1NXNUNqVHRCSkJPV0h2QW9QT3lVQjBQSWs2VFVPbVhGWHpOTy9y?= =?utf-8?B?bzFoa2ZtZWtDblprd255bXBYazA4M0g1YnU1REJTc0JoYS93OUUrYjNBWloz?= =?utf-8?B?U09HR0xjRlU1ajN4QXF3eHhPZDhxdERwUlBkcFlDOEpqcG9HWjVrcThMYk9V?= =?utf-8?B?c3Rtakx4cGh3VGtBWlZRQVVwem52RVhTdENTL1FBNEdLMmJiTmlCaVZGbS91?= =?utf-8?B?cVRJRU85RHlwdVJTcERsZ0RIY2R1MGtMMzczWmxWTlJOYXhTL1JiN3ZIN3RD?= =?utf-8?B?TTRyekRWMUhVR0w4dUxBajRXUGFGVGJyaTBWNWh6eE9ZRVhSUHpCZnN0U2Yz?= =?utf-8?B?a1FneXdQamhVeHhEV3ZDcEpMV25ZNkxXSHhSai9LOThocTRWcHhtb2p0T2do?= =?utf-8?B?VXJ1MHpQK1FuYVNHcFQ1NjF1OXpUTW0raEZZK01xS0t2Rlh1U3JFNmdzRkkz?= =?utf-8?B?YjJBeGJpNUxEVVFIaHVFZEFPbVEzNXlnYkRoblBIY1ZQbnRKTGkvRWU1aWNK?= =?utf-8?B?dlJqMVozWWtFSXI0T3FXdHUxbzFWQVBRVEpFejMxMElLWGZPb2cyRkx5ZVcz?= =?utf-8?B?SnBTcjFRMVRmWlk5QkUzWC94bW53WkFvdjRIZ3JpbEYra1lTb3VJbU41NDN1?= =?utf-8?B?Z2lhRnZqamhIVVdrZFhyTzV1OTNHeXBoNmRoSmdvdi9EeUFxays4K0orUmlK?= =?utf-8?Q?jM3f7AnziCAJN4oRib?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 33682a5a-7c0d-4f46-2149-08de520ca2d1 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2026 18:58:55.9420 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: X3eVBfbtRdJoRfpiSub5rOd+cyUT6NCZdrC0p5MFkF8sKd9R0Qczuw60kfOLTkH9 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR12MB6939 X-Rspamd-Queue-Id: E6DF2C0012 X-Stat-Signature: cxzy894wgp7hnfxcru6md1ic915uq3qr X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1768244342-626651 X-HE-Meta: U2FsdGVkX1+loi3fXwt6mDA5bZf+/F9ktJ2QWdwg6rYBVUlJ0yAF45AJHQ6s/qD54RWA+vyLYAt0hPn8Z9QmuJw10d+qOU5oGmq16mbWNLn6J2R/Q1cjdh8Z5y9KwROIEJq/Sskh5ii3/YyvG/AUVygDMe9/yQ3G6NkIN0JJa9DAvrwYvRJ/e3n1TP+bTFXFD7KVZG/HQSvd0RuVIkNLG/jmugV+Pv51lEJyoZMWOviWBaCl9mkRx4wktNOb8ET0uOw1BtWP8qSeuvCZCfOhxEciXJ195TPM1rHpCOTa6K/V5dy4hZGMDc71yU8BaLz1cgPhlRMSZMFJbbsqC4TGQpu/3oUyyFZWgqplPjxxA2bS1vAfylFbKUl+XiAfbEm+mRC39/gL8Fx5OvGk6HeHt/UwvxwzvxslRcqGMa3ZFt1A8RBlfXxBbZcA6WWWOsAPw6jvX4E7ofeMSKR9EQxnhXRIE5JCiV/+g4dp4B1zaUU6f99fYTCNX7f2rmk9wruyP1vPYP5CEy1Lg9K5PODIIarLOB18MCpVteOxaAtp5/MvBjlz0yZ7lDohG0AOVWh4qELSasf8OwAoD/ddoGE7y3mSeWNZvueb/7xc081LuHf4Ze13BonTa3Pk/N37eY4DfmadVbh+TUPuMnqpuHxZRdkNpKvXJYWxdhiMOn6cIFZI15lWCK2WOc6FFA9hx4rKBTDS3g1gSILzLqNzswBuwMo1ztlGbW61oKpsiAxQMjWUSYqXwwjxC5TWrCs2c4iwtafEY5kABmrpHRY9ZjQGEX3r6eEMVOejnZgC3i6hFcV773VEgWqtrVHbnsMxt0VC0JjWWImIMwvjmcWLyXFuppenkqSElOJ1xaH88SaIhNb0kEkXZoLDYT811lE4MgdlIaeR+wSsE0ZKbKsT1HtSXD7LzdP5V+dIKxEMb1xUntG7/x6r0x+vRHyAeCWt90VPIeq2tz96PKA5qwrN5km z1ZpnSA4 1IbUGIYUKo/IkJwb1/wSIjVoIgzu3kx6mo5ZCKwqXi9NxKJua3u3Yp2R3E7ahICPXtlz1dXpkp+h/RnjTWqtdyKFZYouu9g5dElBYaA2/DLP4Fxl5u4dDz2iBoWK1YIFLC46QnQ0kSEdRHUEQ35gBrvQ0NVeqt6aNhQwB9i01OO11yiW904UZncA1M5vcGy0Dc2tvNVX9XpoRCf+dbKvyM0JV7AySlwVvE151M6Sn1c+nBZKj6bKEXULFPsWLvVtoRs7gDAeFE9qR95CpzhLNMJPPVFh3GE4a8amNw2egcN9QY7Fh/LvEwJkOJpTTTNyfKQTDoUTWAZuu+ygCzedjvSCjXPLQl7JFL0JlKDOO9SZMA/ZuLfryykX0z/J1w6CnpOAbaCbYxG6Rju6Lm1/S2nJCjKiBLx6yYVcvD3n+dIkKhgCPg/POdgR7PzLvd4lNBW5Q1e2Nb8tvR737avljQeJMrkIvClOmAI0OjbAQwPCXG+VirsgY1YNg5VE56RoHWvif/Xap60irpRbSdnTyvRQUWtlkqxaEGMjVvtgk5vqHTb0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 12 Jan 2026, at 13:21, Ryan Roberts wrote: > On 12/01/2026 15:57, Zi Yan wrote: >> On 12 Jan 2026, at 8:24, Ryan Roberts wrote: >> >>> Hi Zi, >>> >>> Sorry for slow response - I had some other high priority stuff come in.= .. >>> >>> >>> On 07/01/2026 03:32, Zi Yan wrote: >>>> On 5 Jan 2026, at 12:31, Ryan Roberts wrote: >>>> >>>>> On 05/01/2026 17:15, Zi Yan wrote: >>>>>> On 5 Jan 2026, at 11:17, Ryan Roberts wrote: >>>>>> >>>>>>> Decompose the range of order-0 pages to be freed into the set of la= rgest >>>>>>> possible power-of-2 size and aligned chunks and free them to the pc= p or >>>>>>> buddy. This improves on the previous approach which freed each orde= r-0 >>>>>>> page individually in a loop. Testing shows performance to be improv= ed by >>>>>>> more than 10x in some cases. >>>>>>> >>>>>>> Since each page is order-0, we must decrement each page's reference >>>>>>> count individually and only consider the page for freeing as part o= f a >>>>>>> high order chunk if the reference count goes to zero. Additionally >>>>>>> free_pages_prepare() must be called for each individual order-0 pag= e >>>>>>> too, so that the struct page state and global accounting state can = be >>>>>>> appropriately managed. But once this is done, the resulting high or= der >>>>>>> chunks can be freed as a unit to the pcp or buddy. >>>>>>> >>>>>>> This significiantly speeds up the free operation but also has the s= ide >>>>>>> benefit that high order blocks are added to the pcp instead of each= page >>>>>>> ending up on the pcp order-0 list; memory remains more readily avai= lable >>>>>>> in high orders. >>>>>>> >>>>>>> vmalloc will shortly become a user of this new optimized >>>>>>> free_contig_range() since it agressively allocates high order >>>>>>> non-compound pages, but then calls split_page() to end up with >>>>>>> contiguous order-0 pages. These can now be freed much more efficien= tly. >>>>>>> >>>>>>> The execution time of the following function was measured in a VM o= n an >>>>>>> Apple M2 system: >>>>>>> >>>>>>> static int page_alloc_high_ordr_test(void) >>>>>>> { >>>>>>> unsigned int order =3D HPAGE_PMD_ORDER; >>>>>>> struct page *page; >>>>>>> int i; >>>>>>> >>>>>>> for (i =3D 0; i < 100000; i++) { >>>>>>> page =3D alloc_pages(GFP_KERNEL, order); >>>>>>> if (!page) >>>>>>> return -1; >>>>>>> split_page(page, order); >>>>>>> free_contig_range(page_to_pfn(page), 1UL << order); >>>>>>> } >>>>>>> >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> Execution time before: 1684366 usec >>>>>>> Execution time after: 136216 usec >>>>>>> >>>>>>> Perf trace before: >>>>>>> >>>>>>> 60.93% 0.00% kthreadd [kernel.kallsyms] [k] ret_f= rom_fork >>>>>>> | >>>>>>> ---ret_from_fork >>>>>>> kthread >>>>>>> 0xffffbba283e63980 >>>>>>> | >>>>>>> |--60.01%--0xffffbba283e636dc >>>>>>> | | >>>>>>> | |--58.57%--free_contig_range >>>>>>> | | | >>>>>>> | | |--57.19%--___free_pages >>>>>>> | | | | >>>>>>> | | | |--46.65%--__free_f= rozen_pages >>>>>>> | | | | | >>>>>>> | | | | |--28.08= %--free_pcppages_bulk >>>>>>> | | | | | >>>>>>> | | | | --12.05= %--free_frozen_page_commit.constprop.0 >>>>>>> | | | | >>>>>>> | | | |--5.10%--__get_pfn= block_flags_mask.isra.0 >>>>>>> | | | | >>>>>>> | | | |--1.13%--_raw_spin= _unlock >>>>>>> | | | | >>>>>>> | | | |--0.78%--free_froz= en_page_commit.constprop.0 >>>>>>> | | | | >>>>>>> | | | --0.75%--_raw_spin= _trylock >>>>>>> | | | >>>>>>> | | --0.95%--__free_frozen_pages >>>>>>> | | >>>>>>> | --1.44%--___free_pages >>>>>>> | >>>>>>> --0.78%--0xffffbba283e636c0 >>>>>>> split_page >>>>>>> >>>>>>> Perf trace after: >>>>>>> >>>>>>> 10.62% 0.00% kthreadd [kernel.kallsyms] [k] ret_from_= fork >>>>>>> | >>>>>>> ---ret_from_fork >>>>>>> kthread >>>>>>> 0xffffbbd55ef74980 >>>>>>> | >>>>>>> |--8.74%--0xffffbbd55ef746dc >>>>>>> | free_contig_range >>>>>>> | | >>>>>>> | --8.72%--__free_contig_range >>>>>>> | >>>>>>> --1.56%--0xffffbbd55ef746c0 >>>>>>> | >>>>>>> --1.54%--split_page >>>>>>> >>>>>>> Signed-off-by: Ryan Roberts >>>>>>> --- >>>>>>> include/linux/gfp.h | 1 + >>>>>>> mm/page_alloc.c | 116 +++++++++++++++++++++++++++++++++++++++-= ---- >>>>>>> 2 files changed, 106 insertions(+), 11 deletions(-) >>>>>>> >>>>>>> diff --git a/include/linux/gfp.h b/include/linux/gfp.h >>>>>>> index b155929af5b1..3ed0bef34d0c 100644 >>>>>>> --- a/include/linux/gfp.h >>>>>>> +++ b/include/linux/gfp.h >>>>>>> @@ -439,6 +439,7 @@ extern struct page *alloc_contig_pages_noprof(u= nsigned long nr_pages, gfp_t gfp_ >>>>>>> #define alloc_contig_pages(...) alloc_hooks(alloc_contig_pages_n= oprof(__VA_ARGS__)) >>>>>>> >>>>>>> #endif >>>>>>> +unsigned long __free_contig_range(unsigned long pfn, unsigned long= nr_pages); >>>>>>> void free_contig_range(unsigned long pfn, unsigned long nr_pages); >>>>>>> >>>>>>> #ifdef CONFIG_CONTIG_ALLOC >>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>>>> index a045d728ae0f..1015c8edf8a4 100644 >>>>>>> --- a/mm/page_alloc.c >>>>>>> +++ b/mm/page_alloc.c >>>>>>> @@ -91,6 +91,9 @@ typedef int __bitwise fpi_t; >>>>>>> /* Free the page without taking locks. Rely on trylock only. */ >>>>>>> #define FPI_TRYLOCK ((__force fpi_t)BIT(2)) >>>>>>> >>>>>>> +/* free_pages_prepare() has already been called for page(s) being = freed. */ >>>>>>> +#define FPI_PREPARED ((__force fpi_t)BIT(3)) >>>>>>> + >>>>>>> /* prevent >1 _updater_ of zone percpu pageset ->high and ->batch = fields */ >>>>>>> static DEFINE_MUTEX(pcp_batch_high_lock); >>>>>>> #define MIN_PERCPU_PAGELIST_HIGH_FRACTION (8) >>>>>>> @@ -1582,8 +1585,12 @@ static void __free_pages_ok(struct page *pag= e, unsigned int order, >>>>>>> unsigned long pfn =3D page_to_pfn(page); >>>>>>> struct zone *zone =3D page_zone(page); >>>>>>> >>>>>>> - if (free_pages_prepare(page, order)) >>>>>>> - free_one_page(zone, page, pfn, order, fpi_flags); >>>>>>> + if (!(fpi_flags & FPI_PREPARED)) { >>>>>>> + if (!free_pages_prepare(page, order)) >>>>>>> + return; >>>>>>> + } >>>>>>> + >>>>>>> + free_one_page(zone, page, pfn, order, fpi_flags); >>>>>>> } >>>>>>> >>>>>>> void __meminit __free_pages_core(struct page *page, unsigned int o= rder, >>>>>>> @@ -2943,8 +2950,10 @@ static void __free_frozen_pages(struct page = *page, unsigned int order, >>>>>>> return; >>>>>>> } >>>>>>> >>>>>>> - if (!free_pages_prepare(page, order)) >>>>>>> - return; >>>>>>> + if (!(fpi_flags & FPI_PREPARED)) { >>>>>>> + if (!free_pages_prepare(page, order)) >>>>>>> + return; >>>>>>> + } >>>>>>> >>>>>>> /* >>>>>>> * We only track unmovable, reclaimable and movable on pcp lists. >>>>>>> @@ -7250,9 +7259,99 @@ struct page *alloc_contig_pages_noprof(unsig= ned long nr_pages, gfp_t gfp_mask, >>>>>>> } >>>>>>> #endif /* CONFIG_CONTIG_ALLOC */ >>>>>>> >>>>>>> +static void free_prepared_contig_range(struct page *page, >>>>>>> + unsigned long nr_pages) >>>>>>> +{ >>>>>>> + while (nr_pages) { >>>>>>> + unsigned int fit_order, align_order, order; >>>>>>> + unsigned long pfn; >>>>>>> + >>>>>>> + /* >>>>>>> + * Find the largest aligned power-of-2 number of pages that >>>>>>> + * starts at the current page, does not exceed nr_pages and is >>>>>>> + * less than or equal to pageblock_order. >>>>>>> + */ >>>>>>> + pfn =3D page_to_pfn(page); >>>>>>> + fit_order =3D ilog2(nr_pages); >>>>>>> + align_order =3D pfn ? __ffs(pfn) : fit_order; >>>>>>> + order =3D min3(fit_order, align_order, pageblock_order); >>>>>>> + >>>>>>> + /* >>>>>>> + * Free the chunk as a single block. Our caller has already >>>>>>> + * called free_pages_prepare() for each order-0 page. >>>>>>> + */ >>>>>>> + __free_frozen_pages(page, order, FPI_PREPARED); >>>>>>> + >>>>>>> + page +=3D 1UL << order; >>>>>>> + nr_pages -=3D 1UL << order; >>>>>>> + } >>>>>>> +} >>>>>>> + >>>>>>> +/** >>>>>>> + * __free_contig_range - Free contiguous range of order-0 pages. >>>>>>> + * @pfn: Page frame number of the first page in the range. >>>>>>> + * @nr_pages: Number of pages to free. >>>>>>> + * >>>>>>> + * For each order-0 struct page in the physically contiguous range= , put a >>>>>>> + * reference. Free any page who's reference count falls to zero. T= he >>>>>>> + * implementation is functionally equivalent to, but significantly= faster than >>>>>>> + * calling __free_page() for each struct page in a loop. >>>>>>> + * >>>>>>> + * Memory allocated with alloc_pages(order>=3D1) then subsequently= split to >>>>>>> + * order-0 with split_page() is an example of appropriate contiguo= us pages that >>>>>>> + * can be freed with this API. >>>>>>> + * >>>>>>> + * Returns the number of pages which were not freed, because their= reference >>>>>>> + * count did not fall to zero. >>>>>>> + * >>>>>>> + * Context: May be called in interrupt context or while holding a = normal >>>>>>> + * spinlock, but not in NMI context or while holding a raw spinloc= k. >>>>>>> + */ >>>>>>> +unsigned long __free_contig_range(unsigned long pfn, unsigned long= nr_pages) >>>>>>> +{ >>>>>>> + struct page *page =3D pfn_to_page(pfn); >>>>>>> + unsigned long not_freed =3D 0; >>>>>>> + struct page *start =3D NULL; >>>>>>> + unsigned long i; >>>>>>> + bool can_free; >>>>>>> + >>>>>>> + /* >>>>>>> + * Chunk the range into contiguous runs of pages for which the re= fcount >>>>>>> + * went to zero and for which free_pages_prepare() succeeded. If >>>>>>> + * free_pages_prepare() fails we consider the page to have been f= reed >>>>>>> + * deliberately leak it. >>>>>>> + * >>>>>>> + * Code assumes contiguous PFNs have contiguous struct pages, but= not >>>>>>> + * vice versa. >>>>>>> + */ >>>>>>> + for (i =3D 0; i < nr_pages; i++, page++) { >>>>>>> + VM_BUG_ON_PAGE(PageHead(page), page); >>>>>>> + VM_BUG_ON_PAGE(PageTail(page), page); >>>>>>> + >>>>>>> + can_free =3D put_page_testzero(page); >>>>>>> + if (!can_free) >>>>>>> + not_freed++; >>>>>>> + else if (!free_pages_prepare(page, 0)) >>>>>>> + can_free =3D false; >>>>>> >>>>>> I understand you use free_pages_prepare() here to catch early failur= es. >>>>>> I wonder if we could let __free_frozen_pages() handle the failure of >>>>>> non-compound >0 order pages instead of a new FPI flag. >>>>> >>>>> I'm not sure I follow. You would still need to provide a flag to >>>>> __free_frozen_pages() to tell it "this is a set of order-0 pages". Ot= herwise it >>>>> will treat it as a non-compound high order page, which would be wrong= ; >>>>> free_pages_prepare() would only be called for the head page (with the= order >>>>> passed in) and that won't do the right thing. >>>>> >>>>> I guess you could pass the flag all the way to free_pages_prepare() t= hen it >>>>> could be modified to do the right thing for contiguous order-0 pages;= that would >>>>> probably ultimately be more efficient then calling free_pages_prepare= () for >>>>> every order-0 page. Is that what you are suggesting? >>>> >>>> Yes. I mistakenly mixed up non-compound high order page and a set of o= rder-0 >>>> pages. There is alloc_pages_bulk() to get a list of order-0 pages, but >>>> free_pages_bulk() does not exist. Maybe that is what we need here? >>> >>> This is what I initially started with; vmalloc maintains a list of stru= ct pages >>> so why not just pass that list to something like free_pages_bulk()? The= problem >>> is that the optimization relies on a *contiguous* set of order-0 pages.= A list >>> of pointers to pages does not imply they must be contiguous. So I don't= think >>> it's the right API. >>> >>> We already have free_contig_range() which takes a range of PFNs. That's= the >>> exact semantic we can optimize so surely that's the better API style? H= ence I >>> added __free_contig_range() and reimplemented free_contig_range() (whic= h does >>> more than vfree wants()) on top. >>> >>>> Using __free_frozen_pages() for a set of order-0 pages looks like a >>>> shoehorning. >>> >>> Perhaps; that's an internal function so could rename to __free_frozen_o= rder(), >>> passing in a PFN and an order and teach it to handle all 3 cases intern= ally: >>> >>> - high-order non-compound page (already supports) >>> - high-order compound page (already supports) >>> - power-of-2 sized and aligned number of contiguos order-0 pages (new) >>> >>>> I admit that adding free_pages_bulk() with maximal code >>>> reuse and a good interface will take some effort, so it probably is a = long >>>> term goal. free_pages_bulk() is also slightly different from what you >>>> want to do, since, if it uses same interface as alloc_pages_bulk(), >>>> it will need to accept a page array instead of page + order. >>> >>> Yeah; I don't think that's a good API for this case. We would spend mor= e effort >>> looking for contiguous pages when there are none. (for the vfree case i= t's >>> highly likely that they are contiguous because that's how they were all= ocated). >>> >> >> All above makes sense to me. >> >>>> >>>> I am not suggesting you should do this, but just think out loud. >>>> >>>>> >>>>>> >>>>>> Looking at free_pages_prepare(), three cases would cause failures: >>>>>> 1. PageHWPoison(page): the code excludes >0 order pages, so it needs >>>>>> to be fixed. BTW, Jiaqi Yan has a series trying to tackle it[1]. >>>>>> >>>>>> 2. uncleared PageNetpp(page): probably need to check every individua= l >>>>>> page of this >0 order page and call bad_page() for any violator. >>>>>> >>>>>> 3. bad free page: probably need to do it for individual page as well= . >>>>> >>>>> It's not just handling the failures, it's accounting; e.g. >>>>> __memcg_kmem_uncharge_page(). >>>> >>>> Got it. Another idea comes to mind. >>>> >>>> Is it doable to >>>> 1) use put_page_testzero() to bring all pages=E2=80=99 refs to 0, >>>> 2) unsplit/merge these contiguous order-0 pages back to non-compound >>>> high order pages, >>>> 3) free unsplit/merged pages with __free_frozen_pages()? >>> >>> Yes, I thought about this approach. I think it would work, but I assume= d it >>> would be extra effort to merge the pages only to then free them. It mig= ht be in >>> the noise though. Do you think this approach is better than what I sugg= est >>> above? If so, I'll have a go. >> >> Yes, the symmetry also makes it easier to follow. Thanks. > > Having taken another look at this, I'm not sure it's the correct approach= . We > would need a bunch of extra checks to ensure the order-0 pages can be saf= ely > merged together; they need the same page owner, tag and memcg meta data. = And we > need to check stuff like poisoning, etc. Effectively this would turn into= all > the same stuff that free_pages_prepare() already does. OK, it seems a lot of work. Thank you for looking into it. > > So I don't think merging is the way to go; I'd prefer to either do it as = I've > done it in this version of the series (call free_pages_prepare() per orde= r-0, > then __free_frozen_pages() for the whole block) or call free_pages_prepar= e() > once for the whole block but tell it that it has order-0 pages and let it > ~iterate over them internally. > > Can you be convinced? Either works. I prefer the latter, but has no strong opinion on it. Best Regards, Yan, Zi