From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19200C5B552 for ; Mon, 9 Jun 2025 13:28:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A5F616B00A1; Mon, 9 Jun 2025 09:28:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A37436B00AC; Mon, 9 Jun 2025 09:28:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 926516B00AD; Mon, 9 Jun 2025 09:28:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6DF7D6B00A1 for ; Mon, 9 Jun 2025 09:28:15 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E07EB1CD905 for ; Mon, 9 Jun 2025 13:28:14 +0000 (UTC) X-FDA: 83535940908.09.AC06E01 Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam02on2069.outbound.protection.outlook.com [40.107.212.69]) by imf28.hostedemail.com (Postfix) with ESMTP id 49D1CC000A for ; Mon, 9 Jun 2025 13:28:12 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=VvEKQbU9; spf=pass (imf28.hostedemail.com: domain of ziy@nvidia.com designates 40.107.212.69 as permitted sender) smtp.mailfrom=ziy@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749475692; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WnXTQLpD67TEarseL97NPYNgeSD4DlPltTh9UYn9VzM=; b=5CyWHcc9kRf/0TTOrKVrz1/Hu1LDzAFXyWhyheePF7BxpyCvEN3KMgJojxJNW+o88005Jg 8BoxY11oODU7wMkZ9Mzj3UfrNkXDbdpTVsCzFQW/Sor3wqO6caGY8f7Gc9b7Z1LdJrDOv7 QGckkDMdNWx8E52c3g8wXTHuQ9YYQvo= ARC-Authentication-Results: i=2; imf28.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=VvEKQbU9; spf=pass (imf28.hostedemail.com: domain of ziy@nvidia.com designates 40.107.212.69 as permitted sender) smtp.mailfrom=ziy@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1749475692; a=rsa-sha256; cv=pass; b=0TuvLKlyP8bhJ9YMHIYjiQvlVWMX5PeRdUFEET77B9QWSJG3Dpw7yr5h6CzAutVlRALcOl Mi3c3P0W8ISu430SPe/q4pRMhno8ZAgXjI5INf0rqhfIh0sAKGbtrg3K74rpAtZmGdyu+y sx91ixLT4giyWE6l00SyEBCi6yTrIrA= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=DN5x4lzQkLuAiqZR4y5IyAf6pjoep4irGbWAEurg0qlSKIu7KF5P/s13/tbzy/FuXf6gvB9Ly2nP+9YO9QZ6gPcnb4KvEvGNt/9tEZDhhuF7mdmNRyBagzzb2PWkBJRxFBfNtlnonzT8pJ8z9HXAetNJxMwLRBL7Iy4auIZMSCjTXBTNJC0MvrAstws7svKjmCtCrtGJbyciGxL7OzcT+0/UbhchGZXfqdq5eQs56p6cwaGy2BD93Wl5fAbMwtaR72nCRrfV5IMQG14YLykxnCddlvrLQQ0zGWcubpzRZQ2ckQY9c99V0jdnYZtHHENglYt35xZfGvjrwb8JA+lT9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WnXTQLpD67TEarseL97NPYNgeSD4DlPltTh9UYn9VzM=; b=cD4KIisTs2EhkZt7ysaGEM35m4YPGEQ3y5cVR/Q1rSSb5McBooCiMIoayyZz//ApIP4M1VGQTLjMLbrAS0n022MTkM1vtiig34iSm4LCEOL5fdcHU1kmaZEoB/v3ihaI6AppjcOLSDhi4Vqac6v6TyaRrn128bEJjlq6LX2DfVdeQVA8Nr2VD7kQX9ZhORcUuff6TT05GZMqisSCfCIXtYAQtwvlcpc7rpW6t0HbPnP9spxGtAYtHydEWauEixVh4JuUo4xOS665NpTbS++3IRRSbaRAGSfXxjCBM/K8V3+HEN84FwUeinMAaXj+1zcn5KRuJxDorH9iVsb5znp8HQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WnXTQLpD67TEarseL97NPYNgeSD4DlPltTh9UYn9VzM=; b=VvEKQbU97GvOJHcFqt+uvlEQfGcQAnLLuwaFcD8MazImKVmzOJYh8smcw9C8iyzLoMf5xvD0lDayOMIXkmPSDDEL4jB/kOOc5J71/gmOp5vxhDrHwosooudlAtlVJusVeVGyHvWOk7854pKgZD7QImqLy6K8nT/AAmtYF9zQVoLXpyDS6BEgdM/s5bwAqHfX9knpua5eGC6kowCKe1TPOsvNGmWXJ49Z8nR+x5TWBJuey49eN2bKVLA0lWohtpE0KvpZ2Kyf9N3qcZyqsanaWq9cRvU7QVFBnWv4vnThgtJ+GpYkRxeTBN0nIpaFq49bUDepxBzBSrhzewcQ3QWzLw== Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by MW4PR12MB6950.namprd12.prod.outlook.com (2603:10b6:303:207::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8792.37; Mon, 9 Jun 2025 13:28:08 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%4]) with mapi id 15.20.8792.038; Mon, 9 Jun 2025 13:28:08 +0000 From: Zi Yan To: Usama Arif Cc: David Hildenbrand , Andrew Morton , linux-mm@kvack.org, hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, hughd@google.com, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Matthew Wilcox Subject: Re: [RFC] mm: khugepaged: use largest enabled hugepage order for min_free_kbytes Date: Mon, 09 Jun 2025 09:28:06 -0400 X-Mailer: MailMate (2.0r6263) Message-ID: <76D057AA-58C1-46A0-B067-EB78FE5D2D37@nvidia.com> In-Reply-To: <4adf1f8b-781d-4ab0-b82e-49795ad712cb@gmail.com> References: <20250606143700.3256414-1-usamaarif642@gmail.com> <4c1d5033-0c90-4672-84a1-15978ced245d@redhat.com> <4adf1f8b-781d-4ab0-b82e-49795ad712cb@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BL1P223CA0014.NAMP223.PROD.OUTLOOK.COM (2603:10b6:208:2c4::19) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|MW4PR12MB6950:EE_ X-MS-Office365-Filtering-Correlation-Id: 1fbb8615-4ca3-48df-e64d-08dda75978ef X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|376014|7416014; X-Microsoft-Antispam-Message-Info: =?utf-8?B?ODJHa2c4SThMeUlYWXRwL25uay9HRTFjbmRHS1Q5KzV5WlY1TmJTdUg3MzZ3?= =?utf-8?B?RjVDUitPNVZGWUtSV1VwZVMzVTM4VFNGWkMvekQzbjZ4c2xTYWE1bWlUbXha?= =?utf-8?B?bkovVWdNeFFFU0Q5bUlBSWxYZGMvVG1jZzdnaHhKZDNLdndIN0tGL3RsTHNp?= =?utf-8?B?VGo2N0g1RUtMaVJmWkx5WnYzZmNMWW5kZ3gwSktsWFRpalNBQ20wVVgyc1lk?= =?utf-8?B?UVZQNlg2d0JlTENMZllrSHg2OTY0RDJlQkF1UXVFMnJHY1p3RU1RSGJERFBh?= =?utf-8?B?QzhKMHhRd2IramxIWCs2NEtudzhydllZanRmcDN0K2NqanVxeERQNVNWTnl4?= =?utf-8?B?OFkzR0M3RCt2RmtLeFZRY2pyTjFWZm1EaXBiNlM1enVmMEliNjIyM29SOXhj?= =?utf-8?B?Mlk4aGRaN0w2dlFVVlVzU0VENlNUcnZQYVI3cnhmNmE4cGZLUTZmMHQwMGF5?= =?utf-8?B?dTViRjZFc0FBSVArSkNlUUpSQ0VGbEdYQnZESjNSVnorZy80OXZ5R2M4V2Jk?= =?utf-8?B?NWpQQncyWlg1VUY5d0E5N3ZMT3lqL0E2TWp2MHpZaFJIbUtwZ3JDeXJNUThL?= =?utf-8?B?SE1TenlDQzFMTW5La0RMQ2s2UzJrUElLc05iRDFjNEI4Sk1VWlVwNm5FU25t?= =?utf-8?B?QlAxUTRkMzE0WnNyY3BaSmxHMFd2QUpSeGtkN1NBSzdBWXozdDA0QWxBMWM3?= =?utf-8?B?R2l5N2pGZUhrQ2Qwd1huZTErOU1MbC9iUmJoU0N4ZWhHZmw3bGZMUTA3Y05G?= =?utf-8?B?eDRYMFQvc0ZBd3lFQjJ3azB4Tjd1UDMyV2FKV3YyRnlaWG5pazdlUVB0M3F6?= =?utf-8?B?aDRCQ2dxelBDRzlDNmREUGMzc2ZaWmpHa29ocEZQcXJBS3M5bS9UZ3l5VTlC?= =?utf-8?B?T1RzZlZsRktkM1RXa3JOR2lSNTE1V2M1d214TFlOTTNia1ZBTDVLcm1Jc25r?= =?utf-8?B?QWI5blhBVjVsTFdPSHh4cngxaVNhaEVmMmptZXBSMjd3NGlnYTZyY2NmVEVh?= =?utf-8?B?LzFsOTZWV3VrbEJUVFlvUkJwWXlqTmJEenFSTUd5UGZydHh5cFRFZ0dKQSts?= =?utf-8?B?WDRDUWxjUjlBZ04xSTVTTGVHM0hEeE1sU2RWUWNES0hmMFV3aG9kVmJaY1ZM?= =?utf-8?B?eG9PUTkwMFNlVG92RVltZU0zRGFuYlh2Y044RytvamdyR01pVE9Yc2NFTTE1?= =?utf-8?B?M0thakUyNS9scGd1bkxYUnROYU43cmZ4dmpDSGtNU0hYOEE0SWlxU29aK0Uz?= =?utf-8?B?NTA1V0lEMTJwVzNPZlVKM3RpWjgwUk5sTktGSnhiL1VMc3A2dGxDSE1jQjF5?= =?utf-8?B?L3NTWHJkZ3YwWWd5b21SYnZYVzJ1RnN3cUtsTG5lVzc4dzRjMzgrc2lTeDhr?= =?utf-8?B?RFlSZThFWHNnWWQ2SmJIeTZVUTNZQ2NVdlgyTDdhMWovUHE5VjBDYUtaU0RY?= =?utf-8?B?MVkrSXRxcnpvNTBBaW5CQ0dySjI4R21KcFlrbjUzWm5ZZk5xa1Y3QkhkdlQ0?= =?utf-8?B?MElRTmtidXowYmVubmc4SkQzRXhYNU1DNmtza3hsdVNwNGhxeGpsUE0yMUJQ?= =?utf-8?B?S3p2d1Y0V1NIbTVSYUlUUTBkWk5TTHd2QlEzRW80L0cvaVN2MnlTL0NkbVVs?= =?utf-8?B?UVBxcktyRFBLRmZoaHhvaWlFa0pvVGVGWS9TYUR2c09DZk5CUVVLMEVpdit2?= =?utf-8?B?YlZWL2hJOEM2bWkwNndLWTlQSFhQV1RqSk1zc0IyU1UwS3JKM2srQ1RScTYy?= =?utf-8?B?WVRpbVJtcE9RbVVmSVRQL0U1QW1xYVdXNSs0VUJkRmRVRDhJb2ZuSktxVFl3?= =?utf-8?B?QlNoRGl1eFBmQXNkYU1SY3ViV0NWSW1NNkxMdmU5cjVvSzRhZi9lSFQ3MGVT?= =?utf-8?B?enlNbkdBdTMyWnlBU0JSWUorTHNveXVxenBGTDVQQUg3QVEyUk02ZTEyc2Nt?= =?utf-8?Q?AvxFmwVH1KA=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR12MB9473.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(376014)(7416014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?aWpNODUwaXE2a24reitnazdTdlVQNzVZb3BOZE1rTFBEbEMzUXE3NU1SZXB5?= =?utf-8?B?NWgrdDZ4aFZ1Zmh6Y2RHSGNWekJ3b1RlRkQ1YTVOaDljRE4zUnRDcUhKSTlh?= =?utf-8?B?eDNyS2VtSWZBai9aVDU0Rk1RQ0VSMDkxVjZubTZXV3kyNjNzTHRONVN0VUZ0?= =?utf-8?B?Y3ZuVUV2UENheldGWnBkb3o1cHJPVG1BekgzWXRGcU5scmd4YlNmSHN4VVNS?= =?utf-8?B?bE9oVWx3YU5RMWh3NDJIMC9QZkZmWUMvamExa3pqV2RlYlhQK3lVbytLWUp5?= =?utf-8?B?clgyTHYxTmxDd1lWTkxSdWRrNUhwWGxjSmJKQ2svdy9xaVNBaVlJTUJ1OHM4?= =?utf-8?B?UGQwelFYcmpvMndFQktncTJGT0xoUmpmd2gyMzArM0VlTGVNYWhGSmJJSTlr?= =?utf-8?B?QjNndEpWQ1Z1QVFWZ0dzU3JUc0czQ05wQkNQWkE4WmxNMVFaSnQ3S0JlWVhK?= =?utf-8?B?d04yb1hrUGE0UGRJWjBvakdVRDVSUVp4KzEwRGp5NElRVGJPOTJSNi96WmIw?= =?utf-8?B?T1B5dnhMeFhsblhyZlJpNTFCbWQ3azVIVmtvN3NqWmY4VzVkS2NMVlRCYUFT?= =?utf-8?B?MzNSVVV3dDlWa1ljTHdkVWx4RGJrYktZMGxSSVZvajFGU3ZkYU50ejNtdDJG?= =?utf-8?B?T3FqVXp2NGdsWTRmR0ZQSTZzVHpDalczWlp2TUtiS1dncHZmSlcvRkdTT0Jz?= =?utf-8?B?dFZpT3pTT3ZCOERJSUhlRlFWNlV5ajcxcUJxWE9BZ3cvbDd5UURFL3pFMm12?= =?utf-8?B?WjIzbHZ3K1ZEWnd6NzUwaWhaL1B6dU9GU24wc3VYWms5Z3JrbGxIUFAwYUdk?= =?utf-8?B?aTlDaHREZ0kxcGtMSm1yZzB0L2Q4MUtBdkRnazZkazJIaUpQNyszY0dLcjBU?= =?utf-8?B?a0JBU2VXM3Jxb0daeEJlcTlRWnVSakJUcGVvTHpTWWdsNmU4aEVDN2VjK1B3?= =?utf-8?B?WHNtUUd4OXpjMVZrVzBmNUNrekdScFJhYXBPZXFDckFDa1locVBORXpIOEtx?= =?utf-8?B?UVJJdVFpUEprOFBsbjJOZDJzTXA1amNzM1BTd09kT0ViR2tydDkvZXNrSWJY?= =?utf-8?B?UXp3MDJnNm9WOEt2M1VHUmI0TVBmak90T1gydlJXN0k1S2JwVGpFV0QyVnQw?= =?utf-8?B?dk1pTUR3OUVyMU9BQTdWY2VzdjVOOEd3V2tjelZHZDhWRkY1RzdKWTJzRnNv?= =?utf-8?B?NllOa0R2WXQ1QWlRazVEQURGWStZT2FMaUJXVkcvQVh5RmlxYncxN1ZZaVBt?= =?utf-8?B?Umo0VkNaa0o4L2VHaW5ab0tPRi9ZUmVUYy9mL0daejZqdTUwT2xJOVViWHlQ?= =?utf-8?B?R2NqMjU1WS94TFRYRWFHVkF3VXUycjZpbTNyZkJrNTRadGk3SGQ4d0R3Nndo?= =?utf-8?B?R0lxelFJcnJDNUFoeXM3ZXpZSUU1THFBS3FleXEwczV4clYwVFVsWWFjQnRP?= =?utf-8?B?cXZVWThuM25GVWlLWWNmR2xQZjQ2VXcyU1UyMFJTL0tHWjNDZ04wRjM4WnhG?= =?utf-8?B?b3EwNjdtSjQ3dmVvdzVXbXduQ09iT3FNWDVNQ2VSTDdDR3NWYWZQLzdGdzFJ?= =?utf-8?B?akp6NkxRbndaZlJ1d1l0dXkxN0wzN2lwektPR0hNdzF3Qmd4QUVSRWtwRENx?= =?utf-8?B?dzBqUldHR0VBL2E3NXdCa1BvdkhvTnN1ZWx2OUJiUVlCZnBvV0JmSEtnWU9L?= =?utf-8?B?aWdLZStQS2tGVWsrU0VRUE5SbFlNbGlWSjhnQjNkNzh0akx2eDQ2RXkreHZT?= =?utf-8?B?UmlxcXI2TTlkUy9pYXZOK3gxWDdOZUk0RHFWTDNmdWRMZEJFeTEzcE5Bb2lL?= =?utf-8?B?QUtQV0lDcVQxUExGRWo2NmdYNVNoaDIyQ2hFZ0YzV2tvUkJTbFJVRS9LRUht?= =?utf-8?B?ZUluY1JOOWdaNlRsWG8zRXl0Y1pFSDcvOWlvTnEzUVJxb1RxeEN5L00zRVBS?= =?utf-8?B?SG5lYlVLR05MOURHcXp4bDdnRWhtN1FiNStSWGQvYzBKRXBZU0hTZDg5Vkdv?= =?utf-8?B?RUpTQ0RnOXhJNTl0U1lRcTk4L0ZMeHBXQ3YrUXVCRFRhcU5jZ0JuaGsyMlFV?= =?utf-8?B?ZGZ2Q3pCV29KaG50MjA4UDRwbHNnamVkdXJ5dFl2MXBWU3lOWldwR1Y1RC9D?= =?utf-8?Q?7I9mX8+2CyHoSDX/tCcVGhOwc?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1fbb8615-4ca3-48df-e64d-08dda75978ef X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jun 2025 13:28:08.0530 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: gXM+sRMLFjzD6Bq3CwusvcZ7k56gmZWN4eJoqaR7Tt2vsFa63dmKknBdxuBQzZ/Y X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB6950 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 49D1CC000A X-Stat-Signature: kznbompzxdhmnaf95pxpuk1ggnu8db1e X-Rspam-User: X-HE-Tag: 1749475692-267837 X-HE-Meta: U2FsdGVkX1/vpphVkRZdOlt91lQz3zyJmUA5F03OnsqJEgyQkjI71eW9o3WE9Ap3+Z7027ulYggpBK752Q8inXGFC44sb/S6gU6bK55Vhcw6qepiqd44ja9UIWxlvtcowP7Vraa8PWTj5MII7aU4bRaeTjBpvVlILHDUbsi41zU36GsfVGCWKkqrmRSSSGtiBAdq4fY08/9/4YMv7ZJHRtpS1eJnUbTHa8LFcZMANa3GN/GGPy5TByhGBnEu20iVvdQ+pwH6Vt6baxKyEP8/KpliUfU2EjHVSt9lxhFSOo4H0FU3Z3v7f7D3+VyQvVOjNKl6EoKfg5y5ehlA1cBuI2UOYv+4NaoiGVO5h4P5Y/DM5J3pwTpon9JsrGIYe9hNbnY0LsP/g5lfqbhPOykARRYx5BOnvkmOz1qIKKlPXiHoY3x1Y4yPObwqPP/fPVsri8jbV998n0MRJCwB7VMDEyXU+oSuXXvyJ0affs1tq/JPjKiRajva89XTwl4WGg/T5Mfs5eBptY+q/gYfTg1LY6FYVTmzkqpfPgfQYTVn5sSznjFFdg0PyT1jVuvDnkh/1lp+r+x8g97GcVxtdzMqZ38Vt+FWJJIb+sVVePtn7CophrRNUp19rS48fy8laC+JJDnkdgh9YTbbJgZOtAKYqINR8fPR22W24kbIq7Oa+8XL8epUM3Hx5y4LPJBqaHGKDn8mqRumWg+S/Sb7eugFh/I/6XI80l47NGlU89Re0c1sOYPobdovzNZkNZCONzMTI0K9IHbcAfZ+oN0p8DDztMzqav9YGWkmgh2W2LjNrs0ntN+Zgph6ZNjHrVD10v2doFLvqfnr77chNygKZ3uxw9YqrQQ9OL4iSIvDjucabDAH+pp7TU3wyX1yVKDhlUmknuPnCQYYQ6ncMqWZvTyElB4RM8cBdpGxp1mXeFZxKBnbT9eLQY2D1aA/AGOk1rcIiMSLMkWGYhE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 9 Jun 2025, at 7:34, Usama Arif wrote: > On 06/06/2025 18:37, David Hildenbrand wrote: >> On 06.06.25 16:37, Usama Arif wrote: >>> On arm64 machines with 64K PAGE_SIZE, the min_free_kbytes and hence the >>> watermarks are evaluated to extremely high values, for e.g. a server wi= th >>> 480G of memory, only 2M mTHP hugepage size set to madvise, with the res= t >>> of the sizes set to never, the min, low and high watermarks evaluate to >>> 11.2G, 14G and 16.8G respectively. >>> In contrast for 4K PAGE_SIZE of the same machine, with only 2M THP huge= page >>> size set to madvise, the min, low and high watermarks evaluate to 86M, = 566M >>> and 1G respectively. >>> This is because set_recommended_min_free_kbytes is designed for PMD >>> hugepages (pageblock_order =3D min(HPAGE_PMD_ORDER, PAGE_BLOCK_ORDER)). >>> Such high watermark values can cause performance and latency issues in >>> memory bound applications on arm servers that use 64K PAGE_SIZE, eventh= ough >>> most of them would never actually use a 512M PMD THP. >>> >>> Instead of using HPAGE_PMD_ORDER for pageblock_order use the highest la= rge >>> folio order enabled in set_recommended_min_free_kbytes. >>> With this patch, when only 2M THP hugepage size is set to madvise for t= he >>> same machine with 64K page size, with the rest of the sizes set to neve= r, >>> the min, low and high watermarks evaluate to 2.08G, 2.6G and 3.1G >>> respectively. When 512M THP hugepage size is set to madvise for the sam= e >>> machine with 64K page size, the min, low and high watermarks evaluate t= o >>> 11.2G, 14G and 16.8G respectively, the same as without this patch. >>> >>> An alternative solution would be to change PAGE_BLOCK_ORDER by changing >>> ARCH_FORCE_MAX_ORDER to a lower value for ARM64_64K_PAGES. However, thi= s >>> is not dynamic with hugepage size, will need different kernel builds fo= r >>> different hugepage sizes and most users won't know that this needs to b= e >>> done as it can be difficult to detmermine that the performance and late= ncy >>> issues are coming from the high watermark values. >>> >>> All watermark numbers are for zones of nodes that had the highest numbe= r >>> of pages, i.e. the value for min size for 4K is obtained using: >>> cat /proc/zoneinfo=C2=A0 | grep -i min | awk '{print $2}' | sort -n=C2= =A0 | tail -n 1 | awk '{print $1 * 4096 / 1024 / 1024}'; >>> and for 64K using: >>> cat /proc/zoneinfo=C2=A0 | grep -i min | awk '{print $2}' | sort -n=C2= =A0 | tail -n 1 | awk '{print $1 * 65536 / 1024 / 1024}'; >>> >>> An arbirtary min of 128 pages is used for when no hugepage sizes are se= t >>> enabled. >>> >>> Signed-off-by: Usama Arif >>> --- >>> =C2=A0 include/linux/huge_mm.h | 25 +++++++++++++++++++++++++ >>> =C2=A0 mm/khugepaged.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = | 32 ++++++++++++++++++++++++++++---- >>> =C2=A0 mm/shmem.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 | 29 +++++------------------------ >>> =C2=A0 3 files changed, 58 insertions(+), 28 deletions(-) >>> >>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h >>> index 2f190c90192d..fb4e51ef0acb 100644 >>> --- a/include/linux/huge_mm.h >>> +++ b/include/linux/huge_mm.h >>> @@ -170,6 +170,25 @@ static inline void count_mthp_stat(int order, enum= mthp_stat_item item) >>> =C2=A0 } >>> =C2=A0 #endif >>> =C2=A0 +/* >>> + * Definitions for "huge tmpfs": tmpfs mounted with the huge=3D option >>> + * >>> + * SHMEM_HUGE_NEVER: >>> + *=C2=A0=C2=A0=C2=A0 disables huge pages for the mount; >>> + * SHMEM_HUGE_ALWAYS: >>> + *=C2=A0=C2=A0=C2=A0 enables huge pages for the mount; >>> + * SHMEM_HUGE_WITHIN_SIZE: >>> + *=C2=A0=C2=A0=C2=A0 only allocate huge pages if the page will be full= y within i_size, >>> + *=C2=A0=C2=A0=C2=A0 also respect madvise() hints; >>> + * SHMEM_HUGE_ADVISE: >>> + *=C2=A0=C2=A0=C2=A0 only allocate huge pages if requested with madvis= e(); >>> + */ >>> + >>> + #define SHMEM_HUGE_NEVER=C2=A0=C2=A0=C2=A0 0 >>> + #define SHMEM_HUGE_ALWAYS=C2=A0=C2=A0=C2=A0 1 >>> + #define SHMEM_HUGE_WITHIN_SIZE=C2=A0=C2=A0=C2=A0 2 >>> + #define SHMEM_HUGE_ADVISE=C2=A0=C2=A0=C2=A0 3 >>> + >>> =C2=A0 #ifdef CONFIG_TRANSPARENT_HUGEPAGE >>> =C2=A0 =C2=A0 extern unsigned long transparent_hugepage_flags; >>> @@ -177,6 +196,12 @@ extern unsigned long huge_anon_orders_always; >>> =C2=A0 extern unsigned long huge_anon_orders_madvise; >>> =C2=A0 extern unsigned long huge_anon_orders_inherit; >>> =C2=A0 +extern int shmem_huge __read_mostly; >>> +extern unsigned long huge_shmem_orders_always; >>> +extern unsigned long huge_shmem_orders_madvise; >>> +extern unsigned long huge_shmem_orders_inherit; >>> +extern unsigned long huge_shmem_orders_within_size; >> >> Do really all of these have to be exported? >> > > Hi David, > > Thanks for the review! > > For the RFC, I just did it similar to the anon ones when I got the build = error > trying to use these, but yeah a much better approach would be to just hav= e a > function in shmem that would return the largest shmem thp allowable order= . > >>> + >>> =C2=A0 static inline bool hugepage_global_enabled(void) >>> =C2=A0 { >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return transparent_hugepage_flags & >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>> index 15203ea7d007..e64cba74eb2a 100644 >>> --- a/mm/khugepaged.c >>> +++ b/mm/khugepaged.c >>> @@ -2607,6 +2607,26 @@ static int khugepaged(void *none) >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return 0; >>> =C2=A0 } >>> =C2=A0 +static int thp_highest_allowable_order(void) >> >> Did you mean "largest" ? > > Yes > >> >>> +{ >>> +=C2=A0=C2=A0=C2=A0 unsigned long orders =3D READ_ONCE(huge_anon_orders= _always) >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | READ_ONCE(huge_anon_orders_madvis= e) >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | READ_ONCE(huge_shmem_orders_alway= s) >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | READ_ONCE(huge_shmem_orders_madvi= se) >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | READ_ONCE(huge_shmem_orders_withi= n_size); >>> +=C2=A0=C2=A0=C2=A0 if (hugepage_global_enabled()) >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 orders |=3D READ_ONCE(huge_= anon_orders_inherit); >>> +=C2=A0=C2=A0=C2=A0 if (shmem_huge !=3D SHMEM_HUGE_NEVER) >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 orders |=3D READ_ONCE(huge_= shmem_orders_inherit); >>> + >>> +=C2=A0=C2=A0=C2=A0 return orders =3D=3D 0 ? 0 : fls(orders) - 1; >>> +} >> >> But how does this interact with large folios / THPs in the page cache? >> > > Yes this will be a problem. > > From what I see, there doesn't seem to be a max order for pagecache, only > mapping_set_folio_min_order for the min. Actually, there is one[1]. But it is limited by xas_split_alloc() and can be lifted once xas_split_alloc() is gone (implying READ_ONLY_THP_FOR_FS needs to go). [1] https://elixir.bootlin.com/linux/v6.15.1/source/include/linux/pagemap.h= #L377 > Does this mean that pagecache can fault in 128M, 256M, 512M large folios? > > I think this could increase the OOM rate significantly when ARM64 servers > are used with filesystems that support large folios.. > > Should there be an upper limit for pagecache? If so, it would either be a= new > sysfs entry (which I dont like :( ) or just try and reuse the existing en= tries > with something like thp_highest_allowable_order? MAX_PAGECACHE_ORDER limits the max folio size at the moment in theory and the readahead code only reads PMD level folios at max IIRC. -- Best Regards, Yan, Zi