From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D60F6E77199 for ; Thu, 9 Jan 2025 18:04:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1CFA46B009A; Thu, 9 Jan 2025 13:04:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 157EF6B009C; Thu, 9 Jan 2025 13:04:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E76976B00B5; Thu, 9 Jan 2025 13:04:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id BCA1C6B009A for ; Thu, 9 Jan 2025 13:04:15 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4617AA0343 for ; Thu, 9 Jan 2025 18:04:15 +0000 (UTC) X-FDA: 82988687670.07.001EE4B Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2054.outbound.protection.outlook.com [40.107.243.54]) by imf26.hostedemail.com (Postfix) with ESMTP id 4DD5E140009 for ; Thu, 9 Jan 2025 18:04:12 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=G6cgeMDz; spf=pass (imf26.hostedemail.com: domain of shivankg@amd.com designates 40.107.243.54 as permitted sender) smtp.mailfrom=shivankg@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736445852; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=USfUjx5FUDZ1m/CIwucjOUiAS/r1WvGOsob9kSiBYH8=; b=e7mLtjuwvYflvzi8jQsSQJCLQ0naECyEdippUgq9jXsk1/4qDkkAdJQlMMbdl/C0YpEAO0 mdqXJkzOsEJxbtfcvzU2cbfqaJ2lWHJ7iTbaz2cAoF8D3iuOBoXAHsoOAWdUmYFU+Wx0Y4 t88NySzjxklNKJSeBuuBQY2wj/naAX0= ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1736445852; a=rsa-sha256; cv=pass; b=DplE1ObFDf94BnJKliA4EwFeV8+SM0OXj1FraauUCNHRaMslClkpNYAwwdhTrB+pSX1BvJ +txDZDyU6ZkcA4pn0XPPlTUq/b/odtHDgCO81BkEFps2AxPBxr+0n7Q1SHCbksWLbGnnLA IQJdgBPIqWbLolgBdzukf1fM3wNlNBM= ARC-Authentication-Results: i=2; imf26.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=G6cgeMDz; spf=pass (imf26.hostedemail.com: domain of shivankg@amd.com designates 40.107.243.54 as permitted sender) smtp.mailfrom=shivankg@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=lGP99Zgf+TB1FV/ptqTWDYZtxDmSTlnD5d7E9Iambf+HagBPv9qtmcVO7ul6gKmOATNFl29g461clC/MMEECZHOry7wQDHoPcbzu4dbLyV3ObHkhQM8PT1pocC67HXEgCILp5pYLlTYxTbUzfVk83y+Ld8zfu2KTWhSli2JBoOPVFgRo7GM45POWiOMrLfrwQyTGW66lG3+8+YrILryxO0QtKNzkdrOYEtHgd7OsQTsmbJoZ9GoEgoEaA+0FFTu13paBVywu47Nr4b6aOUAuxH3QJi1e01HtTJKVxABsMufUxe51VXxlttdVrfEb6Em5HJ4p5T5JlOV0y+7EHhaq4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=USfUjx5FUDZ1m/CIwucjOUiAS/r1WvGOsob9kSiBYH8=; b=p0dJ9B6q6Rqz4agWsvfcpySmKw7r3H53IVYH6C38RrLBrlJBwMZNi7ppjabho5wKY3bbOZM6RpXBz4WwqZEEYWwlrmRcj/av/sUhWk84pPrRmFDGA/Q5TdMATrvusd9roHrLlSFb2BXo9BFb7LnapAHaEdcGJdKt8jaHjI1CHqUujD5LRxEsL+9V+Gq5llCaa7Mp34I54lf0QV0qBaqVeQQmOLLSCimn7MDYjkNARGPAlGTMdVItbckXvlFiCmp98hytDa7GjmhD2WuOUDSGgw4cshlHFl3qpVbcjpDaFcMkJX2Uyptm7VwUixYIw+gKTRtLi1/FtmSPNsBGb4wb8Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=USfUjx5FUDZ1m/CIwucjOUiAS/r1WvGOsob9kSiBYH8=; b=G6cgeMDzWiPrkuS2NJWG1rvij/4aYDw0MA8o07VsEWW2Yv24ls/gAAMEjNlk+ZSh6SmFqnSccTg+mYnBBoHR1qUi6f08zOrV2uYYsWsWBjKv1D+zISxUvhAgjYIj+syXPYJSCu++lm71SQbEHnRmgg4nxly9IoiXpz5vDkJ3ThM= Received: from MN2PR12MB4270.namprd12.prod.outlook.com (2603:10b6:208:1d9::21) by SA1PR12MB7125.namprd12.prod.outlook.com (2603:10b6:806:29f::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8335.12; Thu, 9 Jan 2025 18:04:09 +0000 Received: from MN2PR12MB4270.namprd12.prod.outlook.com ([fe80::2e50:d5b4:45f2:684d]) by MN2PR12MB4270.namprd12.prod.outlook.com ([fe80::2e50:d5b4:45f2:684d%4]) with mapi id 15.20.8335.012; Thu, 9 Jan 2025 18:04:09 +0000 Message-ID: <003b0818-a35e-429c-9408-5e7344e981f2@amd.com> Date: Thu, 9 Jan 2025 23:33:57 +0530 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 0/5] Accelerate page migration with batching and multi threads To: Zi Yan Cc: linux-mm@kvack.org, David Rientjes , Aneesh Kumar , David Hildenbrand , John Hubbard , Kirill Shutemov , Matthew Wilcox , Mel Gorman , "Rao, Bharata Bhasker" , Rik van Riel , RaghavendraKT , Wei Xu , Suyeon Lee , Lei Chen , "Shukla, Santosh" , "Grimm, Jon" , sj@kernel.org, shy828301@gmail.com, Liam Howlett , Gregory Price , "Huang, Ying" References: <20250103172419.4148674-1-ziy@nvidia.com> <600a57ff-a462-4997-a621-f919c2c4fa84@amd.com> <567FDE63-E84E-4B1E-85F4-4E1EB0C2CD26@nvidia.com> Content-Language: en-US From: Shivank Garg In-Reply-To: <567FDE63-E84E-4B1E-85F4-4E1EB0C2CD26@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: PN2PR01CA0242.INDPRD01.PROD.OUTLOOK.COM (2603:1096:c01:21a::8) To MN2PR12MB4270.namprd12.prod.outlook.com (2603:10b6:208:1d9::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MN2PR12MB4270:EE_|SA1PR12MB7125:EE_ X-MS-Office365-Filtering-Correlation-Id: ba27e829-bdd9-49b5-431d-08dd30d80372 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|376014|7416014|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?B?Q1oxL2ZkTDFRVEVkRVl2VlJ5ZTJ3N045dzFKelhUaGI0eUFpS3ZXQmN2STVV?= =?utf-8?B?UFk4Wjgyc0thRXlzbDlXVTNYTnovQUM1dXhUYUtadHRjN1RrQy9TcHBTaXZ1?= =?utf-8?B?UlVnTEI0b2tVS2RENEYycVVXNnkwWHBxWXBMU1dKRlRqT20vcWFhaHlzOVMv?= =?utf-8?B?Q21RRkMxbkZMNi9zS0xzOFZySkZOeUI2Vm1KUFBpL2hXdHQ5OFpjRnZ2end2?= =?utf-8?B?U3EwTnZ6UU56QnlrTkVKbU1mb1VMVk5STFNtb2VkdnFNejBRaFVuN0dEWUhk?= =?utf-8?B?YUpYRDJaU3N4MFBZdGZiWWIwTDlDQkhBalgrMXBabUYrK0hPQkVTZldtVGF2?= =?utf-8?B?WDI2M3RyYi9sZGtrVWtZMDRnUjZ6UEMwQUhpY05WU0NOMDRJZGVRUVpwWkVJ?= =?utf-8?B?dTVUVWpHMEJJL1psR3RMa0VDQTRUaXZ5Y3A4Ums1azE1b05PbnB2RFpYM2ZC?= =?utf-8?B?dklya1NDYWZHbWc5bStQY0tzNjFMcHlUaWZES25zOGg4RktnUklLYTdkYW1q?= =?utf-8?B?aFBXUjI2MjZsSVpRR1luV0hXR1hlL2NrcTlkeDdMcDJEOGhmY2pxOTdLd2VI?= =?utf-8?B?MU8xVDVGN0QwUmYrNVY1VERON09UQnlqMWFyU0k4TUtUUlE2bkY3MWJYQzd3?= =?utf-8?B?RWRDRitMNWk0UXczbmhTRzBvNHVpTjFiZlBkMmhvdGJsaGtvTFBBWVBaKzEr?= =?utf-8?B?aGpPZjNRMXZObkZ3c0wzeW5ra2FCdUhSR0laVFpqUWthTFVIcmtKTld2OEhF?= =?utf-8?B?TXhJcm1ycDJOeGVrYnhTMVBsWGJDZ1pJL09XemQ0QVVwMXFJak02YjN3dWN2?= =?utf-8?B?VFpJSkNHeFQ1VFlMNzQvaG5rbWNXelRjamV4dHozbFlYZTY2V1NYeGhyZFlo?= =?utf-8?B?akRObWdQR3ZmRWNkVy9ZK2pjNTJIek5GTXR2R1RwaEVnOXZKZkhpaUIvRFdh?= =?utf-8?B?WjF6UHZ6MkhnTDhtL0tVbUJmKzVFWDI5Um1sa0g2c2VWVUduU0JnUko2clJY?= =?utf-8?B?eXNENzdNZTU0VER6bjRVUjg4RnlhdXlMVWpjVHhZeWdSdC94MGsyUGdmSzY1?= =?utf-8?B?ejUzRTVENXNySHBlVllIcEltT3h4d2pueVlQeC9NOWtLK1Ezb2hPOGk2aFVU?= =?utf-8?B?NE1aTkdjLzFXSnhuUUFBUXl0alkwdWN0ZitjNnJxNU5iZmRVSm5udHlGUDJQ?= =?utf-8?B?Y2ozWW5IT1cvUlZrMU53L0RBZWcwQXoyME43cThLeWdod3FqSXc2dUI4aTZl?= =?utf-8?B?K2N1MllnaGp4Vk9QWitUeHNONHhHUmxOSEE0UnJLVk9RQm9wcm1QMVVLNisw?= =?utf-8?B?eitycitnZjgxTjFjeWhEZi9zVUxjNDVUSHRIWWF2T0xwS2E1SktYTUU5NVhT?= =?utf-8?B?QnVwK0puSDRIcm9qb3RnQXpwVWpZWW5TUGdZQ24zL2x3RHJCVUhYUVhYaXVw?= =?utf-8?B?dGhGN0dOdDVKWkFpWDhFbzJTcW5JZ0FQK3E2TFdiSitkK1pkUlVGRkoraVNF?= =?utf-8?B?Tm82R1dySGNXT041U2Z4MXpheGFmRzRhZUNkNjRnM0hKL2loRzMyakV4N3dt?= =?utf-8?B?NEN2bTEvUmFXMkNnaWRWNTNJdWZDU2EvMTMzRWNhdm5ZZnJ3eWdKWHJpUDV6?= =?utf-8?B?WEJUQzdYeGpJdlhJWnVQT2ozSndCY3paVEdXUHdzN2pDakRrVlZ1cFJXak5k?= =?utf-8?B?S1F2aUU5aXBpYXU3cjVrZUVZSW5lZ2g0R0daUk1jYmg1Tk5OV1M2L2grTHlD?= =?utf-8?B?LzlEWE1SNEpYVlJqbXdVSzFlc1VyTXI4c0czQzFJZVN6YVgxSnFQQ3E5d2t1?= =?utf-8?B?N0l1U0JyZ0VidCtjYnkrNkFPeStzYnRSdis0dDduZmxNT2hIT0orQUY5WDZD?= =?utf-8?B?d0kzM0VycFEwbXpoS2JHNXA3VlRNdHRjSzNRYldWT2NGK2c9PQ==?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MN2PR12MB4270.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(376014)(7416014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?d1ZGc0lQejhiR3YxaExpZWpkNk5CVHZGZkkyOEM5akRRbEZyWkM1WGZOTGUr?= =?utf-8?B?OHgwOXRPN0lDb0dhWitXUjNlSGlWVFJLSCtTSmZGaWJNc0JteVZCVnY2SmpV?= =?utf-8?B?NjJpVWthZmcyZmUxSjBySWpKU2FvOGc2MUNia3dRUGpLNHZJTnNrT0hwWUVz?= =?utf-8?B?Sjl5bFFQZE5UQllhUHptb2MrOVdBdlZaeFd6bFlRODZmd2FOTnEyTy9XdjN6?= =?utf-8?B?OHJCT0FKMjJXTktpMDJZTy9ZNTFCWHZnNzlaMWVLb1VyZGtrN1lWcHMwNkUy?= =?utf-8?B?RDVaZWZ2THl5d25tUWlOM3Z0R2J4Slo3d1Q1V2tDUHpybEVobnZ4eVVBTUcv?= =?utf-8?B?MmdlRVg3dEpLUElGZVllOTljOENGUTFhaG9GcW5ieVRQRXJrYUtNTkIzZzg5?= =?utf-8?B?eXlycFV5SkVlTzRQYzdzcUJqRjBlUkdRbFdsVWF6ZHVlcm15aXNaYVZTYStz?= =?utf-8?B?R1BCVTVQcjF6dUhqb0FmVVNSWmczSWxsaWU4RHZ3VkNvaFk1cmFUSVhXMzhj?= =?utf-8?B?bmJNNFd0OWhhOVZWa0R5ZVNKSEFXZlZxVHFyWlYxbEpHcWVZTCtlN3B5M1k0?= =?utf-8?B?YTlJRFNmRnRLWlY5czhjZGVQTk1mN21TS0M4emJySjBEOWtoZEZZTFNlREh1?= =?utf-8?B?bVRydHp4RFhTQ2w4ZkluOXR4eTkwaFlEZGZPanZiUExHSWZXWGlzeHoyU200?= =?utf-8?B?STFQZFdPdHc2dXYrQ0VGWVJRRG5aNDExVTJGU1oraGZhN1JGSG9wREdpVU03?= =?utf-8?B?ZHMvUGVwRTZYYUNKY1E5U2ZETWRNL1BYMUJTM2ZhNGFMcHpHSTV4Mnp5cmlh?= =?utf-8?B?aFNkc3VYWGJvODhzcHoxejBxdVNlcnR5OEtKZUNyQTZaL0l5T3A2VXhESTVw?= =?utf-8?B?VVFUNmluU1dBQUc5aTFFYWRjeERmWFNiMGhFK25FZXRuVFc4MmZGR25RZ3JO?= =?utf-8?B?YzlUWFJiNlJIdk9jQ1VGdnFKRTV0K1J3WG9LY2hTemlPeDJXVXhyT1Q5TXk2?= =?utf-8?B?R1RkZUJQU0dwMnBsQWovVGtJN0I4OVdCN1EySVpkN0tOOUVDbVMrdzl6Y1lt?= =?utf-8?B?aGtzWVFOcjk3VUovd1V0Qk9STC82RlEyREZab3Blb3VKVExIV1ptSWVUNVVR?= =?utf-8?B?c3loZ1I0RkhBemFhcG5vTGtwbnJ6cDJjZTlIVnl0VnRLeVMxc2NPb1N0Z0U2?= =?utf-8?B?WkJaTkQ5U0NsQTIxcjdDR0w2Ym0wWTNLcy85eWhYYk5xT2dwbTBUTDZpUVRq?= =?utf-8?B?d283R3NWa3BHNy9tV3A1MmhyUHZkMno2T25zV2F3SzJXL1UxS0xtOGQzNDFp?= =?utf-8?B?LzJ0eUVZZy9xcXptK3Z6MENOZHBFZXAvT21mK1AvNHJqNEVYcE0vKzBNZUVy?= =?utf-8?B?UG1VMlI5WkNneVhkSUdJY1J6SkFzOWg4VXIrMWxvNThzUTdlNzIrc0t5UDRG?= =?utf-8?B?SHBySW9pSWp1aGJWaE5CNXgzZEEyZWttNGRveXNuNEp0NDVrSEx4SE8xaEhk?= =?utf-8?B?YSs3ay90NENqMUlkbnpWcE5HYW45bUJOWHlKSWJqR3NJSDY0bmx4b0UwcStV?= =?utf-8?B?NURESnV3R2VJeFRkK2R5UDFCMEJGdk5lY1NiUDA1UG9ONnF5L3pKQ0xkcGJv?= =?utf-8?B?alB2UEhYOTkzYldnMCtMN2x4YjF1eUJvQmwxcWxnaTFLMm9lNEl2NlFwd0FI?= =?utf-8?B?bStjTnlqY3hmeXpDOEgxWEg4MHk5MnA5TlpVZllhalZ6UVVCazFlQkFQZ0tT?= =?utf-8?B?WVYzU2VibTBuOHBTNTBkckU5V1M2SzNtcUx2WE1yNzY1NFQzZVovSDlEYXVu?= =?utf-8?B?Tm9wSDN4R2YyK1FCb0RkNi9jOTFNWWI4WlMxT0lhR2dOMENjSWNXQjJIK3kw?= =?utf-8?B?SGVlaENGbTg1cjJtVjMreTJqN1hRM0F4MS94bzh1ZTlwenZIeGtQRHZpc0pX?= =?utf-8?B?USt3VENoVGQ0RU9pajlqOXhGWHJwYUM3NWR1aHI5cmVkV2o1Y210bUFBY0Q1?= =?utf-8?B?UnMxOFNQMUpyamExVEdVUHZheGJYQXM4WEJ0UVFEblloSzhHSlJFalI5NElI?= =?utf-8?B?eGdheXBiZ1hFa29odmJDSmoxcVVWOWxkQnd3ODJUR3NqbGlkcVMxU1A5OCtC?= =?utf-8?Q?n8Tr2VMFpy993DLdJQvDgkiUY?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: ba27e829-bdd9-49b5-431d-08dd30d80372 X-MS-Exchange-CrossTenant-AuthSource: MN2PR12MB4270.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2025 18:04:08.9435 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: PvU8qiSDyN2XYjOds4kNE7fYkjD9hxwHf5Fy5qwzdSoM8M2EGoWNbwRorKWUZlqsy/4Da5vhfOyitbBwk0Z/uA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB7125 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 4DD5E140009 X-Stat-Signature: yfiktcuhckhax5nsuapjteprk58mbpy3 X-Rspam-User: X-HE-Tag: 1736445852-282950 X-HE-Meta: U2FsdGVkX18oCtEC4ZWe21ryqJ+OiKtRAIx2r8dAcbfy0Inh2zv7E1zCLYi3nY+otNXREIcgF4EwyJy4APC7qOeSsmeXXE6QTjVb0wxon0NLCBA34jsdQvqCFViSO9QG0ypKoPSAz0bZsSvEIfZLhfSqzwu+bSQ/sWJHvJQ5xCnI4pqztqdcx8ccRpmhSFnU++BVOYL55f26j2vieDEHVqHhERPDaD1NHUTh9I0oOAO+2/tEjWIPIuWWbLZxOuYI9j3VhQTUijr+Bkb0Qn1eDJdaMipIx++aECqNCLw7QixFObathWL3ohXzvlNIXlUOluh/NNIfaJatXwDnjdrAlUcq+aF6UjupjuyVwk1p9B0vIErKwg1cV0ekjUtEcopGZiBR7ggr2WkKZnVr4ypAvIOoffccmjCjrbwwu+oiL0ZpcHDbn9drvq+Vy9kH1GU/GoY0KdQ1hr2JCfsMRPiZbqKWSEetC6c3q5RNqChGCos2c0tu6Hlo5q9D97m6aBxuvrQ7v7z3XFU6hr6f2Rdq/fqQeangEBeuEAjnezqr+BYSpMxb7qdVVs/mr2axmtNZP/nj0UcobaoAzXnHSxdwnFYsQDGxm4gpfNFgIaRZgA7cfUEvr2k78ht6G68Q9otnrnZvFVq8Hg+yauzwFvtNfv77CNLTZXkqZdRlSixMniDef+gDFbRZy6Zd4dTHFrSru2ivyqsc/higWXKrO4XuTDXJU6Kd79ehg9VnEwnR1wjHDNNoQyEexXY4aX5ZMs4gVgKXY7oxUXiKi1THjkerSyBhfrNKLRewzB6evq1lQiNJ9wFCns5Bxc9fdCNS263Ks5IiONqiimF6TIqhafO4k/H0JbkRiKH9DljcpmYp8gWl0Mp0KTCeHMzittRV+zAqa8RSpuv9Cw+ykWOgt9+r45hxnKbRhQngmK82M5zGNXhpGkxQc2aUdCPk216yrMw8BUAvxjLuDzGVAGBOBFP JlTOSIBT Ixx24qj8oJ6KZP5Ziee/m8+S3qVo03lh6PSKsouRYv6qjlQKIy3hnEuVf2hUtoO36CE8uMYrsnorFTByfg/VnvjTiCDxnt4g/1XO64sKzh7j1N6F1Qi6c0u5F0IR9O2HH8jzcCEh7Txg48BmfJ9c1ucLhjFKThS6bxDg6U0gSOn7mPrnA7bqP6VLdTYlBQI6JWLsUGv1YI/BLbVwbODckOuoL5R58V9xzSARdJvhd9oaf2WSnHDoET/yxNIJAfu7Hut5gf7gHUlQ73i11EUByLqSv5JUZ3H9baVpNgCk88NcQesdaeWKzel1twrAFuPZq6+hmsdnxAkEb28ZVTwFVMOKpTYNHCr8KKfV/akvBj81DVthV3J3sZhCiXwHG6CroViwM8wZ5FyQoCru/lG1NgAEMO1tD0EZ10IW8gvfAI5Sq5dygIHSuIlSwUlnRhCxv1b8ugo3PTLSD30IIhWMwMqaiOwlkYcZy1wsoPoE7jnVK2WZXPje+njvzkQGJdW0wWvGovFjZKcSZKiSm5xR46IftgxSs+P55OlGk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 1/9/2025 8:34 PM, Zi Yan wrote: > On 9 Jan 2025, at 6:47, Shivank Garg wrote: > >> On 1/3/2025 10:54 PM, Zi Yan wrote: >> >>> >>> 6. A better interface than copy_page_lists_mt() to allow DMA data copy >>> to be used as well. >> >> I think Static Calls can be better option for this. > > This is the first time I hear about it. Based on the info I find, I agree > it is a great mechanism to switch between two methods globally. >> >> This will give a flexible copy interface to support both CPU and various DMA-based >> folio copy. DMA-capable driver can override the default CPU copy path without any >> additional runtime overheads. > > Yes, supporting DMA-based folio copy is also my intention too. I am happy to > with you on that. Things to note are: > 1. DMA engine should have more copy throughput as a single CPU thread, otherwise > the scatter-gather setup overheads will eliminate the benefit of using DMA engine. I agree on this. > 2. Unless the DMA engine is really beef and can handle all possible page migration > requests, CPU-based migration (single or multi threads) should be a fallback. > > In terms of 2, I wonder how much overheads does Static Calls have when switching > between functions. Also, a lock might be needed since falling back to CPU might > be per migrate_pages(). Considering these two, Static Calls might not work > as you intended if switching between CPU and DMA is needed. You can check Patch 4/5 and 5/5 for static call implementation for using DMA Driver https://lore.kernel.org/linux-mm/20240614221525.19170-5-shivankg@amd.com There are no run-time overheads of this Static call approach as update happens only during DMA driver registration/un-registration - dma_update_migrator() The SRCU synchronization will ensure the safety during updates. It'll use static_call(_folios_copy)() for the copy path. A wrapper inside the DMA can ensure it fallback to folios_copy(). Does this address your concern regarding the 2? >> main() { >> ... >> >> // code snippet to measure throughput >> clock_gettime(CLOCK_MONOTONIC, &t1); >> retcode = move_pages(getpid(), num_pages, pages, nodesArray , statusArray, MPOL_MF_MOVE); >> clock_gettime(CLOCK_MONOTONIC, &t2); >> >> // tput = num_pages*PAGE_SIZE/(t2-t1) >> >> ... >> } >> >> >> Measurements: >> ============ >> vanilla: base kernel without patchset >> mt:0 = MT kernel with use_mt_copy=0 >> mt:1..mt:32 = MT kernel with use_mt_copy=1 and thread cnt = 1,2,...,32 >> >> Measured for both configuration push_0_pull_1=0 and push_0_pull_1=1 and >> for 4KB migration and THP migration. >> >> -------------------- >> #1 push_0_pull_1 = 0 (src node CPUs are used) >> >> #1.1 THP=Never, 4KB (GB/s): >> nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 >> 512 1.28 1.28 1.92 1.80 2.24 2.35 2.22 2.17 >> 4096 2.40 2.40 2.51 2.58 2.83 2.72 2.99 3.25 >> 8192 3.18 2.88 2.83 2.69 3.49 3.46 3.57 3.80 >> 16348 3.17 2.94 2.96 3.17 3.63 3.68 4.06 4.15 >> >> #1.2 THP=Always, 2MB (GB/s): >> nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 >> 512 4.31 5.02 3.39 3.40 3.33 3.51 3.91 4.03 >> 1024 7.13 4.49 3.58 3.56 3.91 3.87 4.39 4.57 >> 2048 5.26 6.47 3.91 4.00 3.71 3.85 4.97 6.83 >> 4096 9.93 7.77 4.58 3.79 3.93 3.53 6.41 4.77 >> 8192 6.47 6.33 4.37 4.67 4.52 4.39 5.30 5.37 >> 16348 7.66 8.00 5.20 5.22 5.24 5.28 6.41 7.02 >> 32768 8.56 8.62 6.34 6.20 6.20 6.19 7.18 8.10 >> 65536 9.41 9.40 7.14 7.15 7.15 7.19 7.96 8.89 >> 262144 10.17 10.19 7.26 7.90 7.98 8.05 9.46 10.30 >> 524288 10.40 9.95 7.25 7.93 8.02 8.76 9.55 10.30 >> >> -------------------- >> #2 push_0_pull_1 = 1 (dst node CPUs are used): >> >> #2.1 THP=Never 4KB (GB/s): >> nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 >> 512 1.28 1.36 2.01 2.74 2.33 2.31 2.53 2.96 >> 4096 2.40 2.84 2.94 3.04 3.40 3.23 3.31 4.16 >> 8192 3.18 3.27 3.34 3.94 3.77 3.68 4.23 4.76 >> 16348 3.17 3.42 3.66 3.21 3.82 4.40 4.76 4.89 >> >> #2.2 THP=Always 2MB (GB/s): >> nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 >> 512 4.31 5.91 4.03 3.73 4.26 4.13 4.78 3.44 >> 1024 7.13 6.83 4.60 5.13 5.03 5.19 5.94 7.25 >> 2048 5.26 7.09 5.20 5.69 5.83 5.73 6.85 8.13 >> 4096 9.93 9.31 4.90 4.82 4.82 5.26 8.46 8.52 >> 8192 6.47 7.63 5.66 5.85 5.75 6.14 7.45 8.63 >> 16348 7.66 10.00 6.35 6.54 6.66 6.99 8.18 10.21 >> 32768 8.56 9.78 7.06 7.41 7.76 9.02 9.55 11.92 >> 65536 9.41 10.00 8.19 9.20 9.32 8.68 11.00 13.31 >> 262144 10.17 11.17 9.01 9.96 9.99 10.00 11.70 14.27 >> 524288 10.40 11.38 9.07 9.98 10.01 10.09 11.95 14.48 >> >> Note: >> 1. For THP = Never: I'm doing for 16X pages to keep total size same for your >> experiment with 64KB pagesize) >> 2. For THP = Always: nr_pages = Number of 4KB pages moved. >> nr_pages=512 => 512 4KB pages => 1 2MB page) >> >> >> I'm seeing little (1.5X in some cases) to no benefits. The performance scaling is >> relatively flat across thread counts. >> >> Is it possible I'm missing something in my testing? >> >> Could the base page size difference (4KB vs 64KB) be playing a role in >> the scaling behavior? How the performance varies with 4KB pages on your system? >> >> I'd be happy to work with you on investigating this differences. >> Let me know if you'd like any additional test data or if there are specific >> configurations I should try. > > The results surprises me, since I was able to achieve ~9GB/s when migrating > 16 2MB THPs with 16 threads on a two socket system with Xeon E5-2650 v3 @ 2.30GHz > (a 19.2GB/s bandwidth QPI link between two sockets) back in 2019[1]. > These are 10-year-old Haswell CPUs. And your results above show that EPYC 5 can > only achieve ~4GB/s when migrating 512 2MB THPs with 16 threads. It just does > not make sense. > > One thing you might want to try is to set init_on_alloc=0 in your boot > parameters to use folio_zero_user() instead of GFP_ZERO to zero pages. That > might reduce the time spent on page zeros. > > I am also going to rerun the experiments locally on x86_64 boxes to see if your > results can be replicated. > > Thank you for the review and running these experiments. I really appreciate > it.> > > [1] https://lore.kernel.org/linux-mm/20190404020046.32741-1-zi.yan@sent.com/ > Using init_on_alloc=0 gave significant performance gain over the last experiment but I'm still missing the performance scaling you observed. THP Never nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 512 1.40 1.43 2.79 3.48 3.63 3.73 3.63 3.57 4096 2.54 3.32 3.18 4.65 4.83 5.11 5.39 5.78 8192 3.35 4.40 4.39 4.71 3.63 5.04 5.33 6.00 16348 3.76 4.50 4.44 5.33 5.41 5.41 6.47 6.41 THP Always nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 512 5.21 5.47 5.77 6.92 3.71 2.75 7.54 7.44 1024 6.10 7.65 8.12 8.41 8.87 8.55 9.13 11.36 2048 6.39 6.66 9.58 8.92 10.75 12.99 13.33 12.23 4096 7.33 10.85 8.22 13.57 11.43 10.93 12.53 16.86 8192 7.26 7.46 8.88 11.82 10.55 10.94 13.27 14.11 16348 9.07 8.53 11.82 14.89 12.97 13.22 16.14 18.10 32768 10.45 10.55 11.79 19.19 16.85 17.56 20.58 26.57 65536 11.00 11.12 13.25 18.27 16.18 16.11 19.61 27.73 262144 12.37 12.40 15.65 20.00 19.25 19.38 22.60 31.95 524288 12.44 12.33 15.66 19.78 19.06 18.96 23.31 32.29 Thanks, Shivank > Best Regards, > Yan, Zi >