From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 572BBE77197 for ; Mon, 6 Jan 2025 02:02:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B0EEC6B0082; Sun, 5 Jan 2025 21:02:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ABE316B0088; Sun, 5 Jan 2025 21:02:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 937916B0089; Sun, 5 Jan 2025 21:02:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 74A3D6B0082 for ; Sun, 5 Jan 2025 21:02:04 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C8A7C8129A for ; Mon, 6 Jan 2025 02:02:03 +0000 (UTC) X-FDA: 82975376526.06.3455CE5 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2074.outbound.protection.outlook.com [40.107.102.74]) by imf15.hostedemail.com (Postfix) with ESMTP id 01663A0006 for ; Mon, 6 Jan 2025 02:02:00 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=LsSsfwjS; spf=pass (imf15.hostedemail.com: domain of ziy@nvidia.com designates 40.107.102.74 as permitted sender) smtp.mailfrom=ziy@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736128921; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qkZlIS/6/hDaZnVMVPCffkdDTfRkyQYqMWvfTzevWLA=; b=mdEo4c4QyTN19e24GJMU3aPLEdoXCRlV+RA+vcjTPn7zCGIo+qzQdnAFTPa3SU0prmeNKS v4lPz1wyOVXsVxuYviyJP/vlF2wi3uCLn+GayI6ncZMnNEXhf7dS9XDZADndrfg1+3MDHy G4x3ufZvfIxShxpXgbr7frUghFB1SDA= ARC-Authentication-Results: i=2; imf15.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=LsSsfwjS; spf=pass (imf15.hostedemail.com: domain of ziy@nvidia.com designates 40.107.102.74 as permitted sender) smtp.mailfrom=ziy@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1736128921; a=rsa-sha256; cv=pass; b=U72GkDwvNpVp9EEmgt08m77Kf6nhePZbsD1Vndi8YqdM6D8zQOHG20pxEfsTIy2TeeWepe fzNWa4J4b4q5O8RQ7je16BIHygz2HkTFffC32A2lxX1FAAEtVCKnHJSspx1ePuvydZ69RF I+2r+tfb6xAriQ5pbeM3+pyO9N915aw= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=vsDlswd3q2X+NpEdTjzGyQAgPd1YJgyoZtiHTIRE/cZ7baW+B5p7fbF0C7Rpv2COHAkBRxNu09wFnkmSA2dk+6IGYvBayvVP3dS8J0TkDzvxUIPBppuA7HniaU6+iyd/sHriEIybsF6X9AXPIDUlPwaTUAsccTljjuERNoCUIQN+G4T7kJ6hBnPAJqNh5n6F+AWS+Hy7f5egTstqO0X18d7cgHPpkXYVpn/ta8zzg7QpV+oNoqyHE5UbgqgIww0cAazjnaZ2dhmYL60aamZO3VqkzD+Twgqqny9GEvorIAFM4VV7QMzPfeBbd0H5WQzQoGQ75jGpymGN+zI0DX0gcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qkZlIS/6/hDaZnVMVPCffkdDTfRkyQYqMWvfTzevWLA=; b=hDFhTYWJLgKc7WE2oLIPSKFracQQHXLfL7L4gg6b3gtJQwr0RuXcpge47aLER9PtCaoQpEi2ZDcjlL575/x1BFLLYYuAEijD/F+F6XX8UWhY9MRfR+3onMeWEIMFN0TC4zQWpV5T91s9bq5cKi3sl0LSHwPj70mMMity/d65phzsg4sAivPaneXYjvSqX6bf1WKcdyX66yyUZmwj4AT7mPF/D+ffa77jJP7twEkIqT6rk0TMCW/m285sgdHUFtyhIwVCpMfHhV/F59BTYPHw9+0vpDDaJskzYleIOBl9S+anvhT89QxfdoZtG4tjXnXQwOHAgNop3jQLlh4QfuAgyw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qkZlIS/6/hDaZnVMVPCffkdDTfRkyQYqMWvfTzevWLA=; b=LsSsfwjSnAVF6CJa0b9PCJGJuDerGvPuzJKT15VJDVqK+G2Uk1X0kfm/7WYqN/h43h6PINHw1/eRQm0ju703qNdOsxOCyz6XO0R3sYS0WfV96b94j8fE9o4XVWXfw+fnO4gWEs6tCd32geq9v989pyzfU65PvxByJ9pThhJ9gP8k35iBrdudtvztzvf+gANJ/FSq3QsolNPZA31EtVAhFjQd0lddCu8lxN2Zs3XOjlbOAAEQllh6/TKfMECFYWLvYOL+9NnaWPg6zPLBvGYcwNzQA5j+6jiHfd9bLqe+CiQJzwcDwQCMNeWUvBqFlXQU+ZhokEFMdA7GkcL55iNSAQ== Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SJ0PR12MB6687.namprd12.prod.outlook.com (2603:10b6:a03:47a::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8314.16; Mon, 6 Jan 2025 02:01:51 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%3]) with mapi id 15.20.8314.015; Mon, 6 Jan 2025 02:01:51 +0000 From: Zi Yan To: Hyeonggon Yoo Cc: linux-mm@kvack.org, kernel_team@skhynix.com, 42.hyeyoo@gmail.com, David Rientjes , Shivank Garg , Aneesh Kumar , David Hildenbrand , John Hubbard , Kirill Shutemov , Matthew Wilcox , Mel Gorman , "Rao, Bharata Bhasker" , Rik van Riel , RaghavendraKT , Wei Xu , Suyeon Lee , Lei Chen , "Shukla, Santosh" , "Grimm, Jon" , sj@kernel.org, shy828301@gmail.com, Liam Howlett , Gregory Price , "Huang, Ying" Subject: Re: [RFC PATCH 4/5] mm/migrate: introduce multi-threaded page copy routine Date: Sun, 05 Jan 2025 21:01:48 -0500 X-Mailer: MailMate (2.0r6203) Message-ID: <8B66C7BA-96D6-4E04-89F7-13829BF480D7@nvidia.com> In-Reply-To: References: <20250103172419.4148674-1-ziy@nvidia.com> <20250103172419.4148674-5-ziy@nvidia.com> Content-Type: text/plain X-ClientProxiedBy: MN2PR05CA0005.namprd05.prod.outlook.com (2603:10b6:208:c0::18) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SJ0PR12MB6687:EE_ X-MS-Office365-Filtering-Correlation-Id: 665b1357-3d10-4bc6-f391-08dd2df61633 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014|7416014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?dD8U7YFS05Ej/MNxAgUuZH/0RkiLUq6RGRbMnYwg+gPK4JclGK69MJ84eWUv?= =?us-ascii?Q?5F7cNJhWSDgnpE7N8Y18YOkKcqAdReJ5hIdyE/dKJzgPjQLpPjgqwoc8UoJQ?= =?us-ascii?Q?a5G8ptmYjkDv0u5Cg5DKKePgol9eExPbZA238H4h1pfbRHaXMtrPQEAE3qXE?= =?us-ascii?Q?j2wUWpkV12oVTFYzlYZIgb6YtVJmx0R7AvTFpYXzUwnatzHmXDOJN84Ucly8?= =?us-ascii?Q?xIinZnN6RtT2tiyAxbBlxwW/4XPdMiLuNyMlWu9aJCEl5NVtv2R7iGyaMER3?= =?us-ascii?Q?IHqpBHfknixbpT5GmEG7FqK27NB5iSIkObxYViA1zkT7KSt5BK58OAYwjPmS?= =?us-ascii?Q?mqUpie+cNls1xfCht68R/+5n/ZlMwOu6gbL8wMBlud1uMOkUDA9pINpNCF/S?= =?us-ascii?Q?VriZZUBssVAwMoOpzB7CN4IXaQrzpshUTE70gR6Lhl3B9JvDtTQbH2Jf0ptz?= =?us-ascii?Q?081T86Ob4MVP8QXGRi7xh345tVkXwz77OtYfCb7QosxX5v4ZRJlaxH2RkxKC?= =?us-ascii?Q?dTbBP3pE3wlS8o2LsXJOKOjnrMgBT3Rjwz7CiT/RskOgkbJ7sySzOe+K5jxU?= =?us-ascii?Q?dQ/sEWNTOxEmwZ9y1t85986flKJHH+mq+2P+Po0CmpvoJ5X7xacfZQKp3ymo?= =?us-ascii?Q?wYUYTdirO5mvGdshwB0TIdWHhjGIdCrdPBBmD2NAwdzixXGrE/6stI97LNRg?= =?us-ascii?Q?OM9GD6RDxaYxoU8V7vz45S5sfz0AK2pjWMisqCcY0WqZ538ut+Z+vLn535LZ?= =?us-ascii?Q?I/xDgth06FEidcq9GxtBeFffC3merBS0EwpZDtWo8+/DOPd8JfyNzaekP8jO?= =?us-ascii?Q?I9q9ujri2wOZTeTB02JQYypi0DyZxaqwRskDfWUVpD2KnUjmQL8ZndM6Wx0k?= =?us-ascii?Q?MDnEglSgnIgiDGn+cf8WDVCN72cHfJC5JTMvRKKI6PObibtUiJQ6x8nbJEP3?= =?us-ascii?Q?aEBkljKBaL+ickBc7lpo6NbMTtYVWrJhelleDK1TrBmk2zXrypVfgGEUpyqx?= =?us-ascii?Q?uwFgsiXimmpLrJZDaujEu4z/W679khQFzO7TQ9HWXJk+kn29aX9cLf4pu6T1?= =?us-ascii?Q?vlwZqH+7F0/CXafEShohZn6z7tNeCTt2oGDYcaExLtZrrbVUAzIC1A3j/vww?= =?us-ascii?Q?0dXtIK5bOWxYS9fkKRli9vXt6J0qED19L/r60WLD9GIkNlTVuFIIEeSrrrLP?= =?us-ascii?Q?9l66u4DM7M2jwaBKptEVKzhpDM/hiLY7oJe84OecwyvPZZm0+RnrffcBPdoZ?= =?us-ascii?Q?tr/vnSDSjwQzt931GHjq0y10Y/9xCL/hJWJRIVaRhTUWuWsMvKsUiSdBJ6EO?= =?us-ascii?Q?XBppIjRgmoARt0w02/tnOcyqWP/5LRx3c8wS4jY5F6IfNnu2YjmIyjxIiuAx?= =?us-ascii?Q?KkO60NgckgbytAdUAhoX2qqT1Wew?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR12MB9473.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(376014)(7416014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?IovhuK3naIrmXaLLUEmTHQawJxGnMDlE0uwB+17VrFODu7ATIXW3qj1/q4NT?= =?us-ascii?Q?xJV5OdTTRqONuJAoIDa3H6O9kLwrMX5v1+3gl3sa+TodbqHF/iLbktQfWqVK?= =?us-ascii?Q?UWkqW+xydwKGbhXNmLMZBrg2CB0YdBeK2GkJfY8zg0TEnsdLBEWhxa2r1fKd?= =?us-ascii?Q?NuhBAxzRc3xAEErgCVfBT3mTiW3faD91GDFr1qtEIukmXYpjbHqopu3Xoc99?= =?us-ascii?Q?UNcgnrIig3aZTm2ap8rkAdi5BCfghUE8MK2Nh4YL5sprc42hy/ogyGszwJK5?= =?us-ascii?Q?bL5RMDrhjuINrEHZ13zDwM28T/2wqM14uheq+PF+fe2CnHFNyMr5R6wX1mia?= =?us-ascii?Q?qD382g7A+eUfdfw9XmxItlEYlMAhXLqyE5H4Tf4NhSSHscQoY5LZWukgVsm5?= =?us-ascii?Q?G51mov39taLJk76HM8O94rFZA70cY6marSPEVBFzwwuLVOzFFDcnHLyYOr4b?= =?us-ascii?Q?pNHU5QTzlonj6Kf0usNAxI3iEACW+mXBhvzlgiTsTLzPhxu9yPjWInAa4ciX?= =?us-ascii?Q?wBr9Wf4U8GveL7zrZn8VgIAvsjkW/ARohT+JO2ZFvLLMYGdxVe64DVRXGMV9?= =?us-ascii?Q?HZC8StSSjtcgUiok4hDUY2TXAoAEacCwzp2FGYjlCTLenYBwsR2FMR0py5VX?= =?us-ascii?Q?YAyX4KKcy00+0AHNr2oYD7w0ucjVb2IvbNlHQty2sknl0a4wbwcRlNSUruIO?= =?us-ascii?Q?DYHaiwZ+50krpwnLmMy1ZdlPP72pVi3eePSu8wExEHTNiud6dfv089KlSRbZ?= =?us-ascii?Q?Nr2Tu6JkD8l9FYwnABp8ZsPkQSfUI2kuKwCJ8nJiRjyqzAOsv1Mheev7W5ez?= =?us-ascii?Q?RQ68GjsVEG4ZZjDZypQaUVi19g9WoiDJfHw0/30Za1bR39UfDwbA0ISQLq/R?= =?us-ascii?Q?JS+S3TvXss6R+F99LIXgNiAjYOyJwiH8tGBvBH6dp1dHOfM86kUqGdm3j/k8?= =?us-ascii?Q?2GMAEivTlPmXWg+mOnS56a6ce6qkjAYNvltzbMjHeBAdkkQTq9A98sxGHE4C?= =?us-ascii?Q?t0PkpOs8SDOIobqFXgeiYU2GwGgvIHbLEhqm7YRkSdqxH5aVfW24vjY4Gsu7?= =?us-ascii?Q?/CYw5qOdtqNf1JBw5eJ9x9xZUgehBW2/AlUvFjpnTkTthBaqFmZ3HGUREFfB?= =?us-ascii?Q?C7RLvgeFgrkM+6zg6q+aZsfcd8T84wl7Mg6VIwhL/xfMvp4jo9YpRktaCRbf?= =?us-ascii?Q?w/ZzCi0FCHqMbSFWSebciJ5uAs0RzB+mOgZsR6wsaafW1rdI00Mnv+MCHCgL?= =?us-ascii?Q?06SfEf8D1J+bOFAaC0Y2RM5Zrv0UCAmOfwovV8wxdi2ANaOyI+jBXY1XeYE3?= =?us-ascii?Q?DqIX17by6PFm6JvNI05Qu7umjMA7zOY2v1R+Zf35RTgUSZC4v7HGbtVz7X2c?= =?us-ascii?Q?RMm1q20CGW6+fA3dDvbk17RLCBXy9mUdsd4iFvn3PJY7wy8t2r/8rhlB2amz?= =?us-ascii?Q?dB0HugmqsdFFib9fKU5g9jvQxZ1PC9pUiQ6xut4GhSjXBJMZ5dF7sFyZcn9n?= =?us-ascii?Q?tiuPSDHYFFznQw0pxeJKo7OPKBXLpTRk0pkTVvnE//VHRY42Wen9OeFpBn/u?= =?us-ascii?Q?Z4++RkCd3CyIZRoPH6VIrx4+favXmMqxRRoAjHkL?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 665b1357-3d10-4bc6-f391-08dd2df61633 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2025 02:01:51.5514 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: GLq9q6D+0BhhzUiooxVHDgphaXuyAJJT29bbjBsvM+3+NBwberM74IJT2VuBr0wZ X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR12MB6687 X-Rspamd-Server: rspam05 X-Stat-Signature: 4n3fjjkcaiqner5ys5a3otp4arz8nj9i X-Rspamd-Queue-Id: 01663A0006 X-Rspam-User: X-HE-Tag: 1736128920-278302 X-HE-Meta: U2FsdGVkX1+Nxha7LJNOkoylJwx6IDY/F0nNSvRUtsxr+Zyv5Mov+I6Rzu3czBhVFL30h0aRzPpcmw5YOoB9oaoCEq4W6t95Mc/Gas4rBVEhqyic6ZqXvGw5G1SiE5fbMOwGDjCwV8oNu+sEGbYRiPQjPpFTbJ1ExVDqw3fcMK0KBqIj0d3pvcIfyQMp7MtnQYIQORUNw6Ue564AT5sK3L9YtULux+maFH4Mu1DJjypHTre0G5jztHceUlExVzqSpD0+L4Bo1xzm921kJpQJuPUjDvOU73ipCOCDy/YI/80daEy6xl6/8Gdelp2jOKohyldih+AfwCLqWksAh2C6b8lByIh4SQNL7wQw5zWa6QQpSHDR+q0bnh4pqNGDkBsw9h/t6XgmyC7MPTYhnrbzajPRiJZdadv1ZLQzapbgkWy6QzXv9Wc6fyyMum5p9g+vHKbQvxJaCh8qJy0Ur1rrEWqvID7rAot7/z0XKfOgbUSPqdkEpCVZbQx+BMM+YAnkPgG/9sZOUAscsXgXzejhAjtfuWFlyIOq0TvjfTCp8bIDnvUAfhP1+UNhcvCxGGs4ID3vEyxcFvjKHrb50s7winOhlFciJbEIcNeSRjamWnLZxfVeuuBqC3iPDQhRpkneSOmg7gdU5haCEm0J1Z3a4i4BmUSEOg1ogh2RDAgWaaLt9m86a4CN3okFkFmkeAOoup0Gy/CvKdHjLEmVZPbOH8kobUfPZH1AERzp8/VJLOwgjx3ks8bVFozQTM6bmHJxoDtEbcXEA0WRnusPmdBD1fB8ZzY17EeRsZ6K5zdOFYyUmsXovJY9h8ecC12zIDPx+hacU8TcBsYDk5tgkh6oymoie/QRzWI12oL1i/rQ6+XgiuNgWuB/2QWsMMhNnnUH03iflLYbnPAPL1ifBvvTkVRkFXY+NUEPoxw7Pk4sJr3J3FPAyDgkMf4drWo79BFOMMAYiKxMhn1dUO6fYuP oPJvkTIj rFGaZceq+8e1jZzvo62/OesRW2dlxlbq+wHEuT2qBr9fuVIGcsc/goy3uvDlr5pqQkC2D88lIq/mjA6TJ3Spv39PbQ7duWL0GCZoHBcH9671II8MkX7LAe4dbfH7Blioy9iR8/bZ82hJM/Md2OQdeVbhqz2zPinY1yNAVgPmIm8+oNKbo2nawtgZIXbpgndRfmY2jacew7eJI1bkN5cKJp11xZUXmgmMTK7ysXWG9oWac5ARxdA6lLctKKZU4X70uf3UT2u42IGFnDkfCXFcWl3sk9bNgUDMWbdPwAst2RXXUWxSfS/ToB5/7J4FW4SD4UMIIHABvclMyJSYD0SvuI6UVlPnrGQpljCV/vG6mEgEn0bRXpXG34sLrzsLowkqBdV+VWawls4aODmX/aNtDhydFFYBM3i0rYgVU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 5 Jan 2025, at 20:18, Hyeonggon Yoo wrote: > On 2025-01-04 2:24 AM, Zi Yan wrote: >> Now page copies are batched, multi-threaded page copy can be used to >> increase page copy throughput. Add copy_page_lists_mt() to copy pages in >> multi-threaded manners. Empirical data show more than 32 base pages are >> needed to show the benefit of using multi-threaded page copy, so use 32 as >> the threshold. >> >> Signed-off-by: Zi Yan >> --- >> include/linux/migrate.h | 3 + >> mm/Makefile | 2 +- >> mm/copy_pages.c | 186 ++++++++++++++++++++++++++++++++++++++++ >> mm/migrate.c | 19 ++-- >> 4 files changed, 199 insertions(+), 11 deletions(-) >> create mode 100644 mm/copy_pages.c >> > > [...snip...] > >> +++ b/mm/copy_pages.c >> @@ -0,0 +1,186 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> +/* >> + * Parallel page copy routine. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> + >> +unsigned int limit_mt_num = 4; >> + >> +struct copy_item { >> + char *to; >> + char *from; >> + unsigned long chunk_size; >> +}; >> + >> +struct copy_page_info { >> + struct work_struct copy_page_work; >> + unsigned long num_items; >> + struct copy_item item_list[]; >> +}; >> + >> +static void copy_page_routine(char *vto, char *vfrom, >> + unsigned long chunk_size) >> +{ >> + memcpy(vto, vfrom, chunk_size); >> +} >> + >> +static void copy_page_work_queue_thread(struct work_struct *work) >> +{ >> + struct copy_page_info *my_work = (struct copy_page_info *)work; >> + int i; >> + >> + for (i = 0; i < my_work->num_items; ++i) >> + copy_page_routine(my_work->item_list[i].to, >> + my_work->item_list[i].from, >> + my_work->item_list[i].chunk_size); >> +} >> + >> +int copy_page_lists_mt(struct list_head *dst_folios, >> + struct list_head *src_folios, int nr_items) >> +{ >> + int err = 0; >> + unsigned int total_mt_num = limit_mt_num; >> + int to_node = folio_nid(list_first_entry(dst_folios, struct folio, lru)); >> + int i; >> + struct copy_page_info *work_items[32] = {0}; >> + const struct cpumask *per_node_cpumask = cpumask_of_node(to_node); > > What happens here if to_node is a NUMA node without CPUs? (e.g. CXL > node). I did not think about that case. In that case, from_node will be used. If both from and to are CPUless nodes, maybe the node of the executing CPU should be used to select cpumask here. > > And even with a NUMA node with CPUs I think offloading copies to CPUs > of either "from node" or "to node" will end up a CPU touching two pages > in two different NUMA nodes anyway, one page in the local node > and the other page in the remote node. > > In that sense, I don't understand when push_0_pull_1 (introduced in > patch 5) should be 0 or 1. Am I missing something? >From my experiments, copy throughput differs between pushing data from local CPUs and pulling data from remote CPUs. On NVIDIA Grace CPU, pushing data has higher throughput. Back in 2019, when I tested it on Intel Xeon Broadwell, pulling data has higher throughput. In the final version, a boot time benchmark might be needed to decide whether to push data or pull data. >> + int cpu_id_list[32] = {0}; >> + int cpu; >> + int max_items_per_thread; >> + int item_idx; >> + struct folio *src, *src2, *dst, *dst2; >> + >> + total_mt_num = min_t(unsigned int, total_mt_num, >> + cpumask_weight(per_node_cpumask)); >> + >> + if (total_mt_num > 32) >> + total_mt_num = 32; >> + >> + /* Each threads get part of each page, if nr_items < totla_mt_num */ >> + if (nr_items < total_mt_num) >> + max_items_per_thread = nr_items; >> + else >> + max_items_per_thread = (nr_items / total_mt_num) + >> + ((nr_items % total_mt_num) ? 1 : 0); >> + >> + >> + for (cpu = 0; cpu < total_mt_num; ++cpu) { >> + work_items[cpu] = kzalloc(sizeof(struct copy_page_info) + >> + sizeof(struct copy_item) * max_items_per_thread, >> + GFP_NOWAIT); >> + >> + if (!work_items[cpu]) { >> + err = -ENOMEM; >> + goto free_work_items; >> + } >> + } > > [...snip...] > >> + >> + /* Wait until it finishes */ >> + for (i = 0; i < total_mt_num; ++i) >> + flush_work((struct work_struct *)work_items[i]); >> + >> +free_work_items: >> + for (cpu = 0; cpu < total_mt_num; ++cpu) >> + kfree(work_items[cpu]); >> + >> + return err; > > Should the kernel re-try migration without multi-threading if it failed > to allocate memory? Sure. Will add it in the next version. Thank you for the reviews. -- Best Regards, Yan, Zi