From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EE11E77197 for ; Mon, 6 Jan 2025 02:33:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 671D06B0089; Sun, 5 Jan 2025 21:33:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5FA9E6B008A; Sun, 5 Jan 2025 21:33:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3FE766B008C; Sun, 5 Jan 2025 21:33:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 19B526B0089 for ; Sun, 5 Jan 2025 21:33:35 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B5D241A12E6 for ; Mon, 6 Jan 2025 02:33:34 +0000 (UTC) X-FDA: 82975455948.10.E83C0EE Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2080.outbound.protection.outlook.com [40.107.236.80]) by imf04.hostedemail.com (Postfix) with ESMTP id ED8E940010 for ; Mon, 6 Jan 2025 02:33:31 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=CcFK0zOR; spf=pass (imf04.hostedemail.com: domain of ziy@nvidia.com designates 40.107.236.80 as permitted sender) smtp.mailfrom=ziy@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736130812; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Yr04URR9T73yCPhdtEaIen8wQRf/As0SVxpR+KoPazE=; b=kgYX4I7RtqNikRi71Sq/qZ2xscHpep27kpI2+Net0I51vH9dhbX2OBvMIsEDk5Gstll/Ak Cjv7iSoDXCdm1SvopmfnH5VKj361L7Qguf84n2qf9yWtTGund3bo0Ypsub/jVHLy64W8BA LHgw1oaZ8oK61908d7zP+wTXt/Ph+Ik= ARC-Authentication-Results: i=2; imf04.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=CcFK0zOR; spf=pass (imf04.hostedemail.com: domain of ziy@nvidia.com designates 40.107.236.80 as permitted sender) smtp.mailfrom=ziy@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1736130812; a=rsa-sha256; cv=pass; b=ntWnHukR0Cb5ycDAP07XIndIB4nauUYXUNmq6wewfnoPVxtK1pZiyPM4PWj/J4afyiWLTS KvZL3rPMSrPho1Y7DwN+YSP1+RXhupX0KdrGSzNh1AjKR5VuAxFnr1ThnYhdfj2j8+nzSW Osflqs4LvTcCp1F2wxs1iXjVzA9wZ1o= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=PZsLlQg48wkuZrZv7O0DkVqd7UOvZxgkatDyvOXXBGCH4RjnCem6JLfwFyf8CEWxE3Q1j0I18SgZwAp+d1SW4fWSbnc+StcbkROjIf4XEbMOSvT2s0JUcjTKXO4ojnZU4AVtcQIH8M893pxJZuOXXQ4imo5umtqpyO0zDnzB+64MjN4vYkS3pPxe0Dyvt6uZcb90Yroi14vQvOmaVWMRNS3eg3igzfMvmvAMqg6Cm8Q1EIm6h5lNgdcbtezCBBZGNIDsKjJy6nFaSBX+1/qXoGaSQKL1CIr7Kw17H1gieivXYJxtAWdxOKKKloBOzr0nq10UWt2/wAgL2FhBTEKjRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Yr04URR9T73yCPhdtEaIen8wQRf/As0SVxpR+KoPazE=; b=fhxfkDMV/Y9/KyDUHbfe6JctWStCOrQ8xhtJrWtR1KeCxw3L7N+5x+wQJCB+boYwmtcDTJDNW6lVB0Yk4e9LmjIN9KZ4APBXsOdguip5xeAiks57HumkkWLvKO1LCvJcCr8PVzrzyux+kLyHJtbmQOeL8lSjSdICzUzcjfPZIFaWqQPHmhqRGC3tWOcs3yMdMQqRNQ86SIlt+0/VYKsg68GOL+MtLktrFPAthEMDzu9Yox9D3PZmFe/SXKpseaP3hW6GzyyH037cf9KfajjuG9q8Z/O/2FW5WqyqvE5lZYXXoHU7UruoFR28obM8LertS5uFZ8hhqIeG1feO48C8Fw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Yr04URR9T73yCPhdtEaIen8wQRf/As0SVxpR+KoPazE=; b=CcFK0zORR6nqt9VVidrRtR2Sz1MDnn0nuMoVe7b43DiRowc7hn/kDW9QdytDsboPhdS7ebxxmUfKZtwv0fYDRFF/VB1PLKJJmxM3h0FUwvFdmCXpU14fD8UHp4pORiBHabdW3dDrOWe2WQckYqCJlCmoIyoXqTgczaQ7S+qlR4feISuHfE2r4+kVUQCS+0Nyl8iULnSt3zT/zbmJgYbBTt1V4V2s1/YNj1xmiSgYSNf3dZ9g/nDoV1ko2XQetxuEKp5EM2hsE7avp6RyImAOEalXyw12EVZsJiWbKvMK1yMo3GnYskrQZ32o4FOg87EMBEumbfPvWTXLexfckJ5xoA== Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by IA0PR12MB8975.namprd12.prod.outlook.com (2603:10b6:208:48f::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8314.16; Mon, 6 Jan 2025 02:33:24 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%3]) with mapi id 15.20.8314.015; Mon, 6 Jan 2025 02:33:23 +0000 From: Zi Yan To: Yang Shi Cc: linux-mm@kvack.org, David Rientjes , Shivank Garg , Aneesh Kumar , David Hildenbrand , John Hubbard , Kirill Shutemov , Matthew Wilcox , Mel Gorman , "Rao, Bharata Bhasker" , Rik van Riel , RaghavendraKT , Wei Xu , Suyeon Lee , Lei Chen , "Shukla, Santosh" , "Grimm, Jon" , sj@kernel.org, Liam Howlett , Gregory Price , "Huang, Ying" Subject: Re: [RFC PATCH 0/5] Accelerate page migration with batching and multi threads Date: Sun, 05 Jan 2025 21:33:21 -0500 X-Mailer: MailMate (2.0r6203) Message-ID: In-Reply-To: References: <20250103172419.4148674-1-ziy@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-ClientProxiedBy: MN2PR04CA0033.namprd04.prod.outlook.com (2603:10b6:208:d4::46) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|IA0PR12MB8975:EE_ X-MS-Office365-Filtering-Correlation-Id: 02ce1975-d39d-4b10-df3b-08dd2dfa7dc7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|376014|366016|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?B?cTcxUXhpSlQxYllBTWdFNEF0dmYvYmJ2Z05BbUgyZllIUjBreEN5MmZxUUxF?= =?utf-8?B?QjkzN3JBUnZCUXVHQkhZZC9ZV1I5QW1UeVJKaGZheFdta1pnb1FkYlRQcWp1?= =?utf-8?B?L292a1dLaWhhNElMV1dxY2k2Y2FXTExoeHFpWTFBMnVDOEErZUpvR2NmalpY?= =?utf-8?B?SDBYMVFsNWxzLzk0WE1WOXIvYUxsYitMRzhIb0RtMmZiUmJPd1ZnWTQ0U3h6?= =?utf-8?B?NEc5TVEvVXZJakVlbUk4SDkrM0lGbXU5K1REN2tvcjFrNHZyUkFaS2IzUjRT?= =?utf-8?B?NW1GOHoyVm5uN3owaXljWEx5Z3E5bk5TZnJCNHJtNDVpU2x0aDV0aEtheXhD?= =?utf-8?B?VGtQZS92MjFTU3Z6SUtvMy9qdGh6a1ZTVTI5OVAzVFpFKzJnSXIrTWVRNW5R?= =?utf-8?B?TUF6aFQzN1dGdnQxV293R1R3dTBBaXhsM3JrMU8zL3RSMlhLWjRBNWd2RlFP?= =?utf-8?B?SG1yZlMyRkZiWUQ2RU9nYXYwNFJWamJnNTBuQXdqdXFiSFB4L212a3dDeFF6?= =?utf-8?B?NzNHZFJ3cGdISERzdm85amoxeVhHZCttbmZScWVTZDJ2bzFWdi90WEl2cTJr?= =?utf-8?B?a3NJK0NrZy9YYlpKRVRvYU4rM3ZCZllTamhvdTkyaVNlOE1WcW8rNU1OQ3J5?= =?utf-8?B?NEYzaHZERkJtSVdLekk1cVZRU3EvbWZDd01TdFZzSVdTdVpvUmx3Tzg2b2Fp?= =?utf-8?B?bFVaNmJHSzlIcmJZalByUDJCTWpkUmYySW9ab0dEZjZCaHR4NFYvc1duWGE2?= =?utf-8?B?b21ld1lJamp5ZlRlWXZXakhCWnZDOEMwbkwyTFo4N3FvRlpvZ1BaNTdBendx?= =?utf-8?B?aE5DZHVTdnBSWFhheFNtdGZqUlFpNFRMdTN2NzRXaVdIV2NwMXViRmtDRzlD?= =?utf-8?B?YndPWHZ5cWtrS2h2RUdmUWdpTFNsMkNrM3pCTGwxZHJrcDY0eWdhWUhpbUF4?= =?utf-8?B?dmF2c1ZtTFF0cjNTWlcvb1N6Vmg2Vnlnc011V3dmTDNLWlJaaUxobHY3cm9J?= =?utf-8?B?M2l0ZWJKZWFJQndETzlDLyticjZqSWNEUG5nRFErWHlyR2hiWE00VWxJbFFy?= =?utf-8?B?MmMwekpBTkxkR3FsVkZ0UGUwbVlVSFk3TUcrbzI3emNPMW4zVnlOamlVbTVh?= =?utf-8?B?S3ZhczdRaDhMTWQ0bjEveWNSYUI3cDhWUlZkUUVaNlZjUDU3RjdDQm9YQlkv?= =?utf-8?B?RjQ4cHpKUlIwSmhzbHQreWR0Q2N2Y25yZEhSTEh3Vk81UzFKVVJsdzJQMjk4?= =?utf-8?B?bjFJWm1GVXRDYzdtM0JxYWZrU203Vm1oU1M4L29UYVh4d2FCZGZHZE1wQm8x?= =?utf-8?B?dG00TFZ1eXA3bmlOVk9VUUhSRDNteVFSMjZwVG9VOWpXa1hqTUZvREtzZWU4?= =?utf-8?B?QkFwbWFJOGVQbmNrUEdFcGwwSCtGZGJHcGs3OTVQSDhOeUcyQmpNVHBqV0M2?= =?utf-8?B?cktNY3pqQXNUNXRCdHBlVFdjMTdJN3NNZGZISUVMVXhuR0QrRDcvUzF4cHF4?= =?utf-8?B?VHVXdlJyRnNPT0RKa3F0M012NGJIVjFXcnhSeGJQMEQ3Q2pxYVJmQU9uQ1Zj?= =?utf-8?B?NjFCNHo1dnJUQ2t2UnRQTUxQeE9yY0FjVGJ6bkxkY20vN2NTL20xTzlwUnRs?= =?utf-8?B?L05VOWJqY3BqNkNyTVZxSXRYTk5TNzIvSW5BS2x4S3JPVXNHTWJXN09tbVlk?= =?utf-8?B?TnBLeDBtU0d0RVFLZE50Tjk0K1A0dkFIYmVITnc3NHhYb2tZbnlDb2pCSVVY?= =?utf-8?B?ellLQUluamlReUQ0ZitFSlpMUzNCTVpRUm5Cd2FtK3hFUmVrcVdqSEpqL0l4?= =?utf-8?B?WUI4RUNLRVVBd1V4NFB0UE4zUTl0dEVMSFl1WmY1eDNQaU9WY0NkKy9BNDBK?= =?utf-8?Q?yjz/PjtioR3Nj?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR12MB9473.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(7416014)(376014)(366016)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?N04yYjVnQ3JmVGFBM3VycENGZjF4OFpyQ1ozMDF0b0hIN3dXNlpGQzJXTEdP?= =?utf-8?B?TDQ2a1FWeEJCSXdvTWl2RGtVRFZBWTJGOVJRNnUzdURUckRjQnplTU1ndHo2?= =?utf-8?B?WGJSZllDY1g4SzIrZlVVKytPeWZVSHVYUUlWS0NmQzhtcDE2c1d4bEI5R3gz?= =?utf-8?B?akxNVlU1ei92Ty9lZlMxNEJrQ3U2djdxMm9QV3graXFoUjJiN0RXZjJJVDFE?= =?utf-8?B?elZyaVFFREc2NDVBV2o2eXkyRkVvbkp2UTE2ditkZFZWSGg4WCs4RnNlcTQ0?= =?utf-8?B?VmxzZlZWWnlsakQxOFNvQWdYQ08xMWlqb0pXOWVuY0dtVlRkYUNrOTBER3BQ?= =?utf-8?B?b0x4SFRMT2dId0RzZXdzVG0yemtxRFJodG9rZ2pIRDBKazY0dEpiT2xTQmZG?= =?utf-8?B?OXc3cUlkQWNGZzBZRktlN3E5VDRUL3lNSEdOUThSY0N0VTZoMUo4UUpCeUxy?= =?utf-8?B?amdWM1JjWEY3a0U2NTVvMDhnTVZiSUJlR2I5KzJNVWRpU0NkcWNTcm80V29J?= =?utf-8?B?WGVEY3hFL2pvTmxlS1RKcDZkY0crR0wxVk82V3dHekpiYlptQXJwYjg1aHlH?= =?utf-8?B?bm5EZlI0T3pJeSs3SlkwRGpiRkMwMFo5WFZDdmduclFQREpBelUwTzhzMk81?= =?utf-8?B?QktjS0s2blA2SU9BRjV6R0Q4THhmQ1kwaWcxNTFIaityek1PK2lpenM2bk9o?= =?utf-8?B?S0pBakNDL1Jud1hjTWF6eEdkL0hYVzdVcHVUak9OSExJMkU4R3FRMy9iTmRj?= =?utf-8?B?bnRMU2hGUHZLd0xoV04rODJEeWZHLzZQUGhTSGtjV3JJS3BXc2dKRjViVmox?= =?utf-8?B?ZUxkd0YvTHc3MWNIdmJOWFNIc29BNkdheFRERVRIR1RwMUxpSS9Za0s0QW9o?= =?utf-8?B?YXdwVXArVHcrQzhmSEFkS213cStyRlFsRkJCQjNMR25HYVhzWmp5OWVjcCtV?= =?utf-8?B?elNkQUIyNUtDdjM3ZFVHNCtvOGc0TzNpeTdudkZNMUdmYWFVc0xiWUtzYlh3?= =?utf-8?B?cFd0cWZnOGNsSzRTYWlBNkxDZXgzRHVCUEdRKy9tYjhzVzdTcDYrUWZLV0xM?= =?utf-8?B?cjJnbXFRSk1iWDNnMmJOV3BoQkJEMm4rQTJxbU5HMlRPT3BiclhqZnhVYTkr?= =?utf-8?B?dExFMHcvUlVyM2Y0U0wrMG1DTGZyN2d0ZGVKa2tBc2QrTGV0MVNrYk4zbXNF?= =?utf-8?B?cjNtY3czMHg5dk5NSTBsQS95WlUxakdHTjYwKzI4cmJRekxXS3RQc2k3TlFO?= =?utf-8?B?SGdERzlnOHFRMGZvUVhpVTFKT2hDb0MzWHNSemxQR2x2SyswaHVmRWszU3Zn?= =?utf-8?B?eG9RWCtBQzhJKzFBMXV5eWFhQUdBUWwybVJ0eC8xcndSWjhTYnNxNW9mbjFm?= =?utf-8?B?NlpjdURXb1BQb21waFVCQWdVOFNreFlPVGpENW9aQ3VINGVSKzUrS3lZdnhG?= =?utf-8?B?UWhWQjkrUjNlNzZqZGJnRi94T2VzVlVaVHRUZ0ZCRmdyckRIRkJLaWE3MEZZ?= =?utf-8?B?dFQ4aXVaeVo5dzJsQklMTVFyRTIrcGtSbFh3S2xEUXE4RWNCeHpHYVhQL0Uw?= =?utf-8?B?MVlEdHhyNHB6WlgrcUlGWjRieUV6RjRNZWtDa1pIajJUWGZxSjgyTytSR3c2?= =?utf-8?B?SmNmdkJmS0M1a3Ztc3YvZUE4K3NpdU1od0lkWlRaQWtzRTMyTDdwMnVzR20y?= =?utf-8?B?YXU0SjJuTTRnWjZnRWJDY1ZzVURYMTF2bG5xTlBpdStrL0NlS1FOR1NxZFNl?= =?utf-8?B?bzFlYnRRQ2hhUE03VmdXa1BHRytLVzBoVWZoUWpTdGt6SmY4Rlh5UEdrL0wy?= =?utf-8?B?VTFmS3kvd0VscmFab0U0ZE9LSHVNRFhDTVBBQVlrYjRxRzUvYlNzTFkyNUl3?= =?utf-8?B?clcxUC9XalJKc2hoVG42LzREb051TTlMSHh3aVZYbks4cmgrMTlxUHVnWSti?= =?utf-8?B?MzNnRE1tM0RUKzFoK0ZybkM0Y3VHbi9wc1ArSE5Rak9WTWRXQjdveTRSZ2pq?= =?utf-8?B?WHpoZG8vd21EWGpDTTFCUFEwL1U2am5iTnpxd244ZzdkQmMyY3FVRm9lQ1Q5?= =?utf-8?B?SHdrTlluK3dmdCtxcVZaNUN4SDFUa1BZcnlnejJOM0RxeGpJQzZINkhTb1A2?= =?utf-8?Q?/eeuELXy9rhy5yawtJ5gNHr+N?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 02ce1975-d39d-4b10-df3b-08dd2dfa7dc7 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Jan 2025 02:33:23.2680 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /iv2KG1vxl6YXCrOwprpHAdEf49S33rGtqnTaWFhBST29NuJz81vFeImovPyBsMv X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA0PR12MB8975 X-Rspamd-Server: rspam05 X-Stat-Signature: a4jb7hzjqytterygf8t83b6bkwgrcix9 X-Rspamd-Queue-Id: ED8E940010 X-Rspam-User: X-HE-Tag: 1736130811-337880 X-HE-Meta: U2FsdGVkX1+iuD6T9qL1jxbrAmGYAjH55Obm2u2seGfBYtm1JLc2WDSDAyayuVna189RHWPQ+z7KeiW1xNNSKbiFzD/mGVGBY1GKZhis7er6b55DnBS5cNAATc8RkkTQBr6tgHTiCng7o57JKqyT6oA5T1CBOlvTGz183IZlH8NR933O4d03dn9QIJDTh8Z8y/fRHnRWzg3aDfCbNWk8yKKOWy1qiWf+0SiQXNC25yWkoc5JtRodUeO92kfFV1A62jZETk33CGV9aBvkxonxxd7Oh07b/w8NRHwZhh9gSEskHrCQpqFzD8Fj5zyHl7q38N3kSgFlAtTKJjGdV+dQFLUKxzDJAupqRb6v/K+hOSF44oTH+jcBTk0Q6Sn1NebntkHW3K4caS5GtvGFD61ZJPuChXseGLmJa4aK9IrqZ4rWVfqjyIyZmhT2mDmskCIwevSSYoGezO3B3P1pPBzQAZdp+gTXJe48QjKhtDErXV43ZY508oRTJv6HU9ovmqruMvB00AAi3QzOG1YWpcOKCzvLrEVCPoOptNwltNaArpY4R5bSh0VxBcMg/6ABX6JteiserjLclejxytdCzSw2WBvPsEnlHzEMhJGJzLbHeMKKcamDCuOEfJlyYC6hqTNzJDQeIdxo8N8irM6TjWDILy1rfKU5n01ndiD0lkFQAe9fujcmOe4zFvJrXzO8BHQ4wBfNE9fyDwOiLQryWRohnLF7jVoX9YtQyfL257r8qmx1U71Mo6Yos31kol0gVI6iPvMfBwkhVw3YhYDv8M/b0GT0VTC+dUW0tOtsHlb9Fg2dCUL3isyTyN4UgD8hURHl2nnA6Df7KajHJzij3x6x2bX7/jC3O29EGscT7vspyXA7BBIKtpOifiZxewFlQSM2qM8Np8WD9YB/3koQbb+d87xDwuYqqOKeP6zkPqFArCBDdOnzBb7nONZnqyAbBiuMEXQ3dvqqzXb4B77L0DC v330tfpp MvL+Fm4JiYAjvNoX+ERQ9s2Wqz+AK/77O0yJehyq3PgeNtIVyBPd4Uj8RX10vLQvnGOCpI3C9bmCzq4GixSkESok+/9MVVZamyLMH0Fpozj5/vxIXwoo3jlfOXkS6n0d+DTEySyv1/mXBWrUtCGcEgOSPcixyx1xpLRGIKUiCQJ8K4tpGwL9HJwBlwggDMbuWBQPCBO9tQXK+7KhR3ALmlhZ1cWNEJg/mrwgg43Esf+mgRyGaGHxL3aBb3P409FqMgSbsYcCJstZT0EvnRe0P2pVVhTtgqCqcqzDJK745TSQwz3UAG77at8kRHXh2f/9GIkXSEHSrHpbfZoIv2O6p8Xttik7gOmDiE5HAVdNs4sEYnJhsM/Very5z4TnsJzrwG1qA/pfdNF4cOWe+5cbHKeCSB/sXqLPlpDEBZ3trrBnvsOwdpyKL1yFk5h0y1ZK4WLNi9VYYTIOxs4dm+IjdigmBWWTQDTiwXbuYD2uuBLAJfrMdaeY+66IIXBj8kRsS2Nfcf2jc90DLwZ/UuH4Qf8PfGqizuXiWqt7nX7LySSs0q5FVeBD5kZXYJAcskTGGUtD2zUKotWLyj102FNKYaRAod5ZlME4p1G5pTSrUJv6mU6I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3 Jan 2025, at 17:09, Yang Shi wrote: > On Fri, Jan 3, 2025 at 9:24 AM Zi Yan wrote: >> >> Hi all, >> >> This patchset accelerates page migration by batching folio copy operations and >> using multiple CPU threads and is based on Shivank's Enhancements to Page >> Migration with Batch Offloading via DMA patchset[1] and my original accelerate >> page migration patchset[2]. It is on top of mm-everything-2025-01-03-05-59. >> The last patch is for testing purpose and should not be considered. >> >> The motivations are: >> >> 1. Batching folio copy increases copy throughput. Especially for base page >> migrations, folio copy throughput is low since there are kernel activities like >> moving folio metadata and updating page table entries sit between two folio >> copies. And base page sizes are relatively small, 4KB on x86_64, ARM64 >> and 64KB on ARM64. >> >> 2. Single CPU thread has limited copy throughput. Using multi threads is >> a natural extension to speed up folio copy, when DMA engine is NOT >> available in a system. >> >> >> Design >> === >> >> It is based on Shivank's patchset and revise MIGRATE_SYNC_NO_COPY >> (renamed to MIGRATE_NO_COPY) to avoid folio copy operation inside >> migrate_folio_move() and perform them in one shot afterwards. A >> copy_page_lists_mt() function is added to use multi threads to copy >> folios from src list to dst list. >> >> Changes compared to Shivank's patchset (mainly rewrote batching folio >> copy code) >> === >> >> 1. mig_info is removed, so no memory allocation is needed during >> batching folio copies. src->private is used to store old page state and >> anon_vma after folio metadata is copied from src to dst. >> >> 2. move_to_new_folio() and migrate_folio_move() are refactored to remove >> redundant code in migrate_folios_batch_move(). >> >> 3. folio_mc_copy() is used for the single threaded copy code to keep the >> original kernel behavior. >> >> >> Performance >> === >> >> I benchmarked move_pages() throughput on a two socket NUMA system with two >> NVIDIA Grace CPUs. The base page size is 64KB. Both 64KB page migration and 2MB >> mTHP page migration are measured. >> >> The tables below show move_pages() throughput with different >> configurations and different numbers of copied pages. The x-axis is the >> configurations, from vanilla Linux kernel to using 1, 2, 4, 8, 16, 32 >> threads with this patchset applied. And the unit is GB/s. >> >> The 32-thread copy throughput can be up to 10x of single thread serial folio >> copy. Batching folio copy not only benefits huge page but also base >> page. >> >> 64KB (GB/s): >> >> vanilla mt_1 mt_2 mt_4 mt_8 mt_16 mt_32 >> 32 5.43 4.90 5.65 7.31 7.60 8.61 6.43 >> 256 6.95 6.89 9.28 14.67 22.41 23.39 23.93 >> 512 7.88 7.26 10.15 17.53 27.82 27.88 33.93 >> 768 7.65 7.42 10.46 18.59 28.65 29.67 30.76 >> 1024 7.46 8.01 10.90 17.77 27.04 32.18 38.80 >> >> 2MB mTHP (GB/s): >> >> vanilla mt_1 mt_2 mt_4 mt_8 mt_16 mt_32 >> 1 5.94 2.90 6.90 8.56 11.16 8.76 6.41 >> 2 7.67 5.57 7.11 12.48 17.37 15.68 14.10 >> 4 8.01 6.04 10.25 20.14 22.52 27.79 25.28 >> 8 8.42 7.00 11.41 24.73 33.96 32.62 39.55 >> 16 9.41 6.91 12.23 27.51 43.95 49.15 51.38 >> 32 10.23 7.15 13.03 29.52 49.49 69.98 71.51 >> 64 9.40 7.37 13.88 30.38 52.00 76.89 79.41 >> 128 8.59 7.23 14.20 28.39 49.98 78.27 90.18 >> 256 8.43 7.16 14.59 28.14 48.78 76.88 92.28 >> 512 8.31 7.78 14.40 26.20 43.31 63.91 75.21 >> 768 8.30 7.86 14.83 27.41 46.25 69.85 81.31 >> 1024 8.31 7.90 14.96 27.62 46.75 71.76 83.84 > > Is this done on an idle system or a busy system? For real production > workloads, all the CPUs are likely busy. It would be great to have the > performance data collected from a busys system too. Yes, it was done on an idle system. I redid the experiments on a busy system by running stress on all CPU cores and the results are as not good, since all CPUs are occupied. Then I switched to system_highpri_wq, the throughput got better, almost on par with the results on an idle machine. The numbers are below. It becomes a trade-off between page migration throughput vs user application performance on _a busy system_. If a page migration is badly needed, system_highpri_wq can be used to retain high copy throughput. Otherwise, multithreads should not be used. 64KB with system_unbound_wq on a busy system (GB/s): | ---- | -------- | ---- | ---- | ---- | ---- | ----- | ----- | | | vanilla | mt_1 | mt_2 | mt_4 | mt_8 | mt_16 | mt_32 | | ---- | -------- | ---- | ---- | ---- | ---- | ----- | ----- | | 32 | 4.05 | 1.51 | 1.32 | 1.20 | 4.31 | 1.05 | 0.02 | | 256 | 6.91 | 3.93 | 4.61 | 0.08 | 4.46 | 4.30 | 3.89 | | 512 | 7.28 | 4.87 | 1.81 | 6.18 | 4.38 | 5.58 | 6.10 | | 768 | 4.57 | 5.72 | 5.35 | 5.24 | 5.94 | 5.66 | 0.20 | | 1024 | 7.88 | 5.73 | 5.81 | 6.52 | 7.29 | 6.06 | 5.62 | 2MB with system_unbound_wq on a busy system (GB/s): | ---- | ------- | ---- | ---- | ---- | ----- | ----- | ----- | | | vanilla | mt_1 | mt_2 | mt_4 | mt_8 | mt_16 | mt_32 | | ---- | ------- | ---- | ---- | ---- | ----- | ----- | ----- | | 1 | 1.38 | 0.59 | 1.45 | 1.99 | 1.59 | 2.18 | 1.48 | | 2 | 1.13 | 3.08 | 3.11 | 1.85 | 0.32 | 1.46 | 2.53 | | 4 | 8.31 | 4.02 | 5.68 | 3.22 | 2.96 | 5.77 | 2.91 | | 8 | 8.16 | 5.09 | 1.19 | 4.96 | 4.50 | 3.36 | 4.99 | | 16 | 3.47 | 5.13 | 5.72 | 7.06 | 5.90 | 6.49 | 5.34 | | 32 | 8.42 | 6.97 | 0.13 | 6.77 | 7.69 | 7.56 | 2.87 | | 64 | 7.45 | 8.06 | 7.22 | 8.60 | 8.07 | 7.16 | 0.57 | | 128 | 7.77 | 7.93 | 7.29 | 8.31 | 7.77 | 9.05 | 0.92 | | 256 | 6.91 | 7.20 | 6.80 | 8.56 | 7.81 | 10.13 | 11.21 | | 512 | 6.72 | 7.22 | 7.77 | 9.71 | 10.68 | 10.35 | 10.40 | | 768 | 6.87 | 7.18 | 7.98 | 9.28 | 10.85 | 10.83 | 14.17 | | 1024 | 6.95 | 7.23 | 8.03 | 9.59 | 10.88 | 10.22 | 20.27 | 64KB with system_highpri_wq on a busy system (GB/s): | ---- | ------- | ---- | ---- | ----- | ----- | ----- | ----- | | | vanilla | mt_1 | mt_2 | mt_4 | mt_8 | mt_16 | mt_32 | | ---- | ------- | ---- | ---- | ----- | ----- | ----- | ----- | | 32 | 4.05 | 2.63 | 1.62 | 1.90 | 3.34 | 3.71 | 3.40 | | 256 | 6.91 | 5.16 | 4.33 | 8.07 | 6.81 | 10.31 | 13.51 | | 512 | 7.28 | 4.89 | 6.43 | 15.72 | 11.31 | 18.03 | 32.69 | | 768 | 4.57 | 6.27 | 6.42 | 11.06 | 8.56 | 14.91 | 9.24 | | 1024 | 7.88 | 6.73 | 0.49 | 17.09 | 19.34 | 23.60 | 18.12 | 2MB with system_highpri_wq on a busy system (GB/s): | ---- | ------- | ---- | ----- | ----- | ----- | ----- | ----- | | | vanilla | mt_1 | mt_2 | mt_4 | mt_8 | mt_16 | mt_32 | | ---- | ------- | ---- | ----- | ----- | ----- | ----- | ----- | | 1 | 1.38 | 1.18 | 1.17 | 5.00 | 1.68 | 3.86 | 2.46 | | 2 | 1.13 | 1.78 | 1.05 | 0.01 | 3.52 | 1.84 | 1.80 | | 4 | 8.31 | 3.91 | 5.24 | 4.30 | 4.12 | 2.93 | 3.44 | | 8 | 8.16 | 6.09 | 3.67 | 7.81 | 11.10 | 8.47 | 15.21 | | 16 | 3.47 | 6.02 | 8.44 | 11.80 | 9.56 | 12.84 | 9.81 | | 32 | 8.42 | 7.34 | 10.10 | 13.79 | 23.03 | 26.68 | 45.24 | | 64 | 7.45 | 7.90 | 12.27 | 19.99 | 36.08 | 35.11 | 60.26 | | 128 | 7.77 | 7.57 | 13.35 | 24.67 | 35.03 | 41.40 | 51.68 | | 256 | 6.91 | 7.40 | 14.13 | 25.37 | 38.83 | 62.18 | 51.37 | | 512 | 6.72 | 7.26 | 14.72 | 27.37 | 43.99 | 66.84 | 69.63 | | 768 | 6.87 | 7.29 | 14.84 | 26.34 | 47.21 | 67.51 | 80.32 | | 1024 | 6.95 | 7.26 | 14.88 | 26.98 | 47.75 | 74.99 | 85.00 | > >> >> >> TODOs >> === >> 1. Multi-threaded folio copy routine needs to look at CPU scheduler and >> only use idle CPUs to avoid interfering userspace workloads. Of course >> more complicated policies can be used based on migration issuing thread >> priority. > > The other potential problem is it is hard to attribute cpu time > consumed by the migration work threads to cpu cgroups. In a > multi-tenant environment this may result in unfair cpu time counting. > However, it is a chronic problem to properly count cpu time for kernel > threads. I'm not sure whether it has been solved or not. > >> >> 2. Eliminate memory allocation during multi-threaded folio copy routine >> if possible. >> >> 3. A runtime check to decide when use multi-threaded folio copy. >> Something like cache hotness issue mentioned by Matthew[3]. >> >> 4. Use non-temporal CPU instructions to avoid cache pollution issues. > > AFAICT, arm64 already uses non-temporal instructions for copy page. Right. My current implementation uses memcpy, which does not use non-temporal on ARM64, since a huge page can be copied by multiple threads. A non-temporal memcpy can be added for this use. Thank you for the inputs. > >> >> 5. Explicitly make multi-threaded folio copy only available to >> !HIGHMEM, since kmap_local_page() would be needed for each kernel >> folio copy work threads and expensive. >> >> 6. A better interface than copy_page_lists_mt() to allow DMA data copy >> to be used as well. >> >> Let me know your thoughts. Thanks. >> >> >> [1] https://lore.kernel.org/linux-mm/20240614221525.19170-1-shivankg@amd.com/ >> [2] https://lore.kernel.org/linux-mm/20190404020046.32741-1-zi.yan@sent.com/ >> [3] https://lore.kernel.org/linux-mm/Zm0SWZKcRrngCUUW@casper.infradead.org/ >> >> Byungchul Park (1): >> mm: separate move/undo doing on folio list from migrate_pages_batch() >> >> Zi Yan (4): >> mm/migrate: factor out code in move_to_new_folio() and >> migrate_folio_move() >> mm/migrate: add migrate_folios_batch_move to batch the folio move >> operations >> mm/migrate: introduce multi-threaded page copy routine >> test: add sysctl for folio copy tests and adjust >> NR_MAX_BATCHED_MIGRATION >> >> include/linux/migrate.h | 3 + >> include/linux/migrate_mode.h | 2 + >> include/linux/mm.h | 4 + >> include/linux/sysctl.h | 1 + >> kernel/sysctl.c | 29 ++- >> mm/Makefile | 2 +- >> mm/copy_pages.c | 190 +++++++++++++++ >> mm/migrate.c | 443 +++++++++++++++++++++++++++-------- >> 8 files changed, 577 insertions(+), 97 deletions(-) >> create mode 100644 mm/copy_pages.c >> >> -- >> 2.45.2 >> -- Best Regards, Yan, Zi