From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68454E77199 for ; Thu, 9 Jan 2025 11:47:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D9F3B6B0082; Thu, 9 Jan 2025 06:47:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D50E46B0083; Thu, 9 Jan 2025 06:47:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA2266B0085; Thu, 9 Jan 2025 06:47:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 53E546B0083 for ; Thu, 9 Jan 2025 06:47:24 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D65D51C8B7F for ; Thu, 9 Jan 2025 11:47:23 +0000 (UTC) X-FDA: 82987737966.01.50C04E8 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2081.outbound.protection.outlook.com [40.107.220.81]) by imf18.hostedemail.com (Postfix) with ESMTP id DA10D1C0006 for ; Thu, 9 Jan 2025 11:47:20 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=yjTMVqtp; spf=pass (imf18.hostedemail.com: domain of shivankg@amd.com designates 40.107.220.81 as permitted sender) smtp.mailfrom=shivankg@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1736423241; a=rsa-sha256; cv=pass; b=0G3JrZU5K0qc7uKl7f8GgMkEN2iR87/a6P76r2dWBB9d1uYxxKkXZw0Hgf1FFSLsnKPVXA y/CxhGDukdDR57Lbd5IRg7i1kNyeEdSn+sCc62/4aJhz7H9vzTokFUGvjepHLTWP36HxP7 3/+xXtwbrhEbgZsLgLC4I/yHTI5sfpg= ARC-Authentication-Results: i=2; imf18.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=yjTMVqtp; spf=pass (imf18.hostedemail.com: domain of shivankg@amd.com designates 40.107.220.81 as permitted sender) smtp.mailfrom=shivankg@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736423241; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O70SL+rdL+tbaJJYsUewcmQ1Qujh9B8gjIdnTKARoRo=; b=j5eaoqJCIHmdl025d7p2wE9AL+jkFtPCKbi4faHMlXfkgRO/9cJxEfAq1+ONATyTqwi+JM 5yC4IC3nMD7rwoTgH3Q4IYrBczNKuhGZDxVGILxzGZWN+yyQjQfL7U3FrHS2TQ4hiJ1i+0 Lz43MID4/lsJtoTK0HEK0Y+/f97CZKc= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=LtAgBAcwavXN03/ZEAjvJN6NCu7FnNBj9+H8tEIgkt4PaGK8jSvylx97P+Kuby64pUS4zGKc6GKrsJXZ8hZvUa0kS5pHeFvcFDZIncEu1EmgvF4rRp6p6X7zXe4gKFVgnr30qbJPvkXLVdOGcnB/fc6eIFPWLs3MsMyvDGS5WazUGMpe0IuBLM48CsnDXIgpavef3oc9wN8j04Lb2Y2RCXK/6IeFT4OWl0JSD+a6GB05hbkRwPtdzqFb325LDKwxcMubIdVm3qKyyWVU2TJGFwMwcz6tkNis2wLJ2cGre4M0sd7rbXKqlJ1zirAIikbHOU1e/bfe8EKnI/oYX9KiqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=O70SL+rdL+tbaJJYsUewcmQ1Qujh9B8gjIdnTKARoRo=; b=Kr/FPEFxe2PjH6ZFCr8k8Wwh9uVRzFZC9u7+ZviLkrGMd+LRIOwhmmwO2vM0pH+L9T8xs+yuQsCXO/3Ig/cXOTCcBVsEUZtTsotJuEOebTCD6irNaisg2OblNU5FL/Av5+rch/5gtHbxE1zEP8DTvbT37VUrKaCfAJFNfDvlYx3JSFhkHVH6hO64bLeVODHW+DE2zYAHgHMq1rtx+HtaAeagynuF7RrsXm8EbUPD32wXj5r1Z3/ZaMO7in/Jf7I8UXeFOQbzEPimqukwd3xIlU4RMSFe4CTrbF2uu2jkidh1vjEJYKNDQr5B05Tczv/9mxPTL9yZUSo5Bd3TLMpEqA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=O70SL+rdL+tbaJJYsUewcmQ1Qujh9B8gjIdnTKARoRo=; b=yjTMVqtp6CBhRPc/HxaNMBnQ5ULNskELTwxIjfJX5Ep5RKrPYV7oIMC8j3SEB2RYdgkZva3nL5DLg87AgUqB611G1NA0C6MOZO7UJwUy2Vtr2zNeGKMI83Vq3XdZaGMuGZtqENzCxpzL5jYRdA/N4xqLEMO0s5tKp8I8dW1QsJg= Received: from CH2PR12MB4262.namprd12.prod.outlook.com (2603:10b6:610:af::8) by SN7PR12MB6930.namprd12.prod.outlook.com (2603:10b6:806:262::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8314.17; Thu, 9 Jan 2025 11:47:17 +0000 Received: from CH2PR12MB4262.namprd12.prod.outlook.com ([fe80::3bdb:bf3d:8bde:7870]) by CH2PR12MB4262.namprd12.prod.outlook.com ([fe80::3bdb:bf3d:8bde:7870%2]) with mapi id 15.20.8335.011; Thu, 9 Jan 2025 11:47:17 +0000 Message-ID: <600a57ff-a462-4997-a621-f919c2c4fa84@amd.com> Date: Thu, 9 Jan 2025 17:17:06 +0530 User-Agent: Mozilla Thunderbird From: Shivank Garg Subject: Re: [RFC PATCH 0/5] Accelerate page migration with batching and multi threads To: Zi Yan , linux-mm@kvack.org Cc: David Rientjes , Aneesh Kumar , David Hildenbrand , John Hubbard , Kirill Shutemov , Matthew Wilcox , Mel Gorman , "Rao, Bharata Bhasker" , Rik van Riel , RaghavendraKT , Wei Xu , Suyeon Lee , Lei Chen , "Shukla, Santosh" , "Grimm, Jon" , sj@kernel.org, shy828301@gmail.com, Liam Howlett , Gregory Price , "Huang, Ying" References: <20250103172419.4148674-1-ziy@nvidia.com> X-Mozilla-News-Host: news://nntp.lore.kernel.org Content-Language: en-US In-Reply-To: <20250103172419.4148674-1-ziy@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: BMXPR01CA0095.INDPRD01.PROD.OUTLOOK.COM (2603:1096:b00:54::35) To CH2PR12MB4262.namprd12.prod.outlook.com (2603:10b6:610:af::8) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PR12MB4262:EE_|SN7PR12MB6930:EE_ X-MS-Office365-Filtering-Correlation-Id: 8461d142-9a72-409f-35ad-08dd30a35da1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|376014|7416014|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?B?TGlReUlIUnBjQlowR1BRQzRiaXhsVEVJaGJFbDN5UlVWUkpDUW5meFM0ait1?= =?utf-8?B?V29Ndmo4ZVFwUi9uUWk0b3RxeHdaZzg4cjlQeHE3a1VNZ1I5a20zRmxpVnV2?= =?utf-8?B?MzQ4N2VTU2lqbkxSalFiNDR6SEFiRFlrbXBYdU9IRUd0ZDVSQ010MWtUWks4?= =?utf-8?B?Njc3bzA0QzNaNWcrVi9kTUdUdU8xcW1NWTBjblRjRW9Ea0NIT3loeXVRU0Y3?= =?utf-8?B?d2xScWpKNWExVDZvRkgrVmFzYkswK25wT2M0dDBKa0tZK01zWXJBb2RleXAr?= =?utf-8?B?ODBOZGE3NitEK0plaDE5eExQOXE2RU41ckE5Y3ZkYXQ1VlYzVFFmRm9YWnRG?= =?utf-8?B?Q3FuaHRWVlUwRkFFWFRNbXNhenU0d1o3VHJVVzROenVNRDdVRTlwd3JUNyty?= =?utf-8?B?OGRvVnRJWUdoWVB2dEE2c3EzdUZqdldhSnU2dnpkV3M3cHg5Qk1kTVhKZWNW?= =?utf-8?B?dStnZnlIL2VEMGRFb2h1akY3QUhvd0htNGtZSXA0Um5Ua0VOWk9pMWdMY1lP?= =?utf-8?B?SkFPUWZuMi9NTWgxd1gzRHZxQmJiSklKOUM5L2FLamhhU2FzTjdLVXF6dXE4?= =?utf-8?B?VjhhZm1Uem1pZ2t6ZGdQeStwYnlBdm4vL3E5RWE1NWtESGo1YkRKMmdORkZz?= =?utf-8?B?ZzFsa3E5eEtvdkZVcmVHWlVWUXdJREZjdFFZYU1xaDhINnBaYU5ZSGtGbnFx?= =?utf-8?B?aE9Wcy82NjBnY002VnVMMDFHTHVuNC9GMjJKUnpRbEhSRzV4VFgrdlNLbG4v?= =?utf-8?B?RDhTTXd6eGJqRzlXcGVORUJZTVJLK1RvaE9iYTE4UzNrcjBGc3FjaVpwSDlT?= =?utf-8?B?NUhocWVCYUNpSFJQa0ZrUUVhaTQ3Q2ROMHZGMmVJM2poZS9OME9rL0ZHSE01?= =?utf-8?B?Q0U5WFpiaEowVFJhV3ovQnI3OVBnbXhZRER1NjRyNW54dHprMU1WQnJCNEdj?= =?utf-8?B?QUhqTkFydHhROU5EYURKeXA1MUNxV2V6TzNTSVhGUEZuU3VxZWYrK0Ird0hL?= =?utf-8?B?ZEo3b0gwVE1ZYVlQNm1ldC9hMXdZSTVCWW1rdU5vcUZWN0M2d3BEaVNKaElK?= =?utf-8?B?RGRGbjl3ako5d01rZGFFeDIvN3lLc2VTaWtIWTNxSlNMMCtkN0FjVVpyVTN6?= =?utf-8?B?VFN4ZDFlYmVzbVdEQUlxNng1SEVNUHBUVUpmRGE1VVFqaXJ5MlZIYnhQSTVu?= =?utf-8?B?YUVlTXFHVGkvNmxCYVl4ZWhreDNqeGdReWV6ZlRta1lSTzF5S1VRRXpPUUlV?= =?utf-8?B?ZkY5STdveElmQVdXaGhaU0YrRC83dWxUYTRvTC9mcFRRSGRCcGJSSjRGM25T?= =?utf-8?B?VXB6d210aElUVElKSG8vU2hCQ1NKVEVGL2xGL05lVE5JNWlyTzlYOS9RclpY?= =?utf-8?B?YWRLM1MzNDFYMURGUlBtSkpvWTdjVStQQlU5VDE1ZnkrdTNicXBLWlZIUVlo?= =?utf-8?B?V0FOTW5pSGc0N2UyWXQwRzZKRkprdjhUbWJIOEIxcTlZZzkwQ01oQkdJOUta?= =?utf-8?B?UEFoUk5wQUlURm4xYWlxeEgrYmZVQjZESSthbDNCeTgyaW5WR3FHS0ovTERh?= =?utf-8?B?Yy9VUGhVdm9ZWG9PRTYxdGtLOENlSE5jSkRoVzJVc3UyMUNmOXBpM0daR2E0?= =?utf-8?B?N3Q4Tk5Wa0U0UmNScWNIZEY0NDkwNGFOSkRIVmVuR09uSnZCU0h0d2NvRjYw?= =?utf-8?B?RTVIc0k5OU0rOENIVW9CQTA3ODRleTNnUU00STBLQk9kYkw1TzNPRWFVb0Fo?= =?utf-8?B?VVhjek9KaVZoUGFpZjZFNTFzSjUzRlhwanByUFZqK3Y5aW05cjUxTVFsSUtP?= =?utf-8?B?cUhUWHVQWjJRNGR4ZjRRRkVleTRIZFB2Y0x6ak0rOW82bWFOREt5MTNKaEdu?= =?utf-8?B?dlB6TUtqNmtBMnk3Vmx6ZWNQUERqY3QyeHhmZ0NRbjU1OVE9PQ==?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CH2PR12MB4262.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(376014)(7416014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?SXBWRExJL25xMXNhZVFGUGYxUEVnZ2VVZFVIKzh6K1BOT1gwTTJSaHNqUXRX?= =?utf-8?B?U1pSYjdUempSWXR3Q1V5dDU2cmJySTh5SHo3SUpaQ3dWc1E5R2p1VDVCUVU1?= =?utf-8?B?ekYwY0VxY2RidVNwSzNveFBCYWcrbTkyRDdycnVVcXBPdXVHSVBTRUxjV0Nh?= =?utf-8?B?Qk9yd2hHMWhLbEV1WHFaWmVQcUl3VnBkTkxrSXRET0JwK2hxZngwNkIzOVFx?= =?utf-8?B?TlJOWklaMzBLOC9YdjMzMDNTb0Q2d3h2dXRBeURpalRTcU4rOXNueTV6OTBI?= =?utf-8?B?TmRIUlBSUGxsU2pGYmpsSjBLb2tUR2R6c1hIVVY1bDFXWFFXK1UyUDlaZE5h?= =?utf-8?B?ZE5VeCtjbmk0WkI1cU5SU0FoR0J0NjhweDdkTUswVUJNNnMyZG1qQ2EvSXBp?= =?utf-8?B?YnZFaUI4c1ZaaW1KOU5YS3VMR3kyUDFub1pwSEd3bTBiY0lLcktLUnVJMzRL?= =?utf-8?B?RTNYU3o0L0xoclc3RjZkREN1Tkp5Lzdwd21aaWRUMkFLTTM1UmxMWG5jaEdo?= =?utf-8?B?Z0FWN29zbjMxaUJGVWpsWXRJdVBxMVdEVjNka3RGREl0TSt1K0VVRWdmbE5T?= =?utf-8?B?c1Btd3liN0QwcWNIUFVmMVNZYWM5UHR5Z21CZ2NCTlk2SEh6UktmbkpoZjc3?= =?utf-8?B?K084OGhTeVQvYnhsWWsrUkd1cURZaS9OY28wMVdudms0cG9MS0NtQUI1cExV?= =?utf-8?B?dmNpMTZaMEFpRGxqVFR4bTVEeTJkY1BkUGhuWnFVd0wyNTBZZzNDVDNLb3hG?= =?utf-8?B?M3E1SzJ6Um8zcE5jSVBDeU11ck1mRjkxOUJpRUJJTGxTdnZQNkJvTkwzUnZw?= =?utf-8?B?MXZYMWptZHY5VEM3Rk1hU2FLK0ZETmRqMEJJTU5vZlpMd1h6dVROeXYzMDRD?= =?utf-8?B?R2hyaDVKcHVhcWRUamt1UFNvNUZpL242RlBRNG9Mdzl6UjF0WTNnYUZ3T0RK?= =?utf-8?B?M2pNV0RLUkVpV1A5RTVaY3ZNYXZPL29OUVJXQVF0bEFXTzZGYWRkUi90NVhO?= =?utf-8?B?QXlTUFpuekp2SHNESVNaejBSbVlrZ0pSTldHUitvNEVLVWQ0Y1BxZy9uQjFm?= =?utf-8?B?TWVMYlVBcUpMVittTC9oR1h2M1lmK1VoaVJldnNJQXZDME1aRWFxb29aZGp1?= =?utf-8?B?c0U2MkFyeXVRR3ZwaHBuL25ZOTdZSElnczR6VDNicm9JaGFSa3duRDFXQllq?= =?utf-8?B?dFpRNFNNM3crWVlPSktRYk5kS1FDdGpxVjdxaTI5dTJNSHFYWE9WK3JaR3dK?= =?utf-8?B?TnozNWNuaHR2QlVuTElrQWttUDJJOWNOUlhmMG1wQTY0OHlaMmFlQzgzS3hq?= =?utf-8?B?WC9FaGF4MlNUaUFwa2xGNnVnajV3aGwvM2s0cG00S0xwd1R3U2Qvbks5RWFZ?= =?utf-8?B?d09vaXpYUzN1anlRcE1NYzVOV2RxeS85MWRrcVJ1cmd3TzVxY2Y3ZVFKRmVs?= =?utf-8?B?VTJCRnRBRHltczVDQnNEUEpBZkthcEZUZWdnaHhtTytYWmNtdmZ5VHUyWXp5?= =?utf-8?B?MEVscVZKNkRtckx1Znp3QTE3c2gxYTJveVY5azczbEpicmlhRExWU2xoRW1H?= =?utf-8?B?SGRNMkp0TkRTbGh1OTQydjJYM0JrY1k0M1FHUCtXRDdySlRINVlGaFlJQXFt?= =?utf-8?B?Mm5aNWJKTDQ0eTdlWDEyTi82UFRJdWF3QnJYVWc1UkZpT2F0L3QxZzFFU0Fw?= =?utf-8?B?NjF1ZVBYWVdvMkhMZkZQYi9RQkQxRUFHYWlEUDBMMkdzK2FUdUR4NW9tZlhE?= =?utf-8?B?YTl2UWIwd05Zano4NGh1cTdmTU9CTWoweHZTVjdqb0htWlBBNnFzUUpxeUxF?= =?utf-8?B?RWhJUi9aSHVhTlptL3FYeCs0bVNQK0VmdVYrQ0NLL2pOTE0yVUdnc0Nib0xa?= =?utf-8?B?ZjhHamZMY0ZkUjE2UWY4ZlRHUldBR08wUW02SU85WUcyQy9WNVZjUzhvTTZ6?= =?utf-8?B?ZzlxbXRydENTb1VBZ2ZtZE5WcXZOOFFxK1RjRWRMclhrVXhRM0VvdHZKTFJq?= =?utf-8?B?dFB4WjQyOEpXbS9rMEV1Wm1sSVZaVGpYdUZhVFJUNjhlWFRWMzE0bEplRDdU?= =?utf-8?B?N0VkYy91VUhyLzR6cVpucCttTngrb3Y1ZWFQcHlvUGg1SER4bnJISXJraWZH?= =?utf-8?Q?i6EyDKuoh78e8AUyfRCIxKLbJ?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8461d142-9a72-409f-35ad-08dd30a35da1 X-MS-Exchange-CrossTenant-AuthSource: CH2PR12MB4262.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2025 11:47:17.0722 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: QodziQFWgCKc0X1L4cdwr8GupsM47FNRjSUZbegMwU5Ne4kBSJcQPDsdLGipeBojcvLk33FruZNoqQ+Nl3Tg5Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB6930 X-Rspamd-Queue-Id: DA10D1C0006 X-Stat-Signature: q5fqny3hbdhemdqmnhrtno8pu4h5edqd X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1736423240-216303 X-HE-Meta: U2FsdGVkX19yhSilNBmLbpF8wSd2D72zyTHKDOxYDEM/Xl5q1tegfyKTB2fEtXgElcIWjzwsgWh3uHqR8Qycwb+urPw5nKa1DRYIzpWSvnhuwQ9EvFE9Jbl8F/7uc5A+sfKza1ewpjdicFSzv//eHs4T1X/RJfgj95q7UVnqn1hMvDJR6/I6DSwKUrVY0MD3aSDSC///jhj/aOJfNvzrwQtcfABztnvOzjK5VzlpXLdl2DgoK4jIaLp7bLPydKlnZjXjQWaquBgH6NXoRNPAdirjcTP03vGlJ9YbhBi9G9ucP0+OnMMPePurJENOWfxaDrWuA0fl2lCp967tfnbpAXdh8yb2l/T1IBSeHspeopHdRPg6Jr9R4BBO5cGpJPWaO7tblmCvlbF+0CIv9ZKNMy7jTBNtYVGD6T6Ns3TJt1FDR75wL8b8zLHTUOsHSgNmUJk2ovLomMf00a0F11oHDvCtLxUzkYZ7JsOjfIJMRqyiCCMbc0petE6j0oIsxQ/I0YyQ+6jWueXWrm19lw7ad3bqdPIC0TnoKojB2lw9+lawQlV/2LmXsiePVi28F6rERHAqCWz3DYuT5fM7j3AvNxS+avVjSkQlx00nrPsfJsEvCvWKc4kTO52zEu7F9ZIWjN7CPLahr82Gfux1u5qCK6Uy91CW4cLeixpYBnZeQxXll43imfRD9GJZ7FjaD2T32uQY+0kBFLOtyL6v/kk8oGZR3k09u5PcB/OjQxn9TvUE1gI6sWfvoBTimyD+piuhXPeSNl0CyBGEmdG/E1Erzssy26ro7g1uXgXnXW0rFTaPfZdoawh4VpxxxJYGt14e53e3HSpfXhFX3pUPqb8a+kj0Qs8p5oQMVie8OaYlCz3lefBpVQDlx+slLa3cAWTVEoaLUDBj1e+GmfTcyVIPk2gTD7YQGBQyqaYfyf6SSiBu0Zbu3OlfMRvqxKh3MqvKsKSY5IVXizgHXi8UFqU +ZQ7tzF0 xURypW8IW/9vW6XUsiiGyxdffA4tGd+bJ++tuxQgjXVdPrzgvUMJgNy6352OQGTjahuBRWYaJRpK5rixODNPJwzOniscbKgzvEsGCyejzXqXeQf12grSkvkQeoUBIqwxl7NkYTtBgT0zR87MX1CKSL37elw7ZpYXRtnCE9gSXbCjhosqBVdUm0f8OXRY8tWzcrMSYMcyg//rdQEIMNlKNVYwXsS3OE51g4Etyt1nof/TTYRPD1m4S9Fdii3E6t4Mwux8ZomLXg1q+2jJhoRJ5XasunU71GgxiN7PZYKmSAfsy7g9lFeUQM5Gz5879IajUzYfXVFCn5OweeMGv+LebeikS3WDPVS0MaVwqZ9OqPDCGOUH2O/KUP432UaKrlHkWNGgO5AUg2qrks4YWMTT4wnsv6k5HRqoe5m/nhGtZ46Xhv2A41cCy6xw3E/pKjU5KFDAuPlmJHtQ8WHG90evObvfA9tHezxqtqsmiBXBuh89Ja1cE5z940a5lwl/qJIN9AqhyoAR9FcnWifxBtKhvY2MDM6hkxaEkX9eGpdB2BUcCfCMkrPh5DVaV1P6fG+b+oVvKFluBBcO4QfaFbL8Xuhd1yA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 1/3/2025 10:54 PM, Zi Yan wrote: Hi Zi, It's interesting to see my batch page migration patchset evolution with multi-threading support. Thanks for sharing this. > Hi all, > > This patchset accelerates page migration by batching folio copy operations and > using multiple CPU threads and is based on Shivank's Enhancements to Page > Migration with Batch Offloading via DMA patchset[1] and my original accelerate > page migration patchset[2]. It is on top of mm-everything-2025-01-03-05-59. > The last patch is for testing purpose and should not be considered. > > The motivations are: > > 1. Batching folio copy increases copy throughput. Especially for base page > migrations, folio copy throughput is low since there are kernel activities like > moving folio metadata and updating page table entries sit between two folio > copies. And base page sizes are relatively small, 4KB on x86_64, ARM64 > and 64KB on ARM64. > > 2. Single CPU thread has limited copy throughput. Using multi threads is > a natural extension to speed up folio copy, when DMA engine is NOT > available in a system. > > > Design > === > > It is based on Shivank's patchset and revise MIGRATE_SYNC_NO_COPY > (renamed to MIGRATE_NO_COPY) to avoid folio copy operation inside > migrate_folio_move() and perform them in one shot afterwards. A > copy_page_lists_mt() function is added to use multi threads to copy > folios from src list to dst list. > > Changes compared to Shivank's patchset (mainly rewrote batching folio > copy code) > === > > 1. mig_info is removed, so no memory allocation is needed during > batching folio copies. src->private is used to store old page state and > anon_vma after folio metadata is copied from src to dst. > > 2. move_to_new_folio() and migrate_folio_move() are refactored to remove > redundant code in migrate_folios_batch_move(). > > 3. folio_mc_copy() is used for the single threaded copy code to keep the > original kernel behavior. > > > > TODOs > === > 1. Multi-threaded folio copy routine needs to look at CPU scheduler and > only use idle CPUs to avoid interfering userspace workloads. Of course > more complicated policies can be used based on migration issuing thread > priority. > > 2. Eliminate memory allocation during multi-threaded folio copy routine > if possible. > > 3. A runtime check to decide when use multi-threaded folio copy. > Something like cache hotness issue mentioned by Matthew[3]. > > 4. Use non-temporal CPU instructions to avoid cache pollution issues. > > 5. Explicitly make multi-threaded folio copy only available to > !HIGHMEM, since kmap_local_page() would be needed for each kernel > folio copy work threads and expensive. > > 6. A better interface than copy_page_lists_mt() to allow DMA data copy > to be used as well. I think Static Calls can be better option for this. This will give a flexible copy interface to support both CPU and various DMA-based folio copy. DMA-capable driver can override the default CPU copy path without any additional runtime overheads. > Performance > === > > I benchmarked move_pages() throughput on a two socket NUMA system with two > NVIDIA Grace CPUs. The base page size is 64KB. Both 64KB page migration and 2MB > mTHP page migration are measured. > > The tables below show move_pages() throughput with different > configurations and different numbers of copied pages. The x-axis is the > configurations, from vanilla Linux kernel to using 1, 2, 4, 8, 16, 32 > threads with this patchset applied. And the unit is GB/s. > > The 32-thread copy throughput can be up to 10x of single thread serial folio > copy. Batching folio copy not only benefits huge page but also base > page. > > 64KB (GB/s): > > vanilla mt_1 mt_2 mt_4 mt_8 mt_16 mt_32 > 32 5.43 4.90 5.65 7.31 7.60 8.61 6.43 > 256 6.95 6.89 9.28 14.67 22.41 23.39 23.93 > 512 7.88 7.26 10.15 17.53 27.82 27.88 33.93 > 768 7.65 7.42 10.46 18.59 28.65 29.67 30.76 > 1024 7.46 8.01 10.90 17.77 27.04 32.18 38.80 > > 2MB mTHP (GB/s): > > vanilla mt_1 mt_2 mt_4 mt_8 mt_16 mt_32 > 1 5.94 2.90 6.90 8.56 11.16 8.76 6.41 > 2 7.67 5.57 7.11 12.48 17.37 15.68 14.10 > 4 8.01 6.04 10.25 20.14 22.52 27.79 25.28 > 8 8.42 7.00 11.41 24.73 33.96 32.62 39.55 > 16 9.41 6.91 12.23 27.51 43.95 49.15 51.38 > 32 10.23 7.15 13.03 29.52 49.49 69.98 71.51 > 64 9.40 7.37 13.88 30.38 52.00 76.89 79.41 > 128 8.59 7.23 14.20 28.39 49.98 78.27 90.18 > 256 8.43 7.16 14.59 28.14 48.78 76.88 92.28 > 512 8.31 7.78 14.40 26.20 43.31 63.91 75.21 > 768 8.30 7.86 14.83 27.41 46.25 69.85 81.31 > 1024 8.31 7.90 14.96 27.62 46.75 71.76 83.84 I'm measuring the throughput(in GB/s) on our AMD EPYC Zen 5 system (2-socket, 64-core per socket with SMT Enabled, 2 NUMA nodes) with base page-size as 4KB and using using mm-everything-2025-01-04-04-41 as base kernel. Method: ====== main() { ... // code snippet to measure throughput clock_gettime(CLOCK_MONOTONIC, &t1); retcode = move_pages(getpid(), num_pages, pages, nodesArray , statusArray, MPOL_MF_MOVE); clock_gettime(CLOCK_MONOTONIC, &t2); // tput = num_pages*PAGE_SIZE/(t2-t1) ... } Measurements: ============ vanilla: base kernel without patchset mt:0 = MT kernel with use_mt_copy=0 mt:1..mt:32 = MT kernel with use_mt_copy=1 and thread cnt = 1,2,...,32 Measured for both configuration push_0_pull_1=0 and push_0_pull_1=1 and for 4KB migration and THP migration. -------------------- #1 push_0_pull_1 = 0 (src node CPUs are used) #1.1 THP=Never, 4KB (GB/s): nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 512 1.28 1.28 1.92 1.80 2.24 2.35 2.22 2.17 4096 2.40 2.40 2.51 2.58 2.83 2.72 2.99 3.25 8192 3.18 2.88 2.83 2.69 3.49 3.46 3.57 3.80 16348 3.17 2.94 2.96 3.17 3.63 3.68 4.06 4.15 #1.2 THP=Always, 2MB (GB/s): nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 512 4.31 5.02 3.39 3.40 3.33 3.51 3.91 4.03 1024 7.13 4.49 3.58 3.56 3.91 3.87 4.39 4.57 2048 5.26 6.47 3.91 4.00 3.71 3.85 4.97 6.83 4096 9.93 7.77 4.58 3.79 3.93 3.53 6.41 4.77 8192 6.47 6.33 4.37 4.67 4.52 4.39 5.30 5.37 16348 7.66 8.00 5.20 5.22 5.24 5.28 6.41 7.02 32768 8.56 8.62 6.34 6.20 6.20 6.19 7.18 8.10 65536 9.41 9.40 7.14 7.15 7.15 7.19 7.96 8.89 262144 10.17 10.19 7.26 7.90 7.98 8.05 9.46 10.30 524288 10.40 9.95 7.25 7.93 8.02 8.76 9.55 10.30 -------------------- #2 push_0_pull_1 = 1 (dst node CPUs are used): #2.1 THP=Never 4KB (GB/s): nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 512 1.28 1.36 2.01 2.74 2.33 2.31 2.53 2.96 4096 2.40 2.84 2.94 3.04 3.40 3.23 3.31 4.16 8192 3.18 3.27 3.34 3.94 3.77 3.68 4.23 4.76 16348 3.17 3.42 3.66 3.21 3.82 4.40 4.76 4.89 #2.2 THP=Always 2MB (GB/s): nr_pages vanilla mt:0 mt:1 mt:2 mt:4 mt:8 mt:16 mt:32 512 4.31 5.91 4.03 3.73 4.26 4.13 4.78 3.44 1024 7.13 6.83 4.60 5.13 5.03 5.19 5.94 7.25 2048 5.26 7.09 5.20 5.69 5.83 5.73 6.85 8.13 4096 9.93 9.31 4.90 4.82 4.82 5.26 8.46 8.52 8192 6.47 7.63 5.66 5.85 5.75 6.14 7.45 8.63 16348 7.66 10.00 6.35 6.54 6.66 6.99 8.18 10.21 32768 8.56 9.78 7.06 7.41 7.76 9.02 9.55 11.92 65536 9.41 10.00 8.19 9.20 9.32 8.68 11.00 13.31 262144 10.17 11.17 9.01 9.96 9.99 10.00 11.70 14.27 524288 10.40 11.38 9.07 9.98 10.01 10.09 11.95 14.48 Note: 1. For THP = Never: I'm doing for 16X pages to keep total size same for your experiment with 64KB pagesize) 2. For THP = Always: nr_pages = Number of 4KB pages moved. nr_pages=512 => 512 4KB pages => 1 2MB page) I'm seeing little (1.5X in some cases) to no benefits. The performance scaling is relatively flat across thread counts. Is it possible I'm missing something in my testing? Could the base page size difference (4KB vs 64KB) be playing a role in the scaling behavior? How the performance varies with 4KB pages on your system? I'd be happy to work with you on investigating this differences. Let me know if you'd like any additional test data or if there are specific configurations I should try. > > Let me know your thoughts. Thanks. > > > [1] https://lore.kernel.org/linux-mm/20240614221525.19170-1-shivankg@amd.com/ > [2] https://lore.kernel.org/linux-mm/20190404020046.32741-1-zi.yan@sent.com/ > [3] https://lore.kernel.org/linux-mm/Zm0SWZKcRrngCUUW@casper.infradead.org/ > > Byungchul Park (1): > mm: separate move/undo doing on folio list from migrate_pages_batch() > > Zi Yan (4): > mm/migrate: factor out code in move_to_new_folio() and > migrate_folio_move() > mm/migrate: add migrate_folios_batch_move to batch the folio move > operations > mm/migrate: introduce multi-threaded page copy routine > test: add sysctl for folio copy tests and adjust > NR_MAX_BATCHED_MIGRATION > > include/linux/migrate.h | 3 + > include/linux/migrate_mode.h | 2 + > include/linux/mm.h | 4 + > include/linux/sysctl.h | 1 + > kernel/sysctl.c | 29 ++- > mm/Makefile | 2 +- > mm/copy_pages.c | 190 +++++++++++++++ > mm/migrate.c | 443 +++++++++++++++++++++++++++-------- > 8 files changed, 577 insertions(+), 97 deletions(-) > create mode 100644 mm/copy_pages.c >