From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B31CE95A91 for ; Mon, 9 Oct 2023 15:42:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4797D8D0078; Mon, 9 Oct 2023 11:42:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 426678D0031; Mon, 9 Oct 2023 11:42:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 27B148D0078; Mon, 9 Oct 2023 11:42:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 135D98D0031 for ; Mon, 9 Oct 2023 11:42:40 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id AF9361A0304 for ; Mon, 9 Oct 2023 15:42:39 +0000 (UTC) X-FDA: 81326340438.01.F6FF5B5 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2063.outbound.protection.outlook.com [40.107.220.63]) by imf18.hostedemail.com (Postfix) with ESMTP id A8E2C1C000D for ; Mon, 9 Oct 2023 15:42:36 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=MFNfactQ; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com; spf=pass (imf18.hostedemail.com: domain of ziy@nvidia.com designates 40.107.220.63 as permitted sender) smtp.mailfrom=ziy@nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1696866157; a=rsa-sha256; cv=pass; b=TxH7DebrhtVIO/IN/5WF810043vvhp/XnzOUSZYTyBsmqG2B9mCbqeMh3V/fFZtUt4yEaF PL25bWPvL4Lt2uMyNIC48faId2BoJ/u6LtJPsYzHQY5kupFt7V64Nv6ZRtBotPEMRCsGT+ GgUwA0QjGew1/29sVVrp1USMVF4mNC4= ARC-Authentication-Results: i=2; imf18.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=MFNfactQ; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com; spf=pass (imf18.hostedemail.com: domain of ziy@nvidia.com designates 40.107.220.63 as permitted sender) smtp.mailfrom=ziy@nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696866157; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y7UDdb5HhUJxpxhhhX3Bb722GvT6UCAhL14pkNhsu2Q=; b=4HGH6l9RiLIo2n+SzY6GSU37sR1RLZH9+aeR3A+uegrIG1+l6Q88TUqBh/ze/BzxAB4rYA KDzRHGN9zoYBMvHqInywMOqfSIklJyjUEagnXF5DKnOYHjkVeHposYfh+0TOIs4jKwrygB QRkFfix3iy4adn5NIdWfwGjp+6O3GU8= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=c4yVxVfSaVlbhdePW8o8RQB4CGVqV8xcEUSt4KgX2uoKlHSXmupJicsiJDt++wKz3mQSZ0HkWi/t40e41HKMRQ3kw+HMo4dBVqxFMuYwwa98vq1Rd+EfFgkv56i27iUrdGONQHC1UwNqKYgHRg/M94a3Ue8+/fdwwIJLH3UQbY4vJuhXE5+y1DxmRigoMHPiVFBlnNwMvVrnJDO4DQlYkvL6Lvr1kLkC/5NtGd/O88sGS+aPCjldtaZ620lgzV9xqtgR8HohNGkxYpoEovOzQmR2mgzBOvKPE2DSHmVsIFWMJExegh735pWPAwFGNGLdeKJEC+ekZfRkFdfubEpY/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Y7UDdb5HhUJxpxhhhX3Bb722GvT6UCAhL14pkNhsu2Q=; b=dHy1hZ9NN1wpD26BVvOzumRddUSUGxC2dqCfd1Z5RRgAwu0BQ69VchSs5d54tD2N+fO2mDgek6ZRTJIZIhLLVPGo2d+n+2Pv2AtKMBNucDbrmL8hyniZ++53238zh+FoHm/r80Lwyvu8L69G26BevWJTetBsTX0JRX2+P5IonLRnZyT8DnNb5OGZh9DH+ZL7Wgaoci2FGaNpHfq713qAb3rybkYESf/jm63XUrxabiD4QM0BltVtjIglYHpBP5zYGOAdLn7K7EcwktxPiY+i3Ck/1aOYqZ2lnqeSnOqvw/3rLiw97EhOZ5VCYPUIV0vQWuE/zGpmnVZHNyzC6KroeQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Y7UDdb5HhUJxpxhhhX3Bb722GvT6UCAhL14pkNhsu2Q=; b=MFNfactQ9lev2nGAf822q5Z4H2hWDiq6f3S3VQoWtvQT+MLUn43lq/6GtMuV0+qz7iB2ApAwmcFiQ3bmkXV3hyDtbuzKryXMO/srwkOE4oAdcjFvQ3egA7j27jciHAqDrr1At2dDs8xWUzcLn7rG9Jnq52tbw7eC9CO5AEHnXqTX5I9Dh34SxTGxzMPoHks6C4fOgwd5aanwC3zvA2RCB8Z/flebYju1npYvbOSFC3G7MMibtFA+wNPDPyKkGDTfEQaN2b7xFwavlosulZr5GSCTibFQSwOfbgzyn2v1qDMrWGNiqkpT4+5N1h9BlIzsKKj6dEzISELV5YVwYGCV5Q== Received: from DS7PR12MB5744.namprd12.prod.outlook.com (2603:10b6:8:73::18) by CY8PR12MB8313.namprd12.prod.outlook.com (2603:10b6:930:7d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6863.36; Mon, 9 Oct 2023 15:42:33 +0000 Received: from DS7PR12MB5744.namprd12.prod.outlook.com ([fe80::4b09:197c:609a:1013]) by DS7PR12MB5744.namprd12.prod.outlook.com ([fe80::4b09:197c:609a:1013%7]) with mapi id 15.20.6838.040; Mon, 9 Oct 2023 15:42:33 +0000 From: Zi Yan To: Ryan Roberts Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , "\"Matthew Wilcox (Oracle)\"" , David Hildenbrand , "\"Yin, Fengwei\"" , Yu Zhao , Vlastimil Babka , Johannes Weiner , Baolin Wang , Kemeng Shi , Mel Gorman , Rohan Puri , Mcgrof Chamberlain , Adam Manzanares , John Hubbard Subject: Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction Date: Mon, 09 Oct 2023 11:42:30 -0400 X-Mailer: MailMate (1.14r5994) Message-ID: <694EAB05-AEE6-44E2-9EC8-586A4E3F6343@nvidia.com> In-Reply-To: <13347394-fc63-44b2-9fa0-455f56d9b19d@arm.com> References: <20230912162815.440749-1-zi.yan@sent.com> <5caf5aee-9142-46f6-9a04-5b6e36880b21@arm.com> <3430F048-0B75-4D2F-A097-753E8B1866B2@nvidia.com> <13347394-fc63-44b2-9fa0-455f56d9b19d@arm.com> Content-Type: multipart/signed; boundary="=_MailMate_1165A950-4C45-42F2-A1BF-36DE8685E53D_="; micalg=pgp-sha512; protocol="application/pgp-signature" X-ClientProxiedBy: BL1PR13CA0025.namprd13.prod.outlook.com (2603:10b6:208:256::30) To DS7PR12MB5744.namprd12.prod.outlook.com (2603:10b6:8:73::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB5744:EE_|CY8PR12MB8313:EE_ X-MS-Office365-Filtering-Correlation-Id: 30827c18-7217-4682-efe3-08dbc8de5ac7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: c1OLtet68l3qjz5gzFJLg42Rv1IJjIASqXaFOf7Upe+zV01wh8BVjk2UihdpBbWPft1mGMOFgjASuNyGFyQ7c8srB8zbIO0M5oDIL8iXkKkHaTkPT/aKImIpLQ4++u171vqPasmX46qr2wAo8S1XYGrcv9Ghji0kncxuTdgR0sdmUvgFfHRTAKAcj/03gHsgNPZ40nbnkfWVIIa7IbgQ7HYX02PeTbzMZhQlBundS4QrXqEmnLsZrqmftSv8lwqt7qYmkyPnKhStJI88M3RVN9hzb7aTqhczNfol54e5+LT4W12RUDqkSwlU8VxV23qXnBVjSs+T8F3rYzLgpsbd/hXcwIYmIfOL/572fzm0utSiRi+Ry4qH4DDSIPbooIxTAAMlecQy+OIOd3NADkfvNMi/fenlPXtX/FEaqUgoUiWxqiX8lUVXSiAnzpW6PhkGNuEdOToAbDdTJmD03X5HYDjTlSZG+32hVyFaOtBQ1APJAnFA3ba3+WL4n665jqsHru45j8xvYdUArpXr7AS+jvXxR30HCgKBOjuzAlWteJqYEi8uvo/r2kXj/x8063eDeg0k/jvk+tOmC0iAdtS4mSxOCCc2+A9Kph9E2bN9knFIy3UxS0CYLr/fpT6A5z20 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR12MB5744.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(366004)(136003)(346002)(396003)(39860400002)(376002)(230922051799003)(64100799003)(186009)(451199024)(1800799009)(5660300002)(53546011)(26005)(2616005)(6512007)(478600001)(8676002)(8936002)(107886003)(83380400001)(235185007)(7416002)(4326008)(2906002)(6506007)(54906003)(66476007)(66556008)(66946007)(6486002)(41300700001)(6916009)(316002)(38100700002)(36756003)(86362001)(33656002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?11UVy2MKnKpqgzBtBJNWU5GDgb2lMjzL53VIVhSZ7Fm3SsiHfVPHiqGmxABy?= =?us-ascii?Q?9YUISPKyF7jhTDHXclIMYzaFH19NiOSTZHikbQIgFo14zHrs9sW4rkcjqdsw?= =?us-ascii?Q?D1iVoKKeW852FJbEKtTKKHoTPLCnSR/58AHS5JKw2bFl1CztZ8MpO/7fDkfy?= =?us-ascii?Q?LmpQUgtPjC3xCQ5SubPr7ytGydo6WSgrQF6PJ2ae4m0thqdTr58xuKs89mVC?= =?us-ascii?Q?2qxPgH2YrUk6O/h+pmygcV7/2pdZyfdW8vrxfyetH+nuAhGYreGpzATkLswC?= =?us-ascii?Q?kfERhf75fXQzIXadtYFscfLAKRpSZvuakJ5en1bwndAWWU05BuYZCxAlmkRu?= =?us-ascii?Q?elDlXG4+kOQxp21Ir3IT1T9b3OwfNJ9xks0pd87FvRu/LqQ16OSgCQebjvyG?= =?us-ascii?Q?KMyIH5jziw2AThus9Lc+lwj3qu2oi1jmTkxcR8So4lkjB+PicqXCIQyjT3jZ?= =?us-ascii?Q?FnI+S1LtXhsqdGL0tX026oI1kYud3DeuCozbd3tuc/mFYX3DqJ3Hu8WgfIdJ?= =?us-ascii?Q?kfhpn//2q3+ayTssTjshS4VjZXPDU9JH03fRcU3dmpSIvk3UoWSf7F6fOGaD?= =?us-ascii?Q?NKEKD88J59PobSk1iQacidIupt0NBtt2l9QPsekTiQD16sggEkCMFkDGnsku?= =?us-ascii?Q?4DBRliaTv8VUGKYW4xV2p/eoDOQVPrWr9duY2uNqzFvxWTMD12wQcpTHZ1kB?= =?us-ascii?Q?26p6L7kzZyaM8ezANfgXkKaJlLfauuWCK8kJQsKxOcrAmL+Xu8HgWgsnoMTR?= =?us-ascii?Q?lleuZVIu5yrL8s3yrfpLMsbhcDkar2HUiz2PPNAr7qphMiNQZdbKC6FdwMxZ?= =?us-ascii?Q?nOo/7dR49n49N8Bzt4eodEr5ZXHDbeAPILnZ92c2hgIV551cnxVu214ghp8/?= =?us-ascii?Q?ApfjhRqL93MJcDDL2AmXviDS9olS0C1x+sNg/BbWlkVY62F2bB3z941kqYtI?= =?us-ascii?Q?nV8KrVOiP/24Ptr5NkX6qJIVRLPXb0zKMCMPCDtD1iSrx4iUvRhPdwihqSPv?= =?us-ascii?Q?HpnX/MM6/FErRjgIUF7jbSI/GFdgZYy6DKm3uI+fAJrkbgk+MlgAyk197lP6?= =?us-ascii?Q?VRRIfyJ/8KPX3qdadVxqEdEzeU1MHQFAPlHMrqQJviYgdlCo4YIiOJQIfrYH?= =?us-ascii?Q?BA4ERLPaQGVftkf3D1CUx2DNyXi8qrntztcGNh9CIuaA1sAfaC/+8t8jaEMV?= =?us-ascii?Q?aPQLdcyeCFXnp1uS134dQRsPGyeRn02ZiNq3QFYepTjLZghhSwjC3GSuT6xZ?= =?us-ascii?Q?P4EJaj9qQGtn0wF6wog4bRwpPddTxKRmc6nAiRPOG+Ec1ckXgHP0eI2HOfQN?= =?us-ascii?Q?jyuaZs5KIDi4w/pNBaR8RxMDEneFqKaNFMA5X+NemLM2CyXsRfrMQFijV8Sb?= =?us-ascii?Q?UOYXUVrKf2bna2veVLTEQjLHfWrCP/y5kRoSyLEYe6iGyIkbFZcthWtAKn+3?= =?us-ascii?Q?1hQAvzA14RGWtonBJ6VVS0BCCnDRBM3G4UohZydU6zeP8lVShxgbgV8PWCWf?= =?us-ascii?Q?+qin0d1jhKMJ74iW6ulNNEpOSrsUu0xE0cliT2fZCeCmkiT68B7kDG1O32AV?= =?us-ascii?Q?7RJQPI+pbozwfdou6/c=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 30827c18-7217-4682-efe3-08dbc8de5ac7 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB5744.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Oct 2023 15:42:33.6261 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: YfWRrShNc+8fL7x0ok5lRXt7rOENKj+9MUbYV2QI0+rbGnZUHoEO2RaeOgoLFBjf X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR12MB8313 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: A8E2C1C000D X-Stat-Signature: bwxeodijfjdgbpfiypuiewkdf74rwghn X-HE-Tag: 1696866156-247457 X-HE-Meta: U2FsdGVkX18H4nKj+NJrrZBot9+Uy1gz+OEJ86bmefNkjQNadrsunr4jEUm1g+/iMvfbPBLghveSD3GYm+Qx47/IyHz4FMwSow2ozqNgO56U2rJNGS++GMHatJ307WqgzazzxF5+RH+J6a8OcuKaAVN90hes7AoXOAGHhlR0wjqO/JnZ0BOe0mwxvX5xTiRnTCUgokhf7FbMfv1E6nz77V37BUnzU5DT++7TiDGFdMk4+v7ZMwiRYArfxMO2DWjwNaV8WIeDq7i5yNGvlSpsaKkZjmQHCoYa0Cspgw6Rn0GHNxSGY9sIAI1SPspydWT+qiKA4Z2Zf9RAQ0sHKgXvcLA0vYKG8HrkFlPgsn7dqq5vOrXa7pYsSYdXLL+lvGisd3f9ZNGj6oMVOCs2oHoyV2f4sQT92MNrbWM/PnFRDYFtev7LZk6qAdp8DNNXjePtOw6XwX/M19a8VzRcH1hpROYNj0TZB+Dvc3pRDM1g7qss+qirqkJkAJEYRRPs8EyUM7pJeUG80jq0DTpuZcpcr4/HNPYHPCXTi98oHSWjWNqpg8WUl3/lSw3pglg6zmF1OU+jpOptO1r3akra1kcDnpyCH8DOPiMU/s/IVx54lgp6R6oRYfVOlJS81K7CcJjYOm2LJ1TlFCwZu/+/zttP21W/gBodPMwFiRtOpAbUDmTH6aHiU2zWnO5tkm5gktKXvgQQY7OemM88E7tPB1vGxkFx0jEMjuNRUDqk8oxwyJlVdL6syn+cJ2nuDX4qJt59ZUqOXxNBLOgjAalKSx+xfFrw8kuuIUnYWeLZIh3l3M2Zh7RzkBQTEeOz9XoAD0BYMpLFxuVZ/OKcO1181y1oF5HPLqVMq9AwUDW0xIwUWkX0jpqAaPBndTwG1a6EK34Sc3uulKaLKHM9G8L1KM5PN+DoJuM2hv1Zj98e//tQiilnCXKp9O7LzJO8aCqxaIfd6wUR8eeyqa8+CywVW8a xPA+Sb7/ 77EhMwA32X7hI8pmYepSoMPoeQKBv3ldsvoQDmd6ngiK9gsp+Gqs+c8euxVPLrdHDH0hxGrvU6JyLDeuh0FmITgQUn+b6Y0fdv8whY3A3hIbfD/tVHIvX6Q0vder/5bcFQ4z8VMzvoMMbWEGSqfJGMrPXSR5GVQI5X3oNDyx4DLNOSj9CsaYYo1imIxjyj+vMZXE/c20RwLFjl/MrEOm9iebvdeI+lS277z48+Gl3Tv2k4hpZfv5hDL6ALMorDZNaB/A9wPPw30P7h/qEeVuaEwLXxO17UxlSES5a7IOO6k7+0GUf30xeuUjeBBaciVoRWj6ZFNslVYKW0ipot3uDwFA9ppagzJ3OjXMSPD1m6/9YHCuqOR2khq5MpX4M64Jdrw5uoVKOJh7QIGsVI1H/yGNNPOwiFv3yfCYvQiSYuTupEgd/m7IWOFB0Xb4WOGFtq4WGBA3pZ9oKLxnP9J1+Q8WPTNnX3KMsOAtWw8ScapvLY4PFelkj3Pq1UuxBglfjnq4BeaYLDJ2CdM7/suHBdJ6mF43/IAxdULE6fCcyiFDhinw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --=_MailMate_1165A950-4C45-42F2-A1BF-36DE8685E53D_= Content-Type: multipart/alternative; boundary="=_MailMate_3B45CCC5-2F41-47DD-BF91-A74673B703D3_=" --=_MailMate_3B45CCC5-2F41-47DD-BF91-A74673B703D3_= Content-Type: text/plain; markup=markdown On 9 Oct 2023, at 10:10, Ryan Roberts wrote: > On 09/10/2023 14:24, Zi Yan wrote: >> On 2 Oct 2023, at 8:32, Ryan Roberts wrote: >> >>> Hi Zi, >>> >>> On 12/09/2023 17:28, Zi Yan wrote: >>>> From: Zi Yan >>>> >>>> Hi all, >>>> >>>> This patchset enables >0 order folio memory compaction, which is one of >>>> the prerequisitions for large folio support[1]. It is on top of >>>> mm-everything-2023-09-11-22-56. >>> >>> I've taken a quick look at these and realize I'm not well equipped to provide >>> much in the way of meaningful review comments; All I can say is thanks for >>> putting this together, and yes, I think it will become even more important for >>> my work on anonymous large folios. >>> >>> >>>> >>>> Overview >>>> === >>>> >>>> To support >0 order folio compaction, the patchset changes how free pages used >>>> for migration are kept during compaction. Free pages used to be split into >>>> order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared, >>>> page order stored in page->private is zeroed, and page reference is set to 1). >>>> Now all free pages are kept in a MAX_ORDER+1 array of page lists based >>>> on their order without post allocation process. When migrate_pages() asks for >>>> a new page, one of the free pages, based on the requested page order, is >>>> then processed and given out. >>>> >>>> >>>> Optimizations >>>> === >>>> >>>> 1. Free page split is added to increase migration success rate in case >>>> a source page does not have a matched free page in the free page lists. >>>> Free page merge is possible but not implemented, since existing >>>> PFN-based buddy page merge algorithm requires the identification of >>>> buddy pages, but free pages kept for memory compaction cannot have >>>> PageBuddy set to avoid confusing other PFN scanners. >>>> >>>> 2. Sort source pages in ascending order before migration is added to >>>> reduce free page split. Otherwise, high order free pages might be >>>> prematurely split, causing undesired high order folio migration failures. >>> >>> Not knowing much about how compaction actually works, naively I would imagine >>> that if you are just trying to free up a known amount of contiguous physical >>> space, then working through the pages in PFN order is more likely to yield the >>> result quicker? Unless all of the pages in the set must be successfully migrated >>> in order to free up the required amount of space... >> >> During compaction, pages are not freed, since that is the job of page reclaim. > > Sorry yes - my fault for using sloppy language. When I said "free up a known > amount of contiguous physical space", I really meant "move pages in order to > recover an amount of contiguous physical space". But I still think the rest of > what I said applies; wouldn't you be more likely to reach your goal quicker if > you sort by PFN? Not always. If the in-use folios on the left are order-2, order-2, order-4 (all contiguous in one pageblock) and free pages on the right are order-4 (pageblock N), order-2, order-2 (pageblock N-1) and it is not a single order-8, since there are in-use folios in the middle), going in PFN order will not get you an order-8 free page, since first order-4 free page will be split into two order-2 for the first two order-2 in-use folios. But if you migrate in the the descending order of in-use page orders, you can get an order-8 free page at the end. The patcheset minimizes free page splits to avoid the situation described above, since once a high order free page is split, the opportunity of migrating a high order in-use folio into it is gone and hardly recoverable. > >> The goal of compaction is to get a high order free page without freeing existing >> pages to avoid potential high cost IO operations. If compaction does not work, >> page reclaim would free pages to get us there (and potentially another follow-up >> compaction). So either pages are migrated or stay where they are during compaction. >> >> BTW compaction works by scanning in use pages from lower PFN to higher PFN, >> and free pages from higher PFN to lower PFN until two scanners meet in the middle. >> >> -- >> Best Regards, >> Yan, Zi Best Regards, Yan, Zi --=_MailMate_3B45CCC5-2F41-47DD-BF91-A74673B703D3_= Content-Type: text/html Content-Transfer-Encoding: quoted-printable

On 9 Oct 2023, at 10:10, Ryan Roberts wrote:

On 09/10/2023 14:24, Zi Yan wrote:

On 2 Oct 2023, at 8:32, Ryan Roberts wrote:

Hi Zi,

On 12/09/2023 17:28, Zi Yan wrote:

From: Zi Yan ziy@nvidia.com

Hi all,

This patchset enables >0 order folio memory compaction= , which is one of
the prerequisitions for large folio support[1]. It is on top of
mm-everything-2023-09-11-22-56.

I've taken a quick look at these and realize I'm not well= equipped to provide
much in the way of meaningful review comments; All I can say is thanks fo= r
putting this together, and yes, I think it will become even more importan= t for
my work on anonymous large folios.

Overview

To support >0 order folio compaction, the patchset cha= nges how free pages used
for migration are kept during compaction. Free pages used to be split int= o
order-0 pages that are post allocation processed (i.e., PageBuddy flag cl= eared,
page order stored in page->private is zeroed, and page reference is se= t to 1).
Now all free pages are kept in a MAX_ORDER+1 array of page lists based on their order without post allocation process. When migrate_pages() asks= for
a new page, one of the free pages, based on the requested page order, is<= br> then processed and given out.

Optimizations

  1. Free page split is added to increase migration success rate in case

a source page does not have a matched free page in the fr= ee page lists.
Free page merge is possible but not implemented, since existing
PFN-based buddy page merge algorithm requires the identification of
buddy pages, but free pages kept for memory compaction cannot have
PageBuddy set to avoid confusing other PFN scanners.

  1. Sort source pages in ascending order before migration is added to

reduce free page split. Otherwise, high order free pages = might be
prematurely split, causing undesired high order folio migration failures.=

Not knowing much about how compaction actually works, nai= vely I would imagine
that if you are just trying to free up a known amount of contiguous physi= cal
space, then working through the pages in PFN order is more likely to yiel= d the
result quicker? Unless all of the pages in the set must be successfully m= igrated
in order to free up the required amount of space...

During compaction, pages are not freed, since that is the= job of page reclaim.

Sorry yes - my fault for using sloppy language. When I sa= id "free up a known
amount of contiguous physical space", I really meant "move page= s in order to
recover an amount of contiguous physical space". But I still think t= he rest of
what I said applies; wouldn't you be more likely to reach your goal quick= er if
you sort by PFN?

Not always. If the in-use folios on the left are order-2,= order-2, order-4
(all contiguous in one pageblock) and free pages on the right are order-4= (pageblock N),
order-2, order-2 (pageblock N-1) and it is not a single order-8, since th= ere are
in-use folios in the middle), going in PFN order will not get you an orde= r-8 free
page, since first order-4 free page will be split into two order-2 for th= e first
two order-2 in-use folios. But if you migrate in the the descending order= of
in-use page orders, you can get an order-8 free page at the end.

The patcheset minimizes free page splits to avoid the sit= uation described above,
since once a high order free page is split, the opportunity of migrating = a high order
in-use folio into it is gone and hardly recoverable.

The goal of compaction is to get a high order free page w= ithout freeing existing
pages to avoid potential high cost IO operations. If compaction does not = work,
page reclaim would free pages to get us there (and potentially another fo= llow-up
compaction). So either pages are migrated or stay where they are during c= ompaction.

BTW compaction works by scanning in use pages from lower = PFN to higher PFN,
and free pages from higher PFN to lower PFN until two scanners meet in th= e middle.

--
Best Regards,
Yan, Zi

Best Regards,
Yan, Zi

--=_MailMate_3B45CCC5-2F41-47DD-BF91-A74673B703D3_=-- --=_MailMate_1165A950-4C45-42F2-A1BF-36DE8685E53D_= Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQJDBAEBCgAtFiEEh7yFAW3gwjwQ4C9anbJR82th+ooFAmUkH2cPHHppeUBudmlk aWEuY29tAAoJEJ2yUfNrYfqKElUP/1GMoTe+GuItWr43aPpMG+qlGCGRXhtRkPkJ REgvR7CX2ViqP6qfRxPZf3QLZpiN/Yi3ZdZWlX3YGtcwqPigLqMyYS1vxNPsHWKk x/kHMHQCPztV0w6C8tAm375T7XywRl6Hx+SWg7MgwHhdA5OXlmk14czF7EdDBVCa yqo0akthTVdJQVaFXzb+/+BWOKSJN+qyvgxAK3chxJ/+fsxI+MSe4P5RMcPYqXFI uPHx2K73kuZ2/rc80Gg9UNtbkLARb0XCZBt1WJWWQZzxpREFWIy2Vqxqka0ZwHA3 iQFEVfizp0h7iBDq+VMr7ibU25zNLLKmkZc/f835w5nZQ2+iXZwf0kijtt3lsxPl wphB4y7ppBLXEKVJ9d4Zq3mSG4RSHIzlAYgzuHXaeKNgVRpHmlLo3EXZbiDZMkZ0 S1ZYZjKiQb8pZHl0ltjpRFA6ZIRrdIdzuEbcmnvbw2X/rTI+I+H6sRupjDlHFTRi wSkKo3txPcM09NH9T6ZdxHADNG7rkLrvtA+6DrgQ/xi3sURKSI0GOM6NFDhgaK3D LHON97n4Zj7whf7Wi2s6KOJa05aRakWxhAPSGMkVTiO9ZTwBagsa3s9S+RgqiV8O rkU2fJHIAtBmZIqcE7KuRE0AXhnfDktJw2C5JqZv9Q0fj2jca4CZnqdt7aepsW4B hQjJhg34 =sksy -----END PGP SIGNATURE----- --=_MailMate_1165A950-4C45-42F2-A1BF-36DE8685E53D_=--