From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F1CDE95A91 for ; Mon, 9 Oct 2023 14:10:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D07ED900004; Mon, 9 Oct 2023 10:10:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CB6B1900002; Mon, 9 Oct 2023 10:10:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA5BF900004; Mon, 9 Oct 2023 10:10:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id ABEAB900002 for ; Mon, 9 Oct 2023 10:10:52 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6043DC0316 for ; Mon, 9 Oct 2023 14:10:52 +0000 (UTC) X-FDA: 81326109144.15.C6D429B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 1DEB420024 for ; Mon, 9 Oct 2023 14:10:48 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696860649; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=79Nu6aDnUNOp0xKmkAqXyUEewz7aA+1Dj0+dHr/xEbY=; b=HJINBDfCSa+XJiTtBuxwL3zJn7FpXshwpCLPuLLdZLAIS2p9KdIvTfIZqKNjx8GRGxSoSq QN17MdED8F2C5wE4IR+O+zeMm15iERO53eANIw9WsjXqfYGXmEdrClzlqanIrZ7pwfklbz 3O7gQhiTmV36FXhFboRd6VfACML6fBA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696860649; a=rsa-sha256; cv=none; b=0Jt9QuF4NklWmWwqoho61zKTCSPqi/z/c5DjsHF8WHoWR8gubUtPhAeAuFV6iOn+ns8LQn PuN+FTbkVcdComRjPn3meBa4g2+zKU/1mwOo6rb68bUNCJ9k5lPf1cZCCWF7/Jldl0W/Fv QWQ0xBUp3MruP5F2uTEeEvjV7yoJncA= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A3F201FB; Mon, 9 Oct 2023 07:11:28 -0700 (PDT) Received: from [10.57.66.97] (unknown [10.57.66.97]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B917F3F5A1; Mon, 9 Oct 2023 07:10:45 -0700 (PDT) Message-ID: <13347394-fc63-44b2-9fa0-455f56d9b19d@arm.com> Date: Mon, 9 Oct 2023 15:10:44 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction Content-Language: en-GB To: Zi Yan Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , "Matthew Wilcox (Oracle)" , David Hildenbrand , "Yin, Fengwei" , Yu Zhao , Vlastimil Babka , Johannes Weiner , Baolin Wang , Kemeng Shi , Mel Gorman , Rohan Puri , Mcgrof Chamberlain , Adam Manzanares , John Hubbard References: <20230912162815.440749-1-zi.yan@sent.com> <5caf5aee-9142-46f6-9a04-5b6e36880b21@arm.com> <3430F048-0B75-4D2F-A097-753E8B1866B2@nvidia.com> From: Ryan Roberts In-Reply-To: <3430F048-0B75-4D2F-A097-753E8B1866B2@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 1DEB420024 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: q7surtmieh3p7e8yu8xao6bwk3oyq9xb X-HE-Tag: 1696860648-182102 X-HE-Meta: U2FsdGVkX19nQGkgMrauXx2U4qi0/RaVMATIgH29ADqTF11ZxuC+BsjvJO/9Cd9iMbF/Tbk7XotdWGG8cN8jg6maNRCA2FajWoyeMiZ3CALyYbJENJjCr1usIeCqi04bWf2+P5uB61fU6waVlYYIbDNNNWtYQ6iMn0AiuGLilVkuzfEdX5vPSmBvjLypPRJQmZM/K8TziecmEDHfpA7f5aCYR+KwhfyCMIzJRuKG05oj6KTnQWBgd8LtGhQZ1cJ2ohDn7ZzwWMuetOIRVVhRDjFzLN+r0DqvapF+ospm6i5IxoTiSG3OKX8AF9NmSFEekuC1hM4wpS++kIis7pMOp8exgumU6tmgOwVkX6PLD1RsSZ83k85nnbjBImpCtUPRHIDuRGpVghDDLvWi7yTQXrjd+M/u+YHr8dyFYAmL8cDKkcGawbHnnp89qHNazqJ5PX0h9Rck6Lv65QvTm6id3Vi4O0ELnhNScCicQAoKItTlTj4aEMuspesh5wbnXgcl20yTxKXfqcvpwNASCGyJcUQiL59+8hXEQkVyhcENlqGBHUvYV5f1L/MGuBvJ6Ab5UIBqdFUlW41WkiTqsSV+3VbxHjyjsYHRAEKDhf0Ftp7DhJRZ5TuF31sgzp79RcEgeIDUjV7NMfBkHKY3CRiVaFeN/byWgY83VErGRwWKj8/YaqKvP5k/r0dZXfsNEeOhbzP2jgVi6J7ieGG8NnQmxTIV/LvMNBv3HiuJML3A7lcmeSR0vSmRDhA1d7AR6mQFTT/tWOXq9kGnQEI0e+oX/U18mY6kJU2oRnjqKYq1jYem9ubiRdiaUr6kEVMvIBc/wHa8r+0Qc/kpl+TzU4ib6dGun5WGQzJCaXlQ9tFeeT7oh/zr4NoCidy5B73yPX7M5aB6j/Dg2SrIXmh8UUaP8Pj42rsTngk8TFk/ePHj65pol9RmD0GZCqghQQI/B81A/3+JkFJnozYyJqf8F4e nJrgDxKI NO+EgXTFpVm6xpRRzgpIPDjgzdbdE6PwxguaDO/MjwLuKVk9wYHeg8s8sxlojBRovNU10L4hIeSALcufhX35yOLeLTg7+7EEC48bvSVhFxt80WBgFXG0Row1UJInwNsa52o5Jn9oZAhQYwcVaETUwlpleO4i5jzknEzYG/waDrBEHm6aeqcCjYeTPmODJ3EdSzU4bpP27mMyQe/S12bhLrvWo6gAZuhcgZWRI8z0Rq4GznaR2HngcVdD2Fg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 09/10/2023 14:24, Zi Yan wrote: > On 2 Oct 2023, at 8:32, Ryan Roberts wrote: > >> Hi Zi, >> >> On 12/09/2023 17:28, Zi Yan wrote: >>> From: Zi Yan >>> >>> Hi all, >>> >>> This patchset enables >0 order folio memory compaction, which is one of >>> the prerequisitions for large folio support[1]. It is on top of >>> mm-everything-2023-09-11-22-56. >> >> I've taken a quick look at these and realize I'm not well equipped to provide >> much in the way of meaningful review comments; All I can say is thanks for >> putting this together, and yes, I think it will become even more important for >> my work on anonymous large folios. >> >> >>> >>> Overview >>> === >>> >>> To support >0 order folio compaction, the patchset changes how free pages used >>> for migration are kept during compaction. Free pages used to be split into >>> order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared, >>> page order stored in page->private is zeroed, and page reference is set to 1). >>> Now all free pages are kept in a MAX_ORDER+1 array of page lists based >>> on their order without post allocation process. When migrate_pages() asks for >>> a new page, one of the free pages, based on the requested page order, is >>> then processed and given out. >>> >>> >>> Optimizations >>> === >>> >>> 1. Free page split is added to increase migration success rate in case >>> a source page does not have a matched free page in the free page lists. >>> Free page merge is possible but not implemented, since existing >>> PFN-based buddy page merge algorithm requires the identification of >>> buddy pages, but free pages kept for memory compaction cannot have >>> PageBuddy set to avoid confusing other PFN scanners. >>> >>> 2. Sort source pages in ascending order before migration is added to >>> reduce free page split. Otherwise, high order free pages might be >>> prematurely split, causing undesired high order folio migration failures. >> >> Not knowing much about how compaction actually works, naively I would imagine >> that if you are just trying to free up a known amount of contiguous physical >> space, then working through the pages in PFN order is more likely to yield the >> result quicker? Unless all of the pages in the set must be successfully migrated >> in order to free up the required amount of space... > > During compaction, pages are not freed, since that is the job of page reclaim. Sorry yes - my fault for using sloppy language. When I said "free up a known amount of contiguous physical space", I really meant "move pages in order to recover an amount of contiguous physical space". But I still think the rest of what I said applies; wouldn't you be more likely to reach your goal quicker if you sort by PFN? > The goal of compaction is to get a high order free page without freeing existing > pages to avoid potential high cost IO operations. If compaction does not work, > page reclaim would free pages to get us there (and potentially another follow-up > compaction). So either pages are migrated or stay where they are during compaction. > > BTW compaction works by scanning in use pages from lower PFN to higher PFN, > and free pages from higher PFN to lower PFN until two scanners meet in the middle. > > -- > Best Regards, > Yan, Zi