From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CA5DCD6134 for ; Tue, 10 Oct 2023 10:00:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E68FD8D006D; Tue, 10 Oct 2023 06:00:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E18718D0002; Tue, 10 Oct 2023 06:00:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB92A8D006D; Tue, 10 Oct 2023 06:00:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B63748D0002 for ; Tue, 10 Oct 2023 06:00:16 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 8C4871CA8B2 for ; Tue, 10 Oct 2023 10:00:16 +0000 (UTC) X-FDA: 81329106432.16.A0ABD60 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf17.hostedemail.com (Postfix) with ESMTP id 93BDD4001B for ; Tue, 10 Oct 2023 10:00:14 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696932014; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g7N/5qx/VrBNhTEQ62aELLM9izKabhzxz+E19xVOm6Y=; b=Wtgb6C3lgIbk5kJ0YyiCnN6euSSePlt7UY/kN8sLPGalV4Q/hS4eukO19OPxkh1LW2IAd6 BSOabenq+jJsFz2UF2bqbF5Z6HgzYr/8d4h6GjA1QLGbZ0UNu83pPWMYuprMjOsurJZ+mH Y+iVxMkxy1BseW2vaXYdbOLLraXKWUs= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf17.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696932014; a=rsa-sha256; cv=none; b=0Fcw9S65his0qrezsfyEIsoS1WUF1ZU+Px7H8JihNezt28LWkh/UQD46/wRr1D+iccyF9C BK+wQ+JgHLJdP02it9w+tPqW9QcIsOLDCxI03LoTp+d87MvHjyaTWXuirSMZYV8FKYztxy 9MN9yjjy8G2HKxnewuW2XvAIbL3++FM= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5EFCA1FB; Tue, 10 Oct 2023 03:00:54 -0700 (PDT) Received: from [10.1.30.177] (XHFQ2J9959.cambridge.arm.com [10.1.30.177]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9F1A43F762; Tue, 10 Oct 2023 03:00:11 -0700 (PDT) Message-ID: <03c77958-5d02-4611-b6d4-2a5c94425e70@arm.com> Date: Tue, 10 Oct 2023 11:00:10 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 0/4] Enable >0 order folio memory compaction Content-Language: en-GB To: Zi Yan Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , "Matthew Wilcox (Oracle)" , David Hildenbrand , "Yin, Fengwei" , Yu Zhao , Vlastimil Babka , Johannes Weiner , Baolin Wang , Kemeng Shi , Mel Gorman , Rohan Puri , Mcgrof Chamberlain , Adam Manzanares , John Hubbard References: <20230912162815.440749-1-zi.yan@sent.com> <5caf5aee-9142-46f6-9a04-5b6e36880b21@arm.com> <3430F048-0B75-4D2F-A097-753E8B1866B2@nvidia.com> <13347394-fc63-44b2-9fa0-455f56d9b19d@arm.com> <96622D29-4CC6-4281-96B1-319E5F317EDD@nvidia.com> From: Ryan Roberts In-Reply-To: <96622D29-4CC6-4281-96B1-319E5F317EDD@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 93BDD4001B X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: gzfrkntsa439tt4epi755a873ybhh9wa X-HE-Tag: 1696932014-993248 X-HE-Meta: U2FsdGVkX1+o4vmHRjhOcFMZ7wWmxMOVBVRbUIDJyskkTRoVQOvAXVEd9KF/PD5fV01e7e+50AKhIgYGczaVc8XoYqnOb+d/P1upLKKdFDborCcX3qAZ+baeGxrRel/ChuVFNnpl7R23VIV8cQP1+brllw5HEhO7aZRE/P1J0Tvk8+/5JXKPUiw9JRdl6yEhUTs5aEKzJvpI/MTZKObZC6SOXIeu+GpFvFLxcjartqRLAlTEfdBU15qOzeRcQHJ5q2RVDIeDdf/vU/zUusMEYnVgL04jKqt2687P5K1JDItmhjH8hLT9fLcH/1EDmGpObiM1/r2nrfETjNMhuiM4VwUv+X4XmdYTWv0l0IbdpAuWw6oE5WAeZR0RKN0rtsT6/lIeQZimp+svYPuMnqr07gmGui74SuzudsbMcRhguZPh8srPojV9yv/uTZLeR53+S9oFkDK7KTy3LYe/AwrMwrZJNXlDthl9CW5G160tmtU62DkkYz7mvR4cfrW1Bt7pZHlSAqqCtbe4MnxCpsOKHvR0R1pqAvkJbm5ma+1rKiO+b6vnuVm17CyiguYVBdoBGLrnvpbzzTU9AuQDII2G+8rkb4ZFmELo/Sf3d9oUewVHUyMOPwgxi+bfnTNG7ZxP/9DgidLpl8Hf80xVyRqzKl2N+GjiM1qsSaYpbHTnfVLCWjsVR2NI96KU+pYxPIZQi9ejHq10yPu7lGjy8Wo+kmH15/cHIBH0WVdDM2O8puOdDpW/X1TeeWPKFXV8sloUTSnBwu40/1YrT9Qwjcpur3H5LpVqlj25tdRHF17B30BynaZ0ArMNNBJg+1XLrsTmxGfmpmjX5x4UOMmHWaSZWiuA1I+9+o/ofRTnAH1cxZAs/VRMtUF3KMuoabI14FOI0wQPOYcJJDa2Etuyue90xOdkWUo+fBoGgcP0hx7XHsxPSIVF9mon+Cq6akCbe9adrsQn3C9I/9t1kGSTy77 pkOlbj6h ciFFJPAYEsnYzbP380+Y4AdrTH1cSSoo2tiHp6BpUdli6w5m3vxNb6sWR8Unljsvgbc+wR3H6HKJYYB04Ojd4qi0tbEjspVw7W0HvfMurN14UOznrtflgLiB4t/S/jRhnkyHUgTGq8bzYC4W8wGjXx9ZRKfdw73xxOHAf/YBu1KBSZqR+PaIeTKQ9LRx3mBsDhcLCIuWemvFGo6wMf0ZE0G2I7H7qW76qq6XP47Kta9wvJYFHff8PTHHjuQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 09/10/2023 16:52, Zi Yan wrote: > (resent as plain text) > On 9 Oct 2023, at 10:10, Ryan Roberts wrote: > >> On 09/10/2023 14:24, Zi Yan wrote: >>> On 2 Oct 2023, at 8:32, Ryan Roberts wrote: >>> >>>> Hi Zi, >>>> >>>> On 12/09/2023 17:28, Zi Yan wrote: >>>>> From: Zi Yan >>>>> >>>>> Hi all, >>>>> >>>>> This patchset enables >0 order folio memory compaction, which is one of >>>>> the prerequisitions for large folio support[1]. It is on top of >>>>> mm-everything-2023-09-11-22-56. >>>> >>>> I've taken a quick look at these and realize I'm not well equipped to provide >>>> much in the way of meaningful review comments; All I can say is thanks for >>>> putting this together, and yes, I think it will become even more important for >>>> my work on anonymous large folios. >>>> >>>> >>>>> >>>>> Overview >>>>> === >>>>> >>>>> To support >0 order folio compaction, the patchset changes how free pages used >>>>> for migration are kept during compaction. Free pages used to be split into >>>>> order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared, >>>>> page order stored in page->private is zeroed, and page reference is set to 1). >>>>> Now all free pages are kept in a MAX_ORDER+1 array of page lists based >>>>> on their order without post allocation process. When migrate_pages() asks for >>>>> a new page, one of the free pages, based on the requested page order, is >>>>> then processed and given out. >>>>> >>>>> >>>>> Optimizations >>>>> === >>>>> >>>>> 1. Free page split is added to increase migration success rate in case >>>>> a source page does not have a matched free page in the free page lists. >>>>> Free page merge is possible but not implemented, since existing >>>>> PFN-based buddy page merge algorithm requires the identification of >>>>> buddy pages, but free pages kept for memory compaction cannot have >>>>> PageBuddy set to avoid confusing other PFN scanners. >>>>> >>>>> 2. Sort source pages in ascending order before migration is added to >>>>> reduce free page split. Otherwise, high order free pages might be >>>>> prematurely split, causing undesired high order folio migration failures. >>>> >>>> Not knowing much about how compaction actually works, naively I would imagine >>>> that if you are just trying to free up a known amount of contiguous physical >>>> space, then working through the pages in PFN order is more likely to yield the >>>> result quicker? Unless all of the pages in the set must be successfully migrated >>>> in order to free up the required amount of space... >>> >>> During compaction, pages are not freed, since that is the job of page reclaim. >> >> Sorry yes - my fault for using sloppy language. When I said "free up a known >> amount of contiguous physical space", I really meant "move pages in order to >> recover an amount of contiguous physical space". But I still think the rest of >> what I said applies; wouldn't you be more likely to reach your goal quicker if >> you sort by PFN? > > Not always. If the in-use folios on the left are order-2, order-2, order-4 > (all contiguous in one pageblock) and free pages on the right are order-4 (pageblock N), > order-2, order-2 (pageblock N-1) and it is not a single order-8, since there are > in-use folios in the middle), going in PFN order will not get you an order-8 free > page, since first order-4 free page will be split into two order-2 for the first > two order-2 in-use folios. But if you migrate in the the descending order of > in-use page orders, you can get an order-8 free page at the end. > > The patchset minimizes free page splits to avoid the situation described above, > since once a high order free page is split, the opportunity of migrating a high order > in-use folio into it is gone and hardly recoverable. OK I get it now - thanks! > > >>> The goal of compaction is to get a high order free page without freeing existing >>> pages to avoid potential high cost IO operations. If compaction does not work, >>> page reclaim would free pages to get us there (and potentially another follow-up >>> compaction). So either pages are migrated or stay where they are during compaction. >>> >>> BTW compaction works by scanning in use pages from lower PFN to higher PFN, >>> and free pages from higher PFN to lower PFN until two scanners meet in the middle. >>> >>> -- >>> Best Regards, >>> Yan, Zi > > > Best Regards, > Yan, Zi