From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2801EC001E0 for ; Wed, 19 Jul 2023 18:37:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 80F68280082; Wed, 19 Jul 2023 14:37:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7988328004C; Wed, 19 Jul 2023 14:37:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 639C4280082; Wed, 19 Jul 2023 14:37:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 508C728004C for ; Wed, 19 Jul 2023 14:37:34 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CACEE40244 for ; Wed, 19 Jul 2023 18:37:33 +0000 (UTC) X-FDA: 81029219586.01.ED274C2 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id 9ABD81C0008 for ; Wed, 19 Jul 2023 18:37:31 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689791852; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7Cy3vm5DO+eZwFxhy6JPR83MYp9G2yG1BZQQ8zBXsZ0=; b=H/v24xM+j6DVNXC9y/LfF5pGCBMEFsxgMM76YsTge+qq/sG1h8ZYq4jo7y5a96te3R8JgF vstMy1BB2ALxIAGr/gzVMQY37lZDpNbuH5L95jAgzGNTyRr/3fOhBrwZkHQirf5FAnJH9x S/e6asDYLvHU6k/1MT2jlUbLzeOLOGI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689791852; a=rsa-sha256; cv=none; b=ufG36tyvzDebKUue7BB0rYo25te21VTV7BgX1KdrsH27l866i9GxPaHv0xex0f7R+ljru1 juSm3oyfBdlLayeDR8RhAdZST5RiksyiqvuEPLO/RBtEVBXSTlmx29b33z6XEA4xN9capm 2bTjfUes9wBiaS+WQyXCpqS6DgmKhZc= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; spf=pass (imf21.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 730912F4; Wed, 19 Jul 2023 11:38:13 -0700 (PDT) Received: from [10.57.76.81] (unknown [10.57.76.81]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 00DF63F67D; Wed, 19 Jul 2023 11:37:27 -0700 (PDT) Message-ID: <7c3b347c-11a1-6250-9038-c0c58c5ebd89@arm.com> Date: Wed, 19 Jul 2023 19:37:25 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v2 0/5] variable-order, large folios for anonymous memory To: Zi Yan Cc: David Hildenbrand , Matthew Wilcox , Andrew Morton , "Kirill A. Shutemov" , Yin Fengwei , Yu Zhao , Catalin Marinas , Will Deacon , Anshuman Khandual , Yang Shi , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20230703135330.1865927-1-ryan.roberts@arm.com> <78159ed0-a233-9afb-712f-2df1a4858b22@redhat.com> <4d4c45a2-0037-71de-b182-f516fee07e67@arm.com> <4DD00BE6-4141-4887-B5E5-0B7E8D1E2086@nvidia.com> From: Ryan Roberts In-Reply-To: <4DD00BE6-4141-4887-B5E5-0B7E8D1E2086@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 9ABD81C0008 X-Rspam-User: X-Stat-Signature: z959tr9fot9hdmkpwq6513bhqomo4itz X-Rspamd-Server: rspam03 X-HE-Tag: 1689791851-321824 X-HE-Meta: U2FsdGVkX1/D0efBvoya3USHTJjKH6DQdrNJFm0zR3ZKXwcp6FUikyobgJiJMYgffni4n9HXy9x2uKdKVdD6mrgayUrJva4Rol0rE30nt+ha2ByM6Rcl6sdoZq0ZbCxjRoMIn6ebdq99eyy0Fgw8AuOMyJp8cXgKfvpQHiShUbcfGqC+SblFxUASVXOJ/hYO6fDNLHAOp1XJ95UVcrEoVnqi2prZXoC3DJ45y0ZTV+Ioa1NpLAmIjQiObD57YeoLt9EyeQunXb+EbF9Xa18K42qucxRWhYfGMLtPP2RDGBldJ0Ebc1GpigZQoY6d0lgpFpsjTVZ7rccWYPguSMqNOewvaCjrIekI+zOK8yVTcCzuukH70WEYtHXMA5OwKzf5JHBBdWEl4SdOJ9/WsWFjvOhbcrtNbPyIY5gzXQJiQj3hmRq8yVNRQgeXoKoW+6/Et3D3XJZHYbkVQVaNgDp9gz8Fza8XfSS7zSLLCT4l/DE02emE4l/QgH38qsk6OcW8dSfd3CKhpP5g3lYeWYXAcYJn7iqZpR9Hv2l/wz5TwpMOVo9YVdlbquHz4UH1JHk2EqnFBUSZ0rAKzZ0Pkkh8np5vBGcj51BQ4Aojtn1BJ3HOmVcL2/a2Mb8nUUyl9inCyl07/i++TBRgBaKZ7dnUpcXYbOG4L9vI6UvZIlnYsWfLepETIz6PdI8gwjHoExVlF9H19t7sQz0iDALeMpSpOpqPtBr40en1JHO4lYymVLTZJLHew3s7ITde2sDX6W8YawEueZAjDlD1OiDxNCooWdJQknicYKGbAyF9oBKvALM0KmmMD9aKWobETDza5eiTPykiQ5Swq8F70VgMBy17fT4JvvMYr91dK5Y/GPI8wfzQ2J3bTf2VZygyT9L0CeIlrZHySdPPlwYi61xFy0/jXSC1vEJarwKc6VRw0++xlwW8l9ujham5yRlraDaj5vQ0Z3y6Dw8IM9iA5jkaDCX O685lwvz IXjwIPzKNbqM1ah8eTJ2mL5nCozm6YCMaJrvsh3nmmwg8uNuIyboDVXs46eTWKvG1nID+KHcRmSEZ92Fqw9jotsORVTs1RvkbSSkydlitcXJFD/A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 19/07/2023 17:05, Zi Yan wrote: > On 19 Jul 2023, at 11:49, Ryan Roberts wrote: > >> On 10/07/2023 17:53, Zi Yan wrote: >>> On 7 Jul 2023, at 9:24, David Hildenbrand wrote: >>> >>>> On 07.07.23 15:12, Matthew Wilcox wrote: >>>>> On Fri, Jul 07, 2023 at 01:40:53PM +0200, David Hildenbrand wrote: >>>>>> On 06.07.23 10:02, Ryan Roberts wrote: >>>>>> But can you comment on the page migration part (IOW did you try it already)? >>>>>> >>>>>> For example, memory hotunplug, CMA, MCE handling, compaction all rely on >>>>>> page migration of something that was allocated using GFP_MOVABLE to actually >>>>>> work. >>>>>> >>>>>> Compaction seems to skip any higher-order folios, but the question is if the >>>>>> udnerlying migration itself works. >>>>>> >>>>>> If it already works: great! If not, this really has to be tackled early, >>>>>> because otherwise we'll be breaking the GFP_MOVABLE semantics. >>>>> >>>>> I have looked at this a bit. _Migration_ should be fine. _Compaction_ >>>>> is not. >>>> >>>> Thanks! Very nice if at least ordinary migration works. >>>> >>>>> >>>>> If you look at a function like folio_migrate_mapping(), it all seems >>>>> appropriately folio-ised. There might be something in there that is >>>>> slightly wrong, but that would just be a bug to fix, not a huge >>>>> architectural problem. >>>>> >>>>> The problem comes in the callers of migrate_pages(). They pass a >>>>> new_folio_t callback. alloc_migration_target() is the usual one passed >>>>> and as far as I can tell is fine. I've seen no problems reported with it. >>>>> >>>>> compaction_alloc() is a disaster, and I don't know how to fix it. >>>>> The compaction code has its own allocator which is populated with order-0 >>>>> folios. How it populates that freelist is awful ... see split_map_pages() >>>> >>>> Yeah, all that code was written under the assumption that we're moving order-0 pages (which is what the anon+pagecache pages part). >>>> >>>> From what I recall, we're allocating order-0 pages from the high memory addresses, so we can migrate from low memory addresses, effectively freeing up low memory addresses and filling high memory addresses. >>>> >>>> Adjusting that will be ... interesting. Instead of allocating order-0 pages from high addresses, we might want to allocate "as large as possible" ("grab what we can") from high addresses and then have our own kind of buddy for allocating from that pool a compaction destination page, depending on our source page. Nasty. >>> >>> We probably do not need a pool, since before migration, we have isolated folios to >>> be migrated and can come up with a stats on how many folios there are at each order. >>> Then, we can isolate free pages based on the stats and do not split free pages >>> all the way down to order-0. We can sort the source folios based on their orders >>> and isolate free pages from largest order to smallest order. That could avoid >>> a free page pool. >> >> Hi Zi, I just wanted to check; is this something you are working on or planning >> to work on? I'm trying to maintain a list of all the items that need to get >> sorted for large anon folios. It would be great to put your name against it! ;-) > > Sure. I can work on this one. Awesome - thanks! > > -- > Best Regards, > Yan, Zi