From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3411C4167B for ; Tue, 5 Dec 2023 09:34:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 697FB6B0087; Tue, 5 Dec 2023 04:34:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6484D6B0093; Tue, 5 Dec 2023 04:34:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 50F9A6B0095; Tue, 5 Dec 2023 04:34:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3D3D86B0087 for ; Tue, 5 Dec 2023 04:34:31 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E246F140241 for ; Tue, 5 Dec 2023 09:34:30 +0000 (UTC) X-FDA: 81532254300.20.0845752 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf11.hostedemail.com (Postfix) with ESMTP id BFE474000B for ; Tue, 5 Dec 2023 09:34:28 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701768869; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/A1JJjzArvjB5V4XdmvoTLZ3XIkEq0s21zEWxP3pNWo=; b=P2ebAxM/t3cixDDk08i9oVuXqfqO214hiRnu15b/IV1B6Fc3C1ryI1Ab23cRRbXYxLVygK Wy6Dn9yNdCb5OYi8IAn/qq1M41tNOagAt1pYSL4WFNc5QhM/o8snvW8Sd0sVxYMYya4Yl/ WU+q/SUbWVpvj8P1FPCd6ipgATtOb/o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701768869; a=rsa-sha256; cv=none; b=w1Q1Lo2XCqFLkGq6h2nigeOIMG1mWRJjpGxJUjauJjxSTqrVzenCLGOWJ0c+/Dks6nNEC5 xmEsBlg5NLpTEKQ9rJFLLIUrdcCYbdT/Z0xqYWzPD1Nx7rHfsHCM3kJjBS/f+HW8JRr4FM 5pMS1/G2vfP/mp1F2uoAgdXzAl3Z0qc= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; spf=pass (imf11.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 381B8139F; Tue, 5 Dec 2023 01:35:14 -0800 (PST) Received: from [10.57.73.130] (unknown [10.57.73.130]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8863E3F6C4; Tue, 5 Dec 2023 01:34:24 -0800 (PST) Message-ID: <2de0617e-d1d7-49ec-9cb8-206eaf37caed@arm.com> Date: Tue, 5 Dec 2023 09:34:23 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v8 00/10] Multi-size THP for anonymous memory Content-Language: en-GB To: Andrew Morton Cc: Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Anshuman Khandual , Yang Shi , "Huang, Ying" , Zi Yan , Luis Chamberlain , Itaru Kitayama , "Kirill A. Shutemov" , John Hubbard , David Rientjes , Vlastimil Babka , Hugh Dickins , Kefeng Wang , Barry Song <21cnbao@gmail.com>, Alistair Popple , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <20231204102027.57185-1-ryan.roberts@arm.com> <20231204113039.42510c23455026e40c5e2a56@linux-foundation.org> From: Ryan Roberts In-Reply-To: <20231204113039.42510c23455026e40c5e2a56@linux-foundation.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: BFE474000B X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: i3acdiwxt8eu9ez6i75diiwtg8x6pt7x X-HE-Tag: 1701768868-641220 X-HE-Meta: U2FsdGVkX19YGsPne5kSz0Mu7OhaEkiWkoRzzDvpr0gf3fCXWytPI95MNCnf5CaFPLSvosVYoqXdmM0QUy0HGdZa9w/xRlmBFOMG7sLiVRUhNewFt0OMVNx0oFSFINqhuu+rXaB0BMT5UMBuQutV9TrlkvQU9IdO/HsKbMxkp3+Hy4PS6MZyiyB3PIJysz3Rotm44bNyaDDnWFtS4//xYCDml9pKQg7er//53pUovpuxTPIKAz6X8EmHvX3byGLUJrb1uTHufDtu970KWGhG5QuxfYfqnzKvPBpOHEi07ywK7/JydGuekaRhjyNcA2E6B7eFfAKUQuTrTb3DjiThcAi/MkJhG9RUpeBOiMtGRgVBBSaWL3h93WUGtY/uxSbO5J5rygtdq75lvDvwWFDE6hoXf31clubtckgEqCcskGzNZx2PF2NJmHv0wYQ3rk9DgAKzn1WN79UNqdNhl6//AL1lkVY4bExx6Pq88jPqwd2QoYpjDjANuEUxW8+Q7L0V2SGiIzEEEwJsJ8KvUA006yPiLdWujIaQr2tkrk9z9FdatZQV8GT8q2jjROQPZownct2xleng7uuNGVSYtsDE9NlORSL08W4a8EK9J8QbGdrEpZggNJyh3eGZglhkqS0z+NBXG0wcJMaHmxAmsM+PT1s0SBziSnEljSb2OAOR0tcJazPgloLNxSvNE5Ah+SEFJGBGqBxSg0v3TF2hg6IDOXwfvRv3kGLcdERtINPdvRt1R/DHyEhcNY2jjbcwSdD3kYH6GDfbUrHaz7Ok3cGi7ufq8xJLtFtiXG0qGG17oAuxv5Bt0FIpNqPBdHIdiCDKdqqZarAtrVneB3IF+ztezjvAN26IrFtNwxlpLJhmr0+WnWJjyv+ZslelkjmSxGM7NVc2syKcWj/TxqphTDngbuna9CSDIEL/CuRSAa0n+j++MzenC+p2I00yRO8WfP8t3ptbLDw/a+ufmzX3ZzD pTTchZix gx1Cd5YagUHcmJVDA8lCCCeivVbV5AAeX3RshYTFhhZPqs2nCFan56aPT65WtSKWa0Yinem4PxwGi2KiSxmCJdC+XuoPN5sShiTONGumGubqzKb0/ATHfg/LJ5WRHoIyYgYicn6fM5ES82w2T3fsyIJr/QSAaEZJDpU5R5wW7fsr8Hp3/4cnCf/faWtkno9rSgYok9cslAl4tpHAjQ7tjT48Ecp23bkYwdw7dsIBfdqi9966TCRaSWa6Nx8tHiX0NO7mN91oFtv+DDQuNuzgJNXd2B9NcCBiO0XKueTIf2bv9bOs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 04/12/2023 19:30, Andrew Morton wrote: > On Mon, 4 Dec 2023 10:20:17 +0000 Ryan Roberts = wrote: >=20 >> Hi All, >> >> >> Prerequisites >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> >> Some work items identified as being prerequisites are listed on page 3= at [9]. >> The summary is: >> >> | item | status | >> |:------------------------------|:------------------------| >> | mlock | In mainline (v6.7) | >> | madvise | In mainline (v6.6) | >> | compaction | v1 posted [10] | >> | numa balancing | Investigated: see below | >> | user-triggered page migration | In mainline (v6.7) | >> | khugepaged collapse | In mainline (NOP) | >=20 > What does "prerequisites" mean here? Won't compile without? Kernel > crashes without? Nice-to-have-after? Please expand on this. Short answer: It's supposed to mean things that either need to be done to= prevent the mm from regressing (both correctness and performance) when m= ulti-size THP is present but disabled, or things that need to be done to = make the mm robust (but not neccessarily optimially performant) when mult= i-size THP is enabled. But in reality, all of the things on the list coul= d really be reclassified as "nice-to-have-after", IMHO; their absence wil= l neither cause compilation nor runtime errors. Longer answer: When I first started looking at this, I was advised that t= here were likely a number of corners which made assumptions about large f= olios always being PMD-sized, and if not found and fixed, could lead to s= tability issues. At the time I was also pursuing a strategy of multi-size= THP being a compile-time feature with no runtime control, so I decided i= t was important for multi-size THP to not effectively disable other featu= res (e.g. various madvise ops used to ignore PTE-mapped large folios). Th= is list represents all the things that I could find based on code review,= as well as things suggested by others, and in the end, they all fall int= o that last category of "PTE-mapped large folios efectively disable exist= ing features". But given we now have runtime controls to opt-in to multi-= size THP, I'm not sure we need to classify these as prerequisites. But I = didn't want to unilaterally make that decision, given this list has previ= ously been discussed and agreed by others. It's also worth noting that in the case of compaction, that's already a p= roblem for large folios in the page cache; large folios will be skipped. >=20 > I looked at [9], but access is denied. Sorry about that; its owned by David Rientjes so I can't fix that for you= =2E It's a PDF of a slide with the following table: +-------------------------------+----------------------------------------= --------------------------------+--------------+--------------------+ | Item | Description = | Assignee | Status | +-------------------------------+----------------------------------------= --------------------------------+--------------+--------------------+ | mlock | Large, pte-mapped folios are ignored wh= en mlock is requested. | Yin, Fengwei | In mainline (v6.7) | | | Code comment for mlock_vma_folio() says= "...filter out pte mappings | | | | | of THPs which cannot be consistently co= unted: a pte mapping of the | | | | | THP head cannot be distinguished by the= page alone." | | | | madvise | MADV_COLD, MADV_PAGEOUT, MADV_FREE: For= large folios, code assumes | Yin, Fengwei | In mainline (v6.6) | | | exclusive only if mapcount=3D=3D1, else= skips remainder of operation. | | |= | | For large, pte-mapped folios, exclusive= folios can have mapcount | | | | | upto nr_pages and still be exclusive. E= ven better; don't split | | | | | the folio if it fits entirely within th= e range. | | | | compaction | Raised at LSFMM: Compaction skips non-o= rder-0 pages. | Zi Yan | v1 posted | | | Already problem for page-cache pages to= day. | | | | numa balancing | Large, pte-mapped folios are ignored by= numa-balancing code. Commit | John Hubbard | Investigated: | | | comment (e81c480): "We're going to have= THP mapped with PTEs. It | | Not prerequisite | | | will confuse numabalancing. Let's skip = them for now." | | | | user-triggered page migration | mm/migrate.c (migrate_pages syscall) We= don't want to migrate folio | Kefeng Wang | In mainline (v6.7) | | | that is shared. = | | | | khugepaged collapse | collapse small-sized THP to PMD-sized T= HP in khugepaged/MADV_COLLAPSE. | Ryan Roberts | In mainline (NOP) | | | Kirill thinks khugepage should already = be able to collapse | | | | | small large folios to PMD-sized THP; ve= rification required. | | | +-------------------------------+----------------------------------------= --------------------------------+--------------+--------------------+ Thanks, Ryan >=20 >> [9] https://drive.google.com/file/d/1GnfYFpr7_c1kA41liRUW5YtCb8Cj18Ud/= view?usp=3Dsharing&resourcekey=3D0-U1Mj3-RhLD1JV6EThpyPyA >=20 >=20