From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D0D2E82CA5 for ; Wed, 27 Sep 2023 15:32:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 925D76B0183; Wed, 27 Sep 2023 11:32:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B21E6B018B; Wed, 27 Sep 2023 11:32:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72B666B018C; Wed, 27 Sep 2023 11:32:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CEC0E6B0183 for ; Wed, 27 Sep 2023 11:32:28 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A7551B406E for ; Wed, 27 Sep 2023 15:32:28 +0000 (UTC) X-FDA: 81282769176.08.E9FF5BF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf16.hostedemail.com (Postfix) with ESMTP id D647D180022 for ; Wed, 27 Sep 2023 15:32:24 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=LiCdI15j; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695828744; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=59e0+8m48wyieP4YmODEWbNeP6E7UG4VeFSrOVe3M7w=; b=BKDlcwuvtG3dWiAlk+AfP4vyupqxTxTLu5VMS//yi8QCiPgs4TC44cFn7CDy3/VYu96JXz /b3gT8wdGyDXqTwwL4YhTpd2VEFSuV60odp3V4BbkrjkjPgidULi9ljPKZpEuYVQyzY84d xrrrXhVbZ3H0t5f40G/DbO97n46EV7M= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=LiCdI15j; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695828744; a=rsa-sha256; cv=none; b=nqGJo5qsSxy8ue3ViBsAMD8cV+c+7M3G+2ZMhJGqlAcnHmHPGMWfmpwrhSj97qpFr7NAaf CGQ1lYr6QyBsnnYf8vvgfDS4J6pDnz+ATpxWY+FX6puBtbfzk1fTcHM9skgPV5IFXA+54u mLUXy4TC9C/ycQcCQ/qxgBjEzOjP588= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1695828744; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=59e0+8m48wyieP4YmODEWbNeP6E7UG4VeFSrOVe3M7w=; b=LiCdI15j5cCjiuh1jBbJP7GCLU5c+FDHSLK9/IefkPsY6+DUJW/LDdel0tsjjq0AWFSVg3 0qj+btcxJZ4ddqdpIIgmOwTGMUgr/RObC3rae16TXrmPDabo+6ohiCCcVC5Q2H+uInhl8c PewRVLLtNapFoyD5ABTUkRmR0YIjjy0= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-322-HLNu76vLN4yHu8iOo7fSrg-1; Wed, 27 Sep 2023 11:32:22 -0400 X-MC-Unique: HLNu76vLN4yHu8iOo7fSrg-1 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-3fef3606d8cso110054865e9.1 for ; Wed, 27 Sep 2023 08:32:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695828741; x=1696433541; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=59e0+8m48wyieP4YmODEWbNeP6E7UG4VeFSrOVe3M7w=; b=m/IDFrdra2C2hmJn8iU33b40cwkzYnRlRkOs9d1a6GVlrJbSF0UnPrTg0roCzvNi0H Q5ei4koPrjrAM1IhwYgn6dT6oWXwin1AISmlp4OV4iGifjpZvZa4NWLSQDvmzgim88pW T/YNzIMAQorNB3v3jUnmI9J6KQS/xOBtobonchw83lqk4ym0FEVtKnbDw0BzJmqyQQ2/ FGc/52byt9xEd0gQYsvd7m7PmEZ7jSpk/bN8KI294Y3/WvU9hWs3xId+U3Lg9zf+eBEP Z8pXiIeXI71MssEndOU0WnleDVPmReBFj4kJvJMJ1AwhogbFRt0g48QbP+lZYvL+3Trs QP/g== X-Gm-Message-State: AOJu0YyHbhK+7uQi9lrt/CnYCPJawJlI0F59SDy6K/PAdHEV3aThoQwq /dvG1gpMm75l5bEZ83vHeitppj/JeMDcOUVcWU/m4ZEj+3mFcuu3XKaWQZDOWIEYXoqUQfiCTkF fAMks8YtD30Q= X-Received: by 2002:a7b:ca4d:0:b0:405:3d41:5646 with SMTP id m13-20020a7bca4d000000b004053d415646mr2506486wml.2.1695828741160; Wed, 27 Sep 2023 08:32:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE+h9ReL+V5FMKY8tv9bt9ciXCiykB+QWbvDZD3FDLgRSXfYXeIqkrZeY68X5h5UHCz9ooP8Q== X-Received: by 2002:a7b:ca4d:0:b0:405:3d41:5646 with SMTP id m13-20020a7bca4d000000b004053d415646mr2506454wml.2.1695828740716; Wed, 27 Sep 2023 08:32:20 -0700 (PDT) Received: from ?IPV6:2003:cb:c749:6900:3a06:bd5:2f7b:e6eb? (p200300cbc74969003a060bd52f7be6eb.dip0.t-ipconnect.de. [2003:cb:c749:6900:3a06:bd5:2f7b:e6eb]) by smtp.gmail.com with ESMTPSA id s12-20020a05600c044c00b004064741f855sm1322930wmb.47.2023.09.27.08.32.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 27 Sep 2023 08:32:20 -0700 (PDT) Message-ID: Date: Wed, 27 Sep 2023 17:32:19 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 To: Ryan Roberts , John Hubbard , Matthew Wilcox , Yang Shi , "Yin, Fengwei" , Yu Zhao , Zi Yan , David Rientjes , Andrew Morton , Vlastimil Babka , "Kirill A. Shutemov" , Hugh Dickins Cc: Linux-MM References: <4966f496-9f71-460c-b2ab-8661384ce626@arm.com> <4830fb3e-4a35-4842-98f4-9e7baa0e692a@arm.com> <7301771f-d654-4e5a-a197-3a3d8750440c@nvidia.com> <92937776-1e16-47e5-bef9-4c1a04bc98c0@arm.com> <5fa4aa95-6982-7879-e067-69fdb8b76d01@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: ANON_LARGE_FOLIOS meeting follow-up & refined proposal In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: D647D180022 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 4a3z33yrfxswqs5f89zoip4zcc9mhx6h X-HE-Tag: 1695828744-273504 X-HE-Meta: U2FsdGVkX1/NclNC3TQvixaDIN+iDB+swgkSt4nGTp2XFqLlbAh9WuGzOCFuUq7ctTD/pYrjlBijylbiXdla+ucRsDTsoO+1ZH+zvCx/i+M41XNCHkoQK6EcMdPKvntk9CGbJKsbiN+7bbgicu9coEHGr+U5gO9W+K5+lvz6aeEhWRhLrMibV6+J4Sq6pXlMsEs2asnV7PAANchukCv1nact9N6UblecBca6cZmEgsVUXT01yp6TxrZuQvr0ouSgAAtSz2FTu5tYcfNeeJdoqq+tefsx4Kqy0tu7IgmY+IKIzLfCFn8tkybZfypbiZymmpCr9WrVLN8zY0klDy1h5yynzYjA0pylM8jK6mtVOVRFjVkv8jG5VXtJS1tXijjGMaIxQw4tNioQho1TBQZYDfZEYMbSzZpkJD4jv6fHXiLN4A9ifuVLdFhV4wdHk7bxYXT8lzJF0+2WILvRNIJpYGquzreWzEoJn+vPncPlDVkWYzAuVosK7tp7Xa2Cj11nGkfVEqL4KJh/ml8yl6ENR3jEMVFk3l8gjWZ1X366m4z3UcLiyLE8Ucl+n0v9CLW+KyLPSkzCayUjc9vNBIi2OuRShNYkdEtohEWVwaZ6SOlVqgyVeVPj6gTjiD4kTETlnklD8l5MY7VB4mHveJcTEcV9w/ZY0pcXXMrAavxWDAZ8q7fJIZluGHrMOISoh0QxqAkZtAnlEYkD+1yoGQztNocMlqksOKUaCdpqn999n/lwu4ZS7DlKInFq3GLtTQrXhIwwdCwi7kJ+UuFRPzbXoH2Ju1aI3JrDdpZb0f817SvQgyY1isBpxpWJvdD0G9dgQvDpeCIh6m/zae96Wrj28p0ot06tL4cq+c83dbF50rKSJtTLnETI5yn3raizbRQ3rAcP7JEASpgFL6MasDv/eL14uZaUfUshPL5cast0N8KxOILdTxiKH99+CBGOdfduu8ykwWSGzA0535stYGy N/YuZgRM TVLiEFFjX22NdtgPhtXKTX73s59aAglxC+3LUGaGqrfxDE40fl89Hd6rwnFkKROFMJ+RrENJecGsdZ/QFbsg72qmguBdGDk2VjcOPr0UTQA6zbo0tOGkUdNTwla7YyUfpLdGiWCz2jd9TSwr8ZwrGDWQgJqkE1Ax88Kn9ER0VtKLw0DCESlPwXLsdUFRWNi4dexmjtIUiBwIRtzCqzbgz8VH0y4/1wUZ5Wz3kDYhDImYuOzbz6YVsWDNiF5OCSLUme2lP7Ry+ddgm0nZ3P56TXJWqMKHHBmWgmmuEAHjW6JgUmnsRgLVqESJKOftpsJrWcjUaglXiZbCeGz5yfzMK25lR4rHVrVrKOi5ZTvBpURF9hRHhWT5jFAVew12swAwp1wXr8MSH0y4kbCq9Ytrg2EseR+mxNFi82cSFPA32xOL4xY3Xdk5MQh4JqQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 27.09.23 09:23, Ryan Roberts wrote: > On 26/09/2023 19:31, David Hildenbrand wrote: >> On 25.09.23 10:51, Ryan Roberts wrote: >>> On 23/09/2023 01:33, John Hubbard wrote: >>>> On 9/22/23 08:48, Ryan Roberts wrote: >>>> ... >>>>> I never had any feedback on the below; I'm not sure if that means everyone is >>>>> happy or that nobody read it?? >>>> >>>> One can never really know: zero or more people read it, and of those, no >>>> one hated it enough to send out a quick NAK. So that's a *possible*, >>>> lukewarm endorsement of sorts. Success! :) >>> >>> You really know how to fill a guy with confidence! ;-) >>> >>>> >>>> ... >>>> >>>>> BUT I've had yet another idea on the controls front, which would enable >>>>> exposing >>>>> this to user space as an extension to transparent_hugepage, while continuing to >>>>> support THP as is and also be able to control THP and ALF (anon large folio) >>>> >>>> The new ALF / ANON_LARGE_FOLIO naming looks good to me. The grep aspect >>>> is a nice touch. >>> >>> Well if we go the route of the newest proposal, then I guess the naming is less >>> important, because it all attaches to transparent_hugepage. >> >> I agree that ALF is better. But having something under "THP", that is not THP >> and not accounted as THP ... I don't quite like it (although, before we >> discussed that approach in the past, I did like it). > > I know we discussed and concluded against putting it under THP in the past, but > I think that decision was driven by not not having any proposal that would allow > us to put it under THP without breaking the expectations of existing (PMD-sized) > THP users, or not being able to control use of the lower orders and PMD-order. Not only that. It was also because we didn't want to confuse users/devs that assume that THP == PMD-sized. I'll CC Hugh, I recall he had an opinion on that (I recall some comments about cleanly separating both features towards the user). > Personally I think my latest proposal is a way to solve that problem, and in > that case, I personally think exposing it as an extension to THP is neater: > > - all existing THP controls work as they did before > - new anon_orders and anon_always_mask files allow opt-in to > smaller-than-PMD-orders As "enable" controls anon only (that's correct, right?), maybe these should also simply be called "orders" and "always_mask". shmem could get their own set, like "shmem_enable". > - All exisitng counters remain unchanged, and continue to count PMD-mapped THP > only: > - /proc/meminfo:AnonHugePages > - /sys/devices/system/node/nodeX/meminfo:AnonHugePages > - /proc/vmstat:nr_anon_transparent_hugepages > - /proc//smaps[_roolup]:AnonHugePages > - memory.stat(v1):rss_huge > - memory.stat(v2):anon_thp > - New counters introduced to count PTE-mapped THP/large folios: > - /proc/meminfo:AnonHugePteMap > - /sys/devices/system/node/nodeX/meminfo:AnonHugePteMap > - /proc/vmstat:nr_anon_thp_pte > - /proc//smaps[_roolup]:AnonHugePteMap > - memory.stat(v1):anon_thp_pte > - memory.stat(v2):anon_thp_pte > - It's a lot less code (I have an implementation for both approaches) > > Admittedly, I haven't spent too much time thinking about the other thp counters > in vmstat yet (e.g. thp_fault_alloc, thp_fault_fallback, etc). Proposal is that > for now, they would continue to be PMD-order only. But I think you could > probably hook those upto the PTE-mapped ones as well, instead of duplicating all > the counters. > > As Kiril mentioned, PTE-mapped THP is already a thing, so this approach just > formalises it. Not quite. PTE-mapped THP were just a side-effect of the transparency handling. We never allocated and populated PTE-mapped PMD-sized THP on allocation. So I don't immediately see the connection between both for this case. Would you account a PTE-mapped (PMD-sized) THP as anon_thp or anon_thp_pte? What if it's mapped via PTEs and PMDs? I don't see how that formalises that case for the existing PMD-szed THP. > > I also think the "huge" means PMD-size argument is a bit weak, given that THP > supports PUD-size today for file mappings, and in the context of hugetlb, huge > can mean contpte, pmd, contpmd, pud, etc. I made similar statements in the past but was convinced otherwise :) > > I'll have the patch set ready to post by Friday. How about I post it, then we > can continue the conversation in the context of the actual code? If the > concensus is that this is not the way to do it, then I'll post the large_folio > version instead? No strong opinion from my side, I considered a "fresh start" without the THP implication/thermonology after all the previous discussions cleaner [which I think was one of the outcomes of the previous discussions]. -- Cheers, David / dhildenb