From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27989C3DA42 for ; Wed, 17 Jul 2024 09:51:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86A136B009D; Wed, 17 Jul 2024 05:51:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 81AAC6B009E; Wed, 17 Jul 2024 05:51:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E1226B009F; Wed, 17 Jul 2024 05:51:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5163D6B009D for ; Wed, 17 Jul 2024 05:51:05 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C9F041209D1 for ; Wed, 17 Jul 2024 09:51:04 +0000 (UTC) X-FDA: 82348776048.07.8044B2A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf09.hostedemail.com (Postfix) with ESMTP id BC73A140021 for ; Wed, 17 Jul 2024 09:51:01 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721209823; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QVGGBvlCdGyHua3NGQudWoOegk5v9magUmz+LMSFTho=; b=nZX5TKHdtLmOV2o2V0ZgJeL8C88ir9FcQHzXCwXY8aWsvbz7teoHGYXG1CAnfvBUJWkM+0 ecH8HyaiSH3U2wwDsvwykgfScVth4ykomeDTo05fd8iQPwVODMyN6cGsJQextzMzLW51vm 25bL7lsnp88lINr9iH1fS7zE5MpuXbs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721209823; a=rsa-sha256; cv=none; b=vCve+xiqfTIPW9Qn6qW+LmTSIHj/T6JVtajH+GvZS2nS+DurP4Q6UJGzZTTr6iEqXTe6W8 P5RP2j2py0D/nIWSmYr/GHskPFaQ9r774V73+qPFtVGpGOAEJBMzGTIORHUVInjubpWQWH NEtfLDIk16a4TnPjS5gVkXPURN35cwg= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 071621063; Wed, 17 Jul 2024 02:51:26 -0700 (PDT) Received: from [10.57.77.222] (unknown [10.57.77.222]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 428BA3F73F; Wed, 17 Jul 2024 02:50:59 -0700 (PDT) Message-ID: Date: Wed, 17 Jul 2024 10:50:57 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 2/2] mm: mTHP stats for pagecache folio allocations Content-Language: en-GB To: David Hildenbrand , Lance Yang , Baolin Wang Cc: Andrew Morton , Hugh Dickins , Jonathan Corbet , "Matthew Wilcox (Oracle)" , Barry Song , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240711072929.3590000-1-ryan.roberts@arm.com> <20240711072929.3590000-3-ryan.roberts@arm.com> <9e0d84e5-2319-4425-9760-2c6bb23fc390@linux.alibaba.com> <756c359e-bb8f-481e-a33f-163c729afa31@redhat.com> <8c32a2fc-252d-406b-9fec-ce5bab0829df@arm.com> <5c58d9ea-8490-4ae6-b7bf-be816dab3356@redhat.com> <9052f430-2c5a-4d9d-b54c-bd093b797702@redhat.com> <5472faf5-1fbe-4a89-a17e-83716fc00b5a@redhat.com> From: Ryan Roberts In-Reply-To: <5472faf5-1fbe-4a89-a17e-83716fc00b5a@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: BC73A140021 X-Stat-Signature: h8pejhaanc157457ia6we1ke7x83j4yq X-HE-Tag: 1721209861-341654 X-HE-Meta: U2FsdGVkX18Y4h7V1LthiGDejE6ILXpvjRHR33bgNKpiqAiUDUo3Kzc8qFGcCe4KqweSFXorDycbF/x7BPdnye8o8jIp2zzRFZeb8ThxtKxMzWCoI/5tTAO2yNcFcYVE4I+AJT1ZsJCRl3cI9aioTw3YOi2nskTyyOq8tamAWdPxkZYZrLaaWiwPiItXxJnWD2O2UKtV7LsucVePPPV6l/Ujq1PX5t6bxp8tX9hywxeqCVWswx6ICYZAx0BRVSe+J4lputTB6ef1e410rqhw+eq+8cp5mWUWsFrKxnmXijd+8/i8NF3eG06G0TS1DtbGq/nqz1Z88tc8GLNWz4ztHvT4prT4ZiSKZXb3IZKMwvszrTlrKlxTBuz6c6TWuxXiw5VY4Gl42a320oPHF7xF6+jndvlWVFmJbNhkUaMF8ms6Dl8XnXpybJU9zq8YlED+tk37H/YwLH/D5jwA5YrtDqoOoMY93VPDNR4B6xIHyYLIwDnOI6DpmDNbnDMOCKKFvw5O+cN+NTPhqWRtVPOtPBuJS3GsZK56HzOZiaXyRoeg0gXFyyxaD8+zBU5s/XReJ++yGOOInHCDV4x7mcv7iMB5Dsq94ZKEWilwhVzmGb5FuTTNtV23vxJFRaIwVdajBNFBhrylWCMf/YYZeMAL8r+WTk7l1/MEHWifJCFC3BYQT5lxx0+yqwsserRCwzyyb1AhnGhUYwYfchSwZSb7hgZsSFFkeApM/QHmuLQEmI7cQrgzCMJYVKdCUv9/deSwqbMSWcMYx6R1lW1l1S0TnWUEer7frq94mkwgSEyo4Vqd62QZv6RKY4ddHf+E+08NcfCk9g9imFRleyXymKvUKjMoGsdfllYmd6Wz72GAi6s9l4YEgaZR/5+DVqVLQw9NvDHgIUrDvqSYQPJK689kb3VmAPV8RODpa3ApWErAySD2eZ4+OuXS+xZgYDjp0o7OweZFOFfVB+M12BjJa6d rJLYdIf+ BRmRAv2kawyF2XK864LC4N9/8u9CtErJlQ9YK99tlpf0dOZK1Z6stMeVeADe/M6AcVOFPXq68U/IYWMELSxboRCyHxFM7BAjW0mhzhLvcsILGgAhZXfAjPv3LDoDO4G9DYU02s+9Uqhi3YUWCgvvVO6gJSO/JY3qcvZLz8uajLGhu7Hk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 17/07/2024 09:44, David Hildenbrand wrote: >>>> I guess the real supported orders are: >>>> >>>>     anon: >>>>       min order: 2 >>>>       max order: PMD_ORDER >>>>     anon-shmem: >>>>       min order: 1 >>>>       max order: MAX_PAGECACHE_ORDER >>>>     tmpfs-shmem: >>>>       min order: PMD_ORDER <= 11 ? PMD_ORDER : NONE >>>>       max order: PMD_ORDER <= 11 ? PMD_ORDER : NONE >>>>     file: >>>>       min order: 1 >>>>       max order: MAX_PAGECACHE_ORDER >>> >>> That's my understanding. But not sure about anon-shmem really supporting >>> order-1, maybe we do. >> >> Oh, I thought we only had the restriction for anon folios now (due to deferred >> split queue), so assumed it would just work. With Gavin's >> THP_ORDERS_ALL_FILE_DEFAULT change, that certainly implies that shmem must >> support order-1. If it doesn't then we we might want to tidy that further. >> >> Baolin, perhaps you can confirm either way? > > Currently there would not have been a way to enable it, right? (maybe I'm wrong) __thp_vma_allowable_orders() doesn't do anything special for shmem if TVA_IN_PF is set, so I guess it could concievably return order-1 in that path. Not sure if it ever gets called that way for shmem though - I don't think so. But agree that shmem_allowable_huge_orders() will not currently return order-1. > >> >>> >>>> >>>> But today, controls and stats are exposed for: >>>> >>>>     anon: >>>>       min order: 2 >>>>       max order: PMD_ORDER >>>>     anon-shmem: >>>>       min order: 2 >>>>       max order: PMD_ORDER >>>>     tmpfs-shmem: >>>>       min order: PMD_ORDER >>>>       max order: PMD_ORDER >>>>     file: >>>>       min order: Nothing yet (this patch proposes 1) >>>>       max order: Nothing yet (this patch proposes MAX_PAGECACHE_ORDER) >>>> >>>> So I think there is definitely a bug for shmem where the minimum order control >>>> should be order-1 but its currently order-2. >>> >>> Maybe, did not play with that yet. Likely order-1 will work. (although probably >>> of questionable use :) ) >> >> You might have to expand on why its of "questionable use". I'd assume it has the >> same amount of value as using order-1 for regular page cache pages? i.e. half >> the number of objects to manage for the same amount of memory. > > order-1 was recently added for the pagecache to get some device setups running > (IIRC, where we cannot use order-0, because device blocksize > PAGE_SIZE). > > You might be right about "half the number of objects", but likely just going for > order-2, order-3, order-4 ... for shmem might be even better. And simply falling > back to order-0 when you cannot get the larger orders. Sure, but then you're into the territory of baking in policy. Remember that originally I was only interested in 64K but the concensus was to expose all the sizes. Same argument applies to 8K; expose it and let others decide policy. > > I could have sworn you mentioned something like that in your "configurable > orders for pagecache" RFC that I only briefly skimmed so far :P I'm exposing the 8K control for pagecache in that series. > > ... only enabling "order-1" and none of the other orders for shmem sounds rather > "interesting". > > But yeah, maybe there is valid use for it, so I'm all for allowing it if it can > be done. > >> >>> >>>> >>>> I also wonder about PUD-order for DAX? We don't currently have a stat/control. >>>> If we wanted to add it in future, if we take the "expose all stats/controls for >>>> all orders" approach, we would end up extending all the way to PUD-order and >>>> all >>>> the orders between PMD and PUD would be dummy for all memory types. That really >>>> starts to feel odd, so I still favour only populating what's really supported. >>> >>> I would go further and say that calling the fsdax thing a THP is borderline >>> wrong and we should not expose any new toggles for it that way. >>> >>> It really behaves much more like hugetlb folios that can be PTE-mapped ... we >>> cannot split these things, and they are not allocated from the buddy. So I >>> wouldn't worry about fsdax for now. >>> >>> fsdax support for compound pages (now large folios) probably never should have >>> been glued to any THP toggle. >> >> Yeah fair enough. I wasn't really arguing for adding any dax controls; I was >> just trying to think of examples as to why adding dummy controls might be a bad >> idea. > > Yes. > >>> >>>> >>>> I propose to fix shmem (extend down to 1, stop at MAX_PAGECACHE_ORDER) and >>>> continue with the approach of "indicating only what really exists" for v2. >>>> >>>> Shout if you disagree. >>> >>> Makes sense. >> >> Excellent. I posted v2, which has these changes, yesterday afternoon. :) > > Yes, still digging through mails ... in-between having roughly 1000 meetings a > day :) No problem. You're in-demand. I can wait. :)