From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F56AD2C56A for ; Tue, 22 Oct 2024 15:31:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC7216B009C; Tue, 22 Oct 2024 11:31:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C76CF6B009D; Tue, 22 Oct 2024 11:31:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC9636B009F; Tue, 22 Oct 2024 11:31:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 893A56B009C for ; Tue, 22 Oct 2024 11:31:42 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7A458A9CFB for ; Tue, 22 Oct 2024 15:31:09 +0000 (UTC) X-FDA: 82701627708.04.005040B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf12.hostedemail.com (Postfix) with ESMTP id 4A09F40003 for ; Tue, 22 Oct 2024 15:31:32 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=D1gZ1Qqw; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729610948; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7b+nqmrv5oJ2xIV5ntlowv32I+l6MVBCYQKdjO2iViA=; b=gRf+Bs8NdYRuEhUXB2NSU8GQJtUWEdfmabCHSv5v05Ub/U9Lol37sXvBLHIzuTxUoLrc+Z NyplC/bjgPKO5YWjeRjZal1oYc6ZUnl9q3V2gjmQFcW/qS5kEMBAeHLM8xApfOGXH3H9ri yoLJQtDP/zyJ43lenEbxdQBEyGtKQM4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729610948; a=rsa-sha256; cv=none; b=sakpnhCHMlZc0Te8UF7r4pxvU27zbB05uNx5cTvVUZkNHrTAUCo+KVvJKi7BSk4yJYL5Ha TBPvECpBYdXd6Cpq4HePbRyFBVCvsjCyVG1aDsSsL9GOO3JdxWTHh5Xo4CjzQpWvnUhcuk 3kvY6CQ2JQA3ij2rp6xE9oBOKazNqWk= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=D1gZ1Qqw; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1729611099; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=7b+nqmrv5oJ2xIV5ntlowv32I+l6MVBCYQKdjO2iViA=; b=D1gZ1QqweJWeuTcYAAS763ourygC3sSQszaxuGHSHxr1m8Q92kppnulepEZ+Kla9/VSsw0 11qSxdl1hN+sNSJHZVK+UKweg9MhRlhLRpwcjgo76mprlm9L64jlg9nPwHY5zKTQ+eKfO4 15he8EQ4TyjnrLT3oggM4qZeBG4uhag= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-481-TUqYEBxGOj6CETPlhqFyvQ-1; Tue, 22 Oct 2024 11:31:35 -0400 X-MC-Unique: TUqYEBxGOj6CETPlhqFyvQ-1 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-4315544642eso46082425e9.3 for ; Tue, 22 Oct 2024 08:31:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729611095; x=1730215895; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=7b+nqmrv5oJ2xIV5ntlowv32I+l6MVBCYQKdjO2iViA=; b=Ns5GavNZ/9cXJoEQMMW3CRdOPN27CqutkC8jOMCU/Puy67Y45j7K1Ezo3AC9218Z+8 8Hjl+k2n77jT867J9AcXXkodwEPMZGgALisd0UrZB2eOCnL7TARwqvXwLy157HCVJcDF FQ0bLeod+LC3QorJVgcunupUknl7oebwzdr+hzcLtSju1OBZ18/LU+BM8MOWz+6p9KOi qwV4LZI2ilCG3jZJbifcWT4c7I1pt4X6E3GCzDHmctSup74agYoOBcH1NJwqV2LK5eDO ul0RJToqq97oxHig32LDcXMsjDzuPqhN+DwibLgo5RniMkmW3Pa1lal334luv5YekgJr kX7g== X-Forwarded-Encrypted: i=1; AJvYcCVTTBSmc4BYWDhjU1YrwgvqWwDHPl+gdzxSW+8qcWK+Zk01qes9FBkqQ5/ArYswS40HekNCbeTEng==@kvack.org X-Gm-Message-State: AOJu0YxzrBGiyRhq+bTPeeaad7tFkaprRcAp0jMe/HwobfWIe+M17P66 ZVLwutik9oF9yW2/4Dv0/mzkIE8SNOkniY8ArTk+58hqa3NKfmaNsEOYDfSGCKf90sIc4Xov0Ys kCYCj+8taVAneLrxtMGwrQaqPJB3pIr9k2yxVJURkAEd2nEol X-Received: by 2002:a05:600c:1ca4:b0:42f:8515:e490 with SMTP id 5b1f17b1804b1-4317ca98076mr28482225e9.5.1729611094428; Tue, 22 Oct 2024 08:31:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHBjwewfVnePxfxvVhUU23cGadMRceW2H02WpLl+xu8/c0vIb2RRhjhzDr54jH/naRPZF/ptg== X-Received: by 2002:a05:600c:1ca4:b0:42f:8515:e490 with SMTP id 5b1f17b1804b1-4317ca98076mr28482035e9.5.1729611094013; Tue, 22 Oct 2024 08:31:34 -0700 (PDT) Received: from ?IPV6:2003:cb:c705:f700:352b:d857:b95d:9072? (p200300cbc705f700352bd857b95d9072.dip0.t-ipconnect.de. [2003:cb:c705:f700:352b:d857:b95d:9072]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4316f5cc4c7sm90254405e9.48.2024.10.22.08.31.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 22 Oct 2024 08:31:33 -0700 (PDT) Message-ID: <486a72c6-5877-4a95-a587-2a32faa8785d@redhat.com> Date: Tue, 22 Oct 2024 17:31:32 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v3 0/4] Support large folios for tmpfs To: Baolin Wang , Daniel Gomez , "Kirill A. Shutemov" Cc: Matthew Wilcox , akpm@linux-foundation.org, hughd@google.com, wangkefeng.wang@huawei.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ioworker0@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A . Shutemov" References: <6dohx7zna7x6hxzo4cwnwarep3a7rohx4qxubds3uujfb7gp3c@2xaubczl2n6d> <8e48cf24-83e1-486e-b89c-41edb7eeff3e@linux.alibaba.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4A09F40003 X-Stat-Signature: cpcbonu58za7qzch39w5jigkz9ccht56 X-HE-Tag: 1729611092-207142 X-HE-Meta: U2FsdGVkX1+uqNgLU/RnQ/Cd9Jycbscsnv4tedEciqNKWIGOBcv+Jpp4tgdM7M1WnoKYpY+gSyfAnTSzDUsZdEh9nuo6ya2Z8v3xq/jwZSBuWIbS9kTXI95jF70h62uht0bMo714CYt3EyfzgwrkZsYOP+zhc1EMxC12B6U2FeO+T5nxmciiT81yZWbKzUkLL5ADtkmZ8cNjl/9IIBrjpuiygRsSNVpOEUPO8oMJyeQwzRLzliRiltGW4tOq/gUMSGOByUSDbrA4agiADeuTVFMtcz0yd0jiAsM9DyTPvBRrgHNrAtEF9E+YCqznImnejiURqXzvmhSzDiu/iHvg7DEbrT21/f/6uIRWGcTevhDx0eXHt7wag8K+k87C3yebfc0QUeXbzsou7cwopNCX3LViborGVCQsMajxTk7MQeGWiYpOOtZRYpFVYKVEnn7VG5kHqkJIAPrFMCSirg3GlZ5N+mbXlc7P9a+OSGECPRrKXv/3XX4CxZbWnYLfripaK1iq/wJJQEXHzmV6eZwfqnjMC2DHN/wkS4oPzp4j9rg2GUAwRL2V6I4WgS04561sQVQoWF3PXbFJDxDDsdqFcFfdTCbcHsbNakxA6WyHjG3/41djUoCKwvljkfu2608DObgI6jXCW6Vl6erjS+pok17EVdxY4x/iR/SGCoT5X8/bXdHOqHSP5OqDAPeKXDY4C5CeQXmNvJzHRnHPbP1pNrLqLvCxfraydHk3jEYt7IJonvErUB3gOUZ0EqsNmsxtXmPD6wuwsKO2aOfUS6a9jE1Tw10YAGr4i9yaIw0UzHJfZzRVPrWyFxfxI9w0qWog/ASzotL9qa6FH559sY8SRgXPFpoFzygPZTjVXTP8C4XpVB2zGn85OCeTXZO5a/iskKwloeW1LXEOU6EA+Pqwy55qox0vvU9lofVodhsJbaGLPCcLeWBYj19wTgavUtx9jG2tc9LhgHHIZdoDTmb YHhMpMw7 z5605eFQf7MIFq7fMFGajo8KP/WkQHMkGVcZIJb6wgct05MTHqjjv6pIAjkqvuMOx9ZVbX55tN7HCYmxYZJOAXbgmu584G09+cod0xDNuI5hpoIjp58DWuhCjLQ+k7x8TBpuRUC2J/FLcWusGyl+TR8XdzDTLgnarzZu8MEDILPQaEIrPNCmY8VxHH9jGuqmXoRhC3F0WOEoeO91ragJqfq1Xh8gxEqdGh06s3TGnY8eQXMb+Qx7+fp3uWajtqteC0btsGvz8HjvReikDYO3k+gG/sSOi5t4eDpAseVyyAYEvpLMoOauabIa0f4d/G4Ph1JQ9TaAxOwRKlLEP+K7yZJMbn6QWRUauSo6Tr8VVTeR2psWMEhUqBuP64VIM8c5WBfPn7IOmxd8DSHvronTVWTHSD2/ABaMD5PBeVWZ4vH6BXwk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 22.10.24 05:41, Baolin Wang wrote: > > > On 2024/10/21 21:34, Daniel Gomez wrote: >> On Mon Oct 21, 2024 at 10:54 AM CEST, Kirill A. Shutemov wrote: >>> On Mon, Oct 21, 2024 at 02:24:18PM +0800, Baolin Wang wrote: >>>> >>>> >>>> On 2024/10/17 19:26, Kirill A. Shutemov wrote: >>>>> On Thu, Oct 17, 2024 at 05:34:15PM +0800, Baolin Wang wrote: >>>>>> + Kirill >>>>>> >>>>>> On 2024/10/16 22:06, Matthew Wilcox wrote: >>>>>>> On Thu, Oct 10, 2024 at 05:58:10PM +0800, Baolin Wang wrote: >>>>>>>> Considering that tmpfs already has the 'huge=' option to control the THP >>>>>>>> allocation, it is necessary to maintain compatibility with the 'huge=' >>>>>>>> option, as well as considering the 'deny' and 'force' option controlled >>>>>>>> by '/sys/kernel/mm/transparent_hugepage/shmem_enabled'. >>>>>>> >>>>>>> No, it's not. No other filesystem honours these settings. tmpfs would >>>>>>> not have had these settings if it were written today. It should simply >>>>>>> ignore them, the way that NFS ignores the "intr" mount option now that >>>>>>> we have a better solution to the original problem. >>>>>>> >>>>>>> To reiterate my position: >>>>>>> >>>>>>> - When using tmpfs as a filesystem, it should behave like other >>>>>>> filesystems. >>>>>>> - When using tmpfs to implement MAP_ANONYMOUS | MAP_SHARED, it should >>>>>>> behave like anonymous memory. >>>>>> >>>>>> I do agree with your point to some extent, but the ‘huge=’ option has >>>>>> existed for nearly 8 years, and the huge orders based on write size may not >>>>>> achieve the performance of PMD-sized THP in some scenarios, such as when the >>>>>> write length is consistently 4K. So, I am still concerned that ignoring the >>>>>> 'huge' option could lead to compatibility issues. >>>>> >>>>> Yeah, I don't think we are there yet to ignore the mount option. >>>> >>>> OK. >>>> >>>>> Maybe we need to get a new generic interface to request the semantics >>>>> tmpfs has with huge= on per-inode level on any fs. Like a set of FADV_* >>>>> handles to make kernel allocate PMD-size folio on any allocation or on >>>>> allocations within i_size. I think this behaviour is useful beyond tmpfs. >>>>> >>>>> Then huge= implementation for tmpfs can be re-defined to set these >>>>> per-inode FADV_ flags by default. This way we can keep tmpfs compatible >>>>> with current deployments and less special comparing to rest of >>>>> filesystems on kernel side. >>>> >>>> I did a quick search, and I didn't find any other fs that require PMD-sized >>>> huge pages, so I am not sure if FADV_* is useful for filesystems other than >>>> tmpfs. Please correct me if I missed something. >>> >>> What do you mean by "require"? THPs are always opportunistic. >>> >>> IIUC, we don't have a way to hint kernel to use huge pages for a file on >>> read from backing storage. Readahead is not always the right way. >>> >>>>> If huge= is not set, tmpfs would behave the same way as the rest of >>>>> filesystems. >>>> >>>> So if 'huge=' is not set, tmpfs write()/fallocate() can still allocate large >>>> folios based on the write size? If yes, that means it will change the >>>> default huge behavior for tmpfs. Because previously having 'huge=' is not >>>> set means the huge option is 'SHMEM_HUGE_NEVER', which is similar to what I >>>> mentioned: >>>> "Another possible choice is to make the huge pages allocation based on write >>>> size as the *default* behavior for tmpfs, ..." >>> >>> I am more worried about breaking existing users of huge pages. So changing >>> behaviour of users who don't specify huge is okay to me. >> >> I think moving tmpfs to allocate large folios opportunistically by >> default (as it was proposed initially) doesn't necessary conflict with >> the default behaviour (huge=never). We just need to clarify that in >> the documentation. >> >> However, and IIRC, one of the requests from Hugh was to have a way to >> disable large folios which is something other FS do not have control >> of as of today. Ryan sent a proposal to actually control that globally >> but I think it didn't move forward. So, what are we missing to go back >> to implement large folios in tmpfs in the default case, as any other fs >> leveraging large folios? > > IMHO, as I discussed with Kirill, we still need maintain compatibility > with the 'huge=' mount option. This means that if 'huge=never' is set > for tmpfs, huge page allocation will still be prohibited (which can > address Hugh's request?). However, if 'huge=' is not set, we can > allocate large folios based on the write size. I consider allocating large folios in shmem/tmpfs on the write path less controversial than allocating them on the page fault path -- especially as long as we stay within the size to-be-written. I think in RHEL THP on shmem/tmpfs are disabled as default (e.g., shmem_enabled=never). Maybe because of some rather undesired side-effects (maybe some are historical?): I recall issues with VMs with THP+ memory ballooning, as we cannot reclaim pages of folios if splitting fails). I assume most of these problematic use cases don't use tmpfs as an ordinary file system (write()/read()), but mmap() the whole thing. Sadly, I don't find any information about shmem/tmpfs + THP in the RHEL documentation; most documentation is only concerned about anon THP. Which makes me conclude that they are not suggested as of now. I see more issues with allocating them on the page fault path and not having a way to disable it -- compared to allocating them on the write() path. Getting Hugh's opinion in this would be very valuable. -- Cheers, David / dhildenb