From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EBC3C02181 for ; Mon, 20 Jan 2025 18:39:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8EFEB6B007B; Mon, 20 Jan 2025 13:39:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C7116B0083; Mon, 20 Jan 2025 13:39:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71A2D6B0085; Mon, 20 Jan 2025 13:39:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 53DBF6B007B for ; Mon, 20 Jan 2025 13:39:13 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CCEB21201A2 for ; Mon, 20 Jan 2025 18:39:12 +0000 (UTC) X-FDA: 83028692544.28.8019285 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 379FE80010 for ; Mon, 20 Jan 2025 18:39:10 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IznktO0J; spf=pass (imf02.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737398350; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EPw7ws0ZFAl6BpiaOxaiLWEV4yFXae5TxWsmQD8/JVo=; b=ryjiooac2yGTpYinF4wl0neQ3Q+x+i7oAbDCYNICobA2WGodE8ZwM7AsUphXdadKU+ThBq zo5YxjGJizfefy6y75Y2lIyEEIYFSmPDR5DSnjZW3lkay2DN6xGKcSsTTCvqCreK7hkjVB zk1MNbhOyIbz42QGDe+Z6Z1OqO30qv4= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IznktO0J; spf=pass (imf02.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737398350; a=rsa-sha256; cv=none; b=ol4yoOzSvBPZifbHi79ijcPp5w/dUy1g7YyCfj3FhxCS7xNy9gmSQmCS3ysmGJVtUKHsfL WLNmdb3WduV6CCDF8px7d5f7O2nH6z3I48rKit1pkW5VPTD3xGtrK5Aplu7Qw/Xr5b6sJT oCqgc/W3lFochw6BkFCFAiFKq9jcYMY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1737398349; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=EPw7ws0ZFAl6BpiaOxaiLWEV4yFXae5TxWsmQD8/JVo=; b=IznktO0JNkk1rFe1H208Tb8uckbFJLvq3+UcMGM5MOocXyh/llXAXB/J3D79Gcf86Ix805 6EcOoyFoUcmSipAMJPzVmJ37G16abfSqjncXBNsNpBjwUAfUJ/JkUWx7XCZZ4qESOpO2l9 D03TiJEhjwnCs6sKbYyAVZZpUK7h6oE= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-612-Oq9ro-irNgmWT5DZY39bcQ-1; Mon, 20 Jan 2025 13:39:08 -0500 X-MC-Unique: Oq9ro-irNgmWT5DZY39bcQ-1 X-Mimecast-MFC-AGG-ID: Oq9ro-irNgmWT5DZY39bcQ Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-385e49efd59so2068193f8f.0 for ; Mon, 20 Jan 2025 10:39:08 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737398347; x=1738003147; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=EPw7ws0ZFAl6BpiaOxaiLWEV4yFXae5TxWsmQD8/JVo=; b=htHwDIQQsK8GBIpZEdaGSqxYjuDKy9f33NRFNkrtMw/aLmaXAOqLdqb2SyIRDa0nma CiS4skvEPPF7bdluSQGo04+6ZfFWixxEtgOIjnAIg5RajFBx5veB77N3wfyrGupe9ofY WPFxeFPTJd5j+OVA4lgbUsLWc2I5QSR1xdFnvk4goVsqWmMkk6sbrM0sATDMeMfazRU5 WmB0KSlgsSfvYrWaTCTfw+dUj0fM8m1G3Ev+ZceFTZzUmPpFtVvQ7dtXlDDRWLjNp/Gr dDztaYDQzFbgVcnuYYs00KnQZStMsszKwC//NP9khrYgS37PMki8CrgKRI0V9OZL6oDu Bcbw== X-Forwarded-Encrypted: i=1; AJvYcCXj23GnmU2NnpFJmPeVwSX46BRe9KxlrXOF2vuQ4RXYSvVmV1Uo8MSBITCXIzOczJhdtu41Uij3AQ==@kvack.org X-Gm-Message-State: AOJu0YxtYC5EDIkSgv16s2OBK28O9rnqLNLeN0Chafyp6FdgLAq8XP6w VZENDUG4ei1dufP67VTnfAOS+q5AC0wLEB4J/C1ry2v01Xhx1ZDLXpJoo072bbRI1n17ypgyxZb Zf61tDd7ZaM8VK3RDwbOAXXmMXtKxEWECzWu81Lk/yUZJttEn X-Gm-Gg: ASbGncsOKWdojdxvijCX2BhUTRkMmsrWyFbFhnmk8FiC1Jvwqu3oZsaKTCH+7eY/+K+ obHsqkn3uC9gEsZgNrTce6/h8RbxsIjq1JaGsCzTb5JJEyId1PZ8SUXXVwdBtlQB41HEsrxbiVs 7LuggByvLuenPqVTLWVePFF7RITu1zgTSzIubrtZhwsEwqvkwkpnqH+e+xzQuEK4CmVpY6fytJm Ep2QywhctCpgk0x7zKvZnjrHEWMwiVgDP/Ho/mVeJZaPlcrKXs7WMM6yFTSD4wcwAs8CNqv+6cy Y6s306oKL+SccwUBR0bb3F5V X-Received: by 2002:a05:6000:1864:b0:386:3328:6106 with SMTP id ffacd0b85a97d-38bf59e199cmr15090382f8f.35.1737398347146; Mon, 20 Jan 2025 10:39:07 -0800 (PST) X-Google-Smtp-Source: AGHT+IG5NNUH70H31Q04HtApQ90vhk2LZKQPc06mJujxNhXoh/ci4GXQ+v3M4jqehn8K1lMsyNG2fQ== X-Received: by 2002:a05:6000:1864:b0:386:3328:6106 with SMTP id ffacd0b85a97d-38bf59e199cmr15090360f8f.35.1737398346728; Mon, 20 Jan 2025 10:39:06 -0800 (PST) Received: from [192.168.3.141] (p4ff23481.dip0.t-ipconnect.de. [79.242.52.129]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4389041f7e9sm152280785e9.23.2025.01.20.10.39.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 20 Jan 2025 10:39:06 -0800 (PST) Message-ID: <95472249-44f6-4764-a5fa-fac834eb5a49@redhat.com> Date: Mon, 20 Jan 2025 19:39:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC 00/11] khugepaged: mTHP support To: Ryan Roberts , Nico Pache Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, willy@infradead.org, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, sunnanyong@huawei.com, usamaarif642@gmail.com, audra@redhat.com, akpm@linux-foundation.org References: <20250108233128.14484-1-npache@redhat.com> <40a65c5e-af98-45f9-a254-7e054b44dc95@arm.com> <37375ace-5601-4d6c-9dac-d1c8268698e9@redhat.com> <0a318ea8-7836-405a-a033-f073efdc958f@arm.com> <8305ddf7-1ada-4a75-a2c3-385b530b25d4@redhat.com> <9bf875ad-3e31-464d-bccd-7c737a2c53bc@arm.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: <9bf875ad-3e31-464d-bccd-7c737a2c53bc@arm.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: XrSTt3zbZFcMsHjqCFGnNOZRqz6Z592-fI_pfn9VBnQ_1737398347 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 379FE80010 X-Stat-Signature: a8y51qtw7e8n6ocs1kcw8hzr4z7edc3a X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1737398350-769955 X-HE-Meta: U2FsdGVkX19CWQYEwVHN5hyTs44PyuOyzBT288TDHVuPleD2Dhe6jD8d7+txRKxZzclQfXcdWxH2PMyZxpJ39l8ZFtBFetWd0jN/HtMq+HKKrjiuVJnBmJI6BIfZGKkA8BYc2XuquoHJ3ZqUHLBJBniK0GrNLf2mLug8EudCjRdp6y5x3cccMbFEbfNLZiVal7omTwgT4QdA9KgGLVKiLik0NfdatFm1BBu06LIkeFlcFpNQYd/Fz7Ant9UQl8AL32soJlZVEiFtG5sZciwxBGiwYec5Mr0yuCD3q3UvJNQIbFyHOW55O1/5mEn1N33cxhXj2nCu9iXDgbq4Hbznch9HOZOKvwaFD9EWKc9wjA5w3tOvtHAyKre1tXecNzm0QZZwKmb32WfG+TQ8+tkWl+pnWr1rxHKX18TkxRD+clQpgVsSNicV5yh1tsgXU71eHw31dwbc5ypDm5xdS9S/zgmXnXGZWY3whIr/fdnLxusHqZ+1zTvimCh1kC2uuUJ7kEXZ9Nwu7cagidbQ8WKaxNeFDe8NCvLtFMtpJxMM2G1s+Cqy+hXzmoKWEgV42XQJmPD7tKKPvdcAyEWfzC2SYQxhW3jmNuZNQiVYz8SM1emiwQNnmjrbOCRDZsqIzj0mbiyy9L9xFHM062zzgG4wpr5IVGgA4QUqI8F/g0R82dZqt2yi3NMFHZme3EpjllC/uXMCKruA+6xSKuhNCcWCWr2Dq5cC9c6YLbxNwg0WO64hGv4ZSVLrRS0rK/tpFiC4w1kGwf2eU+/3UaXlwDxM+1g2PZnwN+N7FBKVQ5e8PhrWSwXfeuxK1Sru18xqVGiOv0j3aRdQMdQh49MqmqsQeSDEOmPGY2KmQicUhCLfH5RtrX4Poki9ChWP2PLu9D4wMh2rlCn4iihV1Wu1wPdU8U7nipdzvlfhv3O6u7MEoT9ItgraRgXZeaBmPNkfG1ObfRlvG1mitWa90uwei13 B/pRexpP t+9oQ+WqQZ7qNXzjWJ/TyzkfCxSCUzAooQFHvtMDSHNcO3QRXrbxl6TCeje4cmYf2tvCRTHpQpBgcAmiKxNUhIxv/kLGjS3Vg1cA3HH0nLo3eyteW802DAFobeJGSRzCQqhN+sVxmWf4VGlXQ0AxHGXO2fArZX+83hUuSX74qrbnZsLXomYjZhs3UDEo+QxtmV2J6yQ/h5M1TWa9DZ2OSM/s22Tv/xlFhMtj4fAsvuy1qkj6eYCzeTc06qADq308bB5JYggN3dKDE1wgT5wqi0/APKkIT7lX8VBF1PUvtotm6wwAxAUDMp9TNd1FrIgXySCMDgOZ8N68NKNKJ5SEze/c782+h33RlunZP3vPEArHPgXrFsKVnTdaggHiG8p37ink29hCvUSx8/7f5fEc/3tShPuKRoQN1iLg63R/DWBFaawlzPyv/YwBYwi2q3DgF0vC8mKuSqm8/cwdUDnKVgMLFWzO9ZVlAchHtlIEHyUuS+rs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 20.01.25 17:27, Ryan Roberts wrote: > On 20/01/2025 13:56, David Hildenbrand wrote: >> On 20.01.25 14:37, Ryan Roberts wrote: >>> On 20/01/2025 12:54, David Hildenbrand wrote: >>>>>> I think the 1 problem that emerged during review of Dev's series, which we >>>>>> don't >>>>>> have a proper solution to yet, is the issue of "creep", where regions can be >>>>>> collapsed to progressively higher orders through iterative scans. At each >>>>>> collapse, the required thresholds (e.g. max_ptes_none) are met, and the >>>>>> collapse >>>>>> effectively adds more non-none ptes so the next scan will then collapse to >>>>>> even >>>>>> higher order. Does your solution suffer from this (theoretical/edge case) >>>>>> issue? >>>>>> If not, how did you solve? >>>>> >>>>> Yes sadly it suffers from the same issue. bringing max_ptes_none much >>>>> lower as a default would "help". >>>> >>>> Can we just keep it simple and only support max_ptes_none = 511 ("pagefault >>>> behavior" -- PMD_NR_PAGES - 1) or max_ptes_none = 0 ("deferred behavior") and >>>> document that the other weird configurations will make mTHP skip, because "weird >>>> and unexpetced" ? :) > > nit: Rather than values of max_ptes_none other than 0 and max making mTHP skip, > perhaps it's better to say we round to closest of 0 and max? Maybe. Rounding down always implies doing something not necessarily desired. In any case, I assume most setups just have the default values here ... :) > >>>> >>> >>> That sounds like a great simplification in principle! >> >> And certainly a much easier to start with :) >> >> If we ever get the request to support something else, maybe that's also where we >> can learn *why*, and what we would actually want to do with mTHP. >> >>> We would need to consider >>> the swap and shared tunables too though. Perhaps we can pull a similar trick >>> with those? >> >> Swapped and shared are a bit more challenging, because they are set to "/ 2" or >> "/ 8" heuristics. >> >> >> One simple starting point here is of course to say "when collapsing mTHP, all >> have to be unshared and all have to be swapped in", so to essentially ignore >> both tunables (in a memory friendly way, as if they are set to 0) for mTHP >> collapse and worry about that later, when really required. > > For swap, if we assume we start with the whole VMA swapped out, I think setting > max_ptes_swap to 0 could still cause the "creep" problem if faulting pages back > in sequentially? I guess that's creep due to faulting pattern though, so at > least it's not due to collapse. Doesn't feel ideal though. > > I'm not sure what the semantic of "shared" is? I'm guessing it's specifically > for private COWed pages, and khugepaged will trigger the COW on collapse? Yes. > So > again depending on the pattern of writes we could still end up with creep in a > similar way to swap? I think in regards of both "yes", so a simple starting point but not necessarily what we want long term. The creep is at least "not wasting more memory", because we don't collapse where PMD wouldn't have collapsed. After all, right now we don't collapse mTHP, now we would collapse mTHP in many scenarios, so we don't have to be perfect initially. Deriving stuff for small THP sizes when configured for PMD THP sizes is not easy to do right. > >> >> Two alternatives I discussed with Nico for these (not sure which is implemented >> here) is to calculate it proportionally to the folio order we are collapsing: > > You're only listing one option here... what's the other one you discussed? > Ah sorry, reshuffled it and then had to rush. The other thing I had in mind is to scan the whole PMD range, and discard skip the whole PMD range if it doesn't obey the max_ptes_* stuff. Not perfect, but will mean that we behave just like PMD collapse would, unless I am missing something. >> >> Assuming max_ptes_swap = 64 (PMD: 512 PTEs) and we are collapsing a 1 MiB mTHP >> (256 PTEs), 32 PTEs would be allowed to be swapped out. > > Yeah this is exactly what Dev's version is doing at the moment. But that's the > behaviour that leads to the "creep" problem. Right. -- Cheers, David / dhildenb