From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6CFEC02182 for ; Wed, 22 Jan 2025 08:04:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B9AE6B0082; Wed, 22 Jan 2025 03:04:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 568786B0083; Wed, 22 Jan 2025 03:04:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E2886B0085; Wed, 22 Jan 2025 03:04:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 19A576B0082 for ; Wed, 22 Jan 2025 03:04:32 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6F814C13CE for ; Wed, 22 Jan 2025 08:04:31 +0000 (UTC) X-FDA: 83034350742.25.E000B38 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf14.hostedemail.com (Postfix) with ESMTP id D417B10000E for ; Wed, 22 Jan 2025 08:04:28 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cHX6CGnS; spf=pass (imf14.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737533069; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a3zIp+ly3QZXI+x6noDJ39MFke6f2VuV1F2gNl9BJao=; b=YbSdht5cApViJx6BAMDYuSWaSIz3ZfzuCZRDMByNohIv2P7gNOVqMTbOHxH3Q7V+K3i/4D nGJ2PY0e4b9MQ5uhwYnsQeDf8uxFj18jHZNjwOx9R7jhGrva664+Is/Qlw3Tw58JV1nfha jYP8tXhp+50f6lLV2JUo+VlNoKXLaSw= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cHX6CGnS; spf=pass (imf14.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737533069; a=rsa-sha256; cv=none; b=IcY0qYRrEODzDGqZ3F/jnrqWjn2LB22bpwED4zWjLDQPyCtlaCK/y5igZcjwQxBu0plFr9 vRb8cn2whn+k3uwBPOu40hYqM884/mV5DnUnoaehAaD+zupy+v3J7FXWYXn3hd22XkbP0s Z1dynNlK6uryFi0A2Q9iJGyFv9ci+2o= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1737533068; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=a3zIp+ly3QZXI+x6noDJ39MFke6f2VuV1F2gNl9BJao=; b=cHX6CGnSALZi4nqPSUGVitH9SCRs+oNsEE9qDHdMmFbvdr4T9m/mTgM7+iI0X3g8W51yHD jidwIkxsZ/hP+diHT4A1EFlffWj96JOKcz96xW+cl0JZ403WGjCCZMyYWXulE/9skYHBoV m149wDQsXeXE2ArOafOSx9TTJ78Ilaw= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-311-h59XCp1zNLicei4EmNUpvQ-1; Wed, 22 Jan 2025 03:04:26 -0500 X-MC-Unique: h59XCp1zNLicei4EmNUpvQ-1 X-Mimecast-MFC-AGG-ID: h59XCp1zNLicei4EmNUpvQ Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-43646b453bcso34131165e9.3 for ; Wed, 22 Jan 2025 00:04:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737533065; x=1738137865; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=a3zIp+ly3QZXI+x6noDJ39MFke6f2VuV1F2gNl9BJao=; b=VElFPhFsfSSJ3dYkzTdUVhl/LSfIAnoXqoEi0+CUTMEafUlionn5XICWU1I0TzxPDH 9P/Z6ZDmNuyi584cC+nP3h32RZ6UhCovyB9QKf48GlbjnbBO6/AlfexUbMUVaXtT8Boj aO2Ehp0DvU5psVlUCTh02Go+jHAspxbrjeNTJWPp0wyHmYxACsXOMJbi7RuD4I8VlcV5 va7IibqoeM1NTKnp+x6S2ScC7w2l1KOXcrHo7cqE5gUF+DugSWgZJCrzUZyVdOu3/6sh BOuxiuVQbjr8/vYjnRMGG12/TnoQ6gZAEQJM0ZHubaWQSgnrtixSpM8KbEE2sn7DRXJU rxNw== X-Forwarded-Encrypted: i=1; AJvYcCVH5Bget80FcvtKc4w0XOtdLMpyqx0AXzpkXnAT8qg815G9QkA4kUVaB75EfqJJ+vyca403Z9NZog==@kvack.org X-Gm-Message-State: AOJu0Yw4kCBNn2uJxO8gpS88iYoX/HHRL8P1KCOPPgCXDwgbQD0sxtg0 fGFATQjUhlQ+682s3Qp0N2eaLuLvTjPRVY9YBo8Xg7+BN8NRnbnxMLOgDJ+GMqu3uVQpzRygQnI 8gUlt8I4xr/QsLS5MQGk1jvZq0yv4c/vxa7D+Bji6YZCEDjXu X-Gm-Gg: ASbGnctZSZqdSdYabhN6iVoYWHMjJwCceo8qckqps8h1WRxskAdA/1XR4RDN5rTrfmR eTx88FxtYJc/Lu22dZEn2dgCJb1v0qIc/2jMJJsmWIfQLeIPC9MLV2VqSA5I6Nh/+tz4nXoyEpS 0/REkbvb4ySFZkpitn0UDpf/Y6LUczfZ9cV7CUXWkje0dhZXFZwlX56x6NOsHaugi4oeAJnG4Hr cJytRkG6CHGKGj35LwlV2HcNWp/DIbOqdZMczWWbI73ZFYGdTpoyUD6xaow8Dy5Agxw/0zU9pip Fs69AH8AUM6YzFoyiRudE9JT+QW0Q7vKlub00x47sdRyRYuRMcnRP2q4MT/yCIjG75V/VUJrQxY MKKWsSs1ByltpEQH8pEPwfg== X-Received: by 2002:a05:600c:1ca7:b0:431:44fe:fd9f with SMTP id 5b1f17b1804b1-4389142776dmr184381665e9.23.1737533065449; Wed, 22 Jan 2025 00:04:25 -0800 (PST) X-Google-Smtp-Source: AGHT+IEO/mxEI7aWRtKCMoiWIj1hDsU5qhR3rZxP/ivdahEKd1p/5HA8FaRk9NGkv56QMU1wlJL1eA== X-Received: by 2002:a05:600c:1ca7:b0:431:44fe:fd9f with SMTP id 5b1f17b1804b1-4389142776dmr184381335e9.23.1737533065016; Wed, 22 Jan 2025 00:04:25 -0800 (PST) Received: from ?IPV6:2003:cb:c70b:db00:724d:8b0c:110e:3713? (p200300cbc70bdb00724d8b0c110e3713.dip0.t-ipconnect.de. [2003:cb:c70b:db00:724d:8b0c:110e:3713]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38bf322aac3sm15771691f8f.51.2025.01.22.00.04.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Jan 2025 00:04:23 -0800 (PST) Message-ID: <30c29ec8-1199-4aeb-828d-853ec441fee1@redhat.com> Date: Wed, 22 Jan 2025 09:04:21 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: mm: CMA reservations require 32MiB alignment in 16KiB page size kernels instead of 8MiB in 4KiB page size kernel. To: Barry Song <21cnbao@gmail.com>, Juan Yescas Cc: Zi Yan , linux-mm@kvack.org, muchun.song@linux.dev, rppt@kernel.org, osalvador@suse.de, akpm@linux-foundation.org, lorenzo.stoakes@oracle.com, Jann Horn , Liam.Howlett@oracle.com, minchan@kernel.org, jaewon31.kim@samsung.com, Suren Baghdasaryan , Kalesh Singh , "T.J. Mercier" , Isaac Manjarres , iamjoonsoo.kim@lge.com, quic_charante@quicinc.com References: <463eb421-ac16-435c-b0a0-51a6a92168f6@redhat.com> <8f36d3ca-3a31-4fc4-9eaa-c53ee84bf6e7@redhat.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 7X6iw5WHIGwC9NPQqv4S94vbdqYa3RiwRzu1tSdAZbo_1737533065 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: D417B10000E X-Stat-Signature: 9tgkw8ecjgjiox57d6hqz9x9nrnh8us7 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1737533068-483940 X-HE-Meta: U2FsdGVkX1+KJHgjrIa0yoCltujEnC87bhtMQ9Brfa7Udo7ig3HR8Apls0vnf8uscyMqCzHNOCGf9FC8FLZN0FXCwAaVBsSaD69VpGn0nC1lnb8PZzhEsODimdyFdbXtbOz2303tJU01gGnXr/Fxz9eT2fBnfi5AyVpYD094cpZMCBKKJkowX83tc5rGsnSYkwFU8qD5DzmDqHOsCx82mBrF6JPxW6YWiwiLxKU2SBrBxnugsQ+FqrmU7b4IfWh6KOBGSOXxLeuHe1xXTqVG4LI0wAivsXDEXryrkYsAEOyT3LzpDMF4qCMLdGQVTChaGcoTdos8YKrMeKpJxa6rtYbqiKoBmbpMTxb1ca91q21FGyziZ4dk1Fs8ptYzG4z0oDFKlVsrgiY9L89Ljjq/zLOkY7sXhCrPQQcsIaDIjHjwYEFFY9609SEDpeO9FQoQ+SdPI1C8Jnp/rz/EXWcdwuOlHjmIQ+sAJ9Z3wjxCwmMYZS3+BAshjCq7VShJOtvMGksVejkwdtSblauYdTxf0+JuWGmPDRtNDFRQsyCNQPYByvk/dExSDif8IE9FnCP3+U1Xgz4NYB7wX6LEV96gz9d3BektaArwecm2h1+X4xT+k8VTD81TGtLUtZQlTCMnRPo/Cb3eDhLZ/qtvEjm2ACNYEE8pahH1sIXCRA5h+llRuwvj7JaWm1nh4hlT6ol/9yoa83Zy0SD3yWysZ5Ngy5WQACgksNY/kBm50ECUhdzKMZ4XbG7KnjJd524WlMGamlJZXLNzKWoJocFHf4xV2EfRk+FXu273HM9EJ29VsQ+lPc+/Coc6PRyvl1JhjyxVgK7Vzs4kth4ml38aI2K5MgnYMeJVFkdfx7hWm1PKi91AK3GRR5Q3U6vj4uJgh2fqowOBx5XjWdKZSvKHsDnHpVVHSvodvxg0Thjo0MuRdTPhVTdge/GepcIXt0DLrkbummCh1iW/W0ysYmb0R7p xZQtWjxz 95yQzrubkUrUXxHz4O1I7WCBLU6s5UcQRYLnSwUtipvdOjzc2jxTaXlfI2umD6oe4xLonIZ1Ppov+NcIncyhM+OHqJ0FtQ+QmPDtHFu5vGt5/QWw0M+aaKitJRZjaQY24V4r8CPeuJZO8JbOJefOSJzIMvMrvZtj2yYfbi2z225pf8H0VDCL7r5vc7WxWhYntnRVa0lkY0FZvNWvUAnR2EvCZp7HEJYk39R9vp+PV+4kqIKuiKsdUWyxWFmNq4aW6+7rdOcm2LolJhuKAlovTvsf+fHdsPysVguXREEkxuRt5dYgYq7+katQ6Z08D54MIRVqPpi/n5XHLi9fQKrDNEb7vAf+GWPmS3DZtka1kJoO0aePwpSjMF1V7JO3yRoUwOKw6bZ1ULFkyFJnoBMw69n7WHDzEXWPH22NJxfvcNYfFRlZ0DEz7ysO9w1e1MjhpPC7Jd1188k/7dbO7IqwOMv7CClhnXiWSVNgVmInmiiVQ+6A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 22.01.25 07:52, Barry Song wrote: > On Wed, Jan 22, 2025 at 5:06 PM Juan Yescas wrote: >> >> On Tue, Jan 21, 2025 at 6:24 PM Zi Yan wrote: >>> >>> On Tue Jan 21, 2025 at 9:08 PM EST, Juan Yescas wrote: >>>> On Mon, Jan 20, 2025 at 9:59 AM David Hildenbrand wrote: >>>>> >>>>> On 20.01.25 16:29, Zi Yan wrote: >>>>>> On Mon Jan 20, 2025 at 3:14 AM EST, David Hildenbrand wrote: >>>>>>> On 20.01.25 01:39, Zi Yan wrote: >>>>>>>> On Sun Jan 19, 2025 at 6:55 PM EST, Barry Song wrote: >>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> However, with this workaround, we can't use transparent huge pages. >>>>>>>>>>>>> >>>>>>>>>>>>> Is the CMA_MIN_ALIGNMENT_BYTES requirement alignment only to support huge pages? >>>>>>>>>> No. CMA_MIN_ALIGNMENT_BYTES is limited by CMA_MIN_ALIGNMENT_PAGES, which >>>>>>>>>> is equal to pageblock size. Enabling THP just bumps the pageblock size. >>>>>>>>> >>>> >>>> Thanks, I can see the initialization in include/linux/pageblock-flags.h >>>> >>>> #define pageblock_order MIN_T(unsigned int, HUGETLB_PAGE_ORDER, MAX_PAGE_ORDER) >>>> >>>>>>>>> Currently, THP might be mTHP, which can have a significantly smaller >>>>>>>>> size than 32MB. For >>>>>>>>> example, on arm64 systems with a 16KiB page size, a 2MB CONT-PTE mTHP >>>>>>>>> is possible. >>>>>>>>> Additionally, mTHP relies on the CONFIG_TRANSPARENT_HUGEPAGE configuration. >>>>>>>>> >>>>>>>>> I wonder if it's possible to enable CONFIG_TRANSPARENT_HUGEPAGE >>>>>>>>> without necessarily >>>>>>>>> using 32MiB THP. If we use other sizes, such as 64KiB, perhaps a large >>>>>>>>> pageblock size wouldn't >>>>>>>>> be necessary? >>>> >>>> Do you mean with mTHP? We haven't explored that option. >>> >>> Yes. Unless your applications have special demands for PMD THPs. 2MB >>> mTHP should work. >>> >>>> >>>>>>>> >>>>>>>> I think this should work by reducing MAX_PAGE_ORDER like Juan did for >>>>>>>> the experiment. But MAX_PAGE_ORDER is a macro right now, Kconfig needs >>>>>>>> to be changed and kernel needs to be recompiled. Not sure if it is OK >>>>>>>> for Juan's use case. >>>>>>> >>>> >>>> The main goal is to reserve only the necessary CMA memory for the >>>> drivers, which is >>>> usually the same for 4kb and 16kb page size kernels. >>> >>> Got it. Based on your experiment, you changed MAX_PAGE_ORDER to get the >>> minimal CMA alignment size. Can you deploy that kernel to production? >> >> We can't deploy that because many Android partners are using PMD THP instead >> of mTHP. >> >>> If yes, you can use mTHP instead of PMD THP and still get the CMA >>> alignemnt you want. >>> >>>> >>>>>>> >>>>>>> IIRC, we set pageblock size == THP size because this is the granularity >>>>>>> we want to optimize defragmentation for. ("try keep pageblock >>>>>>> granularity of the same memory type: movable vs. unmovable") >>>>>> >>>>>> Right. In past, it is optimized for PMD THP. Now we have mTHP. If user >>>>>> does not care about PMD THP (32MB in ARM64 16KB base page case) and mTHP >>>>>> (2MB mTHP here) is good enough, reducing pageblock size works. >>>>>> >>>>>>> >>>>>>> However, the buddy already supports having different pagetypes for large >>>>>>> allocations. >>>>>> >>>>>> Right. To be clear, only MIGRATE_UNMOVABLE, MIGRATE_RECLAIMABLE, and >>>>>> MIGRATE_MOVABLE can be merged. >>>>> >>>>> Yes! An a THP cannot span partial MIGRATE_CMA, which would be fine. >>>>> >>>>>> >>>>>>> >>>>>>> So we could leave MAX_ORDER alone and try adjusting the pageblock size >>>>>>> in these setups. pageblock size is already variable on some >>>>>>> architectures IIRC. >>>>>> >>>> >>>> Which values would work for the CMA_MIN_ALIGNMENT_BYTES macro? In the >>>> 16KiB page size kernel, >>>> I tried these 2 configurations: >>>> >>>> #define CMA_MIN_ALIGNMENT_BYTES (2048 * CMA_MIN_ALIGNMENT_PAGES) >>>> >>>> and >>>> >>>> #define CMA_MIN_ALIGNMENT_BYTES (4096 * CMA_MIN_ALIGNMENT_PAGES) >>>> >>>> with both of them, the kernel failed to boot. >>> >>> CMA_MIN_ALIGNMENT_BYTES needs to be PAGE_SIZE * CMA_MIN_ALIGNMENT_PAGES. >>> So you need to adjust CMA_MIN_ALIGNMENT_PAGES, which is set by pageblock >>> size. pageblock size is determined by pageblock order, which is >>> affected by MAX_PAGE_ORDER. >>> >>>> >>>>>> Making pageblock size a boot time variable? We might want to warn >>>>>> sysadmin/user that >pageblock_order THP/mTHP creation will suffer. >>>>> >>>>> Yes, some way to configure it. >>>>> >>>>>> >>>>>>> >>>>>>> We'd only have to check if all of the THP logic can deal with pageblock >>>>>>> size < THP size. >>>>>> >>>> >>>> The reason that THP was disabled in my experiment is because this >>>> assertion failed >>>> >>>> mm/huge_memory.c >>>> /* >>>> * hugepages can't be allocated by the buddy allocator >>>> */ >>>> MAYBE_BUILD_BUG_ON(HPAGE_PMD_ORDER > MAX_PAGE_ORDER); >>>> >>>> when >>>> >>>> config ARCH_FORCE_MAX_ORDER >>>> int >>>> ..... >>>> default "8" if ARM64_16K_PAGES >>>> >>> >>> You can remove that BUILD_BUG_ON and turn on mTHP and see if mTHP works. >>> >> >> We'll do that and post the results. >> >>>> >>>>>> Probably yes, pageblock should be independent of THP logic, although >>>>>> compaction (used to create THPs) logic is based on pageblock. >>>>> >>>>> Right. As raised in the past, we need a higher level mechanism that >>>>> tries to group pageblocks together during comapction/conversion to limit >>>>> fragmentation on a higher level. >>>>> >>>>> I assume that many use cases would be fine with not using 32MB/512MB >>>>> THPs at all for now -- and instead using 2 MB ones. Of course, for very >>>>> large installations it might be different. >>>>> >>>>>>> >>>>>>> This issue is even more severe on arm64 with 64k (pageblock = 512MiB). >>>>>> >>>> >>>> I agree, and if ARCH_FORCE_MAX_ORDER is configured to the max value we get: >>>> >>>> PAGE_SIZE | max MAX_PAGE_ORDER | CMA_MIN_ALIGNMENT_BYTES >>>> 4KiB | 15 | 4KiB >>>> * 32KiB = 128MiB >>>> 16KiB | 13 | 16KiB >>>> * 8KiB = 128MiB >>>> 64KiB | 13 | 64KiB >>>> * 8KiB = 512MiB >>>> >>>>>> This is also good for virtio-mem, since the offline memory block size >>>>>> can also be reduced. I remember you complained about it before. >>>>> >>>>> Yes, yes, yes! :) >>>>> >>> >>> David's proposal should work in general, but will might take non-trivial >>> amount of work: >>> >>> 1. keep pageblock size always at 4MB for all arch. >>> 2. adjust existing pageblock users, like compaction, to work on a >>> different range, independent of pageblock. >>> a. for anti-fragmentation mechanism, multiple pageblocks might have >>> different migratetypes but would be compacted to generate huge >>> pages, but how to align their migratetypes is TBD. >>> 3. other corner case handlings. >>> >>> >>> The final question is that Barry mentioned that over-reserved CMA areas >>> can be used for movable page allocations. Why does it not work for you? >> >> I need to run more experiments to see what type of page allocations in >> the system is the dominant one (unmovable or movable). If it is movable, >> over-reserved CMA areas should be fine. > > My understanding is that over-reserving 28MiB is unlikely to cause > noticeable regression, given that we frequently handle allocations like > GFP_HIGHUSER_MOVABLE or similar, which are significantly larger > than 28MiB. However, David also mentioned a reservation of 512MiB > for a 64KiB page size. In that case, 512MiB might be large enough to > potentially impact the balance between movable and unmovable > allocations. For instance, if we still have 512MiB reserved in CMA > but are allocating unmovable folios(for example dma-buf), we could > fail an allocation even when there’s actually capacity. So, in any case, > there is still work to be done here. > > By the way, is 512MiB truly a reasonable size for THP? No, it's absolutely stupid for most setups. Just think of a small VM with 4 GiB: great you have 8 pageblocks and probably never get a single THP. -- Cheers, David / dhildenb