From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9356BC5AD49 for ; Tue, 3 Jun 2025 15:42:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17CC26B04B1; Tue, 3 Jun 2025 11:42:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 12D236B04B2; Tue, 3 Jun 2025 11:42:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F36756B04B3; Tue, 3 Jun 2025 11:42:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CED526B04B1 for ; Tue, 3 Jun 2025 11:42:36 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4BC6C140E2D for ; Tue, 3 Jun 2025 15:42:36 +0000 (UTC) X-FDA: 83514506712.07.1B06177 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf01.hostedemail.com (Postfix) with ESMTP id AD7C140013 for ; Tue, 3 Jun 2025 15:42:33 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cAMwKFwj; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf01.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748965353; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GBxLRRnBqLAmeWB0B7HeK4vK3McXs0/oE4hTEermesE=; b=LkPJpmfR5ew7/aufwn4QYSfd0LJFRIxS7Hnh3cduBviZoFV8FN+7bVyz9no0WRuBw76IQL VKVEgZVXukdVBydytD4Z+UjxeKEePForIB+Gy8UI4gSk0XwY6dXTBfKHDxE9chDT3NdJzv NycuY1dFDDWUSq00rhNf5hqiqH5X8MQ= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cAMwKFwj; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf01.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748965353; a=rsa-sha256; cv=none; b=GwDt0W3QvYx9mVTxB+PbCK2bW4frscw2nBVG2j4vcwoYor4OwHrr26QhGEb77cqpp+V2Yj B7hK5D1HLftlTG2+jXbd7vTXLnVcduJwrj+UQjxV1G8T6g59PaR3r+lKPrDbqvuRO1lX4Y 2c97bBLgowSZ0Qmxhg8nTmfoT+6dRP4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1748965353; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=GBxLRRnBqLAmeWB0B7HeK4vK3McXs0/oE4hTEermesE=; b=cAMwKFwjcjo2/DfygKxK3NpjHGJn7+xmwreTeUVuV4GrDIpydpzD9dRL5F50v0GVWtAWeG wOZgDX6FxinCKynV4lbhksgKd79C2oGRk0hrBcm+kfDCGmkp42ZUmfTG0bwqvJOnd2JLUQ 5Sh9I8k2j8bhdoerLLkeOrDXLKiuzhk= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-686-2GK5Ne1pOLmHwLxeack_-A-1; Tue, 03 Jun 2025 11:42:31 -0400 X-MC-Unique: 2GK5Ne1pOLmHwLxeack_-A-1 X-Mimecast-MFC-AGG-ID: 2GK5Ne1pOLmHwLxeack_-A_1748965350 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-440667e7f92so40002305e9.3 for ; Tue, 03 Jun 2025 08:42:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748965350; x=1749570150; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=GBxLRRnBqLAmeWB0B7HeK4vK3McXs0/oE4hTEermesE=; b=ueW31xv98SI5jwXaTTMqeO04W1KNR4oSqQyfCoPnNST+Nwb0HBqOMTcsJPnvo9XaTi vG7d6FbQ9vncNQARzz0bN327Me3IoxzByIX/D87KcGReuC9L4yMCDMP6d2PSq+/eVZWm mFMqAYb74WfgE8otuvfOHwgwiHksu8eYD4nWzdHCDDiMLogkZmS3qERvKft4pJWSIyCG 1JLOvYOYWHLJMJKyl40rCk/pUOv5bERt3d9Y7NIFOgj8AuyqSnuDQ0f2+JUGmr7S7DsH 3fedo0PgAOn0gop1lTImgYUTukJGGoE2nLC/ke2cmiRxVEPefdOtv/3KQ6+ibSZnwgJy qZcg== X-Forwarded-Encrypted: i=1; AJvYcCWNANtBc1J8ghMGdS29GvsmvBChqctWmnXDuBtxOmd52Gq12YknAgm9HHlem3rCJ/nyOoAOco21dg==@kvack.org X-Gm-Message-State: AOJu0YwIlOCeQbwlFu7BoypPouYUK1acaASgOt0mZjMPmx3CjSLwwLzX y2XbKxIj3XGvqq+mpXNdV3jK1ecORqg5t9SdEvJemMynBO39pI1KjC5vb9wa+qVP19Lh5u//RwF G7TkBBpG8MUyiHl9TN/Fb9bvPoxPUHHH2FNZtXVy9x/1XWuCx6ICK X-Gm-Gg: ASbGncszQV1MSLUYyVT/WiEZJ2TofAeVWDmTPt6ylH6qJXq9no7eysXwmkxKEka51dV f+R6ejl7EVtiv96G7JjaqvGb3FVtReX/7RbGC2S1zEnTjeNwfk9I2CFqOrTG7KkpQLpo1iGrlhD 9TUr9MnpjViQYiMStgDA0PJ6EPvzbSHafscLXqqVFpYRZQ3+lyw+xn99sLsOuXaf7u2AmvzeFsv mJU9DMsgnyxapQyKsCK/0T4F/K511NW2Zgykgrjj5xXVLxxz5cSH0pmci0wIIElcDyL3HMNZB2z rCVYkXgPlA90tmru9n1kQaxMOuVewIiAS//bSSPu61/teBTYD2wmNiRkQolwBXK3lPj2dK6ofJK Zsw3HaGUUgQc7qD00/fHIAeQWhwBS+WWjUuMAOvo= X-Received: by 2002:a05:6000:144e:b0:3a4:d975:7d62 with SMTP id ffacd0b85a97d-3a4fe393721mr11576316f8f.35.1748965350407; Tue, 03 Jun 2025 08:42:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFAi0ScpkbGJg0544yhMX0kGvgGSLtYJr1P9QOROvdPDtv4vd+pL/0t3rJOBPb7O88sopGRPw== X-Received: by 2002:a05:6000:144e:b0:3a4:d975:7d62 with SMTP id ffacd0b85a97d-3a4fe393721mr11576282f8f.35.1748965349905; Tue, 03 Jun 2025 08:42:29 -0700 (PDT) Received: from ?IPV6:2003:d8:2f0d:f000:eec9:2b8d:4913:f32a? (p200300d82f0df000eec92b8d4913f32a.dip0.t-ipconnect.de. [2003:d8:2f0d:f000:eec9:2b8d:4913:f32a]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4efe5b87csm18935531f8f.12.2025.06.03.08.42.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 03 Jun 2025 08:42:29 -0700 (PDT) Message-ID: <3e075035-fb74-4b5b-81e4-d32b832de44f@redhat.com> Date: Tue, 3 Jun 2025 17:42:28 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v7] mm: Add CONFIG_PAGE_BLOCK_ORDER to select page block order To: Zi Yan Cc: Juan Yescas , Andrew Morton , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org, tjmercier@google.com, isaacmanjarres@google.com, kaleshsingh@google.com, masahiroy@kernel.org, Minchan Kim References: <20250521215807.1860663-1-jyescas@google.com> <54943dbb-45fe-4b69-a6a8-96381304a268@redhat.com> <51739EAE-32CB-43AD-A969-B24FE3DAA351@nvidia.com> <74955261-748E-4E91-8D55-6ECAEBFB3F70@nvidia.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: <74955261-748E-4E91-8D55-6ECAEBFB3F70@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: KtT3f8mkKXr3h8KYyoGf1kIh0w-rdODQKvxk3mYSYMs_1748965350 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: AD7C140013 X-Stat-Signature: gon8tjimyxkuqqj4nwyc7yc1ppw4d5n6 X-Rspam-User: X-HE-Tag: 1748965353-133093 X-HE-Meta: U2FsdGVkX1+HCI/95GlIWAm54/11SMeQ3zNingj9Td3LOQys7Lnt3pQlmqOsIywU5Lg86e8jtMEc/nmXLLNQh+Ic4giriTp8IOKNmV8BOdEAVprKVxm6a0tOQ7j+bEwPHobPqGioHxdq06oXJUp36lOEbpuJ/ZrGhMK4xo+OaUfYBgfKt3UOFdDLheT7ZOgKVInoCu1BL1F2ZitSAEcC76BZhsbiYsumVL9DLprspfu6aHlmZOdoaKdyhMTV3Zwh0pb2+xBZ+WyN82GyJaSAlybNhd01ijPsosnE/YXGwWd83nSIgICX7FeWt4Y4gdij4fDPGGN5uWOzIbU99xWPBnnxBTN5CAPJNF5nmfi2dP1VJ1CC5xkWaz2oKm8jS9eWqkPHIKzUU0yq9I8+G7AJz9wW3Aq/J6hyiuIVmOwXUgxYwjGioKi3hEL7GoWhlKafvyRQaj+abuJPpLDkTLeWjnKhqY7BdQ7pOfJujqUEGt5m8ogNQS1443XmRlJrsbMhtE4xz47Yofgdly+B6bTqQciwVOf9qHEIKkMr2WkeRg2Nt1G9293en9rivUtJvODpcxacV7iSzyhFMldd9vR5q5eMNCsVtAnmpzHcbEnX37H6cHDrH5u7ZjOG7nhKNHZcZX/ZEf/BYjy/8ZkDsowWthc90kHNGaoZ+4C5dxfoE/0PHY9ysflNWrvcu0lAnTQkKSOx+fHsLpWHAmCGpCA60CVQ5si7GKQjTNpbHQXArkMIAq2mvr0nsNnA97xru+Om2l0JxnghMd3AUfhd1l4+343vj/JPvLlhoHquqyWDsqWsX4991bxwEdE2ZX+7g+B/jbnZtLA0ufqLvxM0E7ju3zbW4va+2qLW2KGdq2y3IEhruhRdk4KXLGHmsjzc03z0U+jQRonEoEkX6CRo9vS7l1lq6tBYcIRRgJEaNZAY+KfPvWPULFw0+/UKmDQVgNfx99B2OYjPtCBFZGX1uxU fPxbfxxl mrh2dS/wd7JNcIoWsXetfHsGp8fA3k+g+LgsM11I4zDwHjchyAYPGHratWfnjbRmH/5tjKgVeyJHQuJnV0ylCZKntiHXd1Fs1qrPO2hFZcjYNBbKs2/MMTUUtkJjxFd2jVF8VUOqTFRuBIv6TXrc0cRVKinzVLv7kuaNi7Wa82SJ3ZviaRSvltGQ9A8jiUbo8WFmFxWFIS+VMPi4Hpvc+mfGfWC7EXYW7ynuf6jniu6R5/Pfvb/m+I66sLkUefMFMsTbCxIuacVwBMqHVgn5cdXJEAy4SI0GOywDBYGEOfwl6djI4XxhXR56+5XvmA/CWicG59KDO6j5y6SM+aigL2SFGC7vwIUfs31P9p+awAiGF5MKZP230EYlb1gUl0L07UeEuPC6zzxntU9GD/9+KPzEwhGMfMUEqxNGP1d1Y7TpVwdlB3qBVxC/AVH4Ta9i0SZivCShoCHPwsQ6ASFbu0eFi4sXEJVnX7iNUo+1AU1l9lsx65w/+gp87qVUWna2vwVu32Jr0B9KB1tWz5GKnsagcRDfh7dx9Ea+VkMvi1oG/KLF0jAnb+Y7vRU1nJANHw7sCxQBcXzyvM3uS8AX0pf0E6DLHAtcMtMplDr6SnEpYVn05LjyLPNtbaDt9YYwVFg3Obqqlj3FMqKaZ+Y+j65c/fVTByprJI6aY1lVXAugWiQb3vNJHIsk10PgDRS1fIWHcjDE15kvXCrvCsAfozE/p3m/+WO04ktE+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 03.06.25 17:14, Zi Yan wrote: > On 3 Jun 2025, at 10:55, Zi Yan wrote: > >> On 3 Jun 2025, at 9:03, David Hildenbrand wrote: >> >>> On 21.05.25 23:57, Juan Yescas wrote: >>>> Problem: On large page size configurations (16KiB, 64KiB), the CMA >>>> alignment requirement (CMA_MIN_ALIGNMENT_BYTES) increases considerably, >>>> and this causes the CMA reservations to be larger than necessary. >>>> This means that system will have less available MIGRATE_UNMOVABLE and >>>> MIGRATE_RECLAIMABLE page blocks since MIGRATE_CMA can't fallback to them. >>>> >>>> The CMA_MIN_ALIGNMENT_BYTES increases because it depends on >>>> MAX_PAGE_ORDER which depends on ARCH_FORCE_MAX_ORDER. The value of >>>> ARCH_FORCE_MAX_ORDER increases on 16k and 64k kernels. >>>> >>>> For example, in ARM, the CMA alignment requirement when: >>>> >>>> - CONFIG_ARCH_FORCE_MAX_ORDER default value is used >>>> - CONFIG_TRANSPARENT_HUGEPAGE is set: >>>> >>>> PAGE_SIZE | MAX_PAGE_ORDER | pageblock_order | CMA_MIN_ALIGNMENT_BYTES >>>> ----------------------------------------------------------------------- >>>> 4KiB | 10 | 9 | 4KiB * (2 ^ 9) = 2MiB >>>> 16Kib | 11 | 11 | 16KiB * (2 ^ 11) = 32MiB >>>> 64KiB | 13 | 13 | 64KiB * (2 ^ 13) = 512MiB >>>> >>>> There are some extreme cases for the CMA alignment requirement when: >>>> >>>> - CONFIG_ARCH_FORCE_MAX_ORDER maximum value is set >>>> - CONFIG_TRANSPARENT_HUGEPAGE is NOT set: >>>> - CONFIG_HUGETLB_PAGE is NOT set >>>> >>>> PAGE_SIZE | MAX_PAGE_ORDER | pageblock_order | CMA_MIN_ALIGNMENT_BYTES >>>> ------------------------------------------------------------------------ >>>> 4KiB | 15 | 15 | 4KiB * (2 ^ 15) = 128MiB >>>> 16Kib | 13 | 13 | 16KiB * (2 ^ 13) = 128MiB >>>> 64KiB | 13 | 13 | 64KiB * (2 ^ 13) = 512MiB >>>> >>>> This affects the CMA reservations for the drivers. If a driver in a >>>> 4KiB kernel needs 4MiB of CMA memory, in a 16KiB kernel, the minimal >>>> reservation has to be 32MiB due to the alignment requirements: >>>> >>>> reserved-memory { >>>> ... >>>> cma_test_reserve: cma_test_reserve { >>>> compatible = "shared-dma-pool"; >>>> size = <0x0 0x400000>; /* 4 MiB */ >>>> ... >>>> }; >>>> }; >>>> >>>> reserved-memory { >>>> ... >>>> cma_test_reserve: cma_test_reserve { >>>> compatible = "shared-dma-pool"; >>>> size = <0x0 0x2000000>; /* 32 MiB */ >>>> ... >>>> }; >>>> }; >>>> >>>> Solution: Add a new config CONFIG_PAGE_BLOCK_ORDER that >>>> allows to set the page block order in all the architectures. >>>> The maximum page block order will be given by >>>> ARCH_FORCE_MAX_ORDER. >>>> >>>> By default, CONFIG_PAGE_BLOCK_ORDER will have the same >>>> value that ARCH_FORCE_MAX_ORDER. This will make sure that >>>> current kernel configurations won't be affected by this >>>> change. It is a opt-in change. >>>> >>>> This patch will allow to have the same CMA alignment >>>> requirements for large page sizes (16KiB, 64KiB) as that >>>> in 4kb kernels by setting a lower pageblock_order. >>>> >>>> Tests: >>>> >>>> - Verified that HugeTLB pages work when pageblock_order is 1, 7, 10 >>>> on 4k and 16k kernels. >>>> >>>> - Verified that Transparent Huge Pages work when pageblock_order >>>> is 1, 7, 10 on 4k and 16k kernels. >>>> >>>> - Verified that dma-buf heaps allocations work when pageblock_order >>>> is 1, 7, 10 on 4k and 16k kernels. >>>> >>>> Benchmarks: >>>> >>>> The benchmarks compare 16kb kernels with pageblock_order 10 and 7. The >>>> reason for the pageblock_order 7 is because this value makes the min >>>> CMA alignment requirement the same as that in 4kb kernels (2MB). >>>> >>>> - Perform 100K dma-buf heaps (/dev/dma_heap/system) allocations of >>>> SZ_8M, SZ_4M, SZ_2M, SZ_1M, SZ_64, SZ_8, SZ_4. Use simpleperf >>>> (https://developer.android.com/ndk/guides/simpleperf) to measure >>>> the # of instructions and page-faults on 16k kernels. >>>> The benchmark was executed 10 times. The averages are below: >>>> >>>> # instructions | #page-faults >>>> order 10 | order 7 | order 10 | order 7 >>>> -------------------------------------------------------- >>>> 13,891,765,770 | 11,425,777,314 | 220 | 217 >>>> 14,456,293,487 | 12,660,819,302 | 224 | 219 >>>> 13,924,261,018 | 13,243,970,736 | 217 | 221 >>>> 13,910,886,504 | 13,845,519,630 | 217 | 221 >>>> 14,388,071,190 | 13,498,583,098 | 223 | 224 >>>> 13,656,442,167 | 12,915,831,681 | 216 | 218 >>>> 13,300,268,343 | 12,930,484,776 | 222 | 218 >>>> 13,625,470,223 | 14,234,092,777 | 219 | 218 >>>> 13,508,964,965 | 13,432,689,094 | 225 | 219 >>>> 13,368,950,667 | 13,683,587,37 | 219 | 225 >>>> ------------------------------------------------------------------- >>>> 13,803,137,433 | 13,131,974,268 | 220 | 220 Averages >>>> >>>> There were 4.85% #instructions when order was 7, in comparison >>>> with order 10. >>>> >>>> 13,803,137,433 - 13,131,974,268 = -671,163,166 (-4.86%) >>>> >>>> The number of page faults in order 7 and 10 were the same. >>>> >>>> These results didn't show any significant regression when the >>>> pageblock_order is set to 7 on 16kb kernels. >>>> >>>> - Run speedometer 3.1 (https://browserbench.org/Speedometer3.1/) 5 times >>>> on the 16k kernels with pageblock_order 7 and 10. >>>> >>>> order 10 | order 7 | order 7 - order 10 | (order 7 - order 10) % >>>> ------------------------------------------------------------------- >>>> 15.8 | 16.4 | 0.6 | 3.80% >>>> 16.4 | 16.2 | -0.2 | -1.22% >>>> 16.6 | 16.3 | -0.3 | -1.81% >>>> 16.8 | 16.3 | -0.5 | -2.98% >>>> 16.6 | 16.8 | 0.2 | 1.20% >>>> ------------------------------------------------------------------- >>>> 16.44 16.4 -0.04 -0.24% Averages >>>> >>>> The results didn't show any significant regression when the >>>> pageblock_order is set to 7 on 16kb kernels. >>>> >>>> Cc: Andrew Morton >>>> Cc: Vlastimil Babka >>>> Cc: Liam R. Howlett >>>> Cc: Lorenzo Stoakes >>>> Cc: David Hildenbrand >>>> CC: Mike Rapoport >>>> Cc: Zi Yan >>>> Cc: Suren Baghdasaryan >>>> Cc: Minchan Kim >>>> Signed-off-by: Juan Yescas >>>> Acked-by: Zi Yan >>>> --- >>>> Changes in v7: >>>> - Update alignment calculation to 2MiB as per David's >>>> observation. >>>> - Update page block order calculation in mm/mm_init.c for >>>> powerpc when CONFIG_HUGETLB_PAGE_SIZE_VARIABLE is set. >>>> >>>> Changes in v6: >>>> - Applied the change provided by Zi Yan to fix >>>> the Kconfig. The change consists in evaluating >>>> to true or false in the if expression for range: >>>> range 1 if . >>>> >>>> Changes in v5: >>>> - Remove the ranges for CONFIG_PAGE_BLOCK_ORDER. The >>>> ranges with config definitions don't work in Kconfig, >>>> for example (range 1 MY_CONFIG). >>>> - Add PAGE_BLOCK_ORDER_MANUAL config for the >>>> page block order number. The default value was not >>>> defined. >>>> - Fix typos reported by Andrew. >>>> - Test default configs in powerpc. >>>> >>>> Changes in v4: >>>> - Set PAGE_BLOCK_ORDER in incluxe/linux/mmzone.h to >>>> validate that MAX_PAGE_ORDER >= PAGE_BLOCK_ORDER at >>>> compile time. >>>> - This change fixes the warning in: >>>> https://lore.kernel.org/oe-kbuild-all/202505091548.FuKO4b4v-lkp@intel.com/ >>>> >>>> Changes in v3: >>>> - Rename ARCH_FORCE_PAGE_BLOCK_ORDER to PAGE_BLOCK_ORDER >>>> as per Matthew's suggestion. >>>> - Update comments in pageblock-flags.h for pageblock_order >>>> value when THP or HugeTLB are not used. >>>> >>>> Changes in v2: >>>> - Add Zi's Acked-by tag. >>>> - Move ARCH_FORCE_PAGE_BLOCK_ORDER config to mm/Kconfig as >>>> per Zi and Matthew suggestion so it is available to >>>> all the architectures. >>>> - Set ARCH_FORCE_PAGE_BLOCK_ORDER to 10 by default when >>>> ARCH_FORCE_MAX_ORDER is not available. >>>> >>>> include/linux/mmzone.h | 16 ++++++++++++++++ >>>> include/linux/pageblock-flags.h | 8 ++++---- >>>> mm/Kconfig | 34 +++++++++++++++++++++++++++++++++ >>>> mm/mm_init.c | 2 +- >>>> 4 files changed, 55 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >>>> index 6ccec1bf2896..05610337bbb6 100644 >>>> --- a/include/linux/mmzone.h >>>> +++ b/include/linux/mmzone.h >>>> @@ -37,6 +37,22 @@ >>>> #define NR_PAGE_ORDERS (MAX_PAGE_ORDER + 1) >>>> +/* Defines the order for the number of pages that have a migrate type. */ >>>> +#ifndef CONFIG_PAGE_BLOCK_ORDER >>>> +#define PAGE_BLOCK_ORDER MAX_PAGE_ORDER >>>> +#else >>>> +#define PAGE_BLOCK_ORDER CONFIG_PAGE_BLOCK_ORDER >>>> +#endif /* CONFIG_PAGE_BLOCK_ORDER */ >>>> + >>>> +/* >>>> + * The MAX_PAGE_ORDER, which defines the max order of pages to be allocated >>>> + * by the buddy allocator, has to be larger or equal to the PAGE_BLOCK_ORDER, >>>> + * which defines the order for the number of pages that can have a migrate type >>>> + */ >>>> +#if (PAGE_BLOCK_ORDER > MAX_PAGE_ORDER) >>>> +#error MAX_PAGE_ORDER must be >= PAGE_BLOCK_ORDER >>>> +#endif >>>> + >>>> /* >>>> * PAGE_ALLOC_COSTLY_ORDER is the order at which allocations are deemed >>>> * costly to service. That is between allocation orders which should >>>> diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h >>>> index fc6b9c87cb0a..e73a4292ef02 100644 >>>> --- a/include/linux/pageblock-flags.h >>>> +++ b/include/linux/pageblock-flags.h >>>> @@ -41,18 +41,18 @@ extern unsigned int pageblock_order; >>>> * Huge pages are a constant size, but don't exceed the maximum allocation >>>> * granularity. >>>> */ >>>> -#define pageblock_order MIN_T(unsigned int, HUGETLB_PAGE_ORDER, MAX_PAGE_ORDER) >>>> +#define pageblock_order MIN_T(unsigned int, HUGETLB_PAGE_ORDER, PAGE_BLOCK_ORDER) >>>> #endif /* CONFIG_HUGETLB_PAGE_SIZE_VARIABLE */ >>>> #elif defined(CONFIG_TRANSPARENT_HUGEPAGE) >>>> -#define pageblock_order MIN_T(unsigned int, HPAGE_PMD_ORDER, MAX_PAGE_ORDER) >>>> +#define pageblock_order MIN_T(unsigned int, HPAGE_PMD_ORDER, PAGE_BLOCK_ORDER) >>>> #else /* CONFIG_TRANSPARENT_HUGEPAGE */ >>>> -/* If huge pages are not used, group by MAX_ORDER_NR_PAGES */ >>>> -#define pageblock_order MAX_PAGE_ORDER >>>> +/* If huge pages are not used, group by PAGE_BLOCK_ORDER */ >>>> +#define pageblock_order PAGE_BLOCK_ORDER >>>> #endif /* CONFIG_HUGETLB_PAGE */ >>>> diff --git a/mm/Kconfig b/mm/Kconfig >>>> index e113f713b493..13a5c4f6e6b6 100644 >>>> --- a/mm/Kconfig >>>> +++ b/mm/Kconfig >>>> @@ -989,6 +989,40 @@ config CMA_AREAS >>>> If unsure, leave the default value "8" in UMA and "20" in NUMA. >>>> +# >>>> +# Select this config option from the architecture Kconfig, if available, to set >>>> +# the max page order for physically contiguous allocations. >>>> +# >>>> +config ARCH_FORCE_MAX_ORDER >>>> + int >>>> + >>>> +# >>>> +# When ARCH_FORCE_MAX_ORDER is not defined, >>>> +# the default page block order is MAX_PAGE_ORDER (10) as per >>>> +# include/linux/mmzone.h. >>>> +# >>>> +config PAGE_BLOCK_ORDER >>>> + int "Page Block Order" >>>> + range 1 10 if ARCH_FORCE_MAX_ORDER = 0 >>>> + default 10 if ARCH_FORCE_MAX_ORDER = 0 >>>> + range 1 ARCH_FORCE_MAX_ORDER if ARCH_FORCE_MAX_ORDER != 0 >>>> + default ARCH_FORCE_MAX_ORDER if ARCH_FORCE_MAX_ORDER != 0 >>>> + help >>>> + The page block order refers to the power of two number of pages that >>>> + are physically contiguous and can have a migrate type associated to >>>> + them. The maximum size of the page block order is limited by >>>> + ARCH_FORCE_MAX_ORDER. >>>> + >>>> + This config allows overriding the default page block order when the >>>> + page block order is required to be smaller than ARCH_FORCE_MAX_ORDER >>>> + or MAX_PAGE_ORDER. >>>> + >>>> + Reducing pageblock order can negatively impact THP generation >>>> + success rate. If your workloads uses THP heavily, please use this >>>> + option with caution. >>>> + >>>> + Don't change if unsure. >>> >>> >>> The semantics are now very confusing [1]. The default in x86-64 will be 10, so we'll have >>> >>> CONFIG_PAGE_BLOCK_ORDER=10 >>> >>> >>> But then, we'll do this >>> >>> #define pageblock_order MIN_T(unsigned int, HPAGE_PMD_ORDER, PAGE_BLOCK_ORDER) >>> >>> >>> So the actual pageblock order will be different than CONFIG_PAGE_BLOCK_ORDER. >>> >>> Confusing. >>> >>> Either CONFIG_PAGE_BLOCK_ORDER is misnamed (CONFIG_PAGE_BLOCK_ORDER_CEIL ? CONFIG_PAGE_BLOCK_ORDER_LIMIT ?), or the semantics should be changed. >> >> IIRC, Juan's intention is to limit/lower pageblock order to reduce CMA region >> size. CONFIG_PAGE_BLOCK_ORDER_LIMIT sounds reasonable to me. > > LIMIT might be still ambiguous, since it can be lower limit or upper limit. > CONFIG_PAGE_BLOCK_ORDER_CEIL is better. Here is the patch I come up with, > if it looks good to you, I can send it out properly. LGTM -- Cheers, David / dhildenb