From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B044C3ABA3 for ; Thu, 1 May 2025 05:28:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4A9286B008A; Thu, 1 May 2025 01:28:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 457A86B008C; Thu, 1 May 2025 01:28:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 320A56B0092; Thu, 1 May 2025 01:28:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 15D956B008A for ; Thu, 1 May 2025 01:28:38 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B3E99121D51 for ; Thu, 1 May 2025 05:28:37 +0000 (UTC) X-FDA: 83393209074.02.3AC38B9 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf29.hostedemail.com (Postfix) with ESMTP id F356E12000F for ; Thu, 1 May 2025 05:28:35 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qh++xG9D; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3ggYTaAcKCCMIXDRB9RFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jyescas.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3ggYTaAcKCCMIXDRB9RFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jyescas.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746077316; a=rsa-sha256; cv=none; b=7Mku4tO4W/K8lYjqJGWeDGPtA80srfn95oK0NyhcJE09eX6wEJ7IIxmNiFaBKcTDropod6 zOlB3weg1f9Zbk6Bhi5IbadFJqwTH3rwp4rruJh5A+xw1wwP3hRyvJhvo0NEVPD3o+v0VB FgYGelx20Jf2te8ZQOI1Fov1fcbp0pA= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qh++xG9D; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3ggYTaAcKCCMIXDRB9RFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jyescas.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3ggYTaAcKCCMIXDRB9RFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--jyescas.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746077316; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=WHj8Cwb8q2VjJ9Dtv2wOEvSjJNxf/JSEJossLrlxFWk=; b=zagNhcBdsM/a0bYem/B02yg8fIFIryrVkYZRzlnX9jfPM62Oc9JxPNfjfLhed/w/jgrKqt KNG9wvdCd3f/w4Wl6Xi/IPeGWmc0KGtfIXB79IGagv9Cyec55kOnvSgjBj6EELBouFrW1l 2Gq3XZrUa5gUXIMX9WgEqlzZ3uUp2KY= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-30364fc706fso632664a91.3 for ; Wed, 30 Apr 2025 22:28:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746077315; x=1746682115; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=WHj8Cwb8q2VjJ9Dtv2wOEvSjJNxf/JSEJossLrlxFWk=; b=qh++xG9DMs0CLNB5i9tLUhjCSqBO9O4SNBxR1c6XfwzOEwg7jhZhuZKcEdocPsGNln /8wEcIM7ZLU+KruaNfSo71adsvjstovjfzFIr1ziuPMgKjrGACI8nqNbgj9WDke1E814 zvLqj1MlrUTnkN201QXCkN6PdjIHu9NupC70bNjP6atp0AU4S4yFJjDMYfRA3jIUS1lh Qe20AgJZ4cK776FgJjeCDyo6H/WfP3idw0X7OQOmpT9Dj4voIXyhOq9qXVPoKRc1h37L ZNWbnFO4q+nahIuxFq243BjkpEjOwlQUr98my3lhicl0XVdfXzf4+83teAufZZ6W5Joo 2n+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746077315; x=1746682115; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=WHj8Cwb8q2VjJ9Dtv2wOEvSjJNxf/JSEJossLrlxFWk=; b=XJ9x7Ojq/nVNbKekB4g46tCyx0d62lPcIrurc4eOzTYdbf1fvLQPe919e+g45TSsK9 L9MayBWlzTwvmmLJ6ktMYzxhKKY6EpNfOv55QKcV5CVpFp9Sj5P4fJDlpWSTznr8ruoA QLUuHbYjTE0rwhpuAjbLjp6YQ3/NSyfFnFw91T7J0h+Y7Tpd/tvmPzrX0bloralqEqur gFCEk12m5YawBHOuSWxpM/ppOz04p04tY1ri5Hv58xVrX63Mp4ldPDSPCljO8EYBcykM oVL0jE/cFWg6G3fY5gOkoGRNwBLtmB1z/JuDvBKHwVG8gfwu1tU7StONK0K7fbr5mkW1 kyXw== X-Forwarded-Encrypted: i=1; AJvYcCVbJK+BIn079wcERT4bQhRMs1vUAIRxxdmUrZj+0T6iQulXzagh4JxJ6uYtRZUPw75G2LBTACD5rg==@kvack.org X-Gm-Message-State: AOJu0Yx+2BjkrGh+Rx4iWq+4/tiFKWlKkMagn+TmNr5You6ahsNgWwJi RlHYsbXRIfRkon+CKpIcZ5AFCOPFRX4R+g/3pKdPTIJEnpCBGeMI/28JsEMq6oP7WsriQJVrPtX 6Hw16Lg== X-Google-Smtp-Source: AGHT+IEaXbmSVO/94V1Xc1oJJpIXquwO+RqB8A6EJq35zMnuc9249NpxOc3xG1uRj7oTqBDZbv2KgBoM5UKh X-Received: from pjbee8.prod.google.com ([2002:a17:90a:fc48:b0:2f4:465d:5c61]) (user=jyescas job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5290:b0:2ff:5e4e:861 with SMTP id 98e67ed59e1d1-30a4335de2fmr1991820a91.24.1746077314864; Wed, 30 Apr 2025 22:28:34 -0700 (PDT) Date: Wed, 30 Apr 2025 22:25:11 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.49.0.906.g1f30a19c02-goog Message-ID: <20250501052532.1903125-1-jyescas@google.com> Subject: [PATCH] mm: Add ARCH_FORCE_PAGE_BLOCK_ORDER to select page block order From: Juan Yescas To: Catalin Marinas , Will Deacon , Andrew Morton , Juan Yescas , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: tjmercier@google.com, isaacmanjarres@google.com, surenb@google.com, kaleshsingh@google.com, Vlastimil Babka , "Liam R. Howlett" , Lorenzo Stoakes , David Hildenbrand , Mike Rapoport , Zi Yan , Minchan Kim Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: F356E12000F X-Stat-Signature: xw434q7w1th1t1d6auc3jm5dtobnuofu X-Rspam-User: X-HE-Tag: 1746077315-199282 X-HE-Meta: U2FsdGVkX19t2DL669gEQjU6VQGxKfQiK3UILzREd6+eDwlac6OMv4zMPlYoln9TMapLDAOGC8C8/O/5CJQpI/e0a9K+lCoLEQEXXud+tuWFos98qFCqo4zcpeyFQ6bgnZcrED+P+3BvCMRyrxb218DNApKX0jf29yWKApd/RZl8eTseIovJ1M1vNMsgv1/LoyUNluVWrUIHZsSBh4oVfKvHVQ1nnov0tPFeOv84nrIAX3BJUTbkffpLjISKKueV61H0bY1sk8BRc3UjqNhj+A+rB/EAddD0ly/aIxJaVdyreZgBF401Zl7RU+QXC6dwFvaneb7mt3cDUBCNbXmpSmNWIxuKyUm5izv0DehM/BC2Rl3kuCBqz05EGt4zMHwMPm1bHehL9RoRW74p3EWRKnxLsubOqyKy9nFe0jchgwGuSeeoEm/0yH6dw4HfHAnUis+pjEpNWOGzSJkQ2r/UJIPp00r7c2FSWm6U0sJIkLsnGk12Qd1j+Cvl8b3sO2zOTlnYnkwMV9cy86SugGCKg03W0oAiHqxxoq8g2TkcG5r5wlEIo8rbo7iQI+UOlpABWybwrgCuJ+kUTZtWTExUfaXUhN6kBAKP19Q5U67QxlXbHabeRkVf/5vtKXwm6lUiINhZmVeC/xHqtPNCtae/SnkrZgxzJpLKepM7Zw9uaAHuP/ZGP+asQ1Q0i9J7c+SLw5wXcM+ZcpJZM3N9ZCrW4PGKBzghYxjaRJIhhjBo4kdShB5b6tpk1g+veMIMsXLR3uW0qTwZ+a+9TRlv1DYGS5KFMZVTcP13kHle9Yqje/CvPAgR8u1ZiOWDa1sGSHpIf3OE7D/OhntHWKQpWDDI+Va4lP02S55dq17EkEyQTVS/6RvsRKPAirqLX3VWO3pW5LSvN5BUKymDpDMl4Shpn+eSsa2uVD0Y1ykolBcFKgjsZovynZrMefOMh0F0rw0+85dvb6rZssjykgXkqlf gH7CqrBv LuWWxDq9Z3Kk4enHLCdzDl46yJWo1uTTtKrXOcetvuD1QSgkSUX2wOPBxmDhL/0XLblp/wJrf4Eh41aeOtafkzm2cq7zYYDaRUnHxsAkKnilQprq1ZeVLEoTlwYRYdlgmbquGbAIqmsvhwbN6I3Qlu8wWq0LFwCkYvuta7IRCXP17ndVXrszE8WToR60Y7Pwxy0+0HvHTof8C+CtFRyn0B8pDTsZdplzdX3+tUQnVSm1rI3QbJ+6LOBttsL14QF0jiuk808AMPDv80l5JY+zmk0jA/AMlT4ZQGLb3+yso3+2WHIyj25nGLMuOhPtB1iwtsE4sRc7MqDXY6N4gqHqZcGm464LV87R9G3R+NryVdPiSI/TOO7MeATyTfeAFK82kfyKMI/2AtZgHn8AbU7B9k446N62xFGulFDekVxn1CoIEFEuy74MaY5R6AleuOZwTpXn/p2rv0SKfrKblpbEt+GT2d7FaapGN305rEVu8Imxl5zjDvbJDuAiFzhLEFGJpNck4b24bplo8SyHWAZq6EwIoWUldvwxufQ8jtOjgJmBJDYX1aQyJXe9Dc+Byl1YOGehYzaHL/uUcYPJ3fsT1EkdkgSljo5/lqA4sabbe3IHVvBfsQYyM06dPQvXxP+Nh580WiqgLeBxtsD9u+X9kHd7F02Zr0tu1pxUfaGfbDqZ3BwT+iV70CqElAzCUeJXEZs/ywapRIiI/oMwu6HZ836X9MfaHQgyh+/er76o7VLGX2hv5kxa9Ph7peTTfS5AnCvzWFHRiC4rDpAOkMyYKDarNgTZbcYdiAMSBxthw2uPySwC6VYg8Y84MU8u5PdNZRTEoUjb0HL6koII= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Problem: On large page size configurations (16KiB, 64KiB), the CMA alignment requirement (CMA_MIN_ALIGNMENT_BYTES) increases considerably, and this causes the CMA reservations to be larger than necessary. This means that system will have less available MIGRATE_UNMOVABLE and MIGRATE_RECLAIMABLE page blocks since MIGRATE_CMA can't fallback to them. The CMA_MIN_ALIGNMENT_BYTES increases because it depends on MAX_PAGE_ORDER which depends on ARCH_FORCE_MAX_ORDER. The value of ARCH_FORCE_MAX_ORDER increases on 16k and 64k kernels. For example, the CMA alignment requirement when: - CONFIG_ARCH_FORCE_MAX_ORDER default value is used - CONFIG_TRANSPARENT_HUGEPAGE is set: PAGE_SIZE | MAX_PAGE_ORDER | pageblock_order | CMA_MIN_ALIGNMENT_BYTES ----------------------------------------------------------------------- 4KiB | 10 | 10 | 4KiB * (2 ^ 10) = 4MiB 16Kib | 11 | 11 | 16KiB * (2 ^ 11) = 32MiB 64KiB | 13 | 13 | 64KiB * (2 ^ 13) = 512MiB There are some extreme cases for the CMA alignment requirement when: - CONFIG_ARCH_FORCE_MAX_ORDER maximum value is set - CONFIG_TRANSPARENT_HUGEPAGE is NOT set: - CONFIG_HUGETLB_PAGE is NOT set PAGE_SIZE | MAX_PAGE_ORDER | pageblock_order | CMA_MIN_ALIGNMENT_BYTES ------------------------------------------------------------------------ 4KiB | 15 | 15 | 4KiB * (2 ^ 15) = 128MiB 16Kib | 13 | 13 | 16KiB * (2 ^ 13) = 128MiB 64KiB | 13 | 13 | 64KiB * (2 ^ 13) = 512MiB This affects the CMA reservations for the drivers. If a driver in a 4KiB kernel needs 4MiB of CMA memory, in a 16KiB kernel, the minimal reservation has to be 32MiB due to the alignment requirements: reserved-memory { ... cma_test_reserve: cma_test_reserve { compatible = "shared-dma-pool"; size = <0x0 0x400000>; /* 4 MiB */ ... }; }; reserved-memory { ... cma_test_reserve: cma_test_reserve { compatible = "shared-dma-pool"; size = <0x0 0x2000000>; /* 32 MiB */ ... }; }; Solution: Add a new config ARCH_FORCE_PAGE_BLOCK_ORDER that allows to set the page block order. The maximum page block order will be given by ARCH_FORCE_MAX_ORDER. By default, ARCH_FORCE_PAGE_BLOCK_ORDER will have the same value that ARCH_FORCE_MAX_ORDER. This will make sure that current kernel configurations won't be affected by this change. It is a opt-in change. This patch will allow to have the same CMA alignment requirements for large page sizes (16KiB, 64KiB) as that in 4kb kernels by setting a lower pageblock_order. Tests: - Verified that HugeTLB pages work when pageblock_order is 1, 7, 10 on 4k and 16k kernels. - Verified that Transparent Huge Pages work when pageblock_order is 1, 7, 10 on 4k and 16k kernels. - Verified that dma-buf heaps allocations work when pageblock_order is 1, 7, 10 on 4k and 16k kernels. Benchmarks: The benchmarks compare 16kb kernels with pageblock_order 10 and 7. The reason for the pageblock_order 7 is because this value makes the min CMA alignment requirement the same as that in 4kb kernels (2MB). - Perform 100K dma-buf heaps (/dev/dma_heap/system) allocations of SZ_8M, SZ_4M, SZ_2M, SZ_1M, SZ_64, SZ_8, SZ_4. Use simpleperf (https://developer.android.com/ndk/guides/simpleperf) to measure the # of instructions and page-faults on 16k kernels. The benchmark was executed 10 times. The averages are below: # instructions | #page-faults order 10 | order 7 | order 10 | order 7 -------------------------------------------------------- 13,891,765,770 | 11,425,777,314 | 220 | 217 14,456,293,487 | 12,660,819,302 | 224 | 219 13,924,261,018 | 13,243,970,736 | 217 | 221 13,910,886,504 | 13,845,519,630 | 217 | 221 14,388,071,190 | 13,498,583,098 | 223 | 224 13,656,442,167 | 12,915,831,681 | 216 | 218 13,300,268,343 | 12,930,484,776 | 222 | 218 13,625,470,223 | 14,234,092,777 | 219 | 218 13,508,964,965 | 13,432,689,094 | 225 | 219 13,368,950,667 | 13,683,587,37 | 219 | 225 ------------------------------------------------------------------- 13,803,137,433 | 13,131,974,268 | 220 | 220 Averages There were 4.85% #instructions when order was 7, in comparison with order 10. 13,803,137,433 - 13,131,974,268 = -671,163,166 (-4.86%) The number of page faults in order 7 and 10 were the same. These results didn't show any significant regression when the pageblock_order is set to 7 on 16kb kernels. - Run speedometer 3.1 (https://browserbench.org/Speedometer3.1/) 5 times on the 16k kernels with pageblock_order 7 and 10. order 10 | order 7 | order 7 - order 10 | (order 7 - order 10) % ------------------------------------------------------------------- 15.8 | 16.4 | 0.6 | 3.80% 16.4 | 16.2 | -0.2 | -1.22% 16.6 | 16.3 | -0.3 | -1.81% 16.8 | 16.3 | -0.5 | -2.98% 16.6 | 16.8 | 0.2 | 1.20% ------------------------------------------------------------------- 16.44 16.4 -0.04 -0.24% Averages The results didn't show any significant regression when the pageblock_order is set to 7 on 16kb kernels. Cc: Andrew Morton Cc: Vlastimil Babka Cc: Liam R. Howlett Cc: Lorenzo Stoakes Cc: David Hildenbrand CC: Mike Rapoport Cc: Zi Yan Cc: Suren Baghdasaryan Cc: Minchan Kim Signed-off-by: Juan Yescas --- arch/arm64/Kconfig | 14 ++++++++++++++ include/linux/pageblock-flags.h | 12 +++++++++--- 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index a182295e6f08..d784049e1e01 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1658,6 +1658,20 @@ config ARCH_FORCE_MAX_ORDER Don't change if unsure. +config ARCH_FORCE_PAGE_BLOCK_ORDER + int "Page Block Order" + range 1 ARCH_FORCE_MAX_ORDER + default ARCH_FORCE_MAX_ORDER + help + The page block order refers to the power of two number of pages that + are physically contiguous and can have a migrate type associated to them. + The maximum size of the page block order is limited by ARCH_FORCE_MAX_ORDER. + + This option allows overriding the default setting when the page + block order requires to be smaller than ARCH_FORCE_MAX_ORDER. + + Don't change if unsure. + config UNMAP_KERNEL_AT_EL0 bool "Unmap kernel when running in userspace (KPTI)" if EXPERT default y diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h index fc6b9c87cb0a..ab3de96bb50c 100644 --- a/include/linux/pageblock-flags.h +++ b/include/linux/pageblock-flags.h @@ -28,6 +28,12 @@ enum pageblock_bits { NR_PAGEBLOCK_BITS }; +#if defined(CONFIG_ARCH_FORCE_PAGE_BLOCK_ORDER) +#define PAGE_BLOCK_ORDER CONFIG_ARCH_FORCE_PAGE_BLOCK_ORDER +#else +#define PAGE_BLOCK_ORDER MAX_PAGE_ORDER +#endif /* CONFIG_ARCH_FORCE_PAGE_BLOCK_ORDER */ + #if defined(CONFIG_HUGETLB_PAGE) #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE @@ -41,18 +47,18 @@ extern unsigned int pageblock_order; * Huge pages are a constant size, but don't exceed the maximum allocation * granularity. */ -#define pageblock_order MIN_T(unsigned int, HUGETLB_PAGE_ORDER, MAX_PAGE_ORDER) +#define pageblock_order MIN_T(unsigned int, HUGETLB_PAGE_ORDER, PAGE_BLOCK_ORDER) #endif /* CONFIG_HUGETLB_PAGE_SIZE_VARIABLE */ #elif defined(CONFIG_TRANSPARENT_HUGEPAGE) -#define pageblock_order MIN_T(unsigned int, HPAGE_PMD_ORDER, MAX_PAGE_ORDER) +#define pageblock_order MIN_T(unsigned int, HPAGE_PMD_ORDER, PAGE_BLOCK_ORDER) #else /* CONFIG_TRANSPARENT_HUGEPAGE */ /* If huge pages are not used, group by MAX_ORDER_NR_PAGES */ -#define pageblock_order MAX_PAGE_ORDER +#define pageblock_order PAGE_BLOCK_ORDER #endif /* CONFIG_HUGETLB_PAGE */ -- 2.49.0.906.g1f30a19c02-goog