From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAF6AC3ABA3 for ; Thu, 1 May 2025 21:17:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BDBD96B0088; Thu, 1 May 2025 17:17:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B89DC6B0089; Thu, 1 May 2025 17:17:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A03E26B008A; Thu, 1 May 2025 17:17:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 7FBFC6B0088 for ; Thu, 1 May 2025 17:17:53 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 67C64819F7 for ; Thu, 1 May 2025 21:17:53 +0000 (UTC) X-FDA: 83395601226.25.75F4FEE Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) by imf08.hostedemail.com (Postfix) with ESMTP id 93A9E160006 for ; Thu, 1 May 2025 21:17:51 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GF07Rz06; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of jyescas@google.com designates 209.85.160.173 as permitted sender) smtp.mailfrom=jyescas@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746134271; a=rsa-sha256; cv=none; b=gmkxusDQAwYv3L/dxl9kJU60expVEdMba+JoeVOTaG9/jMJofY3w3LyxJqyoblzjwA8G2G f8DeUI4eLYyJSjS7jAELWiBwWoE7EDN7ycfTCeaFYnbqgmoXRu9Om8YOcJFHyCQidKJkpm /pOxP+3//AhenEv2XbSh9N3KOMqMU5I= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GF07Rz06; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of jyescas@google.com designates 209.85.160.173 as permitted sender) smtp.mailfrom=jyescas@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746134271; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JWyQw8Cds5LESduVc1Rgu5TXLU2igPw8biM/GPcGcEE=; b=JPsfn9XAzaHJIpVpuWW+AxU3mTLAkAFTNzLp8wrC6Zyg/cc7igu7MbVo9f3UmP0GFw33Ul zFrlAO2GBMo2sZ5DvTqDOjchlOtW7wKMvKAslPZulJL+fOuAvINVaIfCcKxELI4LkKV4/H kh1FfEOMT2agnWgBWVtB4buQ9qJYD+s= Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-4774611d40bso78351cf.0 for ; Thu, 01 May 2025 14:17:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746134271; x=1746739071; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=JWyQw8Cds5LESduVc1Rgu5TXLU2igPw8biM/GPcGcEE=; b=GF07Rz06GsRyoPWroqUIIY+/T7V/2Ej7qb9xQplKi/GW+dHMdGHBZUiA+GcLm1lTe0 pIrmvt+BhSS4QlEpFr+zpKeKgtBhPJy6FD0G4Borx5TeCasdJ7L06jgzV9tRDN8aj3Bd ThnsJJSxHbVrLTu0WzQ9pgjRjE3TbLCsnZtTOHsDhPWRWnolPoQ4qPt2n30i9HE6dmsj 2WZyKs+Lgl3RjRdnOdsZUwRWmn/bMCp5spnOgV0lyqpMVaeSlFlfCx0uLGvRwqyZ29ok UPCjBEsIQN059jg5bZABAiRcfcEGNg1pNxNm4LrmbDj3DHiG18YZNujjkzz80ON6Sh0L hkFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746134271; x=1746739071; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JWyQw8Cds5LESduVc1Rgu5TXLU2igPw8biM/GPcGcEE=; b=VTJjIhlysvTNGamO0wqA5NOeQDMtCtq8L2XMG9mQ5Okcx0BCyyI4pFwT7tBX7Zw0kJ SDnL4j8MaFgDe8kTeTp7sMbkjMHXasFqagmJDPJICbHNAxZ4PGghrF4d6bgF5IF7A3jG PAV5KithurFg4g/cZYsvlEgsuuH3DBmpM7RmKAZr+Fyh8Cy8Xr9gHHkKE0SHIaVfuFyI GN3KUkJSNRs6BmZjCWphRQmRjCm7UPoGPTgjujEcNcML4BZEQ85j897A+VjPCAuaXGf7 WiQ5WMFdPeDyHFjtUrX8WFZfGgoVyJxLdhsIKLshd4YTodU0YEUTw0B+AR9XMDR7WtKH sZNg== X-Forwarded-Encrypted: i=1; AJvYcCU2WRoS+xkTwKL2XoofLPqGUlP9daxo/K5hEHSrAAgZra3foN/Iwlcnw2pO6NXZWUejjtV36sj3kA==@kvack.org X-Gm-Message-State: AOJu0YzFwQ4JzeQVcnmyI2GdNHtNbAHV6y+eqVKpBUl0uCHeL6FqioAS ketLWrSRQOalo1FAF5rOzXIOztom0W6IyLHfttVWhsridg2oTn5pM8/L6+VP2PtMiNTfzkYlJoZ Tauamjsh0EnMfwXeNJ+Q+7H90JH6y0o1cWLrT X-Gm-Gg: ASbGncsY2GuTI/yQww/gqxmLrupn00w2NwqK4ttlcaY/6s2BFFlvYSrS2B3E2+qXm5e qRVB7UY9pgXS3rH07qYya58mLkq0nICoE0/SXL4BaHsYzfQxXARWQVMIJ/0JGxDCoALAgC71nUo kWE4PaH4hnNozbeEf9dqM2ve2Olaho+pk/efsMR8O0BfcgFQ0Qqz9M/hIk X-Google-Smtp-Source: AGHT+IGfumJSsiisStXyn8AuifGkC9XEdJhdj92CWWPKME4nvzUrmJ0CTeBR/ov+OTDdJdNRZX5qbGMbAy/mLDFtJ18= X-Received: by 2002:ac8:584f:0:b0:47a:cc58:1fb0 with SMTP id d75a77b69052e-48b0c80e383mr4627621cf.3.1746134270300; Thu, 01 May 2025 14:17:50 -0700 (PDT) MIME-Version: 1.0 References: <20250501052532.1903125-1-jyescas@google.com> <15428BF6-A7DD-44ED-B225-AECD7866394C@nvidia.com> In-Reply-To: <15428BF6-A7DD-44ED-B225-AECD7866394C@nvidia.com> From: Juan Yescas Date: Thu, 1 May 2025 14:17:38 -0700 X-Gm-Features: ATxdqUHh-eQxAZeQPnN6vxzYF4KED7BDzIubnWMQBvU9nZhpT5jXU4ibXzkaJaE Message-ID: Subject: Re: [PATCH] mm: Add ARCH_FORCE_PAGE_BLOCK_ORDER to select page block order To: Zi Yan Cc: Catalin Marinas , Will Deacon , Andrew Morton , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, tjmercier@google.com, isaacmanjarres@google.com, surenb@google.com, kaleshsingh@google.com, Vlastimil Babka , "Liam R. Howlett" , Lorenzo Stoakes , David Hildenbrand , Mike Rapoport , Minchan Kim Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 93A9E160006 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: b3uaw1cccy4dfo4kz7i571p53kfd1per X-HE-Tag: 1746134271-258517 X-HE-Meta: U2FsdGVkX18ho/j7PjfYn1ymqDxAkqfOyF5+UV/4e6tvtp0v4vq980u3SeMw/BJgQLDfVJmSCNrnoT3zUkquK0eNSE4OwsJSx6gRVYH6aWFP+WbkRk8DF9/OdtmbBpCJKn292uLGNtWHXhNkTuMT6+lmntk0a6dlxkDwJKhiJsLH7PDvDXQ0LftujwuUFpqEuyR4z7rd4jtlIpKwEVUoQPAHPpUEJ+7mBxn/eCmP3Q0n26yfwazW9Vh3NAp1kWPe1+EHxlipSix3pK9t8Y0+6+rXQC6QCc5aasSAYkxixBHQ4G89cA7fI9oRN6t0/4EMW9uNXPry+pEI92Kp9yxv3J67KS7+tdrg/9UVV4oBZ4KR6xEPu3DItt07FZ0xIQOqg8efJCWWqaZR/mFCfztV1qEvYP/QAR6ErA8uePEAV9xJYhslJ5gTJLvv+HE9pCLa28DVueQ+A0u5pGj0hySUxcSImaFNPgQnkTSO/EbiEHzgRPsg73QQSA4xp9a10bxeaJJ5ndRamcy33sTth51mAWqT75GfHBeZgkNVQyvhh8CJtrvuheSme81wZBm+TDmTx+ot7LgXU0BDJbckypwhWZGSH0hiKpDee2BknOe/zEF3BKfXY51xw6etiG19sP4Y/pvGTE/jcSQNvdRHAzFXHRyUV7N67HBECEF5JMmbXggOptZR4w+ViBgt5bfw3z0qPj4H/tGCC0BoQDDj1l7VnErnGk0UaX8Bn+0XojxWL2nhaGyy9P0RrtegvqSBP+S3F7SwPNmo5tAQ42Koir3vfqwmeCF2xPfkgsqT1X88yjb66Rz53Y4TI5CC8eDmRbAApqVwZO28cTFhucJYUhkxsnR9feMjPUDfgEejb/K015vMWfd6BsQTlQviv4A69qvTrmTNMfvtHqdFIpVfdyuNpM/fbHhO8rgQf8Cw01rJoSS3srfF5OWUmD0H6TeL8eG0GzY4HPK3nM7q9dq1Imq nb/XmMgW m2IeJB1kdLXAWp5Zd1LITASHaJaiOtGAnm96MlONuTNSdnGFL9ixB4JWWb7v0P7k/gafG35+nN6K/AAiAcMxsbJ8rNEG7pgr3YfqlmzyFASXf6UNCkilWhuvKJvCwfRa99o7ZtjIvP/8nN42gzzFd5v7ZEEaZhHsR+soMFod88I7m5Xw+Hd81179jfd5b2HcC/7NsAMmMm0GOsbHtA3EVuMXjaaSDRcKEK53NrUjYkgCqHqPnA8M1Fpa6m3jF62x2dF+DGEIrzzBbUoifx/1tOsx77LqKK7klFNlYGEm6zMuAh/36VLJfC7Xodk7t4wbJAGSGAuyDt6h0He/uI8RwxrOS3MdYnr984Cq+v5AkFjJNYsZKHqj/9OaZtiafE4CvwXbJRmFFTNN+hf4QqIEAvez6R9r4oWxyvE54FGiTKh3xgCMQa5i0dbCCD1KLle1e4APJ8CyX1zrtvKOQnzJ73XLbg1yPiExjFUCMLYWHnZbpzg2p4N9kYzMu+A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, May 1, 2025 at 11:49=E2=80=AFAM Zi Yan wrote: > > On 1 May 2025, at 1:25, Juan Yescas wrote: > > > Problem: On large page size configurations (16KiB, 64KiB), the CMA > > alignment requirement (CMA_MIN_ALIGNMENT_BYTES) increases considerably, > > and this causes the CMA reservations to be larger than necessary. > > This means that system will have less available MIGRATE_UNMOVABLE and > > MIGRATE_RECLAIMABLE page blocks since MIGRATE_CMA can't fallback to the= m. > > > > The CMA_MIN_ALIGNMENT_BYTES increases because it depends on > > MAX_PAGE_ORDER which depends on ARCH_FORCE_MAX_ORDER. The value of > > ARCH_FORCE_MAX_ORDER increases on 16k and 64k kernels. > > > > For example, the CMA alignment requirement when: > > > > - CONFIG_ARCH_FORCE_MAX_ORDER default value is used > > - CONFIG_TRANSPARENT_HUGEPAGE is set: > > > > PAGE_SIZE | MAX_PAGE_ORDER | pageblock_order | CMA_MIN_ALIGNMENT_BYTES > > ----------------------------------------------------------------------- > > 4KiB | 10 | 10 | 4KiB * (2 ^ 10) =3D 4= MiB > > 16Kib | 11 | 11 | 16KiB * (2 ^ 11) =3D 32= MiB > > 64KiB | 13 | 13 | 64KiB * (2 ^ 13) =3D 512= MiB > > > > There are some extreme cases for the CMA alignment requirement when: > > > > - CONFIG_ARCH_FORCE_MAX_ORDER maximum value is set > > - CONFIG_TRANSPARENT_HUGEPAGE is NOT set: > > - CONFIG_HUGETLB_PAGE is NOT set > > > > PAGE_SIZE | MAX_PAGE_ORDER | pageblock_order | CMA_MIN_ALIGNMENT_BYTES > > -----------------------------------------------------------------------= - > > 4KiB | 15 | 15 | 4KiB * (2 ^ 15) =3D 128= MiB > > 16Kib | 13 | 13 | 16KiB * (2 ^ 13) =3D 128= MiB > > 64KiB | 13 | 13 | 64KiB * (2 ^ 13) =3D 512= MiB > > > > This affects the CMA reservations for the drivers. If a driver in a > > 4KiB kernel needs 4MiB of CMA memory, in a 16KiB kernel, the minimal > > reservation has to be 32MiB due to the alignment requirements: > > > > reserved-memory { > > ... > > cma_test_reserve: cma_test_reserve { > > compatible =3D "shared-dma-pool"; > > size =3D <0x0 0x400000>; /* 4 MiB */ > > ... > > }; > > }; > > > > reserved-memory { > > ... > > cma_test_reserve: cma_test_reserve { > > compatible =3D "shared-dma-pool"; > > size =3D <0x0 0x2000000>; /* 32 MiB */ > > ... > > }; > > }; > > > > Solution: Add a new config ARCH_FORCE_PAGE_BLOCK_ORDER that > > allows to set the page block order. The maximum page block > > order will be given by ARCH_FORCE_MAX_ORDER. > > > > By default, ARCH_FORCE_PAGE_BLOCK_ORDER will have the same > > value that ARCH_FORCE_MAX_ORDER. This will make sure that > > current kernel configurations won't be affected by this > > change. It is a opt-in change. > > > > This patch will allow to have the same CMA alignment > > requirements for large page sizes (16KiB, 64KiB) as that > > in 4kb kernels by setting a lower pageblock_order. > > > > Tests: > > > > - Verified that HugeTLB pages work when pageblock_order is 1, 7, 10 > > on 4k and 16k kernels. > > > > - Verified that Transparent Huge Pages work when pageblock_order > > is 1, 7, 10 on 4k and 16k kernels. > > > > - Verified that dma-buf heaps allocations work when pageblock_order > > is 1, 7, 10 on 4k and 16k kernels. > > > > Benchmarks: > > > > The benchmarks compare 16kb kernels with pageblock_order 10 and 7. The > > reason for the pageblock_order 7 is because this value makes the min > > CMA alignment requirement the same as that in 4kb kernels (2MB). > > > > - Perform 100K dma-buf heaps (/dev/dma_heap/system) allocations of > > SZ_8M, SZ_4M, SZ_2M, SZ_1M, SZ_64, SZ_8, SZ_4. Use simpleperf > > (https://developer.android.com/ndk/guides/simpleperf) to measure > > the # of instructions and page-faults on 16k kernels. > > The benchmark was executed 10 times. The averages are below: > > > > # instructions | #page-faults > > order 10 | order 7 | order 10 | order 7 > > -------------------------------------------------------- > > 13,891,765,770 | 11,425,777,314 | 220 | 217 > > 14,456,293,487 | 12,660,819,302 | 224 | 219 > > 13,924,261,018 | 13,243,970,736 | 217 | 221 > > 13,910,886,504 | 13,845,519,630 | 217 | 221 > > 14,388,071,190 | 13,498,583,098 | 223 | 224 > > 13,656,442,167 | 12,915,831,681 | 216 | 218 > > 13,300,268,343 | 12,930,484,776 | 222 | 218 > > 13,625,470,223 | 14,234,092,777 | 219 | 218 > > 13,508,964,965 | 13,432,689,094 | 225 | 219 > > 13,368,950,667 | 13,683,587,37 | 219 | 225 > > ------------------------------------------------------------------- > > 13,803,137,433 | 13,131,974,268 | 220 | 220 Averages > > > > There were 4.85% #instructions when order was 7, in comparison > > with order 10. > > > > 13,803,137,433 - 13,131,974,268 =3D -671,163,166 (-4.86%) > > > > The number of page faults in order 7 and 10 were the same. > > > > These results didn't show any significant regression when the > > pageblock_order is set to 7 on 16kb kernels. > > > > - Run speedometer 3.1 (https://browserbench.org/Speedometer3.1/) 5 time= s > > on the 16k kernels with pageblock_order 7 and 10. > > > > order 10 | order 7 | order 7 - order 10 | (order 7 - order 10) % > > ------------------------------------------------------------------- > > 15.8 | 16.4 | 0.6 | 3.80% > > 16.4 | 16.2 | -0.2 | -1.22% > > 16.6 | 16.3 | -0.3 | -1.81% > > 16.8 | 16.3 | -0.5 | -2.98% > > 16.6 | 16.8 | 0.2 | 1.20% > > ------------------------------------------------------------------- > > 16.44 16.4 -0.04 -0.24% Averages > > > > The results didn't show any significant regression when the > > pageblock_order is set to 7 on 16kb kernels. > > > > Cc: Andrew Morton > > Cc: Vlastimil Babka > > Cc: Liam R. Howlett > > Cc: Lorenzo Stoakes > > Cc: David Hildenbrand > > CC: Mike Rapoport > > Cc: Zi Yan > > Cc: Suren Baghdasaryan > > Cc: Minchan Kim > > Signed-off-by: Juan Yescas > > --- > > arch/arm64/Kconfig | 14 ++++++++++++++ > > include/linux/pageblock-flags.h | 12 +++++++++--- > > 2 files changed, 23 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index a182295e6f08..d784049e1e01 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -1658,6 +1658,20 @@ config ARCH_FORCE_MAX_ORDER > > > > Don't change if unsure. > > > > +config ARCH_FORCE_PAGE_BLOCK_ORDER > > + int "Page Block Order" > > + range 1 ARCH_FORCE_MAX_ORDER > > + default ARCH_FORCE_MAX_ORDER > > + help > > + The page block order refers to the power of two number of pages= that > > + are physically contiguous and can have a migrate type associate= d to them. > > + The maximum size of the page block order is limited by ARCH_FOR= CE_MAX_ORDER. > > Since memory compaction operates at pageblock granularity and pageblock s= ize > usually matches THP size, a smaller pageblock size degrades kernel > anti-fragmentation mechanism for THP significantly. Can you add something= like > the text below to the help section? > > "Reducing pageblock order can negatively impact THP generation successful= rate. > If your workloads uses THP heavily, please use this option with caution." > Thanks Zi for Pointing this out. I will add the comment in the help section= . > Otherwise, Acked-by: Zi Yan > > I am also OK if you move this to mm/Kconfig. > This seems reasonable to me. > > + > > + This option allows overriding the default setting when the page > > + block order requires to be smaller than ARCH_FORCE_MAX_ORDER. > > + > > + Don't change if unsure. > > + > > config UNMAP_KERNEL_AT_EL0 > > bool "Unmap kernel when running in userspace (KPTI)" if EXPERT > > default y > > diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-= flags.h > > index fc6b9c87cb0a..ab3de96bb50c 100644 > > --- a/include/linux/pageblock-flags.h > > +++ b/include/linux/pageblock-flags.h > > @@ -28,6 +28,12 @@ enum pageblock_bits { > > NR_PAGEBLOCK_BITS > > }; > > > > +#if defined(CONFIG_ARCH_FORCE_PAGE_BLOCK_ORDER) > > +#define PAGE_BLOCK_ORDER CONFIG_ARCH_FORCE_PAGE_BLOCK_ORDER > > +#else > > +#define PAGE_BLOCK_ORDER MAX_PAGE_ORDER > > +#endif /* CONFIG_ARCH_FORCE_PAGE_BLOCK_ORDER */ > > + > > #if defined(CONFIG_HUGETLB_PAGE) > > > > #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE > > @@ -41,18 +47,18 @@ extern unsigned int pageblock_order; > > * Huge pages are a constant size, but don't exceed the maximum alloca= tion > > * granularity. > > */ > > -#define pageblock_order MIN_T(unsigned int, HUGETLB_PAGE_= ORDER, MAX_PAGE_ORDER) > > +#define pageblock_order MIN_T(unsigned int, HUGETLB_PAGE_= ORDER, PAGE_BLOCK_ORDER) > > > > #endif /* CONFIG_HUGETLB_PAGE_SIZE_VARIABLE */ > > > > #elif defined(CONFIG_TRANSPARENT_HUGEPAGE) > > > > -#define pageblock_order MIN_T(unsigned int, HPAGE_PMD_ORD= ER, MAX_PAGE_ORDER) > > +#define pageblock_order MIN_T(unsigned int, HPAGE_PMD_ORD= ER, PAGE_BLOCK_ORDER) > > > > #else /* CONFIG_TRANSPARENT_HUGEPAGE */ > > > > /* If huge pages are not used, group by MAX_ORDER_NR_PAGES */ > > -#define pageblock_order MAX_PAGE_ORDER > > +#define pageblock_order PAGE_BLOCK_ORDER > > > > #endif /* CONFIG_HUGETLB_PAGE */ > > > > -- > > 2.49.0.906.g1f30a19c02-goog > > > -- > Best Regards, > Yan, Zi