From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 909C2D18136 for ; Mon, 14 Oct 2024 17:33:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B4026B0085; Mon, 14 Oct 2024 13:33:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2895D6B0088; Mon, 14 Oct 2024 13:33:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 12A8E6B0089; Mon, 14 Oct 2024 13:33:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EBCA96B0085 for ; Mon, 14 Oct 2024 13:32:59 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id BD777ABD81 for ; Mon, 14 Oct 2024 17:32:43 +0000 (UTC) X-FDA: 82672903110.30.BA927B3 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) by imf23.hostedemail.com (Postfix) with ESMTP id 437B8140012 for ; Mon, 14 Oct 2024 17:32:53 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=broadcom.com header.s=google header.b=KLSrhma8; dmarc=pass (policy=quarantine) header.from=broadcom.com; spf=pass (imf23.hostedemail.com: domain of florian.fainelli@broadcom.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=florian.fainelli@broadcom.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728927072; a=rsa-sha256; cv=none; b=X/2UV6Ruh9kkw+jOPeDL3AhcBABt8bV2DTkHVJLz78cS2t9VK904pm3izE2owlElctWiZN R2PdzccrLhJ7bMnyHFJrGbi33JSVqE3sTdNxLdQOhae5yXCS8q71SxLnzMFeusfFJPkP8N 9P4lcTooAm6/oS2TfWsunbfhnogBrFo= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=broadcom.com header.s=google header.b=KLSrhma8; dmarc=pass (policy=quarantine) header.from=broadcom.com; spf=pass (imf23.hostedemail.com: domain of florian.fainelli@broadcom.com designates 209.85.214.176 as permitted sender) smtp.mailfrom=florian.fainelli@broadcom.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728927072; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bQQSHSD7pMPEnyh5pSYNZBMzysQwlv1Sj6SfqEv1kug=; b=YIBw4M3eaf6O6QbmBIuH3jEA8b4j2dHZ+6gn9oedijLbPS/qB5a+IG1RHLs9GKxdTNoLsq iLOR/uyShBtFZPqS/PQzgs0nynD51fjY/B2CTxw2EiSn6kfTG0TrHJlekXvL3Rk9LfhBy7 r7UwA59S1968x/UyBy8tUnpLBSSbFyY= Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-20c714cd9c8so34233775ad.0 for ; Mon, 14 Oct 2024 10:32:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1728927176; x=1729531976; darn=kvack.org; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:from:to:cc:subject:date:message-id:reply-to; bh=bQQSHSD7pMPEnyh5pSYNZBMzysQwlv1Sj6SfqEv1kug=; b=KLSrhma8gcCel82uhxR9+69qjus6dh72B59j872Zjzii2SzxxoZrQSfHL/BVHHXPFF e7QpYjPq0PpdiU85fBNe3qs0tZLQ+Sgbrii2uCALKpfpFhuvIBu5LN5fIfdksA4+QuHj 2YlxuWI8ZwfT5P5N5754HhWJJwAEwh5orZR8s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728927176; x=1729531976; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=bQQSHSD7pMPEnyh5pSYNZBMzysQwlv1Sj6SfqEv1kug=; b=WPxL1tlM4Q0trFtl5WQ5L9UsgCP2va0AEC0UDIu3ex7Dfj+NBAhpxESuZeiUm4HzbK KDDA4fsFuz1Tc7eSiqNsbjwV+cBeOfQfv3BgddWlXjO7Kk5fn2eXw4gVYYAJaslZI38f ageN4k0x6DvJNStR4eTHDVRvDRKJRFOLEfGZLxfnUKiCGqyHxqYT0JWzNEXes95GtJuO G9GTiRtrLUjaSQp67bzigS1MM06O49I07qhgZpLuuGTmrNKYFixYLuFvs8ZDreOtEr5/ YLF5zRxB/TCL7CsafM2eJTzB4lToCdFUxyKQd5NsIG5rt4YpkN/OlVP27u4DlOA7hOMr XKtg== X-Forwarded-Encrypted: i=1; AJvYcCVWCqp1dmiIVt28IhFGpFyIO/ix1UXb5IdOL55ylOFYkMi2+gKA4v7pYEf+7BxY6yi9mtOpQ934fg==@kvack.org X-Gm-Message-State: AOJu0Yx+Z9R9MrqI0Pkl6Yur6L0HmpvveX33Y+obmxNf/GOND06Bqr3d YhpEg6OpgSqjKrbbuWM6iHzuZLEfSoiIB6CzWBuI+VUaQBmYD19ji9FiWueP0w== X-Google-Smtp-Source: AGHT+IEns2eqkxyiAg986XLyfsK4RpsMx2bwZL6BIHWTOuBJ8RnZHMXrUX4qvjPsrHC+Vm+yqwVq0A== X-Received: by 2002:a17:903:1c6:b0:20c:e262:2560 with SMTP id d9443c01a7336-20ce26227c9mr76987585ad.50.1728927175840; Mon, 14 Oct 2024 10:32:55 -0700 (PDT) Received: from [10.67.48.245] ([192.19.223.252]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20c8c0eb7bbsm68863045ad.162.2024.10.14.10.32.54 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 14 Oct 2024 10:32:55 -0700 (PDT) Message-ID: <3e742298-2f38-496c-ba63-1e30d16318c6@broadcom.com> Date: Mon, 14 Oct 2024 10:32:53 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 00/57] Boot-time page size selection for arm64 To: Ryan Roberts , Andrew Morton , Anshuman Khandual , Ard Biesheuvel , Catalin Marinas , David Hildenbrand , Greg Marsden , Ivan Ivanov , Kalesh Singh , Marc Zyngier , Mark Rutland , Matthias Brugger , Miroslav Benes , Will Deacon Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20241014105514.3206191-1-ryan.roberts@arm.com> Content-Language: en-US From: Florian Fainelli Autocrypt: addr=florian.fainelli@broadcom.com; keydata= xsBNBFPAG8ABCAC3EO02urEwipgbUNJ1r6oI2Vr/+uE389lSEShN2PmL3MVnzhViSAtrYxeT M0Txqn1tOWoIc4QUl6Ggqf5KP6FoRkCrgMMTnUAINsINYXK+3OLe7HjP10h2jDRX4Ajs4Ghs JrZOBru6rH0YrgAhr6O5gG7NE1jhly+EsOa2MpwOiXO4DE/YKZGuVe6Bh87WqmILs9KvnNrQ PcycQnYKTVpqE95d4M824M5cuRB6D1GrYovCsjA9uxo22kPdOoQRAu5gBBn3AdtALFyQj9DQ KQuc39/i/Kt6XLZ/RsBc6qLs+p+JnEuPJngTSfWvzGjpx0nkwCMi4yBb+xk7Hki4kEslABEB AAHNMEZsb3JpYW4gRmFpbmVsbGkgPGZsb3JpYW4uZmFpbmVsbGlAYnJvYWRjb20uY29tPsLB IQQQAQgAywUCZWl41AUJI+Jo+hcKAAG/SMv+fS3xUQWa0NryPuoRGjsA3SAUAAAAAAAWAAFr ZXktdXNhZ2UtbWFza0BwZ3AuY29tjDAUgAAAAAAgAAdwcmVmZXJyZWQtZW1haWwtZW5jb2Rp bmdAcGdwLmNvbXBncG1pbWUICwkIBwMCAQoFF4AAAAAZGGxkYXA6Ly9rZXlzLmJyb2FkY29t Lm5ldAUbAwAAAAMWAgEFHgEAAAAEFQgJChYhBNXZKpfnkVze1+R8aIExtcQpvGagAAoJEIEx tcQpvGagWPEH/2l0DNr9QkTwJUxOoP9wgHfmVhqc0ZlDsBFv91I3BbhGKI5UATbipKNqG13Z TsBrJHcrnCqnTRS+8n9/myOF0ng2A4YT0EJnayzHugXm+hrkO5O9UEPJ8a+0553VqyoFhHqA zjxj8fUu1px5cbb4R9G4UAySqyeLLeqnYLCKb4+GklGSBGsLMYvLmIDNYlkhMdnnzsSUAS61 WJYW6jjnzMwuKJ0ZHv7xZvSHyhIsFRiYiEs44kiYjbUUMcXor/uLEuTIazGrE3MahuGdjpT2 IOjoMiTsbMc0yfhHp6G/2E769oDXMVxCCbMVpA+LUtVIQEA+8Zr6mX0Yk4nDS7OiBlvOwE0E U8AbwQEIAKxr71oqe+0+MYCc7WafWEcpQHFUwvYLcdBoOnmJPxDwDRpvU5LhqSPvk/yJdh9k 4xUDQu3rm1qIW2I9Puk5n/Jz/lZsqGw8T13DKyu8eMcvaA/irm9lX9El27DPHy/0qsxmxVmU pu9y9S+BmaMb2CM9IuyxMWEl9ruWFS2jAWh/R8CrdnL6+zLk60R7XGzmSJqF09vYNlJ6Bdbs MWDXkYWWP5Ub1ZJGNJQ4qT7g8IN0qXxzLQsmz6tbgLMEHYBGx80bBF8AkdThd6SLhreCN7Uh IR/5NXGqotAZao2xlDpJLuOMQtoH9WVNuuxQQZHVd8if+yp6yRJ5DAmIUt5CCPcAEQEAAcLB gQQYAQIBKwUCU8AbwgUbDAAAAMBdIAQZAQgABgUCU8AbwQAKCRCTYAaomC8PVQ0VCACWk3n+ obFABEp5Rg6Qvspi9kWXcwCcfZV41OIYWhXMoc57ssjCand5noZi8bKg0bxw4qsg+9cNgZ3P N/DFWcNKcAT3Z2/4fTnJqdJS//YcEhlr8uGs+ZWFcqAPbteFCM4dGDRruo69IrHfyyQGx16s CcFlrN8vD066RKevFepb/ml7eYEdN5SRALyEdQMKeCSf3mectdoECEqdF/MWpfWIYQ1hEfdm C2Kztm+h3Nkt9ZQLqc3wsPJZmbD9T0c9Rphfypgw/SfTf2/CHoYVkKqwUIzI59itl5Lze+R5 wDByhWHx2Ud2R7SudmT9XK1e0x7W7a5z11Q6vrzuED5nQvkhAAoJEIExtcQpvGagugcIAJd5 EYe6KM6Y6RvI6TvHp+QgbU5dxvjqSiSvam0Ms3QrLidCtantcGT2Wz/2PlbZqkoJxMQc40rb fXa4xQSvJYj0GWpadrDJUvUu3LEsunDCxdWrmbmwGRKqZraV2oG7YEddmDqOe0Xm/NxeSobc MIlnaE6V0U8f5zNHB7Y46yJjjYT/Ds1TJo3pvwevDWPvv6rdBeV07D9s43frUS6xYd1uFxHC 7dZYWJjZmyUf5evr1W1gCgwLXG0PEi9n3qmz1lelQ8lSocmvxBKtMbX/OKhAfuP/iIwnTsww 95A2SaPiQZA51NywV8OFgsN0ITl2PlZ4Tp9hHERDe6nQCsNI/Us= In-Reply-To: <20241014105514.3206191-1-ryan.roberts@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 437B8140012 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: kzq51x961r6n66uk1ezkp58w8qwx79ne X-HE-Tag: 1728927173-217544 X-HE-Meta: U2FsdGVkX1+F/44qYMK6y1PFublCNxlppud3yd0UvVNhG3a4oI+gIb3CDvKmao9xQ4E2GGah4Nq8AD+Ux0Jywtapr+lykp3UHspdE7BXNoC5N1DwScBOSz2SZqzNd/wUjasqWL8PS+ndl+gYe+Vs/w4svd9nk8vzojozhSN0p0vZU55VFnP9qyX3zzWaSN1Lae438A58aNRG7UHStToHhcQz5VI/ssTa8NAl5W7ibJsZM3Me/mrzRHJZUhqYP/+ZEH0LsgYNtdpU6U5Hfvo16a8S84UUOhxOiQtwnewMS2HeCICMxhlVnzBazc7YH2kx0WphLqGAihPnGn/FTR5lP2i/MTGoblpuXIMDxAgkRVMXl9GpvEeZz9aN87RocHDOnh/uPDW7vTEJYeTebTcDsmJ/StYGeQN/AnWcooWS0XouMJPn9POqhaYUY0EyiRoPj00nHGxWLJ2sEi2ArQHhji5fT+RfmyrgW2nwAInug+1ZYJltFC0NKPmPZWdBoupxLJSKjyVIOND2uvuZIHhzmlZbztgKfMrw2VBfCWES7jGN8LpgM/6dJ4+zU3Z7Q5TaAmarqH8tzDL0iWsuYVOKtdlQsREykp1Zg6/qsJdn3CrQVqfAX5Jw9AQ+lKhpuniKMAkVxR4TGjgbMXexBZZg5xX/gJS4rrzZDhq7NDFUQMO8mTzteiu91ELCm1hA8h005wAjBPQekqLEeXAam96DBa/CvfxXBiYiiN6NnZRl9zP0ZJYpogEcQ+6VLeyW1f6fB9yoV5Zc3hj+tvAMisT5X+VKnlxtHYopjuI0fMKtMvEb/iX62wgnEzdNCWbbiwg6mOkLMsa6PneKebqMx1MhEHeEeNgOWz/CmQWlIqgjeeRAIZ1nRlLMEk0YxqNfVIFdC695E2XYTJa42jSdl0Lowno33gxC/7qRMwnKoGJXAhw9uTnHC1crAW2oUq4yB3RdDJSwsJ98FAbtC7y4wvW cbbZZwBX QtFqtVQr8rlFHC+qz2ko9XrYZoxAmxiDEVe2S03M3b7A0Ubx6fP7pPDm88mnrmhG98tEKyQ+5muwOp1f9Dr1trWrZUaZr12YhTDd5WD4LEkowiGwCzfeH1HXdiG7bxcnArY2yzEA8svUSRhdNkzwKl2JI+iR2qhsIR53+nOjkRP1cOv9Hq2IfiOIl6/NV62PhVnZzAHdm5NH90+rakjhaP5BRJA+2Ht4ZsFNYeNXlcrquUfy1BQ5dDrgSfzo5jvmtiOnuvjiG3xg/KXcabE35plptnoK/4RxItQyiAxz6mvz5EkA/42NVuZpCc6Y3t9k3HaqctKRlaNosoik= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/14/24 03:55, Ryan Roberts wrote: > Hi All, > > Patch bomb incoming... This covers many subsystems, so I've included a core set > of people on the full series and additionally included maintainers on relevant > patches. I haven't included those maintainers on this cover letter since the > numbers were far too big for it to work. But I've included a link to this cover > letter on each patch, so they can hopefully find their way here. For follow up > submissions I'll break it up by subsystem, but for now thought it was important > to show the full picture. > > This RFC series implements support for boot-time page size selection within the > arm64 kernel. arm64 supports 3 base page sizes (4K, 16K, 64K), but to date, page > size has been selected at compile-time, meaning the size is baked into a given > kernel image. As use of larger-than-4K page sizes become more prevalent this > starts to present a problem for distributions. Boot-time page size selection > enables the creation of a single kernel image, which can be told which page size > to use on the kernel command line. > > Why is having an image-per-page size problematic? > ================================================= > > Many traditional distros are now supporting both 4K and 64K. And this means > managing 2 kernel packages, along with drivers for each. For some, it means > multiple installer flavours and multiple ISOs. All of this adds up to a > less-than-ideal level of complexity. Additionally, Android now supports 4K and > 16K kernels. I'm told having to explicitly manage their KABI for each kernel is > painful, and the extra flash space required for both kernel images and the > duplicated modules has been problematic. Boot-time page size selection solves > all of this. > > Additionally, in starting to think about the longer term deployment story for > D128 page tables, which Arm architecture now supports, a lot of the same > problems need to be solved, so this work sets us up nicely for that. > > So what's the down side? > ======================== > > Well nothing's free; Various static allocations in the kernel image must be > sized for the worst case (largest supported page size), so image size is in line > with size of 64K compile-time image. So if you're interested in 4K or 16K, there > is a slight increase to the image size. But I expect that problem goes away if > you're compressing the image - its just some extra zeros. At boot-time, I expect > we could free the unused static storage once we know the page size - although > that would be a follow up enhancement. > > And then there is performance. Since PAGE_SIZE and friends are no longer > compile-time constants, we must look up their values and do arithmetic at > runtime instead of compile-time. My early perf testing suggests this is > inperceptible for real-world workloads, and only has small impact on > microbenchmarks - more on this below. > > Approach > ======== > > The basic idea is to rid the source of any assumptions that PAGE_SIZE and > friends are compile-time constant, but in a way that allows the compiler to > perform the same optimizations as was previously being done if they do turn out > to be compile-time constant. Where constants are required, we use limits; > PAGE_SIZE_MIN and PAGE_SIZE_MAX. See commit log in patch 1 for full description > of all the classes of problems to solve. > > By default PAGE_SIZE_MIN=PAGE_SIZE_MAX=PAGE_SIZE. But an arch may opt-in to > boot-time page size selection by defining PAGE_SIZE_MIN & PAGE_SIZE_MAX. arm64 > does this if the user selects the CONFIG_ARM64_BOOT_TIME_PAGE_SIZE Kconfig, > which is an alternative to selecting a compile-time page size. > > When boot-time page size is active, the arch pgtable geometry macro definitions > resolve to something that can be configured at boot. The arm64 implementation in > this series mainly uses global, __ro_after_init variables. I've tried using > alternatives patching, but that performs worse than loading from memory; I think > due to code size bloat. FWIW, this paragraph was not entirely clear to me until I looked at patch 57 to see that the compile time page size selection had been retained, and could continue to be used as-is. It was somewhat implicit, but not IMHO explicit enough, not a big deal though. Great work, thanks for doing that! This makes me wonder if we could leverage any of that to have a single kernel supporting both LPAE and !LPAE on ARM 32-bit, but that still seems like somewhat more difficult, largely due to the difference in the page table descriptor format (long vs. short). -- Florian