From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40660D0E6D8 for ; Mon, 21 Oct 2024 09:55:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEF016B0083; Mon, 21 Oct 2024 05:55:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A9FB06B0088; Mon, 21 Oct 2024 05:55:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 966A56B0089; Mon, 21 Oct 2024 05:55:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 783B56B0083 for ; Mon, 21 Oct 2024 05:55:24 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5B7841418A7 for ; Mon, 21 Oct 2024 09:55:08 +0000 (UTC) X-FDA: 82697151558.02.A82EC76 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf04.hostedemail.com (Postfix) with ESMTP id 4B8BE40008 for ; Mon, 21 Oct 2024 09:55:03 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf04.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729504472; a=rsa-sha256; cv=none; b=m6cjhpdlOY5N4XrHFWsHDpiZmljY8YfKmNl4pXKCU09UMkDsWTwigLwFnUnW7AwX1xmyqM UXw/2MEFa6Jm+9QF4fBT5bx3IFzrkrS3Tebqa77LhhgqA7/sku73dt2RQS97v7wBXYCUqq mlBoQdJ6PfA8Xu1SzTEoZ9XwZR4goeI= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf04.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729504472; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cjr8LPBvhlfkxQdtTVULD8NZhhrbrZYfw2sRTnp3fxI=; b=7bHmr3E19hAZfarvF2rZjZyWV1eZFP4zoJ+hRpvHtTcZlB8SHv2q+7x42AKcbPtHmPGdbl cd39rcsXxXXwqKnFYNBwEVGLEzBfft4ghp7nAuCG1d1I00u1N77yZwmf78LM5q//bSsbuJ epZ3oa7jTe79Q7QnAe65RhPOV+rZn5Y= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C2BF4DA7; Mon, 21 Oct 2024 02:55:50 -0700 (PDT) Received: from [10.57.87.148] (unknown [10.57.87.148]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2C1DC3F73B; Mon, 21 Oct 2024 02:55:18 -0700 (PDT) Message-ID: <745cb0c5-35ce-4879-9d98-52816f3241df@arm.com> Date: Mon, 21 Oct 2024 10:55:16 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [External] : Re: [RFC PATCH v1 00/57] Boot-time page size selection for arm64 Content-Language: en-GB To: Joseph Salisbury , David Hildenbrand , Andrew Morton , Anshuman Khandual , Ard Biesheuvel , Catalin Marinas , Greg Marsden , Ivan Ivanov , Kalesh Singh , Marc Zyngier , Mark Rutland , Matthias Brugger , Miroslav Benes , Will Deacon Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20241014105514.3206191-1-ryan.roberts@arm.com> <915e2f0c-f603-4617-8429-da4dacc862c4@redhat.com> <3f096ba0-b6f0-4db7-9d65-ba0550eb98b1@redhat.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Queue-Id: 4B8BE40008 X-Rspamd-Server: rspam01 X-Stat-Signature: 5wduwma8th9hba853nn8dh8ygh6hctim X-HE-Tag: 1729504503-479070 X-HE-Meta: U2FsdGVkX1/VAZ/N+zrA6x4RD3bqwhmVzCHxp1nymh8o83i6fF+skx0tZ56bqB59LcKac6OPEBDuUP5L4OiWqa/Z84DxMZ8VRtCEY+4brIPks3OtdELmHH1gIxyi3vYQrIVGnMBEKuOPpjSUheB/MGBVz363zCh6QKGDLManvYMirhkxtgzv8MMc6/wnpClooKbjwsJjhrqmYRaZgjS8fd0qpEzmb3ErVY5iVDc1Qix0abFI/k+ggeNvF0UqLwR05jfAhgC6DlTQuhRij8BphOpd6OMDD2d1mu6llHyIlcc0A29Fnz/2P9LGT1exbB9JN4V7BnOBR2UXK33uxcZDbUAyXTQpEBuJ67diTwDeBCe/pMMr2VWcx7esA0Fy4nXr23Cr+9f50rGRh8sh75qbuHFYKaDwHBn+lnYAZFOVxMl6FkckI1ykTgnUwoNZmXcGjDbtGW9KnrBoMubBKzgejPJGK294I8tG6RxSjCKQ/kE+ktY5LQ2RYGKCf8UyCxuueD5pJZWXT9GGrtWi2xrbgDQkGtQZnlK4/6lzECQmchCNrHlKevA7o2F1+wqzzBv0va0yBUfPLzVL17QBmNIjwStApLpwk88pE+QdlYAPSzu4z/iRpSrPOzc8TZQwkvDFv17Qge1wYWUeXZHzgRQXRWDhwZaf7aDpdUV9/x/so5kcPqz6+k1ikipDZD8AqVlD547S86hwNduLrbUBspBW13ddWOBXC0JiVXNN/XQIhU0CcdH4dpf7t4G+u+8Cjk9S4MpXY+sOT00NQWuWvfuMKLZHFvkhfEXIDEoklEo2s463egexATqvEcQcC7O+wkLVwZUAyuieWz+e7y7dP9M9Plr9cBo/nJiclgboxLsIuTBhBUbRpR+rbxGjcozfyHR2CwhJjMqKLu6D3pAka4BXma6NrLAdTGukm57UiPvPClE/wYCe+IInZDsQ50f2P0osmu5rIeIJccUPD4rxhHg xJ0hfEWt nP85RKIBIkHplm5gqsyabsx74dINBiqV2xr1HhBQR1EcrNQSmxTnusXK1IRxX8kpiT43B8IfzCXXZZBBJoSAgVGm6oMyUSLuc7IGn2g5hjPEE9z8WYaWyEvc3GiK0kGs/iDVSdtqnzn6fznVAdsiGUTJa8oKZEeikYoOf+l7WKza/k9g= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 18/10/2024 21:06, Joseph Salisbury wrote: > > > > On 10/18/24 15:27, David Hildenbrand wrote: >> >>>>> Hi Ryan, >>>>> >>>>> First off, this is excellent work!  Your cover page was very detailed >>>>> and made the patch set easier to understand. Thanks! >>>>> >>>>> Some questions/comments: >>>>> >>>>> Once a kernel is booted with a certain page size, could there be issues >>>>> if it is booted later with a different page size?  How about if this is >>>>> done frequently? >>>> >>>> I think that is the reason why you are only given the option in RHEL >>>> to select the kernel (4K vs. 64K) to use at install time. >>>> >>>> Software can easily use a different data format for persistance based >>>> on the base page size. I would suspect DBs might be the usual suspects. >>>> >>>> One example is swap space I think, where the base page size used when >>>> formatting the device is used, and it cannot be used with a different >>>> page size unless reformatting it. >>>> >>>> So ... one has to be a bit careful ... >>>> >>> Yes, that is what I was thinking.  Once a userspace process does an I/O >>> and if it is based on PAGE_SIZE things can go south.  I think this is >>> not an issue with THP, so maybe it's possible with boot-time page selection? >> >> THP is a different beast and has different semantics: the base page size >> doesn't change: the result of getpagesize() is unmodified ("transparent"). >> >> One would have to emulate for a given user space process a different page >> size ... and Ryan can likely tell some stories about that. >> >> Not that I consider it reasonable to have dynamic page sizes in the kernel and >> then try emulating a different one for all user space. > > This is probably  a case of ensuring proper documentation from the distro or > application vendor. > > Or maybe some type of "Safety gate" could be implemented outside of the kernel. > Some check for the prior use of different page sizes, in the cases where it > could cause problems. I agree there are likely to be problems in some corner cases if switching page size between boots, if persisted data makes assumptions about the page size. I would argue that any problems that are observed should really be considered bugs in the user space SW though. But I don't think this is really any different from today; With Ubuntu, for example, you can install both 4K and 64K kernels concurrently, then choose which one to boot via Grub. So the issue exists there already. This proposed boot-time page size selection series, doesn't make that any worse, it just simplifies the distribution model, given the reality that distros are now having to support multiple page sizes. Thanks, Ryan