From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F04D1C25B4B for ; Mon, 23 Oct 2023 17:14:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 68BEA6B0129; Mon, 23 Oct 2023 13:14:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6149A6B012A; Mon, 23 Oct 2023 13:14:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B4F96B012B; Mon, 23 Oct 2023 13:14:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 359136B0129 for ; Mon, 23 Oct 2023 13:14:34 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E17AB40211 for ; Mon, 23 Oct 2023 17:14:33 +0000 (UTC) X-FDA: 81377375226.12.AF84266 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf29.hostedemail.com (Postfix) with ESMTP id 16D2912001F for ; Mon, 23 Oct 2023 17:14:31 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=MeuEwtvu; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf29.hostedemail.com: domain of will@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=will@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698081272; a=rsa-sha256; cv=none; b=TE7zgPo3zNJU18erzYDKHtsdWXvgk3tWFqzrQj2Soy/o5XXYMDETB833vi2XiEVX2gA3Ee WLF6GXbYp1DM3DBSIHuka4rLqbaaAoRw3LVFDb2SK7grb79Wo4/Vuel7KDEpioUSXZyJY1 GjFYytXvJZE6IMHoHZ0S6NTtlOQ3lRk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=MeuEwtvu; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf29.hostedemail.com: domain of will@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=will@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698081272; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lszXnBZG7oAR0TTZK6GVqzWKnHsbtNrO1oMs7TR8VwI=; b=AvuHArCH8sYyvbUDaQq0LYR5nDcuv8Es/qtyPolGJGbrrE/OINVSDakbEkaanZbTxQYFR7 /nBDlcvflX6zHusvT1T6Tr/VFd5/FFM8JUnicc4mPo0ZxnNg1786yVZEBJgmMDzd2TGJoM aYFieOaboiptqlZCIhWHU9md9zf4np0= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 043BD625F2; Mon, 23 Oct 2023 17:14:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 43161C433C9; Mon, 23 Oct 2023 17:14:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698081270; bh=Ja8s9hHWNsMl9rR/A2vkVRwGBByDHecu9TUfTmLK5O0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=MeuEwtvuDFO5yp5iwwuVEJNzg9RfG0g2iSphmorTs1YHktZPSgAIBmOgSnoaQJdHm nUlWIMj2sV40TRQksuLqdyoEUZOrZ4YgskSrGgA2/nhb9IFL78IFLLm5ChkY8qJxOI XvSk9ecB7rzX1Id68w7pVTn+QD7K44RzxoL0JXDRTA8j3sh2U4yTDlfwo97LvPAH00 RUnsGssWWAdudZuWTwU1hLYa0ovrTsWm97cFrfjCEL0C5VQBb75ldMMm5wnopS8rdo ma5otrvqzXBMC9YyKDYNnhopzPnGbtq133bDGxB7c+u2v5Gw+ZIgVNda5UvccMxt+p Sq/U+bn87cRzw== Date: Mon, 23 Oct 2023 18:14:20 +0100 From: Will Deacon To: Mike Rapoport Cc: linux-kernel@vger.kernel.org, Andrew Morton , =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= , Catalin Marinas , Christophe Leroy , "David S. Miller" , Dinh Nguyen , Heiko Carstens , Helge Deller , Huacai Chen , Kent Overstreet , Luis Chamberlain , Mark Rutland , Michael Ellerman , Nadav Amit , "Naveen N. Rao" , Palmer Dabbelt , Puranjay Mohan , Rick Edgecombe , Russell King , Song Liu , Steven Rostedt , Thomas Bogendoerfer , Thomas Gleixner , bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-modules@vger.kernel.org, linux-parisc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev, netdev@vger.kernel.org, sparclinux@vger.kernel.org, x86@kernel.org Subject: Re: [PATCH v3 04/13] mm/execmem, arch: convert remaining overrides of module_alloc to execmem Message-ID: <20231023171420.GA4041@willie-the-truck> References: <20230918072955.2507221-1-rppt@kernel.org> <20230918072955.2507221-5-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230918072955.2507221-5-rppt@kernel.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 16D2912001F X-Stat-Signature: w37jmm1d6zerq7p96oax51w1fb63tarx X-HE-Tag: 1698081271-819135 X-HE-Meta: U2FsdGVkX1/dslkuHA34T4dS6Vrnewz8gYVqFffYOrbmD17GgGGT9ZkT3Gmg5LRvRb6KuRSl4Ck8NRUFvcrwXH9TgrgmQhgX8e1dA4wPsJmcxtOUcyCq4LekYif4JcXCsykSTGADsKFzWhtyUTEodYSdSlR3QQzcGHC8qCiKPd31vbcUgqOL5fqMeMSRXm/4hTqNKxbgo1NDq/pMUk/pRZjZt+Yc0V+RnJzAG6iiZV2+1E/uQFLdneBCPo3hbXi4fGgHedbt6PSE7mVxU7HmMJHJL6VYqcqjfzO3ZrisKfdv0e0JP8y1+XmeHBpkFYOXf4+aQ2U3TnwnDv8vUv0N93lP50dzvbYBlUQLCfnYrt35QvnCV4wftQ9/FsI22HZRPu6ZXArar1vGxobWh5w6qwwAdoasYkVMNoX0AQMZH8P4NNiiXTS9LQhNgAvU9GdstXBsAjTiBRHBImxjvM3Yp/uHSB/YiuR+db0lc6TCY/Cb4lDPK9D1Scx+UNFkPLO6W220//xEG+US1CtXJ/t83sFzkHtwRZnRKdEdWbbyUigL3nnTzTeTu0whTC4HN5tOJ7Tizddm8Z9nRo11V1RvcUGk3uzLDdI3YK8/ASbSa+sdfjJEzL0Hc55+TaAXjsOAYEOjX9FRXKRWxdh0cFEyVnj2mlwVWX/cTU1SB3zZwYWtMzT+x/ZHHzg5Iq9A4yB32+sWo9O/811OW+bBPwYYP9B1jEj+F7S5qIF88EvKULC5WMu1wudwi1c3Yqk/bZYGtNuWO7sof+jAVMRqECcwb940ja5rrgCZaNTAadeOacDzdgvwIgdcgVlvT2hAqJjqt6q8qA5c0Eope0o6gCmdWFBY+rPJa/GRqj3zEQselNNiPE49miptiGKvcR7/LeCv7VnsH2/Jd5FLbpznYRBGtaB60k7ZJCb053A2oSm7RPEqSxL5j8ZCB+I2doa67Y6Pk5MgXgWKor0fXtaCc3z ctAHazyC qpAAeGo5XgzgAQPVmLxorZJ5HcE7CAJC1SRoMSk8QiWvch3VZK2vOsJuJUx3dr1mbUpqlEPHLYcLnlqE74Qh4ADjcYC8J/zsQK/qPKe/cWvNgHPKClwGCNQU1oCw10hH575SEfhK6zar7uwL2f/I9sr8PjQxdvpnI8j/YG7N61KefrU/yd+gdZT1wfq53lXUX2TI4rZ4NSfDpQ7hil8gfC3ZMyDzk49KMCAOF3CwtfDoIpfM/no4/LJgDnTt/Hg0N2fv3a1aCrjw3BjJfQjrKD6BynN8REf7kknnkIGF1sricVWLppY1MxqMd2EEqTmLGAxcVAk5cEeHX+SJOL1JKi/KZr/UKSt2BaBGVhDBf4Poh7xMjn9ysB2uyIg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Mike, On Mon, Sep 18, 2023 at 10:29:46AM +0300, Mike Rapoport wrote: > From: "Mike Rapoport (IBM)" > > Extend execmem parameters to accommodate more complex overrides of > module_alloc() by architectures. > > This includes specification of a fallback range required by arm, arm64 > and powerpc and support for allocation of KASAN shadow required by > arm64, s390 and x86. > > The core implementation of execmem_alloc() takes care of suppressing > warnings when the initial allocation fails but there is a fallback range > defined. > > Signed-off-by: Mike Rapoport (IBM) > --- > arch/arm/kernel/module.c | 38 ++++++++++++--------- > arch/arm64/kernel/module.c | 57 ++++++++++++++------------------ > arch/powerpc/kernel/module.c | 52 ++++++++++++++--------------- > arch/s390/kernel/module.c | 52 +++++++++++------------------ > arch/x86/kernel/module.c | 64 +++++++++++------------------------- > include/linux/execmem.h | 14 ++++++++ > mm/execmem.c | 43 ++++++++++++++++++++++-- > 7 files changed, 167 insertions(+), 153 deletions(-) [...] > diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c > index dd851297596e..cd6320de1c54 100644 > --- a/arch/arm64/kernel/module.c > +++ b/arch/arm64/kernel/module.c > @@ -20,6 +20,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -108,46 +109,38 @@ static int __init module_init_limits(void) > > return 0; > } > -subsys_initcall(module_init_limits); > > -void *module_alloc(unsigned long size) > +static struct execmem_params execmem_params __ro_after_init = { > + .ranges = { > + [EXECMEM_DEFAULT] = { > + .flags = EXECMEM_KASAN_SHADOW, > + .alignment = MODULE_ALIGN, > + }, > + }, > +}; > + > +struct execmem_params __init *execmem_arch_params(void) > { > - void *p = NULL; > + struct execmem_range *r = &execmem_params.ranges[EXECMEM_DEFAULT]; > > - /* > - * Where possible, prefer to allocate within direct branch range of the > - * kernel such that no PLTs are necessary. > - */ Why are you removing this comment? I think you could just move it next to the part where we set a 128MiB range. > - if (module_direct_base) { > - p = __vmalloc_node_range(size, MODULE_ALIGN, > - module_direct_base, > - module_direct_base + SZ_128M, > - GFP_KERNEL | __GFP_NOWARN, > - PAGE_KERNEL, 0, NUMA_NO_NODE, > - __builtin_return_address(0)); > - } > + module_init_limits(); Hmm, this used to be run from subsys_initcall(), but now you're running it _really_ early, before random_init(), so randomization of the module space is no longer going to be very random if we don't have early entropy from the firmware or the CPU, which is likely to be the case on most SoCs. > > - if (!p && module_plt_base) { > - p = __vmalloc_node_range(size, MODULE_ALIGN, > - module_plt_base, > - module_plt_base + SZ_2G, > - GFP_KERNEL | __GFP_NOWARN, > - PAGE_KERNEL, 0, NUMA_NO_NODE, > - __builtin_return_address(0)); > - } > + r->pgprot = PAGE_KERNEL; > > - if (!p) { > - pr_warn_ratelimited("%s: unable to allocate memory\n", > - __func__); > - } > + if (module_direct_base) { > + r->start = module_direct_base; > + r->end = module_direct_base + SZ_128M; > > - if (p && (kasan_alloc_module_shadow(p, size, GFP_KERNEL) < 0)) { > - vfree(p); > - return NULL; > + if (module_plt_base) { > + r->fallback_start = module_plt_base; > + r->fallback_end = module_plt_base + SZ_2G; > + } > + } else if (module_plt_base) { > + r->start = module_plt_base; > + r->end = module_plt_base + SZ_2G; > } > > - /* Memory is intended to be executable, reset the pointer tag. */ > - return kasan_reset_tag(p); > + return &execmem_params; > } > > enum aarch64_reloc_op { [...] > diff --git a/include/linux/execmem.h b/include/linux/execmem.h > index 44e213625053..806ad1a0088d 100644 > --- a/include/linux/execmem.h > +++ b/include/linux/execmem.h > @@ -32,19 +32,33 @@ enum execmem_type { > EXECMEM_TYPE_MAX, > }; > > +/** > + * enum execmem_module_flags - options for executable memory allocations > + * @EXECMEM_KASAN_SHADOW: allocate kasan shadow > + */ > +enum execmem_range_flags { > + EXECMEM_KASAN_SHADOW = (1 << 0), > +}; > + > /** > * struct execmem_range - definition of a memory range suitable for code and > * related data allocations > * @start: address space start > * @end: address space end (inclusive) > + * @fallback_start: start of the range for fallback allocations > + * @fallback_end: end of the range for fallback allocations (inclusive) > * @pgprot: permissions for memory in this address space > * @alignment: alignment required for text allocations > + * @flags: options for memory allocations for this range > */ > struct execmem_range { > unsigned long start; > unsigned long end; > + unsigned long fallback_start; > + unsigned long fallback_end; > pgprot_t pgprot; > unsigned int alignment; > + enum execmem_range_flags flags; > }; > > /** > diff --git a/mm/execmem.c b/mm/execmem.c > index f25a5e064886..a8c2f44d0133 100644 > --- a/mm/execmem.c > +++ b/mm/execmem.c > @@ -11,12 +11,46 @@ static void *execmem_alloc(size_t size, struct execmem_range *range) > { > unsigned long start = range->start; > unsigned long end = range->end; > + unsigned long fallback_start = range->fallback_start; > + unsigned long fallback_end = range->fallback_end; > unsigned int align = range->alignment; > pgprot_t pgprot = range->pgprot; > + bool kasan = range->flags & EXECMEM_KASAN_SHADOW; > + unsigned long vm_flags = VM_FLUSH_RESET_PERMS; > + bool fallback = !!fallback_start; > + gfp_t gfp_flags = GFP_KERNEL; > + void *p; > > - return __vmalloc_node_range(size, align, start, end, > - GFP_KERNEL, pgprot, VM_FLUSH_RESET_PERMS, > - NUMA_NO_NODE, __builtin_return_address(0)); > + if (PAGE_ALIGN(size) > (end - start)) > + return NULL; > + > + if (kasan) > + vm_flags |= VM_DEFER_KMEMLEAK; Hmm, I don't think we passed this before on arm64, should we have done? Will