From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29A80C7EE2F for ; Fri, 9 Jun 2023 17:02:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 660AD6B0072; Fri, 9 Jun 2023 13:02:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E8CB6B0074; Fri, 9 Jun 2023 13:02:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 48A0D8E0002; Fri, 9 Jun 2023 13:02:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3983A6B0072 for ; Fri, 9 Jun 2023 13:02:39 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 013C440364 for ; Fri, 9 Jun 2023 17:02:38 +0000 (UTC) X-FDA: 80883828438.17.2B61543 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf24.hostedemail.com (Postfix) with ESMTP id 89C5A180024 for ; Fri, 9 Jun 2023 17:02:33 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Q+dI+cmw; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf24.hostedemail.com: domain of song@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=song@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686330154; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MWwnbCm/JxASLuBwnI152gWPeUMiV+/WoHue9qCYS+A=; b=rTkfxOPupYRj47Bgv07k+b32YrQSxITz/9G3OazXusk5CuM7Z74utSMJLP6+SMTbxj0hlJ MLzE68UqT0Bsmmt2sDvMgXn6V8mmLl6ynZUErVVvYeZR7oQrW2Gl77C3wPHkoiQ8cogMra z12reKIQRpmRdDGf7ZwiCODS2/v0Xoc= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Q+dI+cmw; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf24.hostedemail.com: domain of song@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=song@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686330154; a=rsa-sha256; cv=none; b=Gsta5ybLzjHcMAETSMiT03ucE2dxHwki1WXyPyfvSGFVlT4LveX/Qi0e4o1AKl5sbmDeig PsP6w2OqF+dGhrSkkoDKztuc26mrt60cBqQ19dRYcsU8KjTYqvxRroj7NmW1FfSFviCOeQ VgHOifPktPKajaoBCgGq07RL2XXNRJ8= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 00CE065A37 for ; Fri, 9 Jun 2023 17:02:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB4D7C433AF for ; Fri, 9 Jun 2023 17:02:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686330152; bh=vxAb82+vAl+BH98Z/zX2UFkfYunrykLKznTO4Bf32/o=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=Q+dI+cmwHEvIObOXxBsxyLekwmm76nF9mGs/vdLlr9pWiFEOIysNhWUdd8PCPTZ60 LGGzk92gX0XuOyi4c26mYX3WPE87Sa8gWWUHUcQn3brBzeJWQThR6bQGMoW8J4LXj2 /rRDcmDl6AO9YE0mvQ8g/y0o2DWJ6lY1sgodoPu5UtSM0BhSA/NrKC725qSlwzaMbW LIHQKoajXtYBZmHeGyhSCo2catvq01eWOqhjFyA0VEGDaMp7xHaZwMDzYeTK06Q/xY rwzqQP0Yl8UvtbdxwWkN+eVpQSLM8A1ii33t60UlF5wAwY777uNuFBnLsUA8TSvepk QZiKMOywmlWog== Received: by mail-lf1-f41.google.com with SMTP id 2adb3069b0e04-4f655293a38so2191858e87.0 for ; Fri, 09 Jun 2023 10:02:31 -0700 (PDT) X-Gm-Message-State: AC+VfDwBHHFptymNOom5Us82SEecFI7Gn+SoMCaZBb0XUilElsakHf3M FPkvCI8BvqhH16S2lE+jBtxNRVxULNU1lDm7tCI= X-Google-Smtp-Source: ACHHUZ4u3y79XyQLMHF4er59h6QhhYai8KonYYOOA8YDDos4NgFbnLZdqPfEIRGQCpyydTzD04kcj6u0lkxU6y+OWW4= X-Received: by 2002:a2e:9891:0:b0:2b1:e5d8:d008 with SMTP id b17-20020a2e9891000000b002b1e5d8d008mr1338098ljj.37.1686330149805; Fri, 09 Jun 2023 10:02:29 -0700 (PDT) MIME-Version: 1.0 References: <20230601101257.530867-1-rppt@kernel.org> <20230605092040.GB3460@kernel.org> <20230608184116.GJ52412@kernel.org> In-Reply-To: <20230608184116.GJ52412@kernel.org> From: Song Liu Date: Fri, 9 Jun 2023 10:02:16 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 00/13] mm: jit/text allocator To: Mike Rapoport Cc: Mark Rutland , Kent Overstreet , linux-kernel@vger.kernel.org, Andrew Morton , Catalin Marinas , Christophe Leroy , "David S. Miller" , Dinh Nguyen , Heiko Carstens , Helge Deller , Huacai Chen , Luis Chamberlain , Michael Ellerman , "Naveen N. Rao" , Palmer Dabbelt , Russell King , Steven Rostedt , Thomas Bogendoerfer , Thomas Gleixner , Will Deacon , bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-modules@vger.kernel.org, linux-parisc@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev, netdev@vger.kernel.org, sparclinux@vger.kernel.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 89C5A180024 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: aatag7g5mmio1p6uqhnofro4ebr9zwe1 X-HE-Tag: 1686330153-529566 X-HE-Meta: U2FsdGVkX1+kmcIDkqoIEvpZ1XY+HgYeWvlGV94cmWwAyG8Tj3uh47aPeQ6b+jBN1Nc10pBheq+EVeQMtQXPbeeoyqEbR82U7+pm/o7yaUDHwZPeHn4Wd6UHW43VxnJBr2fR0Imf7nLMUy0lh+OXMNE+Cz/oMTPLcCviZoxx4f2sV+mEF2LWLveEWjYan6OY1OZtAgZsNPP4IezL1o5ZKEEztrbf9E4l4ZDqjbjNOxpBEV7sJMjybh5bs4JI7ouuVZgg+/h4hxVFj7KU5nvRVc3chZmOWUhQrRNKbytvq3itnukAhbkkchk/DemvBxh5inVIjuEhIXR61DOuGwZaqOflr4RLQlKOHwNmVmgA941KgwG8d+dbjAKdsgidA6VJtcgsQ/qFUXV3XOmLspUnHlXDTQa2LsBCt/tzKhZvRn0o9mR7kBiShK8jw/j+NCxRi75YDGkZv9+G34dbMmfIVvLK8QHvCZORoE0Z4YuH+227R4Xs8/UQMqmFUiRisllQ8K3lY9BKRmVFjZX0RGYVMqSOR31wQLeEjBmLGBqdEFsdzptMfog1GFm72HGZC5VJbUfE+QcUl94wc7B8W6V8yrZngwS/IfCo5IddCeUiSYrCKY/6aYK5CUB2fqbGPyDsCKBpe/AT9YE3vHs/kGO3wXiLPC3Z6wqMk7OiEgBT1yUgHVg3q8Gmm6yvbtq9NBi1bJcUICXFIGb5Xq8EX4Gd/9iYQTW0Rzlu4MpT1E/nCJkcmkBfOM3szhxN+WoPqVGvlcvd6DqUav8d03XIO3TXGD5zQ2yJbgPr7oECQUZajkFHNq6NgI2WKbD4Husn7XmC8oc52NP8a3t2MS5ppBucc6wwmvGRYfV7sF0auZTv0n1V20R0+iWG86StA2zGtRHRj42apV9JlqrciDCj+5gLCNU8DTuo+HxZjBUBb94tgGCpPFh9BXdOPIPLM//sJHfa+W2um4LmvIjnbWH3LW9 cFB3d/9l +Dwyzz/qpjAOhz87z8vYBNZq94lGY2URIO9+DnvueiSZtuOwab7i6aK29YFqa95RmofudPs0bLVeVH1IkY3Lbs05Sa6yeSIUqSq9ZZuyIBUbz02S9l0rthh5sTqEgTMXpKV4kmxMEk+nmfxf9eqgatG0lQGNgx1XmI0DviSnXCSWYay06DTFcZDx2kGooO8BKR0WQA9yGPCox8L6YzBRUxi3LunmHmQfL0JIlRhjhTKQLvPjQaZm9rYvcae/Y026vLfsdemmWMmxp4w65JEw7BSoxzW4gh3mRdYB6Q53zIpQu2jMFV2S2jcVYh3JLuvUJiDLtALORlnq6zi76numMeuyqfn4nOTxXN6RrS/5WoL8ePqaWtPOTFGVbgq4CEfBhlPje2+iB8ky1dGZ/eQUyaPEL/cjRUML/fr0BwDr8X0ssprrLU8MX/ZMIjw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jun 8, 2023 at 11:41=E2=80=AFAM Mike Rapoport wro= te: > > On Tue, Jun 06, 2023 at 11:21:59AM -0700, Song Liu wrote: > > On Mon, Jun 5, 2023 at 3:09=E2=80=AFAM Mark Rutland wrote: > > > > [...] > > > > > > > > Can you give more detail on what parameters you need? If the on= ly extra > > > > > > parameter is just "does this allocation need to live close to k= ernel > > > > > > text", that's not that big of a deal. > > > > > > > > > > My thinking was that we at least need the start + end for each ca= ller. That > > > > > might be it, tbh. > > > > > > > > Do you mean that modules will have something like > > > > > > > > jit_text_alloc(size, MODULES_START, MODULES_END); > > > > > > > > and kprobes will have > > > > > > > > jit_text_alloc(size, KPROBES_START, KPROBES_END); > > > > ? > > > > > > Yes. > > > > How about we start with two APIs: > > jit_text_alloc(size); > > jit_text_alloc_range(size, start, end); > > > > AFAICT, arm64 is the only arch that requires the latter API. And TBH, I= am > > not quite convinced it is needed. > > Right now arm64 and riscv override bpf and kprobes allocations to use the > entire vmalloc address space, but having the ability to allocate generate= d > code outside of modules area may be useful for other architectures. > > Still the start + end for the callers feels backwards to me because the > callers do not define the ranges, but rather the architectures, so we sti= ll > need a way for architectures to define how they want allocate memory for > the generated code. Yeah, this makes sense. > > > > > It sill can be achieved with a single jit_alloc_arch_params(), just= by > > > > adding enum jit_type parameter to jit_text_alloc(). > > > > > > That feels backwards to me; it centralizes a bunch of information abo= ut > > > distinct users to be able to shove that into a static array, when the= callsites > > > can pass that information. > > > > I think we only two type of users: module and everything else (ftrace, = kprobe, > > bpf stuff). The key differences are: > > > > 1. module uses text and data; while everything else only uses text. > > 2. module code is generated by the compiler, and thus has stronger > > requirements in address ranges; everything else are generated via som= e > > JIT or manual written assembly, so they are more flexible with addres= s > > ranges (in JIT, we can avoid using instructions that requires a speci= fic > > address range). > > > > The next question is, can we have the two types of users share the same > > address ranges? If not, we can reserve the preferred range for modules, > > and let everything else use the other range. I don't see reasons to fur= ther > > separate users in the "everything else" group. > > I agree that we can define only two types: modules and everything else an= d > let the architectures define if they need different ranges for these two > types, or want the same range for everything. > > With only two types we can have two API calls for alloc, and a single > structure that defines the ranges etc from the architecture side rather > than spread all over. > > Like something along these lines: > > struct execmem_range { > unsigned long start; > unsigned long end; > unsigned long fallback_start; > unsigned long fallback_end; > pgprot_t pgprot; > unsigned int alignment; > }; > > struct execmem_modules_range { > enum execmem_module_flags flags; > struct execmem_range text; > struct execmem_range data; > }; > > struct execmem_jit_range { > struct execmem_range text; > }; > > struct execmem_params { > struct execmem_modules_range modules; > struct execmem_jit_range jit; > }; > > struct execmem_params *execmem_arch_params(void); > > void *execmem_text_alloc(size_t size); > void *execmem_data_alloc(size_t size); > void execmem_free(void *ptr); With the jit variation, maybe we can just call these module_[text|data]_alloc()? btw: Depending on the implementation of the allocator, we may also need separate free()s for text and data. > > void *jit_text_alloc(size_t size); > void jit_free(void *ptr); > [...] How should we move ahead from here? AFAICT, all these changes can be easily extended and refactored in the future, so we don't have to make it perfect the first time. OTOH, having the interface committed (either this set or my module_alloc_type version) can unblock works in the binpack allocator and the users side. Therefore, I think we can move relatively fast here? Thanks, Song