From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1534EC433FE for ; Tue, 8 Nov 2022 18:50:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A9D546B0072; Tue, 8 Nov 2022 13:50:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A4E1E8E0003; Tue, 8 Nov 2022 13:50:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93CBD8E0002; Tue, 8 Nov 2022 13:50:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8634B6B0072 for ; Tue, 8 Nov 2022 13:50:21 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5B7E94121E for ; Tue, 8 Nov 2022 18:50:21 +0000 (UTC) X-FDA: 80111165442.28.649FA60 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf28.hostedemail.com (Postfix) with ESMTP id C7938C0011 for ; Tue, 8 Nov 2022 18:50:20 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0EE2561752 for ; Tue, 8 Nov 2022 18:50:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C6436C43470 for ; Tue, 8 Nov 2022 18:50:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1667933418; bh=zu9PMhVfzzTPBM578gdkzAALFYMkI9GwJjbL3eyvjAM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=Y86jQbHnVfUqG1sMkGTtdBeomtv/wb2PQODh1qzPhXx7eWj4bLqsm0za77Jf2M6Mx PKjkEklqRZVcq283TRIz2nZyfRjafK+1zWjj2hVzWuEbSgjviV/t+agMyoQnqp1Wgj UDtoaJma16bxcNNTkDDlZ2E+rwWzePTfXRms9XpFplP/eB7u9oOEOKl/Kfh7kubWhQ QYw9wsUSp4rssTJSlbtSCGHHyrkyiK+20JUWBbNvBxRq56rA5afe3CD7RRskeOe1Op 3xglcKn2N0aLn6oEc+VjRGWxgKSdKjx2ey/hKmt9vep3zJVPRmF8uF6aIzQ9VGds/W DHKNQKmRVKsKw== Received: by mail-ej1-f45.google.com with SMTP id f5so41016596ejc.5 for ; Tue, 08 Nov 2022 10:50:18 -0800 (PST) X-Gm-Message-State: ACrzQf1mJKvf5OvEgCYExtr0dT0bq/hqjkZClKWwEDnwOcWx9jB1UoOG IbtVZMYK5Y7gekrUnNRhJD+WdGwdKp/1NGtFZlg= X-Google-Smtp-Source: AMsMyM56A9ld/ijomWhhxZcM0Q6GqvKYUhlV48OrAak6z2K8RXAGhJXA8FLeGzN+XDmHx5xVq2WtTK4fY+TaCd2Ejuw= X-Received: by 2002:a17:907:b602:b0:7ad:e82c:3355 with SMTP id vl2-20020a170907b60200b007ade82c3355mr41175687ejc.3.1667933416863; Tue, 08 Nov 2022 10:50:16 -0800 (PST) MIME-Version: 1.0 References: <20221107223921.3451913-1-song@kernel.org> <9e59a4e8b6f071cf380b9843cdf1e9160f798255.camel@intel.com> In-Reply-To: <9e59a4e8b6f071cf380b9843cdf1e9160f798255.camel@intel.com> From: Song Liu Date: Tue, 8 Nov 2022 10:50:04 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH bpf-next v2 0/5] execmem_alloc for BPF programs To: "Edgecombe, Rick P" Cc: "rppt@kernel.org" , "peterz@infradead.org" , "bpf@vger.kernel.org" , "linux-mm@kvack.org" , "hch@lst.de" , "x86@kernel.org" , "akpm@linux-foundation.org" , "mcgrof@kernel.org" , "Lu, Aaron" Content-Type: text/plain; charset="UTF-8" ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Y86jQbHn; spf=pass (imf28.hostedemail.com: domain of song@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=song@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667933420; a=rsa-sha256; cv=none; b=55CHOHqNBPB0t5/rgx9+Nb0wDxWlcMcYUknnkj8kkK4UXFsXjzCcPOTojUEouVt8EoMhEv 0OHvIOLuaEqCVnvU1ualDvpc59gUtVwvGfb/ia7dKON8fdlsX06vP6G+Gjf4H7lDyfmvXY EdMPrW7X5dgi70b2cKxBEndFc/2hiCI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667933420; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3tQse3xPzFVR07HdAN+mnneOGeyK3IbmRJcC+Nmp4M4=; b=dUWDy1txeLMUhHCrsY7SYwO9HBXzbdVBYM0BXpuS1P2GdjAmYxItvz5NMyo30MSHewuC76 UXiGL0Ftv0NpEYRanX6rUBQkHnUkNLU5oukcb7121ajqCZGVZAPNf2yKwzanniqa2l8aVl uCf0DW2YUqCSxmxGpJ2RtPOjuWfNUEs= X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: C7938C0011 Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Y86jQbHn; spf=pass (imf28.hostedemail.com: domain of song@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=song@kernel.org; dmarc=pass (policy=none) header.from=kernel.org X-Stat-Signature: m8jfopferdsc3fki4pyiq37i9g4dc5cm X-Rspam-User: X-HE-Tag: 1667933420-751423 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 8, 2022 at 8:51 AM Edgecombe, Rick P wrote: > > On Tue, 2022-11-08 at 13:27 +0200, Mike Rapoport wrote: > > > Based on our experiments [5], we measured 0.5% performance > > > improvement > > > from bpf_prog_pack. This patchset further boosts the improvement to > > > 0.7%. > > > The difference is because bpf_prog_pack uses 512x 4kB pages instead > > > of > > > 1x 2MB page, bpf_prog_pack as-is doesn't resolve #2 above. > > > > > > This patchset replaces bpf_prog_pack with a better API and makes it > > > available for other dynamic kernel text, such as modules, ftrace, > > > kprobe. > > > > > > The proposed execmem_alloc() looks to me very much tailored for x86 > > to be > > used as a replacement for module_alloc(). Some architectures have > > module_alloc() that is quite different from the default or x86 > > version, so > > I'd expect at least some explanation how modules etc can use execmem_ > > APIs > > without breaking !x86 architectures. > > I think this is fair, but I think we should ask ask ourselves - how > much should we do in one step? > > For non-text_poke() architectures, the way you can make it work is have > the API look like: > execmem_alloc() <- Does the allocation, but necessarily usable yet > execmem_write() <- Loads the mapping, doesn't work after finish() > execmem_finish() <- Makes the mapping live (loaded, executable, ready) > > So for text_poke(): > execmem_alloc() <- reserves the mapping > execmem_write() <- text_pokes() to the mapping > execmem_finish() <- does nothing > > And non-text_poke(): > execmem_alloc() <- Allocates a regular RW vmalloc allocation > execmem_write() <- Writes normally to it > execmem_finish() <- does set_memory_ro()/set_memory_x() on it Yeah, some fallback mechanism like this is missing in current version. It is not a problem for BPF programs, as we call it from arch code. But we do need better APIs for modules. Thanks, Song > > Non-text_poke() only gets the benefits of centralized logic, but the > interface works for both. This is pretty much what the perm_alloc() RFC > did to make it work with other arch's and modules. But to fit with the > existing modules code (which is actually spread all over) and also > handle RO sections, it also needed some additional bells and whistles. > > So the question I'm trying to ask is, how much should we target for the > next step? I first thought that this functionality was so intertwined, > it would be too hard to do iteratively. So if we want to try > iteratively, I'm ok if it doesn't solve everything. > >