From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E88AC433FE for ; Mon, 7 Nov 2022 23:14:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE5426B0072; Mon, 7 Nov 2022 18:14:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D6E606B0073; Mon, 7 Nov 2022 18:14:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C35FB6B0074; Mon, 7 Nov 2022 18:14:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B03316B0072 for ; Mon, 7 Nov 2022 18:14:16 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 85AAE1C6838 for ; Mon, 7 Nov 2022 23:14:16 +0000 (UTC) X-FDA: 80108201712.05.35D2867 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf12.hostedemail.com (Postfix) with ESMTP id 0B9DB40002 for ; Mon, 7 Nov 2022 23:14:15 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1C60061350 for ; Mon, 7 Nov 2022 23:14:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EAEE9C43146 for ; Mon, 7 Nov 2022 23:14:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1667862854; bh=79RwBsf+liOXUM9sIosBoswMDbRU4hjhAJG3pkLWJeM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=AsGPgf1/swkwQ5eFzTLs3uyBAn+jwUmQ2xJlRmtSERsKGzqj1loXUfPXmFE/qMfir GDs4hWMRy8j75evEFkwB8fS3ll9CrqfgXDCCiYO0jQJDf2oQcjCv4MPjfn6FUKwgBq SnFFoZWckhGizPAwgAX+TxGmjuwHfhfonMbg6QZLcuX1j9c5ko0S0STeKT/RldmjSU O4L86aDfE8K7ShZS9ny2GFDcGyoOf52WrCeN2s01buLUmsGx1k9xondnx5n9KaQgv4 c/hX8NVVeD75JyB3V+0sD6QHDXYU7L5f7heuHN7woYTxDHOV9dR2GiG+zmMnOlLG0q sTJZTyXqypl7A== Received: by mail-ej1-f47.google.com with SMTP id n12so34131104eja.11 for ; Mon, 07 Nov 2022 15:14:13 -0800 (PST) X-Gm-Message-State: ACrzQf0eizmkUL01pjYpH2yQwodXoNmLTBipKqB5im25mTNSCMHondax J1qYKmPOGJbu0yFCtWcGBv3pP1zD/OoqM9ekmS0= X-Google-Smtp-Source: AMsMyM4H2tiSkcbmvp2nK8FEhFQKgKHlB4x3iMT4NuR20+rXJNAlZACVo2CnqUt74munkzvFTgXNR4n/qb7UaDkb2Es= X-Received: by 2002:a17:907:b602:b0:7ad:e82c:3355 with SMTP id vl2-20020a170907b60200b007ade82c3355mr37021198ejc.3.1667862852037; Mon, 07 Nov 2022 15:14:12 -0800 (PST) MIME-Version: 1.0 References: <20221107223921.3451913-1-song@kernel.org> In-Reply-To: From: Song Liu Date: Mon, 7 Nov 2022 15:13:59 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH bpf-next v2 0/5] execmem_alloc for BPF programs To: Luis Chamberlain Cc: bpf@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, x86@kernel.org, peterz@infradead.org, hch@lst.de, rick.p.edgecombe@intel.com, aaron.lu@intel.com, rppt@kernel.org, dave@stgolabs.net, torvalds@linux-foundation.org Content-Type: text/plain; charset="UTF-8" ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="AsGPgf1/"; spf=pass (imf12.hostedemail.com: domain of song@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=song@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1667862856; a=rsa-sha256; cv=none; b=VImm74SBDhmLbwJOWv+CIA5Q5kUxmPZGF7KT6Swf34YiVn4TycDi3XlXdudyy+f3aAnTv2 TgTdP/N+Q8aQNwKxN2ggxKlHmuKmrh2+0c6SHWmhVSpz7LQSIyW0e0NYv1lR/+cWWdp/YD xbETq+y2W22CWKQlK2mRyBnPn0GtG6A= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1667862856; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=79RwBsf+liOXUM9sIosBoswMDbRU4hjhAJG3pkLWJeM=; b=NJDU28CT0gtb0x2Ez1kn+Cb+v1TODpnHbUGtPa1ZX1KfkFMPpIs7dyoTRILOI80pH6gtq8 XIE3NDs/721M6XgCpzPZ83oBVTT/BwY1maxReRwP9WFxA5oMyJUqbYJd311Dgp2Gv2P6yz hxFKY9pA7PCREKiP4bBYNYMy5vvFww0= X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0B9DB40002 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="AsGPgf1/"; spf=pass (imf12.hostedemail.com: domain of song@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=song@kernel.org; dmarc=pass (policy=none) header.from=kernel.org X-Stat-Signature: agwquu8diu86qx3o7odcj9956u1n7oyq X-Rspam-User: X-HE-Tag: 1667862855-967072 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Luis, On Mon, Nov 7, 2022 at 2:55 PM Luis Chamberlain wrote: > > On Mon, Nov 07, 2022 at 02:39:16PM -0800, Song Liu wrote: > > This patchset tries to address the following issues: > > > > 1. Direct map fragmentation > > > > On x86, STRICT_*_RWX requires the direct map of any RO+X memory to be also > > RO+X. These set_memory_* calls cause 1GB page table entries to be split > > into 2MB and 4kB ones. This fragmentation in direct map results in bigger > > and slower page table, and pressure for both instruction and data TLB. > > > > Our previous work in bpf_prog_pack tries to address this issue from BPF > > program side. Based on the experiments by Aaron Lu [4], bpf_prog_pack has > > greatly reduced direct map fragmentation from BPF programs. > > You should be able to past the results there into the respecite commit > from non-bpf-prog-pack to the new generalized solution here. > > > 2. iTLB pressure from BPF program > > > > Dynamic kernel text such as modules and BPF programs (even with current > > bpf_prog_pack) use 4kB pages on x86, when the total size of modules and > > BPF program is big, we can see visible performance drop caused by high > > iTLB miss rate. > > This is arbitrary, please provide some real stat and in the commit with > some reproducible benchmark. > > > 3. TLB shootdown for short-living BPF programs > > > > Before bpf_prog_pack loading and unloading BPF programs requires global > > TLB shootdown. This patchset (and bpf_prog_pack) replaces it with a local > > TLB flush. > > > > 4. Reduce memory usage by BPF programs (in some cases) > > > > Most BPF programs and various trampolines are small, and they often > > occupies a whole page. From a random server in our fleet, 50% of the > > loaded BPF programs are less than 500 byte in size, and 75% of them are > > less than 2kB in size. Allowing these BPF programs to share 2MB pages > > would yield some memory saving for systems with many BPF programs. For > > systems with only small number of BPF programs, this patch may waste a > > little memory by allocating one 2MB page, but using only part of it. > > Should be easy to provide some real numbers with at least selftests and > onto the commit as well. > > > Based on our experiments [5], we measured 0.5% performance improvement > > from bpf_prog_pack. This patchset further boosts the improvement to 0.7%. > > The difference is because bpf_prog_pack uses 512x 4kB pages instead of > > 1x 2MB page, bpf_prog_pack as-is doesn't resolve #2 above. > > > > This patchset replaces bpf_prog_pack with a better API and makes it > > available for other dynamic kernel text, such as modules, ftrace, kprobe. > > And likewise here, please no arbitrary internal benchmark, real numbers. The benchmark used here is identical on our web service, which runs on many many servers, so it represents the workload that we care a lot. Unfortunately, it is not possible to run it out of our data centers. We can build some artificial workloads and probably get much higher performance improvements. But these workload may not represent real world use cases. Thanks, Song