From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 544DFC3F6B0 for ; Wed, 24 Aug 2022 20:03:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E301A940007; Wed, 24 Aug 2022 16:03:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DDF6F6B0074; Wed, 24 Aug 2022 16:03:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA6B7940007; Wed, 24 Aug 2022 16:03:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B78176B0073 for ; Wed, 24 Aug 2022 16:03:51 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 819D2A0307 for ; Wed, 24 Aug 2022 20:03:51 +0000 (UTC) X-FDA: 79835561862.09.4DC5C62 Received: from mail-io1-f66.google.com (mail-io1-f66.google.com [209.85.166.66]) by imf25.hostedemail.com (Postfix) with ESMTP id 38800A001E for ; Wed, 24 Aug 2022 20:03:51 +0000 (UTC) Received: by mail-io1-f66.google.com with SMTP id y187so14311378iof.0 for ; Wed, 24 Aug 2022 13:03:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=cJOlmdLgf/zMpHi+EIYjFklQum6i8of9fryDup8SEYY=; b=A7CQ2t/w1A7p0v0JnJhvkGnQLx8cbAN3MS/QHWlVtXAlLUaspgkna6uNBRO6houek1 zHRp+Vu1VknUMhXmrG+N9T68ubyGYpalY1lUVodDtGyk4egiEmf6+yTXsPf0dCN1TGa6 o3zODfWJ0EIvuBfMxgGO3ztGI/ztamkY+FZjOXs6U+vdR5kl0pCTHkQaPqKdQ3TkHxzM hQwaQJA1FcF8Hx30Z0PeTK4vYk88GNE2AxkecV3/Qw7UXPae9n12Q6RUFdvY7zUFuPc7 UclGKXkvGQAAK2Ym090By9LYy+XOr0vI76K/hcsiCOxm56/SokqEZx+bh2tilLxRKpog cTPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=cJOlmdLgf/zMpHi+EIYjFklQum6i8of9fryDup8SEYY=; b=Ot/gkZboPFfa0BULtzIvyHjBRL5VdfWok28mPcjPwlaUlpFknbqtcmJIl7/p8iHvAA kIMgtOSU5K/TjJxBLYqCWqKjBzz/1IA4VZNsu4HRpid3wEUBcTkn7wH8VrHSFVhLs4wG rgvyZKhenXi+jZ4xirjwEYcAihNK7KL5MEMWIZKi9g+sTUDCdWCgxgnE3aDzbgy9wC5D q5Fm8/jjRUEWKIjkltZ/9K0XdvSZtdCMISy14iYEMD/UNm64w7IcmvT+VUEmFk0M3fXG 4s/UqS0Ztyq1AuUQ0tm3T8EPkn9SAo7blIbwtyZdWAGlglPKSZ4wvh5MjKMTh2sXTwzg Hoxw== X-Gm-Message-State: ACgBeo2E4ts6hzn1R6Ww2fwPLeoVDjDjObGYteqU/NMfIT8BZB1mRrlw iSjz6KuG7racfx0RNcexY8TpAfVZuYLgK16bAXU= X-Google-Smtp-Source: AA6agR6rRJun93syO2w8ub3L3xA3WhXeByA+Ys1LJXWwINB4x3IUyfHBKzRAlS3MqYA20lQ/9CQKNkofX7c+KjNgx8k= X-Received: by 2002:a05:6638:2105:b0:34a:694:4fa4 with SMTP id n5-20020a056638210500b0034a06944fa4mr286938jaj.116.1661371430570; Wed, 24 Aug 2022 13:03:50 -0700 (PDT) MIME-Version: 1.0 References: <20220819214232.18784-1-alexei.starovoitov@gmail.com> In-Reply-To: <20220819214232.18784-1-alexei.starovoitov@gmail.com> From: Kumar Kartikeya Dwivedi Date: Wed, 24 Aug 2022 22:03:13 +0200 Message-ID: Subject: Re: [PATCH v3 bpf-next 00/15] bpf: BPF specific memory allocator. To: Alexei Starovoitov Cc: davem@davemloft.net, daniel@iogearbox.net, andrii@kernel.org, tj@kernel.org, delyank@fb.com, linux-mm@kvack.org, bpf@vger.kernel.org, kernel-team@fb.com Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661371431; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cJOlmdLgf/zMpHi+EIYjFklQum6i8of9fryDup8SEYY=; b=TiHsqeUmqpf5fYIEY7YCFcHWnh8H7FCVJ2Ve1+5pm/4+3LtCkt4xNIIzMsORNXQvKWWSlJ WAjW2SiALkGNmqIRxOQ/9hygL2p+N5kvroZFRAo1PUL3vrT26EVle0clIJUP6XWfQ5lgeC 8nxfV3QTeh3k6k2zsnua7c7egqOLs+4= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="A7CQ2t/w"; spf=pass (imf25.hostedemail.com: domain of memxor@gmail.com designates 209.85.166.66 as permitted sender) smtp.mailfrom=memxor@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661371431; a=rsa-sha256; cv=none; b=10vgRlE3iKkG38LcYPxsZb+/QUvjU0lNDq99fnIBGqfQBtRQMTx7AbagQTsnHl73JMffaM 5eBgcO+UexsfgTasbnqXee4a6BQ45ZaPvaCpuVA+So/5TfU9a6/ub2MKCoeI8vACDY8g15 dRsCCocQ3vUOu4UVur2CNRGN3aMPBGI= X-Rspam-User: X-Rspamd-Queue-Id: 38800A001E X-Stat-Signature: kj711p9m4r66en1ok5wywcgftbystfdp Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="A7CQ2t/w"; spf=pass (imf25.hostedemail.com: domain of memxor@gmail.com designates 209.85.166.66 as permitted sender) smtp.mailfrom=memxor@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam09 X-HE-Tag: 1661371431-798312 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 19 Aug 2022 at 23:42, Alexei Starovoitov wrote: > > From: Alexei Starovoitov > > Introduce any context BPF specific memory allocator. > > Tracing BPF programs can attach to kprobe and fentry. Hence they > run in unknown context where calling plain kmalloc() might not be safe. > Front-end kmalloc() with per-cpu cache of free elements. > Refill this cache asynchronously from irq_work. > > Major achievements enabled by bpf_mem_alloc: > - Dynamically allocated hash maps used to be 10 times slower than fully preallocated. > With bpf_mem_alloc and subsequent optimizations the speed of dynamic maps is equal to full prealloc. > - Tracing bpf programs can use dynamically allocated hash maps. > Potentially saving lots of memory. Typical hash map is sparsely populated. > - Sleepable bpf programs can used dynamically allocated hash maps. > >From my side, for the whole series: Acked-by: Kumar Kartikeya Dwivedi > v2->v3: > - Rewrote the free_list algorithm based on discussions with Kumar. Patch 1. > - Allowed sleepable bpf progs use dynamically allocated maps. Patches 13 and 14. > - Added sysctl to force bpf_mem_alloc in hash map even if pre-alloc is > requested to reduce memory consumption. Patch 15. > - Fix: zero-fill percpu allocation > - Single rcu_barrier at the end instead of each cpu during bpf_mem_alloc destruction > > v2 thread: > https://lore.kernel.org/bpf/20220817210419.95560-1-alexei.starovoitov@gmail.com/ > > v1->v2: > - Moved unsafe direct call_rcu() from hash map into safe place inside bpf_mem_alloc. Patches 7 and 9. > - Optimized atomic_inc/dec in hash map with percpu_counter. Patch 6. > - Tuned watermarks per allocation size. Patch 8 > - Adopted this approach to per-cpu allocation. Patch 10. > - Fully converted hash map to bpf_mem_alloc. Patch 11. > - Removed tracing prog restriction on map types. Combination of all patches and final patch 12. > > v1 thread: > https://lore.kernel.org/bpf/20220623003230.37497-1-alexei.starovoitov@gmail.com/ > > LWN article: > https://lwn.net/Articles/899274/ > > Future work: > - expose bpf_mem_alloc as uapi FD to be used in dynptr_alloc, kptr_alloc > - convert lru map to bpf_mem_alloc > > Alexei Starovoitov (15): > bpf: Introduce any context BPF specific memory allocator. > bpf: Convert hash map to bpf_mem_alloc. > selftests/bpf: Improve test coverage of test_maps > samples/bpf: Reduce syscall overhead in map_perf_test. > bpf: Relax the requirement to use preallocated hash maps in tracing > progs. > bpf: Optimize element count in non-preallocated hash map. > bpf: Optimize call_rcu in non-preallocated hash map. > bpf: Adjust low/high watermarks in bpf_mem_cache > bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU. > bpf: Add percpu allocation support to bpf_mem_alloc. > bpf: Convert percpu hash map to per-cpu bpf_mem_alloc. > bpf: Remove tracing program restriction on map types > bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs. > bpf: Remove prealloc-only restriction for sleepable bpf programs. > bpf: Introduce sysctl kernel.bpf_force_dyn_alloc. > > include/linux/bpf_mem_alloc.h | 26 + > include/linux/filter.h | 2 + > kernel/bpf/Makefile | 2 +- > kernel/bpf/core.c | 2 + > kernel/bpf/hashtab.c | 132 +++-- > kernel/bpf/memalloc.c | 601 ++++++++++++++++++++++ > kernel/bpf/syscall.c | 14 +- > kernel/bpf/verifier.c | 52 -- > samples/bpf/map_perf_test_kern.c | 44 +- > samples/bpf/map_perf_test_user.c | 2 +- > tools/testing/selftests/bpf/progs/timer.c | 11 - > tools/testing/selftests/bpf/test_maps.c | 38 +- > 12 files changed, 795 insertions(+), 131 deletions(-) > create mode 100644 include/linux/bpf_mem_alloc.h > create mode 100644 kernel/bpf/memalloc.c > > -- > 2.30.2 >