linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: davem@davemloft.net
Cc: daniel@iogearbox.net, andrii@kernel.org, tj@kernel.org,
	memxor@gmail.com, delyank@fb.com, linux-mm@kvack.org,
	bpf@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH v5 bpf-next 07/15] bpf: Optimize call_rcu in non-preallocated hash map.
Date: Thu,  1 Sep 2022 09:15:39 -0700	[thread overview]
Message-ID: <20220901161547.57722-8-alexei.starovoitov@gmail.com> (raw)
In-Reply-To: <20220901161547.57722-1-alexei.starovoitov@gmail.com>

From: Alexei Starovoitov <ast@kernel.org>

Doing call_rcu() million times a second becomes a bottle neck.
Convert non-preallocated hash map from call_rcu to SLAB_TYPESAFE_BY_RCU.
The rcu critical section is no longer observed for one htab element
which makes non-preallocated hash map behave just like preallocated hash map.
The map elements are released back to kernel memory after observing
rcu critical section.
This improves 'map_perf_test 4' performance from 100k events per second
to 250k events per second.

bpf_mem_alloc + percpu_counter + typesafe_by_rcu provide 10x performance
boost to non-preallocated hash map and make it within few % of preallocated map
while consuming fraction of memory.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/bpf/hashtab.c                      |  8 ++++++--
 kernel/bpf/memalloc.c                     |  2 +-
 tools/testing/selftests/bpf/progs/timer.c | 11 -----------
 3 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 36aa16dc43ad..0d888a90a805 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -953,8 +953,12 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
 		__pcpu_freelist_push(&htab->freelist, &l->fnode);
 	} else {
 		dec_elem_count(htab);
-		l->htab = htab;
-		call_rcu(&l->rcu, htab_elem_free_rcu);
+		if (htab->map.map_type == BPF_MAP_TYPE_PERCPU_HASH) {
+			l->htab = htab;
+			call_rcu(&l->rcu, htab_elem_free_rcu);
+		} else {
+			htab_elem_free(htab, l);
+		}
 	}
 }
 
diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
index 1c46763d855e..da0721f8c28f 100644
--- a/kernel/bpf/memalloc.c
+++ b/kernel/bpf/memalloc.c
@@ -281,7 +281,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size)
 			return -ENOMEM;
 		size += LLIST_NODE_SZ; /* room for llist_node */
 		snprintf(buf, sizeof(buf), "bpf-%u", size);
-		kmem_cache = kmem_cache_create(buf, size, 8, 0, NULL);
+		kmem_cache = kmem_cache_create(buf, size, 8, SLAB_TYPESAFE_BY_RCU, NULL);
 		if (!kmem_cache) {
 			free_percpu(pc);
 			return -ENOMEM;
diff --git a/tools/testing/selftests/bpf/progs/timer.c b/tools/testing/selftests/bpf/progs/timer.c
index 5f5309791649..0053c5402173 100644
--- a/tools/testing/selftests/bpf/progs/timer.c
+++ b/tools/testing/selftests/bpf/progs/timer.c
@@ -208,17 +208,6 @@ static int timer_cb2(void *map, int *key, struct hmap_elem *val)
 		 */
 		bpf_map_delete_elem(map, key);
 
-		/* in non-preallocated hashmap both 'key' and 'val' are RCU
-		 * protected and still valid though this element was deleted
-		 * from the map. Arm this timer for ~35 seconds. When callback
-		 * finishes the call_rcu will invoke:
-		 * htab_elem_free_rcu
-		 *   check_and_free_timer
-		 *     bpf_timer_cancel_and_free
-		 * to cancel this 35 second sleep and delete the timer for real.
-		 */
-		if (bpf_timer_start(&val->timer, 1ull << 35, 0) != 0)
-			err |= 256;
 		ok |= 4;
 	}
 	return 0;
-- 
2.30.2



  parent reply	other threads:[~2022-09-01 16:16 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01 16:15 [PATCH v5 bpf-next 00/15] bpf: BPF specific memory allocator Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 01/15] bpf: Introduce any context " Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 02/15] bpf: Convert hash map to bpf_mem_alloc Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 03/15] selftests/bpf: Improve test coverage of test_maps Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 04/15] samples/bpf: Reduce syscall overhead in map_perf_test Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 05/15] bpf: Relax the requirement to use preallocated hash maps in tracing progs Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 06/15] bpf: Optimize element count in non-preallocated hash map Alexei Starovoitov
2022-09-01 16:15 ` Alexei Starovoitov [this message]
2022-09-01 16:15 ` [PATCH v5 bpf-next 08/15] bpf: Adjust low/high watermarks in bpf_mem_cache Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 09/15] bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 10/15] bpf: Add percpu allocation support to bpf_mem_alloc Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 11/15] bpf: Convert percpu hash map to per-cpu bpf_mem_alloc Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 12/15] bpf: Remove tracing program restriction on map types Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 13/15] bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 14/15] bpf: Remove prealloc-only restriction for " Alexei Starovoitov
2022-09-01 16:15 ` [PATCH v5 bpf-next 15/15] bpf: Remove usage of kmem_cache from bpf_mem_cache Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220901161547.57722-8-alexei.starovoitov@gmail.com \
    --to=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=delyank@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=memxor@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox