From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2B54D43375 for ; Thu, 7 Nov 2024 16:13:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 554B56B0085; Thu, 7 Nov 2024 11:13:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4DDAF6B0088; Thu, 7 Nov 2024 11:13:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 358046B0089; Thu, 7 Nov 2024 11:13:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 13BC66B0085 for ; Thu, 7 Nov 2024 11:13:48 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 79C0C12092D for ; Thu, 7 Nov 2024 16:13:47 +0000 (UTC) X-FDA: 82759793844.24.7A1073D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf11.hostedemail.com (Postfix) with ESMTP id 781A940012 for ; Thu, 7 Nov 2024 16:13:01 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iIorGii7; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf11.hostedemail.com: domain of "SRS0=4mwd=SC=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=4mwd=SC=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730995901; a=rsa-sha256; cv=none; b=izBDOskF0rxSzPuIIIxoqj4V0gYaZLk0yywQqdj2vp13zHIK25H3wit4fS/C6bzBbim6Eo MCyS+ETC4NCeBL/DtxQQTtgz5iQ/GQiPDEEzKlaH0z54ChpDaoH/3TeUvj976FdFBUXxME OVF//t4+jVb1pZNl2mRnO2zDLU3yPtA= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=iIorGii7; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf11.hostedemail.com: domain of "SRS0=4mwd=SC=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=4mwd=SC=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730995901; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jDa6sei8o1MaDgoDADSHEh6+VXwgjNbYt9JhKi0oFXI=; b=bbybEEyw/WO3sfhbHObhWeSaTMz8MOqvMKmcbytyryH7O51+xdhC1PpS9lfWaVcrv6sQEk Yz3hqdnM6P2g9bvGvvs0Y03Kngf8j9VKnK+rcFC4KqOz/tNXTWX61IQoNo8dMstXsv6AwJ dGkEDdeMo0ffy3jb8ZDiIUU4cTpa9s8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id D26795C1B99; Thu, 7 Nov 2024 16:12:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2BCF1C4CECD; Thu, 7 Nov 2024 16:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730996024; bh=FHR30uhF6DB2gIDAdsPiv6Yh2i+bB/ZarNlpamdV/8U=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=iIorGii7VPFbRUVtUE5c5O3wF/9CYPC2R7Zhv6L1KDgv4fIqUVa1A5w0d26/HbeIE Jf2O7i1mnX/JVQMR/wCvmYNwyOBatttLt+0/SGL9jC3kGD8Lbt1SMdTKGp6eLnlJ1m AlLCJmTn4dPOwWfrSEYiT/dW1g6ANs/hpD6mpFsZaZvA2j2J3dZVnSjGE9WPCXP/w1 +iVUs74Zo0hFGkFW3hm0yF+t3y3Kct7iNpYh9EldCWFI17sXeVFxDLxQ+kAlWBNQUw tMZ33tmI101nEF3e0lgTP9FIyUZL14kwM6hXujkI7qvO3yXSbxedwPp+/Vk+dhNMjw DghCR1xtvWLbw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id C32DCCE04CE; Thu, 7 Nov 2024 08:13:43 -0800 (PST) Date: Thu, 7 Nov 2024 08:13:43 -0800 From: "Paul E. McKenney" To: Andrii Nakryiko Cc: Breno Leitao , Andrii Nakryiko , linux-trace-kernel@vger.kernel.org, peterz@infradead.org, oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, linux-mm@kvack.org Subject: Re: [PATCH v5 4/8] uprobes: travers uprobe's consumer list locklessly under SRCU protection Message-ID: <4d034c81-34cd-480b-bab9-6645204fc713@paulmck-laptop> Reply-To: paulmck@kernel.org References: <20240903174603.3554182-1-andrii@kernel.org> <20240903174603.3554182-5-andrii@kernel.org> <20241106-transparent-athletic-ammonite-586af8@leitao> <20241107-uncovered-swinging-bull-1e812e@leitao> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Queue-Id: 781A940012 X-Stat-Signature: wozapnw6cn15jxcd7zubgigukr3fqj7n X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1730995981-248850 X-HE-Meta: U2FsdGVkX19KMiCnyWIYxAsxVwz4s4VGPtjpAlLDsvtMyjNJauLNzCf0RykxrAZux04P2dbZmukD+dAx9EuGudWpaGJlIpDrscNwKL3EC3+ynSbIsI0SJqUpWHtC+n3SLus1QlvTvbJt3NkXZDjxMcHBztr1T5AYevEjMEMUO0fpKmH080/0NSGs+4UrokM4kvAcKxs3p8lrTlILU+ZGnevVzLIbEUfEZaPZbERhcH4cwB5Lj5uAHnxw8Tdf35R9ODrPzuZHJeAciE0pz/SyRNuKpL17RtoYeX/scbA7GF6vLmGexYlwmD12DN5cUmghc5ls6Nc+4bYU8OgAcgXLLj0nR9B0Ac+HG/pVfPOyWX1C94iXZ6QXA/5d3FfnL9QuuhNhDu9GLOqE60zMIHu3r8eHbNoqE3xZCHAEi1QMWQ3yUICihZkbBbM1qscxJ+WhPkEmO+r2TA3YLQXQafkn8A3LGxdcNp8ztjfUTr0K/+EiOJmsPofnQyfslQ0hqpFzEn5yyAU0kaHKGErcANxbBEVqeplFRBqMX+SphJN83c4xzXgS1p7xWIo1VmTWN3KkJH5gexoCqDzuGoN0KbGBQ3mcDmVMIGwqNkX+6tkStaEuHCEsAovAoZjObFz/gV3XhmXI31ko/WtJ+3hrCcnBoUFYKRYcaAxfQ4j03CtGAeRVsw9cGaU92Tj18SxPusBcQVzSx77cbYozf/NCfpTIhoF9svJ/uRNBqJPPjNcMnuTWBbKsqitmLCLReUGMo1aYk4Rtyp3QFZL4XvSVoqAk1QM0JG4lN87YjYmR+zkBuhF22WS+Z0T4COBiiyJ647qSCTJwCw7J7bvxp6gqToIZdWBzAOG6oCS2OcNNkuH4HUepnLFr08ZFYhC0z7E1iiNyeKiLjMCaZ8vR1xddBLYN7d43dxgMTftgrdzQllbqt8O1LMCqSBkfr0rgwBwoGiF/MBo7T1JXsDFQy85z1Y0 0LypupnP 8RBXtbfXBbjtEp6PimSj60yOcZSq7Z+kgTs/H1JMro7nx22j/HMLQLy3nS8GiLlu+lNW6fg7vW1JXcRSo9jVnJHYgvCrHY5TR4b5S/n2xIo0BfeuHYBlVV+Q4ZBghCNi2jXVPbqwna2sZw64rW2LZwfP/mGQphSKO8Ir52gJPpp/gE7vbgGcHk52NYe9KcE6iCAqO/KVwosm7vUJsNp6gfxW4ZjGuVeMTgzqOFUhzY3Lzgrj5uK/nBmLXTtszD3DHl+shR+iVKnmMlIgNDT0R0i5RFOOuX3U3RRyc4Gjid+JMv5xGfVVfb+YNZmoG9HXkYNRA1RbGmkkVRp9D2/DmtsX1y9+YpZkAL2OP6miE1jfwxN+cOdRXPV9ptrt8TP8D3gAGULIOXA8wvJxWvSxPjvxMnEZTyTBxFYjcEDsOrlcFTu6DYTK67leQYvlawvGgN+wlGbvei/1ohF6zmgcz1m1KwXJBt3K0daSMWVK7xKDuuFj6V692onw92V9zotSlrH9b8sCk208ekJuIoNyVwe7B5ywJpu/zfiVB7AUMKXx7fRx+8dWEWlYoMgLIvH9R6QgKAfTm2yzS9fw4VaHKQw1zucfwf/UEF/ii9Q9I4DcDvC1OaW7NWby/tP8uWaKrB3Ug8QyZAt3gW41y9t7u5tdZoe6sOxVlaWE3p3cgmskI25eu2ELM7n4RQhf1DMWdj4o4PCjt6mpIv+x0bkD/Dccs8rz9QwrbpvqX6EjxqG1VqibCwZd7CMBIjw5zTSYlp71n X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: oN tHU, nov 07, 2024 at 08:01:05AM -0800, Andrii Nakryiko wrote: > On Thu, Nov 7, 2024 at 3:35 AM Breno Leitao wrote: > > > > Hello Andrii, > > > > On Wed, Nov 06, 2024 at 08:25:25AM -0800, Andrii Nakryiko wrote: > > > On Wed, Nov 6, 2024 at 4:03 AM Breno Leitao wrote: > > > > On Tue, Sep 03, 2024 at 10:45:59AM -0700, Andrii Nakryiko wrote: > > > > > uprobe->register_rwsem is one of a few big bottlenecks to scalability of > > > > > uprobes, so we need to get rid of it to improve uprobe performance and > > > > > multi-CPU scalability. > > > > > > > > > > First, we turn uprobe's consumer list to a typical doubly-linked list > > > > > and utilize existing RCU-aware helpers for traversing such lists, as > > > > > well as adding and removing elements from it. > > > > > > > > > > For entry uprobes we already have SRCU protection active since before > > > > > uprobe lookup. For uretprobe we keep refcount, guaranteeing that uprobe > > > > > won't go away from under us, but we add SRCU protection around consumer > > > > > list traversal. > > > > > > > > I am seeing the following message in a kernel with RCU_PROVE_LOCKING: > > > > > > > > kernel/events/uprobes.c:937 RCU-list traversed without holding the required lock!! > > > > > > > > It seems the SRCU is not held, when coming from mmap_region -> > > > > uprobe_mmap. Here is the message I got in my debug kernel. (sorry for > > > > not decoding it, but, the stack trace is clear enough). > > > > > > > > WARNING: suspicious RCU usage > > > > 6.12.0-rc5-kbuilder-01152-gc688a96c432e #26 Tainted: G W E N > > > > ----------------------------- > > > > kernel/events/uprobes.c:938 RCU-list traversed without holding the required lock!! > > > > > > > > other info that might help us debug this: > > > > > > > > rcu_scheduler_active = 2, debug_locks = 1 > > > > 3 locks held by env/441330: > > > > #0: ffff00021c1bc508 (&mm->mmap_lock){++++}-{3:3}, at: vm_mmap_pgoff+0x84/0x1d0 > > > > #1: ffff800089f3ab48 (&uprobes_mmap_mutex[i]){+.+.}-{3:3}, at: uprobe_mmap+0x20c/0x548 > > > > #2: ffff0004e564c528 (&uprobe->consumer_rwsem){++++}-{3:3}, at: filter_chain+0x30/0xe8 > > > > > > > > stack backtrace: > > > > CPU: 4 UID: 34133 PID: 441330 Comm: env Kdump: loaded Tainted: G W E N 6.12.0-rc5-kbuilder-01152-gc688a96c432e #26 > > > > Tainted: [W]=WARN, [E]=UNSIGNED_MODULE, [N]=TEST > > > > Hardware name: Quanta S7GM 20S7GCU0010/S7G MB (CG1), BIOS 3D22 07/03/2024 > > > > Call trace: > > > > dump_backtrace+0x10c/0x198 > > > > show_stack+0x24/0x38 > > > > __dump_stack+0x28/0x38 > > > > dump_stack_lvl+0x74/0xa8 > > > > dump_stack+0x18/0x28 > > > > lockdep_rcu_suspicious+0x178/0x2c8 > > > > filter_chain+0xdc/0xe8 > > > > uprobe_mmap+0x2e0/0x548 > > > > mmap_region+0x510/0x988 > > > > do_mmap+0x444/0x528 > > > > vm_mmap_pgoff+0xf8/0x1d0 > > > > ksys_mmap_pgoff+0x184/0x2d8 > > > > > > > > > > > > That said, it seems we want to hold the SRCU, before reaching the > > > > filter_chain(). I hacked a bit, and adding the lock in uprobe_mmap() > > > > solves the problem, but, I might be missing something, since I am not familiar > > > > with this code. > > > > > > > > How does the following patch look like? > > > > > > > > commit 1bd7bcf03031ceca86fdddd8be2e5500497db29f > > > > Author: Breno Leitao > > > > Date: Mon Nov 4 06:53:31 2024 -0800 > > > > > > > > uprobes: Get SRCU lock before traverseing the list > > > > > > > > list_for_each_entry_srcu() is being called without holding the lock, > > > > which causes LOCKDEP (when enabled with RCU_PROVING) to complain such > > > > as: > > > > > > > > kernel/events/uprobes.c:937 RCU-list traversed without holding the required lock!! > > > > > > > > Get the SRCU uprobes_srcu lock before calling filter_chain(), which > > > > needs to have the SRCU lock hold, since it is going to call > > > > list_for_each_entry_srcu(). > > > > > > > > Signed-off-by: Breno Leitao > > > > Fixes: cc01bd044e6a ("uprobes: travers uprobe's consumer list locklessly under SRCU protection") > > > > > > > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > > > > index 4b52cb2ae6d62..cc9d4ddeea9a6 100644 > > > > --- a/kernel/events/uprobes.c > > > > +++ b/kernel/events/uprobes.c > > > > @@ -1391,6 +1391,7 @@ int uprobe_mmap(struct vm_area_struct *vma) > > > > struct list_head tmp_list; > > > > struct uprobe *uprobe, *u; > > > > struct inode *inode; > > > > + int srcu_idx; > > > > > > > > if (no_uprobe_events()) > > > > return 0; > > > > @@ -1409,6 +1410,7 @@ int uprobe_mmap(struct vm_area_struct *vma) > > > > > > > > mutex_lock(uprobes_mmap_hash(inode)); > > > > build_probe_list(inode, vma, vma->vm_start, vma->vm_end, &tmp_list); > > > > + srcu_idx = srcu_read_lock(&uprobes_srcu); > > > > > > Thanks for catching that (production testing FTW, right?!). > > > > Correct. I am running some hosts with RCU_PROVING and I am finding some > > cases where RCU protected areas are touched without holding the RCU read > > lock. > > > > > But I think you a) adding wrong RCU protection flavor (it has to be > > > rcu_read_lock_trace()/rcu_read_unlock_trace(), see uprobe_apply() for > > > an example) and b) I think this is the wrong place to add it. We > > > should add it inside filter_chain(). filter_chain() is called from > > > three places, only one of which is already RCU protected (that's the > > > handler_chain() case). But there is also register_for_each_vma(), > > > which needs RCU protection as well. > > > > Thanks for the guidance! > > > > My initial plan was to protect filter_chain(), but, handler_chain() > > already has the lock. Is it OK to get into a critical section in a > > nested form? > > > > The code will be something like: > > > > handle_swbp() { > > rcu_read_lock_trace(); > > handler_chain() { > > filter_chain() { > > rcu_read_lock_trace(); > > list_for_each_entry_rcu() > > rcu_read_lock_trace(); > > } > > } > > rcu_read_lock_trace(); > > } > > > > Is this nested locking fine? > > Yes, it's totally fine to nest RCU lock regions. As long as you don't nest them more than 255 deep in CONFIG_PREEMPT=n kernels that also have CONFIG_PREEMPT_COUNT=y, or more than 2G deep in CONFIG_PREEMPT=y kernels. For a limited time only, in CONFIG_PREEMPT=n kernels that also have CONFIG_PREEMPT_COUNT=n, you can nest as deeply as you want. ;-) Sorry, couldn't resist... Thanx, Paul