From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BC94D43374 for ; Thu, 7 Nov 2024 16:01:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E24736B0083; Thu, 7 Nov 2024 11:01:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DAD9A6B0085; Thu, 7 Nov 2024 11:01:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C270E6B0088; Thu, 7 Nov 2024 11:01:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9F7666B0083 for ; Thu, 7 Nov 2024 11:01:25 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4848E8092A for ; Thu, 7 Nov 2024 16:01:25 +0000 (UTC) X-FDA: 82759762638.23.B65875B Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) by imf12.hostedemail.com (Postfix) with ESMTP id EC6D040048 for ; Thu, 7 Nov 2024 16:01:02 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hAPfaokq; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.48 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730995194; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9GsBpHXBN7WYjyLsOqeXFU2jUxVAbP9Ec9maZf8+ATk=; b=Tj9ZZVMbFJZ+284yUZVVvqbNnNo+BaGp/QTkx22naqGM94U5k0kLbAydr7A+FVSFjiMTao 8nl0+Qf3/mcbwcLfF7XlFCDb2WDoThroOOBwYKWPuork3/h5SQbIfDAEtL3Kp/j4tHM5GJ qTwQlRAWFCeTVcPC8hszCOGOYd8FI64= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hAPfaokq; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.48 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730995194; a=rsa-sha256; cv=none; b=YFxrNVzEAPTtfTwvIe6Vx3GN0p3WFxeE8+uRjJuRdXxcPY7ph2dK5GOgWsHrrmyBxd4AX8 Davh6VFYlSz+3nvjeTekvEDOTIcNEEJUfajs09koJ07RMKdIMkQJEU0KkjZBMLC5edTtlC 97tzuYvnX5U9gA6rEOW1EpVettGmvUM= Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-2e2e87153a3so828009a91.3 for ; Thu, 07 Nov 2024 08:01:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730995277; x=1731600077; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=9GsBpHXBN7WYjyLsOqeXFU2jUxVAbP9Ec9maZf8+ATk=; b=hAPfaokqbobcUn5VUdDLFJOX2f95qQNaAng6XfVPlrxvfvAj81c2AYZzVb+FBNfT6t MI53HCrKzp0/JEB+/1baNh0259S5Y4NWY1uVcu5wI3hT837c4ZdIwbink/YXnwdFAbN+ MwWN2oYgInzJmJIrjDDKob1g35eqT5at4rKOju3sHGl+7t3NJ5Y4PiGvB8uFBOvVHnYM ATLo7IpHNrmaVZSbLQYSrzoiBYHNW41Is0oLA/HzrtsZNydmlwUTNktDWbL7tXR4qFuQ /lKWs69qxd5yTL5osiHabmMR+HerR26e3Q6+rmjFILH9MD9N67gb9HkVie4rbyXmhrhk a98g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730995277; x=1731600077; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9GsBpHXBN7WYjyLsOqeXFU2jUxVAbP9Ec9maZf8+ATk=; b=KZIOuJ5a4FS8eXgSABFkAPjSFYKGZWogKNERmVWtmbzPEtR+w7IubPKHpvLbpZCZZO WO3QK7PrNVkWBEQ1yIGvKSgsALaWswr6zCGp4zQlNjdJidX6ZhWF1yEHw9Czk1wsmhXe x82DMM+kfZYoIE3YUtbVpAYJ42NZunGAzZPQ0rOsVZJOoaLa8OSLzKOAGeBblKP+mQQZ qizQaEdACSabMO2iAO9FAHiz0TO1DYNpyD9jayE83GQYcxtiVOcfoFB/WaHPD0GFDnG1 fmUzGYuVMLjpl4xUs6Pz//xRzIjocmrJ4ejGg52V1426JCkvKqLyRWmgyLm/ZVjlKgN6 0NYA== X-Forwarded-Encrypted: i=1; AJvYcCUbSCJ77t76SLi1VXMpdzqeJudYfXuoaiVB4bMwlqLYi5sbsU7cI0rT1QDN8yzEhscxb7QbB19wtw==@kvack.org X-Gm-Message-State: AOJu0Yw4m1ZEA0sw5Gosr270XZBEWmNUj9lQ6gFV614UwLiryqTgGM/c /IQ933/CeJbsqAmQoAWWD7itdn2RqK9ERiZJszMqukkrOVIE7lKOBDA1YASEto9TVmPZxhsEEx3 lVhGe6gIgUa0mIRqn4BmIizaEYwc= X-Google-Smtp-Source: AGHT+IEB2f7SlTnpo6K8Xm46OWu/LpB+oG0oeij4NBEcfG4ee4/a4WvRtBx8gMEgYRVXPQe/sna3Y/rLtNIYBwjZFTQ= X-Received: by 2002:a17:90b:2d8c:b0:2e2:d3f6:6efc with SMTP id 98e67ed59e1d1-2e94c50d05amr32746040a91.28.1730995277245; Thu, 07 Nov 2024 08:01:17 -0800 (PST) MIME-Version: 1.0 References: <20240903174603.3554182-1-andrii@kernel.org> <20240903174603.3554182-5-andrii@kernel.org> <20241106-transparent-athletic-ammonite-586af8@leitao> <20241107-uncovered-swinging-bull-1e812e@leitao> In-Reply-To: <20241107-uncovered-swinging-bull-1e812e@leitao> From: Andrii Nakryiko Date: Thu, 7 Nov 2024 08:01:05 -0800 Message-ID: Subject: Re: [PATCH v5 4/8] uprobes: travers uprobe's consumer list locklessly under SRCU protection To: Breno Leitao Cc: Andrii Nakryiko , linux-trace-kernel@vger.kernel.org, peterz@infradead.org, oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: EC6D040048 X-Stat-Signature: agz69h9wtptgmo8r17s3muf6wyj9taee X-HE-Tag: 1730995262-194156 X-HE-Meta: U2FsdGVkX19faKhOSu5wavX4+TBGINHEXJxFMZhBe/xpfgWTDOXWJRfkAHgLFTnA6u2rr85AsnPqmyIHqbRkpIvATtfXta6/zZs2v8eGqtljA67R8xAjqgqXVa++0JX5drD7E3VSFXGLhgHtwI9u4ZuZ18Mnkq/2p4XJDkR0sqcJlur2Y5lip/keK+ZD+PpLyGPgU40Eo3CH1+B3gvwq2qDkulhOzcZFSXhh5pz9pOhSsJk+Jblp9EjwtTEm3N+1C4e2rVWgXtzivInhromiZWZSstIrzmn9o68cRSlv29sTVc64ynVOrFIqw25H/0LQcrpUcxLUbpiNE5gbt2MxyPqGRoDtlPBl4ok5GwxVVvLkx3mTS6twYSeOHqc962iFm/embnK81TEWK2AxQZzT0bMTSZwZvXxdX6dWSyhzmKEvCPmbwGZT2C6se0ztPs3q9Jei+AgwB244NnyYNAdDhHiazpFVSSQFJQP6SrcwK9xqZIjJbSF+u7vwJbKCpyLL1hCDBE3fBkCikQI7uVkIcXAWKg9yJ8sYZ1GnRl5W7sMMqwh6F/kOqhzlQ4fIT8JGOkgaM2bnccZwsLzJ4yXaRJZTtVyPFXgm4fhkp8ZYuEBerGDUZ0RyatDHfAXPZFzzPv4+czdBpK1lC+CXkexVVX5R8k9mB+UkveNTE+sD3ri5miBWqgMpGM+P3u6AVpBrzDPBeou1Mf4gApmkHbtdi/sC28qMQy1CuJZyHCUemt/KdmMxBPm6TPmr+TVWRycJA55lQf7Rfl6MxacOcaQA9IyoKg4Ja6y9jbJ1OjHc+LWJ1F4bShSColnEMPwqiQ8Q2KdsjDqos4ZCQJC9Zmk2C1e98r/vnBqQQhUHUKoiabZJRHRyDhttR+oyNbhs9IHutPeCbX4QxWyVwR0YeY1MBUUBLReTMyXZf6O7UsEsqWR8BxENHBU9U+LVzZ1jaXahKQZsX61CwO1FVGWuvpN UrbgYYhJ q0I8rs4QA0VubvesuIh3ygKO6CUMDbFf6Xp1hhI6n+lIx6ga3ct9eoRedGgYAAH3YbQwnj6BGjkSp2pMVa6vK6SZlKyfA76X+sQ8F8U4SYIMIlDJbddNHXhmsZh9By2pJp6Sc0yKxyqdAUFUEczHOOeJY52j9Y3cnj7aUcFjGhkjSrudBiqfLUcbxYoExnQbrnkEfijsFCdEIOR7fO2yQ1wuPmaz2Z+o560GhimEc89jDZCAqIeGgiq0fyCsqQqr428brKepVz5+1Mww/QOU4pbYH/oOeh0KqlR2cv2k2sSw6AHILUiHrirYt8+1806Si7ZttDvLfzqI/cA4WTXYI9tDnbiTKi2hAtXSsxVNViar1HH5Oj4FHwzKCKg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000031, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Nov 7, 2024 at 3:35=E2=80=AFAM Breno Leitao wro= te: > > Hello Andrii, > > On Wed, Nov 06, 2024 at 08:25:25AM -0800, Andrii Nakryiko wrote: > > On Wed, Nov 6, 2024 at 4:03=E2=80=AFAM Breno Leitao = wrote: > > > On Tue, Sep 03, 2024 at 10:45:59AM -0700, Andrii Nakryiko wrote: > > > > uprobe->register_rwsem is one of a few big bottlenecks to scalabili= ty of > > > > uprobes, so we need to get rid of it to improve uprobe performance = and > > > > multi-CPU scalability. > > > > > > > > First, we turn uprobe's consumer list to a typical doubly-linked li= st > > > > and utilize existing RCU-aware helpers for traversing such lists, a= s > > > > well as adding and removing elements from it. > > > > > > > > For entry uprobes we already have SRCU protection active since befo= re > > > > uprobe lookup. For uretprobe we keep refcount, guaranteeing that up= robe > > > > won't go away from under us, but we add SRCU protection around cons= umer > > > > list traversal. > > > > > > I am seeing the following message in a kernel with RCU_PROVE_LOCKING: > > > > > > kernel/events/uprobes.c:937 RCU-list traversed without holdin= g the required lock!! > > > > > > It seems the SRCU is not held, when coming from mmap_region -> > > > uprobe_mmap. Here is the message I got in my debug kernel. (sorry for > > > not decoding it, but, the stack trace is clear enough). > > > > > > WARNING: suspicious RCU usage > > > 6.12.0-rc5-kbuilder-01152-gc688a96c432e #26 Tainted: G = W E N > > > ----------------------------- > > > kernel/events/uprobes.c:938 RCU-list traversed without hol= ding the required lock!! > > > > > > other info that might help us debug this: > > > > > > rcu_scheduler_active =3D 2, debug_locks =3D 1 > > > 3 locks held by env/441330: > > > #0: ffff00021c1bc508 (&mm->mmap_lock){++++}-{3:3}, at: vm= _mmap_pgoff+0x84/0x1d0 > > > #1: ffff800089f3ab48 (&uprobes_mmap_mutex[i]){+.+.}-{3:3}= , at: uprobe_mmap+0x20c/0x548 > > > #2: ffff0004e564c528 (&uprobe->consumer_rwsem){++++}-{3:3= }, at: filter_chain+0x30/0xe8 > > > > > > stack backtrace: > > > CPU: 4 UID: 34133 PID: 441330 Comm: env Kdump: loaded Tain= ted: G W E N 6.12.0-rc5-kbuilder-01152-gc688a96c432e #26 > > > Tainted: [W]=3DWARN, [E]=3DUNSIGNED_MODULE, [N]=3DTEST > > > Hardware name: Quanta S7GM 20S7GCU0010/S7G MB (CG1), BIOS = 3D22 07/03/2024 > > > Call trace: > > > dump_backtrace+0x10c/0x198 > > > show_stack+0x24/0x38 > > > __dump_stack+0x28/0x38 > > > dump_stack_lvl+0x74/0xa8 > > > dump_stack+0x18/0x28 > > > lockdep_rcu_suspicious+0x178/0x2c8 > > > filter_chain+0xdc/0xe8 > > > uprobe_mmap+0x2e0/0x548 > > > mmap_region+0x510/0x988 > > > do_mmap+0x444/0x528 > > > vm_mmap_pgoff+0xf8/0x1d0 > > > ksys_mmap_pgoff+0x184/0x2d8 > > > > > > > > > That said, it seems we want to hold the SRCU, before reaching the > > > filter_chain(). I hacked a bit, and adding the lock in uprobe_mmap() > > > solves the problem, but, I might be missing something, since I am not= familiar > > > with this code. > > > > > > How does the following patch look like? > > > > > > commit 1bd7bcf03031ceca86fdddd8be2e5500497db29f > > > Author: Breno Leitao > > > Date: Mon Nov 4 06:53:31 2024 -0800 > > > > > > uprobes: Get SRCU lock before traverseing the list > > > > > > list_for_each_entry_srcu() is being called without holding the lo= ck, > > > which causes LOCKDEP (when enabled with RCU_PROVING) to complain = such > > > as: > > > > > > kernel/events/uprobes.c:937 RCU-list traversed without ho= lding the required lock!! > > > > > > Get the SRCU uprobes_srcu lock before calling filter_chain(), whi= ch > > > needs to have the SRCU lock hold, since it is going to call > > > list_for_each_entry_srcu(). > > > > > > Signed-off-by: Breno Leitao > > > Fixes: cc01bd044e6a ("uprobes: travers uprobe's consumer list loc= klessly under SRCU protection") > > > > > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > > > index 4b52cb2ae6d62..cc9d4ddeea9a6 100644 > > > --- a/kernel/events/uprobes.c > > > +++ b/kernel/events/uprobes.c > > > @@ -1391,6 +1391,7 @@ int uprobe_mmap(struct vm_area_struct *vma) > > > struct list_head tmp_list; > > > struct uprobe *uprobe, *u; > > > struct inode *inode; > > > + int srcu_idx; > > > > > > if (no_uprobe_events()) > > > return 0; > > > @@ -1409,6 +1410,7 @@ int uprobe_mmap(struct vm_area_struct *vma) > > > > > > mutex_lock(uprobes_mmap_hash(inode)); > > > build_probe_list(inode, vma, vma->vm_start, vma->vm_end, &tmp= _list); > > > + srcu_idx =3D srcu_read_lock(&uprobes_srcu); > > > > Thanks for catching that (production testing FTW, right?!). > > Correct. I am running some hosts with RCU_PROVING and I am finding some > cases where RCU protected areas are touched without holding the RCU read > lock. > > > But I think you a) adding wrong RCU protection flavor (it has to be > > rcu_read_lock_trace()/rcu_read_unlock_trace(), see uprobe_apply() for > > an example) and b) I think this is the wrong place to add it. We > > should add it inside filter_chain(). filter_chain() is called from > > three places, only one of which is already RCU protected (that's the > > handler_chain() case). But there is also register_for_each_vma(), > > which needs RCU protection as well. > > Thanks for the guidance! > > My initial plan was to protect filter_chain(), but, handler_chain() > already has the lock. Is it OK to get into a critical section in a > nested form? > > The code will be something like: > > handle_swbp() { > rcu_read_lock_trace(); > handler_chain() { > filter_chain() { > rcu_read_lock_trace(); > list_for_each_entry_rcu() > rcu_read_lock_trace(); > } > } > rcu_read_lock_trace(); > } > > Is this nested locking fine? > Yes, it's totally fine to nest RCU lock regions. > Thanks > --breno