From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49FCDD59F52 for ; Wed, 6 Nov 2024 16:25:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE6EB6B0092; Wed, 6 Nov 2024 11:25:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C70246B0096; Wed, 6 Nov 2024 11:25:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5E046B009A; Wed, 6 Nov 2024 11:25:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 97EF66B0092 for ; Wed, 6 Nov 2024 11:25:41 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 268201C7E74 for ; Wed, 6 Nov 2024 16:25:41 +0000 (UTC) X-FDA: 82756195032.26.629B740 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by imf08.hostedemail.com (Postfix) with ESMTP id 49B64160030 for ; Wed, 6 Nov 2024 16:25:16 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZqQTRo5n; spf=pass (imf08.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730910172; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=b+0dv4IU+YC5q2KRQYq8uwpngy1gJkMWrxKCi3WhUWU=; b=VYu90rpf79RtuNVHmALETT1mpRdsmYoyoFsGQYWyCf92R3pPQ5x/uXxLOl8BxPfrfhanGV zitHLg5GsmWW3HylftuM5wDRZNvuY07H2weYqhXrDoSkRfrEWPpJn86KG/DFOwt3xyrBYA KtvwW0lExY43vjXP9fahR8pjN0SC1hk= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZqQTRo5n; spf=pass (imf08.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730910172; a=rsa-sha256; cv=none; b=wqeZ+98OnjI58Sa3oncOKFuwGYMCwCh400CP4HOlWxLzBIPY3N/K8L8HF21C8dhQmLJI9D hjKj6sAW1beouraTuZBlscwpQeO5MtHL2TSjkSh23ZqsvLLv6YP/t3mw6y+DGO6mioqGH8 6gm/QapTVFr9A7W7KDyA6dsqtoV5oKo= Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-2e2ed2230d8so5308099a91.0 for ; Wed, 06 Nov 2024 08:25:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730910338; x=1731515138; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=b+0dv4IU+YC5q2KRQYq8uwpngy1gJkMWrxKCi3WhUWU=; b=ZqQTRo5nxh0/SELSiObSxhYFSwfONj8M0H0kCixCS7HvLRCUGY58VbWWt5pP2c1g77 +e7/+bI0lFNivrbFIv8gvVNmr2nWiKSwvEBPcl5okTZoXJXo7mVrAsOD2/V39eZnk09X RrI+BrgNehufzH4xZZd/iGWvBsRC1UnL8YWiJyjg0CjImCF9RxAW8itYyA6HpWoHc/wy xUbJA9n7O2zS/jirLRgC3ookCKLgXdxII95lmSZcfjSZ7uDyj6LEUbhNEBX4PWW6aTPm /YPE9n7thL4YZhaSE5RKxBIh28aOapARxBY4FPBf5Phel0U8D/VEXJvxXKJO2ays1sbk HQ0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730910338; x=1731515138; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=b+0dv4IU+YC5q2KRQYq8uwpngy1gJkMWrxKCi3WhUWU=; b=o0MDodLs+f8ywcUD/dIKrBUhCzuX5BpDC5zIQFwXqdHcpHAjsO35bBVlmzXF6G6OhK 8t51lcX1pMCckiIDJjfcqbIHi8So9hnNGn+UiDZ5YJNoopQK3u0YepSq+gU+AMzmBrFc d0REFFiRLyotWOPk9hJ7vmkqZBv0Edp4r8tuYRmdRiv6rwCy2Iqz9rg8VYdhfcfGJrB6 895SzTdj8TX2W1IBGBMwuhAZvSzQfOsFzJBOYTja/tkaVy6UTS2ML9dndvuTtS4P5VvQ wRjEhx78bayxzYDvSgP3gdUSrDMmZ/xqt0sJHbUfMxqImvO51anDqlw5BST48rMZXgBL V+Tw== X-Forwarded-Encrypted: i=1; AJvYcCWByoJ1/emaLE7fsxQYfkG/yt98F7CvCUMLnT2VRgfq4VhPpJJahYJrsbWDzDQ76119Th9udvhzyw==@kvack.org X-Gm-Message-State: AOJu0YzS00Si13pvIQ574vLChPuui/ksMTFCXvgBGxKdN99RE7HFUZQK yTfcqZw4qywZYqLxnOBFGkhQTnFvyB97yHnwMxdE4aXnTnPVRWSVBE4/UhD9YqlUb1naWVNOwvH N6yqsYGi0nXhTC6V+v37WR0YBGbc= X-Google-Smtp-Source: AGHT+IEuC+sjDmRRM+7Ip58qlQhuXlW7P6FZrMFM5HQZMRFIR/40DNXrCLAlmXLitRtFdIMOx0fEHkj8wHjs3x34hLg= X-Received: by 2002:a17:90b:2681:b0:2c8:647:1600 with SMTP id 98e67ed59e1d1-2e94c2b082fmr27691282a91.9.1730910337781; Wed, 06 Nov 2024 08:25:37 -0800 (PST) MIME-Version: 1.0 References: <20240903174603.3554182-1-andrii@kernel.org> <20240903174603.3554182-5-andrii@kernel.org> <20241106-transparent-athletic-ammonite-586af8@leitao> In-Reply-To: <20241106-transparent-athletic-ammonite-586af8@leitao> From: Andrii Nakryiko Date: Wed, 6 Nov 2024 08:25:25 -0800 Message-ID: Subject: Re: [PATCH v5 4/8] uprobes: travers uprobe's consumer list locklessly under SRCU protection To: Breno Leitao Cc: Andrii Nakryiko , linux-trace-kernel@vger.kernel.org, peterz@infradead.org, oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 49B64160030 X-Stat-Signature: x3eda7dbfx8bgipgxfapc5mf11x6dhda X-Rspam-User: X-HE-Tag: 1730910316-928352 X-HE-Meta: U2FsdGVkX1/HQhWuV8d+tCxV8/lpxTzQDlp2fFgjqxjNwmM1DN6XiEiyNUB4Gf7ZlJrBn6zNNxZ2mxXneVdj3eBLsY0mUyNckvVO+3FSfRMMbQV0f1/TFxKaPIaR+PAiczE03Uq9SuRLC1aV01r+QTZgYZ57RqOdRHfmBcIwg2DWozZVunG1kdLjtlbPLaY0l8XgpxQ0hEIkDPcWJlMmxbZi/fns92OjMMNrhuXJodXTjHyKt9+tkqZklr/EYxM5SQvwBSBR1lyRu4uA2+mYJa0gjWpaCEn9psUodOJVfG9Xb/LE+0cJcJpuvpaqksWXk2s5azgSoM37zWGLEFW7KkP2racSVrMBy27gSmV7wgsY62ME46I93tz03eWFY/Jq8woPZqAltuZ2iTqO7+XjDOcbf4KmyZc+fpmpsH5EpZlfSfn7MHdup7XSj1KXqLKEKKRwLwK7fApK8iwQvM3PkDcPFtSUQvFmG+/xWU7jxU4uvaBzF/+LuRwZcQCVk9rMQSGLIoZdV2fpIxtl5cnmOGyqpE4ivG3qqq5EQyeKfVQvQR2avLQA/XcjRfYKrcehTWrl9bcj5HvSWI4ursb9bgf/cNNwYn1uEqmjQXU/tpWEn9Xfi/6yTkB0S/CKAqQNEJWIQhlYov3FYqxVaXVtgeOo4t4V4uhHj4zGv/aH5d1qgPBXyRaW1OAEe3ST3LE2FIsoHWgNow+1+KaLmExsTwwxwDx6Rnc5AQNzYItS4okKY9HppxgeH9uTxLUzNPL3wagzsLKBfBEsI7Hx98nWMSTnqTih/HBps0+2ESEfQkpUnE0pFDhsUehe91CGaUw155a0VyUA8DiqQrPsmP+ym95Gp4HwlL89rVIb7ShX8mSyXUeuhPpNI4ZFQC5DiZBTLgTvWVT9VVWphjmSKwN/wBqsIamKWY2oXAelYCuUPEtNJQ/yC7u4LAMCPuMsaigE8Pc+PrAANqmvME7A/EX 1GPK3PxE twucA9HiE/A2np7hJiQ/gJGRVgxLpBkxwjhU7ilOHF6iMrt1eLqC8HuHnVFKxph1a2AUsWSc7DR6AeEft6MfB7IbALEH9NU/PGmB5BRZXivcN2dTKzu2Ro0JjexM2h7mbt1lhvrS9/lOwYItzOUd7lnEmsrnRx3rvvYGK866Pg488uTV1cqXoGlHu9nBvqkLtKOsAtRtIBKgaEjhbg7YhpDvd23Ehk5cfSRjdbCTA73gxLYoWXxsmsiwi8TIt2NRwYcN3B7Rsg4np7/zxTsibfufj8N+nHzIy4EYDeAxT4ievLXcgEziFUWglnL5DOy7MHBwNSI5YRbnj35Y8pFfB8czKya4LWvTQGn8egy9jCcD2t2RETD08DfAU4A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000011, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 6, 2024 at 4:03=E2=80=AFAM Breno Leitao wro= te: > > Hello Andrii, > > On Tue, Sep 03, 2024 at 10:45:59AM -0700, Andrii Nakryiko wrote: > > uprobe->register_rwsem is one of a few big bottlenecks to scalability o= f > > uprobes, so we need to get rid of it to improve uprobe performance and > > multi-CPU scalability. > > > > First, we turn uprobe's consumer list to a typical doubly-linked list > > and utilize existing RCU-aware helpers for traversing such lists, as > > well as adding and removing elements from it. > > > > For entry uprobes we already have SRCU protection active since before > > uprobe lookup. For uretprobe we keep refcount, guaranteeing that uprobe > > won't go away from under us, but we add SRCU protection around consumer > > list traversal. > > I am seeing the following message in a kernel with RCU_PROVE_LOCKING: > > kernel/events/uprobes.c:937 RCU-list traversed without holding th= e required lock!! > > It seems the SRCU is not held, when coming from mmap_region -> > uprobe_mmap. Here is the message I got in my debug kernel. (sorry for > not decoding it, but, the stack trace is clear enough). > > WARNING: suspicious RCU usage > 6.12.0-rc5-kbuilder-01152-gc688a96c432e #26 Tainted: G = W E N > ----------------------------- > kernel/events/uprobes.c:938 RCU-list traversed without holding= the required lock!! > > other info that might help us debug this: > > rcu_scheduler_active =3D 2, debug_locks =3D 1 > 3 locks held by env/441330: > #0: ffff00021c1bc508 (&mm->mmap_lock){++++}-{3:3}, at: vm_mma= p_pgoff+0x84/0x1d0 > #1: ffff800089f3ab48 (&uprobes_mmap_mutex[i]){+.+.}-{3:3}, at= : uprobe_mmap+0x20c/0x548 > #2: ffff0004e564c528 (&uprobe->consumer_rwsem){++++}-{3:3}, a= t: filter_chain+0x30/0xe8 > > stack backtrace: > CPU: 4 UID: 34133 PID: 441330 Comm: env Kdump: loaded Tainted:= G W E N 6.12.0-rc5-kbuilder-01152-gc688a96c432e #26 > Tainted: [W]=3DWARN, [E]=3DUNSIGNED_MODULE, [N]=3DTEST > Hardware name: Quanta S7GM 20S7GCU0010/S7G MB (CG1), BIOS 3D22= 07/03/2024 > Call trace: > dump_backtrace+0x10c/0x198 > show_stack+0x24/0x38 > __dump_stack+0x28/0x38 > dump_stack_lvl+0x74/0xa8 > dump_stack+0x18/0x28 > lockdep_rcu_suspicious+0x178/0x2c8 > filter_chain+0xdc/0xe8 > uprobe_mmap+0x2e0/0x548 > mmap_region+0x510/0x988 > do_mmap+0x444/0x528 > vm_mmap_pgoff+0xf8/0x1d0 > ksys_mmap_pgoff+0x184/0x2d8 > > > That said, it seems we want to hold the SRCU, before reaching the > filter_chain(). I hacked a bit, and adding the lock in uprobe_mmap() > solves the problem, but, I might be missing something, since I am not fam= iliar > with this code. > > How does the following patch look like? > > commit 1bd7bcf03031ceca86fdddd8be2e5500497db29f > Author: Breno Leitao > Date: Mon Nov 4 06:53:31 2024 -0800 > > uprobes: Get SRCU lock before traverseing the list > > list_for_each_entry_srcu() is being called without holding the lock, > which causes LOCKDEP (when enabled with RCU_PROVING) to complain such > as: > > kernel/events/uprobes.c:937 RCU-list traversed without holdin= g the required lock!! > > Get the SRCU uprobes_srcu lock before calling filter_chain(), which > needs to have the SRCU lock hold, since it is going to call > list_for_each_entry_srcu(). > > Signed-off-by: Breno Leitao > Fixes: cc01bd044e6a ("uprobes: travers uprobe's consumer list lockles= sly under SRCU protection") > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > index 4b52cb2ae6d62..cc9d4ddeea9a6 100644 > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -1391,6 +1391,7 @@ int uprobe_mmap(struct vm_area_struct *vma) > struct list_head tmp_list; > struct uprobe *uprobe, *u; > struct inode *inode; > + int srcu_idx; > > if (no_uprobe_events()) > return 0; > @@ -1409,6 +1410,7 @@ int uprobe_mmap(struct vm_area_struct *vma) > > mutex_lock(uprobes_mmap_hash(inode)); > build_probe_list(inode, vma, vma->vm_start, vma->vm_end, &tmp_lis= t); > + srcu_idx =3D srcu_read_lock(&uprobes_srcu); Hey Breno, Thanks for catching that (production testing FTW, right?!). But I think you a) adding wrong RCU protection flavor (it has to be rcu_read_lock_trace()/rcu_read_unlock_trace(), see uprobe_apply() for an example) and b) I think this is the wrong place to add it. We should add it inside filter_chain(). filter_chain() is called from three places, only one of which is already RCU protected (that's the handler_chain() case). But there is also register_for_each_vma(), which needs RCU protection as well. So can you resend the patch as a stand-alone patch, switch to RCU Tasks Trace flavor, and add the protection inside filter_chain()? Thank you! P.S. pending_list traversal that you (accidentally) protect as well in your patch doesn't need RCU protection, so there is no problem with moving into filter_chain() for RCU stuff. > /* > * We can race with uprobe_unregister(), this uprobe can be alrea= dy > * removed. But in this case filter_chain() must return false, al= l > @@ -1422,6 +1424,7 @@ int uprobe_mmap(struct vm_area_struct *vma) > } > put_uprobe(uprobe); > } > + srcu_read_unlock(&uprobes_srcu, srcu_idx); > mutex_unlock(uprobes_mmap_hash(inode)); > > return 0; >