From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25474D711B4 for ; Wed, 20 Nov 2024 15:40:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A7F836B0089; Wed, 20 Nov 2024 10:40:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A2E636B008A; Wed, 20 Nov 2024 10:40:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F6036B008C; Wed, 20 Nov 2024 10:40:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7214E6B0089 for ; Wed, 20 Nov 2024 10:40:31 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 139284050B for ; Wed, 20 Nov 2024 15:40:31 +0000 (UTC) X-FDA: 82806883866.26.7F900ED Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) by imf20.hostedemail.com (Postfix) with ESMTP id BA2B71C000B for ; Wed, 20 Nov 2024 15:39:24 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=CTmFGqpA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732117162; a=rsa-sha256; cv=none; b=UJUGUHEI2KIckaBYd0nGCB7ZmlvPN8TPQxK631HFrXE9ylMVNC9YAoHYOi0J9Rgk66A47e gmKojfuNHuVeZeJiGfUTit0TmF7pQkmXp3ckDenKmYmjQRseCNygZuLbRLhx6nbgQr8UqL 6/+gcIqVumu/FUtyAGX/CIEwodfWGQM= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=CTmFGqpA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732117162; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uFjKS8FuHxdevVGTPIXW1TnXSm9ej+3wMVfAYZJ1jvs=; b=pThhVum8hCjfYPjW46vNEMqymoaaU56h7cSaPbZAA4YrFI5NLIdhuXoeCHl6UJSyI9PYm+ 9i8BkTXdDLq3yAgjb+pKkKjsSr0fm38x1ESirNtwfRZIlzXiBpzheHKzWswDDgRR+3cAVq +qJYEGjiDH3Xf6C+nwoK9TT/di5Jk5k= Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-2e3686088c3so5177872a91.0 for ; Wed, 20 Nov 2024 07:40:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732117228; x=1732722028; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=uFjKS8FuHxdevVGTPIXW1TnXSm9ej+3wMVfAYZJ1jvs=; b=CTmFGqpA0HH1KVyPOjM++XCXLgDMtxIp12Vd0Ku3ltdodXTnnS3AdqgJCkp6lMU5Cw D3ENnvxJL8rV0uVkZPCYhP2Gw49izA79IMx/YIIN5y9FPr206wY4a8bG89d9x8B9U7JR xDSE4D+cCn7bpq8fH6qaoZuMpT5P8Q17us4CqIoTm37mUNGsNcQ1jDXc4v9/9bEBLbIg CqG5CIt4+zxXZ9IIR2gKh6Z+a+nJIQu/EZDsENVDn5iuMGxCMoPAa9p0x9vlj7YXVzmG xPjCkMINw0ZiVeoGpsA/BVLYvIKKXm+sqnbckL6HyDTbviRab9uHc61LDBuzQRVHWLcr aZIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732117228; x=1732722028; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uFjKS8FuHxdevVGTPIXW1TnXSm9ej+3wMVfAYZJ1jvs=; b=RYsmf/ZRvKZxTDKN+E7eUw0sgR4NP1Y82aECnX5I+vjfeAPv5r06+S+20YcR507o1G OvokBEywW0ACDIurGRRPPHLiUsY2pLx/Mf4vBB1Evvl5Nfvaym+9VfbGrLkIIIm1QHy+ h0hMS4gbT54Hv2zsQgyQAuedA2OWFPTp1PJsdk+CwFqDEGmu0uZ0aA8GC+EaJwCoo3lg T3bpft6PrrR5DEw/ZLvynKQpp8wPKK7YezcGFbpz8a/TeO2BJPyyWdJ7IVuBHHcNwIi6 9dxkhniiBb80QHynlmkIwtqEkjaWolzw7cuz9jpl3ehfpww5o1uay5KgR4f+vK4HC8gS HHnw== X-Forwarded-Encrypted: i=1; AJvYcCXbp+sxtgUyXZ5yU98pg4TSUxC+yaMpicAPh5VCs5WchtIBsrs/R277XNprADW1u3T0mFeM/pLS+g==@kvack.org X-Gm-Message-State: AOJu0YwcST9JP34IKBBeeOT67VliCdQ9LWg8ZBgFj2w1xjYqtOULGhYz ahOe0Ih+wf8sfbpTyuE5dAPG2M9wJdhkB+4saa+RqIHJmHe1/PcZ2yTHK6LDAaXYD6mv0itCBuy v0yqQFbV3Ccoh1UnmY7pvEh6oBrg= X-Google-Smtp-Source: AGHT+IGccb8V9MDgANcN/DkHMib6H9ZLqwSbLtXKcpBFjHV91T5dzsJwSZ2Qs9oUVT2cqw0siTJHZxcILUXkEuGT1EY= X-Received: by 2002:a17:90b:3ec5:b0:2ea:7fd8:9dc1 with SMTP id 98e67ed59e1d1-2eaca738a43mr3423668a91.18.1732117227729; Wed, 20 Nov 2024 07:40:27 -0800 (PST) MIME-Version: 1.0 References: <20241028010818.2487581-1-andrii@kernel.org> In-Reply-To: From: Andrii Nakryiko Date: Wed, 20 Nov 2024 07:40:15 -0800 Message-ID: Subject: Re: [PATCH v4 tip/perf/core 0/4] uprobes,mm: speculative lockless VMA-to-uprobe lookup To: Linus Torvalds , Ingo Molnar Cc: linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, peterz@infradead.org, oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, Andrii Nakryiko , vbabka@suse.cz, shakeel.butt@linux.dev, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, arnd@arndb.de, richard.weiyang@gmail.com, zhangpeng.00@bytedance.com, linmiaohe@huawei.com, viro@zeniv.linux.org.uk, hca@linux.ibm.com, Mark Rutland , Will Deacon , linux-arm-kernel , Catalin Marinas , Kernel Team Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: BA2B71C000B X-Rspamd-Server: rspam01 X-Stat-Signature: fy8s4sof3niesxmw4o8dt1br9mocsafx X-HE-Tag: 1732117164-227950 X-HE-Meta: U2FsdGVkX1/XqRKaPbk+kX/DHU/RU6Aw0xSfXESr2czlwdrnl8sHjRoj0zfzBnOJRZksHg1a2wLVn1f36QhHhZkBtcKQDg4dWqPqaKczGLf4nHEnorGIdj8CwL3vcf5kMPnMvPLXHw4q6NcR67Rn9cGNUDFwXNJ1A9sIREB6YAcRg2hwUFa/zEuVadIK8mtyMyhAVlDdSfzmnUorhzwCq9xX79rbAk9ZXd3oCXs3Knj07ObbNI2Hf2NKIULDkwBa2RrFnEBh7vpIoxnW17a+VQAoSWB6wkKjGGcqOXrlH8UBrypBWKJ0X6fd8YvKyc8PoZofPQxZOSclfOvl1fZGA6toFJquCYoihW81L1++Mv8XSBomaEBucpVHnjmWW3HU1wKXSe87zlXaxgkZ40a2W6EDr4Zj38PXflNkU4NNVvaO8LFL5dA6USUJB0IYg93WaZjrWvZG8VDQxdCNdbBy1PMYx05cYbgiAAt4eqXfxn/ehSknCQ72RhgxEirxfE/4kGZt3gFI/FWPKyhwAnuH+g+/cC3S4xFiyyxjrQSRCHDjM3fTKG5tzgxzkhYS4JFyAt6Ht+3o2FKHawvfaTSuTopF2ox8/9QohM+Ho3v0XyuE2RTfsm+VP1O2MAgOh60f3dCeyXjue4QI8a/6RL5T6AVFwyPLzmBhP7Em5opiwnaZyobPSGJ+BB3uwZgU6Sq3YF4HhqTZZAnxXFlDxtqrIIM1tY6O/tKQke7EM8WNo2rISRghESktfhdNnCKX4AmSW6P7Nvb8V1cwVdz4K5gnQsQogtsJ0p7qHaLcIxjs8l4bOY2wBbqpeeHiFD2dTT0e0IfhrrLz2nm34srDG8H0fshnFqdkkwvBc0cQl0C1Bk8CLXd8kEVnqT3ywhdQ8LIEJJEx5f3/f/InEasnkJMhV7cfqg54xvZHtgHIf9obI65QgOAihghTYN31HzVYvLeYIsp908wMF9+TONoAXbQ xq5LrObn Zn0Abp7uYmG1LKksMAV7zC0DiVcrTNWBSe1S+7SegbRAv6j2dVIGGAgMB4meOqS8OlvtEVaYOCVxYN+wKnEN/s/TSOFgjl7YEcDw6LJSbPTr0b8kVpb3L0DQM3HndNgq1wpM2SXZEDLVgUTs7GsclgKIOR4BwzaHgrQzSThqcycL4Ewi2vgSf4qMJa2ZtQZuR9mgJ+c3F6MvRjaj/NMUeb8rv+lWDGOeUTtThTPAo8AVaaS63tLyRc67IR4aU+4Bx1nSwtU26QoHTQ21LbiiKhQcqLiM2pGAOFZRsirybCAtjoVXkcS7RIgCjhJNPhz6GJdQZbCDQgC0JyUkLN4wguxn3uHQboamyzFJxln6WBmvo+UmXnacQPd/jh0j4WbV5dO8mzR8r/u5Fn1Ngl6CYHOjr3TAVM1nQKTp4in7utleTvV6nQtKpKEQMBKUkpsEQw7kRcMptw6YZv9n0vJFgA0MSgec58OkAksujdMmJU4TlnQhUzmMUPmdWWpZhMpTV69koC6RVuJ/u6bs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Linus, I'm not sure what's going on here, this patch set seems to be in some sort of "ignore list" on Peter's side with no indication on its destiny. I'd really like for this change to go into the new release with the rest of uprobe improvements that happened this cycle, as they all nicely complement each other. This patch set has been done-done since Oct 24 when Suren sent the final version of mm-side changes ([0]), which I subsequently resent as part of this mm+uprobe patch set on Oct 27, after coordinating that this will go through uprobe subsystem with Andrew Morton ([1]). The uprobe part was effectively unchanged since this summer, when this speculative uprobe lookup logic was posted as part of an earlier RFC series ([2]). That's just to say that this was thoroughly reviewed, discussed, and stress-tested, meanwhile, and I see no reason to delay landing it for so long. I've even written a separate overview email with a summary of all the uprobe-related work and how it all fits together ([3]), realizing that there are a few seemingly independent email threads and patch sets, trying to engage involved maintainers. The outcome was: - two patch sets did land (uretprobe + SRCU and Jiri's uprobe session prerequisites) after a bunch of extra pings, but that's at least something; - Liao's siglock optimization ([4]) still hasn't landed with no explanation what's the delay; - this patch set is also stuck in limbo for weeks now; - there was little engagement on arm64 front for Liao's optimization of uprobes on STP instructions [5], which is perhaps a separate topic for another email, but just another instance of maintainers not engaging in timely fashion. In short, I hope to get your help with the next steps. What can I do to help land this patch set (and hopefully also others I mentioned above)? More broadly, what should be contributors' expectations on timeliness of maintainers' engagement? Maintainer record in MAINTAINERS can't be just a veto power, right? It is also a responsibility before others to move the kernel development along. I'd like to understand what you think is reasonable to expect here? Same question for patch handling (applying, reviewing, rejecting, etc.) latency. Thank you! [0] https://lore.kernel.org/linux-mm/20241024205231.1944747-1-surenb@goog= le.com/ [1] https://lore.kernel.org/linux-mm/20241028204822.6638f330fad809381eafb= 49c@linux-foundation.org/ [2] https://lore.kernel.org/linux-trace-kernel/20240813042917.506057-14-a= ndrii@kernel.org/ [3] https://lore.kernel.org/linux-trace-kernel/CAEf4BzY-0Eu27jyT_s2kRO1Uu= UPOkE9_SRrBOqu2gJfmxsv+3A@mail.gmail.com/ [4] https://lore.kernel.org/linux-trace-kernel/CAEf4BzarhiBHAQXECJzP5e-z0= fbSaTpfQNPaSXwdgErz2f0vUA@mail.gmail.com/ [5] https://lore.kernel.org/linux-trace-kernel/CAEf4BzZ3trjMWjvWX4Zy1GzW5= RN1ihXZSnLZax7V-mCzAUg2cg@mail.gmail.com/ [6] https://lore.kernel.org/all/172074397710.247544.17045299807723238107.= stgit@devnote2/ On Mon, Nov 11, 2024 at 9:26=E2=80=AFAM Andrii Nakryiko wrote: > > On Tue, Nov 5, 2024 at 6:01=E2=80=AFPM Andrii Nakryiko > wrote: > > > > On Sun, Oct 27, 2024 at 6:09=E2=80=AFPM Andrii Nakryiko wrote: > > > > > > Implement speculative (lockless) resolution of VMA to inode to uprobe= , > > > bypassing the need to take mmap_lock for reads, if possible. First tw= o patches > > > by Suren adds mm_struct helpers that help detect whether mm_struct wa= s > > > changed, which is used by uprobe logic to validate that speculative r= esults > > > can be trusted after all the lookup logic results in a valid uprobe i= nstance. > > > > > > Patch #3 is a simplification to uprobe VMA flag checking, suggested b= y Oleg. > > > > > > And, finally, patch #4 is the speculative VMA-to-uprobe resolution lo= gic > > > itself, and is the focal point of this patch set. It makes entry upro= bes in > > > common case scale very well with number of CPUs, as we avoid any lock= ing or > > > cache line bouncing between CPUs. See corresponding patch for details= and > > > benchmarking results. > > > > > > Note, this patch set assumes that FMODE_BACKING files were switched t= o have > > > SLAB_TYPE_SAFE_BY_RCU semantics, which was recently done by Christian= Brauner > > > in [0]. This change can be pulled into perf/core through stable > > > tags/vfs-6.13.for-bpf.file tag from [1]. > > > > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/com= mit/?h=3Dvfs-6.13.for-bpf.file&id=3D8b1bc2590af61129b82a189e9dc7c2804c34400= e > > > [1] git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git > > > > > > v3->v4: > > > - rebased and dropped data_race(), given mm_struct uses real seqcount= (Peter); > > > v2->v3: > > > - dropped kfree_rcu() patch (Christian); > > > - added data_race() annotations for fields of vma and vma->vm_file wh= ich could > > > be modified during speculative lookup (Oleg); > > > - fixed int->long problem in stubs for mmap_lock_speculation_{start,e= nd}(), > > > caught by Kernel test robot; > > > v1->v2: > > > - adjusted vma_end_write_all() comment to point out it should never b= e called > > > manually now, but I wasn't sure how ACQUIRE/RELEASE comments should= be > > > reworded (previously requested by Jann), so I'd appreciate some hel= p there > > > (Jann); > > > - int -> long change for mm_lock_seq, as agreed at LPC2024 (Jann, Sur= en, Liam); > > > - kfree_rcu_mightsleep() for FMODE_BACKING (Suren, Christian); > > > - vm_flags simplification in find_active_uprobe_rcu() and > > > find_active_uprobe_speculative() (Oleg); > > > - guard(rcu)() simplified find_active_uprobe_speculative() implementa= tion. > > > > > > Andrii Nakryiko (2): > > > uprobes: simplify find_active_uprobe_rcu() VMA checks > > > uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution > > > > > > Suren Baghdasaryan (2): > > > mm: Convert mm_lock_seq to a proper seqcount > > > mm: Introduce mmap_lock_speculation_{begin|end} > > > > > > include/linux/mm.h | 12 ++--- > > > include/linux/mm_types.h | 7 ++- > > > include/linux/mmap_lock.h | 87 ++++++++++++++++++++++++------= -- > > > kernel/events/uprobes.c | 47 ++++++++++++++++- > > > kernel/fork.c | 5 +- > > > mm/init-mm.c | 2 +- > > > tools/testing/vma/vma.c | 4 +- > > > tools/testing/vma/vma_internal.h | 4 +- > > > 8 files changed, 129 insertions(+), 39 deletions(-) > > > > > > -- > > > 2.43.5 > > > > > > > Hi! > > > > What's the status of this patch set? Are there any blockers for it to > > be applied to perf/core? MM folks are OK with landing the first two > > patches in perf/core, so hopefully we should be good to go? > > Another week, another ping. Peter, what can I do to make this land? MM > parts are clearly ok with Andrew Morton, uprobe-side logic didn't > change (modulo inconsequential data_race() back and forth) since at > least August, was approved by Oleg, and seems to be very stable in > testing. I think it's time to let me forget about this patch set and > make actual use of it in production, please.