From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8068CCFA45F for ; Wed, 23 Oct 2024 19:22:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12F836B009E; Wed, 23 Oct 2024 15:22:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E11E6B009F; Wed, 23 Oct 2024 15:22:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE9B86B00A1; Wed, 23 Oct 2024 15:22:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D14B26B009E for ; Wed, 23 Oct 2024 15:22:51 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3C03280CA9 for ; Wed, 23 Oct 2024 19:22:36 +0000 (UTC) X-FDA: 82705839174.29.EA851CA Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf28.hostedemail.com (Postfix) with ESMTP id 15042C0024 for ; Wed, 23 Oct 2024 19:22:31 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=O3c+kedm; dmarc=none; spf=none (imf28.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729711331; a=rsa-sha256; cv=none; b=5vFK1tEHBW8P6FN2g/hLu68uszZc99Zh7RcDF2NU1RXZUpJhYWY+FV1lybeuJRUOEPz5Ut 7D5VKH7VXAElYEOw8eZFgvp4Mrg4LVTpFU+wcYmexDAPScyzf6DqdRe+xOaq2A1m/hwu8p hYXcPowQM7h5h/ZpYH3pJuCRReOi4Hg= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=O3c+kedm; dmarc=none; spf=none (imf28.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729711331; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bjpn07mKWG+e5cbzYSiLKd9EMhYvYVJFmfG6/l792so=; b=UTtsNVXm5BTcaWBfVerNbV2DdxYglHCxEsAvrFDSwCpJY7ymcodQDR1gS3ye+p5hxStQsp ABEAzn77MgpeCurwjzt8DIrzsNQLh4pD9S/YzpyPr8WdjDVRV5alZa1jAkvaEiB7eqjUuC dxxSPxa2Nx+FLopC9qzKxE7NtYEyVqE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=bjpn07mKWG+e5cbzYSiLKd9EMhYvYVJFmfG6/l792so=; b=O3c+kedmaSZ7Rp4kq5v5gUlLVp zKtp6aac143tFKnwtEa9TBUsucVA8PowqlTmJQMkKtZSJdCKWEs9DH6QNb9ZPqTuIWVsikoEwTVXG bGgnmDeh8PvEjVo/uh/U2k4WHQOC70FwvRuDimBiOWbyc9+UcaCCVu4JsGYQ7xZ2Rx5dIRTyZNQHr GeejDHk10mDt4XqUbN+DkiY5IbisPuldQrkbJN/Gcr2T3H0bYjr0lKTApfRgCvRhX7o/Je/o2FHlG 4/BUN0+tqSTUBb0rpvImyV0EaRW3P9N4HaLFvRJmqXI976b+rhZkxHMthXQYZoSwi1TgEba0vGc4X yHwLAp1A==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1t3gwH-00000008X5h-3iez; Wed, 23 Oct 2024 19:22:38 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id E3B6A30073F; Wed, 23 Oct 2024 21:22:36 +0200 (CEST) Date: Wed, 23 Oct 2024 21:22:36 +0200 From: Peter Zijlstra To: Andrii Nakryiko Cc: linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, shakeel.butt@linux.dev, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com Subject: Re: [PATCH v3 tip/perf/core 4/4] uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution Message-ID: <20241023192236.GB11151@noisy.programming.kicks-ass.net> References: <20241010205644.3831427-1-andrii@kernel.org> <20241010205644.3831427-5-andrii@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241010205644.3831427-5-andrii@kernel.org> X-Rspam-User: X-Stat-Signature: igzzqtykg9j4yz154iusiixic9eyhba1 X-Rspamd-Queue-Id: 15042C0024 X-Rspamd-Server: rspam02 X-HE-Tag: 1729711351-974936 X-HE-Meta: U2FsdGVkX1/GjxV/ud/MOFfWGLEjjKIDktL7e+daK2VjpS5Znle124lpwXQ+069CvQWnKjLTUAgFAZ90d4OHApVgkPKZgUSN8WvMITTsyaEXPvXYDt7WbioN82KiIQ1791DyU6qCzxQ6aI7aYYcVeA2eC3IeRBnUGdX2rgLaS7NKAhVQ9KForwqNrGWOY4MJvm3DkiDs5yizw2lSa3BZKi2GtuY7mnwve579VR5HjOmzLdFvV8AYq+v2YnfHL9bRkvJa5nOb5pOP2Wlfs8vy4Z3xnge5G3HA5yfCFHUIr3ZVPCumglC9zmYBfBiDYQcgwntumJzzzp0kh7KYxps/pSZPM8jiU/5OGr/goDkyuMYIqCy1zzMogIQphJ+nbdC9e/175N7jRK/JNShEblQDdlF1phbglhJaSb0LBNhisjwqOoUiCKd/nxYiDBixDwijq8rRt+pf1sSSGX90GSBIMHe/RgQND71k9pkNtRTiwxWVkZdUzPGJWHD95xMgPS2QLp2X0M4CXbguUgx0IGJrTdCk9YpJFbanIAe/QHTbjU7wbHsivSrO0nGp31QfMqg6qfNLyrD+GIF9KXXQMHfnkWjds7FD4l20CCpEi+WrxIialhjDZTk+2RnZhwoI/Mx8HhKr6AVWc3/e9HBRBqS1uRJE5QfZASiF1RjMqxRN6Beib6Yvsqv0HdoUmGp5fxsMhFXNI5lRzuNY/GZf++N1DgEwqc5Wy9Kjtf8575A+2eEZbhKdPiiO+vFD4e8vTbXhc78l+6534GWBCtex9Cv7I/yisT7HgjsskgWxc1PBQGdQTGGB3ebIBQeO6seeeJ1A9Jyp2pQVJNMmwiaWdOqx081eaakYW/nG+fgRG9X/OJIg0+X4xXGbqx6QzmRxsQnhACYVen4MusGz6+2YJd6AGu7+XDbetpr3lI/Bf+22t8lJCYMexm+VRD2QBGZ+okRSRGSB945jicIxhq2ZGrQ MA8vvMfn 60nCcnqPnPzJgN918Fdi/M80aBULUVoESBW/QHqbM0MexIeEY5TCMHBQ+4ZUeYNfkhEP44aqc7Fx0ptNThyySp2zuTVYByic5uYpolvXoEUMu6foTElV51J2aWA5AS1f4BI1+E5PMRjCt3CX3ylYn8pa+vb47t9AeNJyyTaCcPfVqs3yq3HkRyXqNrSf7BhFFPhpjC4KPECCixrPiyRuJ8M6sJ28zxamWIaxXsvBBmLMzQdQQ4QVzFv0mQRlM4To1Icd8GqtipBK/Dp2eCtxNCg1t2A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 10, 2024 at 01:56:44PM -0700, Andrii Nakryiko wrote: > Suggested-by: Matthew Wilcox I'm fairly sure I've suggested much the same :-) > Signed-off-by: Andrii Nakryiko > --- > kernel/events/uprobes.c | 50 +++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 50 insertions(+) > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > index fa1024aad6c4..9dc6e78975c9 100644 > --- a/kernel/events/uprobes.c > +++ b/kernel/events/uprobes.c > @@ -2047,6 +2047,52 @@ static int is_trap_at_addr(struct mm_struct *mm, unsigned long vaddr) > return is_trap_insn(&opcode); > } > > +static struct uprobe *find_active_uprobe_speculative(unsigned long bp_vaddr) > +{ > + struct mm_struct *mm = current->mm; > + struct uprobe *uprobe = NULL; > + struct vm_area_struct *vma; > + struct file *vm_file; > + struct inode *vm_inode; > + unsigned long vm_pgoff, vm_start; > + loff_t offset; > + long seq; > + > + guard(rcu)(); > + > + if (!mmap_lock_speculation_start(mm, &seq)) > + return NULL; So traditional seqcount assumed non-preemptible lock sides and would spin-wait for the LSB to clear, but for PREEMPT_RT we added preemptible seqcount support and that takes the lock to wait, which in this case is exactly the same as returning NULL and doing the lookup holding mmap_lock, so yeah. > + > + vma = vma_lookup(mm, bp_vaddr); > + if (!vma) > + return NULL; > + > + /* vm_file memory can be reused for another instance of struct file, Comment style nit. > + * but can't be freed from under us, so it's safe to read fields from > + * it, even if the values are some garbage values; ultimately > + * find_uprobe_rcu() + mmap_lock_speculation_end() check will ensure > + * that whatever we speculatively found is correct > + */ > + vm_file = READ_ONCE(vma->vm_file); > + if (!vm_file) > + return NULL; > + > + vm_pgoff = data_race(vma->vm_pgoff); > + vm_start = data_race(vma->vm_start); > + vm_inode = data_race(vm_file->f_inode); So... seqcount has kcsan annotations other than data_race(). I suppose this works, but it all feels like a bad copy with random changes. > + > + offset = (loff_t)(vm_pgoff << PAGE_SHIFT) + (bp_vaddr - vm_start); > + uprobe = find_uprobe_rcu(vm_inode, offset); > + if (!uprobe) > + return NULL; > + > + /* now double check that nothing about MM changed */ > + if (!mmap_lock_speculation_end(mm, seq)) > + return NULL; Typically seqcount does a re-try here. > + > + return uprobe; > +}