From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4692CC3DA4A for ; Mon, 19 Aug 2024 20:35:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C596E6B0083; Mon, 19 Aug 2024 16:35:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C09B26B0085; Mon, 19 Aug 2024 16:35:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD1246B0093; Mon, 19 Aug 2024 16:35:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8E11E6B0083 for ; Mon, 19 Aug 2024 16:35:13 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 53277814FC for ; Mon, 19 Aug 2024 20:35:13 +0000 (UTC) X-FDA: 82470149706.20.4AD1D1E Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) by imf11.hostedemail.com (Postfix) with ESMTP id 7F7CC40020 for ; Mon, 19 Aug 2024 20:35:11 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=d1GnrAD5; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf11.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724099624; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nH6MJrjwaF1DExwqq1BAG9TBOgQJbeO0AdhOsYCARpQ=; b=rZrwcBETyF9RUQz9PuY1YUYfGmxFNZJGJbLW482D/1Iy13X83APUrlnRiRHUd6gu4Qtk+F KA1H421QiXVzstZdXa9msXbBeD6zEJOZcaLrm1O5tRg5uwGov5W3nj+4mnlhbHv/mPQCvc ZMzwvmBVIzJ7Zt6IenvFlxuNJcFFkPs= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724099624; a=rsa-sha256; cv=none; b=spLq16d7tE3Ub3bkJA3HRtGtRv07SYItQ3Fn86wZcRoX+zCeJC11BSDl/gorDOSwihKRk7 3QFngtTFnrqoxtxrgs5vuoyMlG2UZD6jWWw98QDvOCrm7hYn6N97cE6Zyr8jZKZpx93x6x 4ZK1WJ19G6PeiIfD1m2JHLDOFryjGDc= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=d1GnrAD5; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf11.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com Received: by mail-pg1-f176.google.com with SMTP id 41be03b00d2f7-7b8884631c4so1915875a12.2 for ; Mon, 19 Aug 2024 13:35:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724099710; x=1724704510; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=nH6MJrjwaF1DExwqq1BAG9TBOgQJbeO0AdhOsYCARpQ=; b=d1GnrAD5MFToTTghaN4trDU4aAtHetfOcCjOnYBTBzUep1FY54hlYBNuudb7r+Bgk5 wNR+xB0MYukNRBNNpjIOfSjJzHdkRI+BiAjzJJ9iVEDI/QJsMFvAL3L6ezmHB5sVuSBO HuylJU4H1vzXbbnsykPQKdxTdH4VQwYRC/DyhWjGKh8IIl7X2PAgwLiF0SddghXRIBZ9 8aK8xXaZ2jxavPqL501agMMKjw3ycbhFAvcpvi3jzTg+F+BwXkVW35DkosSbcwK1f0jp 9rJCWcqj0ZSfNKq0hi4WlZw43LDgWgDSgoKR06vkB8UUwmzvNEwVOG8Vpvrnlync4MIz f4Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724099710; x=1724704510; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nH6MJrjwaF1DExwqq1BAG9TBOgQJbeO0AdhOsYCARpQ=; b=S9sDyBE0a7pE7IGmQPrJ9i51t40jYk7dLAjNe2toCVSGvVxV1EY8aZt8aZH6HBoZB9 evo6x6km4JdY9xN2/nEylEDqv/GYhOlq3yJdKouk2ZqCsvyaeYh7Gue3q92xQHIc2Ot+ v0zwyaBeg34yaIqHKJa44ntWFoXACe9Vl8OYUUFd2szC1KX6qmLLU4jHUy0YYnCC5Q0G L3BRQGm1jcK217744Wzw6zBaRySiBWD1poUrn55rcoENOTNtgX1AaE4G4SJuzfcxMZdY ZeMwWgpvNsNLq+UbtleqDWnLaZCWl5EwCMlvZOK1Fcxe0SSvPCarXAafGEMan9AnJZYQ i57A== X-Forwarded-Encrypted: i=1; AJvYcCVCNjNCtvxtjRbVVmY8HLLBLkGeykxeH1QMv/vF7tFJLqAvZrSRyjjhcZntZ1olRYlOZ2qAJjYGfkMnNZ//GksD6qU= X-Gm-Message-State: AOJu0YwRcAnxcp5tVegvo+Lf3ReKd0GW//SAklM/xJ4B1NDmWxgs2Jyx J6yukXbNLLxHdvWzEtHDsZGP//8SKRMeondWSr4iGXTMuSI0OvMF75FTob4B2RldynWHqrfndWw 0twr7BWyHj7+3OUcuEG1b6pIws9E= X-Google-Smtp-Source: AGHT+IGUFaAMZSC62zrt9/XK41cBx3H4cq8e6deI++UilY88ciWZUojNh5L48XclPwKgVRRXTxRm7ziIqLii28R7Rmk= X-Received: by 2002:a17:90a:bc81:b0:2c8:3f5:37d2 with SMTP id 98e67ed59e1d1-2d3dfc75e41mr13272600a91.20.1724099710147; Mon, 19 Aug 2024 13:35:10 -0700 (PDT) MIME-Version: 1.0 References: <20240813042917.506057-1-andrii@kernel.org> <20240813042917.506057-10-andrii@kernel.org> <20240819134107.GB3515@redhat.com> In-Reply-To: <20240819134107.GB3515@redhat.com> From: Andrii Nakryiko Date: Mon, 19 Aug 2024 13:34:57 -0700 Message-ID: Subject: Re: [PATCH RFC v3 09/13] uprobes: SRCU-protect uretprobe lifetime (with timeout) To: Oleg Nesterov Cc: Andrii Nakryiko , linux-trace-kernel@vger.kernel.org, peterz@infradead.org, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 7F7CC40020 X-Stat-Signature: tmrnn8ob559i6ws76b9n5jxdbw9gzpqk X-Rspam-User: X-HE-Tag: 1724099711-966818 X-HE-Meta: U2FsdGVkX19Nzqj7ALSJe6nGcs4ZyCm922/cMm2gekX6yv3GO/0jg1lThyf7M9frMdRaT8es4ewvrvvA02AoR5EAvBC85Ff5BLaNVOJjYQtuqiLAISfi/SBo5EayDGMW280ppm0D4rzngM68V3SUHRqTMSE3aVId5lhh0NFsjyNfwjLwLLXd4fHqX1prhVH2Ja2xsVFNx48T8sVJbuURHxeq0W+9YdSzj6eTdoERsrc1C6vzQaVly9nFQWLdmsL7yY95LAkgp7vFgmFVX3zZYcukVBydIvdVnvZd1M6q6V7XcA7U5Wc0rbyQ+rQ8HCqJdRPDib4S6FaXvc2M4KmceDN4qe6O9GuSuNx0potyrxvFOEcorbZ/P5CqcogRkcKlILES/c6HE9Qtk75hKDIaiUG8MUymKP4zWmAJF708rSaHmfv6lCgCp22O8FBlWkW9hFceRxfl5OZp0gqw8MOqNSUSk+U9h6WZNhtxUEJVknya+xcBp9+otqm2y2cG7zCrg6ZtyVddbRb1K+iI3jTCUUD6zMd77MIU2GMs8mzKU1NNlO7ieAHzAtGEVbnXd0wsmYnCXUe+DRggOF8uuzXyB84FzInsZbX3JVeG6UCmnAxuBA8BYUl6M5gdJL16DrlUnXBPtDehEp4iuPNeTM7C01lBuqA4yKhGdHZ9l81gBz4VC5R//uX2ZNgVk8IFOUSbI7gMbwR8HmFosDxQdX6EJL8bRBhziuuN63WfJ3/w89EcK/PdlaD5k/29Pp7EHyxvDIEknE+qfk6kgRrgWkJtbvvMCcwnWFDbgQvGypbpqDtCpDDlwR7cAD3U0o1YrGKtUEolwP+NgEy2UfWOXoDoMss4NsV5Eo5/OhNU5QEPxpoJ7cK3EC11zYpEr29/M4YJDtQ//lbzv7m4HUmU1eYhnO2vMqLT2sVCItTqx5u1pgc+jF6hj5r51e7mwALyBvLDDcm9xUBVfWqUibkthgq E6AaMdOZ 6XcVvf+yx8PvzvzCp3TCr8eAufxuxWPMGgI7LJ7wTjRo0W3jinwd+a/WFa6rOVlKQRHZQ2DiJuPLstosHVpHkpxkg/y3fkiXN6bAn82CBqClvUIZO+cLA8Nyh281cRdULryqUVTfNY0dxCY//Gkwm+bFTvFMvsxvdSC5akbYdmmTDz2LMSIY5H+7Pw7fqeg00+fQRDzexnXlijw7J0scGGKrTELsOLNWaGEsAtZsEYlHKfd5IN80NDun+N2aOuDWcR6a/CE5DKOZbnH8CNb49EvTkekLvyHn1Zfi/ptEFUBrlnB/9EVsn5IIlNYByq+NTERpxkedP1gSaGTtmZeg4sSY6e5KGlH21mpR5 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Aug 19, 2024 at 6:41=E2=80=AFAM Oleg Nesterov wro= te: > > On 08/12, Andrii Nakryiko wrote: > > > > Avoid taking refcount on uprobe in prepare_uretprobe(), instead take > > uretprobe-specific SRCU lock and keep it active as kernel transfers > > control back to user space. > ... > > include/linux/uprobes.h | 49 ++++++- > > kernel/events/uprobes.c | 294 ++++++++++++++++++++++++++++++++++------ > > 2 files changed, 301 insertions(+), 42 deletions(-) > > Oh. To be honest I don't like this patch. > > I would like to know what other reviewers think, but to me it adds too ma= ny > complications that I can't even fully understand... Which parts? The atomic xchg() and cmpxchg() parts? What exactly do you feel like you don't fully understand? > > And how much does it help performance-wise? A lot, as we increase uprobe parallelism. Here's a subset of benchmarks for 1-4, 8, 16, 32, 64, and 80 threads firing uretprobe. With and without this SRCU change, but including all the other changes, including the lockless VMA lookup. It's noticeable already with just two competing CPUs/threads, and it just gets much worse from there. Of course in production you shouldn't come close to such rates of uprobe/uretprobe firing, so this is definitely a microbenchmark emphasizing the sharing between CPUs, but it still adds up. And we do have production use cases that would like to fire uprobes at 100K+ per second rates. WITH SRCU for uretprobes =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D uretprobe-nop ( 1 cpus): 1.968 =C2=B1 0.001M/s ( 1.968M/s/cpu) uretprobe-nop ( 2 cpus): 3.739 =C2=B1 0.003M/s ( 1.869M/s/cpu) uretprobe-nop ( 3 cpus): 5.616 =C2=B1 0.003M/s ( 1.872M/s/cpu) uretprobe-nop ( 4 cpus): 7.286 =C2=B1 0.002M/s ( 1.822M/s/cpu) uretprobe-nop ( 8 cpus): 13.657 =C2=B1 0.007M/s ( 1.707M/s/cpu) uretprobe-nop (32 cpus): 45.305 =C2=B1 0.066M/s ( 1.416M/s/cpu) uretprobe-nop (64 cpus): 42.390 =C2=B1 0.922M/s ( 0.662M/s/cpu) uretprobe-nop (80 cpus): 47.554 =C2=B1 2.411M/s ( 0.594M/s/cpu) WITHOUT SRCU for uretprobes =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D uretprobe-nop ( 1 cpus): 2.197 =C2=B1 0.002M/s ( 2.197M/s/cpu) uretprobe-nop ( 2 cpus): 3.325 =C2=B1 0.001M/s ( 1.662M/s/cpu) uretprobe-nop ( 3 cpus): 4.129 =C2=B1 0.002M/s ( 1.376M/s/cpu) uretprobe-nop ( 4 cpus): 6.180 =C2=B1 0.003M/s ( 1.545M/s/cpu) uretprobe-nop ( 8 cpus): 7.323 =C2=B1 0.005M/s ( 0.915M/s/cpu) uretprobe-nop (16 cpus): 6.943 =C2=B1 0.005M/s ( 0.434M/s/cpu) uretprobe-nop (32 cpus): 5.931 =C2=B1 0.014M/s ( 0.185M/s/cpu) uretprobe-nop (64 cpus): 5.145 =C2=B1 0.003M/s ( 0.080M/s/cpu) uretprobe-nop (80 cpus): 4.925 =C2=B1 0.005M/s ( 0.062M/s/cpu) > > I'll try to take another look, and I'll try to think about other approach= es, > not that I have something better in mind... Ok. > > > But lets forgets this patch for the moment. The next one adds even more > complications, and I think it doesn't make sense. > "Even more complications" is a bit of an overstatement. It just applies everything we do for uretprobes in this patch to a very straightforward single-stepped case. > As I have already mentioned in the previous discussions, we can simply ki= ll > utask->active_uprobe. And utask->auprobe. I don't have anything against that, in principle, but let's benchmark and test that thoroughly. I'm a bit uneasy about the possibility that some arch-specific code will do container_of() on this arch_uprobe in order to get to uprobe, we'd need to audit all the code to make sure that can't happen. Also it's a bit unfortunate that we have to assume that struct arch_uprobe is small on all architectures, and there is no code that assumes it can't be moved, etc, etc. (I also don't get why you need memcpy > > So can't we start with the patch below? On top of your 08/13. It doesn't = kill > utask->auprobe yet, this needs a bit more trivial changes. > > What do you think? I think that single-stepped case isn't the main use case (typically uprobe/uretprobe will be installed on nop or push %rbp, both emulated). uretprobes, though, are the main use case (along with optimized entry uprobes). So what we do about single-stepped is a bit secondary (for me, looking at production use cases). But we do need to do something with uretprobes first and foremost. > > Oleg. > > -------------------------------------------------------------------------= ------ > From d7cb674eb6f7bb891408b2b6a5fb872a6c2f0f6c Mon Sep 17 00:00:00 2001 > From: Oleg Nesterov > Date: Mon, 19 Aug 2024 15:34:55 +0200 > Subject: [RFC PATCH] uprobe: kill uprobe_task->active_uprobe > > Untested, not for inclusion yet, and I need to split it into 2 changes. > It does 2 simple things: > > 1. active_uprobe !=3D NULL is possible if and only if utask->stat= e !=3D 0, > so it turns the active_uprobe checks into the utask->state che= cks. > > 2. handle_singlestep() doesn't really need ->active_uprobe, it on= ly > needs uprobe->arch which is "const" after prepare_uprobe(). > > So this patch adds the new "arch_uprobe uarch" member into uta= sk > and changes pre_ssout() to do memcpy(&utask->uarch, &uprobe->a= rch). > --- > include/linux/uprobes.h | 2 +- > kernel/events/uprobes.c | 37 +++++++++++-------------------------- > 2 files changed, 12 insertions(+), 27 deletions(-) [...]