From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35485C25B4F for ; Mon, 6 May 2024 18:51:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB3BD6B0092; Mon, 6 May 2024 14:51:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A64386B0093; Mon, 6 May 2024 14:51:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92B246B0095; Mon, 6 May 2024 14:51:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 701BB6B0092 for ; Mon, 6 May 2024 14:51:50 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1CCBEA060A for ; Mon, 6 May 2024 18:51:50 +0000 (UTC) X-FDA: 82088865180.25.A6A8AD0 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) by imf06.hostedemail.com (Postfix) with ESMTP id 46148180003 for ; Mon, 6 May 2024 18:51:48 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=A0uYH3Lf; spf=pass (imf06.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715021508; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NJYCFU1qR3U70oabFLkUOdKlCf4CgRflsuk19nnPjCE=; b=tWRckndfcKMlrMSnuKf/zOJAtqpJaC9Gny40VSLOHP07Oj6u5G8DwwSHHYysUTriaSjd/a y6q6P5qK0R5T8kzzvdlfHuncgl5qvfbRmNGHX5wRKD3lxmJceZk0M3JVWKEhLgB0T1NA9k wSyR1P0pS2evx3jaE9OrF5R85ey79k0= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=A0uYH3Lf; spf=pass (imf06.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.41 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715021508; a=rsa-sha256; cv=none; b=AmvxdttLyfsK27SKWS0Z/O8gHz6uH2rUlXw+piL2nF6adZkW55umGkYsh0+VACUghS91g9 6D6a67Nbn0jXSdmQeii8hnt9uDz2YNiVGGLiAL4aw2195/WAgsK8xfstnFrImhdcdJ2dSP XWOiaylM4bGaV8KW09RJtcHkXKvCLc4= Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-2b2b42b5126so2044609a91.3 for ; Mon, 06 May 2024 11:51:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715021507; x=1715626307; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=NJYCFU1qR3U70oabFLkUOdKlCf4CgRflsuk19nnPjCE=; b=A0uYH3Lf8t42OHyEXlbr9ojfLApPTKfKTT2w6hZsw4rGENWtldONZKBinvkaobnrJ+ +hu+4XnbYelIZ9DnI0tkTtHiTJtcr4Y90con6lYxX6XNRS3V0H/3BFDh8vu0WZbyqssr rbtIh1Wr8q+zHKiHzKHbnhRYhlwWBQ/7AUYolEMcqfKWsFblrotualv4zAKm7G94re0k W5S7wF2MwsduNiRhemA6yFPh3UiUgPhSUHReCsnjaOy/yoQig90qGdmRzR/pV54X2lbc ux0bAEb0c7gtEq4CEFHOPT72OVlGM1dwyZcrBhydeGaiPxiwWnd0QmUFQMYMdKPuD6xG zfVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715021507; x=1715626307; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NJYCFU1qR3U70oabFLkUOdKlCf4CgRflsuk19nnPjCE=; b=rwKepGipARuVCHbVEx/MRHkULPMSwVN0KvHvDj0A+VzgyOx6Dm765xEDxFD5WdoSP1 CqQ1KRlixNQ1WH++GL4/kx7/2bB6+dgOU2LSGGDr1hYfBmzKLlZSdkfIQ1NElYHHuDgc 2nZMKAJJGbr2ZhcCHfMQOEV/8hNyqcoxmi7/pDoLuzWSvvQF6BEwgcN98wJi17cwUxcE hZk5hvDviFrutiBkL7dYh1DoBAYcS8634X404HmOAUBRIthp319gunmJo0/TH0rvkPht xaGOTZd2iv1Rl3OqO1f+YTT4w109omwsUpXuPtxh3Jz/1v+51KhzoNzjVfz+fbVLkNlu juQA== X-Forwarded-Encrypted: i=1; AJvYcCVMmVoxwWbjIXRD6ZAAXsO/mT5jA7BRvjA22WF5vGgxxK44fU9bywst165bNTqApAaNX0oMntC6c7/1urQhVPfTuMg= X-Gm-Message-State: AOJu0Yx3C8+90A94GXXQyDMk+CGO62rsFOhgVrQKIgBzkC/9tk9zsQmR DDFx4OSUzp5IajUSr9yDr85LFlMuLednjOs+D252ucSqSMaWXJAevYlTX4RMhl1JfA2//l4mA71 rnjhRfQSRIyQ++R/ZRYjaMP2Uwxo= X-Google-Smtp-Source: AGHT+IHUCuh5MCL4HivLFIEUU8PAZ5xNKwbc3jbpnxvyJnZmxUy6rlnIubbWPjNgeUkkQLRn4ZF6m3s29tesf53SNBA= X-Received: by 2002:a17:90a:930c:b0:2a1:f586:d203 with SMTP id p12-20020a17090a930c00b002a1f586d203mr8530578pjo.41.1715021506858; Mon, 06 May 2024 11:51:46 -0700 (PDT) MIME-Version: 1.0 References: <20240504003006.3303334-1-andrii@kernel.org> <20240504003006.3303334-3-andrii@kernel.org> <2024050439-janitor-scoff-be04@gregkh> In-Reply-To: From: Andrii Nakryiko Date: Mon, 6 May 2024 11:51:34 -0700 Message-ID: Subject: Re: [PATCH 2/5] fs/procfs: implement efficient VMA querying API for /proc//maps To: Namhyung Kim Cc: Arnaldo Carvalho de Melo , Jiri Olsa , Ian Rogers , Greg KH , Andrii Nakryiko , linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org, =?UTF-8?Q?Daniel_M=C3=BCller?= , "linux-perf-use." Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 46148180003 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: f169g83kexbfy4ma9ayus84w4hd33mzg X-HE-Tag: 1715021508-16824 X-HE-Meta: U2FsdGVkX18udWTdyqkinZykpBhFl0f2+YfMrpcG8BIJhGN0VvCrjseYVDZcOQvPcFYlAcgo3u4G5Sk9uud4uPytPsOdXQgKBQa0DicNKdhe9Ws9u64uk3qKp10g0COu76aOwrWDxhE5dKfVglSWEk6j1dlkfjtLx0njF9upkgmusEAxyGTwQAjLAJAn8FyhuscI4eiMh0p+Lvcs8tJUipE+/aB7sYm/6CX1YnUD5oulltL+i/fN8gzK7rdfwjxjmpfbVQYBEUTsF6vJDJ08yArkt6DxxofuHk6k4/fyWqVuuAcZthU2Yy18WYXd8/lIhvGNffLohf1kyhAkb6BEJvC5zI89q0gnQXKj0nbumgeDMVnLujRXk9dUtLfHEHb+g+KBwddVLkx/KY7v27UduCFgT8fc8o2fBZmra1ssaQiXGmncpojJHk4bqPpzeerj0T2Gkvag2cpACLznOH1TgZpNIJA+2DVF9vCh5mIBib8Na36V5KPOR4QZ4mAs75LnZLhgPwcYgkiYZY9SKHk/Gn6dfb2nTKv5A22O987F9020U2FLmNMIGftKq0bcWaibutfuMNr7EXBcua/I4oTGob2dL2UivM++l4n0Kj0kph3GH7pBkTbJrtYIj8jEGy8yyUbDKTfkzmN8wTGwP3/yLcOZaitJ9W5STRKXpT1i2LImRPVu0e+c+hrKjr7ahEC7SHbguG9sQ269gnP3cRxYrwMeJZBqiDkhkafeV+k1HOMnEdi/LootCurUzCSLepMREQslhQtF5NV1sSd9x2LN+T0fzWeDD+/gdd1BCqLbYQ8la+45WE1lgL3gOL6fqXLUeCNfJ/Aa67RDWVH/AoW2Zd17RL5fz7HR/cDS3cAxTHJcU5NP2syusnuuAwASDj+Y3zrhNZrNsfVvwoReXdMsAvCP5NKxUzyIA5fwomXEPGHjKEs27ETkCgJ/2lsbrk1/jt3tWcs/YO1GbM6G2pb oVbbXWcd 7y4heu05TqDM8MLf1jZ3ifpZ8EA6hRyb6I4M+wATm0tM19Abn0aOPU2tIJtLvk1mzzom1LEwaTQ1jOwq/QFfbRWQ1PZn/xPe95AEyMziNMEOoPH/3ge6oviSKGLLCzkTewyRyi5+g4NRyLMfhjHahHbtBU797UneymUWGtBudOIB69pkmKJxbWhO/AVnsvVRWwE84iY8Vqim8/HYwxCfR+JYCM+/r4PEsm0spxoLagti0RzGeEH2zPMPlF5lOSZ4Zji2K8JaN/skLc7Qqh9CGJaQyJA/xnSkSrrKAwvCD27SXvmPR11n57fjS2RdUAi7LqZJuaqTZDRUSirew/TMqNqI7KHJcDPEFBSK7sFk8Tle/zuE/J5IJyC54ya7prkbKQDs0Hv28ZFDKlMctH3vV6Tx6gmBF+r7LiUgjggfMVpKGGRfJhX6vAqoK4GZpkvIq/CsldvO/6x+DMdRygI5+g8ZZsYc7M832NOj4laFrFojd6BRRbavK68Szxw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, May 6, 2024 at 11:05=E2=80=AFAM Namhyung Kim = wrote: > > Hello, > > On Mon, May 6, 2024 at 6:58=E2=80=AFAM Arnaldo Carvalho de Melo wrote: > > > > On Sat, May 04, 2024 at 02:50:31PM -0700, Andrii Nakryiko wrote: > > > On Sat, May 4, 2024 at 8:28=E2=80=AFAM Greg KH wrote: > > > > On Fri, May 03, 2024 at 05:30:03PM -0700, Andrii Nakryiko wrote: > > > > > Note also, that fetching VMA name (e.g., backing file path, or sp= ecial > > > > > hard-coded or user-provided names) is optional just like build ID= . If > > > > > user sets vma_name_size to zero, kernel code won't attempt to ret= rieve > > > > > it, saving resources. > > > > > > > Signed-off-by: Andrii Nakryiko > > > > > > Where is the userspace code that uses this new api you have created= ? > > > > > So I added a faithful comparison of existing /proc//maps vs new > > > ioctl() API to solve a common problem (as described above) in patch > > > #5. The plan is to put it in mentioned blazesym library at the very > > > least. > > > > > > I'm sure perf would benefit from this as well (cc'ed Arnaldo and > > > linux-perf-user), as they need to do stack symbolization as well. > > I think the general use case in perf is different. This ioctl API is gre= at > for live tracing of a single (or a small number of) process(es). And > yes, perf tools have those tracing use cases too. But I think the > major use case of perf tools is system-wide profiling. The intended use case is also a system-wide profiling, but I haven't heard that opening a file per process is a big bottleneck or a limitation, tbh. > > For system-wide profiling, you need to process samples of many > different processes at a high frequency. Now perf record doesn't > process them and just save it for offline processing (well, it does > at the end to find out build-ID but it can be omitted). > > Doing it online is possible (like perf top) but it would add more > overhead during the profiling. And we cannot move processing > or symbolization to the end of profiling because some (short- > lived) tasks can go away. We do have some setups where we install a BPF program that monitors process exit and mmap() events and emits (proactively) VMA information. It's not applicable everywhere, and in some setups (like Oculus case) we just accept that short-lived processes will be missed at the expense of less interruption, simpler and less privileged "agents" doing profiling and address resolution logic. So the problem space, as can be seen, is pretty vast and varied, and there is no single API that would serve all the needs perfectly. > > Also it should support perf report (offline) on data from a > different kernel or even a different machine. We fetch build ID (and resolve file offset) and offload actual symbolization to a dedicated fleet of servers, whenever possible. We don't yet do it for kernel stack traces, but we are moving in this direction (and there are their own problems with /proc/kallsyms being text-based, listing everything, and pretty big all in itself; but that's a separate topic). > > So it saves the memory map of processes and symbolizes > the stack trace with it later. Of course it needs to be updated > as the memory map changes and that's why it tracks mmap > or similar syscalls with PERF_RECORD_MMAP[2] records. > > A problem with this approach is to get the initial state of all > (or a target for non-system-wide mode) existing processes. > We call it synthesizing, and read /proc/PID/maps to generate > the mmap records. > > I think the below comment from Arnaldo talked about how > we can improve the synthesizing (which is sequential access > to proc maps) using BPF. Yep. We can also benchmark using this new ioctl() to fetch a full set of VMAs, it might still be good enough. > > Thanks, > Namhyung > [...]