From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5784BC02188 for ; Tue, 28 Jan 2025 01:24:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E24632801F6; Mon, 27 Jan 2025 20:24:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DD46A2801F2; Mon, 27 Jan 2025 20:24:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC3112801F6; Mon, 27 Jan 2025 20:24:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AEC652801F2 for ; Mon, 27 Jan 2025 20:24:45 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 57D2F45C78 for ; Tue, 28 Jan 2025 01:24:45 +0000 (UTC) X-FDA: 83055116130.08.A09D5DE Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf14.hostedemail.com (Postfix) with ESMTP id 6CA4E100012 for ; Tue, 28 Jan 2025 01:24:43 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jIGhfX2U; spf=pass (imf14.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738027483; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hWL6C+kqpj+gBy0RfSYoYVLKnqSYvu312YXcNFBv0HU=; b=hqeO0/MswCzOzx4i4tUviM2ppbY6Bs0TIe0+P4rSTYFlBZYAIo9Q8vvYHAUZF2IITIWsU2 73TD2SdjI8qgt+FChX3hZhygHtBckht/B5UQuqFb1h8UdESyUq1HEYY7qT5SDbLb1NX6Lk eKctmQuTHgt/fq7+RDOln7UVtgo2VQQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738027483; a=rsa-sha256; cv=none; b=jghlQHHRlb8DNkI4qATunDUxcKfJfNwGi9f96cAOXfD5VpSZ+dMWRNioEOO8FlEO3Kp0xO Vc0X9fnNY0OLaoO1V/Gm8jFK17OOXmgbGTmoCappWSOvbO4RyoSG9l6ogMb1ohY1epb3HR h7gWN5EpCnTJr5fkMV1nUGN+cmkUp/s= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jIGhfX2U; spf=pass (imf14.hostedemail.com: domain of andrii.nakryiko@gmail.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=andrii.nakryiko@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-2eec9b3a1bbso7013135a91.3 for ; Mon, 27 Jan 2025 17:24:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738027482; x=1738632282; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=hWL6C+kqpj+gBy0RfSYoYVLKnqSYvu312YXcNFBv0HU=; b=jIGhfX2UnNUhoNIJmY5aWfFgDzP088+FrYSI9Hqz6YWrhN86FmY1xPSd/PxI29g1LM yRP02/ua1g3hj4IAeO27/iOVIcLvyaR50S3Gk4Kn4XojNX0ZnPsiPaGHD9f8Hf5r9QN0 qDuJvDDYeCmNp3x5AvsptUkX5tGaDfhQ1YIR94y0tCedtAhZapY3y8kbV64MvVB63qX7 UjlvhCZWcYb+aC3vbWl9mle3tp2ZY2ovqyy8UlUDKXQtA5EZOUnkdYs+TPOnawleq6gw MbqtIeT8+3Lu0Ghj6743KVB5M6IDvzjHXlIhf37ii2uytfr5DxYjz8DjwMSJ13rWEOAT j2Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738027482; x=1738632282; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hWL6C+kqpj+gBy0RfSYoYVLKnqSYvu312YXcNFBv0HU=; b=q+gaCqQnvPbUPTxbwJRIKnejwaGGR/ris56/1pGyOEZdUG3zBzKg39MfH2I7SahgVQ ySOe7H2n5qlYmoIXF8B522ygyvCUyvr+45tpRkz1mmS1EMNoX5zA9+mp9VctDqQxkNLs z15hrlpsNIBVPyWM/t1hBC0fPRxcdwmCjJFhBJKMcAH1mFdZUHmTxj4/gfmTLjR7J0Vc xDX+RC+dHio4kKcQ671x5vJYMd3HFq+TzJSF8BUIAxGrHcyx+ebGHW31rLwX8JoV2AEG R7I1t9nN3Xk3+AyOumoouVRzSfOgGIDaQ6HBPtyulB73cyrOMCAh1vpTnUpEuWcA5agV XsyA== X-Forwarded-Encrypted: i=1; AJvYcCU7ziGZVMlTnXe92qe9rbAKeNwLac/RIX/01MOecRwlOZeupmpyDZUWkdGSmFMWkHGwufTwULPacg==@kvack.org X-Gm-Message-State: AOJu0YysS9geszP1SxeezB53aH8kYjl8xc2Ms7GR06hjvu/ipLprzMou 6q+c7KlQdUDFz5HzP4j1FBsAzZe00SLSMRYQL5rR6APaEq7lyDklGrqdqv8+Rxozx2QRIxChdPv tYObjaY9M4VKEwoj/kZrWNui0nxc= X-Gm-Gg: ASbGncskrYPESfizgVhXJCxcDoUJ+J+7VUEdZwXNmKJ+VZNXgXbGexTXdz8XFodk/9n 3bUFZs6/7S6quy4NO/CKsbQr2e2H1J8whC/fsnT5H/49X5Nmnno8xaa3qJD0FOzm5H7rO3bihsR t4JA== X-Google-Smtp-Source: AGHT+IEnr9Q9tZOl7yts+9JQOMhGxLXwor7F1+tEO7eO4lojdnHymfL2CNPa8Qw5kBoY8swBcV5WfM44QXbraQ4a4HE= X-Received: by 2002:a05:6a00:2917:b0:725:b201:2362 with SMTP id d2e1a72fcca58-72dafa409b5mr61620850b3a.11.1738027481968; Mon, 27 Jan 2025 17:24:41 -0800 (PST) MIME-Version: 1.0 References: <20250127222114.1132392-1-andrii@kernel.org> <20250127164106.5f40b62e0f1cf353538c46fd@linux-foundation.org> In-Reply-To: <20250127164106.5f40b62e0f1cf353538c46fd@linux-foundation.org> From: Andrii Nakryiko Date: Mon, 27 Jan 2025 17:24:30 -0800 X-Gm-Features: AWEUYZk09zQNKSjG3RxZBbWMKxSCzryFhpuTAs83Pv_RrQIyT0M8-_pPLCp7Qxc Message-ID: Subject: Re: [PATCH v2] mm,procfs: allow read-only remote mm access under CAP_PERFMON To: Andrew Morton Cc: Andrii Nakryiko , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, peterz@infradead.org, mingo@kernel.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, shakeel.butt@linux.dev, rppt@kernel.org, liam.howlett@oracle.com, surenb@google.com, kees@kernel.org, jannh@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 6CA4E100012 X-Stat-Signature: m3fxcf9g4757cde69axm3sas647jksux X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738027483-352572 X-HE-Meta: U2FsdGVkX194rYZ7yG+RgCiH6AvdkSrJRG345/VpZ+fCEBRrNA9qSS6Sc2MIOa+vNGtto2gJAmHv/I2ftDfkDN3HJ7Qed4QE9ixIjV9pshH8D2BTRT62SHR2+79R6T2nQhrG+YMjk7p9O9ppYtmv9HX01N1ldLiQ/5eJi+M4uXlzu9DWkdxmOfjhFjhzfoIuiJvBxlv38pvsERkBqyIG8oxLcHcmkNafJvzScOUxzSM0ov1vXgWvhnQ1V1lX107BkGaRvDcMZ+O1bUkeuEROXF9dRelviP0aVJry62BTTkaZM4ELElbLzzVszsrGYfXoimFcJti/bgjHhqlwE4THrva8ZvCTIDVKjUeOvXYNrosscjvF/tvrK5uQAms2OMXJFxjmigHpsI5vqKbaae+sbydwsgCrTFAqnF6qh9W9l0/3bY3y9b4N1QibDrCEllC8H0yalG11tkiHwVEy2sGLOpU1bB8IS8FDw7lP4pgLarFFKFWhlv4cwVvV2nQrnqCRt4Rj+D0IVlX9qpcDu5pFuDa7XDjTw1RVfEcba5tZyRSpY0Ut6nrmMlCpIBuGqH8JDzGYLDtbbLXaoWBx1GoJuLyHCXpsiT3KRvSnvCutdntNph7Jxl5LMSj08YAMFrN6Hz6/DjDKdnZkm7E2l1uJuD1dd+chGr1dT4XWlXQTb7HufuZ6xEahamg9eRutj8xnHr/n1zd5fmbcWiENceUOxxkShraTZUYz1MK1rB6Qc3JaMf11A73RrAN5HVaLFlGXd8NwDWWzUxrJNQhrwdxNJjK8e5H5PRQNO/q8Tv1BDe7gttYT86AF+KVzL4RXNLKpX95/0sZPeUQrKfh24BtzTFYBKXqmVT6YM/1oENP0yG6OSCf08zd5vr5v9Y4pOVlsYZSe+QcDJKq4r4ckfpVOhhIDm/nNwViARLISmn4/IjDVav7iaaRsZKkZRko5wxXIPvz4sKwVtOGGdm3gWXm mX3jYqjx sibRx0TymVyfNjUMTtnvCvHx6Xteonm6r0i39ao+XrP7+sZea/Jtx9VaHlWy0RXVLreB2bOU1wanUWdQ2tMD0ZNfF6fVveN1nsAK/vohLn5+dxNyNK6BFhH2wM7gwxV6oYuyCbpd5OuvHb9Pl1PaRcNTxJcC8gXEVfv6XkAInbQzR86lasvEyJHfSNRbliSdYwnSSBYT1Q2fCnkYVapMFY3JkwvpfqI/CREGF06m6G4GOJ8vDcX4hJ4WRMAFxjmFRpF2SlL04hGRvIPfhVpao+VsAGyYCEDGJdW2J8gh/bc2aJyRxREg8kCM9kc1wc+5EDUYw2lDeUqZHc9uDUd0xlRWY2xkSn15Ts9hUvLjQB3ENANN4o87sqsK6TXuFW2YriFHNUDkUXR6C6GLw8+WhMx5hNei7f+ZL57f2xNovYU25S40vfGQy40gKfg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.007827, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 27, 2025 at 4:41=E2=80=AFPM Andrew Morton wrote: > > On Mon, 27 Jan 2025 14:21:14 -0800 Andrii Nakryiko wr= ote: > > > It's very common for various tracing and profiling toolis to need to > > access /proc/PID/maps contents for stack symbolization needs to learn > > which shared libraries are mapped in memory, at which file offset, etc. > > Currently, access to /proc/PID/maps requires CAP_SYS_PTRACE (unless we > > are looking at data for our own process, which is a trivial case not to= o > > relevant for profilers use cases). > > > > Unfortunately, CAP_SYS_PTRACE implies way more than just ability to > > discover memory layout of another process: it allows to fully control > > arbitrary other processes. This is problematic from security POV for > > applications that only need read-only /proc/PID/maps (and other similar > > read-only data) access, and in large production settings CAP_SYS_PTRACE > > is frowned upon even for the system-wide profilers. > > > > On the other hand, it's already possible to access similar kind of > > information (and more) with just CAP_PERFMON capability. E.g., setting > > up PERF_RECORD_MMAP collection through perf_event_open() would give one > > similar information to what /proc/PID/maps provides. > > > > CAP_PERFMON, together with CAP_BPF, is already a very common combinatio= n > > for system-wide profiling and observability application. As such, it's > > reasonable and convenient to be able to access /proc/PID/maps with > > CAP_PERFMON capabilities instead of CAP_SYS_PTRACE. > > > > For procfs, these permissions are checked through common mm_access() > > helper, and so we augment that with cap_perfmon() check *only* if > > requested mode is PTRACE_MODE_READ. I.e., PTRACE_MODE_ATTACH wouldn't b= e > > permitted by CAP_PERFMON. So /proc/PID/mem, which uses > > PTRACE_MODE_ATTACH, won't be permitted by CAP_PERFMON, but > > /proc/PID/maps, /proc/PID/environ, and a bunch of other read-only > > contents will be allowable under CAP_PERFMON. > > > > Besides procfs itself, mm_access() is used by process_madvise() and > > process_vm_{readv,writev}() syscalls. The former one uses > > PTRACE_MODE_READ to avoid leaking ASLR metadata, and as such CAP_PERFMO= N > > seems like a meaningful allowable capability as well. > > > > process_vm_{readv,writev} currently assume PTRACE_MODE_ATTACH level of > > permissions (though for readv PTRACE_MODE_READ seems more reasonable, > > but that's outside the scope of this change), and as such won't be > > affected by this patch. > > > > This should be documented somewhere, so we can tell our users what we > did. Documentation/filesystems/proc.rst seems to be the place. . Wow, that's a big file :) Funny enough, that file mentions ptrace only in the context of /proc//timerslack_ns, nothing else. Hm.. Should I add a common section saying something about how either CAP_SYS_PTRACE or CAP_PERFMON provides access to other process' user space information? If that's ok, I can send that as a follow up patch (as I bet there will be a bunch of iteration on exact form, shape, wording, placement).