From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE3F4EB64DC for ; Mon, 3 Jul 2023 15:07:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F71628000E; Mon, 3 Jul 2023 11:07:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A671280001; Mon, 3 Jul 2023 11:07:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4963828000E; Mon, 3 Jul 2023 11:07:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D167B280001 for ; Mon, 3 Jul 2023 11:07:33 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B18A2120144 for ; Mon, 3 Jul 2023 15:07:33 +0000 (UTC) X-FDA: 80970629586.04.0B8908C Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) by imf04.hostedemail.com (Postfix) with ESMTP id 5CFA740071 for ; Mon, 3 Jul 2023 15:07:21 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=nNLdtYLh; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of jannh@google.com designates 209.85.128.41 as permitted sender) smtp.mailfrom=jannh@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688396841; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6Yi73LNZtiJVtdv9XwcLHpbVmu5fGLj+HQkOyGONQFg=; b=fJHjpeoIxPzhjfZZc68yJ72xqFdRjag9uDqAukmhwThlXCx9oY8LQfUKAnGuwOuHePFqgH 5ZTGoTkVMnLwoKR6HXaLtMfQoonirO7XO9h1MLY5pt3Qt0MR+Ii5arAj4l6DgVYpfuzI3T d0Py124h3uNPY8kSrBAvoLyaym996Tw= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=nNLdtYLh; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of jannh@google.com designates 209.85.128.41 as permitted sender) smtp.mailfrom=jannh@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688396841; a=rsa-sha256; cv=none; b=ScRpXfAOVUJhTUGfcYHhbY3DvcL4xdbuSnGBOXwhUtZYoxkThQyaPx+n+GAsfKBL/7wyMY WKtdsFLjPi0YD6W5LWLjuySAzIdoVyt87mnK0neFNXjtJ0N6hJsYfmEaS/rxF8kpjfFmfm AePuzu8FVGCPnV5GH3Cg6aU+NRzzmF0= Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-3fbca6a9ae4so169125e9.0 for ; Mon, 03 Jul 2023 08:07:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688396840; x=1690988840; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=6Yi73LNZtiJVtdv9XwcLHpbVmu5fGLj+HQkOyGONQFg=; b=nNLdtYLhT36EufRkpLv7ca6KV7NAAZgN9RHacCq/L6iFOvy47lS0Jtv965Pogb/qL4 VxabMsMZuttYx2zc72BwOZdqEEv7R+4AL2E44+Ph/vn7lhek+4iddm/sETqfl2XYlQeR J+vySOqsZ5LRqFBT9cPIm2Pmu1HBOM0zkCXgvhQuph8X20jRqvsu1VTDFceEQFbHYCYJ 5CgP4NseOo3j/3ZVP0wYcPlcBkFGO6BdRiJM/UwtfJ3fYbsyg78AplEsjvEKI3DMxH2+ KBtrbQ4sd3cjcYmN79W0y6fMq3+aT81/iMF5RTkM3cIBrj1/zdi+9VzkgjmDLIhcu5iL fvMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688396840; x=1690988840; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6Yi73LNZtiJVtdv9XwcLHpbVmu5fGLj+HQkOyGONQFg=; b=XbONR0Zyir2cJva7YAi6a1IL3a1EIZxpSdvwNLA5Hq/lRPjiyosfShYCo0K37ZoOcl Xjgi9UNVfims+XHAczdZCf4LaJtKika0JEVsZINMWMHhmMASyLEODJ87BU+ZJsqYYrBy p4k3NrcTYVcmFq2Ksk2uj1XqFhY5Bg0/3QeUzAbsef4z+BYTWsGh0JwHBg7fEzgweqwM yjRA0GlGUUcozk9O49ombv1s2Mvi/2vQZ2dMMMP82y9q9iQzhGixvVXkIsxldaokAp1X zdESmELbQDtxRIi8cdlhX6x2VB2SJ0q9JCoTHgfNEtrSCGPQHTMIzjuk55tYUcgeXCqJ vKwQ== X-Gm-Message-State: ABy/qLZ0Ac0csr//OD/m6W1eaDKJXzRKfrOwE+B9CdwqeJTiL7/ZqXPb 65HJzYXRBAOgHoIKlViDnnedxvCdRtSPaXEbSXmM2Q== X-Google-Smtp-Source: APBJJlErLYbKhG362h11ORTRwVJGnuMRUrvtq/Y89AtBohzfF9IqoDlTx3PR/gLXPUH5GATUu8k8ZqKi1AZSHsYpChk= X-Received: by 2002:a05:600c:1e1d:b0:3f7:3654:8d3 with SMTP id ay29-20020a05600c1e1d00b003f7365408d3mr190166wmb.2.1688396839544; Mon, 03 Jul 2023 08:07:19 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Jann Horn Date: Mon, 3 Jul 2023 17:06:42 +0200 Message-ID: Subject: Re: [QUESTION] Full user space process isolation? To: Roberto Sassu Cc: Oleg Nesterov , Paul Moore , James Morris , "Serge E. Hallyn" , Stephen Smalley , Eric Paris , Andrew Morton , Mimi Zohar , Kees Cook , Casey Schaufler , David Howells , LuisChamberlain , Eric Biederman , Petr Tesarik , Christoph Hellwig , Petr Mladek , Peter Zijlstra , Thomas Gleixner , Tejun Heo , linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, keyrings@vger.kernel.org, linux-integrity@vger.kernel.org, linux-hardening@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5CFA740071 X-Stat-Signature: cejn71uhi73iicwwxpst5wokbk9wk3a6 X-HE-Tag: 1688396841-613559 X-HE-Meta: U2FsdGVkX18t3KUL8nY1e4bz7sAAtWAxFsdLKkqL/7d6tGYTNPeWQoebDDTAMuT5VBdcKMdCXGVMkYAKHzg2GpP2ZPi0S/v2hEW9/036MmEdClDv6sj8XpAO2XD/5IQoYLu8t5YBK8IvdBGN+iCGHAEXd7SMwKzK6NTdH4mK1EYcbil8OQK+XhHXCqTBRxJb3K9Cu7y0St5RvcZ/NSaexzOAQX7+2XeYDYvUg45IgPXJiXR8Kil6abAnDU44gV1sYclhzQLDtMI7W7nrkgDrf3txXjAibMCYSVLQkt9KTyybmlFgefK2n3MHBna0tWHW94pitk4COTJnauMaBWUUuXpz74YE1OeBhWU2zT2BcIVcxPjLXHgMk/MuiEgGYTFq0vjPa8Mqmk7IUdcYPNdGUa3POXAiBAJCHh+LUSaNt5NKozHN3EK1BH/yRhd1dCbqOOREXI/Ek2ajkWJ/Rq0ia4FdimC3c3zULJsGJxBZeUj0Skpe9krUPNa1jER5lBar31B9o5e6mjZ2k69x36q3MPCXXKVNIJpxy5kRX9A2MeRB5O0TfLe5jYFHGaztS9Hx8JYHk9cNEz8SD9Mpt8r84YhRj0w4hzL0Vt6iXcVfMoL5MxQlgiKR35asIG12mCx5xqYb1mlOx0H6jiez+prHFj0J+nxZfDxS2SEnmKMkzxwrl0y8RxqMwPf1nPq1AXeMaRi/pb1JemC47Ka88819pC/SAd/mZZkbzb3uFCNQJ5HLUqQNqK4na61iDAjVLzK6Rsyk4qY8gCIG82t0lNN1kIK1fDVh6wz58vU9B0p3wCrWPi20ehslxgC/yB9tnqEE4l5OxbJbAQ9aDsIKUjMgBijrVQ/hFffG3Zb8OyfTe9Dyj+e1ii8vay28Yq8Pd/60knY1BjEW5uvZM1dI5cBd9k5+1JVXCA2AVXY5tDjc8ByyXLbBhbBidd1u9CEkVfCBj80tMAFFXqmHI3Bq5ln dclGDpfI XQ4+d2olKJjhTmsf7OvJUuaRoOzRQqSRYds9vc+V7X07fUZ9eYT2MynUuf09Af503oS4PHzfe1KcN+STEUVwecPRbD1MZa53VETWBlYnRHlkm4Ps4fQFFZGoXRyXI511qRsihJ6sjOEfGhIYvf3TdRP9hPZVaMMygP1gGJntv8vesgI4VCE8Pq3UbUtCmoKR9/OJM1tg5Ey7xX74ZgzfGqLdEvPVdnv0affvqWlTnnjZO4FO0uQgw5g3eit1898kDQPPQ449B6WAaquzOF96jtCbX6BFIj+Mm4dIqth5zxUNJmf5q3I1UzwmhOBM7LkLbVvGlQYPgzN2KHBF0+WXBaIBDqgnlpaVUCHqwv2RaV4IrMTUrL2YRbZ582OGauBpkO3xLwidf1Hclsb1OPV5HsKYOkg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jun 22, 2023 at 4:45=E2=80=AFPM Roberto Sassu wrote: > I wanted to execute some kernel workloads in a fully isolated user > space process, started from a binary statically linked with klibc, > connected to the kernel only through a pipe. FWIW, the kernel has some infrastructure for this already, see CONFIG_USERMODE_DRIVER and kernel/usermode_driver.c, with a usage example in net/bpfilter/. > I also wanted that, for the root user, tampering with that process is > as hard as if the same code runs in kernel space. I believe that actually making it that hard would probably mean that you'd have to ensure that the process doesn't use swap (in other words, it would have to run with all memory locked), because root can choose where swapped pages are stored. Other than that, if you mark it as a kthread so that no ptrace access is allowed, you can probably get pretty close. But if you do anything like that, please leave some way (like a kernel build config option or such) to enable debugging for these processes. But I'm not convinced that it makes sense to try to draw a security boundary between fully-privileged root (with the ability to mount things and configure swap and so on) and the kernel - my understanding is that some kernel subsystems don't treat root-to-kernel privilege escalation issues as security bugs that have to be fixed.