From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0413C369CB for ; Thu, 24 Apr 2025 03:17:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 27A9B6B0006; Wed, 23 Apr 2025 23:17:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2015C6B0007; Wed, 23 Apr 2025 23:17:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A2616B0008; Wed, 23 Apr 2025 23:17:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E24F76B0006 for ; Wed, 23 Apr 2025 23:17:03 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AEDC35F9B9 for ; Thu, 24 Apr 2025 03:17:05 +0000 (UTC) X-FDA: 83367476010.10.70AD2E9 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf14.hostedemail.com (Postfix) with ESMTP id A6A66100008 for ; Thu, 24 Apr 2025 03:17:03 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=uYhwMLBZ; dmarc=none; spf=pass (imf14.hostedemail.com: domain of debug@rivosinc.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=debug@rivosinc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745464623; a=rsa-sha256; cv=none; b=bapWnKQwngZYQS/i2Z2LcX5NqpSJOa8DG9pRRP/ClEN3PUtjuPrDc45o25Jc3LPEnlm3jK 7yHTkNu8oBE/yvgonQNPInIXP+U7pDt2PTtl++P3N4Z8nE9ZmVXbTW7Er6X4HjgV4Htm7w 7Mqnifcw2zID97S0ElXt2mERbQiDb6w= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=uYhwMLBZ; dmarc=none; spf=pass (imf14.hostedemail.com: domain of debug@rivosinc.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=debug@rivosinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745464623; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XUAwJhWsStgPPiYel2m5dmuZ8PujmZM+CaGE2sUsiqI=; b=LwtNYmDBPNxPJ7FhtOGu1RP8CzKmIcjgsSItm4VUBXEiwwAB58RGdgvcFX7IN1yFQ4/P5E TJw7SJxwlCVLN8dTnBGU+9sq9JV1teu/NW32Z31IOZyhA0O080nSMicS0Vxo3dolGAb2S6 eHDSP9hLY9KDRxDuzejhefhhuLYv6xE= Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-72d3b48d2ffso410167b3a.2 for ; Wed, 23 Apr 2025 20:17:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1745464622; x=1746069422; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=XUAwJhWsStgPPiYel2m5dmuZ8PujmZM+CaGE2sUsiqI=; b=uYhwMLBZ9egMJQy/+ag7y0b1WNo36Jwn8iiDGZvM7ygJAVloVo3nbvwjM30hEcNMP5 LwFQIj+/AWjqP6m/WshqeIvWBMfcSaxJmbuJqI+Ngl8YGkvA2z8xJ0j6PoOx+33/tBJY kZf20gEpWMsNQZ3wfrar8q00wIz2SCAcYD2UHuS7VyNzjVwAld8hiUpDQn/RQ438jiMm a9t6ytA8C98/znigld6ffjtpFM/kxIj1C3kixRxrnzDGwR39zXj04sUgXVA4kXV5Pu/d Xh76mXlVogetBeTyYvOh93S+nxekZnOsQvJ0Nb8HD943lTrkvxEuUNt9G0Cjr0/KtY0E O59w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745464622; x=1746069422; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XUAwJhWsStgPPiYel2m5dmuZ8PujmZM+CaGE2sUsiqI=; b=f9z6TalZ/v6Yj0p5ZjN2Zh/R73GLo/pGm8gjUuB5tErX3C040ZFpmzwib9y9negy/q yoHq882SarRd0mY95XyxTdxd4KdOvtLf4XKxqyW2Ejv87fRVFIIl1tioUjSVm0AvBows an82P+MekE/v7093slsjASCpD65ZPqCoG4KvNIQ7qe8wqLeSq/TJ2U8wW2h/jk7s3egc 5itsV77naIhcbjitomdRfVbqLhppq/SvhXf/2QYTI52uq3g4rIi1Mi0H37rmPGtfyozW ApcfWSwLOcvgNfkOdm+I0XVEipaGvIpfAp/floE6wYrtkeqEpzycO+LDehi/55+mHtv1 v+AQ== X-Forwarded-Encrypted: i=1; AJvYcCVfDQmrbX82u6J+pwc7zpb5Ujqeij1eKtPWGtqWiGvLh79qRnv9rolFy6+Vyu4oAAY0pRiIROjkWQ==@kvack.org X-Gm-Message-State: AOJu0YxrZEzEcbjouj4hgQ4eizdrEQBJIAyrJ9twow9ZjR5T0TNBRqhr WK8niQ4FQSFPmfegdYJnbA9soBO8ApSNG7avJR+mBt4KSQ6Um2JupVDzCpB0ei8= X-Gm-Gg: ASbGnctVxtAVVvO241c6/X/GI3FCXjlXMOoDEx1bqbGJ7lgOi3Z9IhUpziP3GEYr6x1 45566bTWPCCRcKPz7sc9Rv1+mdBcriSCqVs7AAVJ5rMU3kz0Lo11JoV0byu/Huu9an9lJ1SaaLh tTmCZXLSZXPRKtLt1zjJjhuBsQm/6qUEHKHwhEvm5ui4//7Cx3YIhQijRt7Th1jY8Yrkny6uR/f 5cgnhgQknFVS0lPrYAkFiiwSPWJzm1MzwfUB9/4AmaiYZRILBXzyDlKzpltpYC8u9N5tz8Q3XFa ppNDdt4RKa3MAQmcjRwqSMea8Tima7ymhIegBr58octpNbLC6eM= X-Google-Smtp-Source: AGHT+IHFdvA0DlIcVCkE1m7fhwPnPK/+qUF0hzp1UIxp5PRqy2sOJJ28/GW1knmKmyzytkf5QLtSrA== X-Received: by 2002:a05:6a20:9f4f:b0:1f5:9208:3ad6 with SMTP id adf61e73a8af0-20444f9f316mr1339520637.41.1745464622463; Wed, 23 Apr 2025 20:17:02 -0700 (PDT) Received: from debug.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-73e25941fd9sm354190b3a.53.2025.04.23.20.16.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Apr 2025 20:17:02 -0700 (PDT) Date: Wed, 23 Apr 2025 20:16:58 -0700 From: Deepak Gupta To: Radim =?utf-8?B?S3LEjW3DocWZ?= Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Andrew Morton , "Liam R. Howlett" , Vlastimil Babka , Lorenzo Stoakes , Paul Walmsley , Palmer Dabbelt , Albert Ou , Conor Dooley , Rob Herring , Krzysztof Kozlowski , Arnd Bergmann , Christian Brauner , Peter Zijlstra , Oleg Nesterov , Eric Biederman , Kees Cook , Jonathan Corbet , Shuah Khan , Jann Horn , Conor Dooley , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, devicetree@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, alistair.francis@wdc.com, richard.henderson@linaro.org, jim.shu@sifive.com, andybnac@gmail.com, kito.cheng@sifive.com, charlie@rivosinc.com, atishp@rivosinc.com, evan@rivosinc.com, cleger@rivosinc.com, alexghiti@rivosinc.com, samitolvanen@google.com, broonie@kernel.org, rick.p.edgecombe@intel.com, Zong Li , linux-riscv Subject: Re: [PATCH v12 10/28] riscv/mm: Implement map_shadow_stack() syscall Message-ID: References: <20250314-v5_user_cfi_series-v12-0-e51202b53138@rivosinc.com> <20250314-v5_user_cfi_series-v12-10-e51202b53138@rivosinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: A6A66100008 X-Stat-Signature: redu45qcgr47dfdjpbw3hsbe3fpttx9b X-Rspam-User: X-HE-Tag: 1745464623-515001 X-HE-Meta: U2FsdGVkX1+0sj1Xjc3xmRY2C81yqyxLcyTjL9/R05qG06hg2L3e+7XXjE6l4iGfkzYTbmQTkIzbXA1/aAKGW+2hxFBnSKfFneTqT7681puX4fl3SZt+UQJHky/b+lLwVa32NiZOH1nTgjaJ3P8mYZFZNw/i6+D5NuwcIuAtB+V0LuiAF1l7/Ty7XJJIqs4KYwBOZbZytslsR5epc22HFlimjkSjRyRbV9Voy5v+vOf2TxpqSWmgluJuCQmuxBHuXeiQ8iIfMv5ofAyMThRcmiW7PMElP/FK404R7TTzJ+TMtFUIt6GH+rAaNa9pp1tq48S5K1mbmbLtq4ecSJ2uVnxYN9pWpHm1/GgIx5vfo+HV7kHcRPnwna/n+pgHBoUk2xtBFkIHlKMuuenrbbNbDky9PG5MN3yt19ctAvd1TIvizxJRLkYqwnsKTrZA9mmCvCrFACCNdIQTFi4lbkiRmZZIRGgdEXEf/O0Q4473JWZOJzwJjVKiK7EZYg8WaZBbPUfE+657fr0vHtu7YjZtj8zt7uPgdQCan7anRrrPDbjZxcCmbndepG9S6+eEUIBlupTid/uLrL3Ir4TdUEozMau0bYBv2c/Wp83jMBW78io2iVbc586bq2YhJRHJygEt12U9WU65Fhu2nLj3PzzdPS3KYTL//rtr5IlWZfw9ZUcd7tpjnTIlM7U5EeMPs2ve9YHV9qYIn/n3OWu334dW6bYjDLb1w/n6b+KiV8CfwWhH3OBATfHqgq3YOO3G9u5iTaZGetx+fY/clfXl7GAxUyo3DIvQBFq+Y12h002qZ4d33JSIFCBlJJ8LOYq4LuluxCD4j/+rMYJ4DgH8UtYIDpteJU6NiCTXi79pSeeFSH7glUYMTcVKhPCMlEDkNfMJ/vQO6As51Dtx28xj9gQd2K3tZA+gvHUq1HkWTTPgMZn5MpGDbYyxO1Gwuil8Fid65bRta7cJheUY8oT2ud2 unwjQ/2d fnMY2U5AeM1ewJ35RM14jGvzeV6JFwMiVDU4ba2J/mMTeKE/vlfhXjxJfp+0EmiCWHR2lUsm0TYepWVLWWMxJG2v7KLvBVooRux0m8C3JrPbBuyc/LLQ2z/Nn8whlkqU5NbbF2vG9mEUK6W/2CHgyGgtpdSQfE4DZw3eJzUKqwd9WRr7AofMpbWAO0ICMBPadovmVDAa9Xse4YFoSs1NtbW7WDVgkZpBNofX403MBv3IlD56znFiSf+Qq89/XY7mNF6M40KNdS5cr3W72kW9cU6fCYA7lloDdPVKwrlBi5NO/ifMk9g52gduZq68OBmY38GYMeK0A8fx7eAeOwmMh37J2YAU/xPDlNqIys38q7w354hq9tc17hXTA98niWhmDFAZH0Z7XIV+0ZkaMegA/eFB2g/O4qIl/M/3o69HhMv4IeoZbkayUK3yZkjnTUdFpu9A/+t0AjmCCBAXAhl3sOjrTVHuU9/uup7dCjLci3nbYmY6ZRxbE101cIpd3IJ2KEVjPDI8cPVd1/ovqsP3ZMOWq3MjZ0Gd7mCUw0W9wluljBBefJ2Ah2p8Wx/um3Bf58blm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 10, 2025 at 11:56:44AM +0200, Radim Krčmář wrote: >2025-03-14T14:39:29-07:00, Deepak Gupta : >> As discussed extensively in the changelog for the addition of this >> syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the >> existing mmap() and madvise() syscalls do not map entirely well onto the >> security requirements for shadow stack memory since they lead to windows >> where memory is allocated but not yet protected or stacks which are not >> properly and safely initialised. Instead a new syscall map_shadow_stack() >> has been defined which allocates and initialises a shadow stack page. >> >> This patch implements this syscall for riscv. riscv doesn't require token >> to be setup by kernel because user mode can do that by itself. However to >> provide compatibility and portability with other architectues, user mode >> can specify token set flag. > >RISC-V shadow stack could use mmap() and madvise() perfectly well. Deviating from what other arches are doing will create more thrash. I expect there will be merging of common logic between x86, arm64 and riscv. Infact I did post one such RFC patch set last year (didn't follow up on it). Using `mmap/madvise` defeats that purpose of creating common logic between arches. There are pitfalls as mentioned with respect to mmap/madivse because of unique nature of shadow stack. And thus it was accepted to create a new syscall to create such mappings. RISC-V will stick to that. >Userspace can always initialize the shadow stack properly and the shadow >stack memory is never protected from other malicious threads. Shadow stack memory is protected from inadvertent stores (be it same thread or a different thread in same address space). Malicious code which can do `sspush/ssamoswap` already assumes that code integrity policies are broken. > >I think that the compatibility argument is reasonable. We'd need to >modify the other syscalls to allow a write-only mapping anyway. > >> diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c >> +static noinline unsigned long amo_user_shstk(unsigned long *addr, unsigned long val) >> +{ >> + /* >> + * Never expect -1 on shadow stack. Expect return addresses and zero >> + */ >> + unsigned long swap = -1; >> + __enable_user_access(); >> + asm goto( >> + ".option push\n" >> + ".option arch, +zicfiss\n" > >Shouldn't compiler accept ssamoswap.d opcode even without zicfiss arch? Its illegal instruction if shadow stack aren't available. Current toolchain emits it only if zicfiss is specified in march. > >> + "1: ssamoswap.d %[swap], %[val], %[addr]\n" >> + _ASM_EXTABLE(1b, %l[fault]) >> + RISCV_ACQUIRE_BARRIER > >Why is the barrier here? IIRC, I was following `arch_cmpxchg_acquire`. But I think that's not needed. What we are doing is `arch_xchg_relaxed` and barrier is not needed. I did consider adding it to arch/riscv/include/asm/cmpxchg.h but there is limited usage of this primitive and thus kept it limited to usercfi.c Anyways I'll re-spin removing the barrier. > >> + ".option pop\n" >> + : [swap] "=r" (swap), [addr] "+A" (*addr) >> + : [val] "r" (val) >> + : "memory" >> + : fault >> + ); >> + __disable_user_access(); >> + return swap; >> +fault: >> + __disable_user_access(); >> + return -1; > >I think we should return 0 and -EFAULT. >We can ignore the swapped value, or return it through a pointer. Consumer of this detects -1 and then return -EFAULT. We would eventually need this when creating shadow stack tokens for kernel shadow stack. I believe `-1` is safe return value which can't be construed as negative kernel address (-EFAULT will be) > >> +} >> + >> +static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size, >> + unsigned long token_offset, bool set_tok) >> +{ >> + int flags = MAP_ANONYMOUS | MAP_PRIVATE; > >Is MAP_GROWSDOWN pointless? Not sure. Didn't see that in x86 or arm64 shadow stack creation. Let me know if its useful. > >> + struct mm_struct *mm = current->mm; >> + unsigned long populate, tok_loc = 0; >> + >> + if (addr) >> + flags |= MAP_FIXED_NOREPLACE; >> + >> + mmap_write_lock(mm); >> + addr = do_mmap(NULL, addr, size, PROT_READ, flags, > >PROT_READ implies VM_READ, so won't this select PAGE_COPY in the >protection_map instead of PAGE_SHADOWSTACK? PROT_READ is pointless here and redundant. I haven't checked if I remove it what happens. `VM_SHADOW_STACK` takes precedence (take a look at pte_mkwrite and pmd_mkwrite. Only way `VM_SHADOW_STACK` is possible in vmflags is via `map_shadow_stack` or `fork/clone` on existing task with shadow stack enabled. In a nutshell user can't specify `VM_SHADOW_STACK` directly (indirectly via map_shadow_stack syscall or fork/clone) . But if set in vmaflags then it'll take precedence. > >Wouldn't avoiding VM_READ also allow us to get rid of the ugly hack in >pte_mkwrite? (VM_WRITE would naturally select the right XWR flags.) > >> + VM_SHADOW_STACK | VM_WRITE, 0, &populate, NULL); >> + mmap_write_unlock(mm); >> + >> +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags) >> +{ >> [...] >> + if (addr && (addr & (PAGE_SIZE - 1))) > >if (!PAGE_ALIGNED(addr))