From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34A92C369AB for ; Thu, 24 Apr 2025 18:16:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C9D96B00C0; Thu, 24 Apr 2025 14:16:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 64EC76B00C3; Thu, 24 Apr 2025 14:16:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42EB56B00D5; Thu, 24 Apr 2025 14:16:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1E29A6B00C0 for ; Thu, 24 Apr 2025 14:16:27 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 026BB817A5 for ; Thu, 24 Apr 2025 18:16:26 +0000 (UTC) X-FDA: 83369742414.08.A617630 Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by imf16.hostedemail.com (Postfix) with ESMTP id 8F78E180013 for ; Thu, 24 Apr 2025 18:16:24 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=FjxGZMey; spf=pass (imf16.hostedemail.com: domain of debug@rivosinc.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=debug@rivosinc.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745518584; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=T9sHYezjUq6l/x/6C81DhRiO6Hkx3zNHx6RVEna5xiI=; b=fbzZXPI452pAWqx9Mu5YSd0iDwIJRcmB7AkzSoe/hxLRGdIjTgn6iKicB4WCn20SgFMN1j GtZrsG9yhP0hLkcNa9LPj2my2FAxmhEt0D3Z62ELInoKu4IzYrZYnc2+rCq5B9TpvSjXnN gBDVLxBW/cg2wPtPaRYmknmQSyq38yk= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=FjxGZMey; spf=pass (imf16.hostedemail.com: domain of debug@rivosinc.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=debug@rivosinc.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745518584; a=rsa-sha256; cv=none; b=oinKjqZCp0pezCEzLeqdFfeqCs/4KplAD51PqKObp2Y+DUEiYwghGWPeASTYJKN7gkaszA uc6NOq2BW+H2GK5ZJ/C18dDfLJnJ4YSxMOWKz0+gX3n1EkvYsQBarFEszFK5TXR3zcIhFv 18KH8v5ziSq87sghU2s/bFNEd0gTNdM= Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-736b0c68092so1161780b3a.0 for ; Thu, 24 Apr 2025 11:16:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1745518583; x=1746123383; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=T9sHYezjUq6l/x/6C81DhRiO6Hkx3zNHx6RVEna5xiI=; b=FjxGZMey/1pr8DNHXAmqbwmMTr0IK24DhVz7TyP29xpRkvJwVYUQywyzrssrNORIKm YufjWFMiQfFXNTIJ+JVY0oHzeIhx6+4YdVsfB50QSVuEnf1hSzy5tzwcEkgNTbFEZ7tN VYv91Z26OP5KYs/0MuUneE+wT1lpFpeCMz8T5/GQBuIPfaxdW/VyRNPEAXjWCxPG1DRm mrc3ESSOkmMNDoe/xJok7mwJhK3ab3Pziu7R8dn7btH8CeD81BkAFO8s0AMC6saqWpFN fkF1kq1JK94C2Od4dSunjnAFR8IyKNWpVQZ2FIeL1HQoiAjAyzuAZma7o5JBC7mHd7ws qKig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745518583; x=1746123383; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=T9sHYezjUq6l/x/6C81DhRiO6Hkx3zNHx6RVEna5xiI=; b=OCVLLpFORxjxvYYNOcO1bvvXhI8VOrSb1X/1uSWbYAU+9O2I7mLQiNTOKzUigMjVK1 cbjiCLehf5EkK7+8H0yKMK7YnUvJ0aNJU5eUplRArjCDNhrT2QrLboyHJXDjT507wKyY Pq0JSALZafA7Dt51KD+JbA9Cn41xCdEymPAw7aR1tFz5It5ecpoLDWQty6fUVbUyst5A Dn2lyI65pijXJODT8lYZUCPVHa5VYp7TzyuzmlyoKbrmtHQEPuZZMLLS+xpIXUC/ugJD zWC6uUxwW9EyAB3IR68GY6C7KUiSLzegRWWrpKpvgcQuKYyB1gU9oSejA/hpJr97hhXy Qylw== X-Forwarded-Encrypted: i=1; AJvYcCWsvOrWFxjtbHBytvkL/4moaCgr6rE15FFCgc8K1f7gS7HxDRSo127upCg265+7aMW5kvrPwGdiHA==@kvack.org X-Gm-Message-State: AOJu0YwVt8S2iogFgBaM/pjcw7ct3f0X6BZm1y3B7126CU3f3inNWHty v0kOHTIIgx+JYdwCz82pT9oHI6hgi2qtbUxGYtUbTIkvnrSMrBwM3fydqrKkbPM= X-Gm-Gg: ASbGnctN0vSGg3mP4/UQhXh/raxrGrq+7f6Ofuh05GFh7AQNGAHA1s6bYaTA/j6wEge 1uT1U9z84+xMHlH8/3zmo0Zse4P+pX5bBkovptFJDgVVzijiPnOFqt0R8cCCFcdeDcYvi+6APFF fEEV7AJC68X50kUkCxDVE1oht+AYuoIgtnQ1bQuU7db6WurnH60Q3ySm96+Gvd/nIf3mJLbHhdh QST8+E7WvQb4DKxDQVl2xeswR0gcYa13oYldq28t7IHwqO4E8pSvTdzrHiHUUEcbcNY2VE4mb44 c84l/J+kwD86+LRbmfFIt2rhVdVBSzmlDFESsX5THr2CpYZ0L3c= X-Google-Smtp-Source: AGHT+IF0jVzUPU4VWGuVNQ17BhULhOztKBekv2BW5HkIJsQ1IRCAo1z8B5DG3h77AtgqUMysW+802Q== X-Received: by 2002:aa7:9315:0:b0:736:9e40:13b1 with SMTP id d2e1a72fcca58-73e24ae7b45mr5236452b3a.23.1745518583417; Thu, 24 Apr 2025 11:16:23 -0700 (PDT) Received: from debug.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-73e259414c7sm1783697b3a.62.2025.04.24.11.16.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Apr 2025 11:16:23 -0700 (PDT) Date: Thu, 24 Apr 2025 11:16:19 -0700 From: Deepak Gupta To: Radim =?utf-8?B?S3LEjW3DocWZ?= Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Andrew Morton , "Liam R. Howlett" , Vlastimil Babka , Lorenzo Stoakes , Paul Walmsley , Palmer Dabbelt , Albert Ou , Conor Dooley , Rob Herring , Krzysztof Kozlowski , Arnd Bergmann , Christian Brauner , Peter Zijlstra , Oleg Nesterov , Eric Biederman , Kees Cook , Jonathan Corbet , Shuah Khan , Jann Horn , Conor Dooley , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-riscv@lists.infradead.org, devicetree@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, alistair.francis@wdc.com, richard.henderson@linaro.org, jim.shu@sifive.com, andybnac@gmail.com, kito.cheng@sifive.com, charlie@rivosinc.com, atishp@rivosinc.com, evan@rivosinc.com, cleger@rivosinc.com, alexghiti@rivosinc.com, samitolvanen@google.com, broonie@kernel.org, rick.p.edgecombe@intel.com, linux-riscv Subject: Re: [PATCH v12 12/28] riscv: Implements arch agnostic shadow stack prctls Message-ID: References: <20250314-v5_user_cfi_series-v12-0-e51202b53138@rivosinc.com> <20250314-v5_user_cfi_series-v12-12-e51202b53138@rivosinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Stat-Signature: deuigfhdk4n1tythyfb3azc5fbycp4nf X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 8F78E180013 X-Rspam-User: X-HE-Tag: 1745518584-339244 X-HE-Meta: U2FsdGVkX19Az+Pi3ughjYxhelghA7PcW7DJdjEnc0tm4yut2vuXwai7ox6pmQvonP0vCheViBCVSoy45pGcyiCn1zCNLg1Neddxl5JxFE99xwTVvoHxZvttn44G5DXHYQD2PBOfSy1256SNgUrL05eMbpeLwIsC9Qfg+x3170APQu1mxhUjoX4G+F02GDFj3sZSYRJA27wplU/Qy6IICAp2K2+VD7Mmt3gWHDNr8raDRTIiOHY26jGwNeUhVMNx2A5MV3hMJd0av3CifGHFlTd7NJhpOk1H2mJhjMZ+rM85wBsYAi+USho4GRSeSIlO2liQhDnbBuQ9d6KoM8d9T60UngG1tPcbHg9+5bYivqSt2xM2mEKIMbRmrdZA8qThTN3wC8F6jm350/l2dxRUhBdXezNwu/Lpxy8EISA0hE6fZZQDIXV3onk3CNo9RTGZGsp7SSdxLt45fHjmibZDcb8cYWRhzQIsht5Wqihze+WmVtGhGdhzyQhi8Imq5Sz9fyymOWmWLj+TjqshVd6tiWPH59taRJxUU5H3V+vDuCYuzAXPU8nLCpNdi5MCDLekFcVpe/fi/vi4rVbidexzQdi9vRPhUQhU5l2A9iPMLfWTWDISgTTC+hW5a86KWqbCd9exTnHyvOsHUV0Gmijm5VnOGVGX7//CalIv7pY9pKijKbJb4p/itSUnDgGYI5iX/CP3wMp9Wl1jQq/XdejUIt5xcH4Gn8uocSu3PA0OcCNgimCzq4XthOO2vbH5rHTCVWLf7BoTppaicUsUe3mZh0qLPDXZSUPo1eW6fpETC5tVYGuUV26wduQXhxH2u8P7jj9vfN8pSPe0bpkTN+jgzITZtnWtlZOL7oNxRGWOcab5IhLAQRK4vGsyxolh+a/k10vgFHA9lK6LqOANi8e4GCycZes6XPB2kC1jvXCD6ibojOXCIrX93LNivoSVMs5NDdwxdV7ffMAszEq9BVk 2Qv24IfV s+9ZYScALrFR9yljhDmEYpFQdGwTE00/09kJeEaPnjAbmJmEuFZQ5wZ0ouB2R5zPojba7HLswIoJ41z67va0RlVAs+3dY/1w/712DgOgX8ihEbwwRm5Pd0k3mu7qebGjp7V7pO2/GmHRLnSSMFQLwwMAA6yN5zkn72RV2bpIp92qcEhSYXxAW8sYzH3lIQ/R/2K/oLeQ+CWa20p1KMDO1jTQ5qbwm8Oo1NQzZ7E9RGT4w/1vwuvQZPSmNCA9ERsR+NKZXk5b7+nop2aCzEVnAaFgh9oz4yF5+qljCSiQOwqKJsqb2wHAvVs3++MTLttqZEdiOrU6AYQABn3l2U1M+g053YtTWem5WDN9/rUO80K9NtLNhbuZ3c0gw7d4FGPtM77nZNRHMV8knExqGKfJ9rTD1id1/I6K/qr08VIsfRjtkqvPK929CfKqkvmdv+Xt6E0nC8VV9NedoHOnj6ap9gPjdMtDCU77mTYL9z35/5TltwhR2HOnFQeDgU7zQXlkn5JHQsZ19KOh5xoIIJe5MIIIEYNff3aKvLhlgJ9awsdfBexk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 24, 2025 at 03:36:54PM +0200, Radim Krčmář wrote: >2025-04-23T21:44:09-07:00, Deepak Gupta : >> On Thu, Apr 10, 2025 at 11:45:58AM +0200, Radim Krčmář wrote: >>>2025-03-14T14:39:31-07:00, Deepak Gupta : >>>> diff --git a/arch/riscv/include/asm/usercfi.h b/arch/riscv/include/asm/usercfi.h >>>> @@ -14,7 +15,8 @@ struct kernel_clone_args; >>>> struct cfi_status { >>>> unsigned long ubcfi_en : 1; /* Enable for backward cfi. */ >>>> - unsigned long rsvd : ((sizeof(unsigned long) * 8) - 1); >>>> + unsigned long ubcfi_locked : 1; >>>> + unsigned long rsvd : ((sizeof(unsigned long) * 8) - 2); >>> >>>The rsvd field shouldn't be necessary as the container for the bitfield >>>is 'unsigned long' sized. >>> >>>Why don't we use bools here, though? >>>It might produce a better binary and we're not hurting for struct size. >> >> If you remember one of the previous patch discussion, this goes into >> `thread_info` Don't want to bloat it. Even if we end shoving into task_struct, >> don't want to bloat that either. I can just convert it into bitmask if >> bitfields are an eyesore here. > > "unsigned long rsvd : ((sizeof(unsigned long) * 8) - 2);" > >is an eyesore that defines exactly the same as the two lines alone > > unsigned long ubcfi_en : 1; > unsigned long ubcfi_locked : 1; > >That one should be removed. > >If we have only 4 bits in 4/8 bytes, then bitfields do generate worse >code than 4 bools and a 0/4 byte hole. The struct size stays the same. > >I don't care much about the switch to bools, though, because this code >is not called often. I'll remove the bitfields, have single `unsigned long cfi_control_state` And do `#define RISCV_UBCFI_EN 1` and so on. > >>>> @@ -262,3 +292,83 @@ void shstk_release(struct task_struct *tsk) >>>> +int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status) >>>> +{ >>>> + /* Request is to enable shadow stack and shadow stack is not enabled already */ >>>> + if (enable_shstk && !is_shstk_enabled(t)) { >>>> + /* shadow stack was allocated and enable request again >>>> + * no need to support such usecase and return EINVAL. >>>> + */ >>>> + if (is_shstk_allocated(t)) >>>> + return -EINVAL; >>>> + >>>> + size = calc_shstk_size(0); >>>> + addr = allocate_shadow_stack(0, size, 0, false); >>> >>>Why don't we use the userspace-allocated stack? >>> >>>I'm completely missing the design idea here... Userspace has absolute >>>over the shadow stack pointer CSR, so we don't need to do much in Linux: >>> >>>1. interface to set up page tables with -W- PTE and >>>2. interface to control senvcfg.SSE. >>> >>>Userspace can do the rest. >> >> Design is like following: >> >> When a user task wants to enable shadow stack for itself, it has to issue >> a syscall to kernel (like this prctl). Now it can be done independently by >> user task by first issuing `map_shadow_stack`, then asking kernel to light >> up envcfg bit and eventually when return to usermode happens, it can write >> to CSR. It is no different from doing all of the above together in single >> `prctl` call. They are equivalent in that nature. >> >> Background is that x86 followed this because x86 had workloads/binaries/ >> functions with (deep)recursive functions and thus by default were forced >> to always allocate shadow stack to be of the same size as data stack. To >> reduce burden on userspace for determining and then allocating same size >> (size of data stack) shadow stack, prctl would do the job of calculating >> default shadow stack size (and reduce programming error in usermode). arm64 >> followed the suite. I don't want to find out what's the compatiblity issues >> we will see and thus just following the suite (given that both approaches >> are equivalent). Take a look at static `calc_shstk_size(unsigned long size)`. >> >> Coming back to your question of why not allowing userspace to manage its >> own shadow stack. Answer is that it can manage its own shadow stack. If it >> does, it just have to be aware of size its allocating for shadow stack. > >It's just that userspace cannot prevent allocation of the default stack >when enabling it, which is the weird part to me. >The allocate and enable syscalls could have been nicely composable. > >> There is already a patch series going on to manage this using clone3. >> https://lore.kernel.org/all/20250408-clone3-shadow-stack-v15-4-3fa245c6e3be@kernel.org/ > >A new ioctl does seem to solve most of the practical issues, thanks. > >> I fully expect green thread implementations in rust/go or swapcontext >> based thread management doing this on their own. >> >> Current design is to ensure existing apps dont have to change a lot in >> userspace and by default kernel gives compatibility. Anyone else wanting >> to optimize the usage of shadow stack can do so with current design. > >Right, changing rlimit_stack around shadow stack allocation is not the >most elegant way, but it does work. > >>>> +int arch_lock_shadow_stack_status(struct task_struct *task, >>>> + unsigned long arg) >>>> +{ >>>> + /* If shtstk not supported or not enabled on task, nothing to lock here */ >>>> + if (!cpu_supports_shadow_stack() || >>>> + !is_shstk_enabled(task) || arg != 0) >>>> + return -EINVAL; >>> >>>The task might want to prevent shadow stack from being enabled? >> >> But Why would it want to do that? Task can simply not issue the prctl. There >> are glibc tunables as well using which it can be disabled. > >The task might do it as some last resort to prevent a buggy code from >enabling shadow stacks that would just crash. Or whatever complicated >reason userspace can think of. > >It's more the other way around. I wonder why we're removing this option >when we don't really care what userspace does to itself. >I think it's complicating the kernel without an obvious gain. It just feels wierd. There isn't anything like this for other features lit-up via envcfg. Does hwprobe allow this on per-task basis? I'll look into it.