From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F23DC433F5 for ; Tue, 4 Oct 2022 23:17:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F13CD6B0071; Tue, 4 Oct 2022 19:17:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC2F88E0001; Tue, 4 Oct 2022 19:17:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D15796B0074; Tue, 4 Oct 2022 19:17:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BD7536B0071 for ; Tue, 4 Oct 2022 19:17:06 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8996E120C3B for ; Tue, 4 Oct 2022 23:17:06 +0000 (UTC) X-FDA: 79984829652.25.2CF6A64 Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 3F6F540018 for ; Tue, 4 Oct 2022 23:17:05 +0000 (UTC) Received: by mail-qk1-f172.google.com with SMTP id h28so9338635qka.0 for ; Tue, 04 Oct 2022 16:17:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=zSufA2BleX6KW+Heb1fzbCfbakErWjyYXp3z/uacu+I=; b=MMg3u1+s+231ITnh092QPi56B1YNS5CXal+weHc2Gl1NpCIkIZ3tdGsu6gqG6+NgH1 HsDy5GTb3SMo3n1bUlC5Fsn2r/PbI/CZmnR3ppB+GQmU+xRHBCczAOlMfvUGjhCOX6mN scf1NFBcYHyVs8zMY6dj74/wc5Dm8TyJkMOsdEAhnOviveUQj21l9K1Jtu9wGYiRN1Xn xrubb9IZVXCsW8dGan2jPjTwo7MqEQg+/klk5dEWIPZRE2WMPOl/A2o13gzNbhB+blGu gU4CA812ZTE5rjSZFomkZ+sgAw1JzIhtscUP3vztlKzROHA340jmlzzYJ8ZK9STed1mX gkQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=zSufA2BleX6KW+Heb1fzbCfbakErWjyYXp3z/uacu+I=; b=6SUMAMnuwwqHf2rT9+uVZA/+uwi7iH0DT7tG2X0EI8oxb49fzCYyKcJ0jIsO8/nr72 f9od9IZ/c4U70Z5MrP/ZC5VP46uNlriMMYVIJY0jR9Tt6V7HWfHbXC8KNYGmcjDaYLLu 2kyw5BzxkIydIaZlaWITIHe8YAZJJ6z0h7nuM72IDbVDPaMIiTV5m66Z/iV75QpJKkeq md1RiayO2xY5wO+3vKp+LNGA/K5xeUmbIMQP7QHBKPgqFXXT0STter5qNZheH6+v5LJZ V/ka5Sd79qgFR8bMYvSsAts6JEn3LALEjNL1XYAJuN9VELXjtV84afhsOEfbsSTBLtrC HhCw== X-Gm-Message-State: ACrzQf35BCaeOAafaN+Lh1KLJtHXdAycBcEnC72x56JMtKJ08YjVshD0 O+JnUFxNPZG5LZwCZjSsACUABNFhjARjXYuDfog= X-Google-Smtp-Source: AMsMyM53PakVoR8R2PIBC98o88+P7qUZY44ZI0/EscGu86qlGmjTuTg/e/ORY2Q7BrAqElyypw1bVcqqZ44Oh6ilgpQ= X-Received: by 2002:a05:620a:2683:b0:6cf:3768:8e4b with SMTP id c3-20020a05620a268300b006cf37688e4bmr18121144qkp.768.1664925424245; Tue, 04 Oct 2022 16:17:04 -0700 (PDT) MIME-Version: 1.0 References: <20220929222936.14584-1-rick.p.edgecombe@intel.com> <20220929222936.14584-29-rick.p.edgecombe@intel.com> <202210031446.E4AD9EE66@keescook> In-Reply-To: From: "H.J. Lu" Date: Tue, 4 Oct 2022 16:16:28 -0700 Message-ID: Subject: Re: [PATCH v2 28/39] x86/cet/shstk: Introduce map_shadow_stack syscall To: "Edgecombe, Rick P" Cc: "keescook@chromium.org" , "bsingharora@gmail.com" , "hpa@zytor.com" , "Syromiatnikov, Eugene" , "peterz@infradead.org" , "rdunlap@infradead.org" , "dave.hansen@linux.intel.com" , "kirill.shutemov@linux.intel.com" , "Eranian, Stephane" , "linux-mm@kvack.org" , "fweimer@redhat.com" , "nadav.amit@gmail.com" , "jannh@google.com" , "dethoma@microsoft.com" , "linux-arch@vger.kernel.org" , "kcc@google.com" , "bp@alien8.de" , "oleg@redhat.com" , "Yang, Weijiang" , "Lutomirski, Andy" , "pavel@ucw.cz" , "arnd@arndb.de" , "Moreira, Joao" , "tglx@linutronix.de" , "mike.kravetz@oracle.com" , "x86@kernel.org" , "linux-doc@vger.kernel.org" , "jamorris@linux.microsoft.com" , "john.allen@amd.com" , "rppt@kernel.org" , "mingo@redhat.com" , "Shankar, Ravi V" , "corbet@lwn.net" , "linux-kernel@vger.kernel.org" , "linux-api@vger.kernel.org" , "gorcunov@gmail.com" Content-Type: text/plain; charset="UTF-8" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664925425; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zSufA2BleX6KW+Heb1fzbCfbakErWjyYXp3z/uacu+I=; b=5b2apUPIOofRc86Y0tT1K6/SNmQndiqjMnvfCrD1YaSTUbfSnUlH419icSggoXF4xqqNAg +JEDAt3Aqt8tAT+70zaZPwHmRlHfbDtDdXIlmhxLf5Z1fxBumJo8KEIsfhwg63ZVpz4cHf QS75YHEBaS8b/5msMOUKgOvrYYro87Q= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=MMg3u1+s; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf11.hostedemail.com: domain of hjl.tools@gmail.com designates 209.85.222.172 as permitted sender) smtp.mailfrom=hjl.tools@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664925425; a=rsa-sha256; cv=none; b=1YB4PIARB9ywwMyy2jFAL4UT1MpSkq8cN3kKgDAdkEuWBopRCUBcQ5BkCY5Q9oF5ANPOsa JnmHYzAHpJi0Jo8De9RW9IrDfPbovKoM1OCIVGjYzgcXiiQudIeV9toVkzMeKyvIPBvy1g XsDJuzs7pWZzPBzidGzkyEBsen9dELg= X-Stat-Signature: in3551ug7om3atok3rtakrmsfugwfd95 X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 3F6F540018 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=MMg3u1+s; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf11.hostedemail.com: domain of hjl.tools@gmail.com designates 209.85.222.172 as permitted sender) smtp.mailfrom=hjl.tools@gmail.com X-HE-Tag: 1664925425-638816 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Oct 4, 2022 at 3:56 PM Edgecombe, Rick P wrote: > > On Mon, 2022-10-03 at 15:23 -0700, Kees Cook wrote: > > On Thu, Sep 29, 2022 at 03:29:25PM -0700, Rick Edgecombe wrote: > > > [...] > > > The following example demonstrates how to create a new shadow stack > > > with > > > map_shadow_stack: > > > void *shstk = map_shadow_stack(adrr, stack_size, > > > SHADOW_STACK_SET_TOKEN); > > > > typo: addr > > Yep, thanks. > > > > > > > [...] > > > +451 common map_shadow_stack sys_map_shadow_stac > > > k > > > > Isn't this "64", not "common"? > > Yes, this should have been changed after dropping 32 bit. We don't support ia32. But this is used for x32 which is supported. > > > > > [...] > > > +#define SHADOW_STACK_SET_TOKEN 0x1 /* Set up a restore token > > > in the shadow stack */ > > > > I think this should get an intro comment, like: > > > > /* Flags for map_shadow_stack(2) */ > > > > Also, as with the other UAPI fields, please use "(1ULL << 0)" here. > > Ok. > > > > > > @@ -62,24 +63,34 @@ static int create_rstor_token(unsigned long > > > ssp, unsigned long *token_addr) > > > if (write_user_shstk_64((u64 __user *)addr, (u64)ssp)) > > > return -EFAULT; > > > > > > - *token_addr = addr; > > > + if (token_addr) > > > + *token_addr = addr; > > > > > > return 0; > > > } > > > > > > > Can this just be collapsed into the patch that introduces > > create_rstor_token()? > > I mean, yea, that would be simpler. Breaking the changes apart was left > over from when the signals placed a token, but didn't need this extra > bit of functionality. > > > > > > -static unsigned long alloc_shstk(unsigned long size) > > > +static unsigned long alloc_shstk(unsigned long addr, unsigned long > > > size, > > > + unsigned long token_offset, bool > > > set_res_tok) > > > { > > > int flags = MAP_ANONYMOUS | MAP_PRIVATE; > > > struct mm_struct *mm = current->mm; > > > - unsigned long addr, unused; > > > + unsigned long mapped_addr, unused; > > > > > > mmap_write_lock(mm); > > > - addr = do_mmap(NULL, addr, size, PROT_READ, flags, > > > > Oops, I missed in the other patch that "addr" was being passed here. > > (uninitialized?) > > Argh, yes. I'll initialize in that patch and remove it here. > > > > > > - VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL); > > > - > > > + mapped_addr = do_mmap(NULL, addr, size, PROT_READ, flags, > > > + VM_SHADOW_STACK | VM_WRITE, 0, &unused, > > > NULL); > > > > I don't see do_mmap() doing anything here to avoid remapping a prior > > vma > > as shstk. Is the intention to allow userspace to convert existing > > VMAs? > > This has caused pain in the past, perhaps force MAP_FIXED_NOREPLACE ? > > No that is not the intention. It should fail and MAP_FIXED_NOREPLACE > looks like it will fit the bill. Thanks! > > > > > > [...] > > > @@ -174,6 +185,7 @@ int shstk_alloc_thread_stack(struct task_struct > > > *tsk, unsigned long clone_flags, > > > > > > > > > stack_size = PAGE_ALIGN(stack_size); > > > + addr = alloc_shstk(0, stack_size, 0, false); > > > if (IS_ERR_VALUE(addr)) > > > return PTR_ERR((void *)addr); > > > > > > > As mentioned earlier, I was expecting this patch to replace a > > (missing) > > call to alloc_shstk. i.e. expecting: > > > > - addr = alloc_shstk(stack_size); > > > > > @@ -395,6 +407,26 @@ int shstk_disable(void) > > > return 0; > > > } > > > > > > + > > > +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned > > > long, size, unsigned int, flags) > > > > Please add kern-doc for this, with some notes. E.g. at least one > > thing isn't immediately > > obvious, maybe more: "addr" must be a multiple of 8. > > Ok. > > > > > > +{ > > > + unsigned long aligned_size; > > > + > > > + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) > > > + return -ENOSYS; > > > > This needs to explicitly reject unknown flags[1], or expanding them > > in the > > future becomes very painful: > > > > if (flags & ~(SHADOW_STACK_SET_TOKEN)) > > return -EINVAL; > > > > > > [1] > > https://docs.kernel.org/process/adding-syscalls.html#designing-the-api-planning-for-extension > > > > Ok, good idea. > > > > + > > > + /* > > > + * An overflow would result in attempting to write the restore > > > token > > > + * to the wrong location. Not catastrophic, but just return the > > > right > > > + * error code and block it. > > > + */ > > > + aligned_size = PAGE_ALIGN(size); > > > + if (aligned_size < size) > > > + return -EOVERFLOW; > > > > The intention here is to allow userspace to ask for _less_ than a > > page > > size multiple, and to put the restore token there? > > > > Is it worth adding a check for size >= 8 here? Or, I guess it would > > just > > immediately crash on the next call? > > Funny you should ask... The glibc changes were doing this and then > looking for the token at the end of the length that it passed (not the > page aligned length). I had changed the kernel at one point to be page > aligned and then had the fun of debugging the results. I thought, glibc > is just wasting shadow stack. It should ask for page aligned shadow > stacks. But HJ argued that the kernel shouldn't second guess what > userspace is asking for based on HW page size details that don't have > to do with the software interface. I was convinced by that argument, > even though glibc is still wasting space. > > I could still be convinced the other way though. Glibc still has time > to (and should) change. But yea, that was actually the intention. Glibc requests a shadow stack of a given size and expects the restore token at the specific location. This is how glibc uses the restore token to switch to the new shadow stack. > > > > > + > > > + return alloc_shstk(addr, aligned_size, size, flags & > > > SHADOW_STACK_SET_TOKEN); > > > +} > > > > -- H.J.