From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73EDDC388F9 for ; Wed, 4 Nov 2020 14:55:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C574F22409 for ; Wed, 4 Nov 2020 14:55:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C574F22409 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3C25A6B006E; Wed, 4 Nov 2020 09:55:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 34A706B0070; Wed, 4 Nov 2020 09:55:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 19D9C6B0071; Wed, 4 Nov 2020 09:55:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0123.hostedemail.com [216.40.44.123]) by kanga.kvack.org (Postfix) with ESMTP id D07946B006E for ; Wed, 4 Nov 2020 09:55:49 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 6F2E0181AC9CB for ; Wed, 4 Nov 2020 14:55:49 +0000 (UTC) X-FDA: 77447035218.22.hen04_4c06fe8272c2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 4AC7C18038E68 for ; Wed, 4 Nov 2020 14:55:49 +0000 (UTC) X-HE-Tag: hen04_4c06fe8272c2 X-Filterd-Recvd-Size: 15914 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Wed, 4 Nov 2020 14:55:48 +0000 (UTC) Received: from suppilovahvero.lan (83-245-197-237.elisa-laajakaista.fi [83.245.197.237]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7979222456; Wed, 4 Nov 2020 14:55:40 +0000 (UTC) From: Jarkko Sakkinen To: x86@kernel.org, linux-sgx@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Jarkko Sakkinen , linux-security-module@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Matthew Wilcox , Jethro Beekman , Haitao Huang , Chunyang Hui , Jordan Hand , Nathaniel McCallum , Seth Moore , Darren Kenny , Sean Christopherson , Suresh Siddha , andriy.shevchenko@linux.intel.com, asapek@google.com, bp@alien8.de, cedric.xing@intel.com, chenalexchen@google.com, conradparker@google.com, cyhanish@google.com, dave.hansen@intel.com, haitao.huang@intel.com, kai.huang@intel.com, kai.svahn@intel.com, kmoy@google.com, ludloff@google.com, luto@kernel.org, nhorman@redhat.com, puiterwijk@redhat.com, rientjes@google.com, tglx@linutronix.de, yaozhangx@google.com, mikko.ylinen@intel.com Subject: [PATCH v40 11/24] x86/sgx: Add SGX misc driver interface Date: Wed, 4 Nov 2020 16:54:17 +0200 Message-Id: <20201104145430.300542-12-jarkko.sakkinen@linux.intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201104145430.300542-1-jarkko.sakkinen@linux.intel.com> References: <20201104145430.300542-1-jarkko.sakkinen@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Intel(R) SGX is new hardware functionality that can be used by applicatio= ns to set aside private regions of code and data called enclaves. New hardwa= re protects enclave code and data from outside access and modification. Add a driver that presents a device file and ioctl API to build and manag= e enclaves. Subsequent patches will expend the ioctl()=E2=80=99s functiona= lity. Cc: linux-security-module@vger.kernel.org Cc: linux-mm@kvack.org Cc: Andrew Morton Cc: Matthew Wilcox Acked-by: Jethro Beekman Tested-by: Jethro Beekman Tested-by: Haitao Huang Tested-by: Chunyang Hui Tested-by: Jordan Hand Tested-by: Nathaniel McCallum Tested-by: Seth Moore Tested-by: Darren Kenny Reviewed-by: Darren Kenny Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Co-developed-by: Suresh Siddha Signed-off-by: Suresh Siddha Signed-off-by: Jarkko Sakkinen --- Changes from v39: * Rename /dev/sgx/enclave as /dev/sgx_enclave. * In the page fault handler, do not check for SGX_ENCL_DEAD. This allows to do forensics to the memory of debug enclaves. arch/x86/kernel/cpu/sgx/Makefile | 2 + arch/x86/kernel/cpu/sgx/driver.c | 112 ++++++++++++++++++ arch/x86/kernel/cpu/sgx/driver.h | 16 +++ arch/x86/kernel/cpu/sgx/encl.c | 188 +++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/sgx/encl.h | 46 ++++++++ arch/x86/kernel/cpu/sgx/main.c | 12 +- 6 files changed, 375 insertions(+), 1 deletion(-) create mode 100644 arch/x86/kernel/cpu/sgx/driver.c create mode 100644 arch/x86/kernel/cpu/sgx/driver.h create mode 100644 arch/x86/kernel/cpu/sgx/encl.c create mode 100644 arch/x86/kernel/cpu/sgx/encl.h diff --git a/arch/x86/kernel/cpu/sgx/Makefile b/arch/x86/kernel/cpu/sgx/M= akefile index 79510ce01b3b..3fc451120735 100644 --- a/arch/x86/kernel/cpu/sgx/Makefile +++ b/arch/x86/kernel/cpu/sgx/Makefile @@ -1,2 +1,4 @@ obj-y +=3D \ + driver.o \ + encl.o \ main.o diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/d= river.c new file mode 100644 index 000000000000..248213dea78e --- /dev/null +++ b/arch/x86/kernel/cpu/sgx/driver.c @@ -0,0 +1,112 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2016-20 Intel Corporation. */ + +#include +#include +#include +#include +#include +#include +#include "driver.h" +#include "encl.h" + +static int sgx_open(struct inode *inode, struct file *file) +{ + struct sgx_encl *encl; + + encl =3D kzalloc(sizeof(*encl), GFP_KERNEL); + if (!encl) + return -ENOMEM; + + xa_init(&encl->page_array); + mutex_init(&encl->lock); + + file->private_data =3D encl; + + return 0; +} + +static int sgx_release(struct inode *inode, struct file *file) +{ + struct sgx_encl *encl =3D file->private_data; + struct sgx_encl_page *entry; + unsigned long index; + + xa_for_each(&encl->page_array, index, entry) { + if (entry->epc_page) { + sgx_free_epc_page(entry->epc_page); + encl->secs_child_cnt--; + entry->epc_page =3D NULL; + } + + kfree(entry); + } + + xa_destroy(&encl->page_array); + + if (!encl->secs_child_cnt && encl->secs.epc_page) { + sgx_free_epc_page(encl->secs.epc_page); + encl->secs.epc_page =3D NULL; + } + + /* Detect EPC page leak's. */ + WARN_ON_ONCE(encl->secs_child_cnt); + WARN_ON_ONCE(encl->secs.epc_page); + + kfree(encl); + return 0; +} + +static int sgx_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct sgx_encl *encl =3D file->private_data; + int ret; + + ret =3D sgx_encl_may_map(encl, vma->vm_start, vma->vm_end, vma->vm_flag= s); + if (ret) + return ret; + + vma->vm_ops =3D &sgx_vm_ops; + vma->vm_flags |=3D VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO; + vma->vm_private_data =3D encl; + + return 0; +} + +static unsigned long sgx_get_unmapped_area(struct file *file, + unsigned long addr, + unsigned long len, + unsigned long pgoff, + unsigned long flags) +{ + if ((flags & MAP_TYPE) =3D=3D MAP_PRIVATE) + return -EINVAL; + + if (flags & MAP_FIXED) + return addr; + + return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); +} + +static const struct file_operations sgx_encl_fops =3D { + .owner =3D THIS_MODULE, + .open =3D sgx_open, + .release =3D sgx_release, + .mmap =3D sgx_mmap, + .get_unmapped_area =3D sgx_get_unmapped_area, +}; + +static struct miscdevice sgx_dev_enclave =3D { + .minor =3D MISC_DYNAMIC_MINOR, + .name =3D "sgx_enclave", + .nodename =3D "sgx_enclave", + .fops =3D &sgx_encl_fops, +}; + +int __init sgx_drv_init(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_SGX_LC)) + return -ENODEV; + + return misc_register(&sgx_dev_enclave); +} diff --git a/arch/x86/kernel/cpu/sgx/driver.h b/arch/x86/kernel/cpu/sgx/d= river.h new file mode 100644 index 000000000000..cda9c43b7543 --- /dev/null +++ b/arch/x86/kernel/cpu/sgx/driver.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ARCH_SGX_DRIVER_H__ +#define __ARCH_SGX_DRIVER_H__ + +#include +#include +#include +#include +#include +#include +#include +#include "sgx.h" + +int sgx_drv_init(void); + +#endif /* __ARCH_X86_SGX_DRIVER_H__ */ diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/enc= l.c new file mode 100644 index 000000000000..d47caa106350 --- /dev/null +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -0,0 +1,188 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2016-20 Intel Corporation. */ + +#include +#include +#include +#include +#include +#include +#include "arch.h" +#include "encl.h" +#include "encls.h" +#include "sgx.h" + +static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl, + unsigned long addr, + unsigned long vm_flags) +{ + unsigned long vm_prot_bits =3D vm_flags & (VM_READ | VM_WRITE | VM_EXEC= ); + struct sgx_encl_page *entry; + + entry =3D xa_load(&encl->page_array, PFN_DOWN(addr)); + if (!entry) + return ERR_PTR(-EFAULT); + + /* + * Verify that the faulted page has equal or higher build time + * permissions than the VMA permissions (i.e. the subset of {VM_READ, + * VM_WRITE, VM_EXECUTE} in vma->vm_flags). + */ + if ((entry->vm_max_prot_bits & vm_prot_bits) !=3D vm_prot_bits) + return ERR_PTR(-EFAULT); + + /* No page found. */ + if (!entry->epc_page) + return ERR_PTR(-EFAULT); + + /* Entry successfully located. */ + return entry; +} + +static vm_fault_t sgx_vma_fault(struct vm_fault *vmf) +{ + unsigned long addr =3D (unsigned long)vmf->address; + struct vm_area_struct *vma =3D vmf->vma; + struct sgx_encl_page *entry; + unsigned long phys_addr; + struct sgx_encl *encl; + vm_fault_t ret; + + encl =3D vma->vm_private_data; + + mutex_lock(&encl->lock); + + entry =3D sgx_encl_load_page(encl, addr, vma->vm_flags); + if (IS_ERR(entry)) { + mutex_unlock(&encl->lock); + + return VM_FAULT_SIGBUS; + } + + phys_addr =3D sgx_get_epc_phys_addr(entry->epc_page); + + ret =3D vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr)); + if (ret !=3D VM_FAULT_NOPAGE) { + mutex_unlock(&encl->lock); + + return VM_FAULT_SIGBUS; + } + + mutex_unlock(&encl->lock); + + return VM_FAULT_NOPAGE; +} + +/** + * sgx_encl_may_map() - Check if a requested VMA mapping is allowed + * @encl: an enclave pointer + * @start: lower bound of the address range, inclusive + * @end: upper bound of the address range, exclusive + * @vm_flags: VMA flags + * + * Iterate through the enclave pages contained within [@start, @end) to = verify + * that the permissions requested by a subset of {VM_READ, VM_WRITE, VM_= EXEC} + * does not contain any permissions that are not contained in the build = time + * permissions of any of the enclave pages within the given address rang= e. + * + * An enclave creator must declare the strongest permissions that will b= e + * needed for each enclave page This ensures that mappings have the id= entical + * or weaker permissions that the earlier declared permissions. + * + * Return: 0 on success, -EACCES otherwise + */ +int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start, + unsigned long end, unsigned long vm_flags) +{ + unsigned long vm_prot_bits =3D vm_flags & (VM_READ | VM_WRITE | VM_EXEC= ); + struct sgx_encl_page *page; + unsigned long count =3D 0; + int ret =3D 0; + + XA_STATE(xas, &encl->page_array, PFN_DOWN(start)); + + /* + * Disallow READ_IMPLIES_EXEC tasks as their VMA permissions might + * conflict with the enclave page permissions. + */ + if (current->personality & READ_IMPLIES_EXEC) + return -EACCES; + + mutex_lock(&encl->lock); + xas_lock(&xas); + xas_for_each(&xas, page, PFN_DOWN(end - 1)) { + if (!page) + break; + + if (~page->vm_max_prot_bits & vm_prot_bits) { + ret =3D -EACCES; + break; + } + + /* Reschedule on every XA_CHECK_SCHED iteration. */ + if (!(++count % XA_CHECK_SCHED)) { + xas_pause(&xas); + xas_unlock(&xas); + mutex_unlock(&encl->lock); + + cond_resched(); + + mutex_lock(&encl->lock); + xas_lock(&xas); + } + } + xas_unlock(&xas); + mutex_unlock(&encl->lock); + + return ret; +} + +static int sgx_vma_mprotect(struct vm_area_struct *vma, + struct vm_area_struct **pprev, unsigned long start, + unsigned long end, unsigned long newflags) +{ + int ret; + + ret =3D sgx_encl_may_map(vma->vm_private_data, start, end, newflags); + if (ret) + return ret; + + return mprotect_fixup(vma, pprev, start, end, newflags); +} + +const struct vm_operations_struct sgx_vm_ops =3D { + .fault =3D sgx_vma_fault, + .mprotect =3D sgx_vma_mprotect, +}; + +/** + * sgx_encl_find - find an enclave + * @mm: mm struct of the current process + * @addr: address in the ELRANGE + * @vma: the resulting VMA + * + * Find an enclave identified by the given address. Give back a VMA that= is + * part of the enclave and located in that address. The VMA is given bac= k if it + * is a proper enclave VMA even if an &sgx_encl instance does not exist = yet + * (enclave creation has not been performed). + * + * Return: + * 0 on success, + * -EINVAL if an enclave was not found, + * -ENOENT if the enclave has not been created yet + */ +int sgx_encl_find(struct mm_struct *mm, unsigned long addr, + struct vm_area_struct **vma) +{ + struct vm_area_struct *result; + struct sgx_encl *encl; + + result =3D find_vma(mm, addr); + if (!result || result->vm_ops !=3D &sgx_vm_ops || addr < result->vm_sta= rt) + return -EINVAL; + + encl =3D result->vm_private_data; + *vma =3D result; + + return encl ? 0 : -ENOENT; +} diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/enc= l.h new file mode 100644 index 000000000000..8eb34e95feda --- /dev/null +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -0,0 +1,46 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/** + * Copyright(c) 2016-20 Intel Corporation. + * + * Contains the software defined data structures for enclaves. + */ +#ifndef _X86_ENCL_H +#define _X86_ENCL_H + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "sgx.h" + +struct sgx_encl_page { + unsigned long desc; + unsigned long vm_max_prot_bits; + struct sgx_epc_page *epc_page; + struct sgx_encl *encl; +}; + +struct sgx_encl { + unsigned long base; + unsigned long size; + unsigned int page_cnt; + unsigned int secs_child_cnt; + struct mutex lock; + struct xarray page_array; + struct sgx_encl_page secs; +}; + +extern const struct vm_operations_struct sgx_vm_ops; + +int sgx_encl_find(struct mm_struct *mm, unsigned long addr, + struct vm_area_struct **vma); +int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start, + unsigned long end, unsigned long vm_flags); + +#endif /* _X86_ENCL_H */ diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/mai= n.c index b9ac438a13a4..c2740e0630d1 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -9,6 +9,8 @@ #include #include #include +#include "driver.h" +#include "encl.h" #include "encls.h" =20 struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; @@ -229,9 +231,10 @@ static bool __init sgx_page_cache_init(void) =20 static void __init sgx_init(void) { + int ret; int i; =20 - if (!boot_cpu_has(X86_FEATURE_SGX)) + if (!cpu_feature_enabled(X86_FEATURE_SGX)) return; =20 if (!sgx_page_cache_init()) @@ -240,8 +243,15 @@ static void __init sgx_init(void) if (!sgx_page_reclaimer_init()) goto err_page_cache; =20 + ret =3D sgx_drv_init(); + if (ret) + goto err_kthread; + return; =20 +err_kthread: + kthread_stop(ksgxswapd_tsk); + err_page_cache: for (i =3D 0; i < sgx_nr_epc_sections; i++) { vfree(sgx_epc_sections[i].pages); --=20 2.27.0