From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B18FC7EE2A for ; Mon, 22 May 2023 20:34:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23714900003; Mon, 22 May 2023 16:34:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E715900002; Mon, 22 May 2023 16:34:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D76F900003; Mon, 22 May 2023 16:34:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F3322900002 for ; Mon, 22 May 2023 16:34:20 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C5451120458 for ; Mon, 22 May 2023 20:34:20 +0000 (UTC) X-FDA: 80819043480.14.0B929FD Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf25.hostedemail.com (Postfix) with ESMTP id 0EBDCA000A for ; Mon, 22 May 2023 20:34:18 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NjQ3hoh4; spf=pass (imf25.hostedemail.com: domain of jolsa@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=jolsa@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684787659; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ihWK5lj58hgcII5KMNPKGvN+HNUBewue7uczjvF2qL8=; b=l4acmIBwxkzVTscFx73jGpMSUUauIXV348exjBUNkIZvc4EW9f7h1Erj8+Pr86J2qNZSKy 36oEtIy9mIl8ELENokplrgyCn9Pct1WB1+ALlXksbtWVuZxutBe5FbIYDrEsss+HytyvSq vksuLf6GkZZ4klpsNCeANLzgW3fq6+I= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684787659; a=rsa-sha256; cv=none; b=WFfM7sxItHs2njY5JfwcoodETHgyeW53KNbzPBcDRr5KL3FDyc4eYijcnuvcD3wdLShgJk 9U5iCaN2+6bFn+944RqbNl1yeWkrX21hhGEXV3ZuPXpEJEHEMYf0xL7uKDyU8lNYQhB8ih uK/fjZCEXC1hdCNXEbOiXc9rNEqpYR8= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NjQ3hoh4; spf=pass (imf25.hostedemail.com: domain of jolsa@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=jolsa@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1B23E62BBB; Mon, 22 May 2023 20:34:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E84C7C4339B; Mon, 22 May 2023 20:34:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684787657; bh=CR+uhegPGZnQF02rO5sFeA05nPiC/9Ec9ZWdVQpJmlw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NjQ3hoh4ha/iUxGPxWNu2SKDmxUuYh2SJImlsMDK6q4UgG9rUr+NvWycE7mmGS+KS F49spCPs8lRMeEvjtQGnrxGDpIMA4mmy+4c/Yg4284v9IcvjYfPYh+M9h96WncZQJs J5l2z2jI58XzRx3J0RgJri5RQAXsIZHp/5O/KsrTXuQ5fkLZQMvqP8JevAS4dla65V 5dl3ag7vA6cV+5QG9PKivZlH5/OQ1KdF7T7TADHIxMPD/llQrpdeUTAbTQNaXJBEMw 8ECIdVo8BFvcMeU2CVqzlkY0Wq7rQXbFb4CBCMVEJSxra1PpTKb4BpN26ERuKR2Uim RMy642v/iMArg== From: Jiri Olsa To: stable@vger.kernel.org Cc: Linus Torvalds , Masami Hiramatsu , x86@kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Tsahee Zidenberg , Andrii Nakryiko , Christoph Hellwig , Daniel Borkmann , Thomas Gleixner , =?UTF-8?q?Mah=C3=A9=20Tardy?= , linux-arm-kernel@lists.infradead.org Subject: [RFC PATCH stable 5.4 1/8] uaccess: Add strict non-pagefault kernel-space read function Date: Mon, 22 May 2023 22:33:45 +0200 Message-Id: <20230522203352.738576-2-jolsa@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230522203352.738576-1-jolsa@kernel.org> References: <20230522203352.738576-1-jolsa@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Stat-Signature: o7fk3asugakj1g1ixiuwgqe1yadb4i8c X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 0EBDCA000A X-Rspam-User: X-HE-Tag: 1684787658-309323 X-HE-Meta: U2FsdGVkX1+/Q4oO68cVp8M2XQM/VnOdL6JaAk/FJKB+gqjhebTg7l4gv+jtS4ziRzOzcoonQ7KqKrxzJF2dSh1wuKOp25Q33B27j1s+s60u/Xxf4PyExJ6rtGO6ogK7blYYapTbjQpOiWZVAg+KIOajk/I1/qKNDq0/U0Agycq6K2z4q8P6L06d4SrFQvhjfBpp4FSFeBnIcVsp23ZhZwMKi/FFK20HBt8WKW5PGNWzL3bmFLpBoxWnJpvH0pn7ONksC6mab91WL9NSKx8Pg9Puwukt0UZH0amgiVhYEc14JT6S11kX4EplT6vj0WRsNkeHMJX0vKJuk6yiQjsHdt1WuNG28YQNKDsZM3Q9X6mKNKjZeiAzGG38IPIl8jCXaW6sFxgbl54rW1aye3feGSyv60/AKHZ/KOkPOJrLvPj0wskKdgxBbF1FGmrqj72aTpY0lpkR/fEAgU6Ll76WEvsUUYEtdlXaZ2Vq0a6y39c4A4wk+gxASIU7QCILy/VhYzNecXbag3ys80UytVNl6J67GHGn2SsCkSD1vvx6wL9rjN7zCX/d9AIrvIHP71CwvzQfSBER4fkxGWiWZy5i1PV8m7/vRKRvbxKZGhvU/lRH8aLB4y75boDLyIBF3c7kgMVkkcBuyplLrp648y8zTorVbQxlOBDHGEyV3erLTnpLWiC/u7FXszhQz7qploixzcPTVVLC5lhdwiBFjmaugLAWjtOILUQXFRO8WMito1Ux2jHYbj4nDaoJ/BeO6EJCZ5wtD9lqeQGSRjY7yRnWpJZpPx9tkrY7YVEs7BFLmrcFIJQJnLoXxozsm/KJW40h1O9226xoG1cKBsTuK1tzm6BHQ4eulMbC/QYHiM7Jn18c2RPDDf2vowGgGcYwFk6q0FxXWIK04us502nP3LbWlYewvlYknmqdTQhACc93J4muXNimHhD+eHGKJppmXiMIfySyus3iropGyQLrXVB 5+4n1EtV zZ4mQVSpHGo11rAjclTAC6Uq05LBFRhCU6GuPhvvKnrOmsVgvuj2+zq+XHooPEUZ+uM7eA3v1sNfo8t1WmMmBnQLokpfLJenwKyKgWKl2ar24BuwHdNd58sElW2RHLlicpvcav6hIGiUECrjKFtUjFoWnoI6UJHzc6rAed6NI5iHhpPD8EgIX1MD02+SyF5E9rzdZ5zMZ/6JyQSS1EcW7sdX/377edYbVzqhiy38a4+v6FDeMHFhbe0OSDcoX7YCHzaUIVq2icT7dUp2jInDekfpK4Md1rsyCRe/eUe6HtnBV82RMemrIOobXTxFhF4bl13JaFqVe14NKECTOp0an3tkt5BhUtJa/1sxjr9pZmIFXjqS59hkh4giY9HwG7U7AP+pqYo2Cm8Uwl0se/o2ev87Gxulx+BvIL7cbn3G3jT/Z7xVjqAfzTCzOhTT4OMfgBx4fhLBbfWWdbTU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Daniel Borkmann commit 75a1a607bb7e6d918be3aca11ec2214a275392f4 upstream. Add two new probe_kernel_read_strict() and strncpy_from_unsafe_strict() helpers which by default alias to the __probe_kernel_read() and the __strncpy_from_unsafe(), respectively, but can be overridden by archs which have non-overlapping address ranges for kernel space and user space in order to bail out with -EFAULT when attempting to probe user memory including non-canonical user access addresses [0]: 4-level page tables: user-space mem: 0x0000000000000000 - 0x00007fffffffffff non-canonical: 0x0000800000000000 - 0xffff7fffffffffff 5-level page tables: user-space mem: 0x0000000000000000 - 0x00ffffffffffffff non-canonical: 0x0100000000000000 - 0xfeffffffffffffff The idea is that these helpers are complementary to the probe_user_read() and strncpy_from_unsafe_user() which probe user-only memory. Both added helpers here do the same, but for kernel-only addresses. Both set of helpers are going to be used for BPF tracing. They also explicitly avoid throwing the splat for non-canonical user addresses from 00c42373d397 ("x86-64: add warning for non-canonical user access address dereferences"). For compat, the current probe_kernel_read() and strncpy_from_unsafe() are left as-is. [0] Documentation/x86/x86_64/mm.txt Signed-off-by: Daniel Borkmann Signed-off-by: Alexei Starovoitov Cc: Linus Torvalds Cc: Masami Hiramatsu Cc: x86@kernel.org Link: https://lore.kernel.org/bpf/eefeefd769aa5a013531f491a71f0936779e916b.1572649915.git.daniel@iogearbox.net --- arch/x86/mm/Makefile | 2 +- arch/x86/mm/maccess.c | 43 +++++++++++++++++++++++++++++++++++++++++ include/linux/uaccess.h | 4 ++++ mm/maccess.c | 25 +++++++++++++++++++++++- 4 files changed, 72 insertions(+), 2 deletions(-) create mode 100644 arch/x86/mm/maccess.c diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 84373dc9b341..bbc68a54795e 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -13,7 +13,7 @@ CFLAGS_REMOVE_mem_encrypt_identity.o = -pg endif obj-y := init.o init_$(BITS).o fault.o ioremap.o extable.o pageattr.o mmap.o \ - pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o + pat.o pgtable.o physaddr.o setup_nx.o tlb.o cpu_entry_area.o maccess.o # Make sure __phys_addr has no stackprotector nostackp := $(call cc-option, -fno-stack-protector) diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c new file mode 100644 index 000000000000..f5b85bdc0535 --- /dev/null +++ b/arch/x86/mm/maccess.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include + +#ifdef CONFIG_X86_64 +static __always_inline u64 canonical_address(u64 vaddr, u8 vaddr_bits) +{ + return ((s64)vaddr << (64 - vaddr_bits)) >> (64 - vaddr_bits); +} + +static __always_inline bool invalid_probe_range(u64 vaddr) +{ + /* + * Range covering the highest possible canonical userspace address + * as well as non-canonical address range. For the canonical range + * we also need to include the userspace guard page. + */ + return vaddr < TASK_SIZE_MAX + PAGE_SIZE || + canonical_address(vaddr, boot_cpu_data.x86_virt_bits) != vaddr; +} +#else +static __always_inline bool invalid_probe_range(u64 vaddr) +{ + return vaddr < TASK_SIZE_MAX; +} +#endif + +long probe_kernel_read_strict(void *dst, const void *src, size_t size) +{ + if (unlikely(invalid_probe_range((unsigned long)src))) + return -EFAULT; + + return __probe_kernel_read(dst, src, size); +} + +long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, long count) +{ + if (unlikely(invalid_probe_range((unsigned long)unsafe_addr))) + return -EFAULT; + + return __strncpy_from_unsafe(dst, unsafe_addr, count); +} diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 70941f49d66e..25ae650dcb1a 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -315,6 +315,7 @@ copy_struct_from_user(void *dst, size_t ksize, const void __user *src, * happens, handle that and return -EFAULT. */ extern long probe_kernel_read(void *dst, const void *src, size_t size); +extern long probe_kernel_read_strict(void *dst, const void *src, size_t size); extern long __probe_kernel_read(void *dst, const void *src, size_t size); /* @@ -354,6 +355,9 @@ extern long notrace probe_user_write(void __user *dst, const void *src, size_t s extern long notrace __probe_user_write(void __user *dst, const void *src, size_t size); extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); +extern long strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, + long count); +extern long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count); extern long strncpy_from_unsafe_user(char *dst, const void __user *unsafe_addr, long count); extern long strnlen_unsafe_user(const void __user *unsafe_addr, long count); diff --git a/mm/maccess.c b/mm/maccess.c index 2d3c3d01064c..3ca8d97e5010 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -43,11 +43,20 @@ probe_write_common(void __user *dst, const void *src, size_t size) * do_page_fault() doesn't attempt to take mmap_sem. This makes * probe_kernel_read() suitable for use within regions where the caller * already holds mmap_sem, or other locks which nest inside mmap_sem. + * + * probe_kernel_read_strict() is the same as probe_kernel_read() except for + * the case where architectures have non-overlapping user and kernel address + * ranges: probe_kernel_read_strict() will additionally return -EFAULT for + * probing memory on a user address range where probe_user_read() is supposed + * to be used instead. */ long __weak probe_kernel_read(void *dst, const void *src, size_t size) __attribute__((alias("__probe_kernel_read"))); +long __weak probe_kernel_read_strict(void *dst, const void *src, size_t size) + __attribute__((alias("__probe_kernel_read"))); + long __probe_kernel_read(void *dst, const void *src, size_t size) { long ret; @@ -157,8 +166,22 @@ EXPORT_SYMBOL_GPL(probe_user_write); * * If @count is smaller than the length of the string, copies @count-1 bytes, * sets the last byte of @dst buffer to NUL and returns @count. + * + * strncpy_from_unsafe_strict() is the same as strncpy_from_unsafe() except + * for the case where architectures have non-overlapping user and kernel address + * ranges: strncpy_from_unsafe_strict() will additionally return -EFAULT for + * probing memory on a user address range where strncpy_from_unsafe_user() is + * supposed to be used instead. */ -long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) + +long __weak strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) + __attribute__((alias("__strncpy_from_unsafe"))); + +long __weak strncpy_from_unsafe_strict(char *dst, const void *unsafe_addr, + long count) + __attribute__((alias("__strncpy_from_unsafe"))); + +long __strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) { mm_segment_t old_fs = get_fs(); const void *src = unsafe_addr; -- 2.40.1