From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F055C83030 for ; Thu, 3 Jul 2025 08:44:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEC1B6B0138; Thu, 3 Jul 2025 04:44:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AC3C66B014B; Thu, 3 Jul 2025 04:44:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D9766B014C; Thu, 3 Jul 2025 04:44:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 872E66B0138 for ; Thu, 3 Jul 2025 04:44:38 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 0CBA2C072D for ; Thu, 3 Jul 2025 08:44:38 +0000 (UTC) X-FDA: 83622317436.03.8D9C1E6 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) by imf14.hostedemail.com (Postfix) with ESMTP id 0C71E10000E for ; Thu, 3 Jul 2025 08:44:35 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ff85Zhtx; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of david.laight.linux@gmail.com designates 209.85.128.49 as permitted sender) smtp.mailfrom=david.laight.linux@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751532276; a=rsa-sha256; cv=none; b=hFefsAb4Mi/XYThUqsSirYZkf09zoEOTG8BXol3ZVCXruMXW040V0Hn9Ygp8RJyTU+a2e2 wow1ySWpe/ddVaui9vSaz9T+tDaXhgnHBBWaZEURs520eeMWmdm6PLtUrTL/oODCs4/UfA sNess3ujHV/OA457co9RtCCjaXeTPCQ= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ff85Zhtx; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of david.laight.linux@gmail.com designates 209.85.128.49 as permitted sender) smtp.mailfrom=david.laight.linux@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751532276; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OFX/TAx6epCpcEo12wFWGJBoxHilKkIQzdhXF7nb3aU=; b=JeOVcCsnoBuowrV6dNUHB2/4h8R9FW1Fm1oc5tNdN8jaYKrQiZvv0kaNeI5elp/NmoNOtP P0918HMMOA8O97IVRyqmQEL9ONUkroC3FjUhqfpvSqL9byyuNnw/q7TSSOJt0ybapeJE57 4QebWAO+IjpBqxYNEMELdgR2liSUY7U= Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-453749aef9eso30320815e9.3 for ; Thu, 03 Jul 2025 01:44:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751532274; x=1752137074; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=OFX/TAx6epCpcEo12wFWGJBoxHilKkIQzdhXF7nb3aU=; b=ff85ZhtxSZM8UxHS4hRZ+OohvAKDC+b66JHbvNViIl2lfGc5yOapsSPJGEL2GQ/p7V NYQolZfDcK71ztJkIxO8Yg9wEKvAXUjkTyNs4aJgWgm5/tgvO/NgdmEDXbmlYT9QCdLu Fts0zyzBxBf/iqYe5+praB6jKId5PRSdlgetoBciqO+JlwePmbFX7WJzxLE7Rpn1BjhL VqQyMHmvkhZVGPdAJH6TbNiSnehjKlUsZBqR2+lVEeP7+cd8ECcq/Nf8Kt/Gi/B2bJlm 0ygKyUpEkmWt1f8l3Ztyc24so9M3Kn1WBPePTA1eSfcjgmBeL1V5gG4d7tkvmlBdfYUP f/ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751532274; x=1752137074; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OFX/TAx6epCpcEo12wFWGJBoxHilKkIQzdhXF7nb3aU=; b=XQc8jVrvKAH2JIQ2R/+YbF8akwh6bxHK6tJZlqhznTFT3pZB5h+KVX5x71dTtgJ/Wv nNTOk2FGamlgiD6B7aWYMs/jpz8uQI1VJXdKvnTI/OiDkUMxh93VbaAI6RahJ8FSNEjw Le6TKsg1qltXcjrXT+ib9NRjjnXpNQ6QoYXlIHa+VV0u4+LGc14whDOhAnMoK5wnAF3Y E1ctjdvqEuXWJo/1GlyJ+sC0f3qxV+4jY451Zi8A9jyGalqqTJDWBrWoAO4lvZpeezxZ hE+QcbKCC7MvbqyfRGAdvNd+iiX23Dh3BpT+nPJnBlpv4fW3x3kRi9lA9hktMK2aI+p4 1nJQ== X-Forwarded-Encrypted: i=1; AJvYcCVefyA5LitXz+pHCk5IrflCnL28CbOAeEzxixWRsUsOWBVSUNKdDFMdlnPfiZmiuVbxQgSmMpDkTg==@kvack.org X-Gm-Message-State: AOJu0Yyql4FtZzZegTok2M1/yYA9w2pI1FEPHXbt3jG+Qrch8qCh0+Ss MWBTaFkAXZxqUK83rzrN/IvyHVgKpaCtMScYUNonoBj74Xl+gjRITKua X-Gm-Gg: ASbGncupZg7bLJBA6VhU0NEsAm8tx/g+q8xpATGk4nVbnUf55frgMfBd5VsjO3MtuSA ZYdJuug8Kj/8QFB9O+ufJgT1eFXkxMe7+UdMcl8hbX4naMLSuK/O5zzvuxeKTQVbpu/SrOwHVKq FeKpklfBPPqNVQrmIExip6PheBnsaXEzpm1Fxbq0F+T/p62dn0lKPRqN/GpS6rj8+tME4kCRC9Q zZmcsOAJnsJYeqR/2/goAOSzPsJ3uWijZ7rIvN9QL5A9P7wmfmWRY48U4IPhxd2z3L9y6+hY5ie eryGgx8UG8gKR9GT3v7Qifly+NlubP1pk6xonvspUSLPF93tb24deiU7ZlKSaCMTziAXUZ87+yT PGuZXw5exynBixoqtLg== X-Google-Smtp-Source: AGHT+IGpJYcIiVB0V7dfxLr4hl+2St0WzP4p5xUbwUL9Q9cruWACk5HaqKiw36ESWFjGbUj0Mh/Z+w== X-Received: by 2002:a05:6000:26c4:b0:3a4:c909:ce16 with SMTP id ffacd0b85a97d-3b32fb30d66mr1813139f8f.49.1751532274119; Thu, 03 Jul 2025 01:44:34 -0700 (PDT) Received: from pumpkin (host-92-21-58-28.as13285.net. [92.21.58.28]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a88c7e7386sm17869936f8f.20.2025.07.03.01.44.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Jul 2025 01:44:33 -0700 (PDT) Date: Thu, 3 Jul 2025 09:44:17 +0100 From: David Laight To: "Kirill A. Shutemov" Cc: Andy Lutomirski , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Ard Biesheuvel , "Paul E. McKenney" , Josh Poimboeuf , Xiongwei Song , Xin Li , "Mike Rapoport (IBM)" , Brijesh Singh , Michael Roth , Tony Luck , Alexey Kardashevskiy , Alexander Shishkin , Jonathan Corbet , Sohil Mehta , Ingo Molnar , Pawan Gupta , Daniel Sneddon , Kai Huang , Sandipan Das , Breno Leitao , Rick Edgecombe , Alexei Starovoitov , Hou Tao , Juergen Gross , Vegard Nossum , Kees Cook , Eric Biggers , Jason Gunthorpe , "Masami Hiramatsu (Google)" , Andrew Morton , Luis Chamberlain , Yuntao Wang , Rasmus Villemoes , Christophe Leroy , Tejun Heo , Changbin Du , Huang Shijie , Geert Uytterhoeven , Namhyung Kim , Arnaldo Carvalho de Melo , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-efi@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCHv8 02/17] x86/asm: Introduce inline memcpy and memset Message-ID: <20250703094417.165e5893@pumpkin> In-Reply-To: <20250701095849.2360685-3-kirill.shutemov@linux.intel.com> References: <20250701095849.2360685-1-kirill.shutemov@linux.intel.com> <20250701095849.2360685-3-kirill.shutemov@linux.intel.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 0C71E10000E X-Stat-Signature: srq63z3d5spgbyfawqbfh8hmbb38hbj6 X-Rspam-User: X-HE-Tag: 1751532275-36109 X-HE-Meta: U2FsdGVkX18pyp9uuUNTJe3P1JewAxeY6BJluTWlCoGCvHQFLg4v2PxWvRB8NRQpMFVSTz9ObfMUJlrHKH9h/Ec5dc65hlhEuqX/kwuKllHN1qt1z9/zRBB0ZUvFaJLH2QJiihAs52n52RbO4Dgu6y9k2p//VlCR7p195+HRpiANgapD0dU6WAF+H6dhdjMT+1MutaT+1Kr6tUix47FFfqkKSwLNeGzRmfBL4/zah8Z7h+CX9SLOBMBVMfuZhpTBfm4r0xpFKu2GmIV/5dD+imvzg1bpjPa61+pK7mrXg2m8U25aU6NCbrVvecR8Y66Hq0OGm4eKW8+0zF0J/Kd/VgBoXXslLti30pwXSWfgbn0DyIDadpFA5ZDgE7T0HeC3LOXa1d8lB1a/xrddSNQtJBOmEqeznR+z5/+MP/wJHY4VMoYSHHx/s5pxSyIa15WxMFPexEy5LPKAg28jtk1Xu34bthAF4JPKtGghoissE0+J0qzRjEtQzUjs4b+08Z8we53eMjPD+NXOw24/4CcnahF2w8rD+o0cGoz6xGY/EXas5o8xGOp8LM89IP+nbwsb8gSZdiBwdLDhQx/dtmh6fNQGyO18YPoj9lBPVdgkh0Urg7pkn0/5Z/ZOZeoEBJcSeGleFJHSTakeBvgof+WF+6Q7XCN+2dgEdQFZ7y50S8SRAxgU7NVVsjgCGuhRRe+k2aW3B5IzECnam4FQMlO7HowAyHNXvnQ4NH13wiDaivRNkrPrVvTdmbSWmQumao3rzv7AfApjfyIPA1Mg8m8wGdvi4Ieyvl0OXP+T+Q/W/V1NGCTWIPzaQoYVU+kmx8W1ib7a8g6J8EUs7nqJWi3y89mOzlIv99WhZHbj+QJry6f02H/Zb8szLY0fm4sbSwORSCKjNJGbzazj54YayOpfjfQvwfUqrHeOgPEDgoY3ACoU2Hdy90DgkGdw0DQzBA3Lre+oT2EnE5FtGzzi2GZ j0EnY2mP abYKHkBDEioWgxNam1Y1C7qKn/ZWWSnbKePeFR0AekFxLsDXOXEF07rBiOLj8P0ZGzqLuOJsiHHw7+mmv+tduQWmyK8SkYMGZ0+8+rzH3aUhsWiQq9wnC6TbGawZU7olOhoV1LssKzpo3aOFBxYS5l/ouLEgJ57KgCKMl3KdIj7g2ayga0kHOgHrC5jBoFY02a7ILQUCC8CPD7v6xRWDqIiAO/p0xAi+bh5q1yrodCvyR8ci3o9Wa0s1wzW5h4wvkUTLKthOZsxvFE3BrbTcvB4R23JpFejPRyHL4ltVNXOeWUkFVNmNqHX1a+PyqV7w3bdsta+gM6iypZHl3N3a0UdD77UrRtHIghpSo8yvL65yuBRay6Svfq4WAZEeyk7smCM/ZTH+V0hswt1qrq0jg9SF5q0MN32o9+3k2PLRnXZcm/GkvysXUk18OcgKS97BNHYlVtg9LnqDH/kCSAbgUVcGG7PQOFKnXDFCfQyvTX+Qn9M0igiRidht/ruaSqbXY3qFOsB34H02v6HagrrQ18RmzTwvMJ4X0Wgvxh0wEV2ZA6sln8H4S6jveNy4jeUVr2VkFlGrDhc5it3mcEUW9oeLFFewvAyBevQroR48JHT+3HvAVuKCYQwblXWscxGv8L9dcTywGsrscoSto5rEJTn5xumkZnrNvthr1jnbiz+2UYVs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 1 Jul 2025 12:58:31 +0300 "Kirill A. Shutemov" wrote: > Extract memcpy and memset functions from copy_user_generic() and > __clear_user(). > > They can be used as inline memcpy and memset instead of the GCC builtins > whenever necessary. LASS requires them to handle text_poke. Except they contain the fault handlers so aren't generic calls. > > Originally-by: Peter Zijlstra > Link: https://lore.kernel.org/all/20241029184840.GJ14555@noisy.programming.kicks-ass.net/ > Signed-off-by: Kirill A. Shutemov > --- > arch/x86/include/asm/string.h | 46 +++++++++++++++++++++++++++++++ > arch/x86/include/asm/uaccess_64.h | 38 +++++++------------------ > arch/x86/lib/clear_page_64.S | 13 +++++++-- > 3 files changed, 67 insertions(+), 30 deletions(-) > > diff --git a/arch/x86/include/asm/string.h b/arch/x86/include/asm/string.h > index c3c2c1914d65..17f6b5bfa8c1 100644 > --- a/arch/x86/include/asm/string.h > +++ b/arch/x86/include/asm/string.h > @@ -1,6 +1,52 @@ > /* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _ASM_X86_STRING_H > +#define _ASM_X86_STRING_H > + > +#include > +#include > +#include > + > #ifdef CONFIG_X86_32 > # include > #else > # include > #endif > + > +#ifdef CONFIG_X86_64 > +#define ALT_64(orig, alt, feat) ALTERNATIVE(orig, alt, feat) > +#else > +#define ALT_64(orig, alt, feat) orig "\n" > +#endif > + > +static __always_inline void *__inline_memcpy(void *to, const void *from, size_t len) > +{ > + void *ret = to; > + > + asm volatile("1:\n\t" > + ALT_64("rep movsb", > + "call rep_movs_alternative", ALT_NOT(X86_FEATURE_FSRM)) > + "2:\n\t" > + _ASM_EXTABLE_UA(1b, 2b) > + : "+c" (len), "+D" (to), "+S" (from), ASM_CALL_CONSTRAINT > + : : "memory", _ASM_AX); > + > + return ret + len; > +} > + > +static __always_inline void *__inline_memset(void *addr, int v, size_t len) > +{ > + void *ret = addr; > + > + asm volatile("1:\n\t" > + ALT_64("rep stosb", > + "call rep_stos_alternative", ALT_NOT(X86_FEATURE_FSRM)) > + "2:\n\t" > + _ASM_EXTABLE_UA(1b, 2b) > + : "+c" (len), "+D" (addr), ASM_CALL_CONSTRAINT > + : "a" ((uint8_t)v) You shouldn't need the (uint8_t) cast (should that be (u8) anyway). At best it doesn't matter, at worst it will add code to mask with 0xff. > + : "memory", _ASM_SI, _ASM_DX); > + > + return ret + len; > +} > + > +#endif /* _ASM_X86_STRING_H */ > diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h > index c8a5ae35c871..eb531e13e659 100644 > --- a/arch/x86/include/asm/uaccess_64.h > +++ b/arch/x86/include/asm/uaccess_64.h > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > > /* > * Virtual variable: there's no actual backing store for this, > @@ -118,21 +119,12 @@ rep_movs_alternative(void *to, const void *from, unsigned len); > static __always_inline __must_check unsigned long > copy_user_generic(void *to, const void *from, unsigned long len) > { > + void *ret; > + > stac(); > - /* > - * If CPU has FSRM feature, use 'rep movs'. > - * Otherwise, use rep_movs_alternative. > - */ > - asm volatile( > - "1:\n\t" > - ALTERNATIVE("rep movsb", > - "call rep_movs_alternative", ALT_NOT(X86_FEATURE_FSRM)) > - "2:\n" > - _ASM_EXTABLE_UA(1b, 2b) > - :"+c" (len), "+D" (to), "+S" (from), ASM_CALL_CONSTRAINT > - : : "memory", "rax"); > + ret = __inline_memcpy(to, from, len); > clac(); > - return len; > + return ret - to; > } > > static __always_inline __must_check unsigned long > @@ -178,25 +170,15 @@ rep_stos_alternative(void __user *addr, unsigned long len); > > static __always_inline __must_check unsigned long __clear_user(void __user *addr, unsigned long size) > { > + void *ptr = (__force void *)addr; > + void *ret; > + > might_fault(); > stac(); > - > - /* > - * No memory constraint because it doesn't change any memory gcc > - * knows about. > - */ > - asm volatile( > - "1:\n\t" > - ALTERNATIVE("rep stosb", > - "call rep_stos_alternative", ALT_NOT(X86_FEATURE_FSRS)) > - "2:\n" > - _ASM_EXTABLE_UA(1b, 2b) > - : "+c" (size), "+D" (addr), ASM_CALL_CONSTRAINT > - : "a" (0)); > - > + ret = __inline_memset(ptr, 0, size); > clac(); > > - return size; > + return ret - ptr; > } > > static __always_inline unsigned long clear_user(void __user *to, unsigned long n) > diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S > index a508e4a8c66a..47b613690f84 100644 > --- a/arch/x86/lib/clear_page_64.S > +++ b/arch/x86/lib/clear_page_64.S > @@ -55,17 +55,26 @@ SYM_FUNC_END(clear_page_erms) > EXPORT_SYMBOL_GPL(clear_page_erms) > > /* > - * Default clear user-space. > + * Default memset. > * Input: > * rdi destination > + * rsi scratch > * rcx count > - * rax is zero > + * al is value > * > * Output: > * rcx: uncleared bytes or 0 if successful. > + * rdx: clobbered > */ > SYM_FUNC_START(rep_stos_alternative) > ANNOTATE_NOENDBR > + > + movzbq %al, %rsi > + movabs $0x0101010101010101, %rax > + > + /* RDX:RAX = RAX * RSI */ > + mulq %rsi NAK - you can't do that here. Neither %rsi nor %rdx can be trashed. The function has a very explicit calling convention. It is also almost certainly a waste of time. Pretty much all the calls will be for a constant 0x00. Rename it all memzero() ... David > + > cmpq $64,%rcx > jae .Lunrolled >