From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A77DC77B7C for ; Thu, 3 Jul 2025 16:52:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 335726B01F8; Thu, 3 Jul 2025 12:52:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 30D796B01FB; Thu, 3 Jul 2025 12:52:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 222FA94000D; Thu, 3 Jul 2025 12:52:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 114E66B01F8 for ; Thu, 3 Jul 2025 12:52:28 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5A30B5800F for ; Thu, 3 Jul 2025 16:52:27 +0000 (UTC) X-FDA: 83623546734.22.EBCAEE5 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) by imf29.hostedemail.com (Postfix) with ESMTP id 5B7E1120012 for ; Thu, 3 Jul 2025 16:52:25 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cCoT1f2R; spf=pass (imf29.hostedemail.com: domain of david.laight.linux@gmail.com designates 209.85.221.48 as permitted sender) smtp.mailfrom=david.laight.linux@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751561545; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dw94v4+dm+2EAGEsktndj6wIkAF7S3MInrEzqbxbZ14=; b=EmFWfSkctASGl5xeiCfSq6srE/u37rQlqp6IJh4ljBia7FdSRN4z8WDAJXfPyDHkLiTl/7 anry3QCYm2ZOOnWIUFObQEu5CGWnZGc98hmIYYVP2zr/xUiApOYQa8rTNIVf1DPLLV8LBW +HF2HV6EOqwmhN53wlbFOlD9OXnn7fI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cCoT1f2R; spf=pass (imf29.hostedemail.com: domain of david.laight.linux@gmail.com designates 209.85.221.48 as permitted sender) smtp.mailfrom=david.laight.linux@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751561545; a=rsa-sha256; cv=none; b=ZQNXNkZ+6gx3sOljwvTzbTcdlJXP/fcJxrCUkgoPidCDcaUWdnd+4yPIuHRc4fT5Fg4Rr5 gWO2z+cULV46Os7BGcyLjfmKgbjHscs6ZwM7uO+vfS88ynf5gevFUdm58MTnpTFltss7E0 IQAdl9uwUZB0VKRJkuDVWDSsWJ8xnX4= Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-3a6f2c6715fso16253f8f.1 for ; Thu, 03 Jul 2025 09:52:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751561544; x=1752166344; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=dw94v4+dm+2EAGEsktndj6wIkAF7S3MInrEzqbxbZ14=; b=cCoT1f2RCIebKGBxnhA+eGOTrKw0LD4TMcZRWmB7WdkECGJ2l9ij7GtYLrE3tb1sZn F1rCyNLx+uZP4X5ZgSPZSf4S21xxmQrngM4+o5FJrHGsP7is7hK56RIeuH5gjvsmSj5z X2/xZPPDLemM7UAW6XZLuW8fT2OrvWXU2N12syjWUp+V8aiNZZsz7cTlz/AZC90m/LeA IbeykDXO3IlXbAQg5KBD/VGf+B3aTTpDmjBiF2Fd38GZZJi8NZR6X8+Z9s5S+N9xNQYK NWHFQaYsYMXRjOnQd30KBNL0AygySNj2vsObR+IgUymANMEsog5plzLWaONgQbKmaz5e 7/zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751561544; x=1752166344; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dw94v4+dm+2EAGEsktndj6wIkAF7S3MInrEzqbxbZ14=; b=QrklwzW/nBtZDb7w+dX42vORUBjxCTLOdnhu987lGMEQSmuZgo3EkcW7EYCte5uAfr wYBpf4XaFcJb1TXAon+QO+UL+uwI1C1Yo66F6t+vyjtUYe+v7JEGjJLdE5Muy5tBQaF9 v/ucf0GrtjmWy24wRvcSs/UoEXgzfekrvi3irSKuiWpH0GRSIG0txP0RdszVykPRLAl1 g54FJl9PowSfR1BRqquEhJ/2jiiKwAG9ShoVicO6eHL17DatsKuDfrdB27VVMRYzK/no 6V/jEhF0nm4egK+9qLhwbmGRlvLZp6RwlMICmjEqyJCl6c7XcsR5eRhU9/jpUFUyz4ab AXpw== X-Forwarded-Encrypted: i=1; AJvYcCXNmosCLekRfFRxu+XcQzQlAq4zVyfoFVjoZuL7aTeZcv4PHig0en5zu+gSf3t7PFhDb7k2D60BIg==@kvack.org X-Gm-Message-State: AOJu0YyUpOp+WXpgtwCrylfA4g0EqCMUQ13h8jj4O5kBvz/tvmhIr7y0 ZHgm7vtGfJMuJjC/9XDa/MqkbAjpSbNFtjcroZX5jmuQB+IvnX0WQEY/ X-Gm-Gg: ASbGncuKH6LzPH2BmlJLOPyeTHPnZ+wRSsrq2QjqlUnqX3wgtAOYeNK6B2Ftic6NxO1 +MHmstuf1SY5weEh3zaMbuAQZZ89SIJx97ULXmtGRCFXwTRWUTNvhF0NKMFXgx1wxFPSSki4/se h2NlgJbylFpH5EI4Uj9ZJfQAgCWOAIWeH2Y22k/VsWW3xD300CPMcQOXgpl62SmOahNAOHviMsc 9heoXCXBA4jwojbiYmQsjJRWP6hLUnZNv178T8K0+AL3oUtzXyUpBvZUn+KR75YEEgcwHrDU8rW ED9et/ERTuj969CQcX9Fo6GcFgKkKuzwi3cGraBo0HgDbH8K9qtOmvuKyCbIi8XHs+XRNjg38SB fBQVVVxSQsu5Mm5jscw== X-Google-Smtp-Source: AGHT+IEwxwpPMNgiW5OZzNxrC0UFXM2KJ10ptvxCCiTuN7vhdgd4/rcD6AlW0fW1gF7+O5XydxM1/A== X-Received: by 2002:a05:6000:2f85:b0:3a6:d5fd:4687 with SMTP id ffacd0b85a97d-3b1fe6b72c6mr7061097f8f.18.1751561543434; Thu, 03 Jul 2025 09:52:23 -0700 (PDT) Received: from pumpkin (host-92-21-58-28.as13285.net. [92.21.58.28]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3b47030cdf5sm254497f8f.1.2025.07.03.09.52.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Jul 2025 09:52:23 -0700 (PDT) Date: Thu, 3 Jul 2025 17:52:20 +0100 From: David Laight To: Vegard Nossum Cc: "Kirill A. Shutemov" , Andy Lutomirski , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Ard Biesheuvel , "Paul E. McKenney" , Josh Poimboeuf , Xiongwei Song , Xin Li , "Mike Rapoport (IBM)" , Brijesh Singh , Michael Roth , Tony Luck , Alexey Kardashevskiy , Alexander Shishkin , Jonathan Corbet , Sohil Mehta , Ingo Molnar , Pawan Gupta , Daniel Sneddon , Kai Huang , Sandipan Das , Breno Leitao , Rick Edgecombe , Alexei Starovoitov , Hou Tao , Juergen Gross , Kees Cook , Eric Biggers , Jason Gunthorpe , "Masami Hiramatsu (Google)" , Andrew Morton , Luis Chamberlain , Yuntao Wang , Rasmus Villemoes , Christophe Leroy , Tejun Heo , Changbin Du , Huang Shijie , Geert Uytterhoeven , Namhyung Kim , Arnaldo Carvalho de Melo , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-efi@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCHv8 02/17] x86/asm: Introduce inline memcpy and memset Message-ID: <20250703175220.1cb05c1b@pumpkin> In-Reply-To: <78aab15e-5bc2-47cc-ac1c-5a348bff0e17@oracle.com> References: <20250701095849.2360685-1-kirill.shutemov@linux.intel.com> <20250701095849.2360685-3-kirill.shutemov@linux.intel.com> <20250703094417.165e5893@pumpkin> <20250703131552.32adf6b8@pumpkin> <78aab15e-5bc2-47cc-ac1c-5a348bff0e17@oracle.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 5B7E1120012 X-Stat-Signature: erz1n66qxxsunnh8xxi6s3qcbub9odms X-Rspam-User: X-HE-Tag: 1751561545-891703 X-HE-Meta: U2FsdGVkX18YeYhTtLWyDZ7eOOw8U7JqhfESzQqBLn3jFEYmo8c+3xL+PwxV4GwLSJKSPFXF18Thrc1eQvQzMqpJQqaZgkMl6iA99iOPfZwae7W1QfVib1noy5Ond5bMAaCi90X2weTwyF8iYYzj2HG6qhcGjMssYwtTsIk2Gxv52KfPVClbvSdk0EOfjiuf5R01ui2bvKvzEBZZiBuPhtDf4XzCpiyeoodVO9eFSq3rT2eK75Pqn9fl3m09snRFlg6q0WeCdf60p4j9l5AH30BYoT+Vd813n+3dBeAEBqFIkOwdUHiYwhU+VDDMmQ84Y4HAf3UlCbn0oZZPBlD0QLbpowtifDttDS/PhFQQeKyyoRHf1H5NyD5NZKv/B/OKTRgg/zAimqwWMgv0d6dK9y4YbzITes6/Pd3WD9+6ZCtLUy7OYCkA4oB0mwEEs1d0IzYPlDrVuBnu4vwai35Ie6FJEWi2XaofPpQ8Vzg4g6TmSVvfPCK14L0IAgRmnMwgy9+c6SwkcoimuwnAf0xeelfQgdfNEgk3bsL5Txku0DGpZrzFvskOySfAI7m2PexW/zfyM2bS/TJK/Cw9qpjHl+uxbj6PshsZS2pJd0uB7cY+DcFSe2uJPw3AJNvdY4ZYp5zUxuP3lXrBIYukSBHYZ3U2j1WvyBY4GBqD5ZTAf1h35mTirXL/gxXqK0QBmvqT3H9Fygvg3g0NCItbZ5QL1nztAYwDxkzhduZ5kxeXQUD8UHu+DUt71Sxl8SYRe+f04VCiYiw7v7MWPecJyvvs9jO01f3cLEk72Xd/mjZCwApRvIjksDAzoOtdnSReNbRhHtIvuGbE4puBLXhRNIdzgtJd/F/3osepT+kCXiN6bKz65ar9r3vlMxOPh8HV2TdrGF49t/extV4zEp9TonLk0gof8QZXEMwP2dANAEQOEbQrFwNTVzQVHWhtWZKmtGcmFq/YqhjRXFE8GooqkAo vTgQEbqZ 8zOzc9W86ZPO+lQVLJe8q65jPewNeWnMut6bbaqEP/Hy38E5sQDENrKM80ifxw3BiC+fXJzU8mooRe4Oporejm9alCmyjn/1kXswqySR/zVIbO2pCV8TUHY6Q1sNEtlJV/2ajG8foBR4MmVmAmFuI5DwnKUxs3uB4Mwc1Td3CU4fpFy0OBwWRuMBdYp2ImqkQlf6pdS1EGTZoI7NJTN6AfO+AZ1oIQnWvVvTE3nTCQcBZ2bstH9w7rbHRoM3mVeTi2qo/utMD1RfhleW8KrxHj+AZHIXVRNDZtyh0aWq5egG1saNe8W4xc5DF8oP5JHcQ+P9BAFP3M/0NsMseNP6+rYULEsyDIXcoj3756A9QRQ3xQjt01WrKgwlNe4xBpb6d5fknX1CRDeyv7EHH40nNsQKGcb6rmwTdSUfgRN3OK7scms6xknpPi8zf/UNnKVNP2mq+TlQ+RQxncIL4aMgBBuEdQkrd+q04SqRNXUP9l/iwxNZQDVt51AdSCowMC7xBOJHTWFE2uK6Bc4SR2hQnRYOK5iZW4/VsfP976iTcL/loVAOgCgD5e83ApEHQZT/d+og3/Swh/RaOc8vuVSu7porGJfhaWOCbFXhV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 3 Jul 2025 15:33:16 +0200 Vegard Nossum wrote: > On 03/07/2025 14:15, David Laight wrote: > > On Thu, 3 Jul 2025 13:39:57 +0300 > > "Kirill A. Shutemov" wrote: > >> On Thu, Jul 03, 2025 at 09:44:17AM +0100, David Laight wrote: > >>> On Tue, 1 Jul 2025 12:58:31 +0300 > >>> "Kirill A. Shutemov" wrote: > >>>> diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S > >>>> index a508e4a8c66a..47b613690f84 100644 > >>>> --- a/arch/x86/lib/clear_page_64.S > >>>> +++ b/arch/x86/lib/clear_page_64.S > >>>> @@ -55,17 +55,26 @@ SYM_FUNC_END(clear_page_erms) > >>>> EXPORT_SYMBOL_GPL(clear_page_erms) > >>>> > >>>> /* > >>>> - * Default clear user-space. > >>>> + * Default memset. > >>>> * Input: > >>>> * rdi destination > >>>> + * rsi scratch > >>>> * rcx count > >>>> - * rax is zero > >>>> + * al is value > >>>> * > >>>> * Output: > >>>> * rcx: uncleared bytes or 0 if successful. > >>>> + * rdx: clobbered > >>>> */ > >>>> SYM_FUNC_START(rep_stos_alternative) > >>>> ANNOTATE_NOENDBR > >>>> + > >>>> + movzbq %al, %rsi > >>>> + movabs $0x0101010101010101, %rax > >>>> + > >>>> + /* RDX:RAX = RAX * RSI */ > >>>> + mulq %rsi > >>> > >>> NAK - you can't do that here. > >>> Neither %rsi nor %rdx can be trashed. > >>> The function has a very explicit calling convention. > > That's why we have the clobbers... see below > > >> What calling convention? We change the only caller to confirm to this. > > > > The one that is implicit in: > > > >>>> + asm volatile("1:\n\t" > >>>> + ALT_64("rep stosb", > >>>> + "call rep_stos_alternative", ALT_NOT(X86_FEATURE_FSRM)) > >>>> + "2:\n\t" > >>>> + _ASM_EXTABLE_UA(1b, 2b) > >>>> + : "+c" (len), "+D" (addr), ASM_CALL_CONSTRAINT > >>>> + : "a" ((uint8_t)v) > > > > The called function is only allowed to change the registers that > > 'rep stosb' uses - except it can access (but not change) > > all of %rax - not just %al. > > > > See: https://godbolt.org/z/3fnrT3x9r > > In particular note that 'do_mset' must not change %rax. > > > > This is very specific and is done so that the compiler can use > > all the registers. > > I feel like you trimmed off the clobbers from the asm() in the context > above. For reference, it is: > > + : "memory", _ASM_SI, _ASM_DX); I'm sure they weren't there... Enough clobbers will 'un-break' it - but that isn't the point. Linux will reject the patch if he reads it. The whole point about the function is that it is as direct a replacement for 'rep stos/movsb' as possible. > > I'm not saying this can't be optimized, but that doesn't seem to be your > complaint -- you say "the called function is only allowed to change > ...", but this is not true when we have the clobbers, right? You can't change %rax either - not without a clobber. Oh, and even with your version you only clobbers for %rax and %rdx. There is no need to use both %rsi and %rdx. The performance is a different problem. And the extra clobbers are likely to matter. x86 really doesn't have many registers. David > > This is exactly what I fixed with my v7 fixlet to this patch: > > https://lore.kernel.org/all/1b96b0ca-5c14-4271-86c1-c305bf052b16@oracle.com/ > > > Vegard