From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C080C77B7E for ; Thu, 25 May 2023 12:50:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6F466900003; Thu, 25 May 2023 08:50:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A415900002; Thu, 25 May 2023 08:50:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56C51900003; Thu, 25 May 2023 08:50:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 48D34900002 for ; Thu, 25 May 2023 08:50:38 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0EE5F120B30 for ; Thu, 25 May 2023 12:50:38 +0000 (UTC) X-FDA: 80828761356.11.F5681A2 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf04.hostedemail.com (Postfix) with ESMTP id 8AB2740002 for ; Thu, 25 May 2023 12:50:34 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=UWh+D7q5; spf=none (imf04.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1685019035; a=rsa-sha256; cv=none; b=jziweFsl2jWIaYBPjVnd8CP0CWwF7nNKVhbeK24URxNOH5N/PXC4wVQ0VXzFXXzauZxvgc sdTEU4UNi9LVVaLxJcbJjzHkBhRd+sYv7735CVixCNVc6goCWUmOR8z5BtES0lUnQmW1Xf 0Q8txb8jjlgcPdlQFewk/zhyG2+7CpE= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=UWh+D7q5; spf=none (imf04.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1685019035; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CbOUeGIPPTYO+RDwR++9bbnSmnxFiSpb24Vrd039RHU=; b=gch5hJmIHN+QYYXzGjyqvHLJ1LpQch5MP/Dl6DKcL0bSRQi9d1LIN8xUAZHdJ7UYXQJtSB eav9rxalI/sGqV8pkP9G3M0vjiqfr3byKR5tz7IEP6bcM4/C+jYLmvGexlNsuzZB3CQat3 ZHjgUXkkxbaflJd70fFXVsCQxPNmbQQ= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=CbOUeGIPPTYO+RDwR++9bbnSmnxFiSpb24Vrd039RHU=; b=UWh+D7q5fSKs/u+72BHr4FkJ/K YoXLj1At4QppxJttqncoojvdUKHXEbmsi+/5zVMXINyDz/0Tqvd9WXg2DrhjaohlsmOUDiPkhH67A BuP2MBWCJiXaAw8bERA/Dj4xUQmdAgGlu/50JMrQDQVESmfg1hosRCp6Oh9TZ1PUloeaPtQz5y8hn yzLLFmbagcs3vaLCm81whwwKSIOiZnZw0DwAwa1eN9bUsF5VJAhf7cNz+5vNIwf3curZC8WIPImCX MyuBWaX1N5vW8Qj+aCkdoGauXJn5hXp8cx4nv0NMrET4P1TT8fIkZIIYV6gV9V8bJ4/10Iwh6Kt8Q D+Tp7Naw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1q2APn-006XSv-1a; Thu, 25 May 2023 12:49:59 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 30813300338; Thu, 25 May 2023 14:49:55 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id BDC19213168E1; Thu, 25 May 2023 14:49:55 +0200 (CEST) Date: Thu, 25 May 2023 14:49:55 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: Re: [PATCH v3 05/11] percpu: Wire up cmpxchg128 Message-ID: <20230525124955.GS83892@hirez.programming.kicks-ass.net> References: <20230515075659.118447996@infradead.org> <20230515080554.248739380@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230515080554.248739380@infradead.org> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 8AB2740002 X-Stat-Signature: s7tao8jbbyd6rzf3o7fzjrh6fb8uqnuw X-Rspam-User: X-HE-Tag: 1685019034-33585 X-HE-Meta: U2FsdGVkX1/C59BAgGvZOzL83NWSkAoCE/lK6Aor2Jn73TLCcuwR7Vs4MiqcODiRThuIHl6UiKaANJu5WG5lv1SEI8cSWYHCGGcS2+anbB6syrv93OvbTV9TeCk3Q57mk/Gfd24GsYU1oMFOy1oLL9cgyo/Ds0Axa2muOsZfzX6h1uID1UuohoU+6FA8CxDsrJJcVhO6FpWfJywRrm5ywBwPmprpgIKUO5CbPImipfiu2Caan3bbEES3zzbmdizIQtzyd9hDIk/1HRtAbSpv12rBZdeepcVmRNZeiAScRclczyHxSRAbzwBZXhRKxNZ+M5CHtbaZ04kX/dLA2JocnFq+oRgANhTSn/OSeIGkv8nuYwXE81ohferVqfmjrX0x6DffTxnJXutuulWM2cY993eL68iNj2o+XfLBV6IgQd9sAlbaNmGdc24LCNMgVXLWzjYLkS/cmCfHApQGuuNf1MRHE9b+XZrjc0gS/c3lWfxJfYOm+EsiXwWCuIT3tvFKw+VrZqfIQR6b7rMYAKEW6xpXkESX7qQ6wCErYONlZEBTv/j5isC0/pZBNIQSv9GX3jpjMFOSbf5iZo4A96GxAzxWRQrdhTavejbhwlnVg9QLRSPEu0jFD3fepvNYqvYo2qB5lWKFeqQBFfGLJtCd0KQ+7CCDpxJY5ZCT6i0N82IMPdQIsiSB4cEXubX5YWivB3NaqHQDrqknYhLJUCDODVUnP0YqJ1yU9AzNbMKgEv04ZeeLXz58CjqVe+DjI9VZvnX1wULNHwLtrsAxXQuxudi37vtv+eePiAlPcrYOUWb5HntRE7VuM2w7vyG3vOIyAdmUTpEiaIoMzum8SQLdenOejlc2TT4kSBYCTTH8xGJ0KQxKgoXSmuCWkije0bjbRTCTvBW/gGFpKtpoegVNFE/ccu3dSjdvqaeCbiiBL8CaoCEUxjGTBygcbjIozO9VqrWCyCiu5+7nDlo76wq r0BdAqZ1 PaNObKi8j3XbqNBSEYEyj5ePDdDKj0SBLwa9oEJA9dK6e/tWOzZnNqiUtqFOJagpelFbN3pO7pTFsYA6xMb6bQQ8dZ0DIpMob9kvSZ77MzvNjLmsTWgqwOD/jN08zd1hGRexpZiMDx9CVb11PRV0NffPwuqHKg9g7Gcoz3dSXeqR6A8nM7qdYCe9OX0R08x5yqo15MCOLnGAE334OMkazVySL4hcMlC2CKLWn7CFcrxppANIJeJASFtodfOjwW46a7d6Vy+uzRpZTWj8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, May 15, 2023 at 09:57:04AM +0200, Peter Zijlstra wrote: > In order to replace cmpxchg_double() with the newly minted > cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). > > Signed-off-by: Peter Zijlstra (Intel) > --- > --- a/arch/x86/include/asm/percpu.h > +++ b/arch/x86/include/asm/percpu.h > @@ -210,6 +210,65 @@ do { \ > (typeof(_var))(unsigned long) pco_old__; \ > }) > > +#if defined(CONFIG_X86_32) && defined(CONFIG_X86_CMPXCHG64) > +#define percpu_cmpxchg64_op(size, qual, _var, _oval, _nval) \ > +({ \ > + union { \ > + u64 var; \ > + struct { \ > + u32 low, high; \ > + }; \ > + } old__, new__; \ > + \ > + old__.var = _oval; \ > + new__.var = _nval; \ > + \ > + asm qual ("cmpxchg8b " __percpu_arg([var]) \ > + : [var] "+m" (_var), \ > + "+a" (old__.low), \ > + "+d" (old__.high) \ > + : "b" (new__.low), \ > + "c" (new__.high) \ > + : "memory"); \ > + \ > + old__.var; \ > +}) > + > +#define raw_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg64_op(8, , pcp, oval, nval) > +#define this_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg64_op(8, volatile, pcp, oval, nval) > +#endif > + > +#ifdef CONFIG_X86_64 > +#define raw_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg_op(8, , pcp, oval, nval); > +#define this_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg_op(8, volatile, pcp, oval, nval); > + > +#define percpu_cmpxchg128_op(size, qual, _var, _oval, _nval) \ > +({ \ > + union { \ > + u128 var; \ > + struct { \ > + u64 low, high; \ > + }; \ > + } old__, new__; \ > + \ > + old__.var = _oval; \ > + new__.var = _nval; \ > + \ > + asm qual ("cmpxchg16b " __percpu_arg([var]) \ > + : [var] "+m" (_var), \ > + "+a" (old__.low), \ > + "+d" (old__.high) \ > + : "b" (new__.low), \ > + "c" (new__.high) \ > + : "memory"); \ > + \ > + old__.var; \ > +}) > + > +#define raw_cpu_cmpxchg128(pcp, oval, nval) percpu_cmpxchg128_op(16, , pcp, oval, nval) > +#define this_cpu_cmpxchg128(pcp, oval, nval) percpu_cmpxchg128_op(16, volatile, pcp, oval, nval) > +#endif > + > /* > * this_cpu_read() makes gcc load the percpu variable every time it is > * accessed while this_cpu_read_stable() allows the value to be cached. Since this_cpu_cmpxchg*() is assumed to always be present (their usage is not guarded with system_has_cmpxchg*() it needs a fallback for then the instruction is not present. The below has been tested with clearcpuid=cx16; adding an obvious defect in the fallback implemention crashes the kernel, with the code as presented it boots. Build tested i386-defconfig. (the things we do for museum pieces :/) --- include/asm/percpu.h | 25 +++++++++++--------- lib/Makefile | 3 +- lib/cmpxchg16b_emu.S | 43 ++++++++++++++++++++-------------- lib/cmpxchg8b_emu.S | 63 +++++++++++++++++++++++++++++++++++++++------------ 4 files changed, 90 insertions(+), 44 deletions(-) --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -210,7 +210,7 @@ do { \ (typeof(_var))(unsigned long) pco_old__; \ }) -#if defined(CONFIG_X86_32) && defined(CONFIG_X86_CMPXCHG64) +#ifdef CONFIG_X86_32 #define percpu_cmpxchg64_op(size, qual, _var, _oval, _nval) \ ({ \ union { \ @@ -223,13 +223,14 @@ do { \ old__.var = _oval; \ new__.var = _nval; \ \ - asm qual ("cmpxchg8b " __percpu_arg([var]) \ + asm qual (ALTERNATIVE("leal %P[var], %%esi; call this_cpu_cmpxchg8b_emu", \ + "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ : [var] "+m" (_var), \ "+a" (old__.low), \ "+d" (old__.high) \ : "b" (new__.low), \ "c" (new__.high) \ - : "memory"); \ + : "memory", "esi"); \ \ old__.var; \ }) @@ -254,13 +255,14 @@ do { \ old__.var = _oval; \ new__.var = _nval; \ \ - asm qual ("cmpxchg16b " __percpu_arg([var]) \ + asm qual (ALTERNATIVE("leaq %P[var], %%rsi; call this_cpu_cmpxchg16b_emu", \ + "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ : [var] "+m" (_var), \ "+a" (old__.low), \ "+d" (old__.high) \ : "b" (new__.low), \ "c" (new__.high) \ - : "memory"); \ + : "memory", "rsi"); \ \ old__.var; \ }) @@ -400,12 +402,13 @@ do { \ bool __ret; \ typeof(pcp1) __o1 = (o1), __n1 = (n1); \ typeof(pcp2) __o2 = (o2), __n2 = (n2); \ - alternative_io("leaq %P1,%%rsi\n\tcall this_cpu_cmpxchg16b_emu\n\t", \ - "cmpxchg16b " __percpu_arg(1) "\n\tsetz %0\n\t", \ - X86_FEATURE_CX16, \ - ASM_OUTPUT2("=a" (__ret), "+m" (pcp1), \ - "+m" (pcp2), "+d" (__o2)), \ - "b" (__n1), "c" (__n2), "a" (__o1) : "rsi"); \ + asm volatile (ALTERNATIVE("leaq %P1, %%rsi; call this_cpu_cmpxchg16b_emu", \ + "cmpxchg16b " __percpu_arg(1), X86_FEATURE_CX16) \ + "setz %0" \ + : "=a" (__ret), "+m" (pcp1) \ + : "b" (__n1), "c" (__n2), \ + "a" (__o1), "d" (__o2) \ + : "memory", "rsi"); \ __ret; \ }) --- a/arch/x86/lib/Makefile +++ b/arch/x86/lib/Makefile @@ -61,8 +61,9 @@ ifeq ($(CONFIG_X86_32),y) lib-y += strstr_32.o lib-y += string_32.o lib-y += memmove_32.o + lib-y += cmpxchg8b_emu.o ifneq ($(CONFIG_X86_CMPXCHG64),y) - lib-y += cmpxchg8b_emu.o atomic64_386_32.o + lib-y += atomic64_386_32.o endif else obj-y += iomap_copy_64.o --- a/arch/x86/lib/cmpxchg16b_emu.S +++ b/arch/x86/lib/cmpxchg16b_emu.S @@ -1,47 +1,54 @@ /* SPDX-License-Identifier: GPL-2.0-only */ #include #include +#include .text /* + * Emulate 'cmpxchg16b %gs:(%rsi)' + * * Inputs: * %rsi : memory location to compare * %rax : low 64 bits of old value * %rdx : high 64 bits of old value * %rbx : low 64 bits of new value * %rcx : high 64 bits of new value - * %al : Operation successful + * + * Notably this is not LOCK prefixed and is not safe against NMIs */ SYM_FUNC_START(this_cpu_cmpxchg16b_emu) -# -# Emulate 'cmpxchg16b %gs:(%rsi)' except we return the result in %al not -# via the ZF. Caller will access %al to get result. -# -# Note that this is only useful for a cpuops operation. Meaning that we -# do *not* have a fully atomic operation but just an operation that is -# *atomic* on a single cpu (as provided by the this_cpu_xx class of -# macros). -# pushfq cli - cmpq PER_CPU_VAR((%rsi)), %rax - jne .Lnot_same - cmpq PER_CPU_VAR(8(%rsi)), %rdx - jne .Lnot_same + /* if (*ptr == old) */ + cmpq PER_CPU_VAR(0(%rsi)), %rax + jne .Lnot_same + cmpq PER_CPU_VAR(8(%rsi)), %rdx + jne .Lnot_same + + /* *ptr = new */ + movq %rbx, PER_CPU_VAR(0(%rsi)) + movq %rcx, PER_CPU_VAR(8(%rsi)) - movq %rbx, PER_CPU_VAR((%rsi)) - movq %rcx, PER_CPU_VAR(8(%rsi)) + /* set ZF in EFLAGS to indicate success */ + orl $X86_EFLAGS_ZF, (%rsp) popfq - mov $1, %al RET .Lnot_same: + /* *ptr != old */ + + /* old = *ptr */ + movq PER_CPU_VAR(0(%rsi)), %rax + movq PER_CPU_VAR(8(%rsi)), %rdx + + /* clear ZF in EFLAGS to indicate failure */ + andl $(~X86_EFLAGS_ZF), (%rsp) + popfq - xor %al,%al RET SYM_FUNC_END(this_cpu_cmpxchg16b_emu) --- a/arch/x86/lib/cmpxchg8b_emu.S +++ b/arch/x86/lib/cmpxchg8b_emu.S @@ -2,10 +2,16 @@ #include #include +#include +#include .text +#ifndef CONFIG_X86_CMPXCHG64 + /* + * Emulate 'cmpxchg8b (%esi)' on UP + * * Inputs: * %esi : memory location to compare * %eax : low 32 bits of old value @@ -15,32 +21,61 @@ */ SYM_FUNC_START(cmpxchg8b_emu) -# -# Emulate 'cmpxchg8b (%esi)' on UP except we don't -# set the whole ZF thing (caller will just compare -# eax:edx with the expected value) -# pushfl cli - cmpl (%esi), %eax - jne .Lnot_same - cmpl 4(%esi), %edx - jne .Lhalf_same + cmpl 0(%esi), %eax + jne .Lnot_same + cmpl 4(%esi), %edx + jne .Lnot_same + + movl %ebx, 0(%esi) + movl %ecx, 4(%esi) - movl %ebx, (%esi) - movl %ecx, 4(%esi) + orl $X86_EFLAGS_ZF, (%esp) popfl RET .Lnot_same: - movl (%esi), %eax -.Lhalf_same: - movl 4(%esi), %edx + movl 0(%esi), %eax + movl 4(%esi), %edx + + andl $(~X86_EFLAGS_ZF), (%esp) popfl RET SYM_FUNC_END(cmpxchg8b_emu) EXPORT_SYMBOL(cmpxchg8b_emu) + +#endif + +SYM_FUNC_START(this_cpu_cmpxchg8b_emu) + + pushfl + cli + + cmpl PER_CPU_VAR(0(%esi)), %eax + jne .Lnot_same2 + cmpl PER_CPU_VAR(4(%esi)), %edx + jne .Lnot_same2 + + movl %ebx, PER_CPU_VAR(0(%esi)) + movl %ecx, PER_CPU_VAR(4(%esi)) + + orl $X86_EFLAGS_ZF, (%esp) + + popfl + RET + +.Lnot_same2: + movl PER_CPU_VAR(0(%esi)), %eax + movl PER_CPU_VAR(4(%esi)), %edx + + andl $(~X86_EFLAGS_ZF), (%esp) + + popfl + RET + +SYM_FUNC_END(this_cpu_cmpxchg8b_emu)