From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 559E1C05027 for ; Mon, 6 Feb 2023 12:49:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B03046B0072; Mon, 6 Feb 2023 07:49:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AB0D16B0073; Mon, 6 Feb 2023 07:49:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92BBE6B0074; Mon, 6 Feb 2023 07:49:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 81A2E6B0072 for ; Mon, 6 Feb 2023 07:49:06 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5871180406 for ; Mon, 6 Feb 2023 12:49:06 +0000 (UTC) X-FDA: 80436847092.28.1D3C180 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf19.hostedemail.com (Postfix) with ESMTP id 134701A000F for ; Mon, 6 Feb 2023 12:49:02 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=gDr6q+UU; spf=none (imf19.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675687743; a=rsa-sha256; cv=none; b=hd+ZSN6j3zgR9SzRblLGhxBHicXT7Ssqp8DyRcQpytSqzpJlRoKTwVitGB42rIBaJY+E4B lTGFSfFLGTea8hM8uxtKlSQIgA7o/d7uNm+mK/jGq0Nq18DP1wef89Z09uLx46Qm5a7A64 6a2rWEJSg24iFluc0mqMm389KEoLzxY= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b=gDr6q+UU; spf=none (imf19.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675687743; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bcFixH8+YqomwdNAIz5TC0lbv8llOKjCDTzT/15p89Y=; b=G7byUmrAQ2NEELu3QhIfTbxIoMzW5FixN1xf6FEdhrBdQx+4+AvFU5kMrQ0QJdSnXA2DyJ +gnQtiadQu4Jf4hp8pbyVs+lTk0D8QAfB8Dp7FCD1cyPcuVUDutF7MjNKpaCWqIVISLlee rh/vITiH44dpXhkdQNrbRbaPDHu951I= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=bcFixH8+YqomwdNAIz5TC0lbv8llOKjCDTzT/15p89Y=; b=gDr6q+UUUvpmalKveY90Z9mKoU feL7VIJlC7cOsfPh7TWmPCKxMv/3rZe/BGG7ZoJI8x2UqBwhLbYi85gyj63IbvaRUs4GIiTSO3Pqo OVK5BTPq+pLIlLu+H1RxuK8O9yhcxWGvv4PBrGClRiRRzIlOCLrSWoSK80kPb8FC2Z/3vi44br+dz QqiszmVInSmiuqpwITmG88XRQlvSb/CZcPEGuI9mtOZ6RJLfuLIgomGZUS1RojJOkp3LECt8WAq3T 70mP4KpcwQH+siIlqQo8iRSiyKggIkVR5F3EMW/BFFEjqMwd8rtkS5Yyzd0SDfDSezVzpyJz95bd8 Atrsuu0g==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pP0uT-006YgS-2N; Mon, 06 Feb 2023 12:47:50 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id ED3B830012F; Mon, 6 Feb 2023 13:48:23 +0100 (CET) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id BCEBF207A0B88; Mon, 6 Feb 2023 13:48:23 +0100 (CET) Date: Mon, 6 Feb 2023 13:48:23 +0100 From: Peter Zijlstra To: Arnd Bergmann Cc: Linus Torvalds , Jonathan Corbet , Will Deacon , Boqun Feng , Mark Rutland , Catalin Marinas , dennis@kernel.org, Tejun Heo , Christoph Lameter , Heiko Carstens , gor@linux.ibm.com, Alexander Gordeev , borntraeger@linux.ibm.com, Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Joerg Roedel , suravee.suthikulpanit@amd.com, Robin Murphy , dwmw2@infradead.org, Baolu Lu , Herbert Xu , "David S . Miller" , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, Linux-Arch , linux-crypto@vger.kernel.org Subject: Re: [PATCH v2 05/10] percpu: Wire up cmpxchg128 Message-ID: References: <20230202145030.223740842@infradead.org> <20230202152655.494373332@infradead.org> <24007667-1ff3-4c86-9c17-a361c3f9f072@app.fastmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Queue-Id: 134701A000F X-Rspamd-Server: rspam01 X-Stat-Signature: ktwfrsoikywzuicab5ogg8s4cdfqydbf X-HE-Tag: 1675687742-999551 X-HE-Meta: U2FsdGVkX1+uJCq960yscX20hmu3BHlzoXiQkKWQ9NGA9iIbFJkrt8nkdbcbFEUmV8aOM7wm9FqV6MBn/NmB0JKgLSOmyFQd/TmBgPiDAGaisnoGYZc0hGXw5TXu3GvWykAZIUFi/NMjcnHnka06qCgn5jygJgBMu6zdJuNbQa5x+dg1VjuQ5ybYVOnWsUUMfvZ8NXhf+TxeAAhDnMrPSUgERKXSZdOv5UirGnlXf4vDgD2oO3Rsk4H3MrmVeFuZBSsBauTxZGR9g3ZhFQqVaprMLjac92XeY8ewa3J3/XAvOvQx60nW1XbI3tdnhrVPcIDwGBq3zChKVb5X3XUs0RUpgeExicDiIXzoBAJ6ZFn5r/VQ5H8hEAhDObPJcLpZ7OzCrnsCxIh6XStJM3Zxl6IERRWwwi69xCbuGDMUCPbj0jOQXtMin2kNDjgmvI1fFOHv8i/9KYBRCtP9lBvVQ4ccHvhtB/aVU+CaMMftqXiREYd7RF47xhcXu7Hwk5nuG1QRia2xF+z0a8ehpYhKX7WPDuUlZQfkhUuOS8bgGvqM2XjE4Nkrt5yJeDnPEKVWrHinkkutFoD3tE31ySyuT0xx0r69IkGcySjrRkVVIqQI8R6RryUzDh5EsPpDMaU9Cg5l2rssRAdiBLxKcJwG8Vc4XPeF7Naper3sE11I2nruKEhEmPIN2tuefzTCPm1q0iesRV2g4zuyIVrseu3t3SIyvh7+IeNrxjX4iYVgD3WFygpqtaR6CVjnR8JqTC0mCNfYtnSPqmqLFsbgBjEcCWa2ddM93D/9AqpjvE4dgMFahbBCpJvWOSchfL2YslHKXpRNUsndRkWdroti1CmM2ytxXIz0szhoZkanxl9Quskqgs6HIAAhKORjy6MgFLo8eYuLumk6iTRNdeTyEW/G3Bb0sWMrui6L8XvJEQXfyX6DcKF7NBg//o3LRVlA6am+e2sBde+rTpOHlVIq6H8 uz4MVagf amA8Gj44KkboNeA7LIoEI4W+rWhkuXy7/ZIIhgTxY+0BiSZIuj3c2mbRCPzqoi8Qgo0xvAVKeEpOGjNlGfYa0g9JNC4+rloXph/aOHsfD8GUp0kgzuUefkEOLc3vPOCQNie2sAhDkscVLN0lyAty/xBaITg6/YMZCYTyXp1WdNvmLaJyQcZHvqXiBfHaN32asS9QuRbiPVLRDDK74SxTEZVSHfh8qipkKl3N9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 06, 2023 at 01:14:28PM +0100, Peter Zijlstra wrote: > On Mon, Feb 06, 2023 at 12:24:00PM +0100, Peter Zijlstra wrote: > > > > Unless I have misunderstood what you are doing, my concerns are > > > still the same: > > > > > > > #define this_cpu_cmpxchg(pcp, oval, nval) \ > > > > - __pcpu_size_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) > > > > + __pcpu_size16_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) > > > > #define this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, > > > > nval2) \ > > > > __pcpu_double_call_return_bool(this_cpu_cmpxchg_double_, pcp1, pcp2, > > > > oval1, oval2, nval1, nval2) > > > > > > Having a variable-length this_cpu_cmpxchg() that turns into cmpxchg128() > > > and cmpxchg64() even on CPUs where this traps (!X86_FEATURE_CX16) seems > > > like a bad design to me. > > > > > > I would much prefer fixed-length this_cpu_cmpxchg64()/this_cpu_cmpxchg128() > > > calls that never trap but fall back to the generic version on CPUs that > > > are lacking the atomics. > > > > You're thinking acidental usage etc..? Lemme see what I can do. > > So lookng at this I remember why I did it like this, currently 32bit > archs silently fall back to the generics for most/all 64bit ops. > > And personally I would just as soon drop support for the > !X86_FEATURE_CX* cpus... :/ Those are some serious museum pieces. > > One problem with silent downgrades like this is that semantics vs NMI > change, which makes for subtle bugs on said museum pieces. > > Basically, using 64bit percpu ops on 32bit is already somewhat dangerous > -- wiring up native cmpxchg64 support in that case seemed an > improvement. > > Anyway... let me get on with doing explicit > {raw,this}_cpu_cmpxchg{64,128}() thingies. I only converted x86 and didn't do the automagic downgrade... Opinions? --- arch/x86/include/asm/percpu.h | 11 +++++++---- include/asm-generic/percpu.h | 18 ++++++++++++++---- include/linux/percpu-defs.h | 20 ++------------------ mm/slab.h | 2 ++ mm/slub.c | 21 +++++++++++---------- 5 files changed, 36 insertions(+), 36 deletions(-) diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 4c803a1fd0e7..7515e065369b 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -214,7 +214,7 @@ do { \ #define percpu_cmpxchg64_op(size, qual, _var, _oval, _nval) \ ({ \ union { \ - typeof(_var) var; \ + u64 val; \ struct { \ u32 low, high; \ }; \ @@ -234,15 +234,18 @@ do { \ old__.var; \ }) -#define raw_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg64_op(8, , pcp, oval, nval) -#define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg64_op(8, volatile, pcp, oval, nval) +#define raw_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg64_op(8, , pcp, oval, nval) +#define this_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg64_op(8, volatile, pcp, oval, nval) #endif #ifdef CONFIG_X86_64 +#define raw_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg_op(8, , pcp, oval, nval); +#define this_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg_op(8, volatile, pcp, oval, nval); + #define percpu_cmpxchg128_op(size, qual, _var, _oval, _nval) \ ({ \ union { \ - typeof(_var) var; \ + u128 var; \ struct { \ u64 low, high; \ }; \ diff --git a/include/asm-generic/percpu.h b/include/asm-generic/percpu.h index ad254a20fe68..7da7d1793411 100644 --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -274,8 +274,13 @@ do { \ #define raw_cpu_cmpxchg_8(pcp, oval, nval) \ raw_cpu_generic_cmpxchg(pcp, oval, nval) #endif -#ifndef raw_cpu_cmpxchg_16 -#define raw_cpu_cmpxchg_16(pcp, oval, nval) \ + +#ifndef raw_cpu_cmpxchg64 +#define raw_cpu_cmpxchg64(pcp, oval, nval) \ + raw_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef raw_cpu_cmpxchg128 +#define raw_cpu_cmpxchg128(pcp, oval, nval) \ raw_cpu_generic_cmpxchg(pcp, oval, nval) #endif @@ -386,8 +391,13 @@ do { \ #define this_cpu_cmpxchg_8(pcp, oval, nval) \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif -#ifndef this_cpu_cmpxchg_16 -#define this_cpu_cmpxchg_16(pcp, oval, nval) \ + +#ifndef this_cpu_cmpxchg64 +#define this_cpu_cmpxchg64(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef this_cpu_cmpxchg128 +#define this_cpu_cmpxchg128(pcp, oval, nval) \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h index fe3c7fc2d411..7cd614a46af4 100644 --- a/include/linux/percpu-defs.h +++ b/include/linux/percpu-defs.h @@ -343,22 +343,6 @@ static inline void __this_cpu_preempt_check(const char *op) { } pscr2_ret__; \ }) -#define __pcpu_size16_call_return2(stem, variable, ...) \ -({ \ - typeof(variable) pscr2_ret__; \ - __verify_pcpu_ptr(&(variable)); \ - switch(sizeof(variable)) { \ - case 1: pscr2_ret__ = stem##1(variable, __VA_ARGS__); break; \ - case 2: pscr2_ret__ = stem##2(variable, __VA_ARGS__); break; \ - case 4: pscr2_ret__ = stem##4(variable, __VA_ARGS__); break; \ - case 8: pscr2_ret__ = stem##8(variable, __VA_ARGS__); break; \ - case 16: pscr2_ret__ = stem##16(variable, __VA_ARGS__); break; \ - default: \ - __bad_size_call_parameter(); break; \ - } \ - pscr2_ret__; \ -}) - #define __pcpu_size_call(stem, variable, ...) \ do { \ __verify_pcpu_ptr(&(variable)); \ @@ -414,7 +398,7 @@ do { \ #define raw_cpu_add_return(pcp, val) __pcpu_size_call_return2(raw_cpu_add_return_, pcp, val) #define raw_cpu_xchg(pcp, nval) __pcpu_size_call_return2(raw_cpu_xchg_, pcp, nval) #define raw_cpu_cmpxchg(pcp, oval, nval) \ - __pcpu_size16_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval) + __pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval) #define raw_cpu_sub(pcp, val) raw_cpu_add(pcp, -(val)) #define raw_cpu_inc(pcp) raw_cpu_add(pcp, 1) #define raw_cpu_dec(pcp) raw_cpu_sub(pcp, 1) @@ -493,7 +477,7 @@ do { \ #define this_cpu_add_return(pcp, val) __pcpu_size_call_return2(this_cpu_add_return_, pcp, val) #define this_cpu_xchg(pcp, nval) __pcpu_size_call_return2(this_cpu_xchg_, pcp, nval) #define this_cpu_cmpxchg(pcp, oval, nval) \ - __pcpu_size16_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) + __pcpu_size_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) #define this_cpu_sub(pcp, val) this_cpu_add(pcp, -(typeof(pcp))(val)) #define this_cpu_inc(pcp) this_cpu_add(pcp, 1) #define this_cpu_dec(pcp) this_cpu_sub(pcp, 1) diff --git a/mm/slab.h b/mm/slab.h index 19e1899673ef..50b5edd6a950 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -25,11 +25,13 @@ typedef union { # ifdef system_has_cmpxchg128 # define system_has_freelist_aba() system_has_cmpxchg128() # define try_cmpxchg_freelist try_cmpxchg128 +# define this_cpu_cmpxchg_freelist this_cpu_cmpxchg128 # endif #else /* CONFIG_64BIT */ # ifdef system_has_cmpxchg64 # define system_has_freelist_aba() system_has_cmpxchg64() # define try_cmpxchg_freelist try_cmpxchg64 +# define this_cpu_cmpxchg_freelist this_cpu_cmpxchg64 # endif #endif /* CONFIG_64BIT */ diff --git a/mm/slub.c b/mm/slub.c index 45f2b28d60e1..35939c5aa28a 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -523,17 +523,14 @@ __update_freelist_fast(struct slab *slab, void *freelist_old, unsigned long counters_old, void *freelist_new, unsigned long counters_new) { - - bool ret = false; - -#ifdef system_has_freelist_aba +#ifdef syste_has_freelist_aba freelist_aba_t old = { .freelist = freelist_old, .counter = counters_old }; freelist_aba_t new = { .freelist = freelist_new, .counter = counters_new }; - ret = try_cmpxchg_freelist(&slab->freelist_counter.full, &old.full, new.full); -#endif /* system_has_freelist_aba */ - - return ret; + return try_cmpxchg_freelist(&slab->freelist_counter.full, &old.full, new.full); +#else + return false; +#endif } static inline bool @@ -3039,11 +3036,15 @@ __update_cpu_freelist_fast(struct kmem_cache *s, void *freelist_old, void *freelist_new, unsigned long tid) { +#ifdef system_has_freelist_aba freelist_aba_t old = { .freelist = freelist_old, .counter = tid }; freelist_aba_t new = { .freelist = freelist_new, .counter = next_tid(tid) }; - return this_cpu_cmpxchg(s->cpu_slab->freelist_tid.full, - old.full, new.full) == old.full; + return this_cpu_cmpxchg_freelist(s->cpu_slab->freelist_tid.full, + old.full, new.full) == old.full; +#else + return false; +#endif } /*