* [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() @ 2024-05-28 14:43 Uros Bizjak 2024-05-28 14:43 ` [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() Uros Bizjak 2024-05-28 18:50 ` [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uladzislau Rezki 0 siblings, 2 replies; 4+ messages in thread From: Uros Bizjak @ 2024-05-28 14:43 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: Uros Bizjak, Andrew Morton, Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, Dennis Zhou, Tejun Heo, Christoph Lameter Add __this_cpu_try_cmpxchg() version of the percpu op. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> --- include/linux/percpu-defs.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h index ec3573119923..8efce7414fad 100644 --- a/include/linux/percpu-defs.h +++ b/include/linux/percpu-defs.h @@ -475,6 +475,12 @@ do { \ raw_cpu_cmpxchg(pcp, oval, nval); \ }) +#define __this_cpu_try_cmpxchg(pcp, ovalp, nval) \ +({ \ + __this_cpu_preempt_check("try_cmpxchg"); \ + raw_cpu_try_cmpxchg(pcp, ovalp, nval); \ +}) + #define __this_cpu_sub(pcp, val) __this_cpu_add(pcp, -(typeof(pcp))(val)) #define __this_cpu_inc(pcp) __this_cpu_add(pcp, 1) #define __this_cpu_dec(pcp) __this_cpu_sub(pcp, 1) -- 2.42.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() 2024-05-28 14:43 [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uros Bizjak @ 2024-05-28 14:43 ` Uros Bizjak 2024-05-28 18:36 ` Uladzislau Rezki 2024-05-28 18:50 ` [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uladzislau Rezki 1 sibling, 1 reply; 4+ messages in thread From: Uros Bizjak @ 2024-05-28 14:43 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: Uros Bizjak, Andrew Morton, Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, Dennis Zhou, Tejun Heo, Christoph Lameter Use __this_cpu_try_cmpxchg() instead of __this_cpu_cmpxchg (*ptr, old, new) == old in preload_this_cpu_lock(). x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg. The generated code improves from: 4bb6: 48 85 f6 test %rsi,%rsi 4bb9: 0f 84 10 fa ff ff je 45cf <...> 4bbf: 4c 89 e8 mov %r13,%rax 4bc2: 65 48 0f b1 35 00 00 cmpxchg %rsi,%gs:0x0(%rip) 4bc9: 00 00 4bcb: 48 85 c0 test %rax,%rax 4bce: 0f 84 fb f9 ff ff je 45cf <...> to: 4bb6: 48 85 f6 test %rsi,%rsi 4bb9: 0f 84 10 fa ff ff je 45cf <...> 4bbf: 4c 89 e8 mov %r13,%rax 4bc2: 65 48 0f b1 35 00 00 cmpxchg %rsi,%gs:0x0(%rip) 4bc9: 00 00 4bcb: 0f 84 fe f9 ff ff je 45cf <...> No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> --- v2: Show generated code improvement in the commit message. --- mm/vmalloc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 5d3aa2dc88a8..4f34d935d648 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1816,7 +1816,7 @@ static void free_vmap_area(struct vmap_area *va) static inline void preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node) { - struct vmap_area *va = NULL; + struct vmap_area *va = NULL, *tmp; /* * Preload this CPU with one extra vmap_area object. It is used @@ -1832,7 +1832,8 @@ preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node) spin_lock(lock); - if (va && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, va)) + tmp = NULL; + if (va && !__this_cpu_try_cmpxchg(ne_fit_preload_node, &tmp, va)) kmem_cache_free(vmap_area_cachep, va); } -- 2.42.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() 2024-05-28 14:43 ` [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() Uros Bizjak @ 2024-05-28 18:36 ` Uladzislau Rezki 0 siblings, 0 replies; 4+ messages in thread From: Uladzislau Rezki @ 2024-05-28 18:36 UTC (permalink / raw) To: Uros Bizjak Cc: linux-mm, linux-kernel, Andrew Morton, Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, Dennis Zhou, Tejun Heo, Christoph Lameter On Tue, May 28, 2024 at 04:43:14PM +0200, Uros Bizjak wrote: > Use __this_cpu_try_cmpxchg() instead of > __this_cpu_cmpxchg (*ptr, old, new) == old in > preload_this_cpu_lock(). x86 CMPXCHG instruction returns > success in ZF flag, so this change saves a compare after cmpxchg. > > The generated code improves from: > > 4bb6: 48 85 f6 test %rsi,%rsi > 4bb9: 0f 84 10 fa ff ff je 45cf <...> > 4bbf: 4c 89 e8 mov %r13,%rax > 4bc2: 65 48 0f b1 35 00 00 cmpxchg %rsi,%gs:0x0(%rip) > 4bc9: 00 00 > 4bcb: 48 85 c0 test %rax,%rax > 4bce: 0f 84 fb f9 ff ff je 45cf <...> > > to: > > 4bb6: 48 85 f6 test %rsi,%rsi > 4bb9: 0f 84 10 fa ff ff je 45cf <...> > 4bbf: 4c 89 e8 mov %r13,%rax > 4bc2: 65 48 0f b1 35 00 00 cmpxchg %rsi,%gs:0x0(%rip) > 4bc9: 00 00 > 4bcb: 0f 84 fe f9 ff ff je 45cf <...> > > No functional change intended. > > Signed-off-by: Uros Bizjak <ubizjak@gmail.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Uladzislau Rezki <urezki@gmail.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Lorenzo Stoakes <lstoakes@gmail.com> > Cc: Dennis Zhou <dennis@kernel.org> > Cc: Tejun Heo <tj@kernel.org> > Cc: Christoph Lameter <cl@linux.com> > --- > v2: Show generated code improvement in the commit message. > --- > mm/vmalloc.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 5d3aa2dc88a8..4f34d935d648 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1816,7 +1816,7 @@ static void free_vmap_area(struct vmap_area *va) > static inline void > preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node) > { > - struct vmap_area *va = NULL; > + struct vmap_area *va = NULL, *tmp; > > /* > * Preload this CPU with one extra vmap_area object. It is used > @@ -1832,7 +1832,8 @@ preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node) > > spin_lock(lock); > > - if (va && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, va)) > + tmp = NULL; > + if (va && !__this_cpu_try_cmpxchg(ne_fit_preload_node, &tmp, va)) > kmem_cache_free(vmap_area_cachep, va); > } > > -- > 2.42.0 > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Thanks! -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() 2024-05-28 14:43 [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uros Bizjak 2024-05-28 14:43 ` [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() Uros Bizjak @ 2024-05-28 18:50 ` Uladzislau Rezki 1 sibling, 0 replies; 4+ messages in thread From: Uladzislau Rezki @ 2024-05-28 18:50 UTC (permalink / raw) To: Uros Bizjak Cc: linux-mm, linux-kernel, Andrew Morton, Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, Dennis Zhou, Tejun Heo, Christoph Lameter On Tue, May 28, 2024 at 04:43:13PM +0200, Uros Bizjak wrote: > Add __this_cpu_try_cmpxchg() version of the percpu op. > > Signed-off-by: Uros Bizjak <ubizjak@gmail.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Uladzislau Rezki <urezki@gmail.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Lorenzo Stoakes <lstoakes@gmail.com> > Cc: Dennis Zhou <dennis@kernel.org> > Cc: Tejun Heo <tj@kernel.org> > Cc: Christoph Lameter <cl@linux.com> > --- > include/linux/percpu-defs.h | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h > index ec3573119923..8efce7414fad 100644 > --- a/include/linux/percpu-defs.h > +++ b/include/linux/percpu-defs.h > @@ -475,6 +475,12 @@ do { \ > raw_cpu_cmpxchg(pcp, oval, nval); \ > }) > > +#define __this_cpu_try_cmpxchg(pcp, ovalp, nval) \ > +({ \ > + __this_cpu_preempt_check("try_cmpxchg"); \ > + raw_cpu_try_cmpxchg(pcp, ovalp, nval); \ > +}) > + > #define __this_cpu_sub(pcp, val) __this_cpu_add(pcp, -(typeof(pcp))(val)) > #define __this_cpu_inc(pcp) __this_cpu_add(pcp, 1) > #define __this_cpu_dec(pcp) __this_cpu_sub(pcp, 1) > -- > 2.42.0 > Acked-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Thanks! -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-05-28 18:50 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-05-28 14:43 [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uros Bizjak 2024-05-28 14:43 ` [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() Uros Bizjak 2024-05-28 18:36 ` Uladzislau Rezki 2024-05-28 18:50 ` [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uladzislau Rezki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox