* [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg()
@ 2024-05-28 14:43 Uros Bizjak
2024-05-28 14:43 ` [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() Uros Bizjak
2024-05-28 18:50 ` [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uladzislau Rezki
0 siblings, 2 replies; 4+ messages in thread
From: Uros Bizjak @ 2024-05-28 14:43 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: Uros Bizjak, Andrew Morton, Uladzislau Rezki, Christoph Hellwig,
Lorenzo Stoakes, Dennis Zhou, Tejun Heo, Christoph Lameter
Add __this_cpu_try_cmpxchg() version of the percpu op.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
---
include/linux/percpu-defs.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
index ec3573119923..8efce7414fad 100644
--- a/include/linux/percpu-defs.h
+++ b/include/linux/percpu-defs.h
@@ -475,6 +475,12 @@ do { \
raw_cpu_cmpxchg(pcp, oval, nval); \
})
+#define __this_cpu_try_cmpxchg(pcp, ovalp, nval) \
+({ \
+ __this_cpu_preempt_check("try_cmpxchg"); \
+ raw_cpu_try_cmpxchg(pcp, ovalp, nval); \
+})
+
#define __this_cpu_sub(pcp, val) __this_cpu_add(pcp, -(typeof(pcp))(val))
#define __this_cpu_inc(pcp) __this_cpu_add(pcp, 1)
#define __this_cpu_dec(pcp) __this_cpu_sub(pcp, 1)
--
2.42.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock()
2024-05-28 14:43 [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uros Bizjak
@ 2024-05-28 14:43 ` Uros Bizjak
2024-05-28 18:36 ` Uladzislau Rezki
2024-05-28 18:50 ` [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uladzislau Rezki
1 sibling, 1 reply; 4+ messages in thread
From: Uros Bizjak @ 2024-05-28 14:43 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: Uros Bizjak, Andrew Morton, Uladzislau Rezki, Christoph Hellwig,
Lorenzo Stoakes, Dennis Zhou, Tejun Heo, Christoph Lameter
Use __this_cpu_try_cmpxchg() instead of
__this_cpu_cmpxchg (*ptr, old, new) == old in
preload_this_cpu_lock(). x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg.
The generated code improves from:
4bb6: 48 85 f6 test %rsi,%rsi
4bb9: 0f 84 10 fa ff ff je 45cf <...>
4bbf: 4c 89 e8 mov %r13,%rax
4bc2: 65 48 0f b1 35 00 00 cmpxchg %rsi,%gs:0x0(%rip)
4bc9: 00 00
4bcb: 48 85 c0 test %rax,%rax
4bce: 0f 84 fb f9 ff ff je 45cf <...>
to:
4bb6: 48 85 f6 test %rsi,%rsi
4bb9: 0f 84 10 fa ff ff je 45cf <...>
4bbf: 4c 89 e8 mov %r13,%rax
4bc2: 65 48 0f b1 35 00 00 cmpxchg %rsi,%gs:0x0(%rip)
4bc9: 00 00
4bcb: 0f 84 fe f9 ff ff je 45cf <...>
No functional change intended.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
---
v2: Show generated code improvement in the commit message.
---
mm/vmalloc.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 5d3aa2dc88a8..4f34d935d648 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1816,7 +1816,7 @@ static void free_vmap_area(struct vmap_area *va)
static inline void
preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node)
{
- struct vmap_area *va = NULL;
+ struct vmap_area *va = NULL, *tmp;
/*
* Preload this CPU with one extra vmap_area object. It is used
@@ -1832,7 +1832,8 @@ preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node)
spin_lock(lock);
- if (va && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, va))
+ tmp = NULL;
+ if (va && !__this_cpu_try_cmpxchg(ne_fit_preload_node, &tmp, va))
kmem_cache_free(vmap_area_cachep, va);
}
--
2.42.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock()
2024-05-28 14:43 ` [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() Uros Bizjak
@ 2024-05-28 18:36 ` Uladzislau Rezki
0 siblings, 0 replies; 4+ messages in thread
From: Uladzislau Rezki @ 2024-05-28 18:36 UTC (permalink / raw)
To: Uros Bizjak
Cc: linux-mm, linux-kernel, Andrew Morton, Uladzislau Rezki,
Christoph Hellwig, Lorenzo Stoakes, Dennis Zhou, Tejun Heo,
Christoph Lameter
On Tue, May 28, 2024 at 04:43:14PM +0200, Uros Bizjak wrote:
> Use __this_cpu_try_cmpxchg() instead of
> __this_cpu_cmpxchg (*ptr, old, new) == old in
> preload_this_cpu_lock(). x86 CMPXCHG instruction returns
> success in ZF flag, so this change saves a compare after cmpxchg.
>
> The generated code improves from:
>
> 4bb6: 48 85 f6 test %rsi,%rsi
> 4bb9: 0f 84 10 fa ff ff je 45cf <...>
> 4bbf: 4c 89 e8 mov %r13,%rax
> 4bc2: 65 48 0f b1 35 00 00 cmpxchg %rsi,%gs:0x0(%rip)
> 4bc9: 00 00
> 4bcb: 48 85 c0 test %rax,%rax
> 4bce: 0f 84 fb f9 ff ff je 45cf <...>
>
> to:
>
> 4bb6: 48 85 f6 test %rsi,%rsi
> 4bb9: 0f 84 10 fa ff ff je 45cf <...>
> 4bbf: 4c 89 e8 mov %r13,%rax
> 4bc2: 65 48 0f b1 35 00 00 cmpxchg %rsi,%gs:0x0(%rip)
> 4bc9: 00 00
> 4bcb: 0f 84 fe f9 ff ff je 45cf <...>
>
> No functional change intended.
>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Uladzislau Rezki <urezki@gmail.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Dennis Zhou <dennis@kernel.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Christoph Lameter <cl@linux.com>
> ---
> v2: Show generated code improvement in the commit message.
> ---
> mm/vmalloc.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 5d3aa2dc88a8..4f34d935d648 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1816,7 +1816,7 @@ static void free_vmap_area(struct vmap_area *va)
> static inline void
> preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node)
> {
> - struct vmap_area *va = NULL;
> + struct vmap_area *va = NULL, *tmp;
>
> /*
> * Preload this CPU with one extra vmap_area object. It is used
> @@ -1832,7 +1832,8 @@ preload_this_cpu_lock(spinlock_t *lock, gfp_t gfp_mask, int node)
>
> spin_lock(lock);
>
> - if (va && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, va))
> + tmp = NULL;
> + if (va && !__this_cpu_try_cmpxchg(ne_fit_preload_node, &tmp, va))
> kmem_cache_free(vmap_area_cachep, va);
> }
>
> --
> 2.42.0
>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Thanks!
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg()
2024-05-28 14:43 [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uros Bizjak
2024-05-28 14:43 ` [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() Uros Bizjak
@ 2024-05-28 18:50 ` Uladzislau Rezki
1 sibling, 0 replies; 4+ messages in thread
From: Uladzislau Rezki @ 2024-05-28 18:50 UTC (permalink / raw)
To: Uros Bizjak
Cc: linux-mm, linux-kernel, Andrew Morton, Uladzislau Rezki,
Christoph Hellwig, Lorenzo Stoakes, Dennis Zhou, Tejun Heo,
Christoph Lameter
On Tue, May 28, 2024 at 04:43:13PM +0200, Uros Bizjak wrote:
> Add __this_cpu_try_cmpxchg() version of the percpu op.
>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Uladzislau Rezki <urezki@gmail.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Lorenzo Stoakes <lstoakes@gmail.com>
> Cc: Dennis Zhou <dennis@kernel.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Christoph Lameter <cl@linux.com>
> ---
> include/linux/percpu-defs.h | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/include/linux/percpu-defs.h b/include/linux/percpu-defs.h
> index ec3573119923..8efce7414fad 100644
> --- a/include/linux/percpu-defs.h
> +++ b/include/linux/percpu-defs.h
> @@ -475,6 +475,12 @@ do { \
> raw_cpu_cmpxchg(pcp, oval, nval); \
> })
>
> +#define __this_cpu_try_cmpxchg(pcp, ovalp, nval) \
> +({ \
> + __this_cpu_preempt_check("try_cmpxchg"); \
> + raw_cpu_try_cmpxchg(pcp, ovalp, nval); \
> +})
> +
> #define __this_cpu_sub(pcp, val) __this_cpu_add(pcp, -(typeof(pcp))(val))
> #define __this_cpu_inc(pcp) __this_cpu_add(pcp, 1)
> #define __this_cpu_dec(pcp) __this_cpu_sub(pcp, 1)
> --
> 2.42.0
>
Acked-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Thanks!
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-05-28 18:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-28 14:43 [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uros Bizjak
2024-05-28 14:43 ` [PATCH v2 2/2] mm/vmalloc: Use __this_cpu_try_cmpxchg() in preload_this_cpu_lock() Uros Bizjak
2024-05-28 18:36 ` Uladzislau Rezki
2024-05-28 18:50 ` [PATCH v2 1/2] percpu: Add __this_cpu_try_cmpxchg() Uladzislau Rezki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox