From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Daniel Wagner <dwagner@suse.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Thomas Gleixner <tglx@linutronix.de>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Uladzislau Rezki <urezki@gmail.com>,
Hillf Danton <hdanton@sina.com>, Michal Hocko <mhocko@suse.com>,
Matthew Wilcox <willy@infradead.org>,
Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>,
Steven Rostedt <rostedt@goodmis.org>
Subject: [PATCH v3 1/3] mm/vmalloc: remove preempt_disable/enable when do preloading
Date: Wed, 16 Oct 2019 11:54:36 +0200 [thread overview]
Message-ID: <20191016095438.12391-1-urezki@gmail.com> (raw)
Some background. The preemption was disabled before to guarantee
that a preloaded object is available for a CPU, it was stored for.
The aim was to not allocate in atomic context when spinlock
is taken later, for regular vmap allocations. But that approach
conflicts with CONFIG_PREEMPT_RT philosophy. It means that
calling spin_lock() with disabled preemption is forbidden
in the CONFIG_PREEMPT_RT kernel.
Therefore, get rid of preempt_disable() and preempt_enable() when
the preload is done for splitting purpose. As a result we do not
guarantee now that a CPU is preloaded, instead we minimize the
case when it is not, with this change.
For example i run the special test case that follows the preload
pattern and path. 20 "unbind" threads run it and each does
1000000 allocations. Only 3.5 times among 1000000 a CPU was
not preloaded. So it can happen but the number is negligible.
V2 - > V3:
- update the commit message
V1 -> V2:
- move __this_cpu_cmpxchg check when spin_lock is taken,
as proposed by Andrew Morton
- add more explanation in regard of preloading
- adjust and move some comments
Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for split purpose")
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
mm/vmalloc.c | 37 ++++++++++++++++++++-----------------
1 file changed, 20 insertions(+), 17 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e92ff5f7dd8b..b7b443bfdd92 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1078,31 +1078,34 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
retry:
/*
- * Preload this CPU with one extra vmap_area object to ensure
- * that we have it available when fit type of free area is
- * NE_FIT_TYPE.
+ * Preload this CPU with one extra vmap_area object. It is used
+ * when fit type of free area is NE_FIT_TYPE. Please note, it
+ * does not guarantee that an allocation occurs on a CPU that
+ * is preloaded, instead we minimize the case when it is not.
+ * It can happen because of cpu migration, because there is a
+ * race until the below spinlock is taken.
*
* The preload is done in non-atomic context, thus it allows us
* to use more permissive allocation masks to be more stable under
- * low memory condition and high memory pressure.
+ * low memory condition and high memory pressure. In rare case,
+ * if not preloaded, GFP_NOWAIT is used.
*
- * Even if it fails we do not really care about that. Just proceed
- * as it is. "overflow" path will refill the cache we allocate from.
+ * Set "pva" to NULL here, because of "retry" path.
*/
- preempt_disable();
- if (!__this_cpu_read(ne_fit_preload_node)) {
- preempt_enable();
- pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
- preempt_disable();
+ pva = NULL;
- if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
- if (pva)
- kmem_cache_free(vmap_area_cachep, pva);
- }
- }
+ if (!this_cpu_read(ne_fit_preload_node))
+ /*
+ * Even if it fails we do not really care about that.
+ * Just proceed as it is. If needed "overflow" path
+ * will refill the cache we allocate from.
+ */
+ pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
spin_lock(&vmap_area_lock);
- preempt_enable();
+
+ if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva))
+ kmem_cache_free(vmap_area_cachep, pva);
/*
* If an allocation fails, the "vend" address is
--
2.20.1
next reply other threads:[~2019-10-16 9:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-16 9:54 Uladzislau Rezki (Sony) [this message]
2019-10-16 9:54 ` [PATCH v3 2/3] mm/vmalloc: respect passed gfp_mask " Uladzislau Rezki (Sony)
2019-10-16 11:06 ` Michal Hocko
2019-10-18 9:40 ` Uladzislau Rezki
2019-10-19 1:58 ` Andrew Morton
2019-10-19 15:51 ` Uladzislau Rezki
2019-10-16 9:54 ` [PATCH v3 3/3] mm/vmalloc: add more comments to the adjust_va_to_fit_type() Uladzislau Rezki (Sony)
2019-10-16 11:07 ` Michal Hocko
2019-10-18 9:41 ` Uladzislau Rezki
2019-10-16 10:59 ` [PATCH v3 1/3] mm/vmalloc: remove preempt_disable/enable when do preloading Michal Hocko
2019-10-18 9:37 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191016095438.12391-1-urezki@gmail.com \
--to=urezki@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=dwagner@suse.de \
--cc=hdanton@sina.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=oleksiy.avramchenko@sonymobile.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox