* [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context.
@ 2024-11-08 10:39 Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 1/4] scftorture: Avoid additional div operation Sebastian Andrzej Siewior
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-08 10:39 UTC (permalink / raw)
To: kasan-dev, linux-kernel, linux-mm
Cc: Paul E. McKenney, Boqun Feng, Marco Elver, Peter Zijlstra,
Tomas Gleixner, Vlastimil Babka, akpm, cl, iamjoonsoo.kim,
longman, penberg, rientjes, sfr
Hi,
Paul reported kfree from IRQ context in scftorture which is noticed by
lockdep since the recent PROVE_RAW_LOCK_NESTING switch.
The last patch in this series adresses the issues, the other things
happened on the way.
v2…v3:
- The clean up on module exit must not be done with thread numbers.
Reported by Boqun Feng.
- Move the clean up on module exit prior to torture_cleanup_end().
Reported by Paul.
v1…v2:
- Remove kfree_bulk(). I get more invocations per report without it.
- Pass `cpu' to scf_cleanup_free_list in scftorture_invoker() instead
of scfp->cpu. The latter is the thread number which can be larger
than the number CPUs leading to a crash in such a case. Reported by
Boqun Feng.
- Clean up the per-CPU lists on module exit. Reported by Boqun Feng.
Sebastian
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 1/4] scftorture: Avoid additional div operation.
2024-11-08 10:39 [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Sebastian Andrzej Siewior
@ 2024-11-08 10:39 ` Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 2/4] scftorture: Wait until scf_cleanup_handler() completes Sebastian Andrzej Siewior
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-08 10:39 UTC (permalink / raw)
To: kasan-dev, linux-kernel, linux-mm
Cc: Paul E. McKenney, Boqun Feng, Marco Elver, Peter Zijlstra,
Tomas Gleixner, Vlastimil Babka, akpm, cl, iamjoonsoo.kim,
longman, penberg, rientjes, sfr, Sebastian Andrzej Siewior
Replace "scfp->cpu % nr_cpu_ids" with "cpu". This has been computed
earlier.
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/scftorture.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/scftorture.c b/kernel/scftorture.c
index 44e83a6462647..455cbff35a1a2 100644
--- a/kernel/scftorture.c
+++ b/kernel/scftorture.c
@@ -463,7 +463,7 @@ static int scftorture_invoker(void *arg)
// Make sure that the CPU is affinitized appropriately during testing.
curcpu = raw_smp_processor_id();
- WARN_ONCE(curcpu != scfp->cpu % nr_cpu_ids,
+ WARN_ONCE(curcpu != cpu,
"%s: Wanted CPU %d, running on %d, nr_cpu_ids = %d\n",
__func__, scfp->cpu, curcpu, nr_cpu_ids);
--
2.45.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 2/4] scftorture: Wait until scf_cleanup_handler() completes.
2024-11-08 10:39 [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 1/4] scftorture: Avoid additional div operation Sebastian Andrzej Siewior
@ 2024-11-08 10:39 ` Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 3/4] scftorture: Move memory allocation outside of preempt_disable region Sebastian Andrzej Siewior
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-08 10:39 UTC (permalink / raw)
To: kasan-dev, linux-kernel, linux-mm
Cc: Paul E. McKenney, Boqun Feng, Marco Elver, Peter Zijlstra,
Tomas Gleixner, Vlastimil Babka, akpm, cl, iamjoonsoo.kim,
longman, penberg, rientjes, sfr, Sebastian Andrzej Siewior
The smp_call_function() needs to be invoked with the wait flag set to
wait until scf_cleanup_handler() is done. This ensures that all SMP
function calls, that have been queued earlier, complete at this point.
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/scftorture.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/scftorture.c b/kernel/scftorture.c
index 455cbff35a1a2..654702f75c54b 100644
--- a/kernel/scftorture.c
+++ b/kernel/scftorture.c
@@ -523,7 +523,7 @@ static void scf_torture_cleanup(void)
torture_stop_kthread("scftorture_invoker", scf_stats_p[i].task);
else
goto end;
- smp_call_function(scf_cleanup_handler, NULL, 0);
+ smp_call_function(scf_cleanup_handler, NULL, 1);
torture_stop_kthread(scf_torture_stats, scf_torture_stats_task);
scf_torture_stats_print(); // -After- the stats thread is stopped!
kfree(scf_stats_p); // -After- the last stats print has completed!
--
2.45.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 3/4] scftorture: Move memory allocation outside of preempt_disable region.
2024-11-08 10:39 [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 1/4] scftorture: Avoid additional div operation Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 2/4] scftorture: Wait until scf_cleanup_handler() completes Sebastian Andrzej Siewior
@ 2024-11-08 10:39 ` Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 4/4] scftorture: Use a lock-less list to free memory Sebastian Andrzej Siewior
2024-11-08 17:46 ` [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Boqun Feng
4 siblings, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-08 10:39 UTC (permalink / raw)
To: kasan-dev, linux-kernel, linux-mm
Cc: Paul E. McKenney, Boqun Feng, Marco Elver, Peter Zijlstra,
Tomas Gleixner, Vlastimil Babka, akpm, cl, iamjoonsoo.kim,
longman, penberg, rientjes, sfr, Sebastian Andrzej Siewior
Memory allocations can not happen within regions with explicit disabled
preemption PREEMPT_RT. The problem is that the locking structures
underneath are sleeping locks.
Move the memory allocation outside of the preempt-disabled section. Keep
the GFP_ATOMIC for the allocation to behave like a "ememergncy
allocation".
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/scftorture.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/scftorture.c b/kernel/scftorture.c
index 654702f75c54b..e3c60f6dd5477 100644
--- a/kernel/scftorture.c
+++ b/kernel/scftorture.c
@@ -320,10 +320,6 @@ static void scftorture_invoke_one(struct scf_statistics *scfp, struct torture_ra
struct scf_check *scfcp = NULL;
struct scf_selector *scfsp = scf_sel_rand(trsp);
- if (use_cpus_read_lock)
- cpus_read_lock();
- else
- preempt_disable();
if (scfsp->scfs_prim == SCF_PRIM_SINGLE || scfsp->scfs_wait) {
scfcp = kmalloc(sizeof(*scfcp), GFP_ATOMIC);
if (!scfcp) {
@@ -337,6 +333,10 @@ static void scftorture_invoke_one(struct scf_statistics *scfp, struct torture_ra
scfcp->scfc_rpc = false;
}
}
+ if (use_cpus_read_lock)
+ cpus_read_lock();
+ else
+ preempt_disable();
switch (scfsp->scfs_prim) {
case SCF_PRIM_RESCHED:
if (IS_BUILTIN(CONFIG_SCF_TORTURE_TEST)) {
--
2.45.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v3 4/4] scftorture: Use a lock-less list to free memory.
2024-11-08 10:39 [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Sebastian Andrzej Siewior
` (2 preceding siblings ...)
2024-11-08 10:39 ` [PATCH v3 3/4] scftorture: Move memory allocation outside of preempt_disable region Sebastian Andrzej Siewior
@ 2024-11-08 10:39 ` Sebastian Andrzej Siewior
2024-11-08 17:46 ` [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Boqun Feng
4 siblings, 0 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-08 10:39 UTC (permalink / raw)
To: kasan-dev, linux-kernel, linux-mm
Cc: Paul E. McKenney, Boqun Feng, Marco Elver, Peter Zijlstra,
Tomas Gleixner, Vlastimil Babka, akpm, cl, iamjoonsoo.kim,
longman, penberg, rientjes, sfr, Sebastian Andrzej Siewior
scf_handler() is used as a SMP function call. This function is always
invoked in IRQ-context even with forced-threading enabled. This function
frees memory which not allowed on PREEMPT_RT because the locking
underneath is using sleeping locks.
Add a per-CPU scf_free_pool where each SMP functions adds its memory to
be freed. This memory is then freed by scftorture_invoker() on each
iteration. On the majority of invocations the number of items is less
than five. If the thread sleeps/ gets delayed the number exceed 350 but
did not reach 400 in testing. These were the spikes during testing.
The bulk free of 64 pointers at once should improve the give-back if the
list grows. The list size is ~1.3 items per invocations.
Having one global scf_free_pool with one cleaning thread let the list
grow to over 10.000 items with 32 CPUs (again, spikes not the average)
especially if the CPU went to sleep. The per-CPU part looks like a good
compromise.
Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
Closes: https://lore.kernel.org/lkml/41619255-cdc2-4573-a360-7794fc3614f7@paulmck-laptop/
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/scftorture.c | 40 ++++++++++++++++++++++++++++++++++++----
1 file changed, 36 insertions(+), 4 deletions(-)
diff --git a/kernel/scftorture.c b/kernel/scftorture.c
index e3c60f6dd5477..eeafd3fc16820 100644
--- a/kernel/scftorture.c
+++ b/kernel/scftorture.c
@@ -97,6 +97,7 @@ struct scf_statistics {
static struct scf_statistics *scf_stats_p;
static struct task_struct *scf_torture_stats_task;
static DEFINE_PER_CPU(long long, scf_invoked_count);
+static DEFINE_PER_CPU(struct llist_head, scf_free_pool);
// Data for random primitive selection
#define SCF_PRIM_RESCHED 0
@@ -133,6 +134,7 @@ struct scf_check {
bool scfc_wait;
bool scfc_rpc;
struct completion scfc_completion;
+ struct llist_node scf_node;
};
// Use to wait for all threads to start.
@@ -148,6 +150,31 @@ static DEFINE_TORTURE_RANDOM_PERCPU(scf_torture_rand);
extern void resched_cpu(int cpu); // An alternative IPI vector.
+static void scf_add_to_free_list(struct scf_check *scfcp)
+{
+ struct llist_head *pool;
+ unsigned int cpu;
+
+ cpu = raw_smp_processor_id() % nthreads;
+ pool = &per_cpu(scf_free_pool, cpu);
+ llist_add(&scfcp->scf_node, pool);
+}
+
+static void scf_cleanup_free_list(unsigned int cpu)
+{
+ struct llist_head *pool;
+ struct llist_node *node;
+ struct scf_check *scfcp;
+
+ pool = &per_cpu(scf_free_pool, cpu);
+ node = llist_del_all(pool);
+ while (node) {
+ scfcp = llist_entry(node, struct scf_check, scf_node);
+ node = node->next;
+ kfree(scfcp);
+ }
+}
+
// Print torture statistics. Caller must ensure serialization.
static void scf_torture_stats_print(void)
{
@@ -296,7 +323,7 @@ static void scf_handler(void *scfc_in)
if (scfcp->scfc_rpc)
complete(&scfcp->scfc_completion);
} else {
- kfree(scfcp);
+ scf_add_to_free_list(scfcp);
}
}
@@ -363,7 +390,7 @@ static void scftorture_invoke_one(struct scf_statistics *scfp, struct torture_ra
scfp->n_single_wait_ofl++;
else
scfp->n_single_ofl++;
- kfree(scfcp);
+ scf_add_to_free_list(scfcp);
scfcp = NULL;
}
break;
@@ -391,7 +418,7 @@ static void scftorture_invoke_one(struct scf_statistics *scfp, struct torture_ra
preempt_disable();
} else {
scfp->n_single_rpc_ofl++;
- kfree(scfcp);
+ scf_add_to_free_list(scfcp);
scfcp = NULL;
}
break;
@@ -428,7 +455,7 @@ static void scftorture_invoke_one(struct scf_statistics *scfp, struct torture_ra
pr_warn("%s: Memory-ordering failure, scfs_prim: %d.\n", __func__, scfsp->scfs_prim);
atomic_inc(&n_mb_out_errs); // Leak rather than trash!
} else {
- kfree(scfcp);
+ scf_add_to_free_list(scfcp);
}
barrier(); // Prevent race-reduction compiler optimizations.
}
@@ -479,6 +506,8 @@ static int scftorture_invoker(void *arg)
VERBOSE_SCFTORTOUT("scftorture_invoker %d started", scfp->cpu);
do {
+ scf_cleanup_free_list(cpu);
+
scftorture_invoke_one(scfp, &rand);
while (cpu_is_offline(cpu) && !torture_must_stop()) {
schedule_timeout_interruptible(HZ / 5);
@@ -529,6 +558,9 @@ static void scf_torture_cleanup(void)
kfree(scf_stats_p); // -After- the last stats print has completed!
scf_stats_p = NULL;
+ for (i = 0; i < nr_cpu_ids; i++)
+ scf_cleanup_free_list(i);
+
if (atomic_read(&n_errs) || atomic_read(&n_mb_in_errs) || atomic_read(&n_mb_out_errs))
scftorture_print_module_parms("End of test: FAILURE");
else if (torture_onoff_failures())
--
2.45.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context.
2024-11-08 10:39 [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Sebastian Andrzej Siewior
` (3 preceding siblings ...)
2024-11-08 10:39 ` [PATCH v3 4/4] scftorture: Use a lock-less list to free memory Sebastian Andrzej Siewior
@ 2024-11-08 17:46 ` Boqun Feng
2024-11-08 18:33 ` Paul E. McKenney
4 siblings, 1 reply; 9+ messages in thread
From: Boqun Feng @ 2024-11-08 17:46 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: kasan-dev, linux-kernel, linux-mm, Paul E. McKenney, Marco Elver,
Peter Zijlstra, Tomas Gleixner, Vlastimil Babka, akpm, cl,
iamjoonsoo.kim, longman, penberg, rientjes, sfr
On Fri, Nov 08, 2024 at 11:39:30AM +0100, Sebastian Andrzej Siewior wrote:
> Hi,
>
> Paul reported kfree from IRQ context in scftorture which is noticed by
> lockdep since the recent PROVE_RAW_LOCK_NESTING switch.
>
> The last patch in this series adresses the issues, the other things
> happened on the way.
>
> v2...v3:
> - The clean up on module exit must not be done with thread numbers.
> Reported by Boqun Feng.
> - Move the clean up on module exit prior to torture_cleanup_end().
> Reported by Paul.
>
> v1...v2:
> - Remove kfree_bulk(). I get more invocations per report without it.
> - Pass `cpu' to scf_cleanup_free_list in scftorture_invoker() instead
> of scfp->cpu. The latter is the thread number which can be larger
> than the number CPUs leading to a crash in such a case. Reported by
> Boqun Feng.
> - Clean up the per-CPU lists on module exit. Reported by Boqun Feng.
>
> Sebastian
>
For the whole series:
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Tested-by: Boqun Feng <boqun.feng@gmail.com>
Regards,
Boqun
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context.
2024-11-08 17:46 ` [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Boqun Feng
@ 2024-11-08 18:33 ` Paul E. McKenney
2024-11-08 18:45 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2024-11-08 18:33 UTC (permalink / raw)
To: Boqun Feng
Cc: Sebastian Andrzej Siewior, kasan-dev, linux-kernel, linux-mm,
Marco Elver, Peter Zijlstra, Tomas Gleixner, Vlastimil Babka,
akpm, cl, iamjoonsoo.kim, longman, penberg, rientjes, sfr
On Fri, Nov 08, 2024 at 09:46:07AM -0800, Boqun Feng wrote:
> On Fri, Nov 08, 2024 at 11:39:30AM +0100, Sebastian Andrzej Siewior wrote:
> > Hi,
> >
> > Paul reported kfree from IRQ context in scftorture which is noticed by
> > lockdep since the recent PROVE_RAW_LOCK_NESTING switch.
> >
> > The last patch in this series adresses the issues, the other things
> > happened on the way.
> >
> > v2...v3:
> > - The clean up on module exit must not be done with thread numbers.
> > Reported by Boqun Feng.
> > - Move the clean up on module exit prior to torture_cleanup_end().
> > Reported by Paul.
> >
> > v1...v2:
> > - Remove kfree_bulk(). I get more invocations per report without it.
> > - Pass `cpu' to scf_cleanup_free_list in scftorture_invoker() instead
> > of scfp->cpu. The latter is the thread number which can be larger
> > than the number CPUs leading to a crash in such a case. Reported by
> > Boqun Feng.
> > - Clean up the per-CPU lists on module exit. Reported by Boqun Feng.
> >
> > Sebastian
> >
>
> For the whole series:
>
> Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
> Tested-by: Boqun Feng <boqun.feng@gmail.com>
Thank you both!
Sebastian, I am guessing that the Kconfig change exposing the bugs fixed
by your series is headed to mainline for the upcoming merge window?
If so, I should of course push these in as well.
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context.
2024-11-08 18:33 ` Paul E. McKenney
@ 2024-11-08 18:45 ` Sebastian Andrzej Siewior
2024-11-08 19:01 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-11-08 18:45 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Boqun Feng, kasan-dev, linux-kernel, linux-mm, Marco Elver,
Peter Zijlstra, Tomas Gleixner, Vlastimil Babka, akpm, cl,
iamjoonsoo.kim, longman, penberg, rientjes, sfr
On 2024-11-08 10:33:29 [-0800], Paul E. McKenney wrote:
> Sebastian, I am guessing that the Kconfig change exposing the bugs fixed
> by your series is headed to mainline for the upcoming merge window?
Yes. It is in tip/locking/core.
> If so, I should of course push these in as well.
That would be nice ;)
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context.
2024-11-08 18:45 ` Sebastian Andrzej Siewior
@ 2024-11-08 19:01 ` Paul E. McKenney
0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2024-11-08 19:01 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Boqun Feng, kasan-dev, linux-kernel, linux-mm, Marco Elver,
Peter Zijlstra, Tomas Gleixner, Vlastimil Babka, akpm, cl,
iamjoonsoo.kim, longman, penberg, rientjes, sfr
On Fri, Nov 08, 2024 at 07:45:10PM +0100, Sebastian Andrzej Siewior wrote:
> On 2024-11-08 10:33:29 [-0800], Paul E. McKenney wrote:
> > Sebastian, I am guessing that the Kconfig change exposing the bugs fixed
> > by your series is headed to mainline for the upcoming merge window?
>
> Yes. It is in tip/locking/core.
>
> > If so, I should of course push these in as well.
>
> That would be nice ;)
Very well, I have started testing and if that goes well (as I expect
that it will), I will rebase them and put them into -next.
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-11-08 19:01 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-08 10:39 [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 1/4] scftorture: Avoid additional div operation Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 2/4] scftorture: Wait until scf_cleanup_handler() completes Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 3/4] scftorture: Move memory allocation outside of preempt_disable region Sebastian Andrzej Siewior
2024-11-08 10:39 ` [PATCH v3 4/4] scftorture: Use a lock-less list to free memory Sebastian Andrzej Siewior
2024-11-08 17:46 ` [PATCH v3 0/4] scftorture: Avoid kfree from IRQ context Boqun Feng
2024-11-08 18:33 ` Paul E. McKenney
2024-11-08 18:45 ` Sebastian Andrzej Siewior
2024-11-08 19:01 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox