* RFC [PATCH1] mm: vmscan: Limit isolate_lru_folios() IRQ holdoff latency impact
[not found] <88ae633aa910c7e320b89e5f4779d472f7cffffb.camel@gmx.de>
@ 2024-10-31 10:21 ` Mike Galbraith
[not found] ` <ad30ef4edfc22e7f29280fdec763189efddda9d0.camel@gmx.de>
1 sibling, 0 replies; 2+ messages in thread
From: Mike Galbraith @ 2024-10-31 10:21 UTC (permalink / raw)
To: LKML; +Cc: linux-mm
[-- Attachment #1: Type: text/plain, Size: 2962 bytes --]
(was regression: mm: vmscan: - size XL irqoff time increase v6.10+)
While examining a service latency regression beginning at v6.10, it was
discovered that isolate_lru_folios() runtimes could cause IRQ holdoff
spanning several ticks. Worse measured was 78ms.
Given it's called with IRQs disabled, cap its runtme to around a ms.
With this patch, and one to mitigate move_expired_inodes() induced
wakeup latency hits up 66ms, the load that repeatably produced this:
(6.12.0-rc5 running osc kernel package build + ram sized bonnie)
T: 0 ( 4541) P:99 I:1000 C: 773124 Min: 1 Act: 6 Avg: 3 Max: 53381
T: 1 ( 4542) P:99 I:1500 C: 515381 Min: 1 Act: 7 Avg: 3 Max: 60119
T: 2 ( 4543) P:99 I:2000 C: 386465 Min: 1 Act: 5 Avg: 4 Max: 67469
T: 3 ( 4544) P:99 I:2500 C: 309182 Min: 1 Act: 7 Avg: 4 Max: 65534
T: 4 ( 4545) P:99 I:3000 C: 257682 Min: 1 Act: 6 Avg: 4 Max: 57776
T: 5 ( 4546) P:99 I:3500 C: 220874 Min: 1 Act: 4 Avg: 4 Max: 64737
T: 6 ( 4547) P:99 I:4000 C: 193289 Min: 1 Act: 6 Avg: 4 Max: 52567
T: 7 ( 4548) P:99 I:4500 C: 171799 Min: 1 Act: 6 Avg: 5 Max: 52606
latency went to this:
T: 0 ( 6078) P:99 I:1000 C:1031230 Min: 1 Act: 7 Avg: 4 Max: 4449
T: 1 ( 6079) P:99 I:1500 C: 687505 Min: 1 Act: 5 Avg: 4 Max: 2075
T: 2 ( 6080) P:99 I:2000 C: 515627 Min: 1 Act: 334 Avg: 5 Max: 2987
T: 3 ( 6081) P:99 I:2500 C: 412503 Min: 1 Act: 8 Avg: 5 Max: 2425
T: 4 ( 6082) P:99 I:3000 C: 343752 Min: 1 Act: 11 Avg: 5 Max: 1466
T: 5 ( 6083) P:99 I:3500 C: 294643 Min: 1 Act: 282 Avg: 5 Max: 3844
T: 6 ( 6084) P:99 I:4000 C: 257813 Min: 1 Act: 7 Avg: 5 Max: 3291
T: 7 ( 6085) P:99 I:4500 C: 229167 Min: 1 Act: 9 Avg: 5 Max: 3516
Signed-off-by: Mike Galbraith <efault@gmx.de>
---
mm/vmscan.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -63,6 +63,9 @@
#include <linux/swapops.h>
#include <linux/balloon_compaction.h>
#include <linux/sched/sysctl.h>
+#ifndef CONFIG_PREEMPT_RT
+#include <linux/sched/clock.h>
+#endif
#include "internal.h"
#include "swap.h"
@@ -1657,6 +1660,10 @@ static unsigned long isolate_lru_folios(
unsigned long skipped = 0;
unsigned long scan, total_scan, nr_pages;
LIST_HEAD(folios_skipped);
+#ifndef CONFIG_PREEMPT_RT
+ u64 then = local_clock();
+ int batch = 0;
+#endif
total_scan = 0;
scan = 0;
@@ -1709,6 +1716,18 @@ static unsigned long isolate_lru_folios(
move_to = dst;
move:
list_move(&folio->lru, move_to);
+
+#ifndef CONFIG_PREEMPT_RT
+ /*
+ * IRQs are disabled, cap holdoff at a millisecond or so.
+ */
+ batch += nr_pages;
+ if (batch < 1024)
+ continue;
+ if (local_clock() - then >= NSEC_PER_MSEC)
+ break;
+ batch = 0;
+#endif
}
/*
[-- Attachment #2: irqoff-trace-6.12.0.g4236f913-master --]
[-- Type: text/plain, Size: 2159 bytes --]
# tracer: irqsoff
#
# irqsoff latency trace v1.1.5 on 6.12.0.g4236f913-master
# --------------------------------------------------------------------
# latency: 77766 us, #12/12, CPU#5 | (M:unknown VP:0, KP:0, SP:0 HP:0 #P:8)
# -----------------
# | task: kswapd0-90 (uid:0 nice:0 policy:0 rt_prio:0)
# -----------------
# => started at: _raw_spin_lock_irq+0x16/0x40
# => ended at: _raw_spin_unlock_irq+0x10/0x30
#
#
# _------=> CPU#
# / _-----=> irqs-off/BH-disabled
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / _-=> migrate-disable
# ||||| / delay
# cmd pid |||||| time | caller
# \ / |||||| \ | /
kswapd0-90 5d.... 0us : _raw_spin_lock_irq+0x16/0x40
kswapd0-90 5d.... 0us : preempt_count_add+0x0/0x50 <-_raw_spin_lock_irq+0x20/0x40
kswapd0-90 5d..1. 0us*: isolate_lru_folios+0x0/0x3e0 <-shrink_inactive_list+0x10f/0x440
kswapd0-90 5d..1. 77765us : __rcu_read_lock+0x0/0x20 <-isolate_lru_folios+0x160/0x3e0
kswapd0-90 5d..1. 77765us : __rcu_read_unlock+0x0/0x180 <-isolate_lru_folios+0x199/0x3e0
kswapd0-90 5d..1. 77766us : __mod_node_page_state+0x0/0x70 <-isolate_lru_folios+0x2ee/0x3e0
kswapd0-90 5d..1. 77766us : __mod_zone_page_state+0x0/0x60 <-isolate_lru_folios+0x2fc/0x3e0
kswapd0-90 5d..1. 77766us : __mod_node_page_state+0x0/0x70 <-shrink_inactive_list+0x12a/0x440
kswapd0-90 5d..1. 77766us : _raw_spin_unlock_irq+0x0/0x30 <-shrink_inactive_list+0x16a/0x440
kswapd0-90 5d..1. 77766us : _raw_spin_unlock_irq+0x10/0x30
kswapd0-90 5d..1. 77767us : tracer_hardirqs_on+0xcb/0xe0 <-_raw_spin_unlock_irq+0x10/0x30
kswapd0-90 5d..1. 77774us : <stack trace>
=> shrink_inactive_list+0x16a/0x440
=> shrink_lruvec+0x2fc/0x470
=> shrink_node+0x1d6/0x4a0
=> balance_pgdat+0x280/0x5c0
=> kswapd+0x10f/0x350
=> kthread+0xfa/0x130
=> ret_from_fork+0x2d/0x50
=> ret_from_fork_asm+0x11/0x20
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: RFC [PATCH2] fs/writeback: Mitigate move_expired_inodes() induced service latency
[not found] ` <ad30ef4edfc22e7f29280fdec763189efddda9d0.camel@gmx.de>
@ 2024-10-31 10:28 ` Mike Galbraith
0 siblings, 0 replies; 2+ messages in thread
From: Mike Galbraith @ 2024-10-31 10:28 UTC (permalink / raw)
To: LKML; +Cc: Ryan Roberts, linux-mm
(grr.. CC was supposed to be kvack)
On Thu, 2024-10-31 at 11:21 +0100, Mike Galbraith wrote:
> (was regression: mm: vmscan: - size XL irqoff time increase v6.10+)
>
>
> Break off queueing of IO after we've been at it for a ms or so and a
> preemption is due, to keep writeback latency impact at least reasonable.
> The IO we're queueing under spinlock still has to be started under that
> same lock.
>
> wakeup_rt tracing caught this function spanning 66ms in a i7-4790 box.
>
> With this patch applied on top of one to mitigate even worse IRQ holdoff
> induced hits (78ms) by isolate_lru_folios(), the same trivial load that
> leads to this and worse (osc kernel package build + bonnie):
> T: 1 ( 6211) P:99 I:1500 C: 639971 Min: 1 Act: 7 Avg: 12 Max: 66696
>
> resulted in this perfectly reasonable max:
> T: 0 ( 6078) P:99 I:1000 C:1031230 Min: 1 Act: 7 Avg: 4 Max: 4449
>
> Note: cyclictest -Smp99 is only the messenger. This is not an RT issue,
> RT is fingering bad generic behavior.
>
> Signed-off-by: Mike Galbraith <efault@gmx.de>
> ---
> fs/fs-writeback.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -29,6 +29,7 @@
> #include <linux/tracepoint.h>
> #include <linux/device.h>
> #include <linux/memcontrol.h>
> +#include <linux/sched/clock.h>
> #include "internal.h"
>
> /*
> @@ -1424,6 +1425,10 @@ static int move_expired_inodes(struct li
> struct inode *inode;
> int do_sb_sort = 0;
> int moved = 0;
> +#ifndef CONFIG_PREEMPT_RT
> + u64 then = local_clock();
> + int iter = 0;
> +#endif
>
> while (!list_empty(delaying_queue)) {
> inode = wb_inode(delaying_queue->prev);
> @@ -1439,6 +1444,19 @@ static int move_expired_inodes(struct li
> if (sb && sb != inode->i_sb)
> do_sb_sort = 1;
> sb = inode->i_sb;
> +#ifndef CONFIG_PREEMPT_RT
> + /*
> + * We're under ->list_lock here, and the IO being queued
> + * still has to be started. Stop queueing when we've been
> + * at it for a ms or so and a preemption is due, to keep
> + * latency impact reasonable.
> + */
> + if (iter++ < 100 || !need_resched())
> + continue;
> + if (local_clock() - then > NSEC_PER_MSEC)
> + break;
> + iter = 0;
> +#endif
> }
>
> /* just one sb in list, splice to dispatch_queue and we're done */
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-10-31 10:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <88ae633aa910c7e320b89e5f4779d472f7cffffb.camel@gmx.de>
2024-10-31 10:21 ` RFC [PATCH1] mm: vmscan: Limit isolate_lru_folios() IRQ holdoff latency impact Mike Galbraith
[not found] ` <ad30ef4edfc22e7f29280fdec763189efddda9d0.camel@gmx.de>
2024-10-31 10:28 ` RFC [PATCH2] fs/writeback: Mitigate move_expired_inodes() induced service latency Mike Galbraith
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox