* Re: Re: Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
[not found] <20240815025226.8973-1-liuye@kylinos.cn>
@ 2024-08-23 2:04 ` liuye
2024-09-03 2:34 ` liuye
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: liuye @ 2024-08-23 2:04 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel, liuye
I'm sorry to bother you about that, but it looks like the following email send 7 days ago,
did not receive a response from you. Do you mind having a look at this
when you have a bit of free time please?
> > > Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
> >
> > Merged in 2016.
> >
> > Under what circumstances does it occur?
>
> User processe are requesting a large amount of memory and keep page active.
> Then a module continuously requests memory from ZONE_DMA32 area.
> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
> However pages in the LRU(active_anon) list are mostly from
> the ZONE_NORMAL area.
>
> > Can you please describe how to reproduce this?
>
> Terminal 1: Construct to continuously increase pages active(anon).
> mkdir /tmp/memory
> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
> dd if=/dev/zero of=/tmp/memory/block bs=4M
> tail /tmp/memory/block
>
> Terminal 2:
> vmstat -a 1
> active will increase.
> procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
> r b swpd free inact active si so bi bo in cs us sy id wa st gu
> 1 0 0 1445623076 45898836 83646008 0 0 0 0 1807 1682 0 0 100 0 0 0
> 1 0 0 1445623076 43450228 86094616 0 0 0 0 1677 1468 0 0 100 0 0 0
> 1 0 0 1445623076 41003480 88541364 0 0 0 0 1985 2022 0 0 100 0 0 0
> 1 0 0 1445623076 38557088 90987756 0 0 0 4 1731 1544 0 0 100 0 0 0
> 1 0 0 1445623076 36109688 93435156 0 0 0 0 1755 1501 0 0 100 0 0 0
> 1 0 0 1445619552 33663256 95881632 0 0 0 0 2015 1678 0 0 100 0 0 0
> 1 0 0 1445619804 31217140 98327792 0 0 0 0 2058 2212 0 0 100 0 0 0
> 1 0 0 1445619804 28769988 100774944 0 0 0 0 1729 1585 0 0 100 0 0 0
> 1 0 0 1445619804 26322348 103222584 0 0 0 0 1774 1575 0 0 100 0 0 0
> 1 0 0 1445619804 23875592 105669340 0 0 0 4 1738 1604 0 0 100 0 0 0
>
> cat /proc/meminfo | head
> Active(anon) increase.
> MemTotal: 1579941036 kB
> MemFree: 1445618500 kB
> MemAvailable: 1453013224 kB
> Buffers: 6516 kB
> Cached: 128653956 kB
> SwapCached: 0 kB
> Active: 118110812 kB
> Inactive: 11436620 kB
> Active(anon): 115345744 kB
> Inactive(anon): 945292 kB
>
> When the Active(anon) is 115345744 kB, insmod module triggers the ZONE_DMA32 watermark.
>
> perf show nr_scanned=28835844.
> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>
> perf record -e vmscan:mm_vmscan_lru_isolate -aR
> perf script
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2 nr_skipped=2 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29 nr_skipped=29 nr_taken=0 lru=active_anon
> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>
> If increase Active(anon) to 1000G then insmod module triggers the ZONE_DMA32 watermark. hard lockup will occur.
>
> In my device nr_scanned = 0000000003e3e937 when hard lockup. Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>
> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> ffffc90006fb7c30: 0000000000000020 0000000000000000
> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
> ffffc90006fb7c70: 0000000000000000 0000000000000000
> ffffc90006fb7c80: 0000000000000000 0000000000000000
> ffffc90006fb7c90: 0000000000000000 0000000000000000
> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
> ffffc90006fb7cb0: 0000000000000000 0000000000000000
> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>
> > Why do you think it took eight years to be discovered?
>
> The problem requires the following conditions to occur:
> 1. The device memory should be large enough.
> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>
> If the memory is not large enough, or if the usage design of ZONE_DMA32 area memory is reasonable, this problem is difficult to detect.
>
> notes:
> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL, but other suitable scenarios may also trigger the problem.
>
> > It looks like that will fix, but perhaps something more fundamental
> > needs to be done - we're doing a tremendous amount of pretty pointless
> > work here. Answers to my above questions will help us resolve this.
> >
> > Thanks.
>
> Please refer to the above explanation for details.
>
> Thanks.
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-08-23 2:04 ` Re: Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios liuye
@ 2024-09-03 2:34 ` liuye
2024-09-03 3:03 ` liuye
2024-09-06 1:16 ` liuye
2 siblings, 0 replies; 12+ messages in thread
From: liuye @ 2024-09-03 2:34 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 5458 bytes --]
On 2024/8/23 上午10:04, liuye wrote:
> I'm sorry to bother you about that, but it looks like the following email send 7 days ago,
> did not receive a response from you. Do you mind having a look at this
> when you have a bit of free time please?
>
>>>> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
>>> Merged in 2016.
>>>
>>> Under what circumstances does it occur?
>> User processe are requesting a large amount of memory and keep page active.
>> Then a module continuously requests memory from ZONE_DMA32 area.
>> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
>> However pages in the LRU(active_anon) list are mostly from
>> the ZONE_NORMAL area.
>>
>>> Can you please describe how to reproduce this?
>> Terminal 1: Construct to continuously increase pages active(anon).
>> mkdir /tmp/memory
>> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
>> dd if=/dev/zero of=/tmp/memory/block bs=4M
>> tail /tmp/memory/block
>>
>> Terminal 2:
>> vmstat -a 1
>> active will increase.
>> procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
>> r b swpd free inact active si so bi bo in cs us sy id wa st gu
>> 1 0 0 1445623076 45898836 83646008 0 0 0 0 1807 1682 0 0 100 0 0 0
>> 1 0 0 1445623076 43450228 86094616 0 0 0 0 1677 1468 0 0 100 0 0 0
>> 1 0 0 1445623076 41003480 88541364 0 0 0 0 1985 2022 0 0 100 0 0 0
>> 1 0 0 1445623076 38557088 90987756 0 0 0 4 1731 1544 0 0 100 0 0 0
>> 1 0 0 1445623076 36109688 93435156 0 0 0 0 1755 1501 0 0 100 0 0 0
>> 1 0 0 1445619552 33663256 95881632 0 0 0 0 2015 1678 0 0 100 0 0 0
>> 1 0 0 1445619804 31217140 98327792 0 0 0 0 2058 2212 0 0 100 0 0 0
>> 1 0 0 1445619804 28769988 100774944 0 0 0 0 1729 1585 0 0 100 0 0 0
>> 1 0 0 1445619804 26322348 103222584 0 0 0 0 1774 1575 0 0 100 0 0 0
>> 1 0 0 1445619804 23875592 105669340 0 0 0 4 1738 1604 0 0 100 0 0 0
>>
>> cat /proc/meminfo | head
>> Active(anon) increase.
>> MemTotal: 1579941036 kB
>> MemFree: 1445618500 kB
>> MemAvailable: 1453013224 kB
>> Buffers: 6516 kB
>> Cached: 128653956 kB
>> SwapCached: 0 kB
>> Active: 118110812 kB
>> Inactive: 11436620 kB
>> Active(anon): 115345744 kB
>> Inactive(anon): 945292 kB
>>
>> When the Active(anon) is 115345744 kB, insmod module triggers the ZONE_DMA32 watermark.
>>
>> perf show nr_scanned=28835844.
>> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>>
>> perf record -e vmscan:mm_vmscan_lru_isolate -aR
>> perf script
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2 nr_skipped=2 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29 nr_skipped=29 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>>
>> If increase Active(anon) to 1000G then insmod module triggers the ZONE_DMA32 watermark. hard lockup will occur.
>>
>> In my device nr_scanned = 0000000003e3e937 when hard lockup. Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>>
>> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
>> ffffc90006fb7c30: 0000000000000020 0000000000000000
>> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
>> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
>> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
>> ffffc90006fb7c70: 0000000000000000 0000000000000000
>> ffffc90006fb7c80: 0000000000000000 0000000000000000
>> ffffc90006fb7c90: 0000000000000000 0000000000000000
>> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
>> ffffc90006fb7cb0: 0000000000000000 0000000000000000
>> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>>
>>> Why do you think it took eight years to be discovered?
>> The problem requires the following conditions to occur:
>> 1. The device memory should be large enough.
>> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
>> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>>
>> If the memory is not large enough, or if the usage design of ZONE_DMA32 area memory is reasonable, this problem is difficult to detect.
>>
>> notes:
>> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL, but other suitable scenarios may also trigger the problem.
>>
>>> It looks like that will fix, but perhaps something more fundamental
>>> needs to be done - we're doing a tremendous amount of pretty pointless
>>> work here. Answers to my above questions will help us resolve this.
>>>
>>> Thanks.
>> Please refer to the above explanation for details.
>>
>> Thanks.
> Thanks.
Friendly ping.
[-- Attachment #2: Type: text/html, Size: 6439 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-08-23 2:04 ` Re: Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios liuye
2024-09-03 2:34 ` liuye
@ 2024-09-03 3:03 ` liuye
2024-09-06 1:16 ` liuye
2 siblings, 0 replies; 12+ messages in thread
From: liuye @ 2024-09-03 3:03 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel
On 2024/8/23 上午10:04, liuye wrote:
> I'm sorry to bother you about that, but it looks like the following email send 7 days ago,
> did not receive a response from you. Do you mind having a look at this
> when you have a bit of free time please?
>
>>>> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
>>>
>>> Merged in 2016.
>>>
>>> Under what circumstances does it occur?
>>
>> User processe are requesting a large amount of memory and keep page active.
>> Then a module continuously requests memory from ZONE_DMA32 area.
>> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
>> However pages in the LRU(active_anon) list are mostly from
>> the ZONE_NORMAL area.
>>
>>> Can you please describe how to reproduce this?
>>
>> Terminal 1: Construct to continuously increase pages active(anon).
>> mkdir /tmp/memory
>> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
>> dd if=/dev/zero of=/tmp/memory/block bs=4M
>> tail /tmp/memory/block
>>
>> Terminal 2:
>> vmstat -a 1
>> active will increase.
>> procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
>> r b swpd free inact active si so bi bo in cs us sy id wa st gu
>> 1 0 0 1445623076 45898836 83646008 0 0 0 0 1807 1682 0 0 100 0 0 0
>> 1 0 0 1445623076 43450228 86094616 0 0 0 0 1677 1468 0 0 100 0 0 0
>> 1 0 0 1445623076 41003480 88541364 0 0 0 0 1985 2022 0 0 100 0 0 0
>> 1 0 0 1445623076 38557088 90987756 0 0 0 4 1731 1544 0 0 100 0 0 0
>> 1 0 0 1445623076 36109688 93435156 0 0 0 0 1755 1501 0 0 100 0 0 0
>> 1 0 0 1445619552 33663256 95881632 0 0 0 0 2015 1678 0 0 100 0 0 0
>> 1 0 0 1445619804 31217140 98327792 0 0 0 0 2058 2212 0 0 100 0 0 0
>> 1 0 0 1445619804 28769988 100774944 0 0 0 0 1729 1585 0 0 100 0 0 0
>> 1 0 0 1445619804 26322348 103222584 0 0 0 0 1774 1575 0 0 100 0 0 0
>> 1 0 0 1445619804 23875592 105669340 0 0 0 4 1738 1604 0 0 100 0 0 0
>>
>> cat /proc/meminfo | head
>> Active(anon) increase.
>> MemTotal: 1579941036 kB
>> MemFree: 1445618500 kB
>> MemAvailable: 1453013224 kB
>> Buffers: 6516 kB
>> Cached: 128653956 kB
>> SwapCached: 0 kB
>> Active: 118110812 kB
>> Inactive: 11436620 kB
>> Active(anon): 115345744 kB
>> Inactive(anon): 945292 kB
>>
>> When the Active(anon) is 115345744 kB, insmod module triggers the ZONE_DMA32 watermark.
>>
>> perf show nr_scanned=28835844.
>> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>>
>> perf record -e vmscan:mm_vmscan_lru_isolate -aR
>> perf script
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2 nr_skipped=2 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29 nr_skipped=29 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>>
>> If increase Active(anon) to 1000G then insmod module triggers the ZONE_DMA32 watermark. hard lockup will occur.
>>
>> In my device nr_scanned = 0000000003e3e937 when hard lockup. Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>>
>> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
>> ffffc90006fb7c30: 0000000000000020 0000000000000000
>> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
>> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
>> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
>> ffffc90006fb7c70: 0000000000000000 0000000000000000
>> ffffc90006fb7c80: 0000000000000000 0000000000000000
>> ffffc90006fb7c90: 0000000000000000 0000000000000000
>> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
>> ffffc90006fb7cb0: 0000000000000000 0000000000000000
>> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>>
>>> Why do you think it took eight years to be discovered?
>>
>> The problem requires the following conditions to occur:
>> 1. The device memory should be large enough.
>> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
>> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>>
>> If the memory is not large enough, or if the usage design of ZONE_DMA32 area memory is reasonable, this problem is difficult to detect.
>>
>> notes:
>> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL, but other suitable scenarios may also trigger the problem.
>>
>>> It looks like that will fix, but perhaps something more fundamental
>>> needs to be done - we're doing a tremendous amount of pretty pointless
>>> work here. Answers to my above questions will help us resolve this.
>>>
>>> Thanks.
>>
>> Please refer to the above explanation for details.
>>
>> Thanks.
>
> Thanks.
>
Friendly ping.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-08-23 2:04 ` Re: Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios liuye
2024-09-03 2:34 ` liuye
2024-09-03 3:03 ` liuye
@ 2024-09-06 1:16 ` liuye
2024-09-11 2:56 ` liuye
2 siblings, 1 reply; 12+ messages in thread
From: liuye @ 2024-09-06 1:16 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel
On 2024/8/23 上午10:04, liuye wrote:
> I'm sorry to bother you about that, but it looks like the following email send 7 days ago,
> did not receive a response from you. Do you mind having a look at this
> when you have a bit of free time please?
>
>>>> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
>>>
>>> Merged in 2016.
>>>
>>> Under what circumstances does it occur?
>>
>> User processe are requesting a large amount of memory and keep page active.
>> Then a module continuously requests memory from ZONE_DMA32 area.
>> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
>> However pages in the LRU(active_anon) list are mostly from
>> the ZONE_NORMAL area.
>>
>>> Can you please describe how to reproduce this?
>>
>> Terminal 1: Construct to continuously increase pages active(anon).
>> mkdir /tmp/memory
>> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
>> dd if=/dev/zero of=/tmp/memory/block bs=4M
>> tail /tmp/memory/block
>>
>> Terminal 2:
>> vmstat -a 1
>> active will increase.
>> procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
>> r b swpd free inact active si so bi bo in cs us sy id wa st gu
>> 1 0 0 1445623076 45898836 83646008 0 0 0 0 1807 1682 0 0 100 0 0 0
>> 1 0 0 1445623076 43450228 86094616 0 0 0 0 1677 1468 0 0 100 0 0 0
>> 1 0 0 1445623076 41003480 88541364 0 0 0 0 1985 2022 0 0 100 0 0 0
>> 1 0 0 1445623076 38557088 90987756 0 0 0 4 1731 1544 0 0 100 0 0 0
>> 1 0 0 1445623076 36109688 93435156 0 0 0 0 1755 1501 0 0 100 0 0 0
>> 1 0 0 1445619552 33663256 95881632 0 0 0 0 2015 1678 0 0 100 0 0 0
>> 1 0 0 1445619804 31217140 98327792 0 0 0 0 2058 2212 0 0 100 0 0 0
>> 1 0 0 1445619804 28769988 100774944 0 0 0 0 1729 1585 0 0 100 0 0 0
>> 1 0 0 1445619804 26322348 103222584 0 0 0 0 1774 1575 0 0 100 0 0 0
>> 1 0 0 1445619804 23875592 105669340 0 0 0 4 1738 1604 0 0 100 0 0 0
>>
>> cat /proc/meminfo | head
>> Active(anon) increase.
>> MemTotal: 1579941036 kB
>> MemFree: 1445618500 kB
>> MemAvailable: 1453013224 kB
>> Buffers: 6516 kB
>> Cached: 128653956 kB
>> SwapCached: 0 kB
>> Active: 118110812 kB
>> Inactive: 11436620 kB
>> Active(anon): 115345744 kB
>> Inactive(anon): 945292 kB
>>
>> When the Active(anon) is 115345744 kB, insmod module triggers the ZONE_DMA32 watermark.
>>
>> perf show nr_scanned=28835844.
>> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>>
>> perf record -e vmscan:mm_vmscan_lru_isolate -aR
>> perf script
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2 nr_skipped=2 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29 nr_skipped=29 nr_taken=0 lru=active_anon
>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>>
>> If increase Active(anon) to 1000G then insmod module triggers the ZONE_DMA32 watermark. hard lockup will occur.
>>
>> In my device nr_scanned = 0000000003e3e937 when hard lockup. Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>>
>> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
>> ffffc90006fb7c30: 0000000000000020 0000000000000000
>> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
>> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
>> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
>> ffffc90006fb7c70: 0000000000000000 0000000000000000
>> ffffc90006fb7c80: 0000000000000000 0000000000000000
>> ffffc90006fb7c90: 0000000000000000 0000000000000000
>> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
>> ffffc90006fb7cb0: 0000000000000000 0000000000000000
>> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>>
>>> Why do you think it took eight years to be discovered?
>>
>> The problem requires the following conditions to occur:
>> 1. The device memory should be large enough.
>> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
>> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>>
>> If the memory is not large enough, or if the usage design of ZONE_DMA32 area memory is reasonable, this problem is difficult to detect.
>>
>> notes:
>> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL, but other suitable scenarios may also trigger the problem.
>>
>>> It looks like that will fix, but perhaps something more fundamental
>>> needs to be done - we're doing a tremendous amount of pretty pointless
>>> work here. Answers to my above questions will help us resolve this.
>>>
>>> Thanks.
>>
>> Please refer to the above explanation for details.
>>
>> Thanks.
>
> Thanks.
>
Friendly ping.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-06 1:16 ` liuye
@ 2024-09-11 2:56 ` liuye
2024-12-05 19:17 ` Yu Zhao
0 siblings, 1 reply; 12+ messages in thread
From: liuye @ 2024-09-11 2:56 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel
Friendly ping.
Thanks.
On 2024/9/6 上午9:16, liuye wrote:
>
>
> On 2024/8/23 上午10:04, liuye wrote:
>> I'm sorry to bother you about that, but it looks like the following email send 7 days ago,
>> did not receive a response from you. Do you mind having a look at this
>> when you have a bit of free time please?
>>
>>>>> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
>>>>
>>>> Merged in 2016.
>>>>
>>>> Under what circumstances does it occur?
>>>
>>> User processe are requesting a large amount of memory and keep page active.
>>> Then a module continuously requests memory from ZONE_DMA32 area.
>>> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
>>> However pages in the LRU(active_anon) list are mostly from
>>> the ZONE_NORMAL area.
>>>
>>>> Can you please describe how to reproduce this?
>>>
>>> Terminal 1: Construct to continuously increase pages active(anon).
>>> mkdir /tmp/memory
>>> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
>>> dd if=/dev/zero of=/tmp/memory/block bs=4M
>>> tail /tmp/memory/block
>>>
>>> Terminal 2:
>>> vmstat -a 1
>>> active will increase.
>>> procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
>>> r b swpd free inact active si so bi bo in cs us sy id wa st gu
>>> 1 0 0 1445623076 45898836 83646008 0 0 0 0 1807 1682 0 0 100 0 0 0
>>> 1 0 0 1445623076 43450228 86094616 0 0 0 0 1677 1468 0 0 100 0 0 0
>>> 1 0 0 1445623076 41003480 88541364 0 0 0 0 1985 2022 0 0 100 0 0 0
>>> 1 0 0 1445623076 38557088 90987756 0 0 0 4 1731 1544 0 0 100 0 0 0
>>> 1 0 0 1445623076 36109688 93435156 0 0 0 0 1755 1501 0 0 100 0 0 0
>>> 1 0 0 1445619552 33663256 95881632 0 0 0 0 2015 1678 0 0 100 0 0 0
>>> 1 0 0 1445619804 31217140 98327792 0 0 0 0 2058 2212 0 0 100 0 0 0
>>> 1 0 0 1445619804 28769988 100774944 0 0 0 0 1729 1585 0 0 100 0 0 0
>>> 1 0 0 1445619804 26322348 103222584 0 0 0 0 1774 1575 0 0 100 0 0 0
>>> 1 0 0 1445619804 23875592 105669340 0 0 0 4 1738 1604 0 0 100 0 0 0
>>>
>>> cat /proc/meminfo | head
>>> Active(anon) increase.
>>> MemTotal: 1579941036 kB
>>> MemFree: 1445618500 kB
>>> MemAvailable: 1453013224 kB
>>> Buffers: 6516 kB
>>> Cached: 128653956 kB
>>> SwapCached: 0 kB
>>> Active: 118110812 kB
>>> Inactive: 11436620 kB
>>> Active(anon): 115345744 kB
>>> Inactive(anon): 945292 kB
>>>
>>> When the Active(anon) is 115345744 kB, insmod module triggers the ZONE_DMA32 watermark.
>>>
>>> perf show nr_scanned=28835844.
>>> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
>>>
>>> perf record -e vmscan:mm_vmscan_lru_isolate -aR
>>> perf script
>>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2 nr_skipped=2 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29 nr_skipped=29 nr_taken=0 lru=active_anon
>>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
>>>
>>> If increase Active(anon) to 1000G then insmod module triggers the ZONE_DMA32 watermark. hard lockup will occur.
>>>
>>> In my device nr_scanned = 0000000003e3e937 when hard lockup. Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
>>>
>>> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
>>> ffffc90006fb7c30: 0000000000000020 0000000000000000
>>> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
>>> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
>>> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
>>> ffffc90006fb7c70: 0000000000000000 0000000000000000
>>> ffffc90006fb7c80: 0000000000000000 0000000000000000
>>> ffffc90006fb7c90: 0000000000000000 0000000000000000
>>> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
>>> ffffc90006fb7cb0: 0000000000000000 0000000000000000
>>> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
>>>
>>>> Why do you think it took eight years to be discovered?
>>>
>>> The problem requires the following conditions to occur:
>>> 1. The device memory should be large enough.
>>> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
>>> 3. The memory in ZONE_DMA32 needs to reach the watermark.
>>>
>>> If the memory is not large enough, or if the usage design of ZONE_DMA32 area memory is reasonable, this problem is difficult to detect.
>>>
>>> notes:
>>> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL, but other suitable scenarios may also trigger the problem.
>>>
>>>> It looks like that will fix, but perhaps something more fundamental
>>>> needs to be done - we're doing a tremendous amount of pretty pointless
>>>> work here. Answers to my above questions will help us resolve this.
>>>>
>>>> Thanks.
>>>
>>> Please refer to the above explanation for details.
>>>
>>> Thanks.
>>
>> Thanks.
>>
> Friendly ping.
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-11 2:56 ` liuye
@ 2024-12-05 19:17 ` Yu Zhao
0 siblings, 0 replies; 12+ messages in thread
From: Yu Zhao @ 2024-12-05 19:17 UTC (permalink / raw)
To: liuye, Hugh Dickins; +Cc: akpm, linux-mm, linux-kernel
On Thu, Dec 5, 2024 at 8:19 AM liuye <liuye@kylinos.cn> wrote:
>
>
> Friendly ping.
>
> Thanks.
Hugh has responded on your "v2 RESEND":
https://lore.kernel.org/linux-mm/dae8ea77-2bc1-8ee9-b94b-207e2c8e1b8d@google.com/
> On 2024/9/6 上午9:16, liuye wrote:
> >
> >
> > On 2024/8/23 上午10:04, liuye wrote:
> >> I'm sorry to bother you about that, but it looks like the following email send 7 days ago,
> >> did not receive a response from you. Do you mind having a look at this
> >> when you have a bit of free time please?
> >>
> >>>>> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
> >>>>
> >>>> Merged in 2016.
> >>>>
> >>>> Under what circumstances does it occur?
> >>>
> >>> User processe are requesting a large amount of memory and keep page active.
> >>> Then a module continuously requests memory from ZONE_DMA32 area.
> >>> Memory reclaim will be triggered due to ZONE_DMA32 watermark alarm reached.
> >>> However pages in the LRU(active_anon) list are mostly from
> >>> the ZONE_NORMAL area.
> >>>
> >>>> Can you please describe how to reproduce this?
> >>>
> >>> Terminal 1: Construct to continuously increase pages active(anon).
> >>> mkdir /tmp/memory
> >>> mount -t tmpfs -o size=1024000M tmpfs /tmp/memory
> >>> dd if=/dev/zero of=/tmp/memory/block bs=4M
> >>> tail /tmp/memory/block
> >>>
> >>> Terminal 2:
> >>> vmstat -a 1
> >>> active will increase.
> >>> procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
> >>> r b swpd free inact active si so bi bo in cs us sy id wa st gu
> >>> 1 0 0 1445623076 45898836 83646008 0 0 0 0 1807 1682 0 0 100 0 0 0
> >>> 1 0 0 1445623076 43450228 86094616 0 0 0 0 1677 1468 0 0 100 0 0 0
> >>> 1 0 0 1445623076 41003480 88541364 0 0 0 0 1985 2022 0 0 100 0 0 0
> >>> 1 0 0 1445623076 38557088 90987756 0 0 0 4 1731 1544 0 0 100 0 0 0
> >>> 1 0 0 1445623076 36109688 93435156 0 0 0 0 1755 1501 0 0 100 0 0 0
> >>> 1 0 0 1445619552 33663256 95881632 0 0 0 0 2015 1678 0 0 100 0 0 0
> >>> 1 0 0 1445619804 31217140 98327792 0 0 0 0 2058 2212 0 0 100 0 0 0
> >>> 1 0 0 1445619804 28769988 100774944 0 0 0 0 1729 1585 0 0 100 0 0 0
> >>> 1 0 0 1445619804 26322348 103222584 0 0 0 0 1774 1575 0 0 100 0 0 0
> >>> 1 0 0 1445619804 23875592 105669340 0 0 0 4 1738 1604 0 0 100 0 0 0
> >>>
> >>> cat /proc/meminfo | head
> >>> Active(anon) increase.
> >>> MemTotal: 1579941036 kB
> >>> MemFree: 1445618500 kB
> >>> MemAvailable: 1453013224 kB
> >>> Buffers: 6516 kB
> >>> Cached: 128653956 kB
> >>> SwapCached: 0 kB
> >>> Active: 118110812 kB
> >>> Inactive: 11436620 kB
> >>> Active(anon): 115345744 kB
> >>> Inactive(anon): 945292 kB
> >>>
> >>> When the Active(anon) is 115345744 kB, insmod module triggers the ZONE_DMA32 watermark.
> >>>
> >>> perf show nr_scanned=28835844.
> >>> 28835844 * 4k = 115343376KB approximately equal to 115345744 kB.
> >>>
> >>> perf record -e vmscan:mm_vmscan_lru_isolate -aR
> >>> perf script
> >>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=2 nr_skipped=2 nr_taken=0 lru=active_anon
> >>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
> >>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
> >>> isolate_mode=0 classzone=1 order=1 nr_requested=32 nr_scanned=28835844 nr_skipped=28835844 nr_taken=0 lru=active_anon
> >>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=29 nr_skipped=29 nr_taken=0 lru=active_anon
> >>> isolate_mode=0 classzone=1 order=0 nr_requested=32 nr_scanned=0 nr_skipped=0 nr_taken=0 lru=active_anon
> >>>
> >>> If increase Active(anon) to 1000G then insmod module triggers the ZONE_DMA32 watermark. hard lockup will occur.
> >>>
> >>> In my device nr_scanned = 0000000003e3e937 when hard lockup. Convert to memory size 0x0000000003e3e937 * 4KB = 261072092 KB.
> >>>
> >>> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> >>> ffffc90006fb7c30: 0000000000000020 0000000000000000
> >>> ffffc90006fb7c40: ffffc90006fb7d40 ffff88812cbd3000
> >>> ffffc90006fb7c50: ffffc90006fb7d30 0000000106fb7de8
> >>> ffffc90006fb7c60: ffffea04a2197008 ffffea0006ed4a48
> >>> ffffc90006fb7c70: 0000000000000000 0000000000000000
> >>> ffffc90006fb7c80: 0000000000000000 0000000000000000
> >>> ffffc90006fb7c90: 0000000000000000 0000000000000000
> >>> ffffc90006fb7ca0: 0000000000000000 0000000003e3e937
> >>> ffffc90006fb7cb0: 0000000000000000 0000000000000000
> >>> ffffc90006fb7cc0: 8d7c0b56b7874b00 ffff88812cbd3000
> >>>
> >>>> Why do you think it took eight years to be discovered?
> >>>
> >>> The problem requires the following conditions to occur:
> >>> 1. The device memory should be large enough.
> >>> 2. Pages in the LRU(active_anon) list are mostly from the ZONE_NORMAL area.
> >>> 3. The memory in ZONE_DMA32 needs to reach the watermark.
> >>>
> >>> If the memory is not large enough, or if the usage design of ZONE_DMA32 area memory is reasonable, this problem is difficult to detect.
> >>>
> >>> notes:
> >>> The problem is most likely to occur in ZONE_DMA32 and ZONE_NORMAL, but other suitable scenarios may also trigger the problem.
> >>>
> >>>> It looks like that will fix, but perhaps something more fundamental
> >>>> needs to be done - we're doing a tremendous amount of pretty pointless
> >>>> work here. Answers to my above questions will help us resolve this.
> >>>>
> >>>> Thanks.
> >>>
> >>> Please refer to the above explanation for details.
> >>>
> >>> Thanks.
> >>
> >> Thanks.
> >>
> > Friendly ping.
> >
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-25 9:29 ` Andrew Morton
@ 2024-09-25 9:53 ` liuye
0 siblings, 0 replies; 12+ messages in thread
From: liuye @ 2024-09-25 9:53 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
On 2024/9/25 下午5:29, Andrew Morton wrote:
> On Wed, 25 Sep 2024 16:37:14 +0800 liuye <liuye@kylinos.cn> wrote:
>
>>
>>
>> On 2024/9/25 上午8:22, Andrew Morton wrote:
>>> On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
>>>
>>>> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>>>> nr_pages = folio_nr_pages(folio);
>>>> total_scan += nr_pages;
>>>>
>>>> - if (folio_zonenum(folio) > sc->reclaim_idx ||
>>>> - skip_cma(folio, sc)) {
>>>> + /* Using max_nr_skipped to prevent hard LOCKUP*/
>>>> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
>>>> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
>>>> nr_skipped[folio_zonenum(folio)] += nr_pages;
>>>> move_to = &folios_skipped;
>>>> + max_nr_skipped++;
>>>> goto move;
>>>
>>> This hunk is not applicable to current mainline.
>>>
>>
>> Please see the PATCH v2 in link [1], and the related discussion in link [2].
>> Then please explain why it is not applicable,thank you.
>
> What I mean is that the patch doesn't apply.
>
> Current mainline has
>
> if (folio_zonenum(folio) > sc->reclaim_idx) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> goto move;
> }
>
PATCH v2 base on current mainline.
@@ -1650,9 +1651,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
nr_pages = folio_nr_pages(folio);
total_scan += nr_pages;
- if (folio_zonenum(folio) > sc->reclaim_idx) {
+ /* Using max_nr_skipped to prevent hard LOCKUP*/
+ if (max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED &&
+ (folio_zonenum(folio) > sc->reclaim_idx)) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
+ max_nr_skipped++;
goto move;
}
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-25 8:37 ` liuye
@ 2024-09-25 9:29 ` Andrew Morton
2024-09-25 9:53 ` liuye
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2024-09-25 9:29 UTC (permalink / raw)
To: liuye; +Cc: linux-mm, linux-kernel
On Wed, 25 Sep 2024 16:37:14 +0800 liuye <liuye@kylinos.cn> wrote:
>
>
> On 2024/9/25 上午8:22, Andrew Morton wrote:
> > On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
> >
> >> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> >> nr_pages = folio_nr_pages(folio);
> >> total_scan += nr_pages;
> >>
> >> - if (folio_zonenum(folio) > sc->reclaim_idx ||
> >> - skip_cma(folio, sc)) {
> >> + /* Using max_nr_skipped to prevent hard LOCKUP*/
> >> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
> >> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
> >> nr_skipped[folio_zonenum(folio)] += nr_pages;
> >> move_to = &folios_skipped;
> >> + max_nr_skipped++;
> >> goto move;
> >
> > This hunk is not applicable to current mainline.
> >
>
> Please see the PATCH v2 in link [1], and the related discussion in link [2].
> Then please explain why it is not applicable,thank you.
What I mean is that the patch doesn't apply.
Current mainline has
if (folio_zonenum(folio) > sc->reclaim_idx) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
goto move;
}
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-09-25 0:22 ` Andrew Morton
@ 2024-09-25 8:37 ` liuye
2024-09-25 9:29 ` Andrew Morton
0 siblings, 1 reply; 12+ messages in thread
From: liuye @ 2024-09-25 8:37 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
On 2024/9/25 上午8:22, Andrew Morton wrote:
> On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
>
>> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>> nr_pages = folio_nr_pages(folio);
>> total_scan += nr_pages;
>>
>> - if (folio_zonenum(folio) > sc->reclaim_idx ||
>> - skip_cma(folio, sc)) {
>> + /* Using max_nr_skipped to prevent hard LOCKUP*/
>> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
>> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
>> nr_skipped[folio_zonenum(folio)] += nr_pages;
>> move_to = &folios_skipped;
>> + max_nr_skipped++;
>> goto move;
>
> This hunk is not applicable to current mainline.
>
Please see the PATCH v2 in link [1], and the related discussion in link [2].
Then please explain why it is not applicable,thank you.
[1]:https://lore.kernel.org/all/20240919021443.9170-1-liuye@kylinos.cn/
[2]:https://lore.kernel.org/all/e878653e-d380-81c2-90a8-fd2d1d4e7287@kylinos.cn/
Thanks,
liuye
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-08-14 9:18 liuye
2024-08-14 21:27 ` Andrew Morton
@ 2024-09-25 0:22 ` Andrew Morton
2024-09-25 8:37 ` liuye
1 sibling, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2024-09-25 0:22 UTC (permalink / raw)
To: liuye; +Cc: linux-mm, linux-kernel
On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> nr_pages = folio_nr_pages(folio);
> total_scan += nr_pages;
>
> - if (folio_zonenum(folio) > sc->reclaim_idx ||
> - skip_cma(folio, sc)) {
> + /* Using max_nr_skipped to prevent hard LOCKUP*/
> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> + max_nr_skipped++;
> goto move;
This hunk is not applicable to current mainline.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
2024-08-14 9:18 liuye
@ 2024-08-14 21:27 ` Andrew Morton
2024-09-25 0:22 ` Andrew Morton
1 sibling, 0 replies; 12+ messages in thread
From: Andrew Morton @ 2024-08-14 21:27 UTC (permalink / raw)
To: liuye; +Cc: linux-mm, linux-kernel
On Wed, 14 Aug 2024 17:18:25 +0800 liuye <liuye@kylinos.cn> wrote:
> This fixes the following hard lockup in function isolate_lru_folios
> when memory reclaim.If the LRU mostly contains ineligible folios
> May trigger watchdog.
>
> watchdog: Watchdog detected hard LOCKUP on cpu 173
> RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
> Call Trace:
> _raw_spin_lock_irqsave+0x31/0x40
> folio_lruvec_lock_irqsave+0x5f/0x90
> folio_batch_move_lru+0x91/0x150
> lru_add_drain_per_cpu+0x1c/0x40
> process_one_work+0x17d/0x350
> worker_thread+0x27b/0x3a0
> kthread+0xe8/0x120
> ret_from_fork+0x34/0x50
> ret_from_fork_asm+0x1b/0x30
>
> lruvec->lru_lock owner:
>
> PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
> #0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
> #1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
> #2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
> #3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
> #4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
> [exception RIP: isolate_lru_folios+403]
> RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
> RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
> RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
> RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
> R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> <NMI exception stack>
> #5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
> #6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
> #7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
> #8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
> #9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
> crash>
Well that's bad.
> Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Merged in 2016.
Can you please describe how to reproduce this? Under what circumstances
does it occur? Why do you think it took eight years to be discovered?
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1655,6 +1655,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
> unsigned long skipped = 0;
> unsigned long scan, total_scan, nr_pages;
> + unsigned long max_nr_skipped = 0;
> LIST_HEAD(folios_skipped);
>
> total_scan = 0;
> @@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
> nr_pages = folio_nr_pages(folio);
> total_scan += nr_pages;
>
> - if (folio_zonenum(folio) > sc->reclaim_idx ||
> - skip_cma(folio, sc)) {
> + /* Using max_nr_skipped to prevent hard LOCKUP*/
> + if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
> + (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
> nr_skipped[folio_zonenum(folio)] += nr_pages;
> move_to = &folios_skipped;
> + max_nr_skipped++;
> goto move;
> }
It looks like that will fix, but perhaps something more fundamental
needs to be done - we're doing a tremendous amount of pretty pointless
work here. Answers to my above questions will help us resolve this.
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios
@ 2024-08-14 9:18 liuye
2024-08-14 21:27 ` Andrew Morton
2024-09-25 0:22 ` Andrew Morton
0 siblings, 2 replies; 12+ messages in thread
From: liuye @ 2024-08-14 9:18 UTC (permalink / raw)
To: akpm; +Cc: linux-mm, linux-kernel, liuye
This fixes the following hard lockup in function isolate_lru_folios
when memory reclaim.If the LRU mostly contains ineligible folios
May trigger watchdog.
watchdog: Watchdog detected hard LOCKUP on cpu 173
RIP: 0010:native_queued_spin_lock_slowpath+0x255/0x2a0
Call Trace:
_raw_spin_lock_irqsave+0x31/0x40
folio_lruvec_lock_irqsave+0x5f/0x90
folio_batch_move_lru+0x91/0x150
lru_add_drain_per_cpu+0x1c/0x40
process_one_work+0x17d/0x350
worker_thread+0x27b/0x3a0
kthread+0xe8/0x120
ret_from_fork+0x34/0x50
ret_from_fork_asm+0x1b/0x30
lruvec->lru_lock owner:
PID: 2865 TASK: ffff888139214d40 CPU: 40 COMMAND: "kswapd0"
#0 [fffffe0000945e60] crash_nmi_callback at ffffffffa567a555
#1 [fffffe0000945e68] nmi_handle at ffffffffa563b171
#2 [fffffe0000945eb0] default_do_nmi at ffffffffa6575920
#3 [fffffe0000945ed0] exc_nmi at ffffffffa6575af4
#4 [fffffe0000945ef0] end_repeat_nmi at ffffffffa6601dde
[exception RIP: isolate_lru_folios+403]
RIP: ffffffffa597df53 RSP: ffffc90006fb7c28 RFLAGS: 00000002
RAX: 0000000000000001 RBX: ffffc90006fb7c60 RCX: ffffea04a2196f88
RDX: ffffc90006fb7c60 RSI: ffffc90006fb7c60 RDI: ffffea04a2197048
RBP: ffff88812cbd3010 R8: ffffea04a2197008 R9: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffffea04a2197008
R13: ffffea04a2197048 R14: ffffc90006fb7de8 R15: 0000000003e3e937
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
<NMI exception stack>
#5 [ffffc90006fb7c28] isolate_lru_folios at ffffffffa597df53
#6 [ffffc90006fb7cf8] shrink_active_list at ffffffffa597f788
#7 [ffffc90006fb7da8] balance_pgdat at ffffffffa5986db0
#8 [ffffc90006fb7ec0] kswapd at ffffffffa5987354
#9 [ffffc90006fb7ef8] kthread at ffffffffa5748238
crash>
Fixes: b2e18757f2c9 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Signed-off-by: liuye <liuye@kylinos.cn>
---
include/linux/swap.h | 1 +
mm/vmscan.c | 7 +++++--
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index ba7ea95d1c57..afb3274c90ef 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -223,6 +223,7 @@ enum {
};
#define SWAP_CLUSTER_MAX 32UL
+#define SWAP_CLUSTER_MAX_SKIPPED (SWAP_CLUSTER_MAX << 10)
#define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX
/* Bit flag in swap_map */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index cfa839284b92..02a8f86d4883 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1655,6 +1655,7 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
unsigned long skipped = 0;
unsigned long scan, total_scan, nr_pages;
+ unsigned long max_nr_skipped = 0;
LIST_HEAD(folios_skipped);
total_scan = 0;
@@ -1669,10 +1670,12 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
nr_pages = folio_nr_pages(folio);
total_scan += nr_pages;
- if (folio_zonenum(folio) > sc->reclaim_idx ||
- skip_cma(folio, sc)) {
+ /* Using max_nr_skipped to prevent hard LOCKUP*/
+ if ((max_nr_skipped < SWAP_CLUSTER_MAX_SKIPPED) &&
+ (folio_zonenum(folio) > sc->reclaim_idx || skip_cma(folio, sc))) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
+ max_nr_skipped++;
goto move;
}
--
2.25.1
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-12-05 19:17 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20240815025226.8973-1-liuye@kylinos.cn>
2024-08-23 2:04 ` Re: Re: [PATCH] mm/vmscan: Fix hard LOCKUP in function isolate_lru_folios liuye
2024-09-03 2:34 ` liuye
2024-09-03 3:03 ` liuye
2024-09-06 1:16 ` liuye
2024-09-11 2:56 ` liuye
2024-12-05 19:17 ` Yu Zhao
2024-08-14 9:18 liuye
2024-08-14 21:27 ` Andrew Morton
2024-09-25 0:22 ` Andrew Morton
2024-09-25 8:37 ` liuye
2024-09-25 9:29 ` Andrew Morton
2024-09-25 9:53 ` liuye
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox