* reply: [PATCHv5] mm: skip CMA pages when they are not available
@ 2024-08-13 9:58 黄朝阳 (Zhaoyang Huang)
2024-08-16 17:20 ` Breno Leitao
0 siblings, 1 reply; 2+ messages in thread
From: 黄朝阳 (Zhaoyang Huang) @ 2024-08-13 9:58 UTC (permalink / raw)
To: Breno Leitao
Cc: Andrew Morton, Matthew Wilcox, Suren Baghdasaryan, Minchan Kim,
linux-mm, linux-kernel, Zhaoyang Huang,
王科 (Ke Wang),
usamaarif642, riel, hannes, nphamcs
>
>On Wed, May 31, 2023 at 10:51:01AM +0800, zhaoyang.huang wrote:
>> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>>
>> This patch fixes unproductive reclaiming of CMA pages by skipping them
>> when they are not available for current context. It is arise from
>> bellowing OOM issue, which caused by large proportion of MIGRATE_CMA
>pages among free pages.
>
>Hello,
>
>I've been looking into a problem with high memory pressure causing OOMs in
>some of our workloads, and it seems that this change may have introduced lock
>contention when there is high memory pressure.
>
>I've collected some metrics for my specific workload that suggest this change
>has increased the lruvec->lru_lock waittime-max by 500x and the
>waittime-avg by 20x.
>
>Experiment
>==========
>
>The experiment involved 100 hosts, each with 64GB of memory and a single
>Xeon 8321HC CPU. The experiment ran for over 80 hours.
>
>Half of the hosts (50) were configured with the patch reverted and lock stat
>enabled, while the other half was run against the upstream version.
>All machines had hugetlb_cma=6G set as a command-line argument.
>
>In this context, "upstream" refers to kernel release 6.9 with some minor
>changes that should not impact the results.
>
>Workload
>========
>
>The workload is a Java based application that fully utilized the memory, in fact,
>the JVM runs with `-Xms50735m -Xmx50735m` arguments.
>
>Results:
>=======
>
>A few values from lockstat:
>
> waittime-max waittime-total waittime-avg
>holdtime-max
>6.9: 242889 15618873933 715
>17485
>6.9-with-revert: 487 688563299 34
>464
>
>The full data could be seen at:
>https://docs.google.com/spreadsheets/d/1Dl-8ImlE4OZrfKjbyWAIWWuQtgD3f
>wEEl9INaZQZ4e8/edit?usp=sharing
>
>Possible causes:
>================
>
>I've been discussing this with colleagues and we're speculating that the high
>contention might be linked to the fact that CMA regions are now being skipped.
>This could potentially extend the duration of the
>isolate_lru_folios() 'while' loop, resulting in increased pressure on the lock.
>
>However, I want to emphasize that I'm not an expert in this area and I am
>simply sharing the data I collected.
Could you please try below patch which could be helpful
https://lore.kernel.org/linux-mm/CAOUHufa7OBtNHKMhfu8wOOE4f0w3b0_2KzzV7-hrc9rVL8e=iw@mail.gmail.com/
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: reply: [PATCHv5] mm: skip CMA pages when they are not available
2024-08-13 9:58 reply: [PATCHv5] mm: skip CMA pages when they are not available 黄朝阳 (Zhaoyang Huang)
@ 2024-08-16 17:20 ` Breno Leitao
0 siblings, 0 replies; 2+ messages in thread
From: Breno Leitao @ 2024-08-16 17:20 UTC (permalink / raw)
To: 黄朝阳 (Zhaoyang Huang)
Cc: Andrew Morton, Matthew Wilcox, Suren Baghdasaryan, Minchan Kim,
linux-mm, linux-kernel, Zhaoyang Huang,
王科 (Ke Wang),
usamaarif642, riel, hannes, nphamcs
Hello Zhaoyang,
On Tue, Aug 13, 2024 at 09:58:32AM +0000, 黄朝阳 (Zhaoyang Huang) wrote:
> >I've been discussing this with colleagues and we're speculating that the high
> >contention might be linked to the fact that CMA regions are now being skipped.
> >This could potentially extend the duration of the
> >isolate_lru_folios() 'while' loop, resulting in increased pressure on the lock.
> >However, I want to emphasize that I'm not an expert in this area and I am
> >simply sharing the data I collected.
> Could you please try below patch which could be helpful
>
> https://lore.kernel.org/linux-mm/CAOUHufa7OBtNHKMhfu8wOOE4f0w3b0_2KzzV7-hrc9rVL8e=iw@mail.gmail.com/
Yes, my colleague Usama have tried it, and it solved the problem. Thanks
for the heads-up, it was very useful.
--breno
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-08-16 17:20 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-13 9:58 reply: [PATCHv5] mm: skip CMA pages when they are not available 黄朝阳 (Zhaoyang Huang)
2024-08-16 17:20 ` Breno Leitao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox