linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [Question] About the PCP free_high heuristic
@ 2025-07-29  8:08 史嘉成
  2025-07-29  9:59 ` Huang, Ying
  0 siblings, 1 reply; 5+ messages in thread
From: 史嘉成 @ 2025-07-29  8:08 UTC (permalink / raw)
  To: ying.huang; +Cc: linux-mm

Hi,

I ran the bw_unix benchmark in lmbench on my test machine (EPYC-7T83, 32 vCPUs,
64 GB of memory):
    bin/x86_64-linux-gnu/bw_unix -P 16
The bandwidth result was 30511.63 MB/s when percpu_pagelist_high_fraction was
set to 8; however, the result drops to 21595.98 MB/s when
percpu_pagelist_high_fraction is set to 0 (enabling PCP high auto-tuning).

I first inspected the auto-tuning code, but the root cause of the performance
degradation lies in the triggering threshold of the free_high heuristic:
    pcp->free_count >= (batch + pcp->high_min / 2)
I noticed that commit c544a95 increases this threshold, but pcp->high_min is
relatively small when auto-tuning is enabled, and the PCP draining leads to
the performance degradation.

The problem was fixed when increasing the threshold to (batch + pcp->high / 2).
Is it intended to use high_min instead of high in the threshold? Would it be
more adaptive to introduce some new tunables for the free_high threshold?

Best,
Shi, Jiacheng

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-07-30  1:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-29  8:08 [Question] About the PCP free_high heuristic 史嘉成
2025-07-29  9:59 ` Huang, Ying
2025-07-29 11:29   ` Shi, Jiacheng
2025-07-30  1:26     ` Huang, Ying
2025-07-30  1:33       ` 史嘉成

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox