From: "Huang, Ying" <ying.huang@intel.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: akpm@linux-foundation.org, mgorman@techsingularity.net,
linux-mm@kvack.org, Matthew Wilcox <willy@infradead.org>,
David Rientjes <rientjes@google.com>
Subject: Re: [PATCH 3/3] mm/page_alloc: Introduce a new sysctl knob vm.pcp_batch_scale_max
Date: Thu, 11 Jul 2024 18:49:41 +0800 [thread overview]
Message-ID: <877cds9pa2.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <CALOAHbBEsmF3_udHNzpOTrRWscweysBgrGweH1s0SSueMhYP7A@mail.gmail.com> (Yafang Shao's message of "Thu, 11 Jul 2024 17:51:38 +0800")
Yafang Shao <laoar.shao@gmail.com> writes:
> On Thu, Jul 11, 2024 at 4:20 PM Huang, Ying <ying.huang@intel.com> wrote:
>>
>> Yafang Shao <laoar.shao@gmail.com> writes:
>>
>> > On Thu, Jul 11, 2024 at 2:44 PM Huang, Ying <ying.huang@intel.com> wrote:
>> >>
>> >> Yafang Shao <laoar.shao@gmail.com> writes:
>> >>
>> >> > On Wed, Jul 10, 2024 at 10:51 AM Huang, Ying <ying.huang@intel.com> wrote:
>> >> >>
>> >> >> Yafang Shao <laoar.shao@gmail.com> writes:
>> >> >>
>> >> >> > The configuration parameter PCP_BATCH_SCALE_MAX poses challenges for
>> >> >> > quickly experimenting with specific workloads in a production environment,
>> >> >> > particularly when monitoring latency spikes caused by contention on the
>> >> >> > zone->lock. To address this, a new sysctl parameter vm.pcp_batch_scale_max
>> >> >> > is introduced as a more practical alternative.
>> >> >>
>> >> >> In general, I'm neutral to the change. I can understand that kernel
>> >> >> configuration isn't as flexible as sysctl knob. But, sysctl knob is ABI
>> >> >> too.
>> >> >>
>> >> >> > To ultimately mitigate the zone->lock contention issue, several suggestions
>> >> >> > have been proposed. One approach involves dividing large zones into multi
>> >> >> > smaller zones, as suggested by Matthew[0], while another entails splitting
>> >> >> > the zone->lock using a mechanism similar to memory arenas and shifting away
>> >> >> > from relying solely on zone_id to identify the range of free lists a
>> >> >> > particular page belongs to[1]. However, implementing these solutions is
>> >> >> > likely to necessitate a more extended development effort.
>> >> >>
>> >> >> Per my understanding, the change will hurt instead of improve zone->lock
>> >> >> contention. Instead, it will reduce page allocation/freeing latency.
>> >> >
>> >> > I'm quite perplexed by your recent comment. You introduced a
>> >> > configuration that has proven to be difficult to use, and you have
>> >> > been resistant to suggestions for modifying it to a more user-friendly
>> >> > and practical tuning approach. May I inquire about the rationale
>> >> > behind introducing this configuration in the beginning?
>> >>
>> >> Sorry, I don't understand your words. Do you need me to explain what is
>> >> "neutral"?
>> >
>> > No, thanks.
>> > After consulting with ChatGPT, I received a clear and comprehensive
>> > explanation of what "neutral" means, providing me with a better
>> > understanding of the concept.
>> >
>> > So, can you explain why you introduced it as a config in the beginning ?
>>
>> I think that I have explained it in the commit log of commit
>> 52166607ecc9 ("mm: restrict the pcp batch scale factor to avoid too long
>> latency"). Which introduces the config.
>
> What specifically are your expectations for how users should utilize
> this config in real production workload?
>
>>
>> Sysctl knob is ABI, which needs to be maintained forever. Can you
>> explain why you need it? Why cannot you use a fixed value after initial
>> experiments.
>
> Given the extensive scale of our production environment, with hundreds
> of thousands of servers, it begs the question: how do you propose we
> efficiently manage the various workloads that remain unaffected by the
> sysctl change implemented on just a few thousand servers? Is it
> feasible to expect us to recompile and release a new kernel for every
> instance where the default value falls short? Surely, there must be
> more practical and efficient approaches we can explore together to
> ensure optimal performance across all workloads.
>
> When making improvements or modifications, kindly ensure that they are
> not solely confined to a test or lab environment. It's vital to also
> consider the needs and requirements of our actual users, along with
> the diverse workloads they encounter in their daily operations.
Have you found that your different systems requires different
CONFIG_PCP_BATCH_SCALE_MAX value already? If no, I think that it's
better for you to keep this patch in your downstream kernel for now.
When you find that it is a common requirement, we can evaluate whether
to make it a sysctl knob.
--
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2024-07-11 10:51 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-07 9:49 [PATCH 0/3] " Yafang Shao
2024-07-07 9:49 ` [PATCH 1/3] mm/page_alloc: A minor fix to the calculation of pcp->free_count Yafang Shao
2024-07-10 1:52 ` Huang, Ying
2024-07-07 9:49 ` [PATCH 2/3] mm/page_alloc: Avoid changing pcp->high decaying when adjusting CONFIG_PCP_BATCH_SCALE_MAX Yafang Shao
2024-07-10 1:51 ` Huang, Ying
2024-07-10 2:07 ` Yafang Shao
2024-07-07 9:49 ` [PATCH 3/3] mm/page_alloc: Introduce a new sysctl knob vm.pcp_batch_scale_max Yafang Shao
2024-07-10 2:49 ` Huang, Ying
2024-07-11 2:21 ` Yafang Shao
2024-07-11 6:42 ` Huang, Ying
2024-07-11 7:25 ` Yafang Shao
2024-07-11 8:18 ` Huang, Ying
2024-07-11 9:51 ` Yafang Shao
2024-07-11 10:49 ` Huang, Ying [this message]
2024-07-11 12:45 ` Yafang Shao
2024-07-12 1:19 ` Huang, Ying
2024-07-12 2:25 ` Yafang Shao
2024-07-12 3:05 ` Huang, Ying
2024-07-12 3:44 ` Yafang Shao
2024-07-12 5:25 ` Huang, Ying
2024-07-12 5:41 ` Yafang Shao
2024-07-12 6:16 ` Huang, Ying
2024-07-12 6:41 ` Yafang Shao
2024-07-12 7:04 ` Huang, Ying
2024-07-12 7:36 ` Yafang Shao
2024-07-12 8:24 ` Huang, Ying
2024-07-12 8:49 ` Yafang Shao
2024-07-12 9:10 ` Huang, Ying
2024-07-12 9:24 ` Yafang Shao
2024-07-12 9:46 ` Yafang Shao
2024-07-15 1:09 ` Huang, Ying
2024-07-15 4:32 ` Yafang Shao
2024-07-10 3:00 ` [PATCH 0/3] " Huang, Ying
2024-07-11 2:25 ` Yafang Shao
2024-07-11 6:38 ` Huang, Ying
2024-07-11 7:21 ` Yafang Shao
2024-07-11 8:36 ` Huang, Ying
2024-07-11 9:40 ` Yafang Shao
2024-07-11 11:03 ` Huang, Ying
2024-07-11 12:40 ` Yafang Shao
2024-07-12 2:32 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877cds9pa2.fsf@yhuang6-desk2.ccr.corp.intel.com \
--to=ying.huang@intel.com \
--cc=akpm@linux-foundation.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=rientjes@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox