From: Stefan Roesch <shr@devkernel.io>
To: David Hildenbrand <david@redhat.com>
Cc: kernel-team@fb.com, akpm@linux-foundation.org,
hannes@cmpxchg.org, riel@surriel.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 0/4] mm/ksm: Add ksm advisor
Date: Tue, 10 Oct 2023 09:02:38 -0700 [thread overview]
Message-ID: <87bkd61n12.fsf@devkernel.io> (raw)
In-Reply-To: <d9e28b8a-dc03-42cf-a6f8-69b2d993cc8d@redhat.com>
David Hildenbrand <david@redhat.com> writes:
> On 06.10.23 18:17, Stefan Roesch wrote:
>> David Hildenbrand <david@redhat.com> writes:
>>
>>> On 04.10.23 21:02, Stefan Roesch wrote:
>>>> What is the KSM advisor?
>>>> =========================
>>>> The ksm advisor automatically manages the pages_to_scan setting to
>>>> achieve a target scan time. The target scan time defines how many seconds
>>>> it should take to scan all the candidate KSM pages. In other words the
>>>> pages_to_scan rate is changed by the advisor to achieve the target scan
>>>> time.
>>>> Why do we need a KSM advisor?
>>>> ==============================
>>>> The number of candidate pages for KSM is dynamic. It can often be observed
>>>> that during the startup of an application more candidate pages need to be
>>>> processed. Without an advisor the pages_to_scan parameter needs to be
>>>> sized for the maximum number of candidate pages. With the scan time
>>>> advisor the pages_to_scan parameter based can be changed based on demand.
>>>> Algorithm
>>>> ==========
>>>> The algorithm calculates the change value based on the target scan time
>>>> and the previous scan time. To avoid pertubations an exponentially
>>>> weighted moving average is applied.
>>>> The algorithm has a max and min
>>>> value to:
>>>> - guarantee responsiveness to changes
>>>> - to avoid to spend too much CPU
>>>> Parameters to influence the KSM scan advisor
>>>> =============================================
>>>> The respective parameters are:
>>>> - ksm_advisor_mode
>>>> 0: None (default), 1: scan time advisor
>>>> - ksm_advisor_target_scan_time
>>>> how many seconds a scan should of all candidate pages take
>>>> - ksm_advisor_min_pages
>>>> minimum value for pages_to_scan per batch
>>>> - ksm_advisor_max_pages
>>>> maximum value for pages_to_scan per batch
>>>> The parameters are exposed as knobs in /sys/kernel/mm/ksm.
>>>> By default the scan time advisor is disabled.
>>>
>>> What would be the main reason to not have this enabled as default?
>>>
>> There might be already exisiting users which directly set pages_to_scan
>> and tuned the KSM settings accordingly, as the default setting of 100 for
>> pages_to_scan is too low for typical workloads.
>
> Good point.
>
>>
>>> IIUC, it is kind-of an auto-tuning of pages_to_scan. Would "auto-tuning"
>>> describe it better than "advisor" ?
>>>
>>> [...]
>>>
>> I'm fine with auto-tune. I was also thinking about that name, but I
>> chose advisor, its a bit less strong and it needs input from the user.
>>
>
> I'm not a native speaker, but "adviser" to me implies that no action is taken,
> only advises are given :) But again, no native speaker.
>
>>>> How is defining a target scan time better?
>>>> ===========================================
>>>> For an administrator it is more logical to set a target scan time.. The
>>>> administrator can determine how many pages are scanned on each scan.
>>>> Therefore setting a target scan time makes more sense.
>>>> In addition the administrator might have a good idea about the
>>>> memory sizing of its respective workloads.
>>>
>>> Is there any way you could imagine where we could have this just do something
>>> reasonable without any user input? IOW, true auto-tuning?
>>>
>> True auto-tuning might be difficult as users might want to be able to
>> choose how aggressive KSM is. Some might want it to be as aggressive as
>> possible to get the maximum de-duplication rate. Others might want a
>> more balanced approach that takes CPU-consumption into consideration.
>> I guess it depends if you are memory-bound, cpu-bound or both.
>
> Agreed, more below.
>
>>
>>> I read above:
>>>> - guarantee responsiveness to changes
>>>> - to avoid to spend too much CPU
>>>
>>> whereby both things are accountable/measurable to use that as the input for
>>> auto-tuning?
>>>
>> I'm not sure a true auto-tuning can be achieved. I think we need
>> some input from the user
>> - How much resources to consume
>> - How fast memory changes or how stable memory is
>> (this we might be able to detect)
>
> Setting the pages_to_scan is a bit mystical. Setting upper/lower pages_to_scan
> bounds is similarly mystical, and highly workload dependent.
>
> So I agree that a better abstraction to automatically tune the scanning is
> reasonable. I wonder if we can let the user give better inputs that are less
> workload dependent.
>
> For example, do we need min/max values for pages_to_scan, or can we replace it
> by something better to the auto-tuning algorithm?
>
> IMHO "target scan time" goes into the right direction, but it can still be
> fairly workload dependent. Maybe a "max CPU consumption" or sth. like that would
> similarly help to limit CPU waste, and it could be fairly workload dependent.
I can look into replacing min/max values for pages_to_scan with min/max
cpu utilization. This might be easier for users to decide on. However I
still think that we need a target value like scan time to optimize for.
next prev parent reply other threads:[~2023-10-10 16:05 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-04 19:02 Stefan Roesch
2023-10-04 19:02 ` [PATCH v1 1/4] mm/ksm: add " Stefan Roesch
2023-10-04 19:02 ` [PATCH v1 2/4] mm/ksm: add sysfs knobs for advisor Stefan Roesch
2023-10-05 17:57 ` kernel test robot
2023-10-05 21:36 ` kernel test robot
2023-10-04 19:02 ` [PATCH v1 3/4] mm/ksm: add tracepoint for ksm advisor Stefan Roesch
2023-10-04 19:02 ` [PATCH v1 4/4] mm/ksm: document ksm advisor and its sysfs knobs Stefan Roesch
2023-10-06 12:01 ` [PATCH v1 0/4] mm/ksm: Add ksm advisor David Hildenbrand
2023-10-06 16:17 ` Stefan Roesch
2023-10-09 9:48 ` David Hildenbrand
2023-10-10 16:02 ` Stefan Roesch [this message]
2023-10-17 15:28 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bkd61n12.fsf@devkernel.io \
--to=shr@devkernel.io \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox