Re: [PATCH v4 1/4] mm/ksm: add ksm advisor

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Stefan Roesch <shr@devkernel.io>, kernel-team@fb.com
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, riel@surriel.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v4 1/4] mm/ksm: add ksm advisor
Date: Mon, 18 Dec 2023 12:29:40 +0100	[thread overview]
Message-ID: <07c3d204-8285-46d2-b7fa-c63800bd7073@redhat.com> (raw)
In-Reply-To: <20231213182729.587081-2-shr@devkernel.io>

On 13.12.23 19:27, Stefan Roesch wrote:
> This adds the ksm advisor. The ksm advisor automatically manages the
> pages_to_scan setting to achieve a target scan time. The target scan
> time defines how many seconds it should take to scan all the candidate
> KSM pages. In other words the pages_to_scan rate is changed by the
> advisor to achieve the target scan time. The algorithm has a max and min
> value to:
> - guarantee responsiveness to changes
> - limit CPU resource consumption
> 
> The respective parameters are:
> - ksm_advisor_target_scan_time (how many seconds a scan should take)
> - ksm_advisor_max_cpu (maximum value for cpu percent usage)
> 
> - ksm_advisor_min_pages (minimum value for pages_to_scan per batch)
> - ksm_advisor_max_pages (maximum value for pages_to_scan per batch)
> 
> The algorithm calculates the change value based on the target scan time
> and the previous scan time. To avoid pertubations an exponentially
> weighted moving average is applied.
> 
> The advisor is managed by two main parameters: target scan time,
> cpu max time for the ksmd background thread. These parameters determine
> how aggresive ksmd scans.
> 
> In addition there are min and max values for the pages_to_scan parameter
> to make sure that its initial and max values are not set too low or too
> high. This ensures that it is able to react to changes quickly enough.
> 
> The default values are:
> - target scan time: 200 secs
> - max cpu: 70%
> - min pages: 500
> - max pages: 30000
> 
> By default the advisor is disabled. Currently there are two advisors:
> none and scan-time.
> 
> Tests with various workloads have shown considerable CPU savings. Most
> of the workloads I have investigated have more candidate pages during
> startup, once the workload is stable in terms of memory, the number of
> candidate pages is reduced. Without the advisor, the pages_to_scan needs
> to be sized for the maximum number of candidate pages. So having this
> advisor definitely helps in reducing CPU consumption.
> 
> For the instagram workload, the advisor achieves a 25% CPU reduction.
> Once the memory is stable, the pages_to_scan parameter gets reduced to
> about 40% of its max value.
> 
> Signed-off-by: Stefan Roesch <shr@devkernel.io>
> ---
>   mm/ksm.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 160 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 7efcc68ccc6ea..4f7b71a1f3112 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -21,6 +21,7 @@
>   #include <linux/sched.h>
>   #include <linux/sched/mm.h>
>   #include <linux/sched/coredump.h>
> +#include <linux/sched/cputime.h>
>   #include <linux/rwsem.h>
>   #include <linux/pagemap.h>
>   #include <linux/rmap.h>
> @@ -248,6 +249,9 @@ static struct kmem_cache *rmap_item_cache;
>   static struct kmem_cache *stable_node_cache;
>   static struct kmem_cache *mm_slot_cache;
>   
> +/* Default number of pages to scan per batch */
> +#define DEFAULT_PAGES_TO_SCAN 100
> +
>   /* The number of pages scanned */
>   static unsigned long ksm_pages_scanned;
>   
> @@ -276,7 +280,7 @@ static unsigned int ksm_stable_node_chains_prune_millisecs = 2000;
>   static int ksm_max_page_sharing = 256;
>   
>   /* Number of pages ksmd should scan in one batch */
> -static unsigned int ksm_thread_pages_to_scan = 100;
> +static unsigned int ksm_thread_pages_to_scan = DEFAULT_PAGES_TO_SCAN;
>   
>   /* Milliseconds ksmd should sleep between batches */
>   static unsigned int ksm_thread_sleep_millisecs = 20;
> @@ -297,6 +301,155 @@ unsigned long ksm_zero_pages;
>   /* The number of pages that have been skipped due to "smart scanning" */
>   static unsigned long ksm_pages_skipped;
>   
> +/* Don't scan more than max pages per batch. */
> +static unsigned long ksm_advisor_max_pages = 30000;
> +
> +/* At least scan this many pages per batch. */
> +static unsigned long ksm_advisor_min_pages = 500;
> +
> +/* Min CPU for scanning pages per scan */
> +static unsigned int ksm_advisor_min_cpu =  10;

That will never be modified, right? Either mark it const or just turn it 
into a define.

[...]

> +/*
> + * The scan time advisor is based on the current scan rate and the target
> + * scan rate.
> + *
> + *      new_pages_to_scan = pages_to_scan * (scan_time / target_scan_time)
> + *
> + * To avoid perturbations it calculates a change factor of previous changes.
> + * A new change factor is calculated for each iteration and it uses an
> + * exponentially weighted moving average. The new pages_to_scan value is
> + * multiplied with that change factor:
> + *
> + *      new_pages_to_scan *= change facor
> + *
> + * The new_pages_to_scan value is limited by the cpu min and max values. It
> + * calculates the cpu percent for the last scan and calculates the new
> + * estimated cpu percent cost for the next scan. That value is capped by the
> + * cpu min and max setting.
> + *
> + * In addition the new pages_to_scan value is capped by the max and min
> + * limits.
> + */
> +static void scan_time_advisor(void)
> +{
> +	unsigned int cpu_percent;
> +	unsigned long cpu_time;
> +	unsigned long cpu_time_diff;
> +	unsigned long cpu_time_diff_ms;
> +	unsigned long pages;
> +	unsigned long per_page_cost;
> +	unsigned long factor;
> +	unsigned long change;
> +	unsigned long last_scan_time;
> +	unsigned long scan_time;
> +
> +	/* Convert scan time to seconds */
> +	scan_time = div_s64(ktime_ms_delta(ktime_get(), advisor_ctx.start_scan),
> +			    MSEC_PER_SEC);
> +	scan_time = scan_time ? scan_time : 1;
> +
> +	/* Calculate CPU consumption of ksmd background thread */
> +	cpu_time = task_sched_runtime(current);
> +	cpu_time_diff = cpu_time - advisor_ctx.cpu_time;
> +	cpu_time_diff_ms = cpu_time_diff / 1000 / 1000;
> +
> +	cpu_percent = (cpu_time_diff_ms * 100) / (scan_time * 1000);
> +	cpu_percent = cpu_percent ? cpu_percent : 1;
> +	last_scan_time = prev_scan_time(&advisor_ctx, scan_time);

I'd simply inline prev_scan_time() here and get rid of it. Whatever you 
think is best.


Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2023-12-18 11:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-13 18:27 [PATCH v4 0/4] mm/ksm: Add " Stefan Roesch
2023-12-13 18:27 ` [PATCH v4 1/4] mm/ksm: add " Stefan Roesch
2023-12-18 11:29   ` David Hildenbrand [this message]
2023-12-18 17:27     ` Stefan Roesch
2023-12-13 18:27 ` [PATCH v4 2/4] mm/ksm: add sysfs knobs for advisor Stefan Roesch
2023-12-18 11:25   ` David Hildenbrand
2023-12-18 17:44     ` Stefan Roesch
2023-12-13 18:27 ` [PATCH v4 3/4] mm/ksm: add tracepoint for ksm advisor Stefan Roesch
2023-12-13 18:27 ` [PATCH v4 4/4] mm/ksm: document ksm advisor and its sysfs knobs Stefan Roesch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07c3d204-8285-46d2-b7fa-c63800bd7073@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@surriel.com \
    --cc=shr@devkernel.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox