From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 206A9C27C48 for ; Sat, 28 Oct 2023 00:10:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 720E56B03FF; Fri, 27 Oct 2023 20:10:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6CFEB6B0401; Fri, 27 Oct 2023 20:10:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C9E96B03FF; Fri, 27 Oct 2023 20:10:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4C1B86B03FF for ; Fri, 27 Oct 2023 20:10:17 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 242B7140ED8 for ; Sat, 28 Oct 2023 00:10:17 +0000 (UTC) X-FDA: 81392938074.13.35CBBF0 Received: from 66-220-144-178.mail-mxout.facebook.com (66-220-144-178.mail-mxout.facebook.com [66.220.144.178]) by imf25.hostedemail.com (Postfix) with ESMTP id 84DA6A0004 for ; Sat, 28 Oct 2023 00:10:15 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=neutral (imf25.hostedemail.com: 66.220.144.178 is neither permitted nor denied by domain of shr@devkernel.io) smtp.mailfrom=shr@devkernel.io; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698451815; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=b5rv3LWodrplP3bf7jX0quuX+wsL+p+jTB1UwG8EudQ=; b=CLPulZw9HF+j4IcNLZNaXqZjcfFjrDEm4z1ZjNNSMGcmHuXiZE+JquaYlP+QoZxFiP1M9D yMZG6gY8/IcXF5T/K1/q2HskXH6DfmANTUjG/uWr0DC6k3/CyMNxInNpAW8fid7SJI7Wvl lZxDqy+lE4YRCY5uSnJMJ/UVVshvwno= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698451815; a=rsa-sha256; cv=none; b=Zh4hdPl/TyylDuhzNZJ/bIAzCePNXGF6MEtQHxSJK8Ya4AEqrzayXESEV0I5lwJ2Qct130 H+ObJ5pEoQQS2NFzlTeYpHaFeEnsP8qhXots4nuNsrNm82vujrhGXy0vLBwFNVLHbmw83o fBH7qbiQgkl+UaW6vdpufVqU6gQzmis= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=neutral (imf25.hostedemail.com: 66.220.144.178 is neither permitted nor denied by domain of shr@devkernel.io) smtp.mailfrom=shr@devkernel.io; dmarc=none Received: by devbig1114.prn1.facebook.com (Postfix, from userid 425415) id C633AE5A2553; Fri, 27 Oct 2023 17:09:57 -0700 (PDT) From: Stefan Roesch To: kernel-team@fb.com Cc: shr@devkernel.io, akpm@linux-foundation.org, david@redhat.com, hannes@cmpxchg.org, riel@surriel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 0/4] mm/ksm: Add ksm advisor Date: Fri, 27 Oct 2023 17:09:41 -0700 Message-Id: <20231028000945.2428830-1-shr@devkernel.io> X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 84DA6A0004 X-Rspam-User: X-Stat-Signature: b9zkhbzi1e9k4tk1d5gk3yerhm8t9d8w X-Rspamd-Server: rspam03 X-HE-Tag: 1698451815-302745 X-HE-Meta: U2FsdGVkX1878fJ8XSG6BRk1NmhRX+DMfTqxcj0nGATRoQk5oAzsUpUgI6It5p8hPmV0N3d+q1PxUMTNEQYfMXF3BoJrzJqqNHHFgjyHWLLoUaMqxpEGXO7LDt92Uj+4DI6dokmXuknC+BtORCNDmvrzqDydZsYcQ56P/403aRAacOUWLIB66cU+n08nkf3Ymw1h+VjgZNxMaXyXIcoCvnsy74RPEc29BPBgBuYK+zsfjgsvJI9k4QQY0zzkqNABg7t1qY02VtAGs/+CA63lDV7x8tz25/21TXc2CLj50F6TEB491ws2Nt6CgicB02htTBpdsWuQM4R7mpCHlD1SnvLtRa0laQ1MW+3SndzBig8QRwSiU8TYrEopJaWDiAO42mfWmVzHWrKX7h/OnQYfUCQwF1vfLGENddo2keP1TeDG3XlO21Goepq64fd1DfTWR6/cXXRUweLn7xXUN8KXaDk/4K8vtLcFurxs89p5zdfwwm0xBChk39Y5OIC/n958ses4BXGNJXFN3+hS1KQtkgnw3uKaNMiS/d4BWN+tRpWI5YGtcpTfnVCrJWsFCvCp3k2I67aG6ccxxQUA/0I5bq3UivyDqyBDh8KCceamSItp2w0WReKx5XW9ZIfbGXeIhN3RkVLKRl03+w23fgY0y50wCLD7C87CslDQZBaQL90mx1tDCr9dClMWpoprp0sVfMJ/SEe/kJyuFdagAj64fux03Iow9fGmwAL7t8QD3B7giKoRkJvsqCvRMm6h8H2a2wlnG1S4C5awQf2yrZwg0ZoaaTfebz5WSIAwIz/68JHps/JQYDhJ8gF2X2B8lA713vWcV21isI+NqVKAR2ithyTMzdMRyTkACzPpXi2w7lFV8XmWBv9yvJ3JI38AUYSw7k1XBn6MfyRK5HQTFYKX+MiseyVV6d0YknjrQzWpBr2e4xcu1IPepw840tdeBhD5fK5tnh0LX4br0shnS6w YTB70ip+ pPgbXuFYeXniMAg1SrFThk2sRv/CzZakEshhH+6o9FnNeDpG9LqKwqLm/QW5b1i3VrShd5zLFkSyoXtdDrU5OoY10jFPaFxXvgekR2dCdRjTHNJw97Jhi7CdLEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: What is the KSM advisor? =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D The ksm advisor automatically manages the pages_to_scan setting to achieve a target scan time. The target scan time defines how many seconds it should take to scan all the candidate KSM pages. In other words the pages_to_scan rate is changed by the advisor to achieve the target scan time. Why do we need a KSM advisor? =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D The number of candidate pages for KSM is dynamic. It can often be observe= d that during the startup of an application more candidate pages need to be processed. Without an advisor the pages_to_scan parameter needs to be sized for the maximum number of candidate pages. With the scan time advisor the pages_to_scan parameter based can be changed based on demand. Algorithm =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D The algorithm calculates the change value based on the target scan time and the previous scan time. To avoid pertubations an exponentially weighted moving average is applied. The algorithm has a max and min value to: - guarantee responsiveness to changes - to avoid to spend too much CPU Parameters to influence the KSM scan advisor =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D The respective parameters are: - ksm_advisor_mode 0: None (default), 1: scan time advisor - ksm_advisor_target_scan_time how many seconds a scan should of all candidate pages take - ksm_advisor_min_cpu lower limit for the cpu usage in percent of the ksmd background thread - ksm_advisor_max_cpu upper limit for the cpu usage in percent of the ksmd background thread The initial value and the max value for the pages_to_scan parameter can be limited with: - ksm_advisor_min_pages minimum value for pages_to_scan per batch - ksm_advisor_max_pages maximum value for pages_to_scan per batch The default settings for the above two parameters should be suitable for most workloads. The parameters are exposed as knobs in /sys/kernel/mm/ksm. By default the scan time advisor is disabled. Currently there are two advisors: - none and - scan time. Resource savings =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Tests with various workloads have shown considerable CPU savings. Most of the workloads I have investigated have more candidate pages during startup. Once the workload is stable in terms of memory, the number of candidate pages is reduced. Without the advisor, the pages_to_scan needs to be sized for the maximum number of candidate pages. So having this advisor definitely helps in reducing CPU consumption. For the instagram workload, the advisor achieves a 25% CPU reduction. Once the memory is stable, the pages_to_scan parameter gets reduced to about 40% of its max value. The new advisor works especially well if the smart scan feature is also enabled. How is defining a target scan time better? =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D For an administrator it is more logical to set a target scan time.. The administrator can determine how many pages are scanned on each scan. Therefore setting a target scan time makes more sense. In addition the administrator might have a good idea about the memory sizing of its respective workloads. Setting cpu limits is easier than setting The pages_to_scan parameter. Th= e pages_to_scan parameter is per batch. For the administrator it is difficu= lt to set the pages_to_scan parameter. Tracing =3D=3D=3D=3D=3D=3D=3D A new tracing event has been added for the scan time advisor. The new trace event is called ksm_advisor. It reports the scan time, the new pages_to_scan setting and the cpu usage of the ksmd background thread. Other approaches =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Approach 1: Adapt pages_to_scan after processing each batch. If KSM merges pages, increase the scan rate, if less KSM pages, reduce the the pages_to_scan rate. This doesn't work too well. While it increases the pages_to_scan for a short period, but generally it ends up with a too low pages_to_scan rate. Approach 2: Adapt pages_to_scan after each scan. The problem with that approach is that the calculated scan rate tends to be high. The more aggressive KSM scans, the more pages it can de-duplicate. There have been earlier attempts at an advisor: propose auto-run mode of ksm and its tests (https://marc.info/?l=3Dlinux-mm&m=3D166029880214485&w=3D2) Changes: =3D=3D=3D=3D=3D=3D=3D=3D V2: - Use functions for long long calculations to support 32 bit platforms - Use cpu min and cpu max settings for the advisor instead of the pages min and max parameters. - pages min and max values are now used for the initial and max values. Generally they are not required to be changed. - Add cpu percent usage value to tracepoint definition - Update documentation for cpu min and cpu max values=20 - Update commit messages for the above changes Stefan Roesch (4): mm/ksm: add ksm advisor mm/ksm: add sysfs knobs for advisor mm/ksm: add tracepoint for ksm advisor mm/ksm: document ksm advisor and its sysfs knobs Documentation/admin-guide/mm/ksm.rst | 66 ++++++ include/trace/events/ksm.h | 33 +++ mm/ksm.c | 314 ++++++++++++++++++++++++++- 3 files changed, 412 insertions(+), 1 deletion(-) base-commit: 12d04a7bf0da67321229d2bc8b1a7074d65415a9 --=20 2.39.3