From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5CB5EE3F00 for ; Tue, 12 Sep 2023 17:52:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C4486B011C; Tue, 12 Sep 2023 13:52:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 274EC6B0132; Tue, 12 Sep 2023 13:52:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 163AB6B0133; Tue, 12 Sep 2023 13:52:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 00DB06B011C for ; Tue, 12 Sep 2023 13:52:53 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C934FC0310 for ; Tue, 12 Sep 2023 17:52:53 +0000 (UTC) X-FDA: 81228691026.11.B930083 Received: from 66-220-144-178.mail-mxout.facebook.com (66-220-144-178.mail-mxout.facebook.com [66.220.144.178]) by imf30.hostedemail.com (Postfix) with ESMTP id 3BC7D80005 for ; Tue, 12 Sep 2023 17:52:52 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=none; spf=neutral (imf30.hostedemail.com: 66.220.144.178 is neither permitted nor denied by domain of shr@devkernel.io) smtp.mailfrom=shr@devkernel.io ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694541172; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=l+n9FJ6lOfF30jIQ75PT+DFsBoxSFVNR8rLKeRoXW5k=; b=igjvM7iinjRuuAV0mhFaR/6dzpTZcMUIOMcwqB/ANqRd0nhDDLbka7X4T0+zDS2/Vd9ldH BGpJp1XuBpFjTZVlZEmHrJ41VmcGA8BAin69yOhWWGKLzOBd4o2a5RGptMoLo2vcVoDhJq iAh/2SWaB8l3cstAewjjR85xEv0m1f4= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=none; spf=neutral (imf30.hostedemail.com: 66.220.144.178 is neither permitted nor denied by domain of shr@devkernel.io) smtp.mailfrom=shr@devkernel.io ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694541172; a=rsa-sha256; cv=none; b=8LnTcsbgPQL5y07LJDIkEcLxlhaLEl+raRELDLhoJI9BiQoW08iU2rMOO91zpUcagc6XAl ske/8B3g3RG58juV9Jqbk8xm0Ke3tEXxQen6g8fkkP5mywjxZNShzXJARUMIgoNcbzxWvt E/CUBD/duqBv/hqXfBEIVXcMOqOdf9U= Received: by devbig1114.prn1.facebook.com (Postfix, from userid 425415) id D9E9DBCD1627; Tue, 12 Sep 2023 10:52:38 -0700 (PDT) From: Stefan Roesch To: kernel-team@fb.com Cc: shr@devkernel.io, akpm@linux-foundation.org, david@redhat.com, hannes@cmpxchg.org, riel@surriel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1 0/4] Smart scanning mode for KSM Date: Tue, 12 Sep 2023 10:52:24 -0700 Message-Id: <20230912175228.952039-1-shr@devkernel.io> X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 3BC7D80005 X-Stat-Signature: 55x7r7iaex396robixbfzpii5eacxi6z X-HE-Tag: 1694541172-670143 X-HE-Meta: U2FsdGVkX19sEXBgjyb6GaQ24yKoASJrtI758UGMXgkwip0XzDJI/Jic/+LnY8IqykKzua+nLu1uEBAtCI1O/VCRICmt0dPwmq+0nQhqfWvnYwMe7fKN0nZi/CUmNJLSH+S3E9gXHLO5qA/7whKpkVD4kfbiUf9K47ZfaaFn3twFLrsjLzsD30vIICvN39KylZyB4oQRqSda17gFrnVvNhIuWvTAdJs2rcWpSaPXbrKlXJjG2rdM8HsEiLjyT4ZUYH+oyG7JyqKtCTytCiQJtMcD4NFWz16g7fZel6F7iWucJlwmQ8idl7VMwSrBxeJAGfY3kAigenw0kZ41VXo6wsrM/O48lHtcwrwde1kcu//4P5x+cy8M3P7WdvTj7bYxqOk96I0LRdXTU+j/+ih5mefgeyasfiD1+5yvg6R1CDDPT03pRXLFppT4+U5SO8SSSW4BB/bv1vSH1sS5PtUYWS+kU0JN1iSGeWUksWePA+7Y/u0/Zvrry92QBgfgKMOo13Sq9ofpcJiYndOSqtidgFShiEhFQpN1I33hZulHpyYl0LwumeKbAqL7S/h4vNw7UcqvvJff+ggaJEkH+NTg1/3nRt3cfasd1TR/bkfuEAiSeaylmOXqZTabzROm17rJ5tYlFrUd5LtdhCnCQKH3wg7UeWDpuUbDUGcYMAW82VX8JUNNTwyXpC747n4NpFUt4Fq7kuHr/xG0zzjlOG2FIQWtsMpKngeHtWE3ULlqEy9JJJnqYaCnJwb0h4Cf/R8pOFBgUsRfGykOaw6i5++VNjlUD6Ssz2TPHfNFmyQR2EvmHJAdPEujtcIKqT5uOwbf2geKHLyg3i+iz4DRKxtcL43XkKiCHieXIVO1pgWljHpiSRPMrFYggiQhnIj9X4Rdd47lbYEmnGBXYa96ImF0Rf64ID3gQAP/Szk2MUin+A4F67/GjUc2YTgLhE+ZjYn1slf3q3mjGdJRxUQSR1F YsC2WxFH 9PGGMHv8pWWYApfnLyYYeGDQ4Sw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch series adds "smart scanning" for KSM. What is smart scanning? =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D KSM evaluates all the candidate pages for each scan. It does not use hist= oric information from previous scans. This has the effect that candidate pages= that couldn't be used for KSM de-duplication continue to be evaluated for each= scan. The idea of "smart scanning" is to keep historic information. With the hi= storic information we can temporarily skip the candidate page for one or several= scans. Details: =3D=3D=3D=3D=3D=3D=3D=3D "Smart scanning" is to keep two small counters to store if the page has b= een used for KSM. One counter stores how often we already tried to use the pa= ge for KSM and the other counter stores when a page will be used as a candidate = page again. How often we skip the candidate page depends how often a page failed KSM de-duplication. The code skips a maximum of 8 times. During testing this = has shown to be a good compromise for different workloads. New sysfs knob: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Smart scanning is not enabled by default. With /sys/kernel/mm/ksm/smart_s= can smart scanning can be enabled. Monitoring: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D To monitor how effective smart scanning is a new sysfs knob has been intr= oduced. /sys/kernel/mm/pages_skipped report how many pages have been skipped by s= mart scanning. Results: =3D=3D=3D=3D=3D=3D=3D=3D - Various workloads have shown a 20% - 25% reduction in page scans For the instagram workload for instance, the number of pages scanned ha= s been reduced from over 20M pages per scan to less than 15M pages. - Less pages scans also resulted in an overall higher de-duplication rate= as some shorter lived pages could be de-duplicated additionally - Less pages scanned allows to reduce the pages_to_scan parameter and this resulted in a 25% reduction in terms of CPU. - The improvements have been observed for workloads that enable KSM with madvise as well as prctl Stefan Roesch (4): mm/ksm: add "smart" page scanning mode mm/ksm: add pages_skipped metric mm/ksm: document smart scan mode mm/ksm: document pages_skipped sysfs knob Documentation/admin-guide/mm/ksm.rst | 11 ++++ mm/ksm.c | 87 ++++++++++++++++++++++++++++ 2 files changed, 98 insertions(+) base-commit: 15bcc9730fcd7526a3b92eff105d6701767a53bb --=20 2.39.3