From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32582CD8CA6 for ; Tue, 10 Oct 2023 16:05:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86A246B017A; Tue, 10 Oct 2023 12:05:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 81AB16B017C; Tue, 10 Oct 2023 12:05:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E2756B017D; Tue, 10 Oct 2023 12:05:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5F6C36B017A for ; Tue, 10 Oct 2023 12:05:07 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3DDAB1A01CE for ; Tue, 10 Oct 2023 16:05:07 +0000 (UTC) X-FDA: 81330025854.20.196122E Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by imf15.hostedemail.com (Postfix) with ESMTP id EA89CA0036 for ; Tue, 10 Oct 2023 16:05:04 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=devkernel.io header.s=fm3 header.b=aLZX01Vo; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=OrqZmIXG; dmarc=none; spf=pass (imf15.hostedemail.com: domain of shr@devkernel.io designates 66.111.4.28 as permitted sender) smtp.mailfrom=shr@devkernel.io ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696953905; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LRmEqSI1v8tnpczO9tpOdCQmOVzZGNRUA1zFNRHlG4A=; b=tqiX/0QS4H3QelToH8L87ZyZL3UQJq3fvc2S44oXjogm/bhl5PF7CfwqJDPJJFW8W0u+I4 6nLM/6bA9W7hPwuqTE4IJ07BWcj/d+Ac4thlF8dw6erT8cNNWjQCNoxsTfb2hjTN/BmfZk bBSN+T9fo7rkfQXSI4I4Ip29SUEWLS8= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=devkernel.io header.s=fm3 header.b=aLZX01Vo; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=OrqZmIXG; dmarc=none; spf=pass (imf15.hostedemail.com: domain of shr@devkernel.io designates 66.111.4.28 as permitted sender) smtp.mailfrom=shr@devkernel.io ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696953905; a=rsa-sha256; cv=none; b=ufgOyEvy/3saIfc62zQqR/5t0Gn7B4GHsTllghAIuqAbVhTervbfDsv5gcqhxRzzH3zypy tj9jlVg/iISeP55SMTFQPIL+ImNHmb+3HAufQV//nCZ6Ba+7gz1NR15tIJAbRWJ/9Zl+Bz S7GAoeLZxdGxj5bn+FDysJ7Wd2tGDSg= Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailout.nyi.internal (Postfix) with ESMTP id 1F5085C0345; Tue, 10 Oct 2023 12:05:04 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Tue, 10 Oct 2023 12:05:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=devkernel.io; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm3; t=1696953904; x=1697040304; bh=LR mEqSI1v8tnpczO9tpOdCQmOVzZGNRUA1zFNRHlG4A=; b=aLZX01VoyGNxYqS25a L+z6C1EOK65j3yv/t5Hz/kO9ZusGEXX3+RneAwGD1da1wvjG106xYjxHE2tdjmVm 9jGHYPKyoQpX6wOrTbAnCY1zeF6zfIKQmRvsNRsqRDfG3lugPW/tmvOuElII7auD aBoiAjYIGmdKmWBY3Zv8OcSc0fbxFT3/14upW0nHf6c/H4Wj1a7lq1uMRA0tlMZD +vxvffliAjaXwN9VPLLQO9R6Smh87lNQ+Yz/03diZkY/XWPEvBBPovsRB9aoiiX+ Uk2vVRhguocgHrvLdtJnKILNJDG8T3TuMJhl63lulsg8jwevJ2II09D/kMYCxiqP dcXg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; t=1696953904; x=1697040304; bh=LRmEqSI1v8tnp czO9tpOdCQmOVzZGNRUA1zFNRHlG4A=; b=OrqZmIXGRKEXdb5GXLkEgBGsgzgzm Q51aZV43gYC2BUEmnrxjNfllqR+uyqs1BTCOqb6cYszo4NRP2xte6B9IuEtAFXEZ pf/cvaxuEdgXc+l+GiWiF64wx+UJZbSAdqkR3RgkFvEEyaccYcw74U1Kn06K2gPr M6Zr3pYHXE3f26N9Iy/5pq7ZMgL8A5z8Q1O6JIUXndqPbQZKQfMJ+h8GSCkloeui S0Rz6cNVFQY6r3/j/57+9cjUZNFQMw2eQkFJmhdZHl4cSNg5tNUv/fIzjMtglndo S+aClrji4bJ+UnDu5nsYpCl/bbfDB7ebvCieMqIy7pbUbGnnCqyYFB2+A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrheehgdelgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpehffgfhvfevufffjgfkgggtsehttdertddtredtnecuhfhrohhmpefuthgvfhgr nhcutfhovghstghhuceoshhhrhesuggvvhhkvghrnhgvlhdrihhoqeenucggtffrrghtth gvrhhnpeevlefggffhheduiedtheejveehtdfhtedvhfeludetvdegieekgeeggfdugeeu tdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehshh hrseguvghvkhgvrhhnvghlrdhioh X-ME-Proxy: Feedback-ID: i84614614:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 10 Oct 2023 12:05:00 -0400 (EDT) References: <20231004190249.829015-1-shr@devkernel.io> <4509a3b4-16a6-f63e-1dd5-e20c7eadf87d@redhat.com> <87fs2nhg14.fsf@devkernel.io> User-agent: mu4e 1.10.3; emacs 29.1 From: Stefan Roesch To: David Hildenbrand Cc: kernel-team@fb.com, akpm@linux-foundation.org, hannes@cmpxchg.org, riel@surriel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v1 0/4] mm/ksm: Add ksm advisor Date: Tue, 10 Oct 2023 09:02:38 -0700 In-reply-to: Message-ID: <87bkd61n12.fsf@devkernel.io> MIME-Version: 1.0 Content-Type: text/plain X-Rspam-User: X-Stat-Signature: jqsgdu58e6qoguee17w15ep36g3893ko X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EA89CA0036 X-HE-Tag: 1696953904-406797 X-HE-Meta: U2FsdGVkX1/mm1L6JWCOnSQWZ+w7VQmU+w4NdP49wuW7iuEvQyFlu0HIb5Bu+T1w/RAO6k/dD1mez0vVUme9Rdgxfv106a55A/lcyKVukVG3AwJa/iLWXdmn9S/Wd6HbK3PvaiYLXduk/cWdPCwqabYfWcWx7j97OwtkaBShD3yFVNv5hEaDMT9I9dKlOA9bYPj0olb2O/5x+3b3Vs4nK5HqbcMFEDnJWPLbVYOEdNAT+vPVGgVUikC0Dpl26Qf2shYkQpRbOqal+a/WBuhkrFYhbz7DFg0lh1mm9Y7tEprYxdxs3DaD8Zupbzi5RwyHJNj5J7rYp3C/63lZ5gpu8Gm7ThAsX2pdVY3KWWcqgfqBQ0B70xdUcC0I0VHTG73qs77Y3i4Fn8TFlFhH4e6/Q9P/Am+MIPsrGnUACJ7iPrH46CTBTLXZsWyM8+nMmvqDPA5FnaEgd3OitBmHmM1MGY+B+hx71IVBppQ9f0TyGI6ApLlhrp7ctW4wo1XVf/kXuVkkMC58kG9DeL++Wt+XXOOJVdYNGFGz5b1XbUr7Q8I32PhdxtV+iZ3C10ZjDukkhqvbmTNDn6ekDmcS7Zi+uQApZlrFB/Z8U85KOBhNuus6cDAgFD06WfQBzOz35SfmhlTsFglrcHaPLaJXfoQQx1VCQe4gTvN6nrKkuFkbzJanzCtUsOvHIEoOWx8FtWIsgy0g8DE2t4uEjBJjlSTsinQrSOB135J7AfOFPrKQOYCLKPajmnOnu5gBjK3/tSMjzDa6suxNQBrLXbbZtuKiFyD2jG789t+rW6y4aVJlWETWaZwikl3g93IqorCPMHvIdlDB7J5xg2hzayj6eclz+aypbWW6o0fdOyqdLZCiesoyu5I5Fjiul3R8Yi8LzM/a1+DTvOCPmz4IxRgurxeliDxZbak9DMPniynWqo7hdcaDbDIpq5/kYmajkAt77NBejL0ulG7oFS8ymblsAoc kjD2ulUT dULKinwkXkOYfjpwStoqbOQNDeUY2vDuKLzfu3XKpyBktHucTlXPMSkIJo25FICT8tJoGInnHK+hRxdPZqslpcxY17WuENllLxWscuGWyMkBG0Jojj7Zd18z1Xf9uUyDRj21LDd0/q7TSQkklh1nIXBQCkPsvn9pQT8expCb6ifIS10wDEGbfX68YX2yawNTIZbg8djV10vaFeJhl7t5C7ECnZP/RF5q4UE+X X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David Hildenbrand writes: > On 06.10.23 18:17, Stefan Roesch wrote: >> David Hildenbrand writes: >> >>> On 04.10.23 21:02, Stefan Roesch wrote: >>>> What is the KSM advisor? >>>> ========================= >>>> The ksm advisor automatically manages the pages_to_scan setting to >>>> achieve a target scan time. The target scan time defines how many seconds >>>> it should take to scan all the candidate KSM pages. In other words the >>>> pages_to_scan rate is changed by the advisor to achieve the target scan >>>> time. >>>> Why do we need a KSM advisor? >>>> ============================== >>>> The number of candidate pages for KSM is dynamic. It can often be observed >>>> that during the startup of an application more candidate pages need to be >>>> processed. Without an advisor the pages_to_scan parameter needs to be >>>> sized for the maximum number of candidate pages. With the scan time >>>> advisor the pages_to_scan parameter based can be changed based on demand. >>>> Algorithm >>>> ========== >>>> The algorithm calculates the change value based on the target scan time >>>> and the previous scan time. To avoid pertubations an exponentially >>>> weighted moving average is applied. >>>> The algorithm has a max and min >>>> value to: >>>> - guarantee responsiveness to changes >>>> - to avoid to spend too much CPU >>>> Parameters to influence the KSM scan advisor >>>> ============================================= >>>> The respective parameters are: >>>> - ksm_advisor_mode >>>> 0: None (default), 1: scan time advisor >>>> - ksm_advisor_target_scan_time >>>> how many seconds a scan should of all candidate pages take >>>> - ksm_advisor_min_pages >>>> minimum value for pages_to_scan per batch >>>> - ksm_advisor_max_pages >>>> maximum value for pages_to_scan per batch >>>> The parameters are exposed as knobs in /sys/kernel/mm/ksm. >>>> By default the scan time advisor is disabled. >>> >>> What would be the main reason to not have this enabled as default? >>> >> There might be already exisiting users which directly set pages_to_scan >> and tuned the KSM settings accordingly, as the default setting of 100 for >> pages_to_scan is too low for typical workloads. > > Good point. > >> >>> IIUC, it is kind-of an auto-tuning of pages_to_scan. Would "auto-tuning" >>> describe it better than "advisor" ? >>> >>> [...] >>> >> I'm fine with auto-tune. I was also thinking about that name, but I >> chose advisor, its a bit less strong and it needs input from the user. >> > > I'm not a native speaker, but "adviser" to me implies that no action is taken, > only advises are given :) But again, no native speaker. > >>>> How is defining a target scan time better? >>>> =========================================== >>>> For an administrator it is more logical to set a target scan time.. The >>>> administrator can determine how many pages are scanned on each scan. >>>> Therefore setting a target scan time makes more sense. >>>> In addition the administrator might have a good idea about the >>>> memory sizing of its respective workloads. >>> >>> Is there any way you could imagine where we could have this just do something >>> reasonable without any user input? IOW, true auto-tuning? >>> >> True auto-tuning might be difficult as users might want to be able to >> choose how aggressive KSM is. Some might want it to be as aggressive as >> possible to get the maximum de-duplication rate. Others might want a >> more balanced approach that takes CPU-consumption into consideration. >> I guess it depends if you are memory-bound, cpu-bound or both. > > Agreed, more below. > >> >>> I read above: >>>> - guarantee responsiveness to changes >>>> - to avoid to spend too much CPU >>> >>> whereby both things are accountable/measurable to use that as the input for >>> auto-tuning? >>> >> I'm not sure a true auto-tuning can be achieved. I think we need >> some input from the user >> - How much resources to consume >> - How fast memory changes or how stable memory is >> (this we might be able to detect) > > Setting the pages_to_scan is a bit mystical. Setting upper/lower pages_to_scan > bounds is similarly mystical, and highly workload dependent. > > So I agree that a better abstraction to automatically tune the scanning is > reasonable. I wonder if we can let the user give better inputs that are less > workload dependent. > > For example, do we need min/max values for pages_to_scan, or can we replace it > by something better to the auto-tuning algorithm? > > IMHO "target scan time" goes into the right direction, but it can still be > fairly workload dependent. Maybe a "max CPU consumption" or sth. like that would > similarly help to limit CPU waste, and it could be fairly workload dependent. I can look into replacing min/max values for pages_to_scan with min/max cpu utilization. This might be easier for users to decide on. However I still think that we need a target value like scan time to optimize for.