From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17608EB64DA for ; Fri, 14 Jul 2023 08:59:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 764976B0071; Fri, 14 Jul 2023 04:59:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 715686B0072; Fri, 14 Jul 2023 04:59:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DC316B0074; Fri, 14 Jul 2023 04:59:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4985C6B0071 for ; Fri, 14 Jul 2023 04:59:24 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 175A51601A3 for ; Fri, 14 Jul 2023 08:59:24 +0000 (UTC) X-FDA: 81009618648.15.99E7A0C Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf24.hostedemail.com (Postfix) with ESMTP id 15D59180018 for ; Fri, 14 Jul 2023 08:59:21 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=AQAUwUV1; spf=pass (imf24.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1689325162; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OD2Y8EYH4iQhl/uinOOqc2LyprUZwMqThWDOzf/DPgk=; b=YxNj6ATkCtYrNf1W3pSukhhVvOWBnY/kQ761QBqjZT16HuEyRDGhGZ5Zxu0jFgY+tZ9l0J t+1J9Jy+SQpSMC30opF4cmoPDhF8Ge5C1LS5tcmqTFV3cK1qrSwLK6CwaPMjkagx4JnBRh e0Bu95PlZ7VvOnws31DKVeQFJqnXBNo= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=AQAUwUV1; spf=pass (imf24.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.28 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1689325162; a=rsa-sha256; cv=none; b=fAOl9Q+mVTosv2FzoAavXPvLRYdeMeNCyYF1tZTC8R+XbXYWBTP0QVG0F5lIXECfIHWNiO P1Eaqb+vLG7xhWSdbX6hfjfgdKxuWO4fVFWGrF6QUj2mUvrbatksZbgCtz2d6qZ0sz41Dw 6s15tAhDsMFZfLBc9KVlqr4DmSl4yxU= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id A89AD22100; Fri, 14 Jul 2023 08:59:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1689325160; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OD2Y8EYH4iQhl/uinOOqc2LyprUZwMqThWDOzf/DPgk=; b=AQAUwUV19C/3BBTp78ZvdS/A9jzkyZrYU47ID+jQp5YfCXngJ9JYeQ9VHBZv/cgyMPwQtW hec+5BnJFMYm77jUoXeS7yjLZPYmwu+UVvDAikEbBkAtjbzo1dEQ20Yq5KLrvQ3Kuaxy0V 85vGxifQdJrBjYcDTSsrIrHqBn8Y6o4= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8763B138F8; Fri, 14 Jul 2023 08:59:20 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id HwIVHmgOsWTeZQAAMHmgww (envelope-from ); Fri, 14 Jul 2023 08:59:20 +0000 Date: Fri, 14 Jul 2023 10:59:19 +0200 From: Michal Hocko To: "Huang, Ying" Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Arjan Van De Ven , Andrew Morton , Mel Gorman , Vlastimil Babka , David Hildenbrand , Johannes Weiner , Dave Hansen , Pavel Tatashin , Matthew Wilcox Subject: Re: [RFC 1/2] mm: add framework for PCP high auto-tuning Message-ID: References: <20230710065325.290366-1-ying.huang@intel.com> <20230710065325.290366-2-ying.huang@intel.com> <87edldefnt.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87edldefnt.fsf@yhuang6-desk2.ccr.corp.intel.com> X-Rspamd-Queue-Id: 15D59180018 X-Rspam-User: X-Stat-Signature: g63eq4qztshxpizn1qyu7gi5tuzffycf X-Rspamd-Server: rspam01 X-HE-Tag: 1689325161-514124 X-HE-Meta: U2FsdGVkX1/GwG5aPcK0915ZYoUDYYYzGKjGDiJTZ1kbn4wEx+Zf1NN8NLAUZqiydWZf3S4GTZJdYPoJ/4RlizUPZLNSzPEO2jm0hrBI96skA2gzd2JqQEwd3igQf8NbTObQOoabeSApm43qEt6ZSHB1XChVawg4WA8lENXEmJVojha+T7i9mbPx84dJAb84K7Drx3FzsLt/cSA4NrPyWVPhPvUnJb2BG76FV1m9qere8Gz6diOHI3QxVYyCuQbZmsTDsmypgEXgnnnqxn7quw/RbwT/kFQKxwcA4bJsiKWA6ecwmI3BVno/zy48Yq849F3BOmvwQ851UtKPVQkje7iQdg5MF+k/hJNu24+/o62Bd5MKpYXbMEg1D/CuIjmi14vqEWMXifNhXgW0vJ43ANgzrbWNYqQM9Nd/toHUH10OcctdS7dHdWL24ya+Ahr9sQXsBx+/+9ZiUGImenglye25Mxje9ZvW4wbl/ctPYdpG52oZlt3xaENUYRWNnwUgwcwS4ToyKjuLN3U/gd9NHxsHxp1nUFQpfHxO1qTLpFiS58IcXsYmggchjU3+ggKO6zUgwGXjZ0rmOK1MhprWgWJDXy4RKhn+GYM5Hp3cIw9ECY1YiuIlw7H6+/f3cOTW9g2G2NywcwPa4iI6EToCUEezZfquP0ad13WMwa8A2HM3fIi/7e3Zazos+lDKuraQRICIhUDEq8yrd4tVmmF1+GjYcfqbACkYsC+QbOxxPOXzmqdKz4SLioB7TiZZnjIUiq4z96IKAltMEBgVGjUBgCrAgefKxLGRc3XvjAF09+Y5ykAJjjFxwa119bwZoTFMA4Hikvev1d9sO3WiDpc8RwW5JNn95Zdnll2+UvlkI0M9wgCZ1GtsaEHR9iQE6eLIcQjc81+UiRvsLF5Z2xDMRVjtszt3sIf1H7yQvOSb02Uhwmn8vpYpGN3eCqf3gnmjCD4hhdt3SJ4Bvq4bOF5 jG5sH6Wp NhmRV+0iTjEabFsdSCKYFIRI+UBNW+Xou9Q/SIoz88rmr5wsh8HfVR+m5I3/6GANWPsHthGm+zQ8iwB7PwDZ+ikp1qj55Uq7p138hxTdJRSnbMxZn+PNpDY58cf6GihUkFkjbC4Y90Yu3X3WBhW5heAU5viGWgCxCQPJ/i64SM2tsEz0rTRZkO40SIhG5fIHMheTAh+GOg6tgjKhebMVI6BYryQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 12-07-23 15:45:58, Huang, Ying wrote: > Michal Hocko writes: > > > On Mon 10-07-23 14:53:24, Huang Ying wrote: > >> The page allocation performance requirements of different workloads > >> are usually different. So, we often need to tune PCP (per-CPU > >> pageset) high to optimize the workload page allocation performance. > >> Now, we have a system wide sysctl knob (percpu_pagelist_high_fraction) > >> to tune PCP high by hand. But, it's hard to find out the best value > >> by hand. And one global configuration may not work best for the > >> different workloads that run on the same system. One solution to > >> these issues is to tune PCP high of each CPU automatically. > >> > >> This patch adds the framework for PCP high auto-tuning. With it, > >> pcp->high will be changed automatically by tuning algorithm at > >> runtime. Its default value (pcp->high_def) is the original PCP high > >> value calculated based on low watermark pages or > >> percpu_pagelist_high_fraction sysctl knob. To avoid putting too many > >> pages in PCP, the original limit of percpu_pagelist_high_fraction > >> sysctl knob, MIN_PERCPU_PAGELIST_HIGH_FRACTION, is used to calculate > >> the max PCP high value (pcp->high_max). > > > > It would have been very helpful to describe the basic entry points to > > the auto-tuning. AFAICS the central place of the tuning is tune_pcp_high > > which is called from the freeing path. Why? Is this really a good place > > considering this is a hot path? What about the allocation path? Isn't > > that a good spot to watch for the allocation demand? > > Yes. The main entry point to the auto-tuning is tune_pcp_high(). Which > is called from the freeing path because pcp->high is only used by page > freeing. It's possible to call it in allocation path instead. The > drawback is that the pcp->high may be updated a little later in some > situations. For example, if there are many page freeing but no page > allocation for quite long time. But I don't think this is a serious > problem. I consider it a serious flaw in the framework as it cannot cope with the transition of the allocation pattern (e.g. increasing the allocation pressure). > > Also this framework seems to be enabled by default. Is this really > > desirable? What about workloads tuning the pcp batch size manually? > > Shouldn't they override any auto-tuning? > > In the current implementation, the pcp->high will be tuned between > original pcp high (default or tuned manually) and the max pcp high (via > MIN_PERCPU_PAGELIST_HIGH_FRACTION). So the high value tuned manually is > respected at some degree. > > So you think that it's better to disable auto-tuning if PCP high is > tuned manually? Yes, I think this is a much safer option. For two reasons 1) it is less surprising to setups which know what they are doing by configuring the batching and 2) the auto-tuning needs a way to get disabled in case there are pathological patterns in behavior. -- Michal Hocko SUSE Labs