From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D8CFC433FE for ; Tue, 9 Nov 2021 09:43:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A557461156 for ; Tue, 9 Nov 2021 09:43:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A557461156 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 2B6E46B00C5; Tue, 9 Nov 2021 04:43:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 267266B00C6; Tue, 9 Nov 2021 04:43:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 156FC6B00C7; Tue, 9 Nov 2021 04:43:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0020.hostedemail.com [216.40.44.20]) by kanga.kvack.org (Postfix) with ESMTP id 06F936B00C5 for ; Tue, 9 Nov 2021 04:43:45 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id B9FFD7C5D1 for ; Tue, 9 Nov 2021 09:43:44 +0000 (UTC) X-FDA: 78788904642.19.298A0E7 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf21.hostedemail.com (Postfix) with ESMTP id BE938D043FA9 for ; Tue, 9 Nov 2021 09:43:36 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 55A012B; Tue, 9 Nov 2021 01:43:41 -0800 (PST) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 98CA03F7F5; Tue, 9 Nov 2021 01:43:38 -0800 (PST) Subject: Re: [Resend PATCH] psi : calc cfs task memstall time more precisely To: Xuewen Yan Cc: Zhaoyang Huang , Johannes Weiner , Andrew Morton , Michal Hocko , Vladimir Davydov , Zhaoyang Huang , "open list:MEMORY MANAGEMENT" , LKML , Peter Zijlstra , Vincent Guittot , xuewen.yan@unisoc.com, Ke Wang References: <1634278612-17055-1-git-send-email-huangzhaoyang@gmail.com> <78b3f72b-3fe7-f2e0-0e6b-32f28b8ce777@arm.com> <85c81ab7-49ed-aba5-6221-ea6a8f37f8ad@arm.com> From: Dietmar Eggemann Message-ID: <05a2e61e-9c55-8f8d-5e72-9854613e53c9@arm.com> Date: Tue, 9 Nov 2021 10:43:37 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: BE938D043FA9 X-Stat-Signature: 5zhbak7f7thq7ja8sxqcqmk5faxu1kb1 Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf21.hostedemail.com: domain of dietmar.eggemann@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dietmar.eggemann@arm.com X-HE-Tag: 1636451016-241688 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 08/11/2021 09:49, Xuewen Yan wrote: > Hi Dietmar > > On Sat, Nov 6, 2021 at 1:20 AM Dietmar Eggemann > wrote: >> >> On 05/11/2021 06:58, Zhaoyang Huang wrote: >>>> I don't understand the EAS (probably asymmetric CPU capacity is meant >>>> here) angle of the story. Pressure on CPU capacity which is usable for >>>> CFS happens on SMP as well? >>> Mentioning EAS here mainly about RT tasks preempting small CFS tasks >>> (big CFS tasks could be scheduled to big core), which would introduce >>> more proportion of preempted time within PSI_MEM_STALL than SMP does. >> >> What's your CPU layout? Do you have the little before the big CPUs? Like >> Hikey 960? [...] >> And I guess rt class prefers lower CPU numbers hence you see this? >> > our CPU layout is: > xuewen.yan:/ # cat /sys/devices/system/cpu/cpu*/cpu_capacity > 544 > 544 > 544 > 544 > 544 > 544 > 1024 > 1024 > > And in our platform, we use the kernel in mobile phones with Android. > And we prefer power, so we prefer the RT class to run on little cores. Ah, OK, out-of-tree extensions. [...] >>>>>>>> + if (current->in_memstall) >>>>>>>> + growth_fixed = div64_ul((1024 - rq->avg_rt.util_avg - rq->avg_dl.util_avg >>>>>>>> + - rq->avg_irq.util_avg + 1) * growth, 1024); >>>>>>>> + >>>> >>>> We do this slightly different in scale_rt_capacity() [fair.c]: >>>> >>>> max = arch_scale_cpu_capacity(cpu_of(rq) /* instead of 1024 to support >>>> asymmetric CPU capacity */ >>> Is it possible that the SUM of rqs' util_avg large than >>> arch_scale_cpu_capacity because of task migration things? >> >> I assume you meant if the rq (cpu_rq(CPUx)) util_avg sum (CFS, RT, DL, >> IRQ and thermal part) can be larger than arch_scale_cpu_capacity(CPUx)? >> >> Yes it can. >> >> Have a lock at >> >> effective_cpu_util(..., max, ...) { >> >> if (foo >= max) >> return max; >> >> } >> >> Even the CFS part (cpu_rq(CPUx)->cfs.avg.util_avg) can be larger than >> the original cpu capacity (rq->cpu_capacity_orig). >> >> Have a look at cpu_util(). capacity_orig_of(CPUx) and >> arch_scale_cpu_capacity(CPUx) both returning rq->cpu_capacity_orig. >> > > Well, your means is we should not use the 1024 and should use the > original cpu capacity? > And maybe use the "sched_cpu_util()" is a good choice just like this: > > + if (current->in_memstall) > + growth_fixed = div64_ul(cpu_util_cfs(rq) * growth, > sched_cpu_util(rq->cpu, capacity_orig_of(rq->cpu))); Not sure about this. In case util_cfs=0 you would get scale=0. IMHO, you need cap = rq->cpu_capacity cap_orig = rq->cpu_capacity_orig scale = (cap * X) / cap_orig or if the update of these rq values happens to infrequently for you then you have to calc the pressure evey time. Something like: pressure = cpu_util_rt(rq) + cpu_util_dl(rq) irq = cpu_util_irq(rq) if (irq >= cap_orig) pressure = cap_orig else pressure = scale_irq_capacity(pressure, irq, cap_orig) pressure += irq scale = ((cap_orig - pressure) * X) / cap_orig