From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B84D8C433F5 for ; Wed, 10 Nov 2021 05:37:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4CFC361208 for ; Wed, 10 Nov 2021 05:37:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4CFC361208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5B8006B0074; Wed, 10 Nov 2021 00:37:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5678C6B0075; Wed, 10 Nov 2021 00:37:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 42F946B0083; Wed, 10 Nov 2021 00:37:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0196.hostedemail.com [216.40.44.196]) by kanga.kvack.org (Postfix) with ESMTP id 33AF66B0074 for ; Wed, 10 Nov 2021 00:37:00 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CA81B184DB421 for ; Wed, 10 Nov 2021 05:36:59 +0000 (UTC) X-FDA: 78791911758.15.5F8947B Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) by imf09.hostedemail.com (Postfix) with ESMTP id 777013000106 for ; Wed, 10 Nov 2021 05:36:59 +0000 (UTC) Received: by mail-lf1-f42.google.com with SMTP id f18so3181875lfv.6 for ; Tue, 09 Nov 2021 21:36:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+zcX1ef8v4+GqxdMyQ+IPlbHORR9ZeSP0Dh2u9Ax2y8=; b=FgO26LrTjzGM5BvVPfuHRSTAi5yDrSozFO4pxxHUmJ53mVGntZazCHpRVD+HkonMEj vBY9OU5mdGmEEksVtedcUpkHlrNgOSHnfy4K81OvC+WW5VS6F9xtKWVeL6737N+NpvXu yTmkVVPWCVsp200M4jwHQp9ooWSth1ncqFJu5HQeraGAiF0Hv8Ps9mTJGwRvGjf2IAOi 1F8sULnwbrCQapwRX/ne0E290otcQ/8z7zVxc14zUAr6rqp223fvnYaeuM3pWUlKN+Qz WNi9PFoI5Y9p3713rca3r0G0R498DsH/pprwYBpB35DRHJD6L71D2p6FSqwU8FJoGb/C zFlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+zcX1ef8v4+GqxdMyQ+IPlbHORR9ZeSP0Dh2u9Ax2y8=; b=CIrJpA6xgrkvrsz93aip0z+Xq8R7QZxV0PqrDmhFmAf+UAZHRVtSaHKrC/OLYLpKv9 H6iomdp5pmLkWSa3TDz4cf2YpNWr+kkmCwrBz+zdRXpMwvdt5XGbbP3gIsNztPTm8mpI PVTzKKVJkGInGutLztxamBnIY2kqsPiK811M1mZxlzI3EW7VCzNG2dW517Ww7uJczapa MP4DRY69LFAqA+O1MTBeh364CzGdxKf40WnSRaNJReQW/GyuPdWvfJHUa931nr5u2zdP bEkGZBSI5YqmLFnsKiNckQUFZpqe/rGoof8pyr1GvJxSUHpYw5EVAXiYQMTfhhZpb50s fy/g== X-Gm-Message-State: AOAM532cgc8ZLziLiGoy2gzgFLXZU7ggg/+GmdWBP5AnQS/6Azps/3bz vGikN92MHkaKAC30eMMOMirsTZXq051jhmrd3nU= X-Google-Smtp-Source: ABdhPJxk7jccrFAvVJAGbUivIgZMky6GwtU2K1Zkgx97uTwpjslqEvS5rkPvkIren9bVkTECQmJepn+zXnO53tcW59g= X-Received: by 2002:ac2:442e:: with SMTP id w14mr12548128lfl.577.1636522617826; Tue, 09 Nov 2021 21:36:57 -0800 (PST) MIME-Version: 1.0 References: <1634278612-17055-1-git-send-email-huangzhaoyang@gmail.com> <78b3f72b-3fe7-f2e0-0e6b-32f28b8ce777@arm.com> <85c81ab7-49ed-aba5-6221-ea6a8f37f8ad@arm.com> <05a2e61e-9c55-8f8d-5e72-9854613e53c9@arm.com> In-Reply-To: <05a2e61e-9c55-8f8d-5e72-9854613e53c9@arm.com> From: Xuewen Yan Date: Wed, 10 Nov 2021 13:36:45 +0800 Message-ID: Subject: Re: [Resend PATCH] psi : calc cfs task memstall time more precisely To: Dietmar Eggemann Cc: Zhaoyang Huang , Johannes Weiner , Andrew Morton , Michal Hocko , Vladimir Davydov , Zhaoyang Huang , "open list:MEMORY MANAGEMENT" , LKML , Peter Zijlstra , Vincent Guittot , xuewen.yan@unisoc.com, Ke Wang Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 777013000106 X-Stat-Signature: cbx58x7bqtkhkcb7qxkqcf4y4yrxe8j1 Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=FgO26LrT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of xuewen.yan94@gmail.com designates 209.85.167.42 as permitted sender) smtp.mailfrom=xuewen.yan94@gmail.com X-HE-Tag: 1636522619-40996 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Dietmar On Tue, Nov 9, 2021 at 5:43 PM Dietmar Eggemann wrote: > > On 08/11/2021 09:49, Xuewen Yan wrote: > > Hi Dietmar > > > > On Sat, Nov 6, 2021 at 1:20 AM Dietmar Eggemann > > wrote: > >> > >> On 05/11/2021 06:58, Zhaoyang Huang wrote: > >>>> I don't understand the EAS (probably asymmetric CPU capacity is meant > >>>> here) angle of the story. Pressure on CPU capacity which is usable for > >>>> CFS happens on SMP as well? > >>> Mentioning EAS here mainly about RT tasks preempting small CFS tasks > >>> (big CFS tasks could be scheduled to big core), which would introduce > >>> more proportion of preempted time within PSI_MEM_STALL than SMP does. > >> > >> What's your CPU layout? Do you have the little before the big CPUs? Like > >> Hikey 960? > > [...] > > >> And I guess rt class prefers lower CPU numbers hence you see this? > >> > > our CPU layout is: > > xuewen.yan:/ # cat /sys/devices/system/cpu/cpu*/cpu_capacity > > 544 > > 544 > > 544 > > 544 > > 544 > > 544 > > 1024 > > 1024 > > > > And in our platform, we use the kernel in mobile phones with Android. > > And we prefer power, so we prefer the RT class to run on little cores. > > Ah, OK, out-of-tree extensions. > > [...] > > >>>>>>>> + if (current->in_memstall) > >>>>>>>> + growth_fixed = div64_ul((1024 - rq->avg_rt.util_avg - rq->avg_dl.util_avg > >>>>>>>> + - rq->avg_irq.util_avg + 1) * growth, 1024); > >>>>>>>> + > >>>> > >>>> We do this slightly different in scale_rt_capacity() [fair.c]: > >>>> > >>>> max = arch_scale_cpu_capacity(cpu_of(rq) /* instead of 1024 to support > >>>> asymmetric CPU capacity */ > >>> Is it possible that the SUM of rqs' util_avg large than > >>> arch_scale_cpu_capacity because of task migration things? > >> > >> I assume you meant if the rq (cpu_rq(CPUx)) util_avg sum (CFS, RT, DL, > >> IRQ and thermal part) can be larger than arch_scale_cpu_capacity(CPUx)? > >> > >> Yes it can. > >> > >> Have a lock at > >> > >> effective_cpu_util(..., max, ...) { > >> > >> if (foo >= max) > >> return max; > >> > >> } > >> > >> Even the CFS part (cpu_rq(CPUx)->cfs.avg.util_avg) can be larger than > >> the original cpu capacity (rq->cpu_capacity_orig). > >> > >> Have a look at cpu_util(). capacity_orig_of(CPUx) and > >> arch_scale_cpu_capacity(CPUx) both returning rq->cpu_capacity_orig. > >> > > > > Well, your means is we should not use the 1024 and should use the > > original cpu capacity? > > And maybe use the "sched_cpu_util()" is a good choice just like this: > > > > + if (current->in_memstall) > > + growth_fixed = div64_ul(cpu_util_cfs(rq) * growth, > > sched_cpu_util(rq->cpu, capacity_orig_of(rq->cpu))); > > Not sure about this. In case util_cfs=0 you would get scale=0. Yes , we should consider it. In addition, it also should be considered when util_cfs > capacity_orig because of the UTIL_EST...... > > IMHO, you need > > cap = rq->cpu_capacity > cap_orig = rq->cpu_capacity_orig > > scale = (cap * X) / cap_orig > > or if the update of these rq values happens to infrequently for you then > you have to calc the pressure evey time. Something like: > > pressure = cpu_util_rt(rq) + cpu_util_dl(rq) > > irq = cpu_util_irq(rq) > > if (irq >= cap_orig) > pressure = cap_orig > else > pressure = scale_irq_capacity(pressure, irq, cap_orig) > pressure += irq > > scale = ((cap_orig - pressure) * X) / cap_orig Why rescale the util there, the sched_cpu_util() would invoke the effective_cpu_util(), and I don't think it's necessary to rescale it. Thanks!