From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D47C9C433FE for ; Mon, 10 Oct 2022 10:57:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ED02B6B0073; Mon, 10 Oct 2022 06:57:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E58766B0074; Mon, 10 Oct 2022 06:57:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF9026B0075; Mon, 10 Oct 2022 06:57:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B5F7A6B0073 for ; Mon, 10 Oct 2022 06:57:23 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 86FD21A0D86 for ; Mon, 10 Oct 2022 10:57:23 +0000 (UTC) X-FDA: 80004738366.02.D1ABE3E Received: from mail3-162.sinamail.sina.com.cn (mail3-162.sinamail.sina.com.cn [202.108.3.162]) by imf13.hostedemail.com (Postfix) with ESMTP id 2420420022 for ; Mon, 10 Oct 2022 10:57:20 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.249.60.223]) by sina.com (172.16.97.27) with ESMTP id 6343FA580000C9A3; Mon, 10 Oct 2022 18:56:26 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 61287149283233 From: Hillf Danton To: Pavan Kondeti Cc: Johannes Weiner , Suren Baghdasaryan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, quic_charante@quicinc.com Subject: Re: PSI idle-shutoff Date: Mon, 10 Oct 2022 18:57:10 +0800 Message-Id: <20221010105710.171-1-hdanton@sina.com> In-Reply-To: <20220913140817.GA9091@hu-pkondeti-hyd.qualcomm.com> References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665399442; a=rsa-sha256; cv=none; b=CCU7cNlZmJ5xYousdxFcBA+Kd71JhrqrCwJEGaKLg3YBRqtp+Ziek7t8vjRMc3F40tq5J1 fQyj6CIY1wVDWHOdziRn4EsxnOnW6g8yJC4oGZ+Ohqy9r6WNbkdALeOblJAdzPhr+tt4NC EomfxF9dNkGe3jLS53cagsPZcXFfKhI= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.162 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665399442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hIfJLhTUup6AOu1AL/iGDIsmHHMSKHDdyanSnAp/YHE=; b=SFxYpPnYaiYsHoa082e+N6K7HwSAEoQghQh9fdky5Ek9ciCVwHIc7E8MlxcazJzxgn7WkQ aExygutQ9f8Pq8jUS2fmpeyHJlQHqrFzrLgyT3vJsQgrsAyoc+aNxMoQglo70Khh1FEu04 VJLZBXTrPtWi1DB6aiXgrW3UF1uQY6U= Authentication-Results: imf13.hostedemail.com; dkim=none; spf=pass (imf13.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.162 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none X-Stat-Signature: xet9o6seke7iy3m5jy3yu86kj6374yqp X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 2420420022 X-Rspam-User: X-HE-Tag: 1665399440-306678 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 13 Sep 2022 19:38:17 +0530 Pavan Kondeti > Hi > > The fact that psi_avgs_work()->collect_percpu_times()->get_recent_times() > run from a kworker thread, PSI_NONIDLE condition would be observed as > there is a RUNNING task. So we would always end up re-arming the work. > > If the work is re-armed from the psi_avgs_work() it self, the backing off > logic in psi_task_change() (will be moved to psi_task_switch soon) can't > help. The work is already scheduled. so we don't do anything there. > > Probably I am missing some thing here. Can you please clarify how we > shut off re-arming the psi avg work? Instead of open coding schedule_delayed_work() in bid to check if timer hits the idle task (see delayed_work_timer_fn()), the idle task is tracked in psi_task_switch() and checked by kworker to see if it preempted the idle task. Only for thoughts now. Hillf +++ b/kernel/sched/psi.c @@ -412,6 +412,8 @@ static u64 update_averages(struct psi_gr return avg_next_update; } +static DEFINE_PER_CPU(int, prev_task_is_idle); + static void psi_avgs_work(struct work_struct *work) { struct delayed_work *dwork; @@ -439,7 +441,7 @@ static void psi_avgs_work(struct work_st if (now >= group->avg_next_update) group->avg_next_update = update_averages(group, now); - if (nonidle) { + if (nonidle && 0 == per_cpu(prev_task_is_idle, raw_smp_processor_id())) { schedule_delayed_work(dwork, nsecs_to_jiffies( group->avg_next_update - now) + 1); } @@ -859,6 +861,7 @@ void psi_task_switch(struct task_struct if (prev->pid) { int clear = TSK_ONCPU, set = 0; + per_cpu(prev_task_is_idle, cpu) = 0; /* * When we're going to sleep, psi_dequeue() lets us * handle TSK_RUNNING, TSK_MEMSTALL_RUNNING and @@ -888,7 +891,8 @@ void psi_task_switch(struct task_struct for (; group; group = iterate_groups(prev, &iter)) psi_group_change(group, cpu, clear, set, now, true); } - } + } else + per_cpu(prev_task_is_idle, cpu) = 1; } /**