From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFADEC636CD for ; Wed, 1 Feb 2023 14:28:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1989C6B0073; Wed, 1 Feb 2023 09:28:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1477E6B0075; Wed, 1 Feb 2023 09:28:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 037486B0078; Wed, 1 Feb 2023 09:28:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E860D6B0073 for ; Wed, 1 Feb 2023 09:28:54 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 99FCE140E2C for ; Wed, 1 Feb 2023 14:28:54 +0000 (UTC) X-FDA: 80418954588.24.B49DCB0 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf14.hostedemail.com (Postfix) with ESMTP id BDE2710001F for ; Wed, 1 Feb 2023 14:28:52 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=U2Mhgl1r; spf=pass (imf14.hostedemail.com: domain of frederic@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675261733; a=rsa-sha256; cv=none; b=hzp9XdCxQSxToVKsWcKbqV5hXcxPkWUt1Ca2eBV7F+x970bDHuRJYF9HEZUPlO9qQIFEEF xt+eFHtNefemJ81B93qwyoeuiauf351wMdIw0FGoltwOEK/EdhScGQXtt+AZCQNhb+MMUV 8BQHEgSXhwDY4QLst4wnLp96jd0krjU= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=U2Mhgl1r; spf=pass (imf14.hostedemail.com: domain of frederic@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=frederic@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675261733; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=am9BxKwkbmiQr53uMFXgmAdZ4lhixWhoCwIQ08cwNek=; b=DZ4bPnT1vTPqGFi3v5L8XIHEOcfsf6h67Qc1YtKbvvTgIUlZhR8Gdpc1kkQHcBiYmRxvt5 +mXY8DLibMosGmPynJmp+xJIKv54EUKUcvA/EcumrDtYpWmPgsWEik9FNwkbZj6xFQdxVh k+ZpOScxX5UiZx2QeGOd9xm4X7jVIHQ= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 40318B82184; Wed, 1 Feb 2023 14:28:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 70612C433EF; Wed, 1 Feb 2023 14:28:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1675261730; bh=YroCa+avc6rV4GCcybGDD1acE12dRIov4jxYSHm88V0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=U2Mhgl1rdL0Jzvs+9wFXh9wA8eb3DoUn7zkp+T8azFnhYLtvd55sb/56x04zVF3dJ mn1dnaqbmTUDfBEIpyLNjj6Hyt3LbBDlqx3kA4OYulUVweI4dVf3d7oIlIcD2aJTaa yDhVbnRNtPMeT0g8E2HyDdzqo5MrRgSQN46Gr6J8MifY1Y1OsXP3B+4OeFqDU8tifT YE7/QVAs8XkofazahY3PO7ARW/DQYQMvzPcDdx1I+VhBCj84rYqIZRkcgBKRGKFUdP LQzQFjsOnvzsBte9XsYTTAY4An0TCwZLCFAkgEsM6Ijc3sgpoRuw82bo3gCSEbT5ea f6PRvKA/TJL9w== Date: Wed, 1 Feb 2023 15:28:46 +0100 From: Frederic Weisbecker To: Hillf Danton Cc: Thomas Gleixner , Yu Liao , fweisbec@gmail.com, mingo@kernel.org, liwei391@huawei.com, adobriyan@gmail.com, mirsad.todorovac@alu.unizg.hr, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Peter Zijlstra Subject: Re: [PATCH RFC] tick/nohz: fix data races in get_cpu_idle_time_us() Message-ID: References: <20230128020051.2328465-1-liaoyu15@huawei.com> <20230201045302.316-1-hdanton@sina.com> <20230201140117.539-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230201140117.539-1-hdanton@sina.com> X-Rspam-User: X-Rspamd-Queue-Id: BDE2710001F X-Rspamd-Server: rspam01 X-Stat-Signature: zcmqf1u457wbx7j6ods1oxuhpqg9yhix X-HE-Tag: 1675261732-777992 X-HE-Meta: U2FsdGVkX18RKY0kJh+lUsBAYavJ0o3+Gm8jLfNMupY6MTrVaNFG2/a65QQ86h6xgFPhVebJe+807eRFOjbxkrs2qODA7LxBTDncfKG0XD2SiJPU25pwKyIMPRW8WEnkgrR8xQlWkpQFglp5qbqN+TPNi5m2+ABJCy+NfdsizqpGp1vJYz6wTxblzM+4207FLlRfH5JIgjDRaq10aRvRJcnGiGO7SkQQQE+1MkABRZpFLP6a2IjUYd7JM267/Tc4QDdVmwja2BXeKDcS82nLjllR01HCh3vLh4GZaCYT6SN4P7MHP5sGsWOVFLb6T/EI8Ti0uN0EBREqNvxfomLY1DkXeA2Ocp0KbIhV4/Q9pHWu6sGp8EOFcb1ncSS3xl+j2noHjyYv28/lgkqxSfL9qu7NsweIJQ9wKKUUqL22k4d7ZwI8Qhn1gGHpY65Jl1m3KtuUXXDnvJfUiVDs2+gqqU1iaj4yDPrZobil/9c89aR+4UsC/sKTua1wMSYokTe8wnl3tV+JC3o7ir1i5EF4kRRyllJ8avSAWj+w+uDmM0LWUybjSGruzzH6cQ/CREpDS9VhDBnuIrBm89wdctbxj76ONjJEUFoI6abkCeSEwiYXVUDJKHRZKF2tXZAU8hnV9VGMWAzeY18RiIqXM1rZfyYo5EbRguAb2BO5viCaVwNVz/vwzL3P7lfpIHpd1x9zpr/cML1P4OaaM3Xd7/ltO2tdT5kjbYBzBNt7aO2WLRDDCrkD1OGDDyLSRthNefqq80u31DOSiL8NfVCaNOt850+a2vebHvk17Vwq1ExFmv8YYII8Bzlh9q/RRb2zDvctP4QWjR3KbcbGONcYJCRn7tobha/wPlrc53FtmWwdc5fWzrL8x3RJSakEkRlEVdf9GXrtGYq5sSomdXo9D+GMC8Fy8YgpU5qfJQu89LWn6ayMe0kzDyIA5ZFh/yZVh02R9xJKyirbIIDcGAAbhGL lMmKRASx aIWUZO7u8nUh09mb26PehAXPBE8+4HVc+/Wv6YFY7/C7SaFqp745hG663B5y8Rn2sqCGKtqmXHH3RBKT10gT1dcK43gIG3sAz1AeDmYu3fQahZm8RU8tV+bpWl0XMSm1zzW+4N8WeswUmCImRi0tnHWCgFQNEkgNAwMsdqGZ1dL+75Xin1Gvcm69SWIvfjm5gq2SXFyF6rAMw84MIftQQ7UQCZicItfGi3AEG+vimPbumJwrBTJDPALwKpwGIsSOouGe+wd6LjYQOIK8cgtZXxk/xg5CjzrFdj6DhRofNz6jf+SK7JavCy4jrFV1M9CfIry9qxYulzeeDueV/oHZoTfK6XDosmfmcoVHr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Feb 01, 2023 at 10:01:17PM +0800, Hillf Danton wrote: > > > +++ b/kernel/time/tick-sched.c > > > @@ -640,13 +640,26 @@ static void tick_nohz_update_jiffies(kti > > > /* > > > * Updates the per-CPU time idle statistics counters > > > */ > > > -static void > > > -update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, u64 *last_update_time) > > > +static u64 update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, > > > + int io, u64 *last_update_time) > > > { > > > ktime_t delta; > > > > > > + if (last_update_time) > > > + *last_update_time = ktime_to_us(now); > > > + > > > if (ts->idle_active) { > > > delta = ktime_sub(now, ts->idle_entrytime); > > > + > > > + /* update is only expected on the local CPU */ > > > + if (cpu != smp_processor_id()) { > > > > Why not just updating it only on idle exit then? > > This aligns to idle exit as much as it can by disallowing remote update. I mean why bother updating if idle does it for us already? One possibility is that we get some more precise values if we read during long idle periods with nr_iowait_cpu() changes in the middle. > > > > > + if (io) > > > > I fear it's not up to the caller to decides if the idle time is IO or not. > > Could you specify a bit on your concern, given the callers of this function? You are randomly stating if the elapsing idle time is IO or not depending on the caller, without verifying nr_iowait_cpu(). Or am I missing something? > > > > > + delta = ktime_add(ts->iowait_sleeptime, delta); > > > + else > > > + delta = ktime_add(ts->idle_sleeptime, delta); > > > + return ktime_to_us(delta); > > Based on the above comments, I guest you missed this line which prevents > get_cpu_idle_time_us() and get_cpu_iowait_time_us() from updating ts. Right... > > But then you may race with the local updater, risking to return > > the delta added twice. So you need at least a seqcount. > > Add seqcount if needed. No problem. > > > > But in the end, nr_iowait_cpu() is broken because that counter can be > > decremented remotely and so the whole thing is beyond repair: > > > > CPU 0 CPU 1 CPU 2 > > ----- ----- ------ > > //io_schedule() TASK A > > current->in_iowait = 1 > > rq(0)->nr_iowait++ > > //switch to idle > > // READ /proc/stat > > // See nr_iowait_cpu(0) == 1 > > return ts->iowait_sleeptime + ktime_sub(ktime_get(), ts->idle_entrytime) > > > > //try_to_wake_up(TASK A) > > rq(0)->nr_iowait-- > > //idle exit > > // See nr_iowait_cpu(0) == 0 > > ts->idle_sleeptime += ktime_sub(ktime_get(), ts->idle_entrytime) > > Ah see your point. > > The diff disallows remotely updating ts, and it is updated in idle exit > after my proposal, so what nr_iowait_cpu() breaks is mitigated. Only halfway mitigated. This doesn't prevent from backward or forward jumps when non-updating readers are involved at all. Thanks. > > Thanks for taking a look, particularly the race linked to nr_iowait_cpu(). > > Hillf