From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 499C1FC6171 for ; Fri, 13 Sep 2024 19:00:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C6446B00A2; Fri, 13 Sep 2024 15:00:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 876286B00C1; Fri, 13 Sep 2024 15:00:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 716FF6B00C2; Fri, 13 Sep 2024 15:00:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 551B66B00A2 for ; Fri, 13 Sep 2024 15:00:13 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CBD2D1C5085 for ; Fri, 13 Sep 2024 19:00:12 +0000 (UTC) X-FDA: 82560630264.13.723A9DE Received: from mail-ej1-f41.google.com (mail-ej1-f41.google.com [209.85.218.41]) by imf06.hostedemail.com (Postfix) with ESMTP id D8CF6180016 for ; Fri, 13 Sep 2024 19:00:10 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jKB1keyW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of jstultz@google.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=jstultz@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726254004; a=rsa-sha256; cv=none; b=yePOcUUBXQp9oiJ2UGUyf/Sx5JCckRZDsNSqhlmcmZTjkSBmWbwJIOzRt9TDFrxyL0CacL EYoDsFQ9pOBshANAUKYkhqx5sCJ1Q3yQ65CuobrirW1LND1gE1TUhM+arRDVS7elCEK3vM Gxa0v4puhJiTNm/oRUyJqxE7GR948Vw= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jKB1keyW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of jstultz@google.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=jstultz@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726254004; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cZf2aYMb8Zlk0CPpJeIXHbW7WWQiG41ie4dpVzhJSlI=; b=8amDJ+ar9PX7GlfoWgLDJ2Vw17raAy+POy72JoFeRRM7H9W5K2i887cgJ/F5uh0LzT+In+ jGZsSVxv1yIrnzlJvQhdE3JNUqOmtbhmcybfxOZ15lh1pTOj2pqIn1bbszAM9AbwAbrUGI KMahH8D7M/OHrrfbzbf+uLszMXJuc/I= Received: by mail-ej1-f41.google.com with SMTP id a640c23a62f3a-a8a789c4fc5so581065766b.0 for ; Fri, 13 Sep 2024 12:00:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1726254009; x=1726858809; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=cZf2aYMb8Zlk0CPpJeIXHbW7WWQiG41ie4dpVzhJSlI=; b=jKB1keyWQLD1IFXV79/L1Nhcxe/YpbUDBbFVGsNuuEC6mI+M/XPA4Aj8Vrz7OVrsRJ hCbJ+ALCCeWryzMBwObZeihSKMHGJO2+xXa+X6LJI1cwMW6hJdMCssDNMOvcxu8rrN5j bZq26fw5kXio6WlFHRObEIHg2foWVWMqLm9IOYg089URY1UiaSeYoAE7/3TatiRi6P9i fFKoNJn+JpufCAx2uW6LM0QXiFqxYK9NFoOCBDlBDGqHuVDSSysSgQYKGzwzhfYp6ay0 gpHTWAmhmgdRzA0tdy3CCuZwk5Nf5Z9Jgez1tmuBJ5qJZjDIgQD66Mg9tjYxFpK/aJyx Xd9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726254009; x=1726858809; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cZf2aYMb8Zlk0CPpJeIXHbW7WWQiG41ie4dpVzhJSlI=; b=hMNf0COQA10PX53NU16iqtv0bkLyBvDpTk4cchlpMQYl99h7fLgaBsnGLp/+RN8UE5 jHBbHc+Yh6PlRlot3djz0laTZ/4KgLGiDCNRgHGZNkFU56KiYpAYgofns3Kyhpm6Clvw epPE4avRwEgn4+g3Qav+mzFGumvbVu6/Bbd/TmmN4rf5gUMa+qO9tEUSEgOajrgrI0SS TBu7Jgp7KjqGacvkZR925fGjt4blxtbPiiHgvIvTk0YcVB2PlNLV6m4V33n/s4phxa2+ kFmUEz5wGdA//lW9AurbrztD2tjCNjzLFtRICvptiSoAyqPYkpnN50kl16n8HXyzf5/c TE1g== X-Forwarded-Encrypted: i=1; AJvYcCXUEKSLODMUGJ+9jHF2/Bd/B84+Uc9YFIQnwMmDvK4oMLGuQlXVB51fdF5iQFtF0M4+AF1b7sMoXQ==@kvack.org X-Gm-Message-State: AOJu0YwC+yopaQ9YuAG7cgmtMHrggZtOvr1jdQhTB2k4XA+vUdr0IglA 7/Fu9uJEGr3FLmMmPV+Hnn1uZeIoPdY3aEvuNf0mBtDQK3ag0DmaMLx/0fM/VepjAGrF1Cplm9g BoiKIxtTv8yS0rWfaNSbaq6SvzXVU/TH8iT8= X-Google-Smtp-Source: AGHT+IHcpW0D1DrK3lVMKL+Vimy/qbLz1ntvEPc5JnGQtAjSuYw2dvEcnvfRgLvohNaDJWzygVp9zoxdzZKfoDHquNE= X-Received: by 2002:a17:906:c10b:b0:a86:8f9b:ef6e with SMTP id a640c23a62f3a-a902a438f77mr781205166b.13.1726254007768; Fri, 13 Sep 2024 12:00:07 -0700 (PDT) MIME-Version: 1.0 References: <20240913-mgtime-v7-0-92d4020e3b00@kernel.org> <20240913-mgtime-v7-1-92d4020e3b00@kernel.org> In-Reply-To: <20240913-mgtime-v7-1-92d4020e3b00@kernel.org> From: John Stultz Date: Fri, 13 Sep 2024 11:59:56 -0700 Message-ID: Subject: Re: [PATCH v7 01/11] timekeeping: move multigrain timestamp floor handling into timekeeper To: Jeff Layton Cc: Thomas Gleixner , Stephen Boyd , Alexander Viro , Christian Brauner , Jan Kara , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Jonathan Corbet , Chandan Babu R , "Darrick J. Wong" , "Theodore Ts'o" , Andreas Dilger , Chris Mason , Josef Bacik , David Sterba , Hugh Dickins , Andrew Morton , Chuck Lever , Vadim Fedorenko , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 9dz849x5ci8i3mko3mzte5fi7w5e5hc7 X-Rspamd-Queue-Id: D8CF6180016 X-Rspamd-Server: rspam02 X-HE-Tag: 1726254010-480589 X-HE-Meta: U2FsdGVkX18oEr4pINn+AxFbv4ksoY0CLFt4IKY/NE0ZeCyrycRKB/9f5VXOzUV23bqPlUjVqzqOsPEXSja6IePrUmRWqAvz+J37fMJWpI4tZXdtfzDI23lp63qfkjgMEtIMuCTxFlDK2vnAMWFd1UzP/xNf6OaeZQAFBdopmiboleux+SVObO+smErLruc7cBR1AXFVZcR3W/SXqsTs4UBHXDgT2QiuUPO/ojJsn5Fg7k/atqz0XyeYyzCWzXt6ZjiGmw/AzLgK0cu9a7lg868nUN62KQB6JLKBRz1ZXXeHkY1DtWuQ0oFFaEXN2398oS2wRmsgTH5TPYBLbQS5W18p9h7F3VsFYdmsLlVB3FxgERoKCTouBT2k6bhxmds27BiNuZZef4B710hbgTy9juID2R1nPQivVBdvj6TUF2QIPBDpcNTm0lc4WF8RvT2HBpMmqLGpES6jAy4l8yRZGR4VLLOOxjLbo0SLvWJNYTUA48M0dG9km2HtwGyWyl5hOYXG3YelejQAJPgiH5Xkv/ZHdaQE+W37r+ZlOTtfXKy2v6oGnYOf1IhgupVngmvbrJCQVT6fK5yiRVH/YxnvInBe/r+0FUBmCyqnfZQUI0gN34zY5vCCL+GsZC3k3SA1LW2lreRGfnDMg7Uhdvf0iY6ItG2p/trhADgUkpSEniZaNzS8kePK/rLjQf+NGwz4zOfgqoKBqcmfwwaajsfisSCeCgt3+mImYLjwCfCfbT6LSoQliWEdO9xeC/GMgphET+M3E9DrJMaZRlSOzmPlRwADLSEsY79xG3+AdJPdsCnxZj6KZ5APlHtehw2GhvJ7NlCz48XnIl+5oiXkM7MbYW2gAe7EXeIeJRoSTuN3Q9irB+bBGgbIlCx/52lTmiwGxa5x/4lsS1Pwfb1rgliK2ntxlT/voa/LCh2MXeWzRD8SkSkxNoPy+iTpoBZe8FaE9JSEKVWNous7t7o19UG J5Y1Xbgk l/uCUlfLyY5p6oTmuZgSSF72/wuZiq41uSFr3FVOSDWsy0r1lARpsMBSSKxS+7t1hDT9Pccp7ci74SbmxVR+QXPdCYcTzlCp8gTKbZwjUdzoIfuiVBH8HSlzvV2r3Txv0ziyhsv0C3i667ArTm4z1iOyGdU009k547UqVZzleM8G73PIbt30oaYzExl65qhOHwRzNUOOB4FILIUHDz4/ZdjVvmUk/JOTcx0aHguE8ASwtmGZgkMPEjrxSxBa65+idkJtyi88sbBFHofSo2uvoPfCl2Pz4IgCl9PfCkOFmySZu+Y8wy1mtJ/3rNA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Sep 13, 2024 at 6:54=E2=80=AFAM Jeff Layton wr= ote: > > For multigrain timestamps, we must keep track of the latest timestamp > that has ever been handed out, and never hand out a coarse time below > that value. > > Add a static singleton atomic64_t into timekeeper.c that we can use to > keep track of the latest fine-grained time ever handed out. This is Maybe drop "ever" and add "handed out through a specific interface", as timestamps can be accessed in a lot of ways that don't keep track of what was returned. > tracked as a monotonic ktime_t value to ensure that it isn't affected by > clock jumps. > > Add two new public interfaces: > > - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of th= e > coarse-grained clock and the floor time > > - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries > to swap it into the floor. A timespec64 is filled with the result. > > Since the floor is global, we take great pains to avoid updating it > unless it's absolutely necessary. If we do the cmpxchg and find that the > value has been updated since we fetched it, then we discard the > fine-grained time that was fetched in favor of the recent update. > > To maximize the window of this occurring when multiple tasks are racing > to update the floor, ktime_get_coarse_real_ts64_mg returns a cookie > value that represents the state of the floor tracking word, and > ktime_get_real_ts64_mg accepts a cookie value that it uses as the "old" > value when calling cmpxchg(). > > Signed-off-by: Jeff Layton > --- > include/linux/timekeeping.h | 4 +++ > kernel/time/timekeeping.c | 81 +++++++++++++++++++++++++++++++++++++++= ++++++ > 2 files changed, 85 insertions(+) > > diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h > index fc12a9ba2c88..cf2293158c65 100644 > --- a/include/linux/timekeeping.h > +++ b/include/linux/timekeeping.h > @@ -45,6 +45,10 @@ extern void ktime_get_real_ts64(struct timespec64 *tv)= ; > extern void ktime_get_coarse_ts64(struct timespec64 *ts); > extern void ktime_get_coarse_real_ts64(struct timespec64 *ts); > > +/* Multigrain timestamp interfaces */ > +extern u64 ktime_get_coarse_real_ts64_mg(struct timespec64 *ts); > +extern void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie); > + > void getboottime64(struct timespec64 *ts); > > /* > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > index 5391e4167d60..ee11006a224f 100644 > --- a/kernel/time/timekeeping.c > +++ b/kernel/time/timekeeping.c > @@ -114,6 +114,13 @@ static struct tk_fast tk_fast_raw ____cacheline_ali= gned =3D { > .base[1] =3D FAST_TK_INIT, > }; > > +/* > + * This represents the latest fine-grained time that we have handed out = as a > + * timestamp on the system. Tracked as a monotonic ktime_t, and converte= d to the > + * realtime clock on an as-needed basis. > + */ > +static __cacheline_aligned_in_smp atomic64_t mg_floor; > + > static inline void tk_normalize_xtime(struct timekeeper *tk) > { > while (tk->tkr_mono.xtime_nsec >=3D ((u64)NSEC_PER_SEC << tk->tkr= _mono.shift)) { > @@ -2394,6 +2401,80 @@ void ktime_get_coarse_real_ts64(struct timespec64 = *ts) > } > EXPORT_SYMBOL(ktime_get_coarse_real_ts64); > > +/** > + * ktime_get_coarse_real_ts64_mg - get later of coarse grained time or f= loor > + * @ts: timespec64 to be filled > + * > + * Adjust floor to realtime and compare it to the coarse time. Fill > + * @ts with the latest one. Returns opaque cookie suitable for passing > + * to ktime_get_real_ts64_mg(). > + */ > +u64 ktime_get_coarse_real_ts64_mg(struct timespec64 *ts) > +{ > + struct timekeeper *tk =3D &tk_core.timekeeper; > + u64 floor =3D atomic64_read(&mg_floor); > + ktime_t f_real, offset, coarse; > + unsigned int seq; > + > + WARN_ON(timekeeping_suspended); > + > + do { > + seq =3D read_seqcount_begin(&tk_core.seq); > + *ts =3D tk_xtime(tk); > + offset =3D *offsets[TK_OFFS_REAL]; > + } while (read_seqcount_retry(&tk_core.seq, seq)); > + > + coarse =3D timespec64_to_ktime(*ts); > + f_real =3D ktime_add(floor, offset); > + if (ktime_after(f_real, coarse)) > + *ts =3D ktime_to_timespec64(f_real); > + return floor; > +} > +EXPORT_SYMBOL_GPL(ktime_get_coarse_real_ts64_mg); > + > +/** > + * ktime_get_real_ts64_mg - attempt to update floor value and return res= ult > + * @ts: pointer to the timespec to be set > + * @cookie: opaque cookie from earlier call to ktime_get_coarse_real_= ts64_mg() > + * > + * Get a current monotonic fine-grained time value and attempt to swap > + * it into the floor using @cookie as the "old" value. @ts will be > + * filled with the resulting floor value, regardless of the outcome of > + * the swap. I'd add more detail here to clarify that this can return a coarse floor value if the cookie is stale. > +void ktime_get_real_ts64_mg(struct timespec64 *ts, u64 cookie) > +{ > + struct timekeeper *tk =3D &tk_core.timekeeper; > + ktime_t offset, mono, old =3D (ktime_t)cookie; > + unsigned int seq; > + u64 nsecs; > + > + WARN_ON(timekeeping_suspended); > + > + do { > + seq =3D read_seqcount_begin(&tk_core.seq); > + > + ts->tv_sec =3D tk->xtime_sec; > + mono =3D tk->tkr_mono.base; > + nsecs =3D timekeeping_get_ns(&tk->tkr_mono); > + offset =3D *offsets[TK_OFFS_REAL]; > + } while (read_seqcount_retry(&tk_core.seq, seq)); > + > + mono =3D ktime_add_ns(mono, nsecs); > + > + if (atomic64_try_cmpxchg(&mg_floor, &old, mono)) { > + ts->tv_nsec =3D 0; > + timespec64_add_ns(ts, nsecs); > + } else { > + /* > + * Something has changed mg_floor since "old" was > + * fetched. That value is just as valid, so accept it. > + */ Mostly because I embarrassingly tripped over this in front of everyone, I might suggest: /* * mg_floor was updated since the cookie was fetched, so the * the try_cmpxchg failed. However try_cmpxchg updated old * with the current mg_floor, so use that to return the current * coarse floor value */ :) thanks -john