From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2461C678D4 for ; Fri, 3 Mar 2023 01:16:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4005A6B0071; Thu, 2 Mar 2023 20:16:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B0A76B0072; Thu, 2 Mar 2023 20:16:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 277E16B0073; Thu, 2 Mar 2023 20:16:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 178B06B0071 for ; Thu, 2 Mar 2023 20:16:47 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id CF6F4410E4 for ; Fri, 3 Mar 2023 01:16:46 +0000 (UTC) X-FDA: 80525822412.01.E613FE0 Received: from mail-yw1-f174.google.com (mail-yw1-f174.google.com [209.85.128.174]) by imf27.hostedemail.com (Postfix) with ESMTP id 1F02740018 for ; Fri, 3 Mar 2023 01:16:44 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="RHZ/5e8m"; spf=pass (imf27.hostedemail.com: domain of surenb@google.com designates 209.85.128.174 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677806205; a=rsa-sha256; cv=none; b=H38qTDuNuHYEMBhjPh0HuH0N6gK9h7NCZ42Lah+lnstZ5tjT1qg0xvFVaB8sUUlVPICmF7 pBBGLbesb/xX87OaW+cfeByh1dvoSmm11pg9qPD1msKq6jZ5FwmHm9lD2z3Dwo3hoJTA8K VEla8yKtcAgJMO+frwUg3KwrJnJajaE= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="RHZ/5e8m"; spf=pass (imf27.hostedemail.com: domain of surenb@google.com designates 209.85.128.174 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677806205; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pnHArXpnIln4mR9B6GQIrF77J/3Fo4zMukNf/BHzahY=; b=TVtxGvLMLogIAHc/hV1/z02/n2Q+6mXP4DnNmDWVlqv1OYraRhe1FuW+rtsDWZQpSewRPh lymNu6Cr+AaA/5xqiZxkLMlRHG2lN9Dr5FLCIYbExPIaJ45eJiFrrSoJKlC17qIj/ggLQp TMagLdU7zgQbIpix/NVDYn5XUZPABwI= Received: by mail-yw1-f174.google.com with SMTP id 00721157ae682-536c2a1cc07so15531257b3.5 for ; Thu, 02 Mar 2023 17:16:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1677806204; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pnHArXpnIln4mR9B6GQIrF77J/3Fo4zMukNf/BHzahY=; b=RHZ/5e8m9fy3BrS8H8EuZi/rzgU5IOzlIA8zeJ5GS+PM41zVTymZSYbmzfQmb6OvVl XbYMzhWI2XYBouWUFdAi3S9YuNgZK4dQ1pMEbzVemOlITarcYhDbrAvpvUNLOmNO0uNb nSN08VzpIdhLwlFCfmEhEXARu67/KiCdyKcZSIchfC/ToxDkZAJw/Y8hlcpRiWPmdpYo UOZ5Ki7t/dMFspouEbW3kmRb+c1YsfQu1ym8WzCEiHGbwjfUFG1ZEmhkWkGcbRH78lrw I+jlQIAUVSmtMxCjT2ToCcbJlQoCCnkSOQIyYGh5aqimOH1BKDiZMWJk9SIIbMS4ANiY SU6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677806204; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pnHArXpnIln4mR9B6GQIrF77J/3Fo4zMukNf/BHzahY=; b=NqRsW3L1t8nEwC8/Fo7swppD1iUSFP0+nVmWwSxj0ceuCoZhzCt9fxi7/dZekExGNw tbOQFGYB22fOWGCXUPoEpEivVsp+bHOEmH7vb9WETDBdO+c1ERhwRWx4tZE32pRCzrZu xPLyzfX/Zok+2vt83bDggza12q3sDvI0OhkRn7gsTyy4dMclcf7xeqokfuQlhdX1dsOq P2vGP1qGNl3kK2eyCqXWV/r2w9/LADF03rLty9YJd75VhRle7Vh6rJhPpaXzhDYdKfJ/ n970POX8YVJZ//6T98zlzp2Pk435KYPDKTMl2Cpk9TrEhUizsCdm4wc2+0Vo5aFwXbNq lBSg== X-Gm-Message-State: AO0yUKUOxU0IE9t2AMo1iB6VAIdimCD/JtKLDdwFpCz2M+kbUTjJMF/P vId+xilBNl4TDEVGtKEyTSd0laMtCw1FG5kPrOrCHLNiF29UlRC+ X-Google-Smtp-Source: AK7set8gB5DbIJ+gLG4zNATh0e4toIdpQr8pDGSph9LTbZyyG9xVSqP7QGYXQGpSjbV8FZuCQlxxqO+eS0dALZFgTVU= X-Received: by 2002:a81:af52:0:b0:533:9d49:f9c9 with SMTP id x18-20020a81af52000000b005339d49f9c9mr7715100ywj.0.1677806204010; Thu, 02 Mar 2023 17:16:44 -0800 (PST) MIME-Version: 1.0 References: <20230303011346.3342233-1-surenb@google.com> In-Reply-To: <20230303011346.3342233-1-surenb@google.com> From: Suren Baghdasaryan Date: Thu, 2 Mar 2023 17:16:33 -0800 Message-ID: Subject: Re: [PATCH v2 1/1] psi: remove 500ms min window size limitation for triggers To: peterz@infradead.org Cc: tj@kernel.org, hannes@cmpxchg.org, lizefan.x@bytedance.com, johunt@akamai.com, mhocko@suse.com, keescook@chromium.org, quic_sudaraja@quicinc.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 1F02740018 X-Rspamd-Server: rspam01 X-Stat-Signature: 4g7igq3kt6arj1ezhx9875i8twws9p8w X-HE-Tag: 1677806204-217459 X-HE-Meta: U2FsdGVkX1+OXM9BSW7dYfM9ecBMdFqjgdTeyJugjEse17wqr7bV1BcbWyuU1liHlAP73xFiLDgouZBCVi2KzPuqmapnl0gMlNPtX8zdN8G11Tu4zAjY6uwxHHZJR1GCy7XTuuRgcWReo/mScWuW5tof19CkDn0OxRjq5gPRr6xa0OpZYR5VVoXe0GM3yKoioJ16I+TkDcNivC1c5eL0zSN3gfEMlcTPISXwFyI/OMw9HEukeGOrjF/rnjroKCfzGID45BTv78qfSOvcjvTLwpTSRwpgO6ycw+86yYJHvAQQ81mEALYSVfBaUoaRLH+25yJZYGT44BaPw+/VRQtYh4e0D5WCY50bPknNiEjVUI+uNawHzKXgzB+8lAQ1siD4vja3/4o3yOniSm/kV67cMoVZy28ZSeJnrH6lMqpfA37JR28m8FqDRx1cXSg8fa0uELKmK7IPDbQRUuXduRkMfo1DWh9DQMq3i3eItlCeW0R+SjK+Osk62ACdaME7nxJtWxh3N1GT11fFWb1nVzkiCbAegz+x0HcSb0MfvGLzs93pxRPpXOhBB3jFfotATLSfIwPrXl4pnFCRjOBSLWsd4GzYLyg078D0VVSvrGFo7MdXqxDGDU7rP8oWeCekx7RnYv+V6DV77rEsTxP3aUKOPqm96TL0Ey0ugZZOe1O1ME1zTTMIURbVoFmnnUXrGZKGYMEw5ROJLSHq/TatZrT3q8baGQG1bjcfjFG3WK07qYNYqn889i2flHuVtTbuDfDt8D657Qb99hbI8twnnU1lvZlbze2n5Glt85faljniDWvPztZskcbft4mSc1fZw1C9DyOnhxdowdhS7OOtcmIZnimMbPhN59LJKFf3xI1NZFYtqiMEI+nQdJhZyxp7taZE7ywnkzQmkRq+YRdXGmdUUIH2OTA29J6CmNjfnmzMTvpUU6Py+TkNAqeboGC1RoghH9S7JfMpy7HQQTGdc2y w05Bza5R TwVH5GNOkjlMP7nCLx0/SRPQU8AMgEuZBEyYyGjy3AVV+8vsY1W2y+7UBnFlTZLHuOdnsonPuBvyUe1X5RmD7pIo0rDslDg+1XFmiEbzJiouRQJifCwzh+E7Ec1o/yNXednjWHqMk+i5k1dtKsRErfoWkdshUMri0CvhsggiLL8UPGkGlJo8LtVTlVOsDpy4qelm5KDXfEMEIDpg6rbMB6Y6LavhJScQTuJOf+6z7JSInhreXvJnI4yUVe0t1pC08mLVSaLq5+gzI4aO4H5k1OfTeBqcbPCBI4jvsFO+0wl74Sa/Q7v5iwJLNolVsVz3M+6JvpueGIqkN+ro//J70N3LlG4Ozxo9DJ36+NChNNmlrnll6lk8mPrOqAV5ZLXbLdsm8YUv4EuekozMPagR59ZXd4wWiaM9J1hxFrpr9OIDKemVf6brlwQtpWKjSyRygbHXxi948rurOOAoHCVBzmL8v76UlYq9B2KGVkemHcf5z2eQieANAgGxmod/T6oAYEUco X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 2, 2023 at 5:13=E2=80=AFPM Suren Baghdasaryan wrote: > > Current 500ms min window size for psi triggers limits polling interval > to 50ms to prevent polling threads from using too much cpu bandwidth by > polling too frequently. However the number of cgroups with triggers is > unlimited, so this protection can be defeated by creating multiple > cgroups with psi triggers (triggers in each cgroup are served by a single > "psimon" kernel thread). > Instead of limiting min polling period, which also limits the latency of > psi events, it's better to limit psi trigger creation to authorized users > only, like we do for system-wide psi triggers (/proc/pressure/* files can > be written only by processes with CAP_SYS_RESOURCE capability). This also > makes access rules for cgroup psi files consistent with system-wide ones. > Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and > remove the psi window min size limitation. > > Suggested-by: Sudarshan Rajagopalan > Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja@quic= inc.com/ > Signed-off-by: Suren Baghdasaryan > Acked-by: Michal Hocko > Acked-by: Johannes Weiner Forgot to change the --to field from Tejun to PeterZ. Peter, just to clarify, this change is targeted for inclusion in your tree. Thanks! > --- > kernel/cgroup/cgroup.c | 10 ++++++++++ > kernel/sched/psi.c | 4 +--- > 2 files changed, 11 insertions(+), 3 deletions(-) > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index 935e8121b21e..b600a6baaeca 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -3867,6 +3867,12 @@ static __poll_t cgroup_pressure_poll(struct kernfs= _open_file *of, > return psi_trigger_poll(&ctx->psi.trigger, of->file, pt); > } > > +static int cgroup_pressure_open(struct kernfs_open_file *of) > +{ > + return (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOUR= CE)) ? > + -EPERM : 0; > +} > + > static void cgroup_pressure_release(struct kernfs_open_file *of) > { > struct cgroup_file_ctx *ctx =3D of->priv; > @@ -5266,6 +5272,7 @@ static struct cftype cgroup_psi_files[] =3D { > { > .name =3D "io.pressure", > .file_offset =3D offsetof(struct cgroup, psi_files[PSI_IO= ]), > + .open =3D cgroup_pressure_open, > .seq_show =3D cgroup_io_pressure_show, > .write =3D cgroup_io_pressure_write, > .poll =3D cgroup_pressure_poll, > @@ -5274,6 +5281,7 @@ static struct cftype cgroup_psi_files[] =3D { > { > .name =3D "memory.pressure", > .file_offset =3D offsetof(struct cgroup, psi_files[PSI_ME= M]), > + .open =3D cgroup_pressure_open, > .seq_show =3D cgroup_memory_pressure_show, > .write =3D cgroup_memory_pressure_write, > .poll =3D cgroup_pressure_poll, > @@ -5282,6 +5290,7 @@ static struct cftype cgroup_psi_files[] =3D { > { > .name =3D "cpu.pressure", > .file_offset =3D offsetof(struct cgroup, psi_files[PSI_CP= U]), > + .open =3D cgroup_pressure_open, > .seq_show =3D cgroup_cpu_pressure_show, > .write =3D cgroup_cpu_pressure_write, > .poll =3D cgroup_pressure_poll, > @@ -5291,6 +5300,7 @@ static struct cftype cgroup_psi_files[] =3D { > { > .name =3D "irq.pressure", > .file_offset =3D offsetof(struct cgroup, psi_files[PSI_IR= Q]), > + .open =3D cgroup_pressure_open, > .seq_show =3D cgroup_irq_pressure_show, > .write =3D cgroup_irq_pressure_write, > .poll =3D cgroup_pressure_poll, > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > index 02e011cabe91..0945f956bf80 100644 > --- a/kernel/sched/psi.c > +++ b/kernel/sched/psi.c > @@ -160,7 +160,6 @@ __setup("psi=3D", setup_psi); > #define EXP_300s 2034 /* 1/exp(2s/300s) */ > > /* PSI trigger definitions */ > -#define WINDOW_MIN_US 500000 /* Min window size is 500ms */ > #define WINDOW_MAX_US 10000000 /* Max window size is 10s */ > #define UPDATES_PER_WINDOW 10 /* 10 updates per window */ > > @@ -1278,8 +1277,7 @@ struct psi_trigger *psi_trigger_create(struct psi_g= roup *group, > if (state >=3D PSI_NONIDLE) > return ERR_PTR(-EINVAL); > > - if (window_us < WINDOW_MIN_US || > - window_us > WINDOW_MAX_US) > + if (window_us =3D=3D 0 || window_us > WINDOW_MAX_US) > return ERR_PTR(-EINVAL); > > /* Check threshold */ > -- > 2.40.0.rc0.216.gc4246ad0f0-goog >