From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B57C7C678D4 for ; Thu, 2 Mar 2023 16:14:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 293E96B0071; Thu, 2 Mar 2023 11:14:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 244AC6B0073; Thu, 2 Mar 2023 11:14:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 10CE46B0074; Thu, 2 Mar 2023 11:14:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F2ED86B0071 for ; Thu, 2 Mar 2023 11:14:08 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BE63780320 for ; Thu, 2 Mar 2023 16:14:08 +0000 (UTC) X-FDA: 80524454976.19.A65FDAC Received: from mail-yb1-f180.google.com (mail-yb1-f180.google.com [209.85.219.180]) by imf22.hostedemail.com (Postfix) with ESMTP id E7B1FC0011 for ; Thu, 2 Mar 2023 16:14:05 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=HOLaGjYG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of surenb@google.com designates 209.85.219.180 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677773645; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pk3jxwGhDgrO97j71qZC7hN119s30TroFhBoZ90aiOw=; b=1Tug8LtvsEd7MXWG2CTOca/OFpba8D5gOjtENdAQzvhKULKDStWNIyt+g738V0NyZdcXrn vsIFTZuz8mOd+DY2ogY2rqp3C+Tj5OhebD71AjFfVSvHmZKb8EHio13Ww8v4woAN6nvO4b jYNX8cFxjUstC0Sd0YguM1gzNYEAy9s= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=HOLaGjYG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of surenb@google.com designates 209.85.219.180 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677773645; a=rsa-sha256; cv=none; b=oTB73HnwbRVKBm7o+IZpRho4AfKTePWWPBM7b7cRvAeddfoyg1os84fhiQlSlYm1QqsbMp dWlcYbFlCC8MfjrFupp/y84QEq7NKgTO4xEGNYhetyXHVrpFT21g8wy3H9w5z2hdIUlA4N W+c8Iv2x7MvdJdoI3mMZMwbsdDgOQSo= Received: by mail-yb1-f180.google.com with SMTP id e82so4065291ybh.9 for ; Thu, 02 Mar 2023 08:14:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1677773645; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pk3jxwGhDgrO97j71qZC7hN119s30TroFhBoZ90aiOw=; b=HOLaGjYG+E2moOTroF7Lrc+FT8kjPg8eO+2zq2UinSgu1L3KwflN5gtuQz0aDv9ZXP KckvhiQgN1sVCm2Hc5Yn1+cw9i8SMozFfdHsbL+AU7QVUr4jkT41uTH8LZGQmb0a6Zyi no4FFwYyuvgxDpfoo1taagMzOvHW+ePVVwvWXz9LTXbSNMqosnxaF+IsYpUHOzyH/Tkz /mwwZKGFNHwd/jt6Ud2+aiJBOGFURPV4LXpc9DU863O/dmd53NCqR/t6vQL5cmqrwzq8 BjAV+kKZjlzQjPbo/AWcbYqHbGcwwvSf8THi1bV9ZERZZazHoRYC2q3QV6kNacTBeMmq XXQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677773645; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pk3jxwGhDgrO97j71qZC7hN119s30TroFhBoZ90aiOw=; b=14cmRMxcCKp4M+jXDXMx7Qr9+KKkD8Vff2bQ59V4EV+i+G8qle1RpqaOEeE+vrY6ru G1ejE8i4lWUf3CjJbpRiJp22eb51Yjck4QocX6SsKWpsDhce05Fxo2g2NanijkWOKoOB AS4WR7Amp/BvWTZHmRsjbzXKWj2bWsdlpJxmSuoeoiRCeo/+0PlGS+ighRS3WMyOBm9j ig7vymPl4l7iv6DTnLxbE4YLEGPAXnNhoXuQEThnHKceVzZ92A9rqjtUjTd5clRyxS9m TG0gn7qCDrckTw4KlwsPZcUqPb4icWxgT2k1cLK9jnZzQNP+YOqCiImQtEKa1FEOwidS 7ElA== X-Gm-Message-State: AO0yUKXR4yh1UCNJF0BVLoBqWlcPHtG6nX54hCAOIot7Uonz1Nm0b7cn pTeMALr0aszZ77bxhBkY2lG0PPeAwRn4utF6ynySWg== X-Google-Smtp-Source: AK7set9+bTrTQ51qMnNEYy8g/1SOF00A9AhrNkytx6kl+1TS0DOeSRD0r4Sq47w1fENuH5Bq/GsXLlw477NZXu+8vAM= X-Received: by 2002:a5b:650:0:b0:997:bdfe:78c5 with SMTP id o16-20020a5b0650000000b00997bdfe78c5mr5867671ybq.6.1677773644842; Thu, 02 Mar 2023 08:14:04 -0800 (PST) MIME-Version: 1.0 References: <20230301193403.1507484-1-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 2 Mar 2023 08:13:54 -0800 Message-ID: Subject: Re: [PATCH 1/1] psi: remove 500ms min window size limitation for triggers To: Johannes Weiner Cc: tj@kernel.org, lizefan.x@bytedance.com, peterz@infradead.org, johunt@akamai.com, mhocko@suse.com, keescook@chromium.org, quic_sudaraja@quicinc.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: E7B1FC0011 X-Stat-Signature: mbxzsippfjbykhjdf1peuegg31dexkac X-HE-Tag: 1677773645-587227 X-HE-Meta: U2FsdGVkX18ooxwRGNWEh0hL1+sRKigclyZpohpNqyAk7bcCyo1nZKNeao+FUsD/y2+2i/d5eE2/hZ0pXBSdswEtahRgH1vf23RXY8Ox7kzlGtWMXe6VeDgwwzTRLXVXECZE+ndoXBzJeida3V9Jqcd1nsZy3Wum4G1Y3p+a9uvVhgR3w9rjGgKwcLzzjry9VPlKwqlXrWY/iYd+DrU5kI94/bSnJCkbI87y7rSAg4sEVawlk2eNFAbTmkIZdK9IoLtW0Hhf6UqEDf40H6ivumyTbQ0F1N0TA3H3QZpzLtE2DoeeGmNEQfDKz87IlPsJVFiUMcxZ9OTIUsP2hZ7T+o5pcnm+5ThsHAklf4Lv51DVzMON+OjYXJF1yD1kiEYCSm3X6tcvoKhcC3q612kVLV94WA75EfS90Cz844RdrmGoLkFRGfPr0R5d8BsLCq/ZS4ly6CrEhPV2ZsTT/Bmn2wpFqhV6BZNpvakpPOyDy77PBmfJZVWQRchMUyYrHBnVlJP25ujOlttmWPHNY8lObg05Eq62WY9nX9An5XujXkMGm7IMbfOtf92e4c8gwwzeiQFnbiegD3ydpQZIlSGsiQNKVReTj8bfvOtrql6SE9TDhmXEFcSlzxxsFDtxUwol73qtCTrTBDbI7/rfb4TV+Wj5q9G9usPUjkEAyf5cs9ECg1ywwnDmcqXH1OtRp75T54A+3S4EHNwyn0W+mqbhGf7YHYP70WXXM392p4PNKU34UzrH9zjRONBpxJEvJLdCpEGX5t6cPHkoausv+0vOiN4nn9O0KRmBXn5RyfMX15FBlJYh+nIYn+qzPsYr5JX+Ksehip1A/aprG5RkTC7OMF8Mo82VC2StpxuuZzQOaMoC4/BwWHbNKIandSDTGeDO/whJaqmNmOc9JSCgHy9WioaeyNBZMrEYVZYZF0mjHLlZtHcJr3z4KdcgZW+kYUtQD2W3TDpPhfuXV164WpF LF+FLpLS bokc6ANAx4JHJUF9v3wc4saHRdUcg9jLLFTmtNSau9jGe43NKh6l08ogPYp82ZYCSBYVKCCpJ2LgztY4Whn/eReOENNaV1l5BpVtbUSkwzOQ+OTL4G9KXkSu2vBdyufZD+LMLU6rRLUZb+k4btO/iAcEEwVeKV2AjRrTv5k5uL/wnvhAtBOfNzkLPo8g9jITe7UMCXhNpl2Yp5WyPSheV0qIm/ijkttfklNVd8exoaSHyEN1g4hGSE2cSQEIC8a68UI4KKmWnKlVa23vpJ0BKujZUqKVlICCNEFRDjIL6jeIJ+RKFIIlggJ+FFSwEM5XKMeEdaJejCS0NVVz0d6cVTHbsF3Ynr1BZHa8210uH4zwsLAhK2GfI4ker2BwNuETVcRZLDYP31TMD9V2J0rkdG6UfhScddMMgoezcFvGAv+Z2PM3l9ElZ34kR0x8Rd321Ust3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 2, 2023 at 7:30 AM Johannes Weiner wrote: > > On Wed, Mar 01, 2023 at 12:48:38PM -0800, Suren Baghdasaryan wrote: > > On Wed, Mar 1, 2023 at 12:07=E2=80=AFPM Johannes Weiner wrote: > > > > > > On Wed, Mar 01, 2023 at 11:34:03AM -0800, Suren Baghdasaryan wrote: > > > > Current 500ms min window size for psi triggers limits polling inter= val > > > > to 50ms to prevent polling threads from using too much cpu bandwidt= h by > > > > polling too frequently. However the number of cgroups with triggers= is > > > > unlimited, so this protection can be defeated by creating multiple > > > > cgroups with psi triggers (triggers in each cgroup are served by a = single > > > > "psimon" kernel thread). > > > > Instead of limiting min polling period, which also limits the laten= cy of > > > > psi events, it's better to limit psi trigger creation to authorized= users > > > > only, like we do for system-wide psi triggers (/proc/pressure/* fil= es can > > > > be written only by processes with CAP_SYS_RESOURCE capability). Thi= s also > > > > makes access rules for cgroup psi files consistent with system-wide= ones. > > > > Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers= and > > > > remove the psi window min size limitation. > > > > > > > > Suggested-by: Sudarshan Rajagopalan > > > > Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraj= a@quicinc.com/ > > > > Signed-off-by: Suren Baghdasaryan > > > > --- > > > > kernel/cgroup/cgroup.c | 10 ++++++++++ > > > > kernel/sched/psi.c | 4 +--- > > > > 2 files changed, 11 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > > > > index 935e8121b21e..b600a6baaeca 100644 > > > > --- a/kernel/cgroup/cgroup.c > > > > +++ b/kernel/cgroup/cgroup.c > > > > @@ -3867,6 +3867,12 @@ static __poll_t cgroup_pressure_poll(struct = kernfs_open_file *of, > > > > return psi_trigger_poll(&ctx->psi.trigger, of->file, pt); > > > > } > > > > > > > > +static int cgroup_pressure_open(struct kernfs_open_file *of) > > > > +{ > > > > + return (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RE= SOURCE)) ? > > > > + -EPERM : 0; > > > > +} > > > > > > I agree with the change, but it's a bit unfortunate that this check i= s > > > duplicated between system and cgroup. > > > > > > What do you think about psi_trigger_create() taking the file and > > > checking FMODE_WRITE and CAP_SYS_RESOURCE against file->f_cred? > > > > That's definitely doable and we don't even need to pass file to > > psi_trigger_create() since it's called only when we write to the file. > > However by moving the capability check into psi_trigger_create() we > > also postpone the check until write() instead of failing early in > > open(). I always assumed failing early is preferable but if > > consolidating the code here makes more sense then I can make the > > switch. Please let me know if you still prefer to move the check. > > Just for context, a person on our team is working on allowing > unprivileged polls with windows that are multiples of 2s, which can be > triggered from the regular aggregator threads. This should be useful > for container delegation, and also for the desktop monitor app usecase > that Chris Down brought up some time ago. At that point, everybody can > open the file for write, and permissions are checked against the > trigger parameters. > > So I don't think it's a big deal to check this particular permission > at write time. But if you prefer we can also merge your patch as-is > and do the refactor as part of the other series. Let's roll this check without additional changes and then consolidate the checking inside psi_trigger_create() in a separate patch. If anybody objects to the late permission check we will just revert that last change without affecting anything else. > > Your call. In either case, please feel free to add > > Acked-by: Johannes Weiner Thanks! Will post the final patch with Ack's later today. Originally it was purely cgroup-related change but now it's more of a PSI change. Therefore Peter's tree will probably be the right place for it.