From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1B28C433EF for ; Tue, 21 Dec 2021 02:51:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D78A6B0088; Mon, 20 Dec 2021 21:51:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 487426B0092; Mon, 20 Dec 2021 21:51:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 327796B0093; Mon, 20 Dec 2021 21:51:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 225036B0088 for ; Mon, 20 Dec 2021 21:51:23 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D4DE3180D330B for ; Tue, 21 Dec 2021 02:51:22 +0000 (UTC) X-FDA: 78940275204.09.56DFBEC Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) by imf07.hostedemail.com (Postfix) with ESMTP id 6E6A14004C for ; Tue, 21 Dec 2021 02:51:22 +0000 (UTC) Received: by mail-qv1-f52.google.com with SMTP id g15so9860845qvi.6 for ; Mon, 20 Dec 2021 18:51:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=RLHHwkH6kNjaF0t9kjoiGI4SRHPFzG04wLGsPZKoEoI=; b=c1dl8FodFoIn9XVKeG6ZGNQT7lbv9kRwCyegE+1V2p3KokY5ENjsuEoiaun78in0xD zjZRFUlP0BEMR6IPN8Lry8Gk/ox46WdeY7eLhEifZH7kjrHL18FHXDWad/PExGOUbdA+ NEi4kGZnP/Gl+O6vPsw3zbYoW+UyKumF2hPvr/Sh6ieJgEoTciS7cigycFWvm753FLJS itjbnnsKCWslZkxLpOib/Ixre+djbIhD8BIlg8XwLKr5y6OHSuWBm/6SnLGNOJI7yTgQ fOJUuzwLfMnCP4xRDrRkwql4BXLXkoTdMhdAT7p2affdjpa8B4R6iJScv0i23dROUukc r8sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RLHHwkH6kNjaF0t9kjoiGI4SRHPFzG04wLGsPZKoEoI=; b=kAg7mAFY8hoCgSQlLw7tMuU2dU+Pm/PdPjf6+vm2eF5GBj4q5kzSbgLdT8swbo+xOE SExHwZMdz9aCSPwMhT4BdcS6QUt25x0PeMO5jYi7oWD/20eVJb68G9XKAkIHi6BgbioB ffOENtQ+36XcNrXGfTGf3PvccQvxPjmqYr8tmcCmpipIUsaSmSogqKdlXKO4gDMta7s1 tGsWAze0wmJqnH/CMv1ZY1uEWro7kXQqRRQOvYRaDM07PXHP9N2qWIadbVLvQBrLJ/8z lPj1WfR3XQMxd6HKiTy/GMu+Uu+N7y0jrc0XDhh98TMZvPyFGbgANIqBZFHEtoolXyEj pAUg== X-Gm-Message-State: AOAM532eoRmUiDLugXORud7/jUXCPpiynoNI2mHTiChkwulYaxHntLLL ZzaCoAtkhzP7p6odXQC8L8Y1TXKuTFDQ1SmgVUE= X-Google-Smtp-Source: ABdhPJyCDUUQOlCEpPUscgu8ExM2LaLWYvXU4qzi4tajsbornvv2ewxHqSjwzEkcF7E2PhBNFQgih/cB12DXLz9w4rs= X-Received: by 2002:a05:6214:2a88:: with SMTP id jr8mr912495qvb.18.1640055081686; Mon, 20 Dec 2021 18:51:21 -0800 (PST) MIME-Version: 1.0 References: <1639721264-12294-1-git-send-email-huangzhaoyang@gmail.com> In-Reply-To: From: Zhaoyang Huang Date: Tue, 21 Dec 2021 10:51:01 +0800 Message-ID: Subject: Re: [PATCH] psi: fix possible trigger missing in the window To: Suren Baghdasaryan Cc: Johannes Weiner , Zhaoyang Huang , "open list:MEMORY MANAGEMENT" , LKML Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6E6A14004C X-Stat-Signature: nw8nw1cpz5q41ywtihsh3qiqx1xdsyz1 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=c1dl8Fod; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf07.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.219.52 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com X-HE-Tag: 1640055082-959965 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Dec 21, 2021 at 10:30 AM Suren Baghdasaryan wrote: > > On Mon, Dec 20, 2021 at 5:57 PM Zhaoyang Huang wrote: > > > > On Tue, Dec 21, 2021 at 3:58 AM Suren Baghdasaryan wrote: > > > > > > On Fri, Dec 17, 2021 at 10:03 PM Zhaoyang Huang wrote: > > > > > > > > loop Suren > > > > > > Thanks. > > > > > > > > > > > > > > On Fri, Dec 17, 2021 at 2:08 PM Huangzhaoyang wrote: > > > > > > > > > > From: Zhaoyang Huang > > > > > > > > > > There could be missing wake up if the rest of the window remain the > > > > > same stall states as the polling_total updates for every polling_min_period. > > > > > > Could you please expand on this description? I'm unclear what the > > > problem is. I assume "polling_min_period" in this description refers > > > to the group->poll_min_period. > > > > > > From the code, looks like the change results in update_triggers() > > > calling window_update() once there was a new stall recorded for the > > > trigger state and until the tracking window is complete. I don't see > > > the point of calling window_update() if there was no stall change > > > since the last call to window_update(). The resulting growth will not > > > increase if there is no new stall. > > > Maybe what you want to achieve here is more than one trigger per > > > window if the stall limit was breached? If so, then this goes against > > > the design for psi triggers in which we want to rate-limit the number > > > of generated triggers per tracking window (see: > > > https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c#L545). > > > Please clarify the issue and the intentions here. > > > Thanks! > > Please correct me if I am wrong. Imagine that there is a new stall > > during the 1st polling_min_period among 10 of them in the window and > > group->polling_total will be updated to total without trigger. If the > > rest of 9 polling_min_periods remain the same states, the trigger will > > be missed when window timing is reached. > > I don't see why updating group->polling_total after the first > registered stall is an issue here. window_update() calculates growth > using current group->total[] and win->start_value (see: > https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c#L483) > which is set at the beginning of the window (see: > https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c#L462). > If the calculated growth did not reach t->threshold then the trigger > should not be fired (see: > https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c#L542). > We fire the trigger only if growth within a given window is higher > than the threshold. > > In your scenario if the stall recorded in the 1st polling_min_period > was less than the threshold and in the other 9 polling_min_periods no > new stalls were registered then there should be no triggers fired in > that window. This is intended behavior. Trigger is fired only when the The stall in the 1st polling_min_period was *LARGE* then the threshold will also be ignored here. > recorded stall within the window breaches the threshold. And there > will be only one trigger generated per window, no matter how much > stall is being recorded after the threshold was breached. > Hopefully this clarifies the behavior? I don't think so. According to your opinion, if the total keeps no change in the last polling_min_period, then the growth during the 1-9 min_periods which is much larger than the threshold will also be ignored. It does not make sense. https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c#L529 > > > > > > > > > > > > > > Signed-off-by: Zhaoyang Huang > > > > > --- > > > > > include/linux/psi_types.h | 2 ++ > > > > > kernel/sched/psi.c | 30 ++++++++++++++++++------------ > > > > > 2 files changed, 20 insertions(+), 12 deletions(-) > > > > > > > > > > diff --git a/include/linux/psi_types.h b/include/linux/psi_types.h > > > > > index 0a23300..9533d2e 100644 > > > > > --- a/include/linux/psi_types.h > > > > > +++ b/include/linux/psi_types.h > > > > > @@ -132,6 +132,8 @@ struct psi_trigger { > > > > > > > > > > /* Refcounting to prevent premature destruction */ > > > > > struct kref refcount; > > > > > + > > > > > + bool new_stall; > > > > > }; > > > > > > > > > > struct psi_group { > > > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > > > > > index 1652f2b..402718c 100644 > > > > > --- a/kernel/sched/psi.c > > > > > +++ b/kernel/sched/psi.c > > > > > @@ -458,9 +458,12 @@ static void psi_avgs_work(struct work_struct *work) > > > > > static void window_reset(struct psi_window *win, u64 now, u64 value, > > > > > u64 prev_growth) > > > > > { > > > > > + struct psi_trigger *t = container_of(win, struct psi_trigger, win); > > > > > + > > > > > win->start_time = now; > > > > > win->start_value = value; > > > > > win->prev_growth = prev_growth; > > > > > + t->new_stall = false; > > > > > } > > > > > > > > > > /* > > > > > @@ -515,7 +518,6 @@ static void init_triggers(struct psi_group *group, u64 now) > > > > > static u64 update_triggers(struct psi_group *group, u64 now) > > > > > { > > > > > struct psi_trigger *t; > > > > > - bool new_stall = false; > > > > > u64 *total = group->total[PSI_POLL]; > > > > > > > > > > /* > > > > > @@ -523,19 +525,26 @@ static u64 update_triggers(struct psi_group *group, u64 now) > > > > > * watchers know when their specified thresholds are exceeded. > > > > > */ > > > > > list_for_each_entry(t, &group->triggers, node) { > > > > > - u64 growth; > > > > > - > > > > > /* Check for stall activity */ > > > > > if (group->polling_total[t->state] == total[t->state]) > > > > > continue; > > > > > > > > > > /* > > > > > - * Multiple triggers might be looking at the same state, > > > > > - * remember to update group->polling_total[] once we've > > > > > - * been through all of them. Also remember to extend the > > > > > - * polling time if we see new stall activity. > > > > > + * update the trigger if there is new stall which will be > > > > > + * reset when run out of the window > > > > > */ > > > > > - new_stall = true; > > > > > + t->new_stall = true; > > > > > + > > > > > + memcpy(&group->polling_total[t->state], &total[t->state], > > > > > + sizeof(group->polling_total[t->state])); > > > > > + } > > > > > + > > > > > + list_for_each_entry(t, &group->triggers, node) { > > > > > + u64 growth; > > > > > + > > > > > + /* check if new stall happened during this window*/ > > > > > + if (!t->new_stall) > > > > > + continue; > > > > > > > > > > /* Calculate growth since last update */ > > > > > growth = window_update(&t->win, now, total[t->state]); > > > > > @@ -552,10 +561,6 @@ static u64 update_triggers(struct psi_group *group, u64 now) > > > > > t->last_event_time = now; > > > > > } > > > > > > > > > > - if (new_stall) > > > > > - memcpy(group->polling_total, total, > > > > > - sizeof(group->polling_total)); > > > > > - > > > > > return now + group->poll_min_period; > > > > > } > > > > > > > > > > @@ -1152,6 +1157,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, > > > > > t->last_event_time = 0; > > > > > init_waitqueue_head(&t->event_wait); > > > > > kref_init(&t->refcount); > > > > > + t->new_stall = false; > > > > > > > > > > mutex_lock(&group->trigger_lock); > > > > > > > > > > -- > > > > > 1.9.1 > > > > >