From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A057C433EF for ; Wed, 16 Mar 2022 16:11:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 10BCB6B0071; Wed, 16 Mar 2022 12:11:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BB526B0072; Wed, 16 Mar 2022 12:11:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC5DE8D0001; Wed, 16 Mar 2022 12:11:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0027.hostedemail.com [216.40.44.27]) by kanga.kvack.org (Postfix) with ESMTP id DC7906B0071 for ; Wed, 16 Mar 2022 12:11:45 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9497CA0F80 for ; Wed, 16 Mar 2022 16:11:45 +0000 (UTC) X-FDA: 79250740170.20.6734ADA Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf23.hostedemail.com (Postfix) with ESMTP id 1995E14000F for ; Wed, 16 Mar 2022 16:11:44 +0000 (UTC) Received: by mail-pl1-f175.google.com with SMTP id p17so2176215plo.9 for ; Wed, 16 Mar 2022 09:11:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=f3/UcyazUDVCXoKZNSRcT9yvk6i9ZNdUD2u/zluj76c=; b=KKrE23DicyH9vgYySJIohq0wHZ2gEixYIl1a72/j27LsVEOk1nvZj7fQM13Jmzeyve wlvBNcYbedZIn2uN2hYpP33rlZQ0mjlaBrsblp6/GzqerceaoDlVpyyylkIcI+qVfnjk qZ8iIh67NqucILt+C1kNJAzLWFrkvdcQUpC+C6Dv0g/A/4ryjkOJ7gdXCwsmst47LTeJ igo4swpYmmcq9EVlhrdccbu0dBgd/nPfQQcq8E4sJRHTP8UYoDMAZQq/WxuH67sKAPQI D0Z+VKfU/QBs5XiNbsoCBggq0kjlLB514fZblz553zcQphhkOOhAhgNq/1kyyXQ9mrPz DhSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=f3/UcyazUDVCXoKZNSRcT9yvk6i9ZNdUD2u/zluj76c=; b=b7pB3NTTYGbmUoO/W4+PRCGRu6OTtx98IAFOVLzgUOu1lH3uecugFw5EPRT3MsC/Vu svzvfwut5ans+GQorxNhzWeErLzNkQbNZ1bAiESsnfPbJFxmwmIBMFlkrp9JRpF2XqBP e/3jfX/LLT/9G2cuOobiRO8O3HKel/D8KqpNv82TPvtnHjfawfiLgDppkkg5L8r+RpYr yBf9J+5k+PcdOUswxRwFQtEC2aYx1EqVoJisHJ75EaOfuOeYRiIeUYDn8HlRwD5AV8Nq cu6KJ+d4WnAHdBiADSQxOn7wcs8nlJNq13c8jTPEurU476aaTwmAguXq785fu3r07+t1 QC0A== X-Gm-Message-State: AOAM5315JY7yTGfvxQSZca6BqcStxfZHYI91N8dKw3WQPuOKAkZ/+7x/ Gl5ya5blPyVnIIQuPEQexNMD23Oh1x54Rp7b26AWvA== X-Google-Smtp-Source: ABdhPJzmM8faizIE7HGAjzf3DixfOhH3YwVNSZ0iy0Ti4GkXSWKmzfk/w4tVfFUWBmLbTOwv9exJBB++r+1hJ7S+wsE= X-Received: by 2002:a17:902:7fc1:b0:151:f80f:494a with SMTP id t1-20020a1709027fc100b00151f80f494amr196767plb.81.1647447103886; Wed, 16 Mar 2022 09:11:43 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Yosry Ahmed Date: Wed, 16 Mar 2022 09:11:07 -0700 Message-ID: Subject: Re: [RFC bpf-next] Hierarchical Cgroup Stats Collection Using BPF To: Song Liu Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Johannes Weiner , Hao Luo , Shakeel Butt , Stanislav Fomichev , David Rientjes , bpf , KP Singh , cgroups@vger.kernel.org, Linux-MM Content-Type: multipart/alternative; boundary="0000000000006a456905da582bd0" X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 1995E14000F X-Stat-Signature: 3a4z81a7ubjscf9rfc6uuxnpoq63m3gg Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=KKrE23Di; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of yosryahmed@google.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=yosryahmed@google.com X-HE-Tag: 1647447104-979606 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --0000000000006a456905da582bd0 Content-Type: text/plain; charset="UTF-8" On Tue, Mar 15, 2022 at 11:05 PM Song Liu wrote: > On Wed, Mar 9, 2022 at 12:27 PM Yosry Ahmed wrote: > > > [...] > > > > The map usage by BPF programs and integration with rstat can be as > follows: > > - Internally, each map entry has per-cpu arrays, a total array, and a > > pending array. BPF programs and user space only see one array. > > - The update interface is disabled. BPF programs use helpers to modify > > elements. Internally, the modifications are made to per-cpu arrays, > > and invoke a call to cgroup_bpf_updated() or an equivalent. > > - Lookups (from BPF programs or user space) invoke an rstat flush and > > read from the total array. > > Lookups invoke a rstat flush, so we still walk every node of a subtree for > each lookup, no? So the actual cost should be similar than walking the > subtree with some BPF program? Did I miss something? > > Hi Song, Thanks for taking the time to read my proposal. The rstat framework maintains a tree that contains only updated cgroups. An rstat flush only traverses this tree, not the cgroup subtree/hierarchy. This also ensures that consecutive readers do not have to do any traversals unless new updates happened, because the first reader will have already flushed the stats. > Thanks, > Song > > > - In cgroup_rstat_flush_locked() flush BPF stats as well. > > > [...] > --0000000000006a456905da582bd0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Tue, Mar 15, 2022 at 11:05 PM Song= Liu <song@kernel.org> wrote:<= br>
On Wed, Mar 9, 2= 022 at 12:27 PM Yosry Ahmed <yosryahmed@google.com> wrote:
>
[...]
>
> The map usage by BPF programs and integration with rstat can be as fol= lows:
> - Internally, each map entry has per-cpu arrays, a total array, and a<= br> > pending array. BPF programs and user space only see one array.
> - The update interface is disabled. BPF programs use helpers to modify=
> elements. Internally, the modifications are made to per-cpu arrays, > and invoke a call to cgroup_bpf_updated()=C2=A0 or an equivalent.
> - Lookups (from BPF programs or user space) invoke an rstat flush and<= br> > read from the total array.

Lookups invoke a rstat flush, so we still walk every node of a subtree for<= br> each lookup, no? So the actual cost should be similar than walking the
subtree with some BPF program? Did I miss something?


Hi Song,

Than= ks for taking the time to read my proposal.

The rs= tat framework maintains a tree that contains only updated cgroups. An rstat= flush only traverses this tree, not the cgroup subtree/hierarchy.

This also ensures that consecutive readers do not have to = do any traversals unless new updates happened, because the first reader wil= l have already flushed the stats.
=C2=A0
Thanks,
Song

> - In cgroup_rstat_flush_locked() flush BPF stats as well.
>
[...]
--0000000000006a456905da582bd0--