From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D6C2C433E2 for ; Tue, 15 Sep 2020 02:47:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 88ADE207EA for ; Tue, 15 Sep 2020 02:47:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="fSs6uCfA" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 88ADE207EA Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CFA146B0087; Mon, 14 Sep 2020 22:47:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C84806B0089; Mon, 14 Sep 2020 22:47:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B4DE98E0009; Mon, 14 Sep 2020 22:47:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0164.hostedemail.com [216.40.44.164]) by kanga.kvack.org (Postfix) with ESMTP id 9BCA16B0087 for ; Mon, 14 Sep 2020 22:47:00 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 63EEF1EF3 for ; Tue, 15 Sep 2020 02:47:00 +0000 (UTC) X-FDA: 77263758600.22.hook01_2a096f42710d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 320ED18038E60 for ; Tue, 15 Sep 2020 02:47:00 +0000 (UTC) X-HE-Tag: hook01_2a096f42710d X-Filterd-Recvd-Size: 8596 Received: from mail-pg1-f193.google.com (mail-pg1-f193.google.com [209.85.215.193]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Sep 2020 02:46:59 +0000 (UTC) Received: by mail-pg1-f193.google.com with SMTP id j34so1188400pgi.7 for ; Mon, 14 Sep 2020 19:46:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=b0FQ0nfZ2Wc8LZgyuuXYTX78UCL16O8uTa9n4V+OwUo=; b=fSs6uCfAjjro3EnF93Kpk7JX7abQU5TpqgxG1AVkOpAPB2gnjE3erXCZw1C3ekj3Vq skmqCNnaP+DezWOJGLNMOG10DgNjZoRghI6njGU7FpwVAN/WK7iohqLjs47VF1H7RHTb oJ9m639MA0tdr9KzrifVqsil73D57GG4Wf1lZXLdnOc7nvsyyVLNAHBqxab7o67TWxHx CS02eoGVX59h7Nao1KAoNa/sFnGExTo6y+SLwnkHTj8hFsG0zesoY2NosrnkGVyaN9Ts FZkP589aTsWIbvdMvGuxFtv7aigbuH8NxaJqLhDlLDqqUmRgOzBsTmK6vFVeMePs05sl Zjiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=b0FQ0nfZ2Wc8LZgyuuXYTX78UCL16O8uTa9n4V+OwUo=; b=T6MrmKG6ksbssx/AyGgqOvvV1vIlbfrG28/DsUAmZEqv0PKSKlr1zqiyd3z9wBMJlG B46FfgLOBlV4dp1JEi/evRSN0Nx39cwlGYHWFiw6H/KhUQ6CDjR0Kfze3WaBGAZHpFPZ Jug6ZaioSiyxVwn0AxxIGpSD5WVi+FfXDyt4iRazjCaqIoDwaJtOd+iW3vHIFmjnr+kq akbpuXOyb00CmL489I3Ja5pOtKCSOv+BMM1ElotKADWONfOMxfXkHrYRUdxSq5Ryam8t xifWLE19RPHrZubzoobQ/CH2iT1Lr8w1Mhpgm1AMm4wS60GADFc6Xyei4Bk01cpOoRgB 3cXg== X-Gm-Message-State: AOAM5300HXWCDk0JYbLD4bzHHqX3wt7S/iEFVHhp2Towe2PpIswoe9m1 ixhQzNNJQkxjqucXXHdMpuJ3YnGOFIf0qDlOXt0kJw== X-Google-Smtp-Source: ABdhPJwmvQ+nKPf9WKg2i0x4PycRJ+B/sEJ9jElgrzmOufgwQsH03ZsqjbBtr+eOUdZjqfC9sthGV7Rf6p+CwQhfkuQ= X-Received: by 2002:a63:3047:: with SMTP id w68mr12740841pgw.341.1600138018835; Mon, 14 Sep 2020 19:46:58 -0700 (PDT) MIME-Version: 1.0 References: <20200913070010.44053-1-songmuchun@bytedance.com> In-Reply-To: From: Muchun Song Date: Tue, 15 Sep 2020 10:46:22 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2 To: Shakeel Butt Cc: Tejun Heo , Li Zefan , Johannes Weiner , Jonathan Corbet , Michal Hocko , Vladimir Davydov , Andrew Morton , Roman Gushchin , Cgroups , linux-doc@vger.kernel.org, LKML , Linux MM , kernel test robot Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 320ED18038E60 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 15, 2020 at 6:57 AM Shakeel Butt wrote: > > On Mon, Sep 14, 2020 at 9:55 AM Muchun Song wrote: > > > > On Tue, Sep 15, 2020 at 12:07 AM Shakeel Butt wrote: > > > > > > On Sun, Sep 13, 2020 at 12:01 AM Muchun Song wrote: > > > > > > > > In the cgroup v1, we have a numa_stat interface. This is useful for > > > > providing visibility into the numa locality information within an > > > > memcg since the pages are allowed to be allocated from any physical > > > > node. One of the use cases is evaluating application performance by > > > > combining this information with the application's CPU allocation. > > > > But the cgroup v2 does not. So this patch adds the missing information. > > > > > > > > Signed-off-by: Muchun Song > > > > Suggested-by: Shakeel Butt > > > > Reported-by: kernel test robot > > > > --- > > > [snip] > > > > + > > > > +static struct numa_stat numa_stats[] = { > > > > + { "anon", PAGE_SIZE, NR_ANON_MAPPED }, > > > > + { "file", PAGE_SIZE, NR_FILE_PAGES }, > > > > + { "kernel_stack", 1024, NR_KERNEL_STACK_KB }, > > > > + { "shmem", PAGE_SIZE, NR_SHMEM }, > > > > + { "file_mapped", PAGE_SIZE, NR_FILE_MAPPED }, > > > > + { "file_dirty", PAGE_SIZE, NR_FILE_DIRTY }, > > > > + { "file_writeback", PAGE_SIZE, NR_WRITEBACK }, > > > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > > > + /* > > > > + * The ratio will be initialized in numa_stats_init(). Because > > > > + * on some architectures, the macro of HPAGE_PMD_SIZE is not > > > > + * constant(e.g. powerpc). > > > > + */ > > > > + { "anon_thp", 0, NR_ANON_THPS }, > > > > +#endif > > > > + { "inactive_anon", PAGE_SIZE, NR_INACTIVE_ANON }, > > > > + { "active_anon", PAGE_SIZE, NR_ACTIVE_ANON }, > > > > + { "inactive_file", PAGE_SIZE, NR_INACTIVE_FILE }, > > > > + { "active_file", PAGE_SIZE, NR_ACTIVE_FILE }, > > > > + { "unevictable", PAGE_SIZE, NR_UNEVICTABLE }, > > > > + { "slab_reclaimable", 1, NR_SLAB_RECLAIMABLE_B }, > > > > + { "slab_unreclaimable", 1, NR_SLAB_UNRECLAIMABLE_B }, > > > > +}; > > > > + > > > > +static int __init numa_stats_init(void) > > > > +{ > > > > + int i; > > > > + > > > > + for (i = 0; i < ARRAY_SIZE(numa_stats); i++) { > > > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > > > + if (numa_stats[i].idx == NR_ANON_THPS) > > > > + numa_stats[i].ratio = HPAGE_PMD_SIZE; > > > > +#endif > > > > + } > > > > > > The for loop seems excessive but I don't really have a good alternative. > > > > Yeah, I also have no good alternative. The numa_stats is only initialized > > once. So there may be no problem :). > > > > > > > > > + > > > > + return 0; > > > > +} > > > > +pure_initcall(numa_stats_init); > > > > + > > > > +static unsigned long memcg_node_page_state(struct mem_cgroup *memcg, > > > > + unsigned int nid, > > > > + enum node_stat_item idx) > > > > +{ > > > > + VM_BUG_ON(nid >= nr_node_ids); > > > > + return lruvec_page_state(mem_cgroup_lruvec(memcg, NODE_DATA(nid)), idx); > > > > +} > > > > + > > > > +static const char *memory_numa_stat_format(struct mem_cgroup *memcg) > > > > +{ > > > > + int i; > > > > + struct seq_buf s; > > > > + > > > > + /* Reserve a byte for the trailing null */ > > > > + seq_buf_init(&s, kmalloc(PAGE_SIZE, GFP_KERNEL), PAGE_SIZE - 1); > > > > + if (!s.buffer) > > > > + return NULL; > > > > + > > > > + for (i = 0; i < ARRAY_SIZE(numa_stats); i++) { > > > > + int nid; > > > > + > > > > + seq_buf_printf(&s, "%s", numa_stats[i].name); > > > > + for_each_node_state(nid, N_MEMORY) { > > > > + u64 size; > > > > + > > > > + size = memcg_node_page_state(memcg, nid, > > > > + numa_stats[i].idx); > > > > + size *= numa_stats[i].ratio; > > > > + seq_buf_printf(&s, " N%d=%llu", nid, size); > > > > + } > > > > + seq_buf_putc(&s, '\n'); > > > > + } > > > > + > > > > + /* The above should easily fit into one page */ > > > > + if (WARN_ON_ONCE(seq_buf_putc(&s, '\0'))) > > > > + s.buffer[PAGE_SIZE - 1] = '\0'; > > > > > > I think you should follow Michal's recommendation at > > > http://lkml.kernel.org/r/20200914115724.GO16999@dhcp22.suse.cz > > > > Here is different, because the seq_buf_putc(&s, '\n') will not add \0 unless > > we use seq_buf_puts(&s, "\n"). > > > > Why a separate memory_numa_stat_format()? For memory_stat_format(), it > is called from two places. There is no need to have a separate > memory_numa_stat_format(). Similarly why not just call seq_printf() > instead of formatting into a seq_buf? I was indeed affected by memory_stat_format(). Thank you for making me sober. -- Yours, Muchun