From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A180AC433ED for ; Wed, 21 Apr 2021 09:50:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CAB4B6143D for ; Wed, 21 Apr 2021 09:50:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CAB4B6143D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E9B016B006C; Wed, 21 Apr 2021 05:50:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E4C196B006E; Wed, 21 Apr 2021 05:50:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC55B6B0070; Wed, 21 Apr 2021 05:50:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id B3CAB6B006C for ; Wed, 21 Apr 2021 05:50:46 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 60A091DF9 for ; Wed, 21 Apr 2021 09:50:46 +0000 (UTC) X-FDA: 78055904892.26.CEF3312 Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) by imf08.hostedemail.com (Postfix) with ESMTP id 9FFF580192EC for ; Wed, 21 Apr 2021 09:50:25 +0000 (UTC) Received: by mail-pg1-f181.google.com with SMTP id w10so29292469pgh.5 for ; Wed, 21 Apr 2021 02:50:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FofhuHim/yAjLcyzSZ2t1K4ha+04WQhQ6veF6EUmaCw=; b=wyV+20cNMljrH/90JfzzMkjbjWp2UDqI8qqA1ZTIFgwWhvrHsDzuUJkRC1r6qx16q7 V06CvlnKq9hFEGaxgbFRnkd8tkDiYGuYEVlua4TE1Z6neZRKv74cAs+8Wt24tSxxp1Aj HsLrO0U2l5E6WIm+4h14LnFuMVUIhxFfUZ/hpLzdzIatdHNDZ4oQ3BBTn+U/PhPGWoPr HQuOInxzzDxhyQ7hnypn0+PFLQ06nYub9SHheHA9GBIw5KFEe6QSl5M3VM1VR72Oy585 +pzsyRDmY8WbVeCVHtCN+S3iBYYfRjbic0CfkgDZc1omxHABftLywiHkqR53ndQtQyqZ lpWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FofhuHim/yAjLcyzSZ2t1K4ha+04WQhQ6veF6EUmaCw=; b=t+hPLFtVMqV0coLZE+V+bO8pofknBQRMLGsPjTovzXR84dp4CpcfrGiunfjvS8LbmM on+QPfb54z8D1KR30Bm7MXV09VEGQvLxuu8q49H0iMgxowYKHUCG1Jb4JceUzk061Z2u fUADg2SKcuBuJiXQn/Xf21oosKFm0mmV1eD98IseNpenB/C6L+mBEOhsnXyvRHaKLu6a EZdiHCQCTYuVBch7+PhP/5WoCviMxTGSgiNHoj64eP/bWP7BNxGIV+XJNG9YfA2LxHJD NI5uSil4pZ9dKjIXdBnoUfXIpPm/DjNPobTnitYPX8/yXAzcWTRGu99edybUflQGSAvR h9vg== X-Gm-Message-State: AOAM530Iyf2Up9gaW/7cpFJIAPXJ7JsKOsBoGGT7Vz4t9q+ggHo4O0Sn SOUUVMm5gI1nWIldEL3ArOngFoLuAULVB8/m4FksDw== X-Google-Smtp-Source: ABdhPJzflLOYb7RSqIdpU6cNT1PnBtbHHmAc9zSIUPhCxdW9TODj+DCrWgD4U4R8hHHOFB8fQpCZhQO+Ic3CfL2BGtA= X-Received: by 2002:a05:6a00:228a:b029:264:1ec7:7c3 with SMTP id f10-20020a056a00228ab02902641ec707c3mr6848675pfe.2.1618998643807; Wed, 21 Apr 2021 02:50:43 -0700 (PDT) MIME-Version: 1.0 References: <20210421062644.68331-1-songmuchun@bytedance.com> In-Reply-To: From: Muchun Song Date: Wed, 21 Apr 2021 17:50:06 +0800 Message-ID: Subject: Re: [External] Re: [PATCH] mm: memcontrol: fix root_mem_cgroup charging To: Michal Hocko Cc: Roman Gushchin , Johannes Weiner , Andrew Morton , Shakeel Butt , Vladimir Davydov , LKML , Linux Memory Management List , Xiongchun duan , fam.zheng@bytedance.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9FFF580192EC X-Stat-Signature: 7mz3ietpti4tn44rqqqzthhr95mow74f Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from=""; helo=mail-pg1-f181.google.com; client-ip=209.85.215.181 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618998625-235932 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 21, 2021 at 3:34 PM Michal Hocko wrote: > > On Wed 21-04-21 14:26:44, Muchun Song wrote: > > The below scenario can cause the page counters of the root_mem_cgroup > > to be out of balance. > > > > CPU0: CPU1: > > > > objcg = get_obj_cgroup_from_current() > > obj_cgroup_charge_pages(objcg) > > memcg_reparent_objcgs() > > // reparent to root_mem_cgroup > > WRITE_ONCE(iter->memcg, parent) > > // memcg == root_mem_cgroup > > memcg = get_mem_cgroup_from_objcg(objcg) > > // do not charge to the root_mem_cgroup > > try_charge(memcg) > > > > obj_cgroup_uncharge_pages(objcg) > > memcg = get_mem_cgroup_from_objcg(objcg) > > // uncharge from the root_mem_cgroup > > page_counter_uncharge(&memcg->memory) > > > > This can cause the page counter to be less than the actual value, > > Although we do not display the value (mem_cgroup_usage) so there > > shouldn't be any actual problem, but there is a WARN_ON_ONCE in > > the page_counter_cancel(). Who knows if it will trigger? So it > > is better to fix it. > > The changelog doesn't explain the fix and why you have chosen to charge > kmem objects to root memcg and left all other try_charge users intact. The object cgroup is special (because the page can reparent). Only the user of objcg APIs should be fixed. > The reason is likely that those are not reparented now but that just > adds an inconsistency. > > Is there any reason you haven't simply matched obj_cgroup_uncharge_pages > to check for the root memcg and bail out early? Because obj_cgroup_uncharge_pages() uncharges pages from the root memcg unconditionally. Why? Because some pages can be reparented to root memcg, in order to ensure the correctness of page counter of root memcg. We have to uncharge pages from root memcg. So we do not check whether the page belongs to the root memcg when it uncharges. Based on this, we have to make sure that the root memcg page counter is increased when the page charged. I think the diagram (in the commit log) can illustrate this problem well. Thanks. > > > Signed-off-by: Muchun Song > > --- > > mm/memcontrol.c | 17 ++++++++++++----- > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 1e68a9992b01..81b54bd9b9e0 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -2686,8 +2686,8 @@ void mem_cgroup_handle_over_high(void) > > css_put(&memcg->css); > > } > > > > -static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > - unsigned int nr_pages) > > +static int __try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > + unsigned int nr_pages) > > { > > unsigned int batch = max(MEMCG_CHARGE_BATCH, nr_pages); > > int nr_retries = MAX_RECLAIM_RETRIES; > > @@ -2699,8 +2699,6 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > bool drained = false; > > unsigned long pflags; > > > > - if (mem_cgroup_is_root(memcg)) > > - return 0; > > retry: > > if (consume_stock(memcg, nr_pages)) > > return 0; > > @@ -2880,6 +2878,15 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > return 0; > > } > > > > +static inline int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > > + unsigned int nr_pages) > > +{ > > + if (mem_cgroup_is_root(memcg)) > > + return 0; > > + > > + return __try_charge(memcg, gfp_mask, nr_pages); > > +} > > + > > #if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MMU) > > static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) > > { > > @@ -3125,7 +3132,7 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp, > > > > memcg = get_mem_cgroup_from_objcg(objcg); > > > > - ret = try_charge(memcg, gfp, nr_pages); > > + ret = __try_charge(memcg, gfp, nr_pages); > > if (ret) > > goto out; > > > > -- > > 2.11.0 > > -- > Michal Hocko > SUSE Labs