From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48C46C433F5 for ; Sat, 18 Sep 2021 07:55:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A3E516112E for ; Sat, 18 Sep 2021 07:55:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A3E516112E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0AAE26B0071; Sat, 18 Sep 2021 03:55:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 05ACB6B0072; Sat, 18 Sep 2021 03:55:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8B886B0073; Sat, 18 Sep 2021 03:55:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0064.hostedemail.com [216.40.44.64]) by kanga.kvack.org (Postfix) with ESMTP id D9C396B0071 for ; Sat, 18 Sep 2021 03:55:43 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 8D249181C9BA4 for ; Sat, 18 Sep 2021 07:55:43 +0000 (UTC) X-FDA: 78599934966.28.D9C1C92 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) by imf16.hostedemail.com (Postfix) with ESMTP id 7DB41F00008C for ; Sat, 18 Sep 2021 07:55:42 +0000 (UTC) Received: by mail-pj1-f52.google.com with SMTP id m21-20020a17090a859500b00197688449c4so9103761pjn.0 for ; Sat, 18 Sep 2021 00:55:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Y3Gvwx2bn9q0G9QM9HtT1I3LdyzvpbTk56Q2HZWRQvk=; b=cSP90aMO2q4sng5CrQIbgmWUuF3Dfzb3aMv9iWZT5xVTRyW5/+MW39tP8PO0V/BaZj 574H29DSVkfXBEoM/XcsBQ+DOiU9XBBhdEjJyW3NMOdLulC+/bHMZ5fhQrLFjJVRzz5S HFb9Axoh1wy0VKi66yA8MHA0O6tF0aPM540ztW7uvH7Xfgsgj/GTrX1kMeoP/xnXaIj9 Yes97w46LeuhoghG0vN0VWV5BrT/Ld7+sMiJsoL8uSfV1M+rhmxItmScbCyDBEZiQVwM YJyeD1QPTfrBElUA1NQWyaPozyNC+n23TnY3SrjM8AwQlPQJipVLbShFLVIBjTHMlIWy ckVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Y3Gvwx2bn9q0G9QM9HtT1I3LdyzvpbTk56Q2HZWRQvk=; b=WrVhHtniqzs3jy0KFXKMOZGZPAEcSQr54kUwWOGPlsfNaa8n2Rdz1gNhLMwws2+5XH h17voYm1mFU+mN0pskS39gz2o0elYwgFct4/+yq2uURh0h+yHRNBpz58nleXC2jgPmVf 28eXQ0yREPYchwIvQd21KghIFFdU56ENQ50PxreN9adMO60/R7jqN2awczcjuxIVsYu3 icCvJhx10JWVO9w8cqH1sVg5rm/GWj78Vp5J56AvX/XEnSU5UuAG3waK6zVCw7bi+1fD oJ74yqQzuJdRuIkD32PFsYSriJn5/ae5gyY3zhVEYDq+lw+fw+juXrld3V6IWv87ji9f vNLw== X-Gm-Message-State: AOAM533xt4ggDlTgfe492v+0t2lx6yp1hKyOOrTDgD69H8ok7XeXekSA s+4IuWQmcWDYw4C/xu115rasElg+oOT+ep5zsDC9/1ICsNLf0Q== X-Google-Smtp-Source: ABdhPJzAFBToLyzQTBujVXVXDzoO5yiLXzBNiThQTHNwMKNZTfyeNrfFVx9YaR0Lu+dglyPBXsakAZ4JPp/tO3SpPd0= X-Received: by 2002:a17:90b:1807:: with SMTP id lw7mr13397776pjb.217.1631951741155; Sat, 18 Sep 2021 00:55:41 -0700 (PDT) MIME-Version: 1.0 References: <20210916134748.67712-1-songmuchun@bytedance.com> In-Reply-To: From: Muchun Song Date: Sat, 18 Sep 2021 15:55:02 +0800 Message-ID: Subject: Re: [PATCH v2 00/13] Use obj_cgroup APIs to charge the LRU pages To: Roman Gushchin Cc: Johannes Weiner , Michal Hocko , Andrew Morton , Shakeel Butt , Vladimir Davydov , LKML , Linux Memory Management List , Xiongchun duan , fam.zheng@bytedance.com, "Singh, Balbir" , Yang Shi , Alex Shi , Muchun Song , Qi Zheng Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=cSP90aMO; spf=pass (imf16.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.216.52 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-Stat-Signature: mepp6tetie9erpi3e4rkpameiupogrxw X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 7DB41F00008C X-HE-Tag: 1631951742-258258 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Sep 18, 2021 at 8:13 AM Roman Gushchin wrote: > > On Fri, Sep 17, 2021 at 06:49:21PM +0800, Muchun Song wrote: > > On Fri, Sep 17, 2021 at 9:29 AM Roman Gushchin wrote: > > > > > > Hi Muchun! > > > > > > On Thu, Sep 16, 2021 at 09:47:35PM +0800, Muchun Song wrote: > > > > This version is rebased over linux 5.15-rc1, because Shakeel has asked me > > > > if I could do that. I rework some code suggested by Roman as well in this > > > > version. I have not removed the Acked-by tags which are from Roman, because > > > > this version is not based on the folio relevant. If Roman wants me to > > > > do this, please let me know, thanks. > > > > > > I'm fine with this, thanks for clarifying. > > > > > > > > > > > Since the following patchsets applied. All the kernel memory are charged > > > > with the new APIs of obj_cgroup. > > > > > > > > [v17,00/19] The new cgroup slab memory controller[1] > > > > [v5,0/7] Use obj_cgroup APIs to charge kmem pages[2] > > > > > > > > But user memory allocations (LRU pages) pinning memcgs for a long time - > > > > it exists at a larger scale and is causing recurring problems in the real > > > > world: page cache doesn't get reclaimed for a long time, or is used by the > > > > second, third, fourth, ... instance of the same job that was restarted into > > > > a new cgroup every time. Unreclaimable dying cgroups pile up, waste memory, > > > > and make page reclaim very inefficient. > > > > > > I've an idea: what if we use struct list_lru_memcg as an intermediate object > > > between an individual page and struct mem_cgroup? > > > > > > It could contain a pointer to a memory cgroup structure (not even sure if a > > > reference is needed), and a lru page can contain a pointer to the lruvec instead > > > of memcg/objcg. > > lruvec_memcg I mean. Thanks for your clarification. > > > > > Hi Roman, > > > > If I understand properly, here you mean the struct page has a pointer > > to the struct lruvec not struct list_lru_memcg. What's the functionality > > of the struct list_lru_memcg? Would you mind exposing more details? > > So the basic idea is simple: a lru page charged to a memcg is associated with > a per-memcg lruvec (list_lru_memcg), which is associated with a memory cgroup. > And after your patches there is a second link of associations: page to objcg > to memcg: > > 1) page->objcg->memcg > 2) page->list_lru_memcg->memcg > > (those are not necessarily direct pointers, but generally speaking, relations). > > My gut feeling is that if we can merge them into just 2) and use list_lru_memcg > as an intermediate object between pages and memory cgroups, the whole thing can > be more efficient and beautiful. > > Yes, on reparenting we'd need to scan over all pages in the lru list, but > hopefully we can do it from a worker context. And it's not such a big deal as > with slab objects, where we simple had no list of all objects. struct list_lru_memcg seems to be redundant, it just contains a pointer to struct mem_cgroup. We need to update each page->lruvec_memcg, why not update page->memcg_data directly to its parent memcg? The update of page->lruvec_memcg should be under both child and parent's lruvec lock, right? I suppose scanning over all pages may be a problem if there are many pages. Thanks. > > Again, I'm not 100% sure if it's possible and worth it, so it shouldn't block > your patchset if everybody else like it. > > Thanks