From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73107C433B4 for ; Wed, 28 Apr 2021 23:32:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F08D46140C for ; Wed, 28 Apr 2021 23:32:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F08D46140C Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6FB916B006C; Wed, 28 Apr 2021 19:32:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6AAB66B006E; Wed, 28 Apr 2021 19:32:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 524916B0070; Wed, 28 Apr 2021 19:32:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116]) by kanga.kvack.org (Postfix) with ESMTP id 384F96B006C for ; Wed, 28 Apr 2021 19:32:24 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E8831181AEF39 for ; Wed, 28 Apr 2021 23:32:23 +0000 (UTC) X-FDA: 78083376966.29.E4A5F50 Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) by imf26.hostedemail.com (Postfix) with ESMTP id 8B92840002C7 for ; Wed, 28 Apr 2021 23:32:14 +0000 (UTC) Received: by mail-lj1-f171.google.com with SMTP id a5so37243473ljk.0 for ; Wed, 28 Apr 2021 16:32:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gKq+LFxXyWBv0dDTFY+xwdbBAeyG/loY9cisPmftCiM=; b=fOK8Gte9Lv/qjiZD5DrB5u9ffD/eOuMtHeRXWH5DTq5IZNXrLrVYdv2yXGI8XKhop0 WMqDH3qz7FrWIhw0WlFbPyWcreqCalPk9cNvVu76ktwMyChX/RXU5svLiLmwqUmqNxlo dZk1T0X7yPoRp/7oSqxflxCnz1sdA7jbVt5+Q62Yv4gaVlH+VIZCQaTM/4mPlnVnAcPo B9fWgjRbqwND3uMVNbtG4kM4CAB++EPz1ngJ7h+qcyduKi670SKL8Qijs04wwmRhactc N9q1nintLqo0k6ZYHrWI09cAl77fxkCeI/W4V/aMwmw9huJMaBIHQAQUVjn+CKII9w3D DjuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gKq+LFxXyWBv0dDTFY+xwdbBAeyG/loY9cisPmftCiM=; b=KGy9c9HBXvnODmSWONuON3gogrQTWaG/005op3nlGvqCriOFEDLNUPrkaDYIYAFsKy Ub3Ek4A3y6YfTOLLERLmURrnbYClLrae83ZFEBcBoCjcX61Wix4PJ7j1oPqgKwcros9/ LY6Np2osMGhHtGWh46jCgjGuQT2B34JlmlQEB7HFpna+DiOvFHDXEKbTWUvA9tIU4qoj /parv1qdLHR/dL6SN36FKJNi1VWEGnK/8moIujSN5New70waevKYmFskchsdUc9bTCkI c5stAbuKk+2MQ8UXVwE6SrKJ1Ah0jR7uvmiPWzMZsjIxvHie2eHgZPCN7QuWxgGZUAOd FdhA== X-Gm-Message-State: AOAM530mpgm4IfJ7OVcfxnsVNBCHjTDOAbELZYJUYzTg8rUVQqRy0d0W d5mnYLG6rkmnyQZ6gkPDY6gzVvZcTbFmEWoZPrnlEg== X-Google-Smtp-Source: ABdhPJy3P/E9ZXBwSluvDEgi8bjVYUJz8Z+EqxAxyirc6UV5SK1NI/Z+7EGWfDNucx42FfRiJqMqLlmVEiLBFgn0Ya4= X-Received: by 2002:a2e:b17b:: with SMTP id a27mr21518327ljm.160.1619652741705; Wed, 28 Apr 2021 16:32:21 -0700 (PDT) MIME-Version: 1.0 References: <20210428094949.43579-1-songmuchun@bytedance.com> In-Reply-To: <20210428094949.43579-1-songmuchun@bytedance.com> From: Shakeel Butt Date: Wed, 28 Apr 2021 16:32:07 -0700 Message-ID: Subject: Re: [PATCH 0/9] Shrink the list lru size on memory cgroup removal To: Muchun Song Cc: Matthew Wilcox , Andrew Morton , Johannes Weiner , Michal Hocko , Vladimir Davydov , Roman Gushchin , Yang Shi , alexs@kernel.org, Alexander Duyck , Wei Yang , linux-fsdevel , LKML , Linux MM Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 8B92840002C7 X-Stat-Signature: jy5oiiwhs876fwyzfia3zcg4dqn3onxs Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf26; identity=mailfrom; envelope-from=""; helo=mail-lj1-f171.google.com; client-ip=209.85.208.171 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619652734-27062 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 28, 2021 at 2:54 AM Muchun Song wrote: > > In our server, we found a suspected memory leak problem. The kmalloc-32 > consumes more than 6GB of memory. Other kmem_caches consume less than 2GB > memory. > > After our in-depth analysis, the memory consumption of kmalloc-32 slab > cache is the cause of list_lru_one allocation. > > crash> p memcg_nr_cache_ids > memcg_nr_cache_ids = $2 = 24574 > > memcg_nr_cache_ids is very large and memory consumption of each list_lru > can be calculated with the following formula. > > num_numa_node * memcg_nr_cache_ids * 32 (kmalloc-32) > > There are 4 numa nodes in our system, so each list_lru consumes ~3MB. > > crash> list super_blocks | wc -l > 952 > > Every mount will register 2 list lrus, one is for inode, another is for > dentry. There are 952 super_blocks. So the total memory is 952 * 2 * 3 > MB (~5.6GB). But the number of memory cgroup is less than 500. So I > guess more than 12286 containers have been deployed on this machine (I > do not know why there are so many containers, it may be a user's bug or > the user really want to do that). But now there are less than 500 > containers in the system. And memcg_nr_cache_ids has not been reduced > to a suitable value. This can waste a lot of memory. If we want to reduce > memcg_nr_cache_ids, we have to reboot the server. This is not what we > want. > > So this patchset will dynamically adjust the value of memcg_nr_cache_ids > to keep healthy memory consumption. In this case, we may be able to restore > a healthy environment even if the users have created tens of thousands of > memory cgroups and then destroyed those memory cgroups. This patchset also > contains some code simplification. > There was a recent discussion [1] on the same issue. Did you get the chance to take a look at that. I have not gone through this patch series yet but will do in the next couple of weeks. [1] https://lore.kernel.org/linux-fsdevel/20210405054848.GA1077931@in.ibm.com/