From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.6 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5B30C433E0 for ; Tue, 21 Jul 2020 19:13:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9515A20672 for ; Tue, 21 Jul 2020 19:13:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CnYKPb1X" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9515A20672 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 23A7C6B0005; Tue, 21 Jul 2020 15:13:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1EA106B0006; Tue, 21 Jul 2020 15:13:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DB4E6B0007; Tue, 21 Jul 2020 15:13:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0108.hostedemail.com [216.40.44.108]) by kanga.kvack.org (Postfix) with ESMTP id E967B6B0005 for ; Tue, 21 Jul 2020 15:13:48 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 860BD1075FCE7 for ; Tue, 21 Jul 2020 19:13:48 +0000 (UTC) X-FDA: 77063032536.23.pan84_020c78d26f2f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 1C18214C37 for ; Tue, 21 Jul 2020 19:13:12 +0000 (UTC) X-HE-Tag: pan84_020c78d26f2f X-Filterd-Recvd-Size: 6134 Received: from mail-lf1-f66.google.com (mail-lf1-f66.google.com [209.85.167.66]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Tue, 21 Jul 2020 19:13:11 +0000 (UTC) Received: by mail-lf1-f66.google.com with SMTP id j21so12302113lfe.6 for ; Tue, 21 Jul 2020 12:13:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=G1+MBynaOaFc+TQGQ/+zLUDlbfNpuxAiueoXbqW+mHQ=; b=CnYKPb1Xv2Oa9t8RxiLG8qh5LFd1tQ4x4f0WjwYFg0GeyKeD8WtsbrK0ptSPAvEwl2 p5N38bdjZsT11squqkQ1pPFLR9OGUwiDyFO1XbXZvTeA+Phru4E2ctuap9k6B7ovdktM c1hj9ICibOXamyVTIL9p8vYpWBYTXMAm08B9cKvi34M8bs+ahrGlrDUCP/R2c1BueQFN vDKbKpW7kZ9rqKt+fNxsj407lyPTEGwfMi3MMK8QZsqwyq6YDDGlB6I7vYjAl7jTwHsB /AdU0ePREeL3IVkKswnAQ8Vc0ueP/mNPN2/leFB8tSv5ukqw0EhQ0veO+jKqLwnXzr6L 9ROg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=G1+MBynaOaFc+TQGQ/+zLUDlbfNpuxAiueoXbqW+mHQ=; b=WnUTLV1W83tdwjFXucNP5Xhe8G5UuXdlvQXfO9GPbSWaDmxvs0bhngdfe1We5a7Ekp 40qL+AH4Sn9lPYkxtYBSRuTZT/J/bL5Fa7Lvm42IBwBhlmemoPBQEeSWtd9d5+WoPXmW +X2xNzEMm0spQlj/xSrKs5JOMAApSXLEJgZc4ipOKpXIAaSN3Qzfh7R4gD05fMUFRPgm yz6NHPiTKZFoFfYZajhhDGuaP9CNvUGnOTPz16owq6tBNvNWRYxK9+3UJVoyIth4qkaT wrMlbSRo3vrLFf/FVO4Lz5Rg4hiqY4C3l82lW35Bl+/cKT/VfHhnGZxYSIU+996KthVn VP4w== X-Gm-Message-State: AOAM533ciqGrf+0WkltyhhdNgiBJxkxVIzO8MRFQf6dvh/0yXEN3Nv7Z SRBXe/QcdkB7+khxUZHwMcfDXY3SYl9NhVIk8ODy6Q== X-Google-Smtp-Source: ABdhPJx4x3WE7/pXjDBRdRagnXwO29Oe/I8pts7edpG8D4QIOLnM3lnE0dfWuR25FRqWYY2Z3HPd7DX+8UxG3rCtJpQ= X-Received: by 2002:a19:e61a:: with SMTP id d26mr14162136lfh.96.1595358789794; Tue, 21 Jul 2020 12:13:09 -0700 (PDT) MIME-Version: 1.0 References: <2E04DD7753BE0E4ABABF0B664610AD6F2620CAF7@dggeml528-mbx.china.huawei.com> <20200721174126.GA271870@cmpxchg.org> <20200721184959.GA8266@carbon.DHCP.thefacebook.com> In-Reply-To: <20200721184959.GA8266@carbon.DHCP.thefacebook.com> From: Shakeel Butt Date: Tue, 21 Jul 2020 12:12:58 -0700 Message-ID: Subject: Re: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs To: Roman Gushchin Cc: Johannes Weiner , jingrui , "tj@kernel.org" , Lizefan , "mhocko@kernel.org" , "vdavydov.dev@gmail.com" , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "cgroups@vger.kernel.org" , "linux-kernel@vger.kernel.org" , caihaomin , "Weiwei (N)" , guro@cmpxchg.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 1C18214C37 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 21, 2020 at 11:51 AM Roman Gushchin wrote: > > On Tue, Jul 21, 2020 at 01:41:26PM -0400, Johannes Weiner wrote: > > On Tue, Jul 21, 2020 at 11:19:52AM +0000, jingrui wrote: > > > Cc: Johannes Weiner ; Michal Hocko ; Vladimir Davydov > > > > > > Thanks. > > > > > > --- > > > PROBLEM: cgroup cost too much memory when transfer small files to tmpfs. > > > > > > keywords: cgroup PERCPU/memory cost too much. > > > > > > description: > > > > > > We send small files from node-A to node-B tmpfs /tmp directory using sftp. On > > > node-B the systemd configured with pam on like below. > > > > > > cat /etc/pam.d/password-auth | grep systemd > > > -session optional pam_systemd.so > > > > > > So when transfer a file, a systemd session is created, that means a cgroup is > > > created, then file saved at /tmp will associated with a cgroup object. After > > > file transferred, session and cgroup-dir will be removed, but the file in /tmp > > > still associated with the cgroup object. The PERCPU memory in cgroup/css object > > > cost a lot(about 0.5MB/per-cgroup-object) on 200/cpus machine. > > > > CC Roman who had a patch series to free all this extended (percpu) > > memory upon cgroup deletion: > > > > https://lore.kernel.org/patchwork/cover/1050508/ > > > > It looks like it never got merged for some reason. > > The mentioned patchset can make the problem less noticeable, but can't solve it completely. > It has never been merged, because the dying cgroup problem was mostly solved by other methods: > slab memory reparenting and various reclaim fixes. So there was no more reason to complicate > the code to release the memcg memory early. > > The overhead of creating and destroying a new memory cgroup for a transfer of a small > file will be noticeable anyway. So IMO the solution is to use a single cgroup for all > transfers. I don't know if systemd supports such mode out of the box, but it shouldn't > be hard to add it. > > But also I wonder if we need a special tmpfs mount option, something like "noaccount". > Not only for this specific case, but also for the case when tmpfs is extensively > shared between multiple cgroups or if it's used to pass some data from one cgroup > to another, or if we care about the performance more than about the accounting; > in other words for cases where the accounting makes more harm than good. > Internally we actually have an tmpfs mount option "memcg=" which charges all the memory of the tmpfs files on that mount to the given memcg and the motivation is the shared tmpfs files between multiple cgroups. One concrete use-case is the shared memory used for communication between the application and the user space network driver [1]. The "memcg=root" can be used as a "noaccount" option. [1] https://sosp19.rcs.uwaterloo.ca/slides/marty.pdf