From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29C38C433F5 for ; Mon, 30 May 2022 13:09:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 984366B0071; Mon, 30 May 2022 09:09:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 90FFA6B0072; Mon, 30 May 2022 09:09:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7AEBE6B0073; Mon, 30 May 2022 09:09:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 676896B0071 for ; Mon, 30 May 2022 09:09:06 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2F168614CD for ; Mon, 30 May 2022 13:09:06 +0000 (UTC) X-FDA: 79522439892.26.019D01A Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) by imf25.hostedemail.com (Postfix) with ESMTP id 372E7A0057 for ; Mon, 30 May 2022 13:08:32 +0000 (UTC) Received: by mail-lj1-f175.google.com with SMTP id t13so9156183ljd.6 for ; Mon, 30 May 2022 06:09:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=openvz-org.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=NzV8dfvq8cM1l3XimLO2tX+6Q+tj9m2ZJPGNF/NhjTo=; b=yhuJl6kgzdEsB6MDpBTkzDuIqI/wRbimnVLT6eSntaZc6lWCsxW2wrBPEL3gUZsDBQ kt7rSdHWA9c1as0IPZMH6dqVC7eR/eoshPwpiRNaxS2rd/8T+IHsxklwl0SQhX3+IE7h /Yej+6rAGirETCjtvpHewEhik6BmYoJwI9pVIfnRAou/ThMFyfB07ZeamdtfGJOf8YkU tF3X6JSQcT+QO+h0POANTn+JyY5OPapGHnK2rqcabhbVPiXAXFXw0f8PDR8kbz9ZxMUB JLEuSo3jNxb4w3QKYhAGoTPfJoJx4Vw2h0zr8dEJxMcYahoP1kEJq72EfUtjGpnsPuDd H7Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=NzV8dfvq8cM1l3XimLO2tX+6Q+tj9m2ZJPGNF/NhjTo=; b=3tEEkszu+abvBFFVoULoQ/eosfIp/P0btvQGpaEQL0iB5CRBxC4V7CIMd3pebvyDhB Dw4YPw3RKbgEKs1URmyZUqVuXHk1EtYLNbSctEmjg9pfC/poBKoemfJhS78pxi6ZLhua ctFii8cxOvWeZmVh+LQrc6oWW6Ia//a0nPSkzKF3ct/55xmrZMM/3h/wXGezBJOYvxSR h7EBq0JTlxNVYuzmlf2MUIGb/VidzZHjTRyvWc0k/DuSJ1w7rp06BM7d8la3glZ3wMTk m/VXmtlv8Vkirdw4iVTanlftGAr7120JIhJQ0le/BvoF8L2B3FIuRT82F7Ku0CN6+unC UoeA== X-Gm-Message-State: AOAM532cH82y7SlaDgv8B1gpf61YaLZfp7zL62Gx9OQ/rHGVAzzLL0Xd kpUjpv65QP9XtQRpe3CF77Mtcg== X-Google-Smtp-Source: ABdhPJyGEmw4OOsueL6lBotf/Xe/ex21rgqNILXZQYJpAEatNT/bo2JEBuXL1R2Rijfpp90G6vOHfA== X-Received: by 2002:a2e:7d18:0:b0:254:1e86:a3c with SMTP id y24-20020a2e7d18000000b002541e860a3cmr13281239ljc.77.1653916142662; Mon, 30 May 2022 06:09:02 -0700 (PDT) Received: from [192.168.1.65] ([46.188.121.129]) by smtp.gmail.com with ESMTPSA id s20-20020a056512203400b00477b0779016sm1162502lfs.264.2022.05.30.06.09.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 30 May 2022 06:09:02 -0700 (PDT) Message-ID: Date: Mon, 30 May 2022 16:09:00 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: [PATCH mm v3 0/9] memcg: accounting for objects allocated by mkdir cgroup Content-Language: en-US To: Michal Hocko Cc: Andrew Morton , kernel@openvz.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Shakeel Butt , Roman Gushchin , =?UTF-8?Q?Michal_Koutn=c3=bd?= , Vlastimil Babka , Muchun Song , cgroups@vger.kernel.org References: <06505918-3b8a-0ad5-5951-89ecb510138e@openvz.org> <3e1d6eab-57c7-ba3d-67e1-c45aa0dfa2ab@openvz.org> From: Vasily Averin In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 372E7A0057 X-Rspam-User: Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=openvz-org.20210112.gappssmtp.com header.s=20210112 header.b=yhuJl6kg; spf=pass (imf25.hostedemail.com: domain of vvs@openvz.org designates 209.85.208.175 as permitted sender) smtp.mailfrom=vvs@openvz.org; dmarc=pass (policy=none) header.from=openvz.org X-Stat-Signature: neqaksdr3st9p1bn5sze1phgjm85ocd5 X-Rspamd-Server: rspam05 X-HE-Tag: 1653916112-46926 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/30/22 14:55, Michal Hocko wrote: > On Mon 30-05-22 14:25:45, Vasily Averin wrote: >> Below is tracing results of mkdir /sys/fs/cgroup/vvs.test on >> 4cpu VM with Fedora and self-complied upstream kernel. The calculations >> are not precise, it depends on kernel config options, number of cpus, >> enabled controllers, ignores possible page allocations etc. >> However this is enough to clarify the general situation. >> All allocations are splited into: >> - common part, always called for each cgroup type >> - per-cgroup allocations >> >> In each group we consider 2 corner cases: >> - usual allocations, important for 1-2 CPU nodes/Vms >> - percpu allocations, important for 'big irons' >> >> common part: ~11Kb + 318 bytes percpu >> memcg: ~17Kb + 4692 bytes percpu >> cpu: ~2.5Kb + 1036 bytes percpu >> cpuset: ~3Kb + 12 bytes percpu >> blkcg: ~3Kb + 12 bytes percpu >> pid: ~1.5Kb + 12 bytes percpu >> perf: ~320b + 60 bytes percpu >> ------------------------------------------- >> total: ~38Kb + 6142 bytes percpu >> currently accounted: 4668 bytes percpu >> >> - it's important to account usual allocations called >> in common part, because almost all of cgroup-specific allocations >> are small. One exception here is memory cgroup, it allocates a few >> huge objects that should be accounted. >> - Percpu allocation called in common part, in memcg and cpu cgroups >> should be accounted, rest ones are small an can be ignored. >> - KERNFS objects are allocated both in common part and in most of >> cgroups >> >> Details can be found here: >> https://lore.kernel.org/all/d28233ee-bccb-7bc3-c2ec-461fd7f95e6a@openvz.org/ >> >> I checked other cgroups types was found that they all can be ignored. >> Additionally I found allocation of struct rt_rq called in cpu cgroup >> if CONFIG_RT_GROUP_SCHED was enabled, it allocates huge (~1700 bytes) >> percpu structure and should be accounted too. > > One thing that the changelog is missing is an explanation why do we need > to account those objects. Users are usually not empowered to create > cgroups arbitrarily. Or at least they shouldn't because we can expect > more problems to happen. > > Could you clarify this please? The problem is actual for OS-level containers: LXC or OpenVz. They are widely used for hosting and allow to run containers by untrusted end-users. Root inside such containers is able to create groups inside own container and consume host memory without its proper accounting. Thank you, Vasily Averin