From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B477FC54EE9 for ; Thu, 8 Sep 2022 16:13:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9ED936B0072; Thu, 8 Sep 2022 12:13:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 99D186B0073; Thu, 8 Sep 2022 12:13:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 864BA8D0001; Thu, 8 Sep 2022 12:13:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 7AB6E6B0072 for ; Thu, 8 Sep 2022 12:13:39 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 51622C1040 for ; Thu, 8 Sep 2022 16:13:39 +0000 (UTC) X-FDA: 79889413758.12.F4AE485 Received: from out2.migadu.com (out2.migadu.com [188.165.223.204]) by imf18.hostedemail.com (Postfix) with ESMTP id 988801C0076 for ; Thu, 8 Sep 2022 16:13:38 +0000 (UTC) Date: Thu, 8 Sep 2022 09:13:07 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1662653615; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=XFgnxoiMa/OYh81zS7O/8tqwdekKzHoWthmdHPgMZK8=; b=lYtwB4ezHv8drbW5ICWe9wsFxRHHKSt63b6pqQ7+h/lyVpPvBSnR5nQhmhy7dem/p8sNgh Brue/zCx0KrHlawTBZyOG77bX3NhlcBkbrwe9frXiG4tVdwbG5t0mRkmaNiYxmP+/HhAeI lnJMdts3Xwvzy69wuOflohBjbNm3FSI= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Yafang Shao Cc: Tejun Heo , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin Lau , Song Liu , Yonghong Song , john fastabend , KP Singh , Stanislav Fomichev , Hao Luo , jolsa@kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Andrew Morton , Zefan Li , Cgroups , netdev , bpf , Linux MM Subject: Re: [PATCH bpf-next v3 00/13] bpf: Introduce selectable memcg for bpf map Message-ID: References: <20220902023003.47124-1-laoar.shao@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662653619; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XFgnxoiMa/OYh81zS7O/8tqwdekKzHoWthmdHPgMZK8=; b=eVPRA2W9RjeifULlN9FRovr7P41JBy5uZPVthuwVEXW+9ePBrMhUmMgSkqJZX3kRIx+z+G FwrMzPOW/WfQfg0ZX4SqaayLYfgaLUTtaybjMN+ArpPs2crXomFW+ry08kxT/2K/cor7Ra l0G8EkTu00uKsm4154q1C7wKmarH5oQ= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=lYtwB4ez; spf=pass (imf18.hostedemail.com: domain of roman.gushchin@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662653619; a=rsa-sha256; cv=none; b=KBUWY+0sIPzSSeDvHyuiqqMnCX8/TDU7eqN+pJHptbS3/pfIYr2/hFsgtLNBTdaJ3fKT4C F3857lEF5takGwjVrf4dmFZqd44x78joxx5UWk70fUW7nFzkiePw8T5p5skMEXS7MQFK05 lGW+PZrHfqOY+6SO1YtC4gHTR5Zn9ek= X-Rspam-User: Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=lYtwB4ez; spf=pass (imf18.hostedemail.com: domain of roman.gushchin@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Rspamd-Queue-Id: 988801C0076 X-Rspamd-Server: rspam03 X-Stat-Signature: z4dtbxe8dkt97ijqjcfj91dkjqstymia X-HE-Tag: 1662653618-205174 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 08, 2022 at 10:37:02AM +0800, Yafang Shao wrote: > On Thu, Sep 8, 2022 at 6:29 AM Roman Gushchin wrote: > > > > On Wed, Sep 07, 2022 at 05:43:31AM -1000, Tejun Heo wrote: > > > Hello, > > > > > > On Fri, Sep 02, 2022 at 02:29:50AM +0000, Yafang Shao wrote: > > > ... > > > > This patchset tries to resolve the above two issues by introducing a > > > > selectable memcg to limit the bpf memory. Currently we only allow to > > > > select its ancestor to avoid breaking the memcg hierarchy further. > > > > Possible use cases of the selectable memcg as follows, > > > > > > As discussed in the following thread, there are clear downsides to an > > > interface which requires the users to specify the cgroups directly. > > > > > > https://lkml.kernel.org/r/YwNold0GMOappUxc@slm.duckdns.org > > > > > > So, I don't really think this is an interface we wanna go for. I was hoping > > > to hear more from memcg folks in the above thread. Maybe ping them in that > > > thread and continue there? > > > > Hi Roman, > > > As I said previously, I don't like it, because it's an attempt to solve a non > > bpf-specific problem in a bpf-specific way. > > > > Why do you still insist that bpf_map->memcg is not a bpf-specific > issue after so many discussions? > Do you charge the bpf-map's memory the same way as you charge the page > caches or slabs ? > No, you don't. You charge it in a bpf-specific way. The only difference is that we charge the cgroup of the processes who created a map, not a process who is doing a specific allocation. Your patchset doesn't change this. There are pros and cons with this approach, we've discussed it back to the times when bpf memcg accounting was developed. If you want to revisit this, it's maybe possible (given there is a really strong and likely new motivation appears), but I haven't seen any complaints yet except from you. > > > Yes, memory cgroups are not great for accounting of shared resources, it's well > > known. This patchset looks like an attempt to "fix" it specifically for bpf maps > > in a particular cgroup setup. Honestly, I don't think it's worth the added > > complexity. Especially because a similar behaviour can be achieved simple > > by placing the task which creates the map into the desired cgroup. > > Are you serious ? > Have you ever read the cgroup doc? Which clearly describe the "No > Internal Process Constraint".[1] > Obviously you can't place the task in the desired cgroup, i.e. the parent memcg. But you can place it into another leaf cgroup. You can delete this leaf cgroup and your memcg will get reparented. You can attach this process and create a bpf map to the parent cgroup before it gets child cgroups. You can revisit the idea of shared bpf maps and outlive specific cgroups. Lof of options. > > [1] https://www.kernel.org/doc/Documentation/cgroup-v2.txt > > > Beatiful? Not. Neither is the proposed solution. > > > > Is it really hard to admit a fault? Yafang, you posted several versions and so far I haven't seen much of support or excitement from anyone (please, fix me if I'm wrong). It's not like I'm nacking a patchset with many acks, reviews and supporters. Still think you're solving an important problem in a reasonable way? It seems like not many are convinced yet. I'd recommend to focus on this instead of blaming me. Thanks!