From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 013FAC433EF for ; Mon, 11 Jul 2022 12:22:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C4076B00CD; Mon, 11 Jul 2022 08:22:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 672C96B00CE; Mon, 11 Jul 2022 08:22:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 561C86B00CF; Mon, 11 Jul 2022 08:22:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 448D46B00CD for ; Mon, 11 Jul 2022 08:22:29 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1BA3933673 for ; Mon, 11 Jul 2022 12:22:29 +0000 (UTC) X-FDA: 79674732018.29.C00454D Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf13.hostedemail.com (Postfix) with ESMTP id 03E5B2006A for ; Mon, 11 Jul 2022 12:22:27 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id B2FFF20218; Mon, 11 Jul 2022 12:22:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1657542146; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hK+QKRmMvPYxr+cYe5fj5/V4jUFyChqieicBd63wLJA=; b=cPbqTQxnoSOqx1wGG5uhWMKYRemf04k28jSyvnZJZ0TbF8ElwNaakDPe0yWSZA02AbUUuE zEuX+MmIlg06pC1fB7V2gva2EDLM8oY/SjNCDYaoCy6qCWwNfUv3InfPw964E67pVN4bWd zWfhwWXPGiIpqNSsIJvCY9TIO0uSrkw= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 3D5012C141; Mon, 11 Jul 2022 12:22:23 +0000 (UTC) Date: Mon, 11 Jul 2022 14:22:23 +0200 From: Michal Hocko To: Alexei Starovoitov Cc: Matthew Wilcox , Christoph Hellwig , davem@davemloft.net, daniel@iogearbox.net, andrii@kernel.org, tj@kernel.org, kafai@fb.com, bpf@vger.kernel.org, kernel-team@fb.com, linux-mm@kvack.org, Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka Subject: Re: [PATCH bpf-next 0/5] bpf: BPF specific memory allocator. Message-ID: References: <20220623003230.37497-1-alexei.starovoitov@gmail.com> <20220706175034.y4hw5gfbswxya36z@MacBook-Pro-3.local> <20220706180525.ozkxnbifgd4vzxym@MacBook-Pro-3.local.dhcp.thefacebook.com> <20220708174858.6gl2ag3asmoimpoe@macbook-pro-3.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220708174858.6gl2ag3asmoimpoe@macbook-pro-3.dhcp.thefacebook.com> ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657542148; a=rsa-sha256; cv=none; b=lBEei+gakL8b7HPZgL99fpkYiXav097hDyfXi+8v7pNt8x7pS5jWOOu9JcfiEikRZMRT4L 2zoznLr9o0OV/BGFx9fcpEOJDR15tHouZ+kRTE/DKqUp8w/QI8SIXNtn0Yl0aFk1bMTYIc R+L88/sUP6loY/YURiCfGq3oJ+z5AuY= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=cPbqTQxn; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf13.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657542148; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hK+QKRmMvPYxr+cYe5fj5/V4jUFyChqieicBd63wLJA=; b=5C/c1Xq3qEYTHUcNeD4lmqFWaMBJkLPtIBlEfAYtv6d315o/BW8Gi7HnyETaZeDC1rKkLU BiQvZPIqI8MkMnMdk40wjbdBramzAfadnS7R84aVMP/L9XcNi/TY5ZT+ivHZWB9C0Znm24 RL2E090WmN7qMQ9b8tKnKMtbt5nMCJ0= X-Stat-Signature: ziad79p5d4memhrnmeszee99ik4diku1 Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=cPbqTQxn; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf13.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 03E5B2006A X-Rspam-User: X-HE-Tag: 1657542147-765944 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 08-07-22 10:48:58, Alexei Starovoitov wrote: > On Fri, Jul 08, 2022 at 03:41:47PM +0200, Michal Hocko wrote: [...] > > Finally it is not really clear to what kind of entity is the life time > > of these caches bound to. Let's say the system goes OOM, is any process > > responsible for it and a clean up would be done if it gets killed? > > We've been asking these questions for years and have been trying to > come up with a solution. > bpf progs are not analogous to user space processes. > There are bpf progs that function completely without user space component. > bpf progs are pretty close to be full featured kernel modules with > the difference that bpf progs are safe, portable and users have > full visibility into them (source code, line info, type info, etc) > They are not binary blobs unlike kernel modules. > But from OOM perspective they're pretty much like .ko-s. > Which kernel module would you force unload when system is OOMing ? > Force unloading ko-s will likely crash the system. > Force unloading bpf progs maybe equally bad. The system won't crash, > but it may be a sorrow state. The bpf could have been doing security > enforcement or network firewall or providing key insights to critical > user space components like systemd or health check daemon. > We've been discussing ideas on how to rank and auto cleanup > the system state when progs have to be unloaded. Some sort of > destructor mechanism. Fingers crossed we will have it eventually. > bpf infra keeps track of everything, of course. > Technically we can detach, unpin and unload everything and all memory > will be returned back to the system. > Anyhow not a new problem. Orthogonal to this patch set. > bpf progs have been doing memory allocation from day one. 8 years ago. > This patch set is trying to make it 100% safe. > Currently it's 99% safe. OK, thanks for the clarification. There is still one thing that is not really clear to me. Without a proper ownership bound to any process why is it desired/helpful to account the memory to a memcg? We have discussed something similar in a different email thread and I still didn't manage to find time to put all the parts together. But if the initiator (or however you call the process which loads the program) exits then this might be the last process in the specific cgroup and so it can be offlined and mostly invisible to an admin. As you have explained there is nothing really actionable on this memory by the OOM killer either. So does it actually buy us much to account? -- Michal Hocko SUSE Labs