From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF101C43334 for ; Tue, 12 Jul 2022 04:39:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A19A940046; Tue, 12 Jul 2022 00:39:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1520A940033; Tue, 12 Jul 2022 00:39:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0406F940046; Tue, 12 Jul 2022 00:39:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E5395940033 for ; Tue, 12 Jul 2022 00:39:19 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B41FC3247A for ; Tue, 12 Jul 2022 04:39:19 +0000 (UTC) X-FDA: 79677193638.25.CC5D32F Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf26.hostedemail.com (Postfix) with ESMTP id 58C4D14005C for ; Tue, 12 Jul 2022 04:39:19 +0000 (UTC) Received: by mail-pf1-f169.google.com with SMTP id d10so6492831pfd.9 for ; Mon, 11 Jul 2022 21:39:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=RbuQvW5P2B327xH9DYaPbGliNJwj/rxxYgllk0AY4L8=; b=TngVdD1sMavZB1GSqPOcX/p1sPgmX0VOBXVJbKQlap5VWxF3Sr11hw6Bkl0+8QyBSX +vJheTHD/QdqNpuBZr+H8dUag2zTPNq4WjZqe+jfFxQDTUKoIUJzNk744rd0DUm8aRBo hgKgI7KKecymL1k/6Iy0tocXBinPMx7p3gSmK6NmFf7Sq2NIKlgRuUkdf6/y3k6b/x03 ZxVSEWc2R1Dgyif4u4Dfz3NeHoSlBlIn9W2L4NZH82gop4/mpmj+4ddnsyco+j19dSZt UiLKEuLTwfXmhhIn3p2xIi+EwbUJHWZ5ADvgwyGcaw7KAPQzrp7qZAbtbKwOBvr9xVK2 iqKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=RbuQvW5P2B327xH9DYaPbGliNJwj/rxxYgllk0AY4L8=; b=wLaXfi7YE7Rk9QHzqUPS7FFLtF5o+PDR6M8Jt0rgeYasmimHKHYOpDuFptft/Tt6G1 Y5zuaKDJKVNdR8Iige80QsxW8kg0BxfmWxA5c2M57+CUfLWD+CmDA5x3wrIBsYciXJSb AL1r78Ey/Rb0mQf0zbz+IG5tBjD/59AyQxyS+xKcuJOAynj/OMGOZFPWBYvCwzgfNBH2 wk4eYsOaiz+GSbUq0vVqXTCoYyZwjw4h7TFRcfFnIIqdLYjOZcL4Yjpsz/nr8LCzluxm N7J8P7QJKO+l7BoKX8D8W9V0I2TQTNezir6MPeFhy0TEOAFZeZBYdRD6EjH+5Ozuu0PK m+og== X-Gm-Message-State: AJIora/5Nv8Swpidvzje2ro2hk6e0VU52FnLzm0KO0n+892bqXLtqr3b HfPVKhOevzzBSgpNkcOQBow= X-Google-Smtp-Source: AGRyM1uAKIH8QJ3s/qucXxt5/RCQQkwQSGuJJjPC8NBl5t1rOE+Q7rXnoM2QRWxZ/G5ba86Lw0jS9g== X-Received: by 2002:a63:2205:0:b0:417:61fd:cd35 with SMTP id i5-20020a632205000000b0041761fdcd35mr2910008pgi.544.1657600758266; Mon, 11 Jul 2022 21:39:18 -0700 (PDT) Received: from macbook-pro-3.dhcp.thefacebook.com ([2620:10d:c090:400::5:c47b]) by smtp.gmail.com with ESMTPSA id w1-20020a1709026f0100b0016bf2a4598asm5589182plk.229.2022.07.11.21.39.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Jul 2022 21:39:17 -0700 (PDT) Date: Mon, 11 Jul 2022 21:39:14 -0700 From: Alexei Starovoitov To: Michal Hocko Cc: Shakeel Butt , Matthew Wilcox , Christoph Hellwig , "David S. Miller" , Daniel Borkmann , Andrii Nakryiko , Tejun Heo , Martin KaFai Lau , bpf , Kernel Team , linux-mm , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka Subject: Re: [PATCH bpf-next 0/5] bpf: BPF specific memory allocator. Message-ID: <20220712043914.pxmbm7vockuvpmmh@macbook-pro-3.dhcp.thefacebook.com> References: <20220706175034.y4hw5gfbswxya36z@MacBook-Pro-3.local> <20220706180525.ozkxnbifgd4vzxym@MacBook-Pro-3.local.dhcp.thefacebook.com> <20220708174858.6gl2ag3asmoimpoe@macbook-pro-3.dhcp.thefacebook.com> <20220708215536.pqclxdqvtrfll2y4@google.com> <20220710073213.bkkdweiqrlnr35sv@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=TngVdD1s; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf26.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657600759; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RbuQvW5P2B327xH9DYaPbGliNJwj/rxxYgllk0AY4L8=; b=otmRx1Pet19iKMv2idJH+7E8+kjcVG1802mFgB8glVcdxjZ86ru3iWX/g2n5GRcYYam+Ra UZ6Ev9uQY6DRwgqlS4h5cJ0Yr2sUbg6d1mY2JlulzXnjqP9IjfVIswnNYQ+tVdR0SnMFNX 5389mWub0FmSvxXY4v/npqwtBQU+Sak= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657600759; a=rsa-sha256; cv=none; b=xX7Cwy0Lx1fN15o/tuuZpauw1DBTgwIcVUoBLBtbbC+bi0NIiCiBT1XEZohxwuHiu0JzDV xkKYVKPHeSc6Pk+ljGKdkh2z6HmhNVmyq9izBAp1mFlAm1m8cdACeE5v5J5HmDfluKhEm6 45pDc3ZL4x2BYLz3jFFPbhp45ix6l2k= X-Stat-Signature: of57ub9bg4ry8o9cpzjhxojsy7bk75cc X-Rspamd-Queue-Id: 58C4D14005C Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=TngVdD1s; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf26.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1657600759-629717 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 11, 2022 at 02:15:07PM +0200, Michal Hocko wrote: > On Sun 10-07-22 07:32:13, Shakeel Butt wrote: > > On Sat, Jul 09, 2022 at 10:26:23PM -0700, Alexei Starovoitov wrote: > > > On Fri, Jul 8, 2022 at 2:55 PM Shakeel Butt wrote: > > [...] > > > > > > > > Most probably Michal's comment was on free objects sitting in the caches > > > > (also pointed out by Yosry). Should we drain them on memory pressure / > > > > OOM or should we ignore them as the amount of memory is not significant? > > > > > > Are you suggesting to design a shrinker for 0.01% of the memory > > > consumed by bpf? > > > > No, just claim that the memory sitting on such caches is insignificant. > > yes, that is not really clear from the patch description. Earlier you > have said that the memory consumed might go into GBs. If that is a > memory that is actively used and not really reclaimable then bad luck. > There are other users like that in the kernel and this is not a new > problem. I think it would really help to add a counter to describe both > the overall memory claimed by the bpf allocator and actively used > portion of it. If you use our standard vmstat infrastructure then we can > easily show that information in the OOM report. OOM report can potentially be extended with info about bpf consumed memory, but it's not clear whether it will help OOM analysis. bpftool map show prints all map data already. Some devs use bpf to inspect bpf maps for finer details in run-time. drgn scripts pull that data from crash dumps. There is no need for new counters. The idea of bpf specific counters/limits was rejected by memcg folks. > OK, thanks for the clarification. There is still one thing that is not > really clear to me. Without a proper ownership bound to any process why > is it desired/helpful to account the memory to a memcg? The first step is to have a limit. memcg provides it. > We have discussed something similar in a different email thread and I > still didn't manage to find time to put all the parts together. But if > the initiator (or however you call the process which loads the program) > exits then this might be the last process in the specific cgroup and so > it can be offlined and mostly invisible to an admin. Roman already sent reparenting fix: https://patchwork.kernel.org/project/netdevbpf/patch/20220711162827.184743-1-roman.gushchin@linux.dev/ > As you have explained there is nothing really actionable on this memory > by the OOM killer either. So does it actually buy us much to account? It will be actionable. One step at a time. In the other thread we've discussed an idea to make memcg selectable when bpf objects are created. The user might create a special memcg and use it for all things bpf. This might be the way to provide bpf specific accounting and limits.