From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4C27CCA480 for ; Tue, 12 Jul 2022 18:40:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C7539400C8; Tue, 12 Jul 2022 14:40:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 37656940063; Tue, 12 Jul 2022 14:40:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2677D9400C8; Tue, 12 Jul 2022 14:40:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 14887940063 for ; Tue, 12 Jul 2022 14:40:24 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id DC31E34073 for ; Tue, 12 Jul 2022 18:40:23 +0000 (UTC) X-FDA: 79679313126.14.4334352 Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by imf24.hostedemail.com (Postfix) with ESMTP id 77225180070 for ; Tue, 12 Jul 2022 18:40:23 +0000 (UTC) Received: by mail-pg1-f177.google.com with SMTP id f11so7490739pgj.7 for ; Tue, 12 Jul 2022 11:40:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4UeeXqybc/cqL+mbk/GXnvmjdYDPUcVq4hnPD2o0mrA=; b=XCN2l/7tBjfdV3ZcY6mjs0Cj1uwyI5BMT9Y9Esd63A2IfeNsQLLoBRZDt043X5aJ1t 9ZqlRAOPRStHfEyscOLrY/+U/GD+WmaGTVfP2ve5f8Erac/rgVmjBb8zajD0einubCdd BHO53udkyyVan81Y3P1k/sCWBj10HTf3OEwQHsYnCCwgycVcMUckZsdFf020hw7gOSCO sDbRgiNxEKmy8k63s4wJyacRKYJipa7XGfAOPzHCMj1PHTfIcFgtF1GNvd8J+22AJnmM Y4lemm//n1trARql3Ywz0U69cd7lRpL8+upx3rEXWNnyLrWe9K3rdFbaClGslPBKFNlJ 8D8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4UeeXqybc/cqL+mbk/GXnvmjdYDPUcVq4hnPD2o0mrA=; b=W0CgN5p0zhWzS9Vs/ArK9SPDpS0UEGDD9eb+FWPVCg0iNTtlusXclNtG7e5IBLh7hb g5q4yTEJUxghbve0jW5BQ0p0t9FeyM9P6ecvPYRW2e4UeI0ZR8sLyiaC5O75Xk3LBayr U73RfL8tpeWuOAoU1hDzsaRA1h591NjJ5AjqzRjtkrtjrV3jIaRTpQ2yY9TVZWmB8iE4 eDsVBNGV7ScDy9iv4fxgU1Ef5zzOI0HlF7VbIIPC71AQz3ldVQV4yGQa3Nv/PA4DC9By XaMZYHBLIXDMt382nWszmyHK64BRJfy9VOlE/4NpoZ1Bp03L2kEC3CJz9Aq3jZhJVKTC /eMw== X-Gm-Message-State: AJIora+fHIqpUje/cEhxu80PlIVvgPIzfMaFNsUpQGdgDtmcG6NBPXKy D3DXSu2gmZuWneLzKQgXuiE= X-Google-Smtp-Source: AGRyM1s/J9hhivI5wwuV0abBfA36cyDpWxsEYI5J5tzavvNkY7X+a42vUR94FWarTlz4fdqGVMB9MQ== X-Received: by 2002:a63:ba1d:0:b0:419:7e6e:2858 with SMTP id k29-20020a63ba1d000000b004197e6e2858mr1942549pgf.67.1657651222382; Tue, 12 Jul 2022 11:40:22 -0700 (PDT) Received: from MacBook-Pro-3.local.dhcp.thefacebook.com ([2620:10d:c090:500::2:8800]) by smtp.gmail.com with ESMTPSA id n6-20020a170903110600b0016a6caacaefsm7232427plh.103.2022.07.12.11.40.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Jul 2022 11:40:21 -0700 (PDT) Date: Tue, 12 Jul 2022 11:40:18 -0700 From: Alexei Starovoitov To: Michal Hocko Cc: Shakeel Butt , Matthew Wilcox , Christoph Hellwig , "David S. Miller" , Daniel Borkmann , Andrii Nakryiko , Tejun Heo , Martin KaFai Lau , bpf , Kernel Team , linux-mm , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka Subject: Re: [PATCH bpf-next 0/5] bpf: BPF specific memory allocator. Message-ID: <20220712184018.i3cisffxr7k3aei7@MacBook-Pro-3.local.dhcp.thefacebook.com> References: <20220706180525.ozkxnbifgd4vzxym@MacBook-Pro-3.local.dhcp.thefacebook.com> <20220708174858.6gl2ag3asmoimpoe@macbook-pro-3.dhcp.thefacebook.com> <20220708215536.pqclxdqvtrfll2y4@google.com> <20220710073213.bkkdweiqrlnr35sv@google.com> <20220712043914.pxmbm7vockuvpmmh@macbook-pro-3.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657651223; a=rsa-sha256; cv=none; b=cGSCM4S8D1FYnywJCSYNwG/lWdEyS7OFrAwDx9zO7NircyNadfSsccyktBLxX9Lbr4Z//L 3fIxXmr6Ga58+2IPmpZ/tsG7DsVBf9OwdAp42rSsAc/IGuDITS4IUZzEHhakz2EkOMpBqI zIwm3fJj9Q5yPbcNJBB88oQgp+bYwYU= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="XCN2l/7t"; spf=pass (imf24.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.215.177 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657651223; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4UeeXqybc/cqL+mbk/GXnvmjdYDPUcVq4hnPD2o0mrA=; b=qGHcwLu/enSoiPMPKPDUdVQw8dgswtsTusmxu5KhYdvM/HAiZT0cE9GvZ2pLLrTB6Dv/IO 1KkK8qAD1ot326cTixQmjTRTSpjrRv7i1VRZO2DpOrQrTlWt7hR8iecyyb/MXrZZo15yky 6W9UXhUZ5k5izyn0O8iRHobt9711Cpg= X-Rspamd-Queue-Id: 77225180070 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="XCN2l/7t"; spf=pass (imf24.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.215.177 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam02 X-Rspam-User: X-Stat-Signature: 6bn7y7r93qppwocy9es94qpg9e8dbun9 X-HE-Tag: 1657651223-396672 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 12, 2022 at 09:40:13AM +0200, Michal Hocko wrote: > On Mon 11-07-22 21:39:14, Alexei Starovoitov wrote: > > On Mon, Jul 11, 2022 at 02:15:07PM +0200, Michal Hocko wrote: > > > On Sun 10-07-22 07:32:13, Shakeel Butt wrote: > > > > On Sat, Jul 09, 2022 at 10:26:23PM -0700, Alexei Starovoitov wrote: > > > > > On Fri, Jul 8, 2022 at 2:55 PM Shakeel Butt wrote: > > > > [...] > > > > > > > > > > > > Most probably Michal's comment was on free objects sitting in the caches > > > > > > (also pointed out by Yosry). Should we drain them on memory pressure / > > > > > > OOM or should we ignore them as the amount of memory is not significant? > > > > > > > > > > Are you suggesting to design a shrinker for 0.01% of the memory > > > > > consumed by bpf? > > > > > > > > No, just claim that the memory sitting on such caches is insignificant. > > > > > > yes, that is not really clear from the patch description. Earlier you > > > have said that the memory consumed might go into GBs. If that is a > > > memory that is actively used and not really reclaimable then bad luck. > > > There are other users like that in the kernel and this is not a new > > > problem. I think it would really help to add a counter to describe both > > > the overall memory claimed by the bpf allocator and actively used > > > portion of it. If you use our standard vmstat infrastructure then we can > > > easily show that information in the OOM report. > > > > OOM report can potentially be extended with info about bpf consumed > > memory, but it's not clear whether it will help OOM analysis. > > If GBs of memory can be sitting there then it is surely an interesting > information to have when seeing OOM. One of the big shortcomings of the > OOM analysis is unaccounted memory. > > > bpftool map show > > prints all map data already. > > Some devs use bpf to inspect bpf maps for finer details in run-time. > > drgn scripts pull that data from crash dumps. > > There is no need for new counters. > > The idea of bpf specific counters/limits was rejected by memcg folks. > > I would argue that integration into vmstat is useful not only for oom > analysis but also for regular health check scripts watching /proc/vmstat > content. I do not think most of those generic tools are BPF aware. So > unless there is a good reason to not account this memory there then I > would vote for adding them. They are cheap and easy to integrate. We've seen enough performance issues with such counters. So, no, they are not cheap. Remember bpf has to be optimized for all cases. Some of them process millions of packets per second. Others do millions of map update/delete per second which means millions of alloc/free. > > > OK, thanks for the clarification. There is still one thing that is not > > > really clear to me. Without a proper ownership bound to any process why > > > is it desired/helpful to account the memory to a memcg? > > > > The first step is to have a limit. memcg provides it. > > I am sorry but this doesn't really explain it. Could you elaborate > please? Is the limit supposed to protect against adversaries? Or is it > just to prevent from accidental runaways? yes to above two. > Is it purely for accounting > purposes? also soft yes. Once the user be able to select memcg it will become a strong yes.