From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81325C05027 for ; Thu, 26 Jan 2023 05:45:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D3D86B0071; Thu, 26 Jan 2023 00:45:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 083AD6B0072; Thu, 26 Jan 2023 00:45:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8D1F6B0073; Thu, 26 Jan 2023 00:45:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D9C3E6B0071 for ; Thu, 26 Jan 2023 00:45:30 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A158EC0401 for ; Thu, 26 Jan 2023 05:45:30 +0000 (UTC) X-FDA: 80395862820.08.2B0AA6F Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf27.hostedemail.com (Postfix) with ESMTP id CC0E440003 for ; Thu, 26 Jan 2023 05:45:27 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=S4Zge+RQ; spf=pass (imf27.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674711927; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FveYOsiMd65GK7iPAWBv1At230iwf8BEo8Ey5SfPIfA=; b=QpMzpZXOPNw5mRy8N/QvghhDYBzodDffXlLF8cmk9/BwJ/azcA2alLZjMLETZ29JmvKWZB 7U2hUgt+6xzd/sY861EWMy2+fKFmO7y6AXquy0B57kcfOeltYCdM42qEqCDH7sQZmTIX/9 saFc65MswRtCaQjxEnutLWtPZ1l1clQ= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=S4Zge+RQ; spf=pass (imf27.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674711927; a=rsa-sha256; cv=none; b=rzOEIWCYLONNIgEWjf8F2GgUdDcFvSGn5Q8lA5Tzh6VawMXeCtcdZatsi98IIxI8UzOHx/ KTOLNDSJmvtgyPgzZbY+rVFbZyimnXqfHGCWoKDkq5Nu6PpuzxNcaVuAn1d3bc7fdXd/1d asQPqishGjVtbl5v4Xgd5CYM0Juj/Fo= Received: by mail-ed1-f44.google.com with SMTP id x10so893146edd.10 for ; Wed, 25 Jan 2023 21:45:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=FveYOsiMd65GK7iPAWBv1At230iwf8BEo8Ey5SfPIfA=; b=S4Zge+RQuskYZqkxySFtgUSlsG5nE+pHkVlCJvm+DunbEgbf1DdCn+KuImSD/P77bJ OKFFrRBlrYOmIQyeyK5yHhE24aJQVGBtkwhkvZMxf8YJhUDYIaGo890FdCXOyM99cG32 9yMejlNvRNWuxu3IiqNpffGeVE1AFT32cjKClWq6vTK5lM+6C1OzXHamJPzMRQndC4ud yg3NpUkhffNvtRwpkKe8CTtvUjU7AHJWqO4Y3eRPdnYfFZt0pU6dt+B3q2wlGHsPeuix CGdnK7XMDXSLuMCUPq+QqF7oerp/tgB2tfLqJNge+U4RJkiKhzSkj+yGDxc9xQzE/ZAH giAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FveYOsiMd65GK7iPAWBv1At230iwf8BEo8Ey5SfPIfA=; b=t1bEEU6iEg4cWasPLSUj2s9NhYkDrKmXSxJVE4YvVtKTSVJrXt2Zpi+gzYrQTWP3Hp 1fO9PHh4o3c5krrHiNVNrXlajaWRjwHYWPO9aulN6nunNls29g7iHu5/KDxxoH3lID/H HGJ5eP3dELlsS6xfR10Mr3OIdWJ9n3JW69Kyw3IIuuCvcGER44Fw2iSvvYjK+HcY2C6g fN4ig72Mffdp5CWWpCdgAzyAk5ivCF2pkP2kqMoHE1jSuJmmr8hoXnRjdUw8SiarvZQv KQh3GpBXEkSv2CYl0FsEcDgGf7FWMVw2m3hlomKB8Zr6pSBNE5khGhf/ZkE9JhKqrdMh /V7Q== X-Gm-Message-State: AO0yUKXleeOSkzG/ImtaHpb/WwEDD77CmZN0ASR2t4VesBCBvqOB66gW mPwHz+lV6o5jMVJ3MPKk1iK9G4ET+qL5r1fsREU= X-Google-Smtp-Source: AK7set/zeXyIi1LbtXUG86WV+E3vGBPdo2Ey/WKyCNmapGg+3QgWq1reyTaVJ03dDwaMMHNUvMP9+dDzrAmwYEjK8FE= X-Received: by 2002:aa7:cb8d:0:b0:4a0:b690:8ee7 with SMTP id r13-20020aa7cb8d000000b004a0b6908ee7mr549984edt.34.1674711926172; Wed, 25 Jan 2023 21:45:26 -0800 (PST) MIME-Version: 1.0 References: <20230112155326.26902-1-laoar.shao@gmail.com> In-Reply-To: From: Alexei Starovoitov Date: Wed, 25 Jan 2023 21:45:14 -0800 Message-ID: Subject: Re: [RFC PATCH bpf-next v2 00/11] mm, bpf: Add BPF into /proc/meminfo To: Yafang Shao Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Vlastimil Babka , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Tejun Heo , dennis@kernel.org, Chris Lameter , Andrew Morton , Pekka Enberg , David Rientjes , Joonsoo Kim , Roman Gushchin , linux-mm , bpf Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: CC0E440003 X-Rspam-User: X-Stat-Signature: zj8kwbmdt4jkgkcqiaj7oqbmm1r7cy7t X-HE-Tag: 1674711927-320141 X-HE-Meta: U2FsdGVkX1/nLHeTLD6Y9Q97M1OCUZZbIcvGBbEGhwOTPYLr8nckwR4wsrqRfJGP0iXBx6O5n4irkbND5kn+tmSaRfekyEsb/mKg8KazlNT4a4yw1YfKZm62++20df0OhIrsHqYoZOF9Q+AiNSCjVMgH1nG1iWWtmVZhPZnwrTWh/3WIseDKjbZXvZxBZ1QjF5gygRAeS7LXzYOqEOstXB3r6lte2iahvr4zkJJyoUu0zqw8nKKqst63A0z9bdZ2bBHhz3F/VcMThhPZ6bolFRilttO1B8VKIOBAbZyVI+RUKQVlXavzGn4Yz86xhhqeQi2me21aZam/OA8oU56jigxCptRTQmXzHe89Glyw2aLi3lfIC1KRIWzEv5Nx8M4pCiDlcanWr6Ttu+Vd6GUdCIQdzWoK9ciS13tqxdf2iwos/YTD7IVTjQVmv6Px/bTJMqvwp/lOSDkax+yJliJ+508I3CpqJco7DzQkLRa15wY0oBrLyuln0YsWRfZyu4Kapho44pq3YGNkAYNMt0JF1xh+rmo82aw9WZh7WJry6VOyoFzft191XGHLddqcPiFShPJ+DOU+XUCoWzxKnczubQGZEA+y+Z0Q9Rx3WgYTK/Vq+F6tGY/aVJoxFwjD1K2bWGFflR0IVgLcZVK7SOjNUWtyiOI1HnEdN18PIIZlnn+B9VQGFFjuH1aooZCd1qrO7ZBnYNuHQ20DEQKKlyVK42LALRusNyhYWXn2HPVSnY2LKNicskDgN9uH8XmJs4kx7Stf4uYxTyGnLWodvxgcHUEJP5e1EgWVaJqDwrXKlH0fXKabJOppCOqvwBeU+N+V9agmBq457NnMdxSzNx+2ykPBjV1dpU8OELn8mhLNCU+Lbt7DrbGiVG0Xl4bO8LhDAcI2O9pDYewamKG5DrCbxXTZWON2MnB8n6n4D+9ntQhWiONNEJ2VmG8S7kZi4ug+lbrEGxZAMDcfVdDmkSq VCam4u/W ttTMKzF7M5uiSxMsvUJpOVGimx/pkkErW4gsWbbdpbsmc2+o4SPQHAdwgnuC1HblT5HjwL0HT/ovbGL/xi9ImFOfDZ9Z0UivcoSFpjG83WgVyIsWjmILqXQcmJw9ixbQlqTZ3DtmXgUnPLRdKqVWr9q6l3FMSs7o/I58666jr8Ta7rGbAYz7DOxm79Pir8Wh4qc3/sM8QKjZxdVUnL7+omr2jNMDGj28CAhG7au8TSbWWo+dG/Ro07u8Uf7hS0QZ90QbLTB+TF7g0vbACpdnddbz08tOm4Njeb20OdewJmobcKIPyL9SinTJi3sD55/+Tkb4sNQOOxjBgw0tty0EF0VUZ51iAh0IbNBVWXsKUWBO15KQE0BKfT4ixABs9sCLqYMap2smrq4bwRMlWLYZeJXU09LQdQ5eNmLg5D7JfMYsglF32eHI+P6qP1GNXiDCo5JkfDpHmICVogt5j8PYo53CXmc54iS5yyWPcZ7V9WTfdbYKThT7JIIjLvw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 17, 2023 at 10:49 PM Yafang Shao wrote: > > > I just don't want to add many if-elses or switch-cases into > > > bpf_map_memory_footprint(), because I think it is a little ugly. > > > Introducing a new map ops could make it more clear. For example, > > > static unsigned long bpf_map_memory_footprint(const struct bpf_map *map) > > > { > > > unsigned long size; > > > > > > if (map->ops->map_mem_footprint) > > > return map->ops->map_mem_footprint(map); > > > > > > size = round_up(map->key_size + bpf_map_value_size(map), 8); > > > return round_up(map->max_entries * size, PAGE_SIZE); > > > } > > > > It is also ugly, because bpf_map_value_size() already has if-stmt. > > I prefer to keep all estimates in one place. > > There is no need to be 100% accurate. > > Per my investigation, it can be almost accurate with little effort. > Take the htab for example, > static unsigned long htab_mem_footprint(const struct bpf_map *map) > { > struct bpf_htab *htab = container_of(map, struct bpf_htab, map); > unsigned long size = 0; > > if (!htab_is_prealloc(htab)) { > size += htab_elements_size(htab); > } > size += kvsize(htab->elems); > size += percpu_size(htab->extra_elems); > size += kvsize(htab->buckets); > size += bpf_mem_alloc_size(&htab->pcpu_ma); > size += bpf_mem_alloc_size(&htab->ma); > if (htab->use_percpu_counter) > size += percpu_size(htab->pcount.counters); > size += percpu_size(htab->map_locked[i]) * HASHTAB_MAP_LOCK_COUNT; > size += kvsize(htab); > return size; > } Please don't. Above doesn't look maintainable. Look at kvsize(htab). Do you really care about hundred bytes? Just accept that there will be a small constant difference between what show_fdinfo reports and the real memory. You cannot make it 100%. There is kfence that will allocate 4k though you asked kmalloc(8). > We just need to get the real memory size from the pointer instead of > calculating the size again. > For non-preallocated htab, it is a little trouble to get the element > size (not the unit_size), but it won't be a big deal. You'd have to convince mm folks that kvsize() is worth doing. I don't think it will be easy. > > With a callback devs will start thinking that this is somehow > > a requirement to report precise memory. > > > > > > > > bpf side tracks all of its allocation. There is no need to do that > > > > > > in generic mm side. > > > > > > Exposing an aggregated single number if /proc/meminfo also looks wrong. > > > > > > > > > > Do you mean that we shouldn't expose it in /proc/meminfo ? > > > > > > > > We should not because it helps one particular use case only. > > > > Somebody else might want map mem info per container, > > > > then somebody would need it per user, etc. > > > > > > It seems we should show memcg info and user info in bpftool map show. > > > > Show memcg info? What do you have in mind? > > > > Each bpf map is charged to a memcg. If we know a bpf map belongs to > which memcg, we can know the map mem info per container. > Currently we can get the memcg info from the process which loads it, > but it can't apply to pinned-bpf-map. > So it would be better if we can show it in bpftool-map-show. That sounds useful. Have you looked at bpf iterators and how bpftool is using them to figure out which process loaded bpf prog and created particular map?