From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4771C63797 for ; Thu, 12 Jan 2023 21:05:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B40538E0002; Thu, 12 Jan 2023 16:05:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AEF7C8E0001; Thu, 12 Jan 2023 16:05:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DEFD8E0002; Thu, 12 Jan 2023 16:05:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8F40B8E0001 for ; Thu, 12 Jan 2023 16:05:32 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 57DBD140887 for ; Thu, 12 Jan 2023 21:05:32 +0000 (UTC) X-FDA: 80347378104.06.DFC76C2 Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by imf15.hostedemail.com (Postfix) with ESMTP id 2CA34A0007 for ; Thu, 12 Jan 2023 21:05:29 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=q5Y5sVls; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673557530; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=G4rKqmSaUi/fgFgZKqCD43R9BqnhvoDRAdLiLx1zzho=; b=V7RUYV8IgG5HiranM94Cx4NCQT9fLSIZqKwp5+XpeoLjuAwrqlWX++iamJ5ZlTQLJn5oT5 HWpEmBx93WCeeE77Y9u+M3X3n+ej5pPDCECrKr5M9+S6zCQOMaJuOL3KqurbmZ5LG1GN5+ 85EDN9Z+YP7tCu0EMjmaBqf981HPYaE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=q5Y5sVls; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf15.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.208.45 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673557530; a=rsa-sha256; cv=none; b=T1tN6dOaIP4cTn+SPxTI/J+ULX+1H1P7Gl/X/bKCKkTfZKKIf2LuJHzKuFDHRy+MB+KU3t izv9/+1Hu+5SA/qoQ/vbH0CxiuD/a3/aVudz8EWb41yQEVL3GkQC6eC2bd2fCFhRbPPqvB jm2cR7wOLq2YWwF3TibPAUsM0/L5N8U= Received: by mail-ed1-f45.google.com with SMTP id v30so28617102edb.9 for ; Thu, 12 Jan 2023 13:05:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=G4rKqmSaUi/fgFgZKqCD43R9BqnhvoDRAdLiLx1zzho=; b=q5Y5sVlsmv4pp2gjVJj7zhqU8/Dy3T4UMHa2X5vNAGGq0qAqZtLhuYAHMdp3MUvTCa ppJtSUqxieN5rr7457qjsH+zzwaAmfxg/NpIYgwlE8pmK8yUHuK3QKadnrtHRlTI8ShN c7DbEirMN8KdOKgLVt6+odaPqKpRASDHZyB3meFXjRd1XxTKTAt1g1qn7LSxsHTVhJqP M84Y9M8Ir2dpvXqlSHHmzP0JHgFMgZ4YPRVbcRjxWwRzoYM9zjyE5AgNhq9nXaYLHv4o UlkKwqri454kY2vN9QguAzIjyCO6+On3QeiH31/4txjf82aG+Y4rpCeOoBmAF1msc940 N9RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=G4rKqmSaUi/fgFgZKqCD43R9BqnhvoDRAdLiLx1zzho=; b=4x9/D2jtgnwCLHf8KGWPMWPieHUI60Ii8j/XEeMCp4yzDv3GdUlgKEEtxIwx0tzH8w Z0yfcN0sdtHOdmGLjTu42c+HAS++DkDLqjj3t0UZdT/Jr5NGoYT7UQkJ5ycGYb5yTkDb e1vrafAqmp3JzBv+TCQjrVgTTPISqemEC1ETQ5LGdYujoH0inoaJu/5vp+iDKn+Fsge5 9/eqDJord7Aee1hsSytf7lFOIaiciUQZc4GDzeYOf9iPXklmY90EzZhGJr1eXE5NreoL sXna/AQdP/rRkxmfgS+sYFgwGMRdeBaAvyg2TBSQfbF4yk62FHF35rtSwWOdx4HWq7B7 2EmA== X-Gm-Message-State: AFqh2kqGrZ+tOoKNaV9PrcJfhW7yrpPl6LiFbkwEuulzqzUevDW5wFZZ uzMGK11eG8WD2L8IFMYNi7KMSsGJZxu+htnWYJY= X-Google-Smtp-Source: AMrXdXvTTLRYRuFlvv3A4QLSVprqV9sqm0TJBbNgep/gV8dsrl1n+DCm0DJavRFJovdEhqB52wiXcp5i8SBUPIGlAGA= X-Received: by 2002:aa7:c94b:0:b0:499:bfa7:832d with SMTP id h11-20020aa7c94b000000b00499bfa7832dmr1728379edt.338.1673557529158; Thu, 12 Jan 2023 13:05:29 -0800 (PST) MIME-Version: 1.0 References: <20230112155326.26902-1-laoar.shao@gmail.com> In-Reply-To: <20230112155326.26902-1-laoar.shao@gmail.com> From: Alexei Starovoitov Date: Thu, 12 Jan 2023 13:05:17 -0800 Message-ID: Subject: Re: [RFC PATCH bpf-next v2 00/11] mm, bpf: Add BPF into /proc/meminfo To: Yafang Shao Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Vlastimil Babka , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Tejun Heo , dennis@kernel.org, Chris Lameter , Andrew Morton , Pekka Enberg , David Rientjes , Joonsoo Kim , Roman Gushchin , linux-mm , bpf Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 2CA34A0007 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: byetxc6914p5gn8rpjy61ougtuykz56n X-HE-Tag: 1673557529-826887 X-HE-Meta: U2FsdGVkX1/iMyGKW6zsX4yc0rENAA9F0NG9yfzXlRKGZrm+tHYbaCV3LbSmgkHrImfINNqgVUw3tXNFvQLwUeJl/enZBHE8Pm46VXdY5m3aHuWiVPpS5Ldw1HrAZqMZzFSino2mDaNcwgzsTVxEVBfSHlAAVC4MmFqV94zWZsDY/FFrkvo/WnHM+VIkiDpv9ofBbgNjUKZ3RC3DQ9fqKBjsFz73HfvASYFw5oJ/5vKQfghriZyeVbr4Il+h5zZaCZGqaEQ5JfGN7SZaRi/Qsr2D6vtCl1t0UJns6KgRFVDJVVPbFj4GKOL0GzWW0Js2f9IrxLRaB3qM+AyNtDLWUA0cNJSNvpQcXfhWDb7EAvCh6xgDafbZQKoDmZmlHcXGlimkL2Gk3srcyT6d8a42Ve1u+pVHaBaKQUpyBJcDHg8y+mfN98s30HuY+7Zhr2vuGAMH49CtQdbPPO7hV+/BDVg+K5qd0UTVQvZAsGNHchlLDykXokPaLyo3pjKNusqm/EelehQ3xp1oWAoVNSxN6XM+iCFK30rwj92p88HAlmz3/rF/sF9NCGBfpUpEU2RrUfXhL9JcNMSE80W4tv+BU+hprlEv5BjTDGBThzt0+X4M9gmYp7cew/F3V+ideevMEdLOtr2W+8HHwJMIYwZbXiBYDrmWUc9L+3GamewT3Y6qpACY489B0Y1tb5ZTyepEb8eSGVxUCDQVIbPAl5SHsgCmR6ySdAWa4/B2kQPuWfIbqwK0FgTJMKIrZq3DySU19oHOfbInaWvx/9wli7rvQM89JvvxsFvy7Hdt7BLNEPjjxqznsYzL9NiMnDRDsKZ++WrEy+yPtRNpxpvhtL2QIW9yVdWRQWbV965AqSQOc+JVGkZU9pG/twH8DYWDImNRhmCJcA/PwffAmvrXmjOTWGhuRMed1F5qT2aU8CNIm+qfcu3rp84FfyQSjDXWyiYuBujYA83NOQTdzythcKV YmrRPqxj FuPV9+BsZ8X2nN2ZdFUuHSOnUXvin7EKmzuM1ADeamhtV7ZR9WYZxGTkNH4+7bVdbalMJxJ46ip/7NOzdPE5R8losDGyPK6BMRQI8V2d9BNiJdCN/MYozA6u7kpWquVxfG+tbmdqXg/QrFQ6Gytuef81zOg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jan 12, 2023 at 7:53 AM Yafang Shao wrote: > > Currently there's no way to get BPF memory usage, while we can only > estimate the usage by bpftool or memcg, both of which are not reliable. > > - bpftool > `bpftool {map,prog} show` can show us the memlock of each map and > prog, but the memlock is vary from the real memory size. The memlock > of a bpf object is approximately > `round_up(key_size + value_size, 8) * max_entries`, > so 1) it can't apply to the non-preallocated bpf map which may > increase or decrease the real memory size dynamically. 2) the element > size of some bpf map is not `key_size + value_size`, for example the > element size of htab is > `sizeof(struct htab_elem) + round_up(key_size, 8) + round_up(value_size, 8)` > That said the differece between these two values may be very great if > the key_size and value_size is small. For example in my verifaction, > the size of memlock and real memory of a preallocated hash map are, > > $ grep BPF /proc/meminfo > BPF: 350 kB <<< the size of preallocated memalloc pool > > (create hash map) > > $ bpftool map show > 41549: hash name count_map flags 0x0 > key 4B value 4B max_entries 1048576 memlock 8388608B > > $ grep BPF /proc/meminfo > BPF: 82284 kB > > So the real memory size is $((82284 - 350)) which is 81934 kB > while the memlock is only 8192 kB. hashmap with key 4b and value 4b looks artificial to me, but since you're concerned with accuracy of bpftool reporting, please fix the estimation in bpf_map_memory_footprint(). You're correct that: > size of some bpf map is not `key_size + value_size`, for example the > element size of htab is > `sizeof(struct htab_elem) + round_up(key_size, 8) + round_up(value_size, 8)` So just teach bpf_map_memory_footprint() to do this more accurately. Add bucket size to it as well. Make it even more accurate with prealloc vs not. Much simpler change than adding run-time overhead to every alloc/free on bpf side. Higher level point: bpf side tracks all of its allocation. There is no need to do that in generic mm side. Exposing an aggregated single number if /proc/meminfo also looks wrong. People should be able to "bpftool map show|awk sum of fields" and get the same number.