From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F110C4828D for ; Thu, 8 Feb 2024 01:39:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D04266B0078; Wed, 7 Feb 2024 20:39:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CB4806B007E; Wed, 7 Feb 2024 20:39:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7C9D6B0080; Wed, 7 Feb 2024 20:39:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A1E0E6B0078 for ; Wed, 7 Feb 2024 20:39:02 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 44F6DC0878 for ; Thu, 8 Feb 2024 01:39:02 +0000 (UTC) X-FDA: 81766928124.29.03CA454 Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) by imf15.hostedemail.com (Postfix) with ESMTP id 6950FA0007 for ; Thu, 8 Feb 2024 01:39:00 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=m5Nc3Czz; spf=pass (imf15.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.51 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707356340; a=rsa-sha256; cv=none; b=iLzStj/pl8MNhtboresjC5hP6DOf9Ibj3CnmVsgcz0bTdVS1EhVFfLRV9ZZ5Hw3qobLyzr O5jKfDi+acKV8C+6nVIWz0O3yzjSW/wWPx1ncl2i5UbY99xtGjZq6EyMKYaapVqbhmq1lx wsoVRiTcSHd42Ef5oVwsgipI+zmvBgU= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=m5Nc3Czz; spf=pass (imf15.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.51 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707356340; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bRNe6eLajXHp1N0Mii1t1fdlI6+aeE+N8u2+AKrA3oE=; b=BmVJbYxwnU9MFZF0kxSaAZuAA78KcMTZp4yWWh2Wy3UxYuk5SageUNW7b7Em0hqJ/TaBsu uQYrExI3IF/zY8mjy77Akv68nMEvdzZLgshQmD5hZxCTWS9cgMBwYTXt9WV6TC0Rft+719 X0fAVX11YfCVgJleqR63h1FSCtqfP3w= Received: by mail-wr1-f51.google.com with SMTP id ffacd0b85a97d-33b401fd72bso1023973f8f.3 for ; Wed, 07 Feb 2024 17:39:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707356339; x=1707961139; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bRNe6eLajXHp1N0Mii1t1fdlI6+aeE+N8u2+AKrA3oE=; b=m5Nc3CzzuLO0T5Gyf8INyQn04xqMAK/2z+T137bH9VUY+GGszU1f9AwF8jFoiT1dBm 5TOSLTOxUU0zbsHaXazViiE4kZhC8gxSEEQ0YkhFLQn9D2pH1iwtlLv3OZGoT+MZWgL2 6xYsBEnCJlX9BhEJM6/8cFIYuQY02EJDDx2HglCNX6ZhYE0rJks1D2FsLRAoQtgls3Ga Se26DJ8zEtgVvQ/e/EklUEt2iSxwLbHlUxX5qs1+tS+ZA9UtrLNU01ruFSgu/3z6d5Us wGMxgpzYnbToZykfNArNoE5s5PgdJO7okukRuB5h1DnHpt/ae9xNiN53ieuK98I0xc6l Q2FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707356339; x=1707961139; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bRNe6eLajXHp1N0Mii1t1fdlI6+aeE+N8u2+AKrA3oE=; b=F+lEAO2WzTaWvA9MtoxsCJ4TZ17DgZtsE5xUI42ERXGdeULCLnNOZUdTvUFfXlBl6s G499nTZb91nBL0tbksj00UKmvXu5yqNdc9hXubo91EvBz7mnxoAaSvT70TKnYe/bvK4H 9NK3FEuQQ7javkY+cppGy4cHttPUmABYIqsdf4MfiqXRay0rzIU5f4XQzgWFnXhRzPVR lJ3Cib/AOSY/dEu4cF9k5eJwtNEPzYIaZCpKnricR64nrM7LmTWirC82ePRMvzDZVL7v NNMOGdRjlkcXcXlVNGuKZkDU4C58oRh0/kN36/08gDJzG6WF3qYfWfhn+MTDPArpd0Je gYJg== X-Forwarded-Encrypted: i=1; AJvYcCWUh9WeRRjL5SrKL3yoXJLaoCcSB+pUbM1cpFf+Sxegg7l5CAy0GvrZFhLzXq0zSuCMvQBaEO7rwgH9yoFJRkRZokQ= X-Gm-Message-State: AOJu0YzeHw2qcAvaGCotgmxcYt1aSc9mMq3fDmkWpzMN07kVtC5rqDTH zX2YLeiaO6loQrxzE27DyCcUy9fDTXINpG8Z8QnL8E8hRnc3F5GiEhd2LbmPIREDcIFPQAI1Ple XeQ7xr2jHz1h/1HZXzru+Oc3qS34= X-Google-Smtp-Source: AGHT+IE1Rw86lBFE5/pKKJ7rhpAC/Xy3buk+CBjbX5lPcHqWhld/eQ4pIv4eglnGkwmZTnPT2REo7YK5u9+xvNgCLYA= X-Received: by 2002:a5d:5964:0:b0:33b:5198:11d9 with SMTP id e36-20020a5d5964000000b0033b519811d9mr1377446wri.71.1707356338564; Wed, 07 Feb 2024 17:38:58 -0800 (PST) MIME-Version: 1.0 References: <20240206220441.38311-1-alexei.starovoitov@gmail.com> <20240206220441.38311-12-alexei.starovoitov@gmail.com> In-Reply-To: From: Alexei Starovoitov Date: Wed, 7 Feb 2024 17:38:47 -0800 Message-ID: Subject: Re: [PATCH bpf-next 11/16] libbpf: Add support for bpf_arena. To: Andrii Nakryiko Cc: bpf , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Kumar Kartikeya Dwivedi , Eddy Z , Tejun Heo , Barret Rhoden , Johannes Weiner , linux-mm , Kernel Team Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 6950FA0007 X-Stat-Signature: 9b4efsgnbr9cjxbe9usjmojfgxr3or1p X-Rspam-User: X-HE-Tag: 1707356340-887614 X-HE-Meta: U2FsdGVkX1+JVNuFQgunaDctaLPuZDG4v4psdmNXI+dVbiG2N7SCE7Qzb6+nOMECv1zT4NDerA5VEAP7X0tET8zgSpv18a8FRw/6c/nFcZb5gbNkx0ZKOYGoPFk9WTs9yQzxc448H52I1qUL42FEU9l6XILnCQVcrIGQ66DbJLYK6dOa/aboHgjG6FacZ+WZVLnP/6siWJicHO3hqG7COzc1DOfPqYt/VZeqbHKHVeYS0jsonaMVWaJ1LCZXYAVjJd5krSPKcy9bTz4WzW7QaaJ5DaWwuQg4WNVnE4Jm1s9Eh4joADSJcGbmx46mheVvPw9rlLx0bbtcNa2UZAkOXQ8sh91GUDwpCSb8BlSA5TbsDB3vIK/LgoSfSpdhoJAAlMLDCibbzDoTc2E12xGUbUNDK6UEwDfToVkPROKncwDz0/Um8vPvmMgUoeyFS/TDxT0qj95Pt1KEfLwY7GVnH+UkU5ifamIGhjVO2ReJ5nfaLlTnTEkUm8uRrAy8xdGYkoG/8+rA9a/cFeut+DZU6IMrBxytnDt8c+1UqRtz1ZXU5Fc7rAEpSatFfqCG44uP0xnxN4ER/osQycc04WF6ukIFjMV3bvejdFHbDQxczd8uKNY80ybgcMNRq+3W5jE51COP8LwqiKu2MTE0gOKwetqGRUGtWnwx+kPDOONNiOKOcgyECGHJWSjqwu26Js61aucfmHe/5jbKr8baqAYTqXKiFh3MBHC9PoyhkoIoEEmSKhXsFJW4uGF+qjs8vm3fbPAkvIXg59UKPDWVeA0d1ufzA26u1RPJmBMJk6YZ+YnNmnT5DOvD6pKI3c/zt6aNc6Rhzrjzr704CD0EzBBHYfEPMp6WFWR3tt18wbhFHZcg6i+bP8U4V+udWI0EHwdfeh8M53eWEjK3bGCNNupj7jgVjJBATVHWmMItZVQvNjp05KOR8h4S9Pdg65tKHIaVXAzLqYN79EohWHmtUcE idj0rD93 mh9L5Q2++ECmfQnLlcQq9UYT2HRDYplDSVYvuweqnGTB5APqeDeAAqWiopQIqGkCzOWv2GeDUU33raN27GuB63hf85PAZetcV88FrE5pEdbZ/eN4obF1l1kFnlWk0A+SO8kn3HSwdgGx5e+zkjyqHhcteacO+h+taTPJ8gYvPBdW/VUW8x7MbKpKe6fGRsB7AVs8cC/shAciA24G0FSUs8gFUwKZpB/w7sWm/u61u4kleNTurVJPRrzZYg0hYqOzPkYX24v3seTmUxu9fRrarC/oRjcstX26BVbPz8zxv7l8RIFBZONttTsByIfSKj8T8FAPa29wmOGQAo9NjLm/PVFlra3NPeNMwcxmLi657k6JJ8HVQmF5YfRCLyihs6QiLyARquBySXG+L6g0DXgoGyZ9ysc8NRr+2LOUnOL3Ohzk1J6XmQ3G/b5nW6vWGuz6ISNlyy9t9IndQUqOD8bDdTMLi/HztuZfeLrP5BDrgoM0+Ux6/KWXaYugGLKg1yiiSoHn5lSiZZLprcgifYTzj+RAusM+8hGAj3rr4hdgCeBaoBAxydOy5n2b/aIyFasbcELfYBEJg9weK/E0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 7, 2024 at 5:15=E2=80=AFPM Andrii Nakryiko wrote: > > On Tue, Feb 6, 2024 at 2:05=E2=80=AFPM Alexei Starovoitov > wrote: > > > > From: Alexei Starovoitov > > > > mmap() bpf_arena right after creation, since the kernel needs to > > remember the address returned from mmap. This is user_vm_start. > > LLVM will generate bpf_arena_cast_user() instructions where > > necessary and JIT will add upper 32-bit of user_vm_start > > to such pointers. > > > > Use traditional map->value_size * map->max_entries to calculate mmap sz= , > > though it's not the best fit. > > We should probably make bpf_map_mmap_sz() aware of specific map type > and do different calculations based on that. It makes sense to have > round_up(PAGE_SIZE) for BPF map arena, and use just just value_size or > max_entries to specify the size (fixing the other to be zero). I went with value_size =3D=3D key_size =3D=3D 8 in order to be able to exte= nd it in the future and allow map_lookup/update/delete to do something useful. Ex: lookup/delete can behave just like arena_alloc/free_pages. Are you proposing to force key/value_size to zero ? That was my first attempt. key_size can be zero, but syscall side of lookup/update expects a non-zero value_size for all maps regardless of type. We can modify bpf/syscall.c, of course, but it feels arena would be too different of a map if generic map handling code would need to be specialized. Then since value_size is > 0 then what sizes make sense? When it's 8 it can be an indirection to anything. key/value would be user pointers to other structs that would be meaningful for an arena. Right now it costs nothing to force both to 8 and pick any logic when we decide what lookup/update should do. But then when value_size =3D=3D 8 than making max_entries to mean the size of arena in bytes or pages.. starting to look odd and different from all other maps. We could go with max_entries=3D=3D0 and value_size to mean the size of arena in bytes, but it will prevent us from defining lookup/update in the future, which doesn't feel right. Considering all this I went with map->value_size * map->max_entries choice. Though it's not pretty. > > @@ -4908,6 +4910,22 @@ static int bpf_object__create_map(struct bpf_obj= ect *obj, struct bpf_map *map, b > > if (map->fd =3D=3D map_fd) > > return 0; > > > > + if (def->type =3D=3D BPF_MAP_TYPE_ARENA) { > > + size_t mmap_sz; > > + > > + mmap_sz =3D bpf_map_mmap_sz(def->value_size, def->max_e= ntries); > > + map->mmaped =3D mmap((void *)map->map_extra, mmap_sz, P= ROT_READ | PROT_WRITE, > > + map->map_extra ? MAP_SHARED | MAP_FI= XED : MAP_SHARED, > > + map_fd, 0); > > + if (map->mmaped =3D=3D MAP_FAILED) { > > + err =3D -errno; > > + map->mmaped =3D NULL; > > + pr_warn("map '%s': failed to mmap bpf_arena: %d= \n", > > + bpf_map__name(map), err); > > + return err; > > leaking map_fd here, you need to close(map_fd) before erroring out ahh. good catch.