From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39805C54E4A for ; Fri, 23 Feb 2024 17:27:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8463B6B0078; Fri, 23 Feb 2024 12:27:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F6216B007B; Fri, 23 Feb 2024 12:27:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 696B16B007D; Fri, 23 Feb 2024 12:27:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 57C326B0078 for ; Fri, 23 Feb 2024 12:27:29 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 31873A171E for ; Fri, 23 Feb 2024 17:27:29 +0000 (UTC) X-FDA: 81823750218.23.D94F40C Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) by imf30.hostedemail.com (Postfix) with ESMTP id 7F6E980003 for ; Fri, 23 Feb 2024 17:27:27 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Gp3qLuW8; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.42 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708709247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+vNuyNXu25mKmmUjoA0sFm1bo+r/+O1xP9DbW8b+d/Q=; b=WDnr4VAcXuNR0yDCD8Rrk/N6zPsdjCZo3bEql3XosbbYkw0eUNKa+1HzKCUwhcXcphhkwg 95JCJlGNQDf664AyjqIFK3+9F7IkxPKU+/4D5AKju5YBKRkg5FkgFQT9lCRNXboZmGTw1G JVz+ydB6Bn3jUgI4FJDlF+DVvCEDxOE= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Gp3qLuW8; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.42 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708709247; a=rsa-sha256; cv=none; b=yjuP3fmv+yoJIztkvF9ppbzSJ4DyekP5Zf7Jo6SIliU/INz1m/3yDbAaLaRjpZ+m7J7u/r VB6HrMRh5zIYX2jK7S0PXfp/CQ+/47VDsWHJ8Xeh2RplozekD+6hU8JM9hSmAEzmYnGKKI u6HGo7PIwXUUc1OO6a3PmrnaMsoKYCc= Received: by mail-wr1-f42.google.com with SMTP id ffacd0b85a97d-33d9d43a1eeso772177f8f.2 for ; Fri, 23 Feb 2024 09:27:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708709246; x=1709314046; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+vNuyNXu25mKmmUjoA0sFm1bo+r/+O1xP9DbW8b+d/Q=; b=Gp3qLuW8DAgzcD/1lt3jv0HR/AbVxOl62g6NzxT4FLOVJMzkrIuonXqSNUczJ/4qqp Jh0h/FsA3SkbXneREB9shqiJJumFGefPpSVB005dkfv6MZpd7YbCt2blzkUo5Lcld691 tgsZJXeyEgsdXbU8eXZws8boydNtXeKn7d3DDiMRbel6GGO0nKMdrLo/CMb33Yfa5/gM sEsO/OLLyxiE2sOD5PZETbZjy1WDXI6MTX7Ybm5XzAnDWtiEzI9G343nEvUE49ty/7re 8Iq/p5CJ9nY7YIxhwCUousnjySlISpv3WNmNyUjNJHvzx3mfl9Y+/yi+WCIG3LOTliZh HjSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708709246; x=1709314046; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+vNuyNXu25mKmmUjoA0sFm1bo+r/+O1xP9DbW8b+d/Q=; b=jBzOIMmI46ZzPITX6wgpjBXmltqGrmNGlPdxn5IN2siQFrTx2MvxIp7AiJQJsmC8F/ 5pOskbh82P9jDSv6xrw0UJQ9l00k3cZUVOShitm8bYR0t655jYtBz2m9DOcL1b3TVQzT dTO6SnGpoV5JFQq2ckbkUJojVa5Mz1jCuwgdr6r9hXnTp5QnGUobWWf8ZPuNph3yNTcf MjyEzyhcF8l1f4tW7n74gea/Ccc3yRlfsNp2fAeiKjKmPqseGclR+8Q1nobLJmAW5DlJ E51tUp6bh4YfE7yLviohvT/TkfNg6uQDaoM+8IKm2LhNkobXZFzp5bkZ1LfsH95yblHb uTYA== X-Forwarded-Encrypted: i=1; AJvYcCWEwlx/mWFkMUeSTTVywCaveogmzH1Y0U9rHsznKmvKFT9fnMRQZ7P4q8mnk6LyRAlv5CM8YU6Ntot+wO4Ta11q4jc= X-Gm-Message-State: AOJu0Ywlgr7xhzoJfOgCuMoiTdMRU43pehSp95XaeOPDEuUH+jDyZQC1 nnB7lKqa0blbbsNh7yiQdyCd2yYEkQZi3GU4wQjdh3cx+FyjGkfLl9+fbInISA2fAEvMqExE4V/ tzxjod/cqMksXoI6YsezVGE1wo8I= X-Google-Smtp-Source: AGHT+IEVoZv7qmp/PLS60Kz0jBWXjbAJ6R1pwCFv983Q/7a2+W0nLQGWUpODt3JAXnigML7dlri8VPUxn6n9aYmgAa0= X-Received: by 2002:a05:6000:1b8b:b0:33d:8783:1e0e with SMTP id r11-20020a0560001b8b00b0033d87831e0emr232244wru.70.1708709245786; Fri, 23 Feb 2024 09:27:25 -0800 (PST) MIME-Version: 1.0 References: <20240220192613.8840-1-alexei.starovoitov@gmail.com> In-Reply-To: From: Alexei Starovoitov Date: Fri, 23 Feb 2024 09:27:14 -0800 Message-ID: Subject: Re: [PATCH bpf-next] mm: Introduce vm_area_[un]map_pages(). To: Christoph Hellwig Cc: bpf , Daniel Borkmann , Andrii Nakryiko , Linus Torvalds , Barret Rhoden , Johannes Weiner , Lorenzo Stoakes , Andrew Morton , Uladzislau Rezki , linux-mm , Kernel Team Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 7F6E980003 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: gx7hq8bd9ej1spc1t7uhoj7d13eapstr X-HE-Tag: 1708709247-118470 X-HE-Meta: U2FsdGVkX1/Qb26OUsKhg96se/manW1d1bNDoYQQjOnPZ2f1x0z4GR/gBr4f0pVEO4innpOW2kuRokQ9LPRqbAZ7/1E9jIKX0FBeepksB3FPIovOEp4gPGhlEC1Mv6tWF6KLEbdMGlpMT8MTjLRLimb5bCJuk6WvWpmlS7QVFI6EuK7NDN9x4R3uicJ+yCqPwfWI8Sz4bJFwRdh6WgFiFpT3MD1kVO//VPEvKSscWmkRqRZmIg+fXVYAqqSMgw+dR+ncDh250n+8LvixyLa9DSKAHCicyk03vxOc77iHGiCrshJCvlQStWw5nMXfG9DHY5PStzhxya6+Ti4Kh/j3Z1fCZwqBa7DUuBg+rsE05w2ENIaE7AS7Z+3YryAIDq/WOBd3fFcPYRv07EVf2Sl51YpDr/DexoXsIkQ3Wl9f116zeoqhjG1qEyPGKRJGb1g4R4m7J7e6dcZBA00rzb9RGMaPac9Xi5L7cR2ie7kCEfvs3HOUW70VB1HhhoxZMjrxT+Lwg10EcMyMuYXqwKjYJr10WE6iOUvI8PUYM7RBY1By/XvUmOI/sIrpmUFaWNNiKcN3l6XQIB/zPbNfsMgdK/sz2zClZOeUeN/B27QcO15oaHC80mbYsHncsGmYED4GURDy8uD+Sbke64PJIkC2Xueb91+IV/5AquPzBCEHrak7ZjxFVfc0zZTdtxW9kCCvg4rHk/TUjQ9kDbUID6m/F+QbTojNXJmlHwEMsD+ToTqWHmoBlKKFjhil6/n/+c8kHRwtUD0Ql4VKvq43vhKdS5aoSbpg3uhw8tiyYTpCWF/JmH2wKmGxorL2P67tUI302BYLRVSBOrsNZ486lUq+ZwCpGIzmMonhlKzkEcuxjBBJzL2/ix2y5/b+y1W+EmozWIGTcY3mMJeW6VoUQ9EejDhaKUEr8duMINckJBM7AHj5xkieJZtDc9iizwUyh+CN9I9hwtBv1Qn50Nz5hoz qgUNC0CI 7ghUm/0xnq7L7S6QOAHO5CptsG7j4VSwvaosz/DM1m+E9RY8/qglp0LuULNcw3u83u2V0azn7IAGywGq8c2cCUbwo6PhtY+W2flCtYiwibG668MujE2RM26NWEw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 23, 2024 at 9:14=E2=80=AFAM Christoph Hellwig wrote: > > On Wed, Feb 21, 2024 at 11:05:09AM -0800, Alexei Starovoitov wrote: > > +#define VM_BPF 0x00000800 /* bpf_arena pages */ > > > > +static inline struct vm_struct *get_bpf_vm_area(unsigned long size) > > +{ > > + return get_vm_area(size, VM_BPF); > > +} > > > > and enforce that flag in vm_area_[un]map_pages() ? > > > > vmallocinfo can display it or skip it. > > Things like find_vm_area() can do something different with such an area > > (if that was the concern). > > Well, a growing allocation is a generally useful feature. I'd > rather not limit it to bpf if we can. sure. See VM_SPARSE proposal in the other email. > > > For the dynamically growing part do you need a special allocator or > > > can we just go straight to the page allocator and implement this > > > in common code? > > > > It's a bit special allocator that is using maple tree to manage > > range within 4G region and > > alloc_pages_node(GFP_KERNEL | __GFP_ZERO | __GFP_ACCOUNT) > > to grab pages. > > With extra dance: > > memcg =3D bpf_map_get_memcg(map); > > old_memcg =3D set_active_memcg(memcg); > > to make sure memcg accounting is done the common way for all bpf maps. > > Ok, so it's not just a growing allocation but actually sparse and > all over the place? That doesn't really make it easier to come > up with a good enough interface. yep. > How do you decide what gets placed > where? See proposal in the other email in this thread. tldr: it's a user space mmap() like interface. either give me N pages at any addr or give me N pages at this addr if this range is still free. > > struct vm_struct *area =3D get_sparse_vm_area(size); > > vm_area_alloc_pages(struct vm_struct *area, ulong addr, int page_cnt, > > int numa_id); > > > > and vm_area_alloc_pages() will allocate pages and vmap_pages_range() > > them while all code in mm/vmalloc.c ? > > My vague hope was that we could just start out with an area and > grow it. But it sounds like you need something much more complex > that that. yes. With bpf specific tricks due to lower 32-bit wrap around. > But yes, a more specific API is probably a better idea. And maybe > the cookie should be a VM area either but a structure dedicated to > this. Right. see 'struct sparse_vm_area' proposal in the other email.