From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82356C5475B for ; Tue, 12 Mar 2024 00:47:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 08CF36B010C; Mon, 11 Mar 2024 20:47:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 03D376B010F; Mon, 11 Mar 2024 20:47:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E46E96B0111; Mon, 11 Mar 2024 20:47:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D24046B010C for ; Mon, 11 Mar 2024 20:47:54 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 86C68405E7 for ; Tue, 12 Mar 2024 00:47:54 +0000 (UTC) X-FDA: 81886549668.30.8B845EC Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) by imf29.hostedemail.com (Postfix) with ESMTP id B332912000F for ; Tue, 12 Mar 2024 00:47:52 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=OCTsIXG1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.52 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710204472; a=rsa-sha256; cv=none; b=Xc+GHBrmzUPCI1wGdfbpB+5AHeviEjsagsUMCTwgLysm5KodtJQ0GqLhM3RSQKrqOe3MW+ 6xlwn894cdY4pGUsRJZ+Am6ymH79BlsjqdcpS8tCF0coeMQz9TJxkxGlgt0VqdN6ZQt+/D nV1uK6eR6GzYahXYt+bG62oCNMgdS1I= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=OCTsIXG1; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.221.52 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710204472; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7OohOlCU2s+XovQoE7aY53efbQQhJlPdukXoEuw11Ps=; b=I+Lch/b8U9zEuEQSeZmIOznkLn9owhW/iEv+nrTNMc9m8RTWYm6Q1OuEWdkiliM30cnd6G yqfnNbaYS+z1QxOxphFgWRisbNQ9m2w0timRdZPENPG12VjqpEfMskTC9QIgu3csecpfe9 MbtKEIa+0lxw1frPlRUkMtYQBmi27v0= Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-33e70d71756so2590904f8f.1 for ; Mon, 11 Mar 2024 17:47:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710204471; x=1710809271; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7OohOlCU2s+XovQoE7aY53efbQQhJlPdukXoEuw11Ps=; b=OCTsIXG1aFXZwaZND5/qW5LO9osS8czMwHeX4JoiPmoco33letj42S9acTDReKtQE/ kLDHOhzqfohU7fC+/OEkvdOocJl57Y2nwLOcvER381KNWQLpxagNNrmfEUb04FE/UKoz OJXty9RXx7SlYUuW5ovGV8AuJ1+DwVDNOStF8+GrFAjejC9O/Ld1nuWC8uh4xeS8H2AS 7WeTnsgS6piksbI6Nb958nAxXwzTIBPqUmTh2MVAc11vM52KQhiapMgPbRs5RL8MA+GH guw5gyoFH9BxdFrSXIL1/fXWMgB8HA+nTfXqEszGYO/VHdRA8dqGIj2RCNnB20J2SYtE gsPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710204471; x=1710809271; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7OohOlCU2s+XovQoE7aY53efbQQhJlPdukXoEuw11Ps=; b=isuf6msrJnxCOIvzwcn+fv+uaZ75PYhTn9go3oFzCyfTMobE6CI0uy8Dzx8fa56lqb dSFWoGxS1HNcg27cP/+pLlmsU8T5kHEoy4Zo0yE8gsWDq4a+milR9YdJnhfAZRvrht6T K2fQWDiJUPxd/kB+7TCMhZgLe2hC1CtBVaBP175GLVbqBCg72HdItv2oD3IPbz44HyDi /K2whdSV+kjPCPhVUrHhDScK+lv4675ECDaMBa55CCSVioRmPQttAJssc9k432cPHRDf LpBM03Tjn7cmwWrtL5ddT51l3FxvHZ2ULQZiky0AK7Jgi8RVwQjn9dL06MmKl0AsgT9m WWag== X-Forwarded-Encrypted: i=1; AJvYcCU9170jWKYmqWXfc2erIN4x79wCDc9i9nd+wv5DTV3RrFZrrYLn0BbxNzEl75JYKv5WqHnQG0lja3TwNyltPjZn7D0= X-Gm-Message-State: AOJu0Yz+KF6QYC2UI70w02Wl9zuu/o10GK1dGdY+aWia/wIkO5Vgw6B+ buhJcGzFVyyJEFllQmmSgeS/3PJFX0EtkVOzFDuJcFf7jNBvf1od+EZuTxQXQdCdJwVt5VBzRwl jdBj0aT47sRsEqSR0s1qHGquh9yg= X-Google-Smtp-Source: AGHT+IGlrNsvL97H+tae7tmYd+RxL9rI9nNgdbIUqdPWFW+mjX24TpvINqzknuBv7NwCB5Ii3SAIa0j/Qi7xrxcqHzY= X-Received: by 2002:adf:a112:0:b0:33e:9912:f0c1 with SMTP id o18-20020adfa112000000b0033e9912f0c1mr3163339wro.7.1710204470865; Mon, 11 Mar 2024 17:47:50 -0700 (PDT) MIME-Version: 1.0 References: <20240308010812.89848-1-alexei.starovoitov@gmail.com> <20240308010812.89848-2-alexei.starovoitov@gmail.com> In-Reply-To: From: Alexei Starovoitov Date: Mon, 11 Mar 2024 17:47:39 -0700 Message-ID: Subject: Re: [PATCH v3 bpf-next 01/14] bpf: Introduce bpf_arena. To: Andrii Nakryiko Cc: bpf , Daniel Borkmann , Andrii Nakryiko , Linus Torvalds , Barret Rhoden , Johannes Weiner , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , linux-mm , Kernel Team Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B332912000F X-Stat-Signature: 1ikg68osti8m1134p7z1zfnu5gdfbmck X-HE-Tag: 1710204472-345690 X-HE-Meta: U2FsdGVkX1/LIm4VJXJu7okm8gBsIrulGUB4Ydw0H9RW/QkHHavLKfqH1TT0xOCeshkcJmUaJ7MiCGK5Hfy446GA5qhg1wvdSUajgcFh4jguiXGzaedykzcrRe+wleZWid5mON8CuMgPXWQ4IqhFEjIad/gamQJB4LzJPrslLvgZ1ZwDDC+FhJYKqQQHUecCp95zyPMa54CiCckojxC6slf+er1BJNujihCkaN/ciho+TndZ2ZjZUKt1O8BS2MR4oYqjGQ7tZd0J5K1C0AxJP+ka0tSRRiG9cgWWTo9wdCdWx4yxVqlQWvWU2M6gHxskjiWzqlIO7YxXqocXWR0bNw4bcaO2G/wZnb3bLtmJTxJ52cxC7H9+xkIDTD/i0KGdant+vK04bcmbuuSDlaaxM5wWexj/WP7rf551gxyP7pgcngOtywr6lWXvXoCVRCW99y6BB6uu3MG7sNDhsU7uF/qUHhx4/1osM6FX6ofZ2DTDGHaTFb1ubkEaVKySD96Kj8T+9KH0KWSvPaUp7mPuw5j09tlHgDHJUP6u9DupP1ufjCaOlKoJxF2vw3l9mDvE2CXuMtjciPVN75CklJfopU0Ir8ZTEWdOHOAwFIummqUK2JtX1faXD25md4/Ag/l3X2vVR5YLVWr3ugGtZ4Bb3tDnhKcrw2SssCEaPezg6FWhYmjnGQCx8zBYbZi/rrEVAzs2am+GX38SKGoidjZCMEsEbSXzuOOmECrMeaYAS/4YpycNa4JdZs1GrWIfFb7KbVYXOPF728bU6tJnzwq5WnTP7nUk45gC+5FOyJjKOlltsjCS/SM395PW6Q9zosSde6QUJJYODm3VWv1Z4Px9JrYU72Krpfw0QJPZ0JTyC/obCuHbEltcOOIinNTisBCBFbl8yPZhcOgkSuI64976SVMrcNgPNtWNaaJvlWg4W/sIaFPaPIPnEOzAVuNZSeurlEcLNnGq1XSWIR7s38s ZTO+kw7Q HZBugmWBvCd7iVWymAg1lg6SXPxG09vEvs6xlarJ/kXDC6d5mgiZ8VtjE//hMtX+zxG34OzBCjpVObSI/q9VG+/ujcb89ZHtu0lG4+gnYK+3pTKwSxN9i4+C4rp3z/u8UNKsk3ilIZtvEKOZEUVTzC/vadRaFVZw2dLVmtejT0YreYUyKBu3VWEu5NR/K+gnb9/HAJiVLV7WIk+XNvI+6XZtrULdSBGuEw1qS+8Wp+m06qKexrhAIPGzQotD0snpK3fuh/AcHHhEdL7Grlk4XQ2ZmtLGoEdfe+I9ZcEqwRLcrmkRIkOdu1DFNgxSvZ2fEzQ+pFu/6Nei/1gocXRQTZchNAUpwLpy0fCdy9LrO8mB9hozRv34qjIc7AsW9GCcYCW/ftvsq+d/LF6YnvrUnqxzagcFw28Tp+YGyUAzcfTs4FOcszzRStns7UkIDp3RD8/QQNEkkmfi6IEqManphwkoRPllZvBkzXJwImRaXZICB3m2V2ivECTuF80Ov314aDrgLtwJHw3CtF6+1LOSKP3hL9w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 11, 2024 at 3:59=E2=80=AFPM Andrii Nakryiko wrote: > > > no, I get that, my point was a bit different and purely pedantic. It > doesn't overflow 32-bit only when viewed from user-space addresses > POV. > > It seems like it can overflow when we translate it into kernel address > by adding kernel_vm_start (`kern_vm =3D get_vm_area(KERN_VM_SZ, > VM_SPARSE | VM_USERMAP)` doesn't guarantee 4GB alignment, IIUC). But I > don't see what kernel-side overflow matters (yet the comment is next > to the code that does kernel-side range mapping, which is why I > commented, the placement of the comment is what makes it a bit more > confusing). Got it. Ok. Will rephrase the comment. > But I was pointing out that if the user-requested area is exactly 4GB > and user_vm_start is aligned at the 4GB boundary, then user_vm_start + > 4GB, technically is incrementing the upper 32 bits of user_vm_start. > Which I don't think matters because it's the exclusive end of range. yes. should be fine. I didn't add a selftest for 4Gb, because not every CI has runners with this much memory. I guess selftest can try to allocate and skip the test if enomem without failing it. Will add it in the follow up. > > > The way you split these zap_pages for page_cnt =3D=3D 1 and page_cnt = > 1 > > > is quite confusing. Why can't you just unconditionally zap_pages() > > > regardless of page_cnt before this loop? And why for page_cnt =3D=3D = 1 we > > > have `page_mapped(page)` check, but it's ok to not check this for > > > page_cnt>1 case? > > > > > > This asymmetric handling is confusing and suggests something more is > > > going on here. Or am I overthinking it? > > > > It's an important optimization for the common case of page_cnt=3D=3D1. > > If page wasn't mapped into some user vma there is no need to call zap_p= ages > > which is slow. > > But when page_cnt is big it's much faster to do the batched zap > > which is what this code does. > > For the case of page_cnt=3D2 or small number there is no good optimizat= ion > > to do other than try to count whether all pages in this range are > > not page_mapped() and omit zap_page(). > > I don't think it's worth doing such optimization at this point, > > since page_cnt=3D1 is likely the most common case. > > If it changes, it can be optimized later. > > yep, makes sense, and a small comment stating that would be useful, IMO := ) Ok. Will add that too.