From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6F14E7718A for ; Thu, 19 Dec 2024 01:19:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DDE5B6B0085; Wed, 18 Dec 2024 20:19:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D8EBC6B0088; Wed, 18 Dec 2024 20:19:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C301A6B0089; Wed, 18 Dec 2024 20:19:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A751C6B0085 for ; Wed, 18 Dec 2024 20:19:06 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1EEF3121209 for ; Thu, 19 Dec 2024 01:19:06 +0000 (UTC) X-FDA: 82909949556.29.DE2580E Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) by imf19.hostedemail.com (Postfix) with ESMTP id 46C0A1A000E for ; Thu, 19 Dec 2024 01:18:30 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gfruVXDb; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.128.53 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734571114; a=rsa-sha256; cv=none; b=KzwyLfUGN5C0/5Bt+Tlw/HdinaAF6ZAbx1S6cgSxpgb8pkFhdhhM6gMQrwBo3Y5dMvu5zW 5deBzD3pGMOUaJofiNSlvB5vH/2mhhR25VQN3+gVBagZWoPBT0grNF8Uh49+7PivzI4j77 /OdXpjduPVsrSnOrId5DY1KNQg7Sc6k= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gfruVXDb; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf19.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.128.53 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734571114; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xswLO3aU6iIX4PogxhwW2hgaGIB3kpLUE2ybR+JJZcs=; b=s6AeMBqIhnGj/dez9P+KWubDu0i8HLEufLIBuvyuSG9U6hKGxWbN8gvbT5ClSMqTlVkief +wsjEzOJb/kk64VSZG+URN2tvz3pBAsbRZY5BEtuBkB57CkFaU2hSLVa1vhWGtXncc6uyH z9WSgGAn0mRlMv18LNT/coibOHFw7ZM= Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-4361f664af5so2835355e9.1 for ; Wed, 18 Dec 2024 17:19:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734571143; x=1735175943; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xswLO3aU6iIX4PogxhwW2hgaGIB3kpLUE2ybR+JJZcs=; b=gfruVXDbX0SHAE3/FfkvifRQ3tdESljKMiRwslyaR0M7JOJATh848l4G5RrDTYOcQO Z4Iz1+bsmbTOB6FT6I/71Fwpw6np5Sail5Q5R4oMdq1IVNn1ND0iH7L6LtYw/S1bUO89 9hY1SpZejb3BlSkMhFsMnw+egtjKhACPkGH8464KnlH/QiXIn7bLo6KS9OPyGKKM4q7X 4yR9w3aoZlltG++pHCbif+C14l18w9k99tBtQSuZJ6h6cE45+yrmSBHNnOS96nuFubPM gF6JIDb5tvbOJbAak5S94fCfkhNTMM6MCHUQZwk9kpEB/ceFZZcYIsUCOCrtLsoxoAx/ MTBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734571143; x=1735175943; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xswLO3aU6iIX4PogxhwW2hgaGIB3kpLUE2ybR+JJZcs=; b=iKQB22aFchrUFCJILbc5Kg268rPccj3tE3Mzwj3NEA9ddmi7We5e94vMkocFja/ejS 0Fvkl6kNpdbxiXBzl57ZzjAreyBltn4KL2lPQjKv/6XnYLred+eE4T/HeaM0J1czWUHv cFj93i2myTK2fxJhV6GPL+sFSiweGzyiBjBF18TCw6/k9A15ZKcw9JCatKqhfSz7E3GP b62f1FmlwnBl5/0IC3sNm+TUlMpNBMaKhIAPWBbXjkNaB1qkH9lVS72R4wktrjQFtjEW +5lfAkM9V7+OtDKfKyCPKgdPlPzjfaCLt3EO+rlMmYQfFyGb3PPrLUhhy4zR8FDzGq6M zbrw== X-Forwarded-Encrypted: i=1; AJvYcCViTsWumSBFOWhRoNlhL2XNKGQmT7mtDBV0i3ttIqi1BvtS0bYLUVJBsXRvDDz7LqEDUv7r5LRD7A==@kvack.org X-Gm-Message-State: AOJu0Yz1Xg7IaAhsmg03dkOQm4iaKFELHbeA0yxyT+AMTBM7bC+qUQuW Kymq8ntaNJVGekZ8cgXVrlE+ZJYzyczU3ptH2/miHFl/cy+yMXiNiN+ktSAEaWNTFbxsl8LN2Rv 0ykKM1ufZf3OuyXdfS/TAiI0+G8U= X-Gm-Gg: ASbGnctuN5x7qzH9OlYezQHiAnGsE7Bupxe+qEDMWZk3wiQaIKQMMtDeUM/OgM1i5RS sgBmvr8ZPDIBofXtCqt6harVL7AHoVniY3cjGfw== X-Google-Smtp-Source: AGHT+IHIGoZFqo6a1Y+o45xe6bl5jc4kkv0Bc5aDD8UlPVp7SF4gzrOCelgSsmEEbzTsHxlNCOl17CxT9wUHnKpVEIU= X-Received: by 2002:a05:600c:3b26:b0:435:23c:e23e with SMTP id 5b1f17b1804b1-4365535b704mr48275125e9.12.1734571142438; Wed, 18 Dec 2024 17:19:02 -0800 (PST) MIME-Version: 1.0 References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> <20241218030720.1602449-2-alexei.starovoitov@gmail.com> In-Reply-To: From: Alexei Starovoitov Date: Wed, 18 Dec 2024 17:18:51 -0800 Message-ID: Subject: Re: [PATCH bpf-next v3 1/6] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation To: Michal Hocko Cc: bpf , Andrii Nakryiko , Kumar Kartikeya Dwivedi , Andrew Morton , Peter Zijlstra , Vlastimil Babka , Sebastian Sewior , Steven Rostedt , Hou Tao , Johannes Weiner , shakeel.butt@linux.dev, Matthew Wilcox , Thomas Gleixner , Jann Horn , Tejun Heo , linux-mm , Kernel Team Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 83ssgp7x7eyrd6a1639tutikc9o6g1ur X-Rspam-User: X-Rspamd-Queue-Id: 46C0A1A000E X-Rspamd-Server: rspam08 X-HE-Tag: 1734571110-421040 X-HE-Meta: U2FsdGVkX19a80nrVlNx4cg+BWzZnDBarAZn/CKC0HQdJvV7KCTpB2VKz7HUCw2IsSJRivXNJ2EoGJJ8aWThHXB+NlrYCdfjMNGl49HSTGQNdzQEEvw2orwUxBI5kSFDV6e7xV3a3rsBg3O3GEFzdfykOweSmPU1kajoTPO/KqP1O0iIlFIk+J2qwAV2O+TU/pKWUM1NQzqzoczIrxAm6k+rsDPwekNS/ammiIWe1YVp1T9gZXztDpPYTYrJAgP23eHsgWxWck9Iv9zOEvjOrdqkVxlT1qfTmiJEFQ8NteZc3noPMw8gNJevskS81K7v4iHy9ROd7vVNjBzKzucbQ5pQRmiGFgabo1WikgdcCVWI8zTb87qPz+2pW8d5+mA4ZvyB5fhcZriJ5rBp+pu0ImH7DGhaer4vfWziraV98bP8zUOPbDBEfrYm5RAEjYWmUere4VFeZWWFSPHjwdF9iTfH2Y6cfprLFWUxY08wJ/SEO3wbNd0tkvE8ONyagfvfPe90elwJPKVpqo3YtzbG9uvEsgClgJp0eoKK74j0Oz0gehewtEFP65KFBHHnnpgD7IhNbUKF3y/oZtltO0GUJRqeKaO8SCcF0QDYYusFaOWYuNkpkLSs66p1PhSGA4DpXBeHgLNZ19ZjKF5jbjtw1IQvLDzZO04HYI6vHSvF+dMi9cYCgzcdH7hvVB9pDYYqw8DA2J4Ibdv105jxy+DFoEXeVGdMfdmwiK/8vcnwfbN5zcoTnydqFXny5gQ44hyfKzCNvayNKiHQZpVZA/OmdGJFWIfZ3eSa1k7IZH6VWZHJV8+yyQk+OAnGo1dvHf4+cBzLE4VoGn4QWbmO2drL6bEqixGVV/FKQ+2JX1H23KXQwhulU3jNDBa4YkA4CkhXxKvc6WzGGCOEnox9DArnHORx6Yxs5weq6JwLWhVezIrMlR1r+P/Wk9dwv/seGLWBOo8GKdu/VPusEIu6Kd/ A3olpUV2 7ARVUANGbpY6BMf+sQPuqsXIJ/h32oUjJ22alwBRNadKIXdT56SAnd9R1X9hBgtXLDxRTmiAdsUnoNmb6uHCA6gdoZeJI7cfR+FD4Tafp6tYDJy+MrtuXquuoc2lkS6QzvMjyl2oBTNOWYiJtOmKtGfQKVgP9K5GBPyQkeTBfj2R48dluencWJsgP5W5jIYSPU8bsgSc851bUSqJq9BLYFsUnQpVYx0kruea+dD+i+7zDN5o8k9mKm2feWprOZjZX8Kgfvoj4/W6W46ZTGVbIQjQe7dVswa2grCCcj9XZWs8g1zCz7W00U/DDtbw4yZZcMZk/SsKfX9sMmlIcxeDFoVxEnAZhC15KHo/0MdQqsvR0fzIMJOStD+x4yaUWj2/5WlGMh98lEHZiPiKYxEP642U+TvWfzzYko7VaMM1w0EarhukCxRe1hxIjNfONbrkPIErhKwtCsFsgxSKp9vOtB2wbf7N/dTKyEm2e2AvHLZLeySc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000101, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Dec 18, 2024 at 3:32=E2=80=AFAM Michal Hocko wrot= e: > > I like this proposal better. I am still not convinced that we really > need internal __GFP_TRYLOCK though. > > If we reduce try_alloc_pages to the gfp usage we are at the following > > On Tue 17-12-24 19:07:14, alexei.starovoitov@gmail.com wrote: > [...] > > +struct page *try_alloc_pages_noprof(int nid, unsigned int order) > > +{ > > + gfp_t alloc_gfp =3D __GFP_NOWARN | __GFP_ZERO | > > + __GFP_NOMEMALLOC | __GFP_TRYLOCK; > > + unsigned int alloc_flags =3D ALLOC_TRYLOCK; > [...] > > + prepare_alloc_pages(alloc_gfp, order, nid, NULL, &ac, > > + &alloc_gfp, &alloc_flags); > [...] > > + page =3D get_page_from_freelist(alloc_gfp, order, alloc_flags, &a= c); > > + > > + /* Unlike regular alloc_pages() there is no __alloc_pages_slowpat= h(). */ > > + > > + trace_mm_page_alloc(page, order, alloc_gfp & ~__GFP_TRYLOCK, ac.m= igratetype); > > + kmsan_alloc_page(page, order, alloc_gfp); > [...] > > From those that care about __GFP_TRYLOCK only kmsan_alloc_page doesn't > have alloc_flags. Those could make the locking decision based on > ALLOC_TRYLOCK. __GFP_TRYLOCK here sets a baseline and is used in patch 4 by inner bits of memcg's consume_stock() logic while called from try_alloc_pages() in patch 5. We cannot pass alloc_flags into it. Just too much overhead. __memcg_kmem_charge_page() -> obj_cgroup_charge_pages() -> try_charge_memcg() -> consume_stock() all of them would need an extra 'u32 alloc_flags'. This is too high cost to avoid ___GFP_TRYLOCK_BIT in gfp_types.h > I am not familiar with kmsan internals and my main question is whether > this specific usecase really needs a dedicated reentrant > kmsan_alloc_page rather than rely on gfp flag to be sufficient. > Currently kmsan_in_runtime bails out early in some contexts. The > associated comment about hooks is not completely clear to me though. > Memory allocation down the road is one of those but it is not really > clear to me whether this is the only one. As I mentioned in giant v2 thread I'm not touching kasan/kmsan in this patch set, since it needs its own eyes from experts in those bits, but when it happens gfp & __GFP_TRYLOCK would be the way to adjust whatever is necessary in kasan/kmsan internals. As Shakeel mentioned, currently kmsan_alloc_page() is gutted, since I'm using __GFP_ZERO unconditionally here. We don't even get to kmsan_in_runtime() check. For bpf use cases __GFP_ZERO and __GFP_ACCOUNT are pretty much mandatory. When there will be a 2nd user of this try_alloc_pages() api we can consider making flags for these two and at that time full analysis kmsan reentrance would be necessary. It works in this patch because of GFP_ZERO. So __GFP_TRYLOCK is needed in many cases: - to make decisions in consume_stock() - in the future in kasan/kmsan - and in slab kmalloc. There I'm going to introduce try_kmalloc() (or kmalloc_nolock(), naming is hard) that will use this internal __GFP_TRYLOCK flag to avoid locks and when it gets to new_slab()->allocate_slab()->alloc_slab_page() the latter will use try_alloc_pages() instead of alloc_pages().