From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F07F3E77184 for ; Thu, 19 Dec 2024 07:14:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D55C6B0088; Thu, 19 Dec 2024 02:14:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 685506B009A; Thu, 19 Dec 2024 02:14:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 527B36B009D; Thu, 19 Dec 2024 02:14:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 337DA6B0088 for ; Thu, 19 Dec 2024 02:14:05 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id AE6BE141487 for ; Thu, 19 Dec 2024 07:14:04 +0000 (UTC) X-FDA: 82910842896.08.AB22FEF Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) by imf04.hostedemail.com (Postfix) with ESMTP id 5D46E40008 for ; Thu, 19 Dec 2024 07:13:28 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=aClEdNr1; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf04.hostedemail.com: domain of mhocko@suse.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734592412; a=rsa-sha256; cv=none; b=XCCWL63C1oSLz1SkptLOe1qoOvRszaLvRqsmL5XqW9eOuRJssQSSHr/eUW1X0NdRt5gbVr KAShFwIkVgDEqNzChLEtgQ5XdFpc2nUkl/W88XsIekh2GZM+TWq40i/J5W8O6VNg0wV08y Kcv/csX8ZOsT07KHhFs/BuFuWj9V/lc= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=aClEdNr1; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf04.hostedemail.com: domain of mhocko@suse.com designates 209.85.208.47 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734592412; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=u1OEi5ZhPA9+MrgzSUOcSZypRzfKMXc2kF2QroNnpVk=; b=KWnGBbyclvbEHWypgTHMY+Xmdv87MvZ+nrih+BsicIYwKGk8Bsf8nEjxAjfiTcejhZ0/FU KcwH+2T1h17r2wxIIR3+uPm1LXHH4uWciL8riP9KzzHxmfi7fnxWILf4ZasmjE64QJYkx6 VJa7i3lD500az5nXj+AD60jDskKzWP8= Received: by mail-ed1-f47.google.com with SMTP id 4fb4d7f45d1cf-5d3f28a4fccso555070a12.2 for ; Wed, 18 Dec 2024 23:14:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1734592441; x=1735197241; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=u1OEi5ZhPA9+MrgzSUOcSZypRzfKMXc2kF2QroNnpVk=; b=aClEdNr1p/6ibHH9/4g6aqCtG/DjEoKjp4yNQye3BUFs7tpAzOqvWGiZ8lPDK9Sy8U NVi0M+V3/bghG+RdO5IpjdbHikn8T4HWIrhuDwF4TriUsC+s3yDCf4kZoZ7IPkk25dHc 5NXARDJxdoBmJs/wIySVtQSPtC+vaaudDOE0SGvGE+GRGrdg/4Hk/0fyWhdbdJketb/y bDMH5i8pYuecMkghDfPBWPGxrFMUvDYNWsPBfig7F+9p/r20aJT94n3ilexaZWKqwAiZ 98pmjnJpsH2nLrGK/9cTJaQuo70sL59RDhNTMsTkT1FyJMsdgYafm1h7JI3XgXiMK/DY pz2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734592441; x=1735197241; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=u1OEi5ZhPA9+MrgzSUOcSZypRzfKMXc2kF2QroNnpVk=; b=HZONwC3HIeyz27kA0eK7vCLcI7SYxDEbJVduQH1gLHrnpzsRpXEHd/hM0P+0z4XNNm eGfwr2h6kMQBmxiY6pnLwTSiz18nkGVoLGx5a5fp2L6FTEpo/zDZ6NCWWLueVBx+fPmo lG1n+RyFYpzM+b2/Bam/PF0cGr6PbcdNtO9L7PGkxxVcbgi7D3RTbfvDRzMvm4XUCrRp xB5kgXM2osbiF815X7BdtuOrFWI+rYdn53Xx9djvH0mOfD/u9RiZ7RobzCF5JbhMCLQ6 8dm9fXWEeZEKGNp2aRSpaJLKFuOm2U3iFHKmfGCHxFGP7ymKG/zBFxNNpbovTY2ODmU3 1pDw== X-Forwarded-Encrypted: i=1; AJvYcCXCaQhN/AP4y1W8KJVluWYXKD6jjAd7hMEHkxIFKJth1mpYE5ymtM8RWynPFSvLSgrvnDyBVYzaIQ==@kvack.org X-Gm-Message-State: AOJu0YxC1yAFzzXn+2cC2b4uZrvmCv3fSpaD18T9sqShKJek5RcJtRKi Sn2yKIfuex4DM1gf6AJBZ8m3Kb4/9kNOyQD0I+Bnfwzd1h/DPRItda2u3FVIfqU= X-Gm-Gg: ASbGncuRnGysle0g+Sd2nvkDOQKc2x1RkvrqLxqsE7OuUtVrZphSrA9dki/VzuJzQ+7 ep7bHAauIOq+TqaUBfufgMKYZnSP74LX8UJDM6JLd69kLUhAeZGH+Clw81Q5DoKEpNOXzuKSDWy 8t6UoKbcKCQFqdPqOeSfB6NOgMod7N45sP7cwD0X5/VtNUHjlSgeLYuI2WigVuJmxUTsLjE2/WF O3tdY87UlfwzF7gvE0ybNWh4AQaYVeQmJUB/1EKfTefQdIbm69UYx8sN3LtP/UE X-Google-Smtp-Source: AGHT+IGbKA88esNGTqcBeufNXkCLlBriajGKpzYOGrs0/RmlZlpKSoGsUeabkdTSnLOY8snhYdh7Aw== X-Received: by 2002:a05:6402:2695:b0:5d1:1024:97a0 with SMTP id 4fb4d7f45d1cf-5d7ee3a29d7mr5286714a12.6.1734592441063; Wed, 18 Dec 2024 23:14:01 -0800 (PST) Received: from localhost (109-81-88-1.rct.o2.cz. [109.81.88.1]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d80678c8cfsm342882a12.39.2024.12.18.23.14.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Dec 2024 23:14:00 -0800 (PST) Date: Thu, 19 Dec 2024 08:13:59 +0100 From: Michal Hocko To: Alexei Starovoitov Cc: bpf , Andrii Nakryiko , Kumar Kartikeya Dwivedi , Andrew Morton , Peter Zijlstra , Vlastimil Babka , Sebastian Sewior , Steven Rostedt , Hou Tao , Johannes Weiner , shakeel.butt@linux.dev, Matthew Wilcox , Thomas Gleixner , Jann Horn , Tejun Heo , linux-mm , Kernel Team Subject: Re: [PATCH bpf-next v3 1/6] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation Message-ID: References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> <20241218030720.1602449-2-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Stat-Signature: 3gncqmgbgir15t8s6pw957a1msnu4g63 X-Rspam-User: X-Rspamd-Queue-Id: 5D46E40008 X-Rspamd-Server: rspam08 X-HE-Tag: 1734592408-621800 X-HE-Meta: U2FsdGVkX19CLwcK+YDKzf8+k6b6BCdiYljgFlUReCkGGkgwxY45/BgdXH98LvAveNYo1CI/KBp7lUheH7XW8dR7PRK/xLVkxwt82sla9nL22mKdeO+O+jcAOvLA1aVGejiMoysLkriCKtSdz93/eulNG5U5KWkNQFQcn5aRf1M+IMivvzdZaUG3YCh1AVLxGNglbY4Muds5/SQlqGBSMaoNHNAp7/mdfPoy7Gn/SHg4Yu1Lnoqagn+NAvX/EY13lftMqxJyb/Mac6YjpuJhXNBxnEiMIS1tG4ZWizDXEUTS9ous2wJoW8AQwaI+oqn76pKquuYIPCxxXfjqC3dR4rqdpB6nYwQEApIIxXiyoB7q8+V03Th9W0uaD/7igIye/5fJ/w3UiTawHJ+7Hhw5xEEda94XJh1YfKDEozMdY7sop8yaXK6XDWct2H6U+JdbQe7DP+4iBh9Hx6Q/7O9MU7FP3mo5nnxKiA1HKM297QlM7LCxpEghxzxCTari/NC8gHYpB3py2Sccj68G5/rXVsddGEJ7P4xBqy6kuaz2Yp5ooklpOfYeQDQLnerOLfYEe/1Kxlua8RYTHY2IL3SwNu5ZhLzZcq4uJ56Da7pnzOJh2nOefOgGSEQt+LbAxf2/s8hZtBmVdUh+fElcF0gPg8XOLdzSKdmwkJS3IdtKB6YROQOEdjdVQxOn7BwYXtYOjEPse+8W2zvbeGqwdzXqBWFJ05BzhboJiSsrhjB2cT6QWxSpQhw7igSnPH94jiu3cytIZ66vedKPi5193tndTwnDsZK/4nqgu+2pNxMxZOzaq07ZfvXcfG7ALv4OHWSauQ7n0Mhzj6K/PKRDGP0RepViQBZpWc5wZN6IgxQh/XXewbZ3Lsi6BGF9MZFv88c0zvonTbz+m3fazH9rEmMJU4gXD9KyF2Acf5As/eBXkNo2NdGdX4HtQ13aV1/YsxMLKEbzYh5GH7RAD0MhLTt l3pxufsb n8Z4RUtPSkyKxVvAbRS/SpQG108HuMNpROaEacLz0ud4+lxg5VGbelx15FZxpOZjQzMUvT+j2OL2IhEFXCRS15mRVjSVNhBr4Fm/vPyqLhG3rOt7zDtQAGrImk7tQ9rEEr8jVf6fCkdWuFB5ELsEJQd5rchi8WiLs0x9q2p8gzJ1qbHjMI5bT2ai1GDHOG7wtdxfNYU/N/ec5iO65dRUQgU48luXh7uSUTaDv1nFaE85PNGVfyn+2jQNUojtYFJR0wDF6S+WLhpamylgwx8CQ6Z38sM1kuj4pchskl0J/FxRec6Cjaj5VhotpdhQAgYsAN67LffLibf13lZQyTnHbJIVBrf7VThD0ErCMViqqOhsbyKVjgK0I6y/v3SGusRTsqf73qnLLGt9gpq/glv/bCkXvDPtoRWsCoUq+FpDwObbYYbiiyFB7ZQdymflJF8H5MvNUeHJIObriKhN7DvYZSNSgkdJCp3oWfgghcx2t3/W6puXnSaoSme1WMJ2rx1ZVto+3Okk+1KtW0Fg4PxiIa+wO7vPdGW3gLXdQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed 18-12-24 17:18:51, Alexei Starovoitov wrote: > On Wed, Dec 18, 2024 at 3:32 AM Michal Hocko wrote: > > > > I like this proposal better. I am still not convinced that we really > > need internal __GFP_TRYLOCK though. > > > > If we reduce try_alloc_pages to the gfp usage we are at the following > > > > On Tue 17-12-24 19:07:14, alexei.starovoitov@gmail.com wrote: > > [...] > > > +struct page *try_alloc_pages_noprof(int nid, unsigned int order) > > > +{ > > > + gfp_t alloc_gfp = __GFP_NOWARN | __GFP_ZERO | > > > + __GFP_NOMEMALLOC | __GFP_TRYLOCK; > > > + unsigned int alloc_flags = ALLOC_TRYLOCK; > > [...] > > > + prepare_alloc_pages(alloc_gfp, order, nid, NULL, &ac, > > > + &alloc_gfp, &alloc_flags); > > [...] > > > + page = get_page_from_freelist(alloc_gfp, order, alloc_flags, &ac); > > > + > > > + /* Unlike regular alloc_pages() there is no __alloc_pages_slowpath(). */ > > > + > > > + trace_mm_page_alloc(page, order, alloc_gfp & ~__GFP_TRYLOCK, ac.migratetype); > > > + kmsan_alloc_page(page, order, alloc_gfp); > > [...] > > > > From those that care about __GFP_TRYLOCK only kmsan_alloc_page doesn't > > have alloc_flags. Those could make the locking decision based on > > ALLOC_TRYLOCK. > > __GFP_TRYLOCK here sets a baseline and is used in patch 4 by inner > bits of memcg's consume_stock() logic while called from > try_alloc_pages() in patch 5. Yes, I have addressed that part in a reply. In short I believe we can achieve reentrancy for NOWAIT/ATOMIC charges without a dedicated gfp flag. [...] > > I am not familiar with kmsan internals and my main question is whether > > this specific usecase really needs a dedicated reentrant > > kmsan_alloc_page rather than rely on gfp flag to be sufficient. > > Currently kmsan_in_runtime bails out early in some contexts. The > > associated comment about hooks is not completely clear to me though. > > Memory allocation down the road is one of those but it is not really > > clear to me whether this is the only one. > > As I mentioned in giant v2 thread I'm not touching kasan/kmsan > in this patch set, since it needs its own eyes > from experts in those bits, > but when it happens gfp & __GFP_TRYLOCK would be the way > to adjust whatever is necessary in kasan/kmsan internals. > > As Shakeel mentioned, currently kmsan_alloc_page() is gutted, > since I'm using __GFP_ZERO unconditionally here. > We don't even get to kmsan_in_runtime() check. I have missed that part! That means that you can drop kmsan_alloc_page altogether no? [...] > - and in slab kmalloc. There I'm going to introduce try_kmalloc() > (or kmalloc_nolock(), naming is hard) that will use this > internal __GFP_TRYLOCK flag to avoid locks and when it gets > to new_slab()->allocate_slab()->alloc_slab_page() > the latter will use try_alloc_pages() instead of alloc_pages(). I cannot really comment on the slab side of things. All I am saying is that we should _try_ to avoid __GFP_TRYLOCK if possible/feasible. It seems that the page allocator can do without that. Maybe slab side can as well. -- Michal Hocko SUSE Labs