From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2946DE7718B for ; Fri, 20 Dec 2024 19:46:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7EA346B007B; Fri, 20 Dec 2024 14:46:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 799F06B0082; Fri, 20 Dec 2024 14:46:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 661E86B0083; Fri, 20 Dec 2024 14:46:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 457626B007B for ; Fri, 20 Dec 2024 14:46:00 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A9E8B44895 for ; Fri, 20 Dec 2024 19:45:59 +0000 (UTC) X-FDA: 82916367660.28.80595FB Received: from out-178.mta0.migadu.com (out-178.mta0.migadu.com [91.218.175.178]) by imf20.hostedemail.com (Postfix) with ESMTP id 4015C1C001A for ; Fri, 20 Dec 2024 19:45:21 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=wTFs2ICp; spf=pass (imf20.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.178 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734723934; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zmhkYguEhBw74e/yDb6B2nnf/MGyGWg0Z2UpSuQG8Kc=; b=jV4yaANeJoiDgNTwr94uEbNF+oLihIuu54+YPomqpeLndAUm+5zFpR03NkQqUsHxjA06AV +fWHa32sLEsi26f+eWUD8VZmfGNzjH365Xb/W7ISZy/ztquckRg6ObFJ8RkB3jgLKNOKEI /q4a776W6FhuPhotAZxlbPnuENQ2Uzs= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=wTFs2ICp; spf=pass (imf20.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.178 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734723934; a=rsa-sha256; cv=none; b=5ZI+JSQt6i7FwkuxQSlUH2zlH3zA0DD4MpOHLo3jB3+3dpX+69zeoi2lnWjY+1furgTTaI mAGg1m50uNFEzL+zHED20kJVzNGF53rbBBKHNR0dVfjqfMhHK+uDfXgZDOaPfIwHPMw2Ns d0KSWWEbJDpm4t+sthN6G+mZsMwiga4= Date: Fri, 20 Dec 2024 11:45:47 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1734723953; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zmhkYguEhBw74e/yDb6B2nnf/MGyGWg0Z2UpSuQG8Kc=; b=wTFs2ICpGBlStxOTOUroqnmX1Y53plE0WbmZD+qWiyaxGVfmD8NuaozVoxz1LPg9Kby4qS WRHyv7u8X1wJLh3NN4Ol6pgf8J2icXmn0Vbu/hScw2Vq2bLhxd+6V48fyZr6GJkC7gotZ1 XSFP+ILPmPDUzCd9IaUZ4fPuJZeFM2g= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Alexei Starovoitov Cc: Michal Hocko , bpf , Andrii Nakryiko , Kumar Kartikeya Dwivedi , Andrew Morton , Peter Zijlstra , Vlastimil Babka , Sebastian Sewior , Steven Rostedt , Hou Tao , Johannes Weiner , Matthew Wilcox , Thomas Gleixner , Jann Horn , Tejun Heo , linux-mm , Kernel Team Subject: Re: [PATCH bpf-next v3 4/6] memcg: Use trylock to access memcg stock_lock. Message-ID: <7s6fbpwsynadnzybhdqg3jwhls4pq2sptyxuyghxpaufhissj5@iadb6ibzscjj> References: <20241218030720.1602449-1-alexei.starovoitov@gmail.com> <20241218030720.1602449-5-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam05 X-Stat-Signature: t6gowncipya9xigjdf3noe7nuh9n38be X-Rspamd-Queue-Id: 4015C1C001A X-Rspam-User: X-HE-Tag: 1734723921-872999 X-HE-Meta: U2FsdGVkX1/XmDona8LY5thlj2o7jElMA1QrGsa3bhNqE9CoXuYrQKFsgUarULAEjcdftvNSU0kSu+VKk2VfsFtcBCes6uApGGWnJZWfeBgg0PHHeA2UE2b4/Vjh+jWdyDFWJHRPU9/92EfmgmynCYEngiPK9axK8SnfD6txFpTRePSzqbd3bKgVKMsrLqaXTuO5dY1IIfPfFnUGP2D6SsEjjm8+7ny+fBMELlPegYd41aTqCzwu+eSiPsTl+WdnxA3vN4DPYRLLO5MIlI8wIpf+nK38P1saklD+j/VY5i7/lXD3xa/11aSY+TrLTH66fAyL5hd9u5jYjbTTEhYF/4gYHdLRflrmEEgyxKmN9X8BrCDd4xMklkVAhLGD7ZQ+g7/e0tEseW0KQuz7YEnCpEEi87rlhW5palvGfPWmBFx7QLQJqHFvalL5ozx6EDB3LGSnxgYB/eDMvq7Fe17QC+rVwhxtyqSxrB9lMERjcv78lbKGKLqwXwgPMT2l5JXqZlyOQ8KisJWl7GCwXUjhcFcnE7FX8PLYor0KnqaiLxVXjWGPysD0dAwkXg0KBg27lFvUvmuG7SPcm6HGil9ouQfpFImvwrNaXer5kSXV3zrv8vSYln1hFJCR+3GRyapr8xDEi1tgQrY26WhhbXbfhiksAdOt3XqikEF/t3WMKsdBXCWRqnzGVyXJsYgmW5wIPdFWazY+3rn04p9aQw5q43MZPZd9EEQnKdtX4hJ53ZhSzxjvbYaLKbNKNl7JA/gPjAJpFwfZIzIcCTC5sEoOKI6UZUUbVwRGenlYBVHa+QRWW9wFMSY8yoYgTHtwlC7oVsPZULJzTHkf487yU2LTVYto5c0zh4piimjwUr2YH0NvP6OrC7FbrblaBGstMsuYnwoXI2BDvel9njXOcAYnAAHxSPr7LXOzVUMSxUiMj1UY6r6aMnz/FY/T/htbJ38fZ3SGowWkOSLf15J3BkB +cnmj54m VjG5GOAoq1QP7eMYVQD74ZZwMNc29H5HzS2J9zMOaPDDKmpjdqhj7Dnm9ZNQBKVeMjCE6iasQZap3Q76k4e9WQCM29RhrAMnFoUIAnlzRYP/zBfo8x2pyWFl0FO1E7E1Vxsw05g7Dd/p49pM3W0L3HBoWO6F8iipRSlJEG43XYbkfq54bce+60VT6pzML8X2LiLgmyxiic7ymcBNv/1OVY7CNwgvZ6UCm71U1SEzaFB6zKwFQ3q3qqSupnY5789Xg+7xkDWMqeF1LCD87vYrfT+A7WTFw0ESXn7SvmvnWMbSD8xSW3tnmwhRJ2gQwkxHnHslB1T4BGW6PN3RVvoazI444qyaFYuOrlQIb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Dec 20, 2024 at 08:10:47AM -0800, Alexei Starovoitov wrote: > On Fri, Dec 20, 2024 at 12:24 AM Michal Hocko wrote: > > > > > +static inline bool gfpflags_allow_spinning(const gfp_t gfp_flags) > > > +{ > > > + /* > > > + * !__GFP_DIRECT_RECLAIM -> direct claim is not allowed. > > > + * !__GFP_KSWAPD_RECLAIM -> it's not safe to wake up kswapd. > > > + * All GFP_* flags including GFP_NOWAIT use one or both flags. > > > + * try_alloc_pages() is the only API that doesn't specify either flag. > > > > I wouldn't be surprised if we had other allocations like that. git grep > > is generally not very helpful as many/most allocations use gfp argument > > of a sort. I would slightly reword this to be more explicit. > > /* > > * This is stronger than GFP_NOWAIT or GFP_ATOMIC because > > * those are guaranteed to never block on a sleeping lock. > > * Here we are enforcing that the allaaction doesn't ever spin > > * on any locks (i.e. only trylocks). There is no highlevel > > * GFP_$FOO flag for this use try_alloc_pages as the > > * regular page allocator doesn't fully support this > > * allocation mode. > > Makes sense. I like this new wording. Will incorporate. > > > > + */ > > > + return !(gfp_flags & __GFP_RECLAIM); > > > +} > > > + > > > #ifdef CONFIG_HIGHMEM > > > #define OPT_ZONE_HIGHMEM ZONE_HIGHMEM > > > #else > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > > index f168d223375f..545d345c22de 100644 > > > --- a/mm/memcontrol.c > > > +++ b/mm/memcontrol.c > > > @@ -1768,7 +1768,7 @@ static bool consume_stock(struct mem_cgroup > > > *memcg, unsigned int nr_pages, > > > return ret; > > > > > > if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > > > - if (gfp_mask & __GFP_TRYLOCK) > > > + if (!gfpflags_allow_spinning(gfp_mask)) > > > return ret; > > > local_lock_irqsave(&memcg_stock.stock_lock, flags); > > > } > > > > > > If that's acceptable then such an approach will work for > > > my slub.c reentrance changes too. > > > > It certainly is acceptable for me. > > Great. > > > Do not forget to add another hunk to > > avoid charging the full batch in this case. > > Well. It looks like you spotted the existing bug ? > > Instead of > + if (!gfpflags_allow_blockingk(gfp_mask)) > + batch = nr_pages; > > it should be unconditional: > > + batch = nr_pages; > > after consume_stock() returns false. > > Consider: > stock_pages = READ_ONCE(stock->nr_pages); > if (memcg == READ_ONCE(stock->cached) && stock_pages >= nr_pages) { > > stock_pages == 10 > nr_pages == 20 > > so after consume_stock() returns false > the batch will stay == MEMCG_CHARGE_BATCH == 64 > and > page_counter_try_charge(&memcg->memsw, batch,... > > will charge too much ? > > and the bug was there for a long time. > > Johaness, > > looks like it's mostly your code? > > Pls help us out. I think the code is fine as the overcharge amount will be refilled into the stock (old one will be flushed). if (gfpflags_allow_spinning(gfp_mask)) batch = nr_pages; The above code will just avoid the refill and flushing the older stock. Maybe Michal's suggestion is due to that reason. BTW after the done_restock tag in try_charge_memcg(), we will another gfpflags_allow_spinning() check to avoid schedule_work() and mem_cgroup_handle_over_high(). Maybe simply return early for gfpflags_allow_spinning() without checking high marks.