From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 501AAC3ABBC for ; Mon, 5 May 2025 17:14:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 69B376B0085; Mon, 5 May 2025 13:14:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 64AFF6B0089; Mon, 5 May 2025 13:14:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5127D6B008A; Mon, 5 May 2025 13:14:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 310D16B0085 for ; Mon, 5 May 2025 13:14:04 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EA22F1C9BF2 for ; Mon, 5 May 2025 17:14:05 +0000 (UTC) X-FDA: 83409502050.07.A5BB178 Received: from out-172.mta1.migadu.com (out-172.mta1.migadu.com [95.215.58.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 1484AC0004 for ; Mon, 5 May 2025 17:14:03 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=RVkChGu3; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.172 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746465244; a=rsa-sha256; cv=none; b=Yb6BaxRcv/66OaFeIWif9El4laUc65v63qOXU8Q8tOEqY+5DBvIbY9rgE1+oZVkpBKYeM6 thDDm7WUhMeS8MjyBvomNv/iZ9DzSkmCyqShCmuogb3uZCdpzT/m2JncYVbYqEfQXT58Vi q8Hekk/xkiRgBIRgTMFccOTauaUqW3M= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=RVkChGu3; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.172 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746465244; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E24kJ44em7O2f1doQqPbEivgystWHS55cv/k6/FiNHU=; b=JgB4gfWVYIX5EsHwwTGk0jAnaTLHw2OddA9ndkDlOstx2xvSOEonAqo1uZHBoKxlpLNRSk pKYX3Qhji2hsYn18VBr+gINzpIIQegoaTcjMZ48Hi1TITxXMOr8nMjOPoIpGTk5YNXzANG wr0NsiYGg3UcD1YADPyP5BS7IQJwxVc= Date: Mon, 5 May 2025 10:13:37 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1746465240; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=E24kJ44em7O2f1doQqPbEivgystWHS55cv/k6/FiNHU=; b=RVkChGu3assQs6W1YgJ8JLKqafWbJ8czNHoU+CkVNLUBbODpPY6e+XqelKnVrzU9BU8iij LcYaVyHMe0FBpZqEyrwYQKb/sUb4+kIZWP7lE+CQ4uO1QgwSmemHaFdI5YcujOlenDLHZX AZbpHvkoU4tK17yCZRuDSP3iNMC9Vhw= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Vlastimil Babka Cc: Shakeel Butt , Alexei Starovoitov , Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Alexei Starovoitov , linux-mm , "open list:CONTROL GROUP (CGROUP)" , bpf , LKML , Meta kernel team , Sebastian Andrzej Siewior , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "David S. Miller" , netdev@vger.kernel.org Subject: Re: [PATCH v2 3/3] memcg: no irq disable for memcg stock lock Message-ID: References: <20250502001742.3087558-1-shakeel.butt@linux.dev> <20250502001742.3087558-4-shakeel.butt@linux.dev> <81a2e692-dd10-4253-afbc-062e0be67ca4@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <81a2e692-dd10-4253-afbc-062e0be67ca4@suse.cz> X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 1484AC0004 X-Stat-Signature: zyoenuzjigzygiup6owsw1jdr3goop3d X-Rspam-User: X-HE-Tag: 1746465243-345960 X-HE-Meta: U2FsdGVkX18zoPGGSCtJxmfd5HWCnS4CS0HSuX+pNjQmVfP0RXc2myrjThttZI6MUAK/8nA9d/ush/n+NkE5H4npvM5h4PIqGAyzgHQCr2Mu9wSSqESy0UCFJ18OgVzYPL+wxhkifgu++AHJTP35iCKpIaHRh1WBgSZx3zmzl/bzPFzNMLpBBld3vGzEweV7qY1QfxEPGMKZYc96V3IOzkO5etiASGIQXVu7SmTEYaXGDkxunOv6kZn6lkiU/4H0D0DTe6p3+YHe/AlAQxTE8F6bhu5FEBP26xavsEVwAlnKStKHLP5MdsbDi+AzcGMIuMmu7Bzcov8AQFznOCgpFT9bwNyDN5lT6E7pkplImF1Z29LMqHlLWkoOBqNAorvdFdYDvJ+ACWgQPU6BX52jwWD0IiAh7TPMDTkwYT8t5Gl2my36j3Js2F1GDTmwvwmWMao8yKBcGpztDaeY6X8t/UxTxIewzIeAjK3MoyVN0k+WYd00JsQrqfTHGJD4yr12UF2zNZGpEgx4o+ZamIx5UItg2zONHw5N785WgVZRUetRILsdfL8dmGHU63gkA4UOBsUL+uRKnr6feQ1H/OxkSCvGeMr+LL5pgTtD8z/3PrRwL08Fevtwa8uUOdij/KF1IgUDXsaClpf3YvRPc14p3SDc8Qp0R8rw2r84MD8O9fceN6EpQay4EjqAMRu2SNBsfZzP46Ocr7pvgiV5XI7S5V3cs8NoYOV2AbbfHYV0A33y0lEEZBJMeyPGoxz5NDRSWnSGse18l5fuZ63gniU88aLn0aZoW+8P1HwAnOU8mWP5ffYxKcysWjItOWNxVfL/g4mWg7N0u1V3YpCN7kELdAkgmISgy+ASlvSEHCEQbFAYZDGIAoeannTDRA0TZ7S2z05wNHhhYacgdktQ0BeGsf6eqxBVC9I0okdU3bZ7xHuQUMNcHaj7JA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Ccing networking folks. Background: https://lore.kernel.org/dvyyqubghf67b3qsuoreegqk4qnuuqfkk7plpfhhrck5yeeuic@xbn4c6c7yc42/ On Mon, May 05, 2025 at 12:28:43PM +0200, Vlastimil Babka wrote: > On 5/3/25 01:03, Shakeel Butt wrote: > >> > index cd81c70d144b..f8b9c7aa6771 100644 > >> > --- a/mm/memcontrol.c > >> > +++ b/mm/memcontrol.c > >> > @@ -1858,7 +1858,6 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, > >> > { > >> > struct memcg_stock_pcp *stock; > >> > uint8_t stock_pages; > >> > - unsigned long flags; > >> > bool ret = false; > >> > int i; > >> > > >> > @@ -1866,8 +1865,8 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, > >> > return ret; > >> > > >> > if (gfpflags_allow_spinning(gfp_mask)) > >> > - local_lock_irqsave(&memcg_stock.lock, flags); > >> > - else if (!local_trylock_irqsave(&memcg_stock.lock, flags)) > >> > + local_lock(&memcg_stock.lock); > >> > + else if (!local_trylock(&memcg_stock.lock)) > >> > return ret; > >> > >> I don't think it works. > >> When there is a normal irq and something doing regular GFP_NOWAIT > >> allocation gfpflags_allow_spinning() will be true and > >> local_lock() will reenter and complain that lock->acquired is > >> already set... but only with lockdep on. > > > > Yes indeed. I dropped the first patch and didn't fix this one > > accordingly. I think the fix can be as simple as checking for > > in_task() here instead of gfp_mask. That should work for both RT and > > non-RT kernels. > > These in_task() checks seem hacky to me. I think the patch 1 in v1 was the > correct way how to use the local_trylock() to avoid these. > > As for the RT concerns, AFAIK RT isn't about being fast, but about being > preemptible, and the v1 approach didn't violate that - taking the slowpaths > more often shouldn't be an issue. > > Let me quote Shakeel's scenario from the v1 thread: > > > I didn't really think too much about PREEMPT_RT kernels as I assume > > performance is not top priority but I think I get your point. Let me > > Agreed. > > > explain and correct me if I am wrong. On PREEMPT_RT kernel, the local > > lock is a spin lock which is actually a mutex but with priority > > inheritance. A task having the local lock can still get context switched > > Let's say (seems implied already) this is a low prio task. > > > (but will remain on same CPU run queue) and the newer task can try to > > And this is a high prio task. > > > acquire the memcg stock local lock. If we just do trylock, it will > > always go to the slow path but if we do local_lock() then it will sleeps > > and possibly gives its priority to the task owning the lock and possibly > > make that task to get the CPU. Later the task slept on memcg stock lock > > will wake up and go through fast path. > > I think from RT latency perspective it could very much be better for the > high prio task just skip the fast path and go for the slowpath, instead of > going to sleep while boosting the low prio task to let the high prio task > use the fast path later. It's not really a fast path anymore I'd say. Thanks Vlastimil, this is actually a very good point. Slow path of memcg charging is couple of atomic operations while the alternative here is at least two context switches (and possibly scheduler delay). So, it does not seem like a fast path anymore. I have cc'ed networking folks to get their take as well. Orthogonally I will do some netperf benchmarking on v1 with RT kernel.