From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06614C282EC for ; Thu, 13 Mar 2025 16:03:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5EE44280005; Thu, 13 Mar 2025 12:03:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 57812280002; Thu, 13 Mar 2025 12:03:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F28C280005; Thu, 13 Mar 2025 12:03:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 219A7280002 for ; Thu, 13 Mar 2025 12:03:03 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8F7A3160267 for ; Thu, 13 Mar 2025 16:03:03 +0000 (UTC) X-FDA: 83216996646.09.70C9C1B Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [91.218.175.173]) by imf18.hostedemail.com (Postfix) with ESMTP id C8DFA1C0027 for ; Thu, 13 Mar 2025 16:02:59 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=ZKZaty+c; spf=pass (imf18.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741881780; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3r0gScSjGUNLpT31yPSgFl/qaMZq0at7tX/2n+LnKbM=; b=7AxhNxkByOYxVesPk/b+1/KbFlG33+2TwItYs+LWyp5KqcYIUBevCsQlmJUw/tq7mH7xRl q4Dq2Ct0YDWDdNvk1L+w9SlXWsdeCd1imWiThlN/WyeoBdVNamzCbhKefjYkj9XSDwSSOQ 8iaJaRFnwwHw2ukrO09opZEPAMkDufs= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=ZKZaty+c; spf=pass (imf18.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741881780; a=rsa-sha256; cv=none; b=mm2CkpMpMBhT34cHG78W0yJaabocixWyyyRPkdQEzpF5g1PJbTw4PxUC/m+OeiTATe/KMt O3iIp/hur8Lv9ij9ceOJe9Ja6LrnOoicZf2CvqklifhAF0YjT1bA/kMHJLNxDwqHM1azJ0 CfEodgrIuh3uz+wK7bE/eB8a+tmLDcw= Date: Thu, 13 Mar 2025 09:02:50 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1741881776; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=3r0gScSjGUNLpT31yPSgFl/qaMZq0at7tX/2n+LnKbM=; b=ZKZaty+cgmIT3ieapCVA8qqh1HlnCRPq3iTzfP3PtL/wSEp+zi8QA7PbkOjWxCzu1Ga6l0 jLhdrg++mdqCrtHzhZnB9r/vBBImtEvY/Nr2+/bQ2Q8NjSxYP5peisny3rdvua6b9XI6Yn 7LQ0rWSCPNZDBg3jCeJHUsfcvep+wlM= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Vlastimil Babka Cc: Michal Hocko , Alexei Starovoitov , Andrew Morton , bpf , Andrii Nakryiko , Kumar Kartikeya Dwivedi , Peter Zijlstra , Sebastian Sewior , Steven Rostedt , Hou Tao , Johannes Weiner , Matthew Wilcox , Thomas Gleixner , Jann Horn , Tejun Heo , linux-mm , Kernel Team Subject: Re: [PATCH bpf-next v9 2/6] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation Message-ID: References: <20250222024427.30294-1-alexei.starovoitov@gmail.com> <20250222024427.30294-3-alexei.starovoitov@gmail.com> <20250310190427.32ce3ba9adb3771198fe2a5c@linux-foundation.org> <4d75c5a8-a538-4d7d-aaf4-8ecf1d1be6b9@suse.cz> <4a52db5b-f5fe-4a60-ba17-a634a2d0b7af@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4a52db5b-f5fe-4a60-ba17-a634a2d0b7af@suse.cz> X-Migadu-Flow: FLOW_OUT X-Stat-Signature: 8r1iuio9bury5jxesydwpwpmqwrzmera X-Rspamd-Queue-Id: C8DFA1C0027 X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1741881779-402403 X-HE-Meta: U2FsdGVkX18rTGHO4SktE+RLN7Uy+M+EoYZc1rJfQw3Soh0bqzhM0+HD6eDjpfFJYO0HOFfMZbfeajoncijZ+wYwbbMRuIR4ZB+YHal6ITlBql3PxfPRqBrdVG63EZlm4IPSbzscpZ5XMywvo/aKSlKLI9BkIVg5leG7j7gZTX3+GkdgqSlP0kAnCnd8HSK0nKCtQEWRVZzgAPW5CHMlOdSBcDjIVvYxxIDGEqNZXeLty7BKWiu6/5q8wmeWsRZL758ocB+cf5aY8D80Q0a95f7vRNLjaqijZ/WnNZnScUhd5yk+3HODq7ZKQx2JWnUUzPBREWlKWKxhItzajA2ounLjHkw2tv3QSZMl0BKHLwcBWd1HosTHEVVqvaPqUEVcGgtu5er6TZ4Fur9Q2KEN+ZCF/HsQD+rw99fUXkiEmVXzuy0i2ueWGf2JHggIE1L5Ez3KFQdXKC6KYcnJLrOKKuSvjBpUw4hKB353XDGtIImvomLgiBsGXt0XozAnPo2gZ28HVZGGjV+jXDxM4yyEKfGw9mA/fMTLta8IZ8JgX7gepTTV+PbR01d+X4aRaT4/36M72Vo/p95yJVcG+8izAAOLoouqJkoDFBX0l1eVKeCiGj4dSscy/oqcSGUrFk18ui8WK3j2IlmUXTJknk7UZTU6HZG7NqGmtyvWJjwB1N5dgk9F2ZB7dK81SQlUpaYfCBRNp7JIbBCDBT7wjcY8jfvDfFen1JepJngzxnYrsWLZ3fjrtL24oDZkMJG9IpB7UgS9SW2MFpcEC6RxyJ7/LK6hxQiTCLiRDkWM8zT9HBEHA1xCcxgtScbZ0Eh/xf5BLYNuOlUIDxfyb4OuadYHY6TE+4jZ02ABK38YW8wD5bVCsDpc6blnKR89a7l6i/C/dy7UEB0q6oCwQAL1NaSj+sGRZkfNLPYqPMNAGMvs4EUOvg/oEJ/sISt3oS0E9A8FZl7LdzoUn/M1JKRL0a8 Xws2WZKj jbaxioxVuNWAQL9k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 13, 2025 at 03:21:48PM +0100, Vlastimil Babka wrote: > On 3/13/25 09:44, Michal Hocko wrote: > > On Wed 12-03-25 12:06:10, Shakeel Butt wrote: > >> On Wed, Mar 12, 2025 at 11:00:20AM +0100, Vlastimil Babka wrote: > >> [...] > >> > > >> > But if we can achieve the same without such reserved objects, I think it's > >> > even better. Performance and maintainability doesn't need to necessarily > >> > suffer. Maybe it can even improve in the process. E.g. if we build upon > >> > patches 1+4 and swith memcg stock locking to the non-irqsave variant, we > >> > should avoid some overhead there (something similar was tried there in the > >> > past but reverted when making it RT compatible). > >> > >> In hindsight that revert was the bad decision. We accepted so much > >> complexity in memcg code for RT without questioning about a real world > >> use-case. Are there really RT users who want memcg or are using memcg? I > >> can not think of some RT user fine with memcg limits enforcement > >> (reclaim and throttling). > > > > I do not think that there is any reasonable RT workload that would use > > memcg limits or other memcg features. On the other hand it is not > > unusual to have RT and non-RT workloads mixed on the same machine. They > > usually use some sort of CPU isolation to prevent from CPU contention > > but that doesn't help much if there are other resources they need to > > contend for (like shared locks). > > > >> I am on the path to bypass per-cpu memcg stocks for RT kernels. > > > > That would cause regressions for non-RT tasks running on PREEMPT_RT > > kernels, right? I would say more predictable for RT-kernel users but anyways I am in the process in the prototying and will share how it looks like. > > For the context, this is about commit 559271146efc ("mm/memcg: optimize user > context object stock access") > > reverted in fead2b869764 ("mm/memcg: revert ("mm/memcg: optimize user > context object stock access")") > > I think at this point we don't have to recreate the full approach of the > first commit and introduce separate in_task() and in-interrupt stocks again. > > The localtry_lock itself should make it possible to avoid the > irqsave/restore overhead (which was the main performance benefit of > 559271146efc [1]) and only end up bypassing the stock when an allocation > from irq context actually interrupts an allocation from task context - which > would be very rare. And it should be already RT compatible. Let me see how > hard it would be on top of patch 4/6 "memcg: Use trylock to access memcg > stock_lock" to switch to the variant without _irqsave... I am already changing stuff in this area, I will also give this idea a try as well. > > [1] the revert cites benchmarks that irqsave/restore can be actually cheaper > than preempt disable/enable, but I believe those were flawed