From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88415C021B2 for ; Fri, 21 Feb 2025 02:36:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11BA728001C; Thu, 20 Feb 2025 21:36:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0CD7728001B; Thu, 20 Feb 2025 21:36:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED56328001C; Thu, 20 Feb 2025 21:36:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CF71428001B for ; Thu, 20 Feb 2025 21:36:19 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4EC75C05D9 for ; Fri, 21 Feb 2025 02:36:19 +0000 (UTC) X-FDA: 83142387678.19.EE6428B Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf20.hostedemail.com (Postfix) with ESMTP id B15061C0007 for ; Fri, 21 Feb 2025 02:36:17 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=W5EzY6vZ; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf20.hostedemail.com: domain of dennis@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=dennis@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740105377; a=rsa-sha256; cv=none; b=Q1XstL24HNCtJTob7yo1WPVRppy9zyl5mb/538uH9lYDafA6p1ejgyubPQmJHH095t83o6 GZHqnYH44kY0xBLy3r3A1YGgTF1GmIMhThj28ZIWaD9OQSjcD55q5mO0IVrjp2zbs0td5z YyVhAWt8Jet/8E7dcXafVGfViHMXzO0= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=W5EzY6vZ; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf20.hostedemail.com: domain of dennis@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=dennis@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740105377; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=okYy36JfBq7r0oh9HbV8I7+dcSuXJZWW4mo9YwV+wqo=; b=OYjFy1GJ8u9Av9dcWE73wb5aO8tvc+EzFdFjtXnjHlMBH3Z2VFn4WfTzhTj+zdybrlYQyW IkzVcceOQV2a6B5SovHkDpc10yj/erMoGXP/F1IpIcPpuCx51yn5kTmvEfPx0AbQiw0tiY pO9B6ohqteYh0dkKc6cv68ifEhQyCA8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 0901861388; Fri, 21 Feb 2025 02:36:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3AC97C4CED1; Fri, 21 Feb 2025 02:36:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740105376; bh=WFIuMo5pTGBiHHYSjJUhVurH/1rAlmn11CxpWCw4GZY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=W5EzY6vZyIp8BdSy+igShiXGpN/XGQh1WOuMxNLgvEAgZ7len5UpeefOCZKvMn2J9 aD8LwSWfEfjQf7gk+9EVhSrJDzEZvl0VEHYObIGQDS+1xZfpj7OuACo9LCnfrvB2U4 Tdlhy/T3Q43qEexDcyuCVyBPUZ8mNF798pke+B+j3sLXSntkH51FKe2e8TLtUsFIzf G9vXjOSL9lRsz0kEOiJLlGMwgfoWF930vdEQfsyIC9/S/csAPDEAAkWUH8GRMO+URN G4b9z46x5fU5pPN6mW1U0l9s+4u8Im8lawhr5Q6szIQTbJKrjK+V+6juvGjIzmFJct FlMmiJ9T9WTdQ== Date: Thu, 20 Feb 2025 18:36:14 -0800 From: Dennis Zhou To: Michal Hocko Cc: Tejun Heo , Filipe Manana , Andrew Morton , linux-mm@kvack.org, LKML Subject: Re: [PATCH] mm, percpu: do not consider sleepable allocations atomic Message-ID: References: <20250206122633.167896-1-mhocko@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Queue-Id: B15061C0007 X-Rspamd-Server: rspam12 X-Stat-Signature: hfsdfuhmzxuk1nrfhtpab1naadcd5j9d X-HE-Tag: 1740105377-927775 X-HE-Meta: U2FsdGVkX1/IKqfHWmBBK4hv1GGzzS4lvH3rmvtlIugkhZ+Sgsiy5KaJRrpTBiyqu+Ok2iWgTxqJL7TTElwvXv8RommzrXYH1BGhjy120qV0lgo2el+fzPCPSQqAweNbeyWiXlSpQJoXADRnJaOA05+jpzZrekZmZkEn+x0TuOHMHX4ZS8tUm5Otm0jRZzogChAI0BxeiJ20KGEaYVeOkS9XSO1NcLP/G1jWAkB7GmKM5XGr1EmhKzqnCYV240jGAc+Atjdpw8PjSlf6jWyBf6nJ866KbbIv4fmkUBJRY1eRzXrn0IN2SVaP1uxUXsTnqXCtqCiX+/fzlI/alyJHi5sBkJXMxDmHSA11NEAeK6sY6F3P1HeK0IgzCvnOS4qWfdDVFCgQHjdFkZ3qNyrRcIbvmRrjgcppnolZ4tVLnv8xpYjYD2TIcUKXK+H8GTAN6xlibxL40MwVuf5xDV93+TNiWBZEGT9fxHbKPigjlgrItVqLIE+wWVUncetCAyD3/lFzhtu1beCQ/e6DfQZ8S2aUX6TwutwFxo1eg4JYmQ6+tmLV59i4ovt9aKPGewIylHX8/TItozjgpRD2J3D5hmjNK7la2TqtpnAThHtRohzxATlSdvGOXF9OQaMmwTwrftWUU+5Iilm6LaiW6Wxxm9j4saJbGfP5RmuQZ560eeO9i10aUmlFVgZN2QozamdeFpbae9mTPf7sfKpPDg6Om0Rf5qwwOpU7NFY1tmFeuBOfszfkkcql/N9kz5JbfcaTG9jyDRlPuGoLJrqc02gYAgG1FcWM7wfggaDoYFomi25PmB83LLLL5kbz9gtAQSJUwQGpaB5kHf7is/3r+/7vhVyk7TM/xOzgTzFp7X6n8qc3iLN2Y8+QaOSq1+10AcvdnYg5LegttMmTjpNUVFluXvwqSzd/Hw48DqSsAVZaTsgtzTMH24EvBN/Ss+ttqQbgk5CATwbwTQXWngjD1vw r7wZvlgW YQeeXy8YnldKiJGGexSVMq//+yxxC2mHNkuRh/d58ItmDt3MJj+KNGFc9Z83SorfjHl5vwzGvFm3Npg66RUoO42rHFYqyoeq5zB7teu5hOFXFG4XgnuILvF115iiqe+FpXEIOcQ3vi/+9EAuxdbvm+9xokcRxG2Q/BnkpjmNBbcyQitt3JZmYdK7fk53ouk3oB2Lh+CDjD597lyDwbLueVazqJdesFpHofezCVzOFfFjnf+a78r5WHE5g1Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 14, 2025 at 04:52:42PM +0100, Michal Hocko wrote: > On Wed 12-02-25 13:39:31, Dennis Zhou wrote: > > Hello, > > > > On Wed, Feb 12, 2025 at 11:30:08AM -1000, Tejun Heo wrote: > > > Hello, > > > > > > On Wed, Feb 12, 2025 at 09:53:20PM +0100, Michal Hocko wrote: > > > ... > > > > > Hmm... you'd a better judge on whether that'd be okay or not but it does > > > > > bother me that we might be increasing the chance of allocation failures for > > > > > GFP_KERNEL users at least under memory pressure. > > > > > > > > Nope, this will not change the allocation failure mode. Reclaim > > > > constrains do not change the failure mode they just change how much the > > > > allocation might struggle to reclaim to succeed. > > > > > > > > My undocumented assumption (another dept on my end) is that pcp > > > > allocations are no hot paths. So the worst case is that GFP_KERNEL > > > > pcp_allocation could have been satisfied _easier_ (i.e. faster) because > > > > it could have reclaimed fs/io caches and now it needs to rely on kswapd > > > > to do that on memory tight situations. On the other hand we have a > > > > situation when NOIO/FS allocations fail prematurely so there is > > > > certainly some pros and cons. > > > > > > I'm having a hard time following. Are you saying that it won't increase the > > > likelihood of allocation failures even under memory pressure but that it > > > might just make allocations take longer to succeed? > > > > > > NOFS/IO prevents allocation attempt from entering fs/io reclaim paths, > > > right? It would still trigger kswapd for reclaim but can the allocation > > > attempt wait for that to finish? If so, wouldn't that constitute a > > > dependency cycle all the same? > > > > > > All in all, percpu allocations taking longer under memory pressure is fine. > > > Becoming more prone to allocation failures, especially for GFP_KERNEL > > > callers, probably isn't great. > > > > > > > Wait, I think I'm interpreting this change differently. This is > > preventing the worker from allocating backing pages via GFP_KERNEL. It > > isn't preventing an allocation via alloc_percpu() from being GFP_KERNEL > > and providing those flags down to the backing page code. alloc_percpu() > > for GFP_KERNEL allocations will populate the pages before returning. > > Correct. > > > I'm reading this as potentially making atomic percpu allocations fail as > > we might be low on backing pages. This change makes the worker now need > > to wait for kswapd to give it pages. Consequently, if there are a lot of > > allocations coming in when it's low, we might burn a bit of cpu from the > > worker now. > > Yes, this is potential side effect. On the other hand NOFS/NOIO requests > wouldn't be considered atomic anymore and they wouldn't fail that > easily. Maybe that is an odd case not worth the additional worker > overhead. As I've said I am not familiar with the pcp internals to know > how often the worker is really required > I've thought about this in the back of my head for the past few weeks. I think I have 2 questions about this change. 1. Back to what TJ said earlier about probing. I feel like GFP_KERNEL allocations should be okay because that more or less is control plane time? I'm not sure dropping PR_SET_IO_FLUSHER is all that big of a work around? 2. This change breaks the feedback loop as we discussed above. Historically we've targeted 2-4 free pages worth of percpu memory. This is done by kicking the percpu work off. That does GFP_KERNEL allocations and if that requires reclaim then it goes and does it. However, now we're saying kswapd is going to work in parallel while we try to get pages in the worker thread. Given you're more versed in the reclaim side. I presume it must be pretty bad if we're failing to get order-0 pages even if we have NOFS/NOIO set? My feeling is that we should add back some knowledge of the dependency so if the worker fails to get pages, it doesn't reschedule immediately. Maybe it's as simple as adding a sleep in the worker or playing with delayed work... Thanks, Dennis