From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1DCCC27C53 for ; Wed, 5 Jun 2024 23:42:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 645906B00A2; Wed, 5 Jun 2024 19:42:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F4CA6B00A3; Wed, 5 Jun 2024 19:42:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46E826B00A5; Wed, 5 Jun 2024 19:42:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2942C6B00A2 for ; Wed, 5 Jun 2024 19:42:14 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 83D551608C6 for ; Wed, 5 Jun 2024 23:42:13 +0000 (UTC) X-FDA: 82198460946.03.F334BB6 Received: from mail-ua1-f42.google.com (mail-ua1-f42.google.com [209.85.222.42]) by imf11.hostedemail.com (Postfix) with ESMTP id C99404000A for ; Wed, 5 Jun 2024 23:42:11 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jBOdQm5z; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.222.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717630931; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fURZ+GlEwn1DW8Z6q3RkvHuwBmyZjfx8HFK5EMb98t4=; b=l300oHl0w9zLVTol67dqowzTwTapcUNvZ4Pw6nLMvcfmyf8XDqDbkN/CNBYfKBodX09YMc EGRGrEle1oGdeHE0XmYaYjDkfe8EXq9CeQyr72cUqBVz5Lp9ZCs9DQd9qQb9Z/YnBjSJOn yO7ZoU8Dsa7aoGkxHPlXKlJzbOI0ptc= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jBOdQm5z; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.222.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717630931; a=rsa-sha256; cv=none; b=AzwVHynEoJz5L3nKkO0mGEhpUWa4owm6PSXNyzlC0QQ0XUVgCFL4n0cJQvqzlHGyGfhQfS xdzLcHeBH6Xn8Pw+Jb9KlK8htGaoJfl2hAfVyi0oyq0nIalLxCD5QcjnJgO18aDWZM9PqC GO6b1nOIA7+OQw30aI/If2/gGQTKIFs= Received: by mail-ua1-f42.google.com with SMTP id a1e0cc1a2514c-80909c1cabcso108411241.1 for ; Wed, 05 Jun 2024 16:42:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717630931; x=1718235731; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fURZ+GlEwn1DW8Z6q3RkvHuwBmyZjfx8HFK5EMb98t4=; b=jBOdQm5zm1rsxyeqpwsksldOGLIpOPPCkbGWo5PNOzjm/BsGVWnKAOj9YnvANLnZ+B YM9cSLj/RGgkh3cS93NN20LkU1WKsQnEKwM7yhcFdWkeLrwCsSSAFQr0hRM4k/usYZ2n c1OHX5vmMdYGtRUdIfIBxotJN+xEhmMD9pqkNORsn5oMuX0pmk+iUV25T/HNB122knuD qejWsRqZjB9rVyh5fywTMuzvXA9k+qGT0fOnGWTVsTePnpehrUE7XRSIbMNcRoUaPbGW UhgZxxwVV2WEHwsXANs8lO4V3rVC6OtSm9j/TNjHGnp5YMNNDr9yzp18xiawPJQLHcuE Gr2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717630931; x=1718235731; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fURZ+GlEwn1DW8Z6q3RkvHuwBmyZjfx8HFK5EMb98t4=; b=h+dQokJA8kaECUw+2CkQ2daWvAbAslTSIF73boGyu+fiHSdbFmCU2L7z6//wGuqkda 3Jsw/vLkrdg3NOnu7ra/esAAp6wB1Ajz7uStebe/+SjLDxPgEZBN6Cay4bPnzR5ArxcO EBm44wPcv60ojrC97xMCUHlAtsdNtPImmBSrciUlPZHpPJxY6BIntVAA5vqQ/4YTFTuy fqMae4oXPXw7Fi1tysg/8aCpOHEUDmytPZPhm8Df8Q6Jz16CHrCIGl3dqRA1YQdQ7MBC YkuGZw68T18U/4MnTQ0vHjI/YF7UAcmDtSq3ORzoo4pyG2gaBQZdyCttxDyA4nS1EHEO K1RA== X-Forwarded-Encrypted: i=1; AJvYcCVvJF/oOF99mBbFONJZY4bmK0rBgU8+Px7+gpXngWNTBcR3neNzFCqCylAJWGoU5GoXGT7Bq4tthm/AMBKm+Cs9Kqk= X-Gm-Message-State: AOJu0YwRMmAid6vhX9fSsaUiqgn8h5SUAO9wO3jtMBcgMXCsWax3tpTx Wh3T9wLgabtm3fe1FhtZKDk/j06N6hI3r/qogrfdTAkg45uAlXl9DGLcdqhgmNzqwRJci+sbKBS NKDc4lJDGh17CMwcJD427u3DBmspwtuFVVeD6 X-Google-Smtp-Source: AGHT+IHB5vbuYgoHzyE4VqkkyenDd0XtXJRfmN4AlQZzBmVBqpqyc1bRUTgyVv8nXskKAUiYew/UvTVdsYWeXMBGKpQ= X-Received: by 2002:a05:6102:5089:b0:47f:40f7:2b5f with SMTP id ada2fe7eead31-48c047f4db6mr5870916137.5.1717630930642; Wed, 05 Jun 2024 16:42:10 -0700 (PDT) MIME-Version: 1.0 References: <20240508202111.768b7a4d@yea> <20240515224524.1c8befbe@yea> <20240602200332.3e531ff1@yea> <20240604001304.5420284f@yea> <20240604134458.3ae4396a@yea> <20240604231019.18e2f373@yea> <20240606010431.2b33318c@yea> In-Reply-To: <20240606010431.2b33318c@yea> From: Yosry Ahmed Date: Wed, 5 Jun 2024 16:41:31 -0700 Message-ID: Subject: Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc) To: Erhard Furtner Cc: Yu Zhao , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Johannes Weiner , Nhat Pham , Chengming Zhou , Sergey Senozhatsky , Minchan Kim , "Vlastimil Babka (SUSE)" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C99404000A X-Stat-Signature: 9aau9f9rm4iiaypb9dwn1wbt6x3dwyks X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1717630931-263330 X-HE-Meta: U2FsdGVkX1+dksET4Z1LMJBKF4IAT6MXBs1E45wsencH4mGIiskUwX0rP933VS/1RKk4y1NWeibBFMHIQ7xcsGflWpT44Tm7BOO/6vg1nQ7hAc0ReTytTWQVMlRC6MtoyCJEOo8aDh7ZrWBVS380sDpvTdXlEjE7vVNTEmt3hwJQrr36Ze7spdDL2+ZAyp5VQFelyi9qpKAOUyRpdSe3+5GVOGZvxF7xxV3HK36cW3Sip7vK7mbh2VdCRi1j8/RsVkKBBdwYcWdGe0AC2qUEubyNOIY/1Qffirr1mjeArumRogoU+n5y5ML4790Yz4Q3NKk6YDNyVutyk+cB+i0k3zLd30pIISyxZlowKInsHYi4fUzf02ev+vQ8brYCmMC0fJHRL+ZP02/InDWfb9XP/vc4zD/G3c3l4xRFZc3H+4ECNZrflrlqo+tNIrMWZvD7rfHAkW2yIcrKfb0Y1/J9A8VP2baTaXmErG28Hj//iRoFxd2D5ygR5+joIFxAz+0xImNXT58S2i+NKUOSXC7k8TWwofY2EarObQ2EvWwbL8aU8SC+zRAxkaw5tbt+WExp+SQ25G9+sVFOyw2oE64cmBzxNQ1l8BV6bdP9j3Ii5dfPiJo6pKMBfr9CURFd2ypHuuEyVl2SgZdzYe4nezfJjsn1r3XpUGaggtyqOnKWsywv26T4b/uT4bjyjGsfUC41AvBfsBQgqsIlXFEptLI75MkwSdf0Ml4TxwLKKBJCF4mLqwxPvVKlX+WjrQG8MTsLiDe8z6lQEwJreRnNffJOKNGTcDDyAZAKjoIsBETF5VeKTM4JO8uR6y13iKGFMpGLCgiJ1DjE2ybiEP+7Wf+GQ69f3vKt8cUt1MinxQqmCrHVCOMcOvZl7VrT2JdKMODpAMuNOfWNQE3MSyNNHIbfPu/8Z/zba1O2m/5Igi4Lt/IvuqqLEW6QdDbQwlwuNPixgqKSVkTdMQbY4T8UOr2 4S20OI0y IXUBdMlVLfmqF/3+yAs8sW/agAnsghlEqFmq+Lm4Fjbc9dV2h2G8vkQEnOZCOi5uee+VRGZjBfuOKdUzqA6VRjIGd+m2paD3hvUxOZw8ZjOeoCrgK/p/pPULmYmNgPeAC3+0Bysgbld71q2CoxII2zGK3T1YYBP3oYxbue68Os2nn9apw9Ogy8V060En9H22fCq7kb9tUruBzS7dTKJyDb1B9JAlDxP9ip5hxwol4YcYjkcOqg6k0ZOovxETD+AsE1vPATTFWG8tmWzUAizdWa9C95VFJDe6ztxsVDw38kIyruTYG73r6+B8tmLfYf+pFeDPWKfMOnPqcnv24LOrJ4KpmG/kIGRdF/bnv5mT9ybwKj9s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.004926, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jun 5, 2024 at 4:04=E2=80=AFPM Erhard Furtner wrote: > > On Tue, 4 Jun 2024 20:03:27 -0700 > Yosry Ahmed wrote: > > > Could you check if the attached patch helps? It basically changes the > > number of zpools from 32 to min(32, nr_cpus). > > Thanks! The patch does not fix the issue but it helps. > > Means I still get to see the 'kswapd0: page allocation failure' in the dm= esg, a 'stress-ng-vm: page allocation failure' later on, another kswapd0 er= ror later on, etc. _but_ the machine keeps running the workload, stays usab= le via VNC and I get no hard crash any longer. > > Without patch kswapd0 error and hard crash (need to power-cycle) <3min. W= ith patch several kswapd0 errors but running for 2 hrs now. I double checke= d this to be sure. Thanks for trying this out. This is interesting, so even two zpools is too much fragmentation for your use case. I think there are multiple ways to go forward here: (a) Make the number of zpools a config option, leave the default as 32, but allow special use cases to set it to 1 or similar. This is probably not preferable because it is not clear to users how to set it, but the idea is that no one will have to set it except special use cases such as Erhard's (who will want to set it to 1 in this case). (b) Make the number of zpools scale linearly with the number of CPUs. Maybe something like nr_cpus/4 or nr_cpus/8. The problem with this approach is that with a large number of CPUs, too many zpools will start having diminishing returns. Fragmentation will keep increasing, while the scalability/concurrency gains will diminish. (c) Make the number of zpools scale logarithmically with the number of CPUs. Maybe something like 4log2(nr_cpus). This will keep the number of zpools from increasing too much and close to the status quo. The problem is that at a small number of CPUs (e.g. 2), 4log2(nr_cpus) will actually give a nr_zpools > nr_cpus. So we will need to come up with a more fancy magic equation (e.g. 4log2(nr_cpus/4)). (d) Make the number of zpools scale linearly with memory. This makes more sense than scaling with CPUs because increasing the number of zpools increases fragmentation, so it makes sense to limit it by the available memory. This is also more consistent with other magic numbers we have (e.g. SWAP_ADDRESS_SPACE_SHIFT). The problem is that unlike zswap trees, the zswap pool is not connected to the swapfile size, so we don't have an indication for how much memory will be in the zswap pool. We can scale the number of zpools with the entire memory on the machine during boot, but this seems like it would be difficult to figure out, and will not take into consideration memory hotplugging and the zswap global limit changing. (e) A creative mix of the above. (f) Something else (probably simpler). I am personally leaning toward (c), but I want to hear the opinions of other people here. Yu, Vlastimil, Johannes, Nhat? Anyone else? In the long-term, I think we may want to address the lock contention in zsmalloc itself instead of zswap spawning multiple zpools. > > The patch did not apply cleanly on v6.9.3 so I applied it on v6.10-rc2. d= mesg of the current v6.10-rc2 run attached. > > Regards, > Erhard