From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49B55C27C54 for ; Thu, 6 Jun 2024 16:43:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C00256B00A5; Thu, 6 Jun 2024 12:43:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BAF556B00A6; Thu, 6 Jun 2024 12:43:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9E796B00A7; Thu, 6 Jun 2024 12:43:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8CEDE6B00A5 for ; Thu, 6 Jun 2024 12:43:05 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0B554816E9 for ; Thu, 6 Jun 2024 16:43:05 +0000 (UTC) X-FDA: 82201033530.26.8BD520F Received: from mail-oi1-f181.google.com (mail-oi1-f181.google.com [209.85.167.181]) by imf29.hostedemail.com (Postfix) with ESMTP id 2EF6412001E for ; Thu, 6 Jun 2024 16:43:01 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="u13V8/0J"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of yosryahmed@google.com designates 209.85.167.181 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717692182; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AJAGh2u2V4Tk6rAVOOJogH71dvDxba8eyU1Mqdojbzs=; b=twxb84EROASZMph7ZcwaJE3skzHR9h8XpDLzyNuIR/3EQgXwqNBKRVxQSa/wm6W974sY8f w05nXv4RREtGZUMt6gsHSn5TuxsXQRNtoURgwAK3PsQu0PUYb8pzi8Hgr2z07WbVTEQGs1 eJqVYw5doY6yIhthOZq5rznihDbKQmc= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="u13V8/0J"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of yosryahmed@google.com designates 209.85.167.181 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717692182; a=rsa-sha256; cv=none; b=lNpj9sGVXvlKoaAQAOtcxBfsqiCXLi6LJ5U3HY5F5pFyppRpJPyQZabNedS6JnjjVOtr8l ZG3a6ed2E6GhLrDFR4tqbNnDrekpkniRH2CI6SsCyDc3xxl4Hq/ozPBYSoE0evY3OsY47U u1Q5pPVLRiiqerOhvOGU1sC1j1akcw4= Received: by mail-oi1-f181.google.com with SMTP id 5614622812f47-3d201f9a55dso657925b6e.3 for ; Thu, 06 Jun 2024 09:43:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717692181; x=1718296981; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=AJAGh2u2V4Tk6rAVOOJogH71dvDxba8eyU1Mqdojbzs=; b=u13V8/0J3H6ujMSa85ZYc+COLdAVbPW3kxLP3Ojd/YgswBVID9yY0NOErPPCrQzMBL oTLbnV3gscy24FnSgQp5LNmC/S84REzKgaDRaTf3WHHelKR3TDcxkuzHfBwqOn91nQsn 1adIwPRI9c0KjA4yBY9cz39neWzYh54vm6EdewoemxNmjWZre1K7MTU1vPLa5WvVwHPS 9BzT3RYAzNwEnMfpEbexjtAIgyhhA2xBJg/vr6OuPdPH0NHYSR4rwmxuuDQwChcgrPjO tmmZNd5n5ExPta0MlSIEMKbiQOFNQpqNEtvYubI/6hB5IcXm7iOLTE/5gt88OaN7oc8V 081A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717692181; x=1718296981; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AJAGh2u2V4Tk6rAVOOJogH71dvDxba8eyU1Mqdojbzs=; b=jXbv1bt3moM+KUnTfNwdm+S/ILuMpO9dcXJhcfBAvHPsvX+k2/CVVhPjofb90Ed54J kyhiT0GNErCbrR/C+f4KqFqUCWhry4cXdX6c8Gi73WqgnoRtIABuz3WYYFPtml7IqkEk UDJVm3FBY4cjj/A8oi7gXUGSjK+ni8t1ZvpvYD1O3ME/IDPSNpbOihFCkNl6k+RFkYV1 uYHAS//wizO2xe/ZL7VXkrYh4wcLSSbnJ/SDL86lTk1Xux7xbuNmeyzP6loEpdDUZPXj q65xvgViFpnu1H6jkyxxH01cxqM+j7bULrgua6IP3lfsMHYp6QfwldSzrgW1uTCLREVX IQSw== X-Forwarded-Encrypted: i=1; AJvYcCXPo/eL9xz0KXpPQF1SApL4SAOSNfZZnTweqmK08xlvpScJ1nnBarkDEdZsyXwWdLRL7n8bEbYFzVl+F2qudvqyKm8= X-Gm-Message-State: AOJu0YxZGa0cWujNf8GoMfgLEzRwMWV8ZCMFuhIyYZOfGY0wgu0/VeM1 GoIb2TWYITmB/gl6YuLhzt7uheNy2OEAmxue7gtQqGR/lYCFyP6kf3wSfrSQjKgU0mtXkpyWNxq pfCGCIN8IW/GqFs7VAmdUZgpmk6nKlmqgxoyC X-Google-Smtp-Source: AGHT+IERasBuiqHdx/CCrfgVVuGNQ623zgMJ03/a4UqBThWZ9fbNc1TBReaGQpa20gK0M9sgOEpWNx7/XFo7+8WGkqg= X-Received: by 2002:a54:448c:0:b0:3d2:368:9251 with SMTP id 5614622812f47-3d2044e5e2bmr6367048b6e.38.1717692180748; Thu, 06 Jun 2024 09:43:00 -0700 (PDT) MIME-Version: 1.0 References: <20240508202111.768b7a4d@yea> <20240602200332.3e531ff1@yea> <20240604001304.5420284f@yea> <20240604134458.3ae4396a@yea> <20240604231019.18e2f373@yea> <20240606010431.2b33318c@yea> <20240606152802.28a38817@yea> In-Reply-To: <20240606152802.28a38817@yea> From: Yosry Ahmed Date: Thu, 6 Jun 2024 09:42:21 -0700 Message-ID: Subject: Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc) To: Erhard Furtner Cc: Yu Zhao , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Johannes Weiner , Nhat Pham , Chengming Zhou , Sergey Senozhatsky , Minchan Kim , "Vlastimil Babka (SUSE)" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 2EF6412001E X-Stat-Signature: mmqnahrs4owor8ianwjbo1k1n5acn5za X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1717692181-906048 X-HE-Meta: U2FsdGVkX19PQbcNHIoDSxhkojiveUD0255yaF74w12viyisC2f4AEDucV0OICPHaUCgOcX47sZsXbnlrl7KPWsA3zf52zwWqSueyiIu8ZPvvRvU7DGPlsTssxaEmBPlCV6QOL2wDCIwBj31rS5r8p0ou6rNLoXQ1XOyyTw2/6Evmv/UtMkojVBnGvbi+cGxAP0eF9B9SEBu6so/to4YgAr8GOGkyws5ACF81JIsdBOhM83OPVSuS3Cl7fopBFRDC4QBYa9kTkdqZ1LBk0Qp7PpM4wMyYUiMAtZyrRzUn3NXv1TbEiFeiiS1n89NVAHnrcC55vzfGtRgq6CdffuIFQx1hnG0ZAicAV1tVRYx1OoRn+gN23O4P9qcBwcKRIDF3FIp6SIqowb2/qbhcKTxNS0r5ylmc4HFLV8WWm8By9VQR3bkpQ+S6asw8XYd7Al8XoXwhGZLOSbrhfZkjC5+8Jt6FaR1468S/1t3FCBsD8JIxaTl3Tzu7I8/bP0Wv6n8l9DliivEiyj46Y2k5IhaP7/33nsK1RwZ7Co4cRCbIb5eRNm+D26YHIB3g5fArvC1apk+9w7VqC6P7/qxMog1QA4PGsGXedxv9YSDI6cCFViv1bQeOxb1U5BUd7UQsS0yMsQDlU8lsJP57Lewei7KuaiNOI+7D03q6yYsB8ouhyYmHskcalEv+1d3/TJPC6DpDBnuLbEA1A9+yZ2Zm5Ysm7P/k/6Ezk7icmyBs5qMLJXkwWnlFhyJLgIQgGEGJ6P18zVvYlA4by418KfXM+WuyR2qepB7ItM4EqreNcb72pHOiRYawNGZfE8aRj0zuvzObvWHucMYuqh90zEvEJ0+HlWfbBYiHwT6/lq6f5VXYV0npZ8Qm9IdjPQA9Zvoh7q9KIXUKDiVx0JfeMcGUKfIzW6lncPPAVkuTX1Y/9VyefTsaPqkuAAyEogbIJVqTzLvDWvIiNLOfbhlWBl5u2k ror2IgSq u08MAL1kehVUj/xCf/8xABEycmdwA1Eybpwocbse0hinpXUHR1Mp9AFg7Q2+O+dedqvbmmyWb61pxjXLA9luhLQ0OaEEtWtPkmaT2lUVkd7JhRHzzAjgHi/ytfGKf4A8Z7bgPbE8JPdcX5DZPgA6QKW2eky9+VMXy/EX9Clb+tjd0JwNeIJsNEd4oWPI5rEihKPLQhBGRVJZRy4revqAbha0XPSgO/ULE/nXakMOfqJ73c9pmXtqdozo0iCoix/nBIaFFWmWHH94Xji+vVixOEEkdyeQ7cWxLKocCRc4l6KRGYGM2Zn7utNNivtkTSjSVwGk0GxaA+Pbwz/SBruvNmeSXu9ul5YUnl8L+Ofiw0m0eSRk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 6, 2024 at 6:28=E2=80=AFAM Erhard Furtner wrote: > > On Wed, 5 Jun 2024 16:58:11 -0700 > Yosry Ahmed wrote: > > > On Wed, Jun 5, 2024 at 4:53=E2=80=AFPM Yu Zhao wrot= e: > > > > > > On Wed, Jun 5, 2024 at 5:42=E2=80=AFPM Yosry Ahmed wrote: > > > > > > > > On Wed, Jun 5, 2024 at 4:04=E2=80=AFPM Erhard Furtner wrote: > > > > > > > > > > On Tue, 4 Jun 2024 20:03:27 -0700 > > > > > Yosry Ahmed wrote: > > > > > > > > > > > Could you check if the attached patch helps? It basically chang= es the > > > > > > number of zpools from 32 to min(32, nr_cpus). > > > > > > > > > > Thanks! The patch does not fix the issue but it helps. > > > > > > > > > > Means I still get to see the 'kswapd0: page allocation failure' i= n the dmesg, a 'stress-ng-vm: page allocation failure' later on, another ks= wapd0 error later on, etc. _but_ the machine keeps running the workload, st= ays usable via VNC and I get no hard crash any longer. > > > > > > > > > > Without patch kswapd0 error and hard crash (need to power-cycle) = <3min. With patch several kswapd0 errors but running for 2 hrs now. I doubl= e checked this to be sure. > > > > > > > > Thanks for trying this out. This is interesting, so even two zpools= is > > > > too much fragmentation for your use case. > > > > > > Now I'm a little bit skeptical that the problem is due to fragmentati= on. > > > > > > > I think there are multiple ways to go forward here: > > > > (a) Make the number of zpools a config option, leave the default as > > > > 32, but allow special use cases to set it to 1 or similar. This is > > > > probably not preferable because it is not clear to users how to set > > > > it, but the idea is that no one will have to set it except special = use > > > > cases such as Erhard's (who will want to set it to 1 in this case). > > > > > > > > (b) Make the number of zpools scale linearly with the number of CPU= s. > > > > Maybe something like nr_cpus/4 or nr_cpus/8. The problem with this > > > > approach is that with a large number of CPUs, too many zpools will > > > > start having diminishing returns. Fragmentation will keep increasin= g, > > > > while the scalability/concurrency gains will diminish. > > > > > > > > (c) Make the number of zpools scale logarithmically with the number= of > > > > CPUs. Maybe something like 4log2(nr_cpus). This will keep the numbe= r > > > > of zpools from increasing too much and close to the status quo. The > > > > problem is that at a small number of CPUs (e.g. 2), 4log2(nr_cpus) > > > > will actually give a nr_zpools > nr_cpus. So we will need to come u= p > > > > with a more fancy magic equation (e.g. 4log2(nr_cpus/4)). > > > > > > > > (d) Make the number of zpools scale linearly with memory. This make= s > > > > more sense than scaling with CPUs because increasing the number of > > > > zpools increases fragmentation, so it makes sense to limit it by th= e > > > > available memory. This is also more consistent with other magic > > > > numbers we have (e.g. SWAP_ADDRESS_SPACE_SHIFT). > > > > > > > > The problem is that unlike zswap trees, the zswap pool is not > > > > connected to the swapfile size, so we don't have an indication for = how > > > > much memory will be in the zswap pool. We can scale the number of > > > > zpools with the entire memory on the machine during boot, but this > > > > seems like it would be difficult to figure out, and will not take i= nto > > > > consideration memory hotplugging and the zswap global limit changin= g. > > > > > > > > (e) A creative mix of the above. > > > > > > > > (f) Something else (probably simpler). > > > > > > > > I am personally leaning toward (c), but I want to hear the opinions= of > > > > other people here. Yu, Vlastimil, Johannes, Nhat? Anyone else? > > > > > > I double checked that commit and didn't find anything wrong. If we ar= e > > > all in the mood of getting to the bottom, can we try using only 1 > > > zpool while there are 2 available? I.e., > > > > Erhard, do you mind checking if Yu's diff below to use a single zpool > > fixes the problem completely? There is also an attached patch that > > does the same thing if this is easier to apply for you. > > No, setting ZSWAP_NR_ZPOOLS to 1 does not fix the problem unfortunately (= that being the only patch applied on v6.10-rc2). This confirms Yu's theory that the zpools fragmentation is not the main reason for the problem. As Vlastimil said, the setup is already tight on memory and that commit may have just pushed it over the edge. Since setting ZSWAP_NR_ZPOOLS to 1 (which effectively reverts the commit) does not help in v6.10-rc2, then something else that came after the commit would have pushed it over the edge anyway. > > Trying to alter the lowmem and virtual mem limits next as Michael suggest= ed. I saw that this worked. So it seems like we don't need to worry about the number of zpools, for now at least :) Thanks for helping with the testing, and thanks to everyone else who helped on this thread. > > Regards, > Erhard