From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2EBBC27C54 for ; Thu, 6 Jun 2024 18:03:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 697A26B00AD; Thu, 6 Jun 2024 14:03:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F9D66B00AE; Thu, 6 Jun 2024 14:03:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4C2036B00B2; Thu, 6 Jun 2024 14:03:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2FB3C6B00AD for ; Thu, 6 Jun 2024 14:03:55 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id E218D417BB for ; Thu, 6 Jun 2024 18:03:54 +0000 (UTC) X-FDA: 82201237188.29.4A85199 Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com [209.85.167.175]) by imf22.hostedemail.com (Postfix) with ESMTP id 017C8C0023 for ; Thu, 6 Jun 2024 18:03:52 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qhlEpTi5; spf=pass (imf22.hostedemail.com: domain of yosryahmed@google.com designates 209.85.167.175 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717697033; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fmY9ClhcMnk3lfJDa8OPw7iQg8oOX8VXIGEoCMmXzxs=; b=ukIENg25t3EK3bTQm/HWCNIAxe3ff9Ce8hddkc5E1DFyG6lNooTF/jvlasnWLTIV3kb01g tSoSMCdHvGEgmVYsbBv4wIOxi4yZ9q/5mI7orMFle6dPJVAyLSdkjNbJzMIugzKm9n4Vq3 H10KVS3IyFCc0IACOn/9Hs6Z30Navf8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717697033; a=rsa-sha256; cv=none; b=ZG9kYhKCtD0/Ig6bRJauKNSfNNEIKTGh/6u3aYYpi9SFc5dGdqOwKfF/1vfx2+bVeNuhPW VwU/fQ6OrpKUHAhL10tLutjPKVwugWiGufovhiMe18uQ2zxhiYNAuhq3TvpnwIBynENz/Q Pc+xFTfYyG+xOloH+0ToOVxVp/tLx7k= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=qhlEpTi5; spf=pass (imf22.hostedemail.com: domain of yosryahmed@google.com designates 209.85.167.175 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-oi1-f175.google.com with SMTP id 5614622812f47-3c9b74043b1so669227b6e.1 for ; Thu, 06 Jun 2024 11:03:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717697032; x=1718301832; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fmY9ClhcMnk3lfJDa8OPw7iQg8oOX8VXIGEoCMmXzxs=; b=qhlEpTi57M0S963hPSOnETrifKCzCe/y9/a6Xl6lc5HPCuJGwUEAbj+XimmuCSxyZy uU/OTg1xwUejJN5m3x/a82t5+M/YbMw4jm9W0nkpH3+72lWiEzw7hZbVkk4ZKu8oEvg1 zwDGSkYtEluwSKSdHlECQPXLHCz7s5GRTKTeX+Vw9W5Nm1I4BTraJar0v3SBTDZBqXu/ pVtNcUVnR0ANe6t1nXyow0Ca1fH4fwdmv/Owgw2ZkpUK2NDOFyfxCxNeIepEls3mmHit kNVCmTzBtsHSAR3DCUhFEjz+YWcD4qBqwQCy5KWfJSm8d8vwS8rh/jr3J3Y6hU3OlINF arWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717697032; x=1718301832; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fmY9ClhcMnk3lfJDa8OPw7iQg8oOX8VXIGEoCMmXzxs=; b=pStjd9/JpTEu4Sdze7ItoELo4IDkI0CymEcEowDxyqRYQ79ILAsFCVFU/LVX8YqaAy kT2GMGTpLjAO8vGWrgBqoZIdDt6yoKlgk6c8iYQu0yx7RAZNmdzBLjC5fA674NklFW6q D5leVhRRJLiY5ksafd6lIfzArKc2HHbScHRXzvGVyh7ikzwgHsxeFwToHuSl/ktLek8f +jkx8dj6rbQ1wWG7GrwPBt5WS5vkKNJwe32E+Z2J4nPlxEXz6W2inNfqXFDdGeQstxZB zt/0DUP6yrOO5CueaHgxAnqiOlXOxt1OuqP12dLSFlOTFBqtRNyBCeE86cD/VfMnFFJp xCyg== X-Forwarded-Encrypted: i=1; AJvYcCUc0JpZGgnVaJmwzHx0tH8sxk6ZiYxzH6mrQ2W2TaVXi46xc2TIa0aaluYv8TQ0QY+PR35fbjknKodidvZifu8KRXo= X-Gm-Message-State: AOJu0YwqPfLzHFWmBic42hCoyTIQRCnWitDF9kVt8OVxcMEmHc3vDfBB eGoLHIVmlDe3XmgLMNbOdGagmGf3IITv4NnZxWUw05OJtD3do4mVvbAko28TZJuuNTo+3HFH1o/ PT+wuMSUSsieMlQNqyYMEpdRdHD1D0v3NPD7u X-Google-Smtp-Source: AGHT+IGB4znCbBR8zUXjogqePqqiINLBxrZ4BZNRbTT6dQK52MxMOJB+MSnUdj/K2GhLRkzuaQLVnFWonw/plNYoJYA= X-Received: by 2002:aca:130b:0:b0:3c7:50ac:c570 with SMTP id 5614622812f47-3d210f072b9mr155114b6e.44.1717697031595; Thu, 06 Jun 2024 11:03:51 -0700 (PDT) MIME-Version: 1.0 References: <20240508202111.768b7a4d@yea> <20240515224524.1c8befbe@yea> <20240602200332.3e531ff1@yea> <20240604001304.5420284f@yea> <20240604134458.3ae4396a@yea> <20240604231019.18e2f373@yea> <20240606010431.2b33318c@yea> In-Reply-To: From: Yosry Ahmed Date: Thu, 6 Jun 2024 11:03:13 -0700 Message-ID: Subject: Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc) To: Yu Zhao Cc: Takero Funaki , Erhard Furtner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Johannes Weiner , Nhat Pham , Chengming Zhou , Sergey Senozhatsky , Minchan Kim , "Vlastimil Babka (SUSE)" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 017C8C0023 X-Rspam-User: X-Stat-Signature: myk39cf741cheiqb5e7nnqt9a8j3ewjt X-HE-Tag: 1717697032-126597 X-HE-Meta: U2FsdGVkX19HbdhP0en+QD92GB9TkSFKR1In7LgRM372u1IUY5pCu1lSw82aTVARtgZfBo5EbdeggBjp164T3NLTWsUY5lLTxaqZgiBWzg16mzOLQS+0izQpR8ZHHMB5VIaWBC4Rl7hOgb/4jM1aRMUyRgzmcdTzJJH1afq1t2mVwr0/bM4cQv59t4ruy493M6Vtr0Zu8QfdZKUfMVXgu22j/wHyrWGZYniZcen61OBEwamN3IexD16h/4KbjF9ftCcWBu7JfKwf7pwG2cCMKVRL7vxff7A1s2baV2gmOrLgs/iMH/G/j6emDhdO7pWNL4Fl/Lr9tWGxbJ9hLmQIb/o0X4XkE88MoK8N58OQQr9sOWDvBfByaNaXeCbAPD3MSlSyukKAZ5I2w2i5Y5DqPxmF5xaZpKOUo8RftpRvXAc6uNVDuRrpW387RMhMrxWkzL+vZok0EfOB+LLW9EbaUzacWOCg/W8jnt3VGbiPC6UCyGIAGYUcoTUU6XUlhP9A8O2EkYF0kv3k6qaXZABKLZtNWlE7/orCd0Gg02YKMmFbjCNavagGiJr2yXE9TnbI9QJcSFfoaiYe/G5DRDNl4xpG2twIDbDnroewWi+bCvvxmOatIs96VjBYmJVo/rwdpUxC+sEeGxbiSEGtXPXQVl4FFDS4Mtmev/STxNCGPjxGylGtzRlRLccNrWioJ+/7XC/tegaVSOuSpGrWdU2FALWiwQ+W0Tf923juscZCu0DfwHXu7h3ypUIDAw4qGp8/MZwh1PhI8/9ap/XD8qLt3btTLFfgii+lWKkTmJU5Zn97z1+ZvIJ3+dabYTWoFNMkY162ZFGGChCiDO++QxK+WjNrJnKBjz3ATejyMWWtLNRvU6gbfiq9Nd0bdDNyYwTM2UPj28iZl8cYvHzjffikNiizgRWgPDUXqhe9+w39vH0MvRsbjCL/KpBsojc+/5VF9TDJshy3Wi5jC7uu8cu LoOs6gWI SyvXk0BsWqE8gXnYdmCmj12qPu9iUzKt75ulB3PpVIYl4Xw8HJVjM/liRrlAOA0ZNitFYU0GeXlmahJOvQlLJp8yQBMkOpHnLLx48Lh0X4BCmFnkrAj+utLzNtfP+b1vQkjnkEHO4rMoGjDMgfQDpYA9rXkpmB0q+xQ3jYDeu1tb1HcXv3xNC0Y6MLHZueR6OCz6wihcggvjVqJRW29Ryp/NFCr7RPu0SOGsTBeZqFUHE0SFMTJOBwVXXgZ/gWqrfg7poZSPZazeAoUPWXs6xeX2+5CvHonAWPMmh7SWLNY72+ld6WVEkWn0Rh6c1xgWvmkQfdNdzUsm3coM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.003847, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 6, 2024 at 10:55=E2=80=AFAM Yu Zhao wrote: > > On Thu, Jun 6, 2024 at 11:42=E2=80=AFAM Yosry Ahmed wrote: > > > > On Thu, Jun 6, 2024 at 10:14=E2=80=AFAM Takero Funaki wrote: > > > > > > 2024=E5=B9=B46=E6=9C=886=E6=97=A5(=E6=9C=A8) 8:42 Yosry Ahmed : > > > > > > > I think there are multiple ways to go forward here: > > > > (a) Make the number of zpools a config option, leave the default as > > > > 32, but allow special use cases to set it to 1 or similar. This is > > > > probably not preferable because it is not clear to users how to set > > > > it, but the idea is that no one will have to set it except special = use > > > > cases such as Erhard's (who will want to set it to 1 in this case). > > > > > > > > (b) Make the number of zpools scale linearly with the number of CPU= s. > > > > Maybe something like nr_cpus/4 or nr_cpus/8. The problem with this > > > > approach is that with a large number of CPUs, too many zpools will > > > > start having diminishing returns. Fragmentation will keep increasin= g, > > > > while the scalability/concurrency gains will diminish. > > > > > > > > (c) Make the number of zpools scale logarithmically with the number= of > > > > CPUs. Maybe something like 4log2(nr_cpus). This will keep the numbe= r > > > > of zpools from increasing too much and close to the status quo. The > > > > problem is that at a small number of CPUs (e.g. 2), 4log2(nr_cpus) > > > > will actually give a nr_zpools > nr_cpus. So we will need to come u= p > > > > with a more fancy magic equation (e.g. 4log2(nr_cpus/4)). > > > > > > > > > > I just posted a patch to limit the number of zpools, with some > > > theoretical background explained in the code comments. I believe that > > > 2 * CPU linearly is sufficient to reduce contention, but the scale ca= n > > > be reduced further. All CPUs are trying to allocate/free zswap is > > > unlikely to happen. > > > How many concurrent accesses were the original 32 zpools supposed to > > > handle? I think it was for 16 cpu or more. or nr_cpus/4 would be > > > enough? > > > > We use 32 zpools on machines with 100s of CPUs. Two zpools per CPU is > > an overkill imo. > > Not to choose a camp; just a friendly note on why I strongly disagree > with the N zpools per CPU approach: > 1. It is fundamentally flawed to assume the system is linear; > 2. Nonlinear systems usually have diminishing returns. > > For Google data centers, using nr_cpus as the scaling factor had long > passed the acceptable ROI threshold. Per-CPU data, especially when > compounded per memcg or even per process, is probably the number-one > overhead in terms of DRAM efficiency. 100% agreed. If you look at option (b) above, I specifically called out that scaling the number of zpools linearly with the number with CPUs have diminishing returns :)