From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A07AC27C52 for ; Thu, 6 Jun 2024 17:55:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BE9F96B00B4; Thu, 6 Jun 2024 13:55:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B72C56B00B8; Thu, 6 Jun 2024 13:55:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A13436B00B9; Thu, 6 Jun 2024 13:55:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7ECDA6B00B4 for ; Thu, 6 Jun 2024 13:55:42 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2DED0417E4 for ; Thu, 6 Jun 2024 17:55:42 +0000 (UTC) X-FDA: 82201216524.21.3DE99A4 Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) by imf07.hostedemail.com (Postfix) with ESMTP id 59D3E40013 for ; Thu, 6 Jun 2024 17:55:40 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ELQQNJKI; spf=pass (imf07.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717696540; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=70T0ZceHUGMDadFXNyVFFMOsXxBYbdVP4L874egq+7s=; b=EGwKpk3eg7GpTN4ABuHc/jqISf+nBP/5MFhYsy/Z42JsrcgmNUz9HUq8G+7gos72l8/Mhu Z6JQbs3s6/SytMCkliMnRA+nTqE/oP194ZmU8CNbz9ZJSOnTSm80YabFfHTH8KBVOXlcBV gUYLUuudaH7c7NW5WjXaJzYtXfIOwZY= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ELQQNJKI; spf=pass (imf07.hostedemail.com: domain of yuzhao@google.com designates 209.85.128.46 as permitted sender) smtp.mailfrom=yuzhao@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717696540; a=rsa-sha256; cv=none; b=6hAb8MVpHLRk7Ms8v9++rknM3m/m3l2WvfF3u1jSURcAs1TcwsNP0DvoHj65Eri0Kp8jSQ LwPZ7khfhFlOHmpim1T/m2i0bJqz8CZ2qJ+eccOLF2Xnz9XJvHR9s46suQz874lt2mlrZl 2GaJDXN3hk7iXLps9/9J4yBnoxCcNP8= Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-4215b888eabso7135e9.0 for ; Thu, 06 Jun 2024 10:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717696539; x=1718301339; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=70T0ZceHUGMDadFXNyVFFMOsXxBYbdVP4L874egq+7s=; b=ELQQNJKIXSAk/KvzDO2zNhC07PRAdcJwYf/CSmwOFORv3/IDt45nLrvcmYgzyj5YjW VvPd6NKC1g6S/Wha3CI8WLR60dsTWJxRJw01jXjtcFHWdaa32XoFcIhTreH376cAcFin KIOxvT+AFApY/CALIy1sjo918TorXw1rYtTHmaoEpm2Hn2txs/al4onxumawutjMHGbL gnc4iGoiOB46aFlUGm9VBDmGbKCrqbM7MgaKDXnlHD3U/TczmPaNhtT5aDImT5sWIvy5 /TRi0CAYjt4pClU5OAKqa9oaYJVdLVhzKInL9MTIJo4yJMxRLgbHZZlmhI9fEAXtZcWm Se8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717696539; x=1718301339; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=70T0ZceHUGMDadFXNyVFFMOsXxBYbdVP4L874egq+7s=; b=xMtcFbR7+GUIFHQjLVwIVvLz9wvcK/Yy2fl02pEVk4wYNp41VDmgTz9WY/zQNJNIB6 6zF1J/R4fCAuuQwI9MIxe3nYxZTEOKD2TXcmR+jrQjFQaAFQnnIaU0GdsxBbMqsicMPS HFN4OHtPhdrETmUuYJRpHZKpQ1k5r4pbID0XqVm/MTqrXoy5YZZ1TkVERKW1UWAjet1t i4G5MqRgb3lyviTS3glvqVyjOs4se11bYtC9hC8uGlXZGSym0J86Vdny5r6ZKufpta4r OgRQspG3OF6DQgnDL8OBh0OHMcXQebRwA/G9BFMmkA71CEKZhJQ1tQXFf2BRi0NX7a+r fCyQ== X-Forwarded-Encrypted: i=1; AJvYcCVPjj3sITiRfaexShowRUp5DBl9nEzR+WkKf9mNzOslx/vnT7Jssdum8ZKNgkB3vPlQhsmypJiL2fL51AJmKNCq7Q0= X-Gm-Message-State: AOJu0YxCP+J7Sbhr5BcLoi9TwJ8eNyuuoNs2CqKfYxmE2L+4Acehv9td FuesYr+xcpXxcUYlKWuhK633Xp2qxKsgROY5qsyVMlRHIpZuNctYh186jzR8yFS+ejXAcaGLvr6 datbJwg//GD4Spz0NtUdcox1HS/eTpnmq33hi X-Google-Smtp-Source: AGHT+IFZtJxK7Ij2aoXQ49eZ6ipGl1snqySio37b8bnykUERrojlAvUSBz+QPxv9HYNFs66SlKfYn193ri6nP3w5/YA= X-Received: by 2002:a05:600c:1c90:b0:421:5288:8360 with SMTP id 5b1f17b1804b1-4215b327e13mr2782465e9.0.1717696538477; Thu, 06 Jun 2024 10:55:38 -0700 (PDT) MIME-Version: 1.0 References: <20240508202111.768b7a4d@yea> <20240515224524.1c8befbe@yea> <20240602200332.3e531ff1@yea> <20240604001304.5420284f@yea> <20240604134458.3ae4396a@yea> <20240604231019.18e2f373@yea> <20240606010431.2b33318c@yea> In-Reply-To: From: Yu Zhao Date: Thu, 6 Jun 2024 11:55:00 -0600 Message-ID: Subject: Re: kswapd0: page allocation failure: order:0, mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel v6.5.9, 32bit ppc) To: Yosry Ahmed , Takero Funaki Cc: Erhard Furtner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Johannes Weiner , Nhat Pham , Chengming Zhou , Sergey Senozhatsky , Minchan Kim , "Vlastimil Babka (SUSE)" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 59D3E40013 X-Stat-Signature: feh4jw7bm47adnt6bzj3kgmi9jgkcf8f X-Rspam-User: X-HE-Tag: 1717696540-525095 X-HE-Meta: U2FsdGVkX1/Jd42fj4OHLB2OLw6k2SC/p9QY/FzkFhaQo54xWuW6P7zOYVq+LC1mDgKYQOV9gpLNWcCoxR1618L96cRvEw57mg6g2dJKTitI3SuXiH5BKJTOpO8M6BME5wGa4prB7TgPyzTjtZfQFeMnsqcN0+MZ1Damvtumu+eDLNBTGTCEoPivqVed1jVB/KyLgzqd3PgR8sL8XvI7ZX+NHla/2BkYIjS9Amdh1H/6LU1q2B/tx4m6oQvqpfIhfsV4mqV1Oa7M8MW53rJRlcvXVJJhpWRnLqGBhO5hiPaJ9rugOPg156Gc2LvS/KMO3bvaWREfqHbyhyxOqmVsYZLAcrQuoihpszMwT0vJnS43J8XfiTWZV/DCkCJvMfBp5Kn849qkquVHI2kMnWqnWFNEg6deQS8gUxOMV+ocy1m8ZC2GpU+Ld8vwDIWeat4M6k4NseKddWEhxiT7YbVWkdCEOcj0Yew+it6JlUj4/pDnkT50oB8ZVmwbiA0oTQnfNpSe2BlSaTMZUwBHFTMV3Eb9wqOf0WQ58O+C41Y5kbwiOEtVdhUiJr6U6hBxeeb+71AYdVk5wIwZL2ViRyaig8GgVsSCaUWj47TE7BWDY2sHMSG6gomisnYzAlfD3De2OLn3AsXe0mRT6yOIqtw9/8hbYrs2eV24S2c+a3GzYWe6zG9ybCI0QSQ27Xd+UPXTatb7FZmwI9j0vaGEd+qEITkO/RAiJlciQo7DDlFM8+9qGGf9rVth7p7YB3yAdXRpW51nXs5G23qrIWbH9j0XSnD46k4oChmtnc5ZdXrCVl8y+P+5L5iTdH74IiBu/efSckXGffffHtkQJHujrZhAcTPaFDe4WDFtN2Xc9NhBzxq0vkDOTTP8gDejmZKdQLBFxZbaywskfKLhXLk1uOHZXvlES4LEFpRmBqZzUW5Dg6vIMM4YzU5uPlVXUeOpS3u19IScZP+LgjModrFY1uh axYL0U27 8Am0EFZrDsuleIM5CbD918zMrfl8TPRlkmoe4M0W+NiBlmrSJrcItBmqy0ISTEp+feqpzvesXrUodTzmoF5/Sm8ilbD8W5vD7aZKnkcocm1uTkiU/I3qVRCWEX+voCXvhoFg7vKpX2TN0x5t+fxFRugt9ypr8jgoTi1W29+YZtT/Xhib3Bwg0nHDr6gOxXHSMcXsBTecg1r58cg9yRLx5HYTfQJZMbzGTxeKYbV660XG3gAMB2XiKRYZfm3nsZfsu9WdeS+bc1ihk+Zt0KJZZTUvmDDq9tN0H7dShQfSevR4MiYrnCm/c5LqrzckkrWTIAZIk X-Bogosity: Ham, tests=bogofilter, spamicity=0.046587, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 6, 2024 at 11:42=E2=80=AFAM Yosry Ahmed = wrote: > > On Thu, Jun 6, 2024 at 10:14=E2=80=AFAM Takero Funaki wrote: > > > > 2024=E5=B9=B46=E6=9C=886=E6=97=A5(=E6=9C=A8) 8:42 Yosry Ahmed : > > > > > I think there are multiple ways to go forward here: > > > (a) Make the number of zpools a config option, leave the default as > > > 32, but allow special use cases to set it to 1 or similar. This is > > > probably not preferable because it is not clear to users how to set > > > it, but the idea is that no one will have to set it except special us= e > > > cases such as Erhard's (who will want to set it to 1 in this case). > > > > > > (b) Make the number of zpools scale linearly with the number of CPUs. > > > Maybe something like nr_cpus/4 or nr_cpus/8. The problem with this > > > approach is that with a large number of CPUs, too many zpools will > > > start having diminishing returns. Fragmentation will keep increasing, > > > while the scalability/concurrency gains will diminish. > > > > > > (c) Make the number of zpools scale logarithmically with the number o= f > > > CPUs. Maybe something like 4log2(nr_cpus). This will keep the number > > > of zpools from increasing too much and close to the status quo. The > > > problem is that at a small number of CPUs (e.g. 2), 4log2(nr_cpus) > > > will actually give a nr_zpools > nr_cpus. So we will need to come up > > > with a more fancy magic equation (e.g. 4log2(nr_cpus/4)). > > > > > > > I just posted a patch to limit the number of zpools, with some > > theoretical background explained in the code comments. I believe that > > 2 * CPU linearly is sufficient to reduce contention, but the scale can > > be reduced further. All CPUs are trying to allocate/free zswap is > > unlikely to happen. > > How many concurrent accesses were the original 32 zpools supposed to > > handle? I think it was for 16 cpu or more. or nr_cpus/4 would be > > enough? > > We use 32 zpools on machines with 100s of CPUs. Two zpools per CPU is > an overkill imo. Not to choose a camp; just a friendly note on why I strongly disagree with the N zpools per CPU approach: 1. It is fundamentally flawed to assume the system is linear; 2. Nonlinear systems usually have diminishing returns. For Google data centers, using nr_cpus as the scaling factor had long passed the acceptable ROI threshold. Per-CPU data, especially when compounded per memcg or even per process, is probably the number-one overhead in terms of DRAM efficiency. > I have further comments that I will leave on the patch, but I mainly > think this should be driven by real data, not theoretical possibility > of lock contention.