From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C3FCC27C5F for ; Fri, 7 Jun 2024 18:57:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 105DE6B00BA; Fri, 7 Jun 2024 14:57:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B7876B00BC; Fri, 7 Jun 2024 14:57:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E997A6B00BD; Fri, 7 Jun 2024 14:57:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CA1716B00BA for ; Fri, 7 Jun 2024 14:57:23 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7FD38120186 for ; Fri, 7 Jun 2024 18:57:23 +0000 (UTC) X-FDA: 82205000766.25.1AEC34C Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf04.hostedemail.com (Postfix) with ESMTP id 97DBD4000C for ; Fri, 7 Jun 2024 18:57:21 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=KQU4aeez; spf=pass (imf04.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717786641; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4DkWO+jz8u3rqu5TPH1WlX45TJvaojMr86i5zgY6ci8=; b=AmqvyVkyjcEi8VCcV9IGUoEMHbwISl6WqCI5FSq27S5kBQthJwhuj3Gw3ZuZ6sf2nOlI2z PgoqXuc/w6CKUgjIA5Dyd8dxa5/pu0BNgqeSlsZ1UslTOt921GY1AdwmoYINfqHJH6lq9i mdBTIAmVSuROl9JC55+kG6HPu+kaAv4= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=KQU4aeez; spf=pass (imf04.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717786641; a=rsa-sha256; cv=none; b=5EPJdeuXJLYTpTVZwMF3Aq81g3syPzxr3siKxwQltmrom1I6j7/xjkaHa4ehwm3pq9UMt2 wGhBwPulHfQXv4LS9AzJ2qGX4ddgkKbfPFP3FPdCWvk3rV4ypY8ZrQM7YDaiq1VJqlnElH R5J12bHn+QU8XEGMi7cyPW5mPvg5jbE= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 761BE6201C for ; Fri, 7 Jun 2024 18:57:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2E8EEC4AF08 for ; Fri, 7 Jun 2024 18:57:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717786640; bh=vCffEC+1ruz2xo3j/grqIKpXf5Rl3QUpvDmpiZ3G/NA=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=KQU4aeezHDUG6VjvFdMnzRL5XyBRjuLRB3oct4w8tA/uN2FJCkwM9g2dAOan9Mzc0 8oEFcNHCicDvGcy3SAyRk7i9Y3w43+EJJsZTXrJYQiiMh8tNFknf54Tmas7qwsCKMr Is6TUFJQnLk2fpOR01jtZP869kYjPsrAn2Awlecgz+FT5cQa+0pAmgg6qU3chdGHuU eigk+B5vU+UyXO3anaTFTPgFIDj0hszzEA0oKyzUvB53HS8P2bGvXaP8ZLHH8f6vOp atULVyYtivxYfTr0oIIKPJE+OSffvIj9JXiOUQbMvKJDDFqONyT/nicnRfksQR4HhS cgUuJAKGX9oUw== Received: by mail-lj1-f179.google.com with SMTP id 38308e7fff4ca-2ead2c6b50bso24770181fa.0 for ; Fri, 07 Jun 2024 11:57:20 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCUalk91AJ8h31b2joWECMN2xUE0UEAFhB9GFvx6dtpcb6Og4s15HBeiXonq3vyl+rvfZHdYYtLxFpBc3/xGj+YV9uc= X-Gm-Message-State: AOJu0YxbI59Xb6iJ7PlkZtvroxoTgxCP3jO85ZNC2Dk+kpbcbj1oR7x+ 0t88SIj1QmZhaKIh5NrWWYTGEG/T/lH66ApIz5AjS44dNs1CbNRaxPHvpmGP0hAKc0ptMdJhXDZ COF6uxUzRv2WoqIv+VAGdLaXbLw== X-Google-Smtp-Source: AGHT+IEQKSS2CPmzfTdd5Yf2gPBipJrOg/ZCeZ/GmQQOmd9WZEutosiOURdu6va53khJz8qsGR7qZIQYO4YfnJZ9BP0= X-Received: by 2002:a2e:8705:0:b0:2ea:f719:317b with SMTP id 38308e7fff4ca-2eaf7287c04mr7026531fa.3.1717786638880; Fri, 07 Jun 2024 11:57:18 -0700 (PDT) MIME-Version: 1.0 References: <20240524-swap-allocator-v1-0-47861b423b26@kernel.org> In-Reply-To: From: Chris Li Date: Fri, 7 Jun 2024 11:57:05 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 0/2] mm: swap: mTHP swap allocator base on swap cluster order To: Ryan Roberts Cc: Barry Song , Andrew Morton , Kairui Song , "Huang, Ying" , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 97DBD4000C X-Stat-Signature: dhpoqatfqnqkdnt8otpnwj14mcexnknw X-HE-Tag: 1717786641-506104 X-HE-Meta: U2FsdGVkX18NWsX4rdsKNIBNzUscPGYZftwlmc6smYRGiM+732qsdIk4uOdpwUoZqpMJ5Jjee/k/kBnZC0zIpYuBqhJLrS6tGee+9+p7u2XmLggYDR8SeepAO0o3373L9HU+FBx2dYAbnNjiHO2BNpgYCeZzU3/Yem73XQTVEzX4lUqhxVA1LtEGQX3uT7jKNnQiLD+tfTG5/lGw0oqD4nu0Ux2I9x6f0pGzw8btvf44QSbUTZ8dE1ZUOvc89GcTpt/jpM9TBX8ZMkRGbBUz3c4HvqIIt8TmFde9/ob6qVx8Bjt1JYRUt0QvX/iZg6zo72zxCNWSuH+23Nr9aSfYppFF1itey1Tlf/925F3rXG6ixNV9exbHj3y0rJlnauGGvFZAKi2OZnLpI8K6rTwahI+9GmPa7pWNicnOWFkl5HaGxaKrFQJoWOg16IS7EHBlXSRXhRgNEOv+5PJbL1vlrE8XFk90VbBUFTKbV2TFfrwj93FpIRpkwpZYsiN+j6AhgxOzMU4dpIiRwS/IDr3tSzj4dUTBCGPJGWqqCr44KgDG6hmSx8FVgN3TOiVHSsMrw8gg2xon5feeIJXUqMTnJyEDJZUuNpHhzD3Js7grblJYWKJu20Msllta17SCYMC/b8N2c+++n5DiqP+JfVSJ4MENQodhj7iOPZ3+YKYovf4VTx3laDYPDN4h9+UNoXl1/1KuINuhwHTkED/jPfWOdg3rmupMIpMCLopLXGWnlzqWxDjWxOLuBlG4dHI3ryIocRe9cfmezqPjutCLsBTmuaocB+5cFRDpACx713yXPQ6w82QX5G2gStoxvjQDq0APM6hAdKGe2Kd0XHGkzeHLT8iaC7FpEzFPHm4q8S6xbVmGNVWHZjwUh/7Ez/zt8pv37aYdMJj9T/MV5XzKKsTfBnZMshyoVT54/DpVyl20q8JWu/dq/6qBfnqd/QIA5uz5J9eowWZyi3ja+uj7Q8G Dk/jy68a DLHyMmDg6C2u7xUTkDVYJOE2asF7neAg77EE5Yu+fI08d4Yb7sdijemg48sDVNJM2ArBDoo8PwqMNPcJOAvuwNNdizUulqqCLrD3WC/o8XM1UgRZbVe5GaMjpXisDZ7qgDSQXsBD8dMnZNagRU39P7PbrAAcYxv/uhR+qT2v/Csev4Z1fDlncR5ZU44aEL+FzcjesK87QNjr0uCiMKGXNSj3Mwt73fT8btkapmqbQB/eHkroBkFhTeKfXirG7DEAtUSMK4XH3nx7ILHEguKcdYO0DBoolmWAMYE88q3tKatbyZZM/toE2vLsKxbppPeDuXwpn45DC4fUCSdNMgGTmCMRuVMaD6knnitT8OrNvRoOTi5A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jun 7, 2024 at 3:49=E2=80=AFAM Ryan Roberts = wrote: > > On 30/05/2024 08:49, Barry Song wrote: > > On Wed, May 29, 2024 at 9:04=E2=80=AFAM Chris Li wr= ote: > >> > >> I am spinning a new version for this series to address two issues > >> found in this series: > >> > >> 1) Oppo discovered a bug in the following line: > >> + ci =3D si->cluster_info + tmp; > >> Should be "tmp / SWAPFILE_CLUSTER" instead of "tmp". > >> That is a serious bug but trivial to fix. > >> > >> 2) order 0 allocation currently blindly scans swap_map disregarding > >> the cluster->order. Given enough order 0 swap allocations(close to the > >> swap file size) the order 0 allocation head will eventually sweep > >> across the whole swapfile and destroy other cluster order allocations. > >> > >> The short term fix is just skipping clusters that are already assigned > >> to higher orders. > >> > >> In the long term, I want to unify the non-SSD to use clusters for > >> locking and allocations as well, just try to follow the last > >> allocation (less seeking) as much as possible. > > > > Hi Chris, > > > > I am sharing some new test results with you. This time, we used two > > zRAM devices by modifying get_swap_pages(). > > > > zram0 -> dedicated for order-0 swpout > > zram1 -> dedicated for order-4 swpout > > > > We allocate a generous amount of space for zRAM1 to ensure it never get= s full > > and always has ample free space. However, we found that Ryan's approach > > does not perform well even in this straightforward scenario. Despite zR= AM1 > > having 80% of its space remaining, we still experience issues obtaining > > contiguous swap slots and encounter a high swpout_fallback ratio. > > > > Sorry for the report, Ryan :-) > > No problem; clearly it needs to be fixed, and I'll help where I can. I'm = pretty > sure that this is due to fragmentation preventing clusters from being fre= ed back > to the free list. > > > > > In contrast, with your patch, we consistently see the thp_swpout_fallba= ck ratio > > at 0%, indicating a significant improvement in the situation. > > Unless I've misunderstood something critical, Chris's change is just allo= wing a > cpu to steal a block from another cpu's current cluster for that order. S= o it No, that is not the main change. The main change is to allow the CPU to allocate from the nonfull and non-empty cluster, which are not in any CPU's current cluster, not in the empty list either. The current patch does not prevent the CPU from stealing from the other CPU current order. It will get addressed in V2. > just takes longer (approx by a factor of the number of cpus in the system= ) to > get to the state where fragmentation is causing fallbacks? As I said in t= he > other thread, I think the more robust solution is to implement scanning f= or high > order blocks. That will introduce more fragmentation to the high order cluster, and will make it harder to allocate high order swap entry later. Please see my previous email for the usage case and the goal of the change. https://lore.kernel.org/linux-mm/CANeU7QnVzqGKXp9pKDLWiuhqTvBxXupgFCRXejYhs= hAjw6uDyQ@mail.gmail.com/T/#mf431a743e458896c2ab4a4077b103341313c9cf4 Let's discuss whether the usage case and the goal makes sense or not. Chris