From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3373C3DA63 for ; Fri, 26 Jul 2024 04:51:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ED7806B0088; Fri, 26 Jul 2024 00:51:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E86F56B008A; Fri, 26 Jul 2024 00:51:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4E526B008C; Fri, 26 Jul 2024 00:51:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B6EE06B0088 for ; Fri, 26 Jul 2024 00:51:16 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6F189C15D1 for ; Fri, 26 Jul 2024 04:51:16 +0000 (UTC) X-FDA: 82380679752.03.AA6E875 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf22.hostedemail.com (Postfix) with ESMTP id D56B5C0003 for ; Fri, 26 Jul 2024 04:51:13 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WwRjHiIs; spf=pass (imf22.hostedemail.com: domain of chrisl@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721969449; a=rsa-sha256; cv=none; b=Ywi2DkeSjgSJqRTtH1N9f1pFOKRsSPKTwYdUKGXFCjgtKhTrx6OSghoxJ1m9eHY0fl4A4J the9UDhnaTmIZZsUawrj+u9Ag5yFhrg/Ardx00rWqNfhSrZr0iTJofkcG9aIRmBEwhvKEI R+JQL5lcVTjZa+7imrt3785OM1yrNcU= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WwRjHiIs; spf=pass (imf22.hostedemail.com: domain of chrisl@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721969449; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2bqHGnE3q/5BEVPcYNgx/zxBhj25PhD498yjFYln9ok=; b=yu0LkxtCfzULesc2szdJWFC9PPW3g+N3CGd49HiHwfC/3trKDuwK5IdhkEVilrHyNjb9N6 jgpyRWbx6gvNczJvTlw7p2E9S4kOrZvGbwFxCIG0bxj2lnJOVEO2KPur+eQ6yMqjgGg9fC Zd9IkxCksLVuHHVP3wO8Y2sVVQbUDAc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 7166ECE12AE for ; Fri, 26 Jul 2024 04:51:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BA857C4AF09 for ; Fri, 26 Jul 2024 04:51:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1721969468; bh=41U3J3A3A9ZA5BCLpOFkq85Qoqhlbjx7I/O2B4k/1oU=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=WwRjHiIs+WJJ+/NzzD5nz3y+W5eqrAczgaszDEpZJcB935264DyyVTEuXAexr1dx3 deXg5G1g1TxICwze7GrnZ2LwRzIripmKCk6TvrAA+XmLPI/6V2jGG85a28AUcfO2RN HiHC8VKs7EZ92qAwmkm2t/lKuoXeUZ6chzWVNoNG26VYT+sa3j4Jq5/AfHt04RIFyM v/8R+ZNYCI9cTCj1jGY9scx7XoylXwt6fHN57PGJXbJVIRPsgLBrwUUbdMTJ9lsu0m fJ3wbWjT1HdN6YlzTmw59SMFUBexz/OqoXAPe1WoCQYPSMvHHKIcSuQgpdUXCcEAub FteAm6E+zBJOw== Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-664b4589b1aso20926807b3.1 for ; Thu, 25 Jul 2024 21:51:08 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCWDvjtgEIZZ0qmlj8o+NvkcE95GFVVYbaPD3rwQoKioPzCBvhrGeGasx0gpIeC6j4zriPdJjOag5EeuLrCQafWB6s4= X-Gm-Message-State: AOJu0YyvJtM9YR/JibIJOTaCz3MqMAUtDRIxVm9G5S96jbHLF+E6KNVP 2WlVsVQSmOfzsHuw0z0dO2Up45alNE7zHbExDpSIf0PrBpmJpNDc3saOxtATdkbsxGNfhLL1sey k6EeqY11hO6us0ae0UGZkFQnDR1KZuyRjvaPCrQ== X-Google-Smtp-Source: AGHT+IF+O4lS4JpENJT6yNGqLEYFk/y/UK2OSsAnM3uMh/CQu9R+YmD9uaEELQc5QrI3dV5t6PdyhDtrQV2Tim5bUBA= X-Received: by 2002:a81:9c54:0:b0:65f:7e5a:648a with SMTP id 00721157ae682-672c025edeemr62028157b3.16.1721969468010; Thu, 25 Jul 2024 21:51:08 -0700 (PDT) MIME-Version: 1.0 References: <20240711-swap-allocator-v4-0-0295a4d4c7aa@kernel.org> <20240711-swap-allocator-v4-2-0295a4d4c7aa@kernel.org> <874j8nxhiq.fsf@yhuang6-desk2.ccr.corp.intel.com> <87o76qjhqs.fsf@yhuang6-desk2.ccr.corp.intel.com> <43f73463-af42-4a00-8996-5f63bdf264a3@arm.com> <87jzhdkdzv.fsf@yhuang6-desk2.ccr.corp.intel.com> <87sew0ei84.fsf@yhuang6-desk2.ccr.corp.intel.com> <4ec149fc-7c13-4777-bc97-58ee455a3d7e@arm.com> <87le1q6jyo.fsf@yhuang6-desk2.ccr.corp.intel.com> <87zfq43o4n.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87zfq43o4n.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Chris Li Date: Thu, 25 Jul 2024 21:50:56 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4 2/3] mm: swap: mTHP allocate swap entries from nonfull list To: "Huang, Ying" Cc: Ryan Roberts , Andrew Morton , Kairui Song , Hugh Dickins , Kalesh Singh , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Barry Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: coc1imi114n83xoztgc8t67f17mwg83k X-Rspamd-Queue-Id: D56B5C0003 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1721969473-169580 X-HE-Meta: U2FsdGVkX19RJq1CqA3vtZW7//xQVFBp6lt13mXtnlAPgU8sGe2t5f3HD4r4xukNN0ztB0s58BqvID/X3Tuqyg7/r/OcTeZB9/Ss7+/ZOSSffWte1rtaf+Rswk1xNNTVgmh4EUIO2qnMlDH1EP+gwjIas948r/YCch7uDuiMNnDt+TvHTHtNorRC0ofcFPBHuQhK1ysY6mrPk7JEyeRgMNJ2ztYXV+01SyfdSKDZsOc8uf/+lu+KnVhu4GZaQ0VYr195xYL3s1hL4n5xxlZs+gn7ptgN8n3jlu4kfKtJr0wicasQ6YMy+LEXPbWgdwQn0RbD4I87Eer7xmCMXUuY3XUIoIebYd2rM2Zcw9d2dGPGWld3ke1sHV06eaTe4ysGnIbNJzBsQDMK48iJ6vv5sImloGOpS/W5yut7vELawkd+fQPrgkBBUpMXb5m9e5hnOrypdmHYZDw9gThetH7BdysUmLqsRbZk1K16lMOdMXWLhItHnje+24nK/O/LOTy23bjUuogsKmHdzIz0p27y8UGNOdGfAlc4ha+YtYswYfzDK8nj2kaAvFavt/4smEOFEoEN8YLmcjeB6UOttHExaSrZdJ58dkw9AvP45tLypzP2DVL9TPMWbIOLSGeZt35OxsnUrOLttvoZAQp8MDnsCZdm8UPS3TyX+h8HK4yl+oJsHtFYAOPz6vsZtjDcNYIhooPtpDq/L7otIPWxZtMIcqdYytlwTJ/NmApBIoVXKgM2Qw9X0HnpWd1UJm1lWzRbRt4KLQE3AbwB4LPrR3Fj585nOovx7et3Mg38OIn+7LSi39kACEcdlsird0aKrsOLA9Qgdc/kePSM131doFohLzymgcBtXh+sLe3NJ+DLDPYRmDvBVLLuuphRZ6dIy8Loab1r0QtV/NLGsA3yyu5MbfEAusGFwa9KFPGiDJasZSueXRKNjEv4ZNl2bTFeHkLmK6J+qSOFzt6EIzXIIcr FES7SApR KuW7+EnZf42xLsM0204Yi2ETmCugFFIYEMo4OIhZeO6Qb3Jin5qLTsTLBp5FP7KLJg8HQsZKR+I8ZKqGoRfmpMPAjmZjRIrW2mtZG70KwF1eSIX4LzpiWmu75yyxRWZT+mNTkN4l+t9dlpO0iCRarL2jkLstBPwxGUoCbSa+zbzbA77FytK53eaOzcJYE0vafR+bQ5+ra6UbDg4XFW1MLnqfytLGqKcgV0Kr2696Lh7z4PHo0fwxuCd1dlBdZrZenaG7zg8NYkoX/vIsfs8/Pr71Jx1Ycuo8v3wqeg3w7xthjsHOuzFLZepwukA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jul 25, 2024 at 7:07=E2=80=AFPM Huang, Ying = wrote: > > If the freeing of swap entry is random distribution. You need 16 > > continuous swap entries free at the same time at aligned 16 base > > locations. The total number of order 4 free swap space add up together > > is much lower than the order 0 allocatable swap space. > > If having one entry free is 50% probability(swapfile half full), then > > having 16 swap entries is continually free is (0.5) EXP 16 =3D 1.5 E-5. > > If the swapfile is 80% full, that number drops to 6.5 E -12. > > This depends on workloads. Quite some workloads will show some degree > of spatial locality. For a workload with no spatial locality at all as > above, mTHP may be not a good choice at the first place. The fragmentation comes from the order 0 entry not from the mTHP. mTHP have their own valid usage case, and should be separate from how you use the order 0 entry. That is why I consider this kind of strategy only works on the lucky case. I would much prefer the strategy that can guarantee work not depend on luck. > >> - Order-4 pages need to be swapped out, but no enough order-4 non-full > >> clusters available. > > > > Exactly. > > > >> > >> So, we need a way to migrate non-full clusters among orders to adjust = to > >> the various situations automatically. > > > > There is no easy way to migrate swap entries to different locations. > > That is why I like to have discontiguous swap entries allocation for > > mTHP. > > We suggest to migrate non-full swap clsuters among different lists, not > swap entries. Then you have the down side of reducing the number of total high order clusters. By chance it is much easier to fragment the cluster than anti-fragment a cluster. The orders of clusters have a natural tendency to move down rather than move up, given long enough time of random access. It will likely run out of high order clusters in the long run if we don't have any separation of orders. > >> But yes, data is needed for any performance related change. > > BTW: I think non-full cluster isn't a good name. Partial cluster is > much better and follows the same convention as partial slab. I am not opposed to it. The only reason I hold off on the rename is because there are patches from Kairui I am testing depending on it. Let's finish up the V5 patch with the swap cache reclaim code path then do the renaming as one batch job. We actually have more than one list that has the clusters partially full. It helps reduce the repeat scan of the cluster that is not full but also not able to allocate swap entries for this order. Just the name of one of them as "partial" is not precise either. Because the other lists are also partially full. We'd better give them precise meaning systematically. Chris