From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B6D5C35FF3 for ; Thu, 13 Mar 2025 21:07:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E776C280009; Thu, 13 Mar 2025 17:07:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD5C2280001; Thu, 13 Mar 2025 17:07:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C505E280009; Thu, 13 Mar 2025 17:07:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9C14A280001 for ; Thu, 13 Mar 2025 17:07:07 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6BF7516060E for ; Thu, 13 Mar 2025 21:07:09 +0000 (UTC) X-FDA: 83217762978.06.C13A753 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) by imf28.hostedemail.com (Postfix) with ESMTP id 7EC85C0005 for ; Thu, 13 Mar 2025 21:07:07 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b="Z5R1/XBf"; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf28.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.171 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741900027; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9ouliMx/rxg1GN/5kBba4AjsVs5wNUw9q6lxfkWoRUQ=; b=PGR0elNubCunl9+51lCYPW+8yfKnD1W+uYt7oYAVePx8mW6QC1W1R5+GYg4ebSgXaeJ9FJ DcYRL5taZpxVox/5mTFexO1Uo/erOU8ESB0659ocQaxBwzYuv/9qi9uX6zff0J2i9vQCZ5 IaW/fFTam1crP86s7+85+555VExabq4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741900027; a=rsa-sha256; cv=none; b=tyTk7lNfiUHmB0r8zpU1vsh//aFgG/MO9LfSvIOelcuXFiJDRywuuPOZ1YoLrSBJeWD5t/ UJoicZeprFnmHUuAsOyoqfwsjmmErWVyiTUf7y0HrPxNpfx60bgUJSyRqeZ7/uxhygUa1V 3SWjRkd6xXzH63XnRAMA8e8PVGqo0q8= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b="Z5R1/XBf"; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf28.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.171 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-7c56321b22cso159391485a.1 for ; Thu, 13 Mar 2025 14:07:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1741900027; x=1742504827; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9ouliMx/rxg1GN/5kBba4AjsVs5wNUw9q6lxfkWoRUQ=; b=Z5R1/XBf1XwIchE7EoM3LeWkMnElujFVSiXGjg/Tpzc0l4wsxssx17/O4JehlXpf3N ci621mLhX+smxfNEnw9sMJbm+D6Ecn0yKRWtgBfnH6gtBt6S95M1setDIgN+kzmJL5h5 vBSV8dspvagejwTWksGJUlBDcxhfLBR8TxHf9GOI/0P5kfIL6yfR+OKZGjFVMlnhyuse EjVsC0nLWslJfX/k/DDgcp7aJUAjVpPnmp56sNfBYBkC+zBqR/R3Ke9+mvucNInhlPWP paUAKpxkQT3AUHanqz14zAhLrX0VPDnIGReF8AuvnxAUQ3VJ6v46ZXs9oPAVEhwnkrAV 4UbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741900027; x=1742504827; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9ouliMx/rxg1GN/5kBba4AjsVs5wNUw9q6lxfkWoRUQ=; b=h5c3QoTgpTj2qTZzi5d066mXQ45wUejDVT7AjuPpsKS9XV1o5eh3H7kDvXUriW7VvA HcO78NBLdwek27eCxv5tpZKZH1OUaH5OtaA8r8qcDLlm4NvH1zkbNaOq+ZAfaB47iMlP x+CHPfjqh4ZSBN4ujAYhA3LgqbdrhK3iHp5zAyT4gWLZUnodaHDM6/ZuctxL8Mtpy1wN 1cUclHWbQde3f0BPGS1bMEPFDi+z2lreHn66AkcJRqfAOj+s806pBFw563Cn3tDtPD8J 1PkzcBDd+sqGjW6RAI6aJQj3YxjSBUyfl7R2Y+Q8atUXEnWrytWeOqQSjlADbg0ETvJu csPA== X-Forwarded-Encrypted: i=1; AJvYcCWfS5mQu2KSj0MZ8cHutN/h+OMuQywc5BwGXeU9xHXIzeXLx4PqEkkHY68K/TxKCtOMoySbqu9z0g==@kvack.org X-Gm-Message-State: AOJu0YyjP+Jomno3uDXKqdHWQGPHififpnGkiX5xSQx8+mCooT9HbhYr uul/RgL4dTsVrRZc29yVXtgnDCn/rYlYkMs69QJLDygtI23DqRSy4uDrpdGEgKU= X-Gm-Gg: ASbGncv4G5Xzqz4PLlbWEuoiCcikM0f98dvJHcUqNCzacM2x4vuqlppzJWhWJVR9OHW 1tqL0x9Yeam2SPiskSg1fILF2cSgI/IQgFzV/d3G+Gna4OouugVEWBfq+vr6muSzHquARaghFFs ypbUgCHCXyNpgNrghaglBp/ODBu+fPWPsHTkwRJBVVbOq7xJVQCNtxYqIvMhhvkLZlXjhgEAcDY VfJaPWJNbBVnVb+Eh/vcqMy0Hm+pVuZi6VbT+4M4n06RBsJOXSxYHsELuurGJBqQmjtJCGFhD23 nW7gKuOVCW3n9BLCQGUIN1e5VKuL8pcXQlQF31t5hNU= X-Google-Smtp-Source: AGHT+IGURnx3waTxMIfAkGQ+RELMEfUq3qDPKdHBU4TLhM9iO7ZYMst5c+ohVVmbcycCmEztxbjgYQ== X-Received: by 2002:a05:620a:8ecb:b0:7c5:53ab:a732 with SMTP id af79cd13be357-7c5737b8b58mr536730085a.16.1741900026755; Thu, 13 Mar 2025 14:07:06 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with UTF8SMTPSA id af79cd13be357-7c573c9d641sm143094885a.65.2025.03.13.14.07.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Mar 2025 14:07:04 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/5] mm: page_alloc: defrag_mode kswapd/kcompactd assistance Date: Thu, 13 Mar 2025 17:05:35 -0400 Message-ID: <20250313210647.1314586-5-hannes@cmpxchg.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250313210647.1314586-1-hannes@cmpxchg.org> References: <20250313210647.1314586-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 7EC85C0005 X-Stat-Signature: gfm5a95fgz134xpwb7tic6wwuqbaagqh X-HE-Tag: 1741900027-417324 X-HE-Meta: U2FsdGVkX1/zEd4aVQqIdPwNZJ6/TDltcXKZ8PFfM0Mk5v5h9otBjB1msos4INQqfnqsOPN/uTl/xzzF80H+unYJxpshzjzVVIJIMPxFCq14xxJg2Dff8b+GI23LuqvEJ4mR0mMbpccqY4ITXnsiCgyjQS004rDEQzz5VfAlytHpaHTbGV5Y9AzOMewJWbadbECLuSOVJIIBnfhwAG8Yo3wK5KE5Kz1MWqcQNt/5kY8aCF3dtorW9Zb/E3TOstNftYzM7miT6v/F1eB/P3y6nOGpkDnV5TUXU1B1JtrcZNk7bdlbPr2vTowFjBuRL8ucJhmf8nYWmAx2HaZCztIvrmO4H7FPFYEeWeraefZGaM+mkohmIe5PZQjLIH7/O/jizRiRtQHJ8Zhlj9tfs6DiVPJORw+PR9nIhpKXaoPoQpH/5beHRRq/fcqc5+h3cNhUhnT9XpaC5oYCxSafdOdxoxd002oiayi7NZJkFP8qyv5Y5SdjFso7idr2hoLHm4WqQZNxl+sQD6gcfJoVNF77yJ1s020tfcSmv6N8L1EByAyOvZN8fPhhIvu8QxOdSsPr8VL9c8CgLDODHKmi8kQMh35Tq6gTHJ5Rgogg++saVQZ7RG/2gzDAIMPRlza1oraJ+3gEO2nJpLhfZmcbzDibBwDwgOCZgBk9DvXWEFtjaTYWr4Pc3IfOsqWuSqKXdVHmqJk/Wh3gGaFYbOPuG8Ae0qErOv3A50Jw2YPG1173Q7NTsbDkSxaL6dzoN4t3gf9/3sJvVkGDKcbyoX2WYOiy5oMJqTSeorXzczVFABxvss5e8yi3ztdwFvef9VV5prIcsBQZTeVUC/ga9IOjmQ+XOn5wA4l2+Gi+uCbkWK/XzkLkGjGwxh9J8RyV+7OCcvA0FNVFNEhV2mliGKFDRCi2xgTdU86Ik0SMul/IriL5b/QmrEW3oqCEx7lL9TDX0LvkqdJosXhd7I4DWZQgW2K bAut1lhm IxFiLlTDoqUGQ7AnqVKAEqBcoO9zLt1Ua/k1F8mVXiT/NKBjrVniwWT8rxBTHggrvz8xv0Cp5qfrk5hFp2/MeaPjWwRi34OaZ/rSkvreggikOoTPCcnmmRhv7RD9QON3DSBDk24popubFinTcTjIH62L9mH+1NCFx+EkgUxe9n9AO80XKNLPaUBoYKfSegULFD8T1S1dVAA54Jl9PE9YYPitE6e6+3nkLoC3OTaLowA/jYPxyujZby0GWF+/XAh25b77LwS+sbq7p6rxwogMZjEehv5TWvkvyIOlDz5ENiVJyupP9fNWtqLD7aXZwYokTEari6ClSIlwSnSElMzoH4vwEB3KYTHGc7dKgvZRh9jxw7ZKxbQbhiR7Iogz2pKddoFMygDmw+4k/1lsa+582y8tL0vlUH6iQtj92uPSVUGjpuLakeuWwnb+/IK4pkRCqdyo8KHiAkpOQCPyTfxumufprdBmxDOW1tUBalK/10/oAsog= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When defrag_mode is enabled, allocation fallbacks strongly prefer whole block conversions instead of polluting or stealing partially used blocks. This means there is a demand for pageblocks even from sub-block requests. Let kswapd/kcompactd help produce them. By the time kswapd gets woken up, normal rmqueue and block conversion fallbacks have been attempted and failed. So always wake kswapd with the block order; it will take care of producing a suitable compaction gap and then chain-wake kcompactd with the block order when its done. VANILLA DEFRAGMODE-ASYNC Hugealloc Time mean 52739.45 ( +0.00%) 34300.36 ( -34.96%) Hugealloc Time stddev 56541.26 ( +0.00%) 36390.42 ( -35.64%) Kbuild Real time 197.47 ( +0.00%) 196.13 ( -0.67%) Kbuild User time 1240.49 ( +0.00%) 1234.74 ( -0.46%) Kbuild System time 70.08 ( +0.00%) 62.62 ( -10.50%) THP fault alloc 46727.07 ( +0.00%) 57054.53 ( +22.10%) THP fault fallback 21910.60 ( +0.00%) 11581.40 ( -47.14%) Direct compact fail 195.80 ( +0.00%) 107.80 ( -44.72%) Direct compact success 7.93 ( +0.00%) 4.53 ( -38.06%) Direct compact success rate % 3.51 ( +0.00%) 3.20 ( -6.89%) Compact daemon scanned migrate 3369601.27 ( +0.00%) 5461033.93 ( +62.07%) Compact daemon scanned free 5075474.47 ( +0.00%) 5824897.93 ( +14.77%) Compact direct scanned migrate 161787.27 ( +0.00%) 58336.93 ( -63.94%) Compact direct scanned free 163467.53 ( +0.00%) 32791.87 ( -79.94%) Compact total migrate scanned 3531388.53 ( +0.00%) 5519370.87 ( +56.29%) Compact total free scanned 5238942.00 ( +0.00%) 5857689.80 ( +11.81%) Alloc stall 2371.07 ( +0.00%) 2424.60 ( +2.26%) Pages kswapd scanned 2160926.73 ( +0.00%) 2657018.33 ( +22.96%) Pages kswapd reclaimed 533191.07 ( +0.00%) 559583.07 ( +4.95%) Pages direct scanned 400450.33 ( +0.00%) 722094.07 ( +80.32%) Pages direct reclaimed 94441.73 ( +0.00%) 107257.80 ( +13.57%) Pages total scanned 2561377.07 ( +0.00%) 3379112.40 ( +31.93%) Pages total reclaimed 627632.80 ( +0.00%) 666840.87 ( +6.25%) Swap out 47959.53 ( +0.00%) 77238.20 ( +61.05%) Swap in 7276.00 ( +0.00%) 11712.80 ( +60.97%) File refaults 138043.00 ( +0.00%) 143438.80 ( +3.91%) With this patch, defrag_mode=1 beats the vanilla kernel in THP success rates and allocation latencies. The trend holds over time: thp_fault_alloc VANILLA DEFRAGMODE-ASYNC 61988 52066 56474 58844 57258 58233 50187 58476 52388 54516 55409 59938 52925 57204 47648 60238 43669 55733 40621 56211 36077 59861 41721 57771 36685 58579 34641 51868 33215 56280 DEFRAGMODE-ASYNC also wins on %sys as ~3/4 of the direct compaction work is shifted to kcompactd. Reclaim activity is higher. Part of that is simply due to the increased memory footprint from higher THP use. The other aspect is that *direct* reclaim/compaction are still going for requested orders rather than targeting the page blocks required for fallbacks, which is less efficient than it could be. However, this is already a useful tradeoff to make, as in many environments peak periods are short and retaining the ability to produce THP through them is more important. Signed-off-by: Johannes Weiner --- mm/page_alloc.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9a02772c2461..4a0d8f871e56 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4076,15 +4076,21 @@ static void wake_all_kswapds(unsigned int order, gfp_t gfp_mask, struct zone *zone; pg_data_t *last_pgdat = NULL; enum zone_type highest_zoneidx = ac->highest_zoneidx; + unsigned int reclaim_order; + + if (defrag_mode) + reclaim_order = max(order, pageblock_order); + else + reclaim_order = order; for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, highest_zoneidx, ac->nodemask) { if (!managed_zone(zone)) continue; - if (last_pgdat != zone->zone_pgdat) { - wakeup_kswapd(zone, gfp_mask, order, highest_zoneidx); - last_pgdat = zone->zone_pgdat; - } + if (last_pgdat == zone->zone_pgdat) + continue; + wakeup_kswapd(zone, gfp_mask, reclaim_order, highest_zoneidx); + last_pgdat = zone->zone_pgdat; } } -- 2.48.1