From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7FDCC35FF3 for ; Thu, 13 Mar 2025 21:07:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2EA92280003; Thu, 13 Mar 2025 17:07:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 272AC280001; Thu, 13 Mar 2025 17:07:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09E15280003; Thu, 13 Mar 2025 17:07:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C2D24280001 for ; Thu, 13 Mar 2025 17:06:59 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 581B11A05A7 for ; Thu, 13 Mar 2025 21:07:01 +0000 (UTC) X-FDA: 83217762642.20.8B05E88 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf02.hostedemail.com (Postfix) with ESMTP id 5D28980019 for ; Thu, 13 Mar 2025 21:06:59 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=hiGBklwu; spf=pass (imf02.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.178 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741900019; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jrFqfcieZOq9at8nTRvNC7n7QX7bSrdSSf1BjUE4qFU=; b=mbixEgfY5awJ0/sNEKGdc1FEj22Em2mWVFGEB/Tyg5fgvpyhjEZXXvsvSlDZ0Yty4xQWI4 wZB/IKZxh63hdBeZ5ykykbdklV5GTaOyAXuuqLyjttsiBX046nY/RQsn2h1BaQTB2qyOj3 T+3YTr8vXynEBIXEuJwcXknILWCLavU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741900019; a=rsa-sha256; cv=none; b=OsIkIa2j+HASbi1Ubog4oWiRTXIxVySSljBUAv0xatwyQYn3/ICrfd8hrkEa4lu+4Ydldy Yy4GAXahuzyY8bRM69Vw8O4ut74LAaetZMw+jDht6w2VIuhY4JGpy9E6Prtp6xrMoXIBip 5Sn2kkCSMltbctHPxWdGaNuz2dabvSE= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=hiGBklwu; spf=pass (imf02.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.178 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-47698757053so16828971cf.0 for ; Thu, 13 Mar 2025 14:06:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1741900018; x=1742504818; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jrFqfcieZOq9at8nTRvNC7n7QX7bSrdSSf1BjUE4qFU=; b=hiGBklwuxHj4uXT7GTcZny+y8fxQiq+ZbXcQ4BEzXdPwq8vGwzML+Z0JHr4Gv02j1d 6m8hwEdN3dSzZ/hS87971MnoBVWywhONpvy5M8lWTSokaWPFklsQpHfO2xSjCngIKrV2 qf+I1b5y5Fq5sHhpnv+TfExlstCxpUp+tqnuk9jEKipn42pq/FWmHiGxzV4OWHRweZPG Fuj59JwrJU8BX6o5vpS4cSAy2cUUjGoBHs65lrfT8mTsDZnrkdcHeyQ+p3R0vgQAiBkR nMdUpJqsI2gp1MNsWFitj5TmhCW+NrQk8PAqpOhajqBVhTltmqpK7NRukG/BQ9LZjI6j bKlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741900018; x=1742504818; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jrFqfcieZOq9at8nTRvNC7n7QX7bSrdSSf1BjUE4qFU=; b=jgcHTY30EcxCVKCgTXsZS9vlYGr6Fgglyil3N+r3gra6BKHVtKenrpBtLVst7HAhYk v4xkx5WCigOQBHY68LJIBhMfHmkuvhDH6/LZ5FO9PGhVT3TkHSfAE9MsowdzyqcNmbBQ lTYpx9VY3pOKrSmqbhYK9+GJnCORogz0dIEivJOd/jlYt5vRKQgjX6H+G52ZSpMXHccC +Bs2oA5KPAWHS8IXQgbOlTgUDaFmQ1RAlkBJNwwFgloOpfORwQGoEYs3GybcmCcI8nRt Id6B+PI9reH04HOmjKK7yxdiOGF6asze55uvIAuGrD5J9ZvsLbHGLCgHePPSFkLFluTf o5dA== X-Forwarded-Encrypted: i=1; AJvYcCUG52+kdzM2IDy+ZWg4ExqntBrJl3I70eZoPMueJiUsLt5daBN74BgYwUaNHxB59h/JnfrxrfwkqA==@kvack.org X-Gm-Message-State: AOJu0Yyo9AAAIE+SPHi14FJ9S5veT3/jYRfbwCPoZ9127XDMIs+tP6+M ls5aaWUvj/JIZiI54pl/+kU0QsJhHRq+3EXfDwBqh8QGaFRh0/Z1D3CofdEJCeB8UReXx41SXFl + X-Gm-Gg: ASbGnctVbN8SeqqcImqbSULREHNhTXANzeFUCaP1qQB/oTsv8OHF8wKZqn+2Il6pQ7v qjxjMwv2q8s2F6QsRpnS5Cdo2t6A454AExA9WdjH+07ea8hTSudbZe5oxdfgULrYhGKxrT217hC jdZCk3Wuo7t2qeUC5fQQT8fXKDZOajqdb6ZRlhoUpVAV0+TYc4i1lpXNR8S3aBsQWpMluFkea9h yr+NVVI3t4MTTqFmB90l4JS1oLgQpHWDUGLzoSgeFz8GPnEO7Acoy57Un7cS24P9yoor9cg1oHK epo6FQ0SmSGWfedtyzsPj0+LVTI4/7MhaDjASnUhfvo= X-Google-Smtp-Source: AGHT+IEDyN0M6bIcmPwzaon/X4MJTAlHo+m+s2WcqlwJf5d95cUnnQdEITXcrYwXBCRzts6TlH0+Zw== X-Received: by 2002:a05:622a:45:b0:476:84c0:4864 with SMTP id d75a77b69052e-476c8152dfbmr1632161cf.31.1741900018472; Thu, 13 Mar 2025 14:06:58 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with UTF8SMTPSA id 6a1803df08f44-6eade231256sm13909986d6.32.2025.03.13.14.06.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Mar 2025 14:06:56 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/5] mm: compaction: push watermark into compaction_suitable() callers Date: Thu, 13 Mar 2025 17:05:32 -0400 Message-ID: <20250313210647.1314586-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250313210647.1314586-1-hannes@cmpxchg.org> References: <20250313210647.1314586-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 5D28980019 X-Stat-Signature: doh3ggygco4ewtsawqtatqrwde4wptq6 X-HE-Tag: 1741900019-717147 X-HE-Meta: U2FsdGVkX1+jPQ0swBKjmAq1AlZ4pYTtQf+AJ87m3NRyYZTceZ14Q10l0DYh0Mfg1Y+LuZczHKPcGC5IXp9xi0eM4xChfDXnftPdyFOasfivmjtjf4I0CYQrpx/wTQtFUTBEPnV7/eOPgZRpRIwZ3vjeaadslPVIpW64DG8ysoIH4nLEI6fqsF+C5pXmn/JAHn3EI7Zm2CyPeXWW3EGzMfQGa37TWLItWHGkfiKMUOhouIC15rr6jEdAOCZKoSappEk6hQOvLn8t9GeV1Ysdi70WxIkGgQbORUpuES/vmin0ls8s94uP7r8kzhVBmbBkSqKBZNT6Vy5TwNIa9H6Lc+rDa0I19/wS1GDTrXbRoIrQgy8EQAhJ8hzYbC/KD35uFDun+b9xqlEH03GBYaNZppIc6xLHzhd6/W5kegP/LGHS6cWYm5Fdp/KUQZRslhxan2+7IMJ4UtUD2gN4GsIdTpUTJLKyOnBWkaAGFc5T++Wb8GDZIH6pvXnD2965GLzPH0E531P2p29vlBFTMrSqQR0mLKShCtsCETJwkXHpSIqj0AUwxkIMCypQtjn6Qv0qtFXlE1wCcI9+PRmsaGb+6AwfK7/3G/rwbZajz7Nb2z+vKphG8g4h3awwFqvXgNGqZNn4qQQ8wdpJNLbT7RXj8HVebjxur3Bpokko4GDCBu4v2fsbzDTxXHRQbSfqZiICmjQnAiAWFwK9fetY7PYciP/WdAeQXO2pDRdLlhHBzykGhfCd18POrLJZtyml5wmEmDsnqsduZBK/HJVymw5o9lFVJMClW3eYSL/XvTlr4mbKPIuLktXE2EnWdKndascVxXlKkAXAHaeMbfIBiRsQxBSnkzHbeDDYPQCr1TOdGC0x3x4tGYib7HGUcv2epGig16fM2AHlEUSehF721KePJ/oRL8jKvsD6+eAy9khoCfDQ/IXEigg/GTSVXX5lmqKpcKpUa5E0eahxKb1YkTK XwVkmN2z 5t/OUmBgruyMDSbPdostIr/Y18S4Mru5a9A3DQl5qx2kCaxPUossh/u1HN0GX6z4pI3/oQdu8SHflNq4PZaRqaG1+upYV7UyBHRfcLxTQbf7y5FB/EHzFigmM4Jbo7QiceYMIUVDgZMxhZi6ML2YQKiNTVybqUhUezWi0ZKMAyCIRw7EyIsNhHRgpHhnZLBfbR4+2gfvrPBJLto59SlfFPkvoMF+4YcxQ0h62NpTuZgFQFbvOz6vkN2VnlF7X9QzpZ8Ws2stRxLlDeS1yGKPkYRdC4DS6kQ/8odZOZq84t+5vh08xPqTsQXR3YBCM1lKpEjHZ37S8QySjDYdt/gqPFb0O8FXJncwVhOibz+NCnU5ihd62seoO2xHjtKvesgdyrjIMG+cNV15MmvB1umsplZWIWXCHjc6HK0Y+JG3UDXzmhbcZz346rMOckA1ALmbJt7xk1sc+KOAqRu2tEFoXA/AAJFzH3dSKMlheTFImaTlIlaQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000756, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: compaction_suitable() hardcodes the min watermark, with a boost to the low watermark for costly orders. However, compaction_ready() requires order-0 at the high watermark. It currently checks the marks twice. Make the watermark a parameter to compaction_suitable() and have the callers pass in what they require: - compaction_zonelist_suitable() is used by the direct reclaim path, so use the min watermark. - compact_suit_allocation_order() has a watermark in context derived from cc->alloc_flags. The only quirk is that kcompactd doesn't initialize cc->alloc_flags explicitly. There is a direct check in kcompactd_do_work() that passes ALLOC_WMARK_MIN, but there is another check downstack in compact_zone() that ends up passing the unset alloc_flags. Since they default to 0, and that coincides with ALLOC_WMARK_MIN, it is correct. But it's subtle. Set cc->alloc_flags explicitly. - should_continue_reclaim() is direct reclaim, use the min watermark. - Finally, consolidate the two checks in compaction_ready() to a single compaction_suitable() call passing the high watermark. There is a tiny change in behavior: before, compaction_suitable() would check order-0 against min or low, depending on costly order. Then there'd be another high watermark check. Now, the high watermark is passed to compaction_suitable(), and the costly order-boost (low - min) is added on top. This means compaction_ready() sets a marginally higher target for free pages. In a kernelbuild + THP pressure test, though, this didn't show any measurable negative effects on memory pressure or reclaim rates. As the comment above the check says, reclaim is usually stopped short on should_continue_reclaim(), and this just defines the worst-case reclaim cutoff in case compaction is not making any headway. Signed-off-by: Johannes Weiner --- include/linux/compaction.h | 5 ++-- mm/compaction.c | 52 ++++++++++++++++++++------------------ mm/vmscan.c | 26 ++++++++++--------- 3 files changed, 45 insertions(+), 38 deletions(-) diff --git a/include/linux/compaction.h b/include/linux/compaction.h index 7bf0c521db63..173d9c07a895 100644 --- a/include/linux/compaction.h +++ b/include/linux/compaction.h @@ -95,7 +95,7 @@ extern enum compact_result try_to_compact_pages(gfp_t gfp_mask, struct page **page); extern void reset_isolation_suitable(pg_data_t *pgdat); extern bool compaction_suitable(struct zone *zone, int order, - int highest_zoneidx); + unsigned long watermark, int highest_zoneidx); extern void compaction_defer_reset(struct zone *zone, int order, bool alloc_success); @@ -113,7 +113,8 @@ static inline void reset_isolation_suitable(pg_data_t *pgdat) } static inline bool compaction_suitable(struct zone *zone, int order, - int highest_zoneidx) + unsigned long watermark, + int highest_zoneidx) { return false; } diff --git a/mm/compaction.c b/mm/compaction.c index 550ce5021807..036353ef1878 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -2382,40 +2382,42 @@ static enum compact_result compact_finished(struct compact_control *cc) } static bool __compaction_suitable(struct zone *zone, int order, - int highest_zoneidx, - unsigned long wmark_target) + unsigned long watermark, int highest_zoneidx, + unsigned long free_pages) { - unsigned long watermark; /* * Watermarks for order-0 must be met for compaction to be able to * isolate free pages for migration targets. This means that the - * watermark and alloc_flags have to match, or be more pessimistic than - * the check in __isolate_free_page(). We don't use the direct - * compactor's alloc_flags, as they are not relevant for freepage - * isolation. We however do use the direct compactor's highest_zoneidx - * to skip over zones where lowmem reserves would prevent allocation - * even if compaction succeeds. - * For costly orders, we require low watermark instead of min for - * compaction to proceed to increase its chances. + * watermark have to match, or be more pessimistic than the check in + * __isolate_free_page(). + * + * For costly orders, we require a higher watermark for compaction to + * proceed to increase its chances. + * + * We use the direct compactor's highest_zoneidx to skip over zones + * where lowmem reserves would prevent allocation even if compaction + * succeeds. + * * ALLOC_CMA is used, as pages in CMA pageblocks are considered - * suitable migration targets + * suitable migration targets. */ - watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ? - low_wmark_pages(zone) : min_wmark_pages(zone); watermark += compact_gap(order); + if (order > PAGE_ALLOC_COSTLY_ORDER) + watermark += low_wmark_pages(zone) - min_wmark_pages(zone); return __zone_watermark_ok(zone, 0, watermark, highest_zoneidx, - ALLOC_CMA, wmark_target); + ALLOC_CMA, free_pages); } /* * compaction_suitable: Is this suitable to run compaction on this zone now? */ -bool compaction_suitable(struct zone *zone, int order, int highest_zoneidx) +bool compaction_suitable(struct zone *zone, int order, unsigned long watermark, + int highest_zoneidx) { enum compact_result compact_result; bool suitable; - suitable = __compaction_suitable(zone, order, highest_zoneidx, + suitable = __compaction_suitable(zone, order, highest_zoneidx, watermark, zone_page_state(zone, NR_FREE_PAGES)); /* * fragmentation index determines if allocation failures are due to @@ -2453,6 +2455,7 @@ bool compaction_suitable(struct zone *zone, int order, int highest_zoneidx) return suitable; } +/* Used by direct reclaimers */ bool compaction_zonelist_suitable(struct alloc_context *ac, int order, int alloc_flags) { @@ -2475,8 +2478,8 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order, */ available = zone_reclaimable_pages(zone) / order; available += zone_page_state_snapshot(zone, NR_FREE_PAGES); - if (__compaction_suitable(zone, order, ac->highest_zoneidx, - available)) + if (__compaction_suitable(zone, order, min_wmark_pages(zone), + ac->highest_zoneidx, available)) return true; } @@ -2513,13 +2516,13 @@ compaction_suit_allocation_order(struct zone *zone, unsigned int order, */ if (order > PAGE_ALLOC_COSTLY_ORDER && async && !(alloc_flags & ALLOC_CMA)) { - watermark = low_wmark_pages(zone) + compact_gap(order); - if (!__zone_watermark_ok(zone, 0, watermark, highest_zoneidx, - 0, zone_page_state(zone, NR_FREE_PAGES))) + if (!__zone_watermark_ok(zone, 0, watermark + compact_gap(order), + highest_zoneidx, 0, + zone_page_state(zone, NR_FREE_PAGES))) return COMPACT_SKIPPED; } - if (!compaction_suitable(zone, order, highest_zoneidx)) + if (!compaction_suitable(zone, order, watermark, highest_zoneidx)) return COMPACT_SKIPPED; return COMPACT_CONTINUE; @@ -3082,6 +3085,7 @@ static void kcompactd_do_work(pg_data_t *pgdat) .mode = MIGRATE_SYNC_LIGHT, .ignore_skip_hint = false, .gfp_mask = GFP_KERNEL, + .alloc_flags = ALLOC_WMARK_MIN, }; enum compact_result ret; @@ -3100,7 +3104,7 @@ static void kcompactd_do_work(pg_data_t *pgdat) continue; ret = compaction_suit_allocation_order(zone, - cc.order, zoneid, ALLOC_WMARK_MIN, + cc.order, zoneid, cc.alloc_flags, false); if (ret != COMPACT_CONTINUE) continue; diff --git a/mm/vmscan.c b/mm/vmscan.c index 2bc740637a6c..3370bdca6868 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -5890,12 +5890,15 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, /* If compaction would go ahead or the allocation would succeed, stop */ for_each_managed_zone_pgdat(zone, pgdat, z, sc->reclaim_idx) { + unsigned long watermark = min_wmark_pages(zone); + /* Allocation can already succeed, nothing to do */ - if (zone_watermark_ok(zone, sc->order, min_wmark_pages(zone), + if (zone_watermark_ok(zone, sc->order, watermark, sc->reclaim_idx, 0)) return false; - if (compaction_suitable(zone, sc->order, sc->reclaim_idx)) + if (compaction_suitable(zone, sc->order, watermark, + sc->reclaim_idx)) return false; } @@ -6122,22 +6125,21 @@ static inline bool compaction_ready(struct zone *zone, struct scan_control *sc) sc->reclaim_idx, 0)) return true; - /* Compaction cannot yet proceed. Do reclaim. */ - if (!compaction_suitable(zone, sc->order, sc->reclaim_idx)) - return false; - /* - * Compaction is already possible, but it takes time to run and there - * are potentially other callers using the pages just freed. So proceed - * with reclaim to make a buffer of free pages available to give - * compaction a reasonable chance of completing and allocating the page. + * Direct reclaim usually targets the min watermark, but compaction + * takes time to run and there are potentially other callers using the + * pages just freed. So target a higher buffer to give compaction a + * reasonable chance of completing and allocating the pages. + * * Note that we won't actually reclaim the whole buffer in one attempt * as the target watermark in should_continue_reclaim() is lower. But if * we are already above the high+gap watermark, don't reclaim at all. */ - watermark = high_wmark_pages(zone) + compact_gap(sc->order); + watermark = high_wmark_pages(zone); + if (compaction_suitable(zone, sc->order, watermark, sc->reclaim_idx)) + return true; - return zone_watermark_ok_safe(zone, 0, watermark, sc->reclaim_idx); + return false; } static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc) -- 2.48.1