From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DEF4C433EF for ; Tue, 7 Dec 2021 06:37:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 01FB16B0082; Tue, 7 Dec 2021 01:36:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F11EF6B0087; Tue, 7 Dec 2021 01:36:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD9406B0088; Tue, 7 Dec 2021 01:36:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0198.hostedemail.com [216.40.44.198]) by kanga.kvack.org (Postfix) with ESMTP id CDA386B0082 for ; Tue, 7 Dec 2021 01:36:55 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 837CD8249980 for ; Tue, 7 Dec 2021 06:36:45 +0000 (UTC) X-FDA: 78890039970.29.0A691D9 Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by imf24.hostedemail.com (Postfix) with ESMTP id 17D38B00009D for ; Tue, 7 Dec 2021 06:36:44 +0000 (UTC) Received: by mail-qt1-f177.google.com with SMTP id m25so13331285qtq.13 for ; Mon, 06 Dec 2021 22:36:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1alM6+AhJgDYrpwHehRie8355PfIbOCgstqykLiCQqs=; b=YpmaD+mNT6M8DTCGsxnNUoPkAcw0RUoyy/gG3VfhdaGarFz2blKer6btNc5brw0MCj mrFv9NF9agHjoOPUCGT8ykaAcWUlViSp8p45IlKZ8mx46TcZtHg7t+3+aQuLeaQrJdzQ RJ/g1MeehSCr7ozKBj4nvvSgEHzbAPQ8YxXJclQAd1H/O+TiFUDZTxRQcDjW5oO984lu 956aX9NrtE4sBi+ZnUbcf+cEc+2/96HqesLPvpBzNMh6Ko3z4j4CUbzUL6ZOpvEiR6B0 71JV5lnKvqTCAyWt7BRBk5mJeHTN8wEQ1UKMmlFSwmCRIHaUe1+1tkXD1BKUuUrZJr7G gGDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1alM6+AhJgDYrpwHehRie8355PfIbOCgstqykLiCQqs=; b=w8tklM/Yv3JsM0k/5e1uHTZvm0fWO87xDvaS7yukjZzLWbs0TisXiMRDs/kuLnobAO 6bb7bN/h1h+HVr3izhf3/uCnKW4CvEeZpwV/A/Q43St3DkReqK8n0acxWfrlnwz8kHwi +gHjRTBcYZl7lhvYTX+FBjdBBuo1DWAzawo2rTVJYlCw30cIMGtAYmc9HBgiyBSuGOP8 GQgAfbUN9HFqSpnL+l8MOo0D/nbiHiHAGsH2DWp1PWtQEiXjhDJrCVGEICRt/fWooTq/ QX8RaaTYCPPkx3h7Si+f8LD4wwy6WNhiO00w3hB6ee9aTwter9dbEOpZbn36XZ/gkWGQ 1PXw== X-Gm-Message-State: AOAM533OoWZNFMr0nU2ojbs1wBh4VuVqrcGfJrOymZJElJF3Rt3eNTry 5F5OEi2FDNO7ZXa2b8BpZTg= X-Google-Smtp-Source: ABdhPJyLFtkeYCzsdOO/qlhIHmssT7sa4UcpZFhKfhTG5zc4yrJe3pn1YWiFMx943h1dOzBygKO1Mw== X-Received: by 2002:ac8:5a84:: with SMTP id c4mr45073384qtc.565.1638859004383; Mon, 06 Dec 2021 22:36:44 -0800 (PST) Received: from hasanalmaruf-mbp.thefacebook.com ([2620:10d:c091:480::1:6fa5]) by smtp.gmail.com with ESMTPSA id w19sm7319304qkw.49.2021.12.06.22.36.42 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 Dec 2021 22:36:43 -0800 (PST) From: Hasan Al Maruf X-Google-Original-From: Hasan Al Maruf To: ying.huang@intel.com Cc: akpm@linux-foundation.org, dave.hansen@linux.intel.com, feng.tang@intel.com, hasanalmaruf@fb.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@suse.de, mgorman@techsingularity.net, mhocko@suse.com, osalvador@suse.de, peterz@infradead.org, riel@surriel.com, shakeelb@google.com, shy828301@gmail.com, weixugc@google.com, ziy@nvidia.com Subject: Re: [PATCH -V10 RESEND 2/6] NUMA balancing: optimize page placement for memory tiering system Date: Tue, 7 Dec 2021 01:36:39 -0500 Message-Id: <20211207063639.83762-1-hasanalmaruf@fb.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211207022757.2523359-3-ying.huang@intel.com> References: <20211207022757.2523359-3-ying.huang@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 17D38B00009D X-Stat-Signature: mm9uywz77e4n1z7x17xkmbpqsfh34tib Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=YpmaD+mN; spf=pass (imf24.hostedemail.com: domain of hasan3050@gmail.com designates 209.85.160.177 as permitted sender) smtp.mailfrom=hasan3050@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-HE-Tag: 1638859004-632955 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Huang, >+void set_numabalancing_state(bool enabled) >+{ >+ if (enabled) >+ sysctl_numa_balancing_mode =3D NUMA_BALANCING_NORMAL; >+ else >+ sysctl_numa_balancing_mode =3D NUMA_BALANCING_DISABLED; >+ __set_numabalancing_state(enabled); >+} >+ One of the properties of optimized NUMA Balancing for tiered memory is we are not going to scan top-tier nodes as promotion doesn't make sense ther= e (implemented in the next patch [3/6]). However, if a system has only single memory node with CPU, does it make sense to run `NUMA_BALANCING_NORMAL` mode there? What do you think about downgrading t= o `NUMA_BALANCING_MEMORY_TIERING` mode if a user setup NUMA Balancing on the default mode of `NUMA_BALANCING_NORMAL` on a single toptier memory node? >diff --git a/mm/vmscan.c b/mm/vmscan.c >index c266e64d2f7e..5edb5dfa8900 100644 >--- a/mm/vmscan.c >+++ b/mm/vmscan.c >@@ -56,6 +56,7 @@ > > #include > #include >+#include > > #include "internal.h" > >@@ -3919,6 +3920,12 @@ static bool pgdat_watermark_boosted(pg_data_t *pg= dat, int highest_zoneidx) > return false; > } > >+/* >+ * Keep the free pages on fast memory node a little more than the high >+ * watermark to accommodate the promoted pages. >+ */ >+#define NUMA_BALANCING_PROMOTE_WATERMARK (10UL * 1024 * 1024 >> PAGE_SH= IFT) >+ > /* > * Returns true if there is an eligible zone balanced for the request o= rder > * and highest_zoneidx >@@ -3940,6 +3947,15 @@ static bool pgdat_balanced(pg_data_t *pgdat, int = order, int highest_zoneidx) > continue; > > mark =3D high_wmark_pages(zone); >+ if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING && >+ numa_demotion_enabled && >+ next_demotion_node(pgdat->node_id) !=3D NUMA_NO_NODE) { >+ unsigned long promote_mark; >+ >+ promote_mark =3D min(NUMA_BALANCING_PROMOTE_WATERMARK, >+ pgdat->node_present_pages >> 6); >+ mark +=3D promote_mark; >+ } > if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx)) > return true; > } This can be moved to a different patch. I think, this patch [2/6] can be splitted into two basic patches -- 1. NUMA Balancing interface for tiered memory and 2. maintaining a headroom for promotion. Instead of having a static value for `NUMA_BALANCING_PROMOTE_WATERMARK` what about decoupling the allocation and reclamation and add a user-space interface for controling them? Do you think patch [2/5] and [3/5] of this series can be merged to your current patchset? https://lore.kernel.org/all/cover.1637778851.git.hasanalmaruf@fb.com/ Best, Hasan