From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54801C4332F for ; Tue, 14 Nov 2023 13:29:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A62396B02D1; Tue, 14 Nov 2023 08:29:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A13036B02D3; Tue, 14 Nov 2023 08:29:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8DA506B02D4; Tue, 14 Nov 2023 08:29:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7F1066B02D1 for ; Tue, 14 Nov 2023 08:29:35 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4D4A816073F for ; Tue, 14 Nov 2023 13:29:35 +0000 (UTC) X-FDA: 81456641910.18.44B1BF4 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf29.hostedemail.com (Postfix) with ESMTP id C32D7120011 for ; Tue, 14 Nov 2023 13:29:31 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf29.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699968572; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FQdxe2xlZHWrvz81BLedu+uxf+LYlWTZORC47HUsWWc=; b=5/qGhs3/nFneLFAEuArp50DFCkAM+viALuBJplqYDFXeN9mYU/oDGZpytg9DApV3Iv8YzH FGuKr+AGDqFJWrfZ/caywxN3TWlOjT7iP9l317tGI+ZshjRc2M7/y2+ZCvEkITABTItgQK YGD4vtCElooPjq9f9f2z9vyvu1AW7d8= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf29.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699968572; a=rsa-sha256; cv=none; b=PD7TrSjPpoUnu7GV46hyAX7P2qdyXObPp2eJ2akdiq8YKwzgPn9YKzgzdCJtIdPBPzidn7 phcJIevSNwdV1WzmyPLFgn26z33uOtMdoigLWxAm9xWseMtoe0ZkaYMH9NPnWlQNTftXk5 qoIRKfUD20ly/pOp7JkqX7CuBD668/s= Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4SV6Cg3RMlzWhNR; Tue, 14 Nov 2023 21:12:31 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Tue, 14 Nov 2023 21:12:52 +0800 Message-ID: <6f953202-b29c-4274-943f-f1a93b1b6ea5@huawei.com> Date: Tue, 14 Nov 2023 21:12:51 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH] mm: support large folio numa balancing Content-Language: en-US To: David Hildenbrand , John Hubbard , Baolin Wang , CC: , , , References: <606d2d7a-d937-4ffe-a6f2-dfe3ae5a0c91@redhat.com> <70973a55-63a0-4a85-abe5-d8681fdb3886@huawei.com> <00372b9e-6020-64b7-1381-e88d9744ed05@nvidia.com> <1b4de866-df27-46fa-81fa-6818a48d8cc1@redhat.com> From: Kefeng Wang In-Reply-To: <1b4de866-df27-46fa-81fa-6818a48d8cc1@redhat.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: C32D7120011 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: ohibhg9wdcibpxgag3djjbqtp815wfxm X-HE-Tag: 1699968571-669103 X-HE-Meta: U2FsdGVkX1/hn9AYexpROVNTvHOAoBhSfLRY49qZ49KJaSDXILg70u6AfrsuReIS8P2+xublJpEqN7/hvw7qp/NI2WbX8grr7b+lBX9uQ1UxWEUAcmiYuHyd/2EWSJhLdEINsq8o1XNrYiZ6Yf4G0Zfqe17Zb5++I8p2dua4NEZyrAF4VDQpc3lyuSh+C+6laaVLm//756dodnFIdoqzCvH1PVfXSL560545q7HQ2ynfLdNr+gUYhUlwNQwbIPH8TWZnADquTSQZ853IpBASWiKYP5+SOykBqY4fFY8ZFtt2zYnhsKTVHWdhd0phCICtz1cMDYb9bKa52ksUV7gvPtKZOW0fLKMDGAZksIjSABcLZ9u7SfUvydjOKyhBw+pY6L30/9vhX2AA5weL4W8AThrmG9RcAYO1pl9VRrb//aahvM+htX/EwSfWSzPk2ub/OP/151tqQuoDoll2pMhz2xLzXlJErcAyE4bMCA46QVfl4e+zvJSkvnQ3RV802p0r8+ybLnDD+uEpxoL7x6LLdfPoKStLj5QhpOuYBYUv5e04QQxSjERi06emxGuFrC7cpksOqg9i4O+ToEWSEZErQHaAHc9DYqVnkxBEeGWJVFBdcWndZiV4Qajq8hkbjlqZNBL6CA+nd3oQ4acUgaoVFgCsvowzmiuyYf320X2e7klDLcctG15Zlwoepn7OLrv39m5MO68QWbgcDWtF+vkpp4WFJvcOmnToQ23xgIEMY4ewvsRYtjohu4J6orpbNyt97g0QUCmSHkW/+hd8yCUYJiISEhUCinS+tqjEuOEoIV9cLH5c1ag7exBOlw9cZ+zmMH7MoE9oM7UKm+XMXePIRfu4g4uTV1htAR6aOvSzCLCCiXfJdzKJCVItvZ+piCygzFCtHZ1KWxJmIRfCTbBj+lGt0nvMJikHS1tLv2ZQ838sQmbusDeYfjP/AU2k3Bl9NDJrZzd+EJIx2NNgBJC gIkGNms/ a5KfPIIwQuMna4Wr+W9Dr32BZ/ODcYd8+FzkunTHHgIDBukove7u00se/tGsDv7pddNbCEj+tXAMmI+mDu+Umw8FotDP1TUsvg2anuWPMvwiy/mSR/C8UIkcpZTcDYu0dvjd80ybJsKceFcJD1Q3FyN3nVzLmQcM62sqW3xOA3IYDJE+L/3/RWmBxO/YEG0aOI+B2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2023/11/14 19:35, David Hildenbrand wrote: > On 13.11.23 23:15, John Hubbard wrote: >> On 11/13/23 5:01 AM, Baolin Wang wrote: >>> >>> >>> On 11/13/2023 8:10 PM, Kefeng Wang wrote: >>>> >>>> >>>> On 2023/11/13 18:53, David Hildenbrand wrote: >>>>> On 13.11.23 11:45, Baolin Wang wrote: >>>>>> Currently, the file pages already support large folio, and >>>>>> supporting for >>>>>> anonymous pages is also under discussion[1]. Moreover, the numa >>>>>> balancing >>>>>> code are converted to use a folio by previous thread[2], and the >>>>>> migrate_pages >>>>>> function also already supports the large folio migration. >>>>>> >>>>>> So now I did not see any reason to continue restricting NUMA >>>>>> balancing for >>>>>> large folio. >>>>> >>>>> I recall John wanted to look into that. CCing him. >>>>> >>>>> I'll note that the "head page mapcount" heuristic to detect sharers >>>>> will >>>>> now strike on the PTE path and make us believe that a large folios is >>>>> exclusive, although it isn't. >>>>> >>>>> As spelled out in the commit you are referencing: >>>>> >>>>> commit 6695cf68b15c215d33b8add64c33e01e3cbe236c >>>>> Author: Kefeng Wang >>>>> Date:   Thu Sep 21 15:44:14 2023 +0800 >>>>> >>>>>       mm: memory: use a folio in do_numa_page() >>>>>       Numa balancing only try to migrate non-compound page in >>>>> do_numa_page(), >>>>>       use a folio in it to save several compound_head calls, note >>>>> we use >>>>>       folio_estimated_sharers(), it is enough to check the folio >>>>> sharers since >>>>>       only normal page is handled, if large folio numa balancing is >>>>> supported, a >>>>>       precise folio sharers check would be used, no functional change >>>>> intended. >>>>> >>>>> >>>>> I'll send WIP patches for one approach that can improve the situation >>>>> soonish. >> >> To be honest, I'm still catching up on the approximate vs. exact >> sharers case. It wasn't clear to me why a precise sharers count >> is needed in order to do this. Perhaps the cost of making a wrong >> decision is considered just too high? > > Good question, I didn't really look into the impact for the NUMA hinting > case where we might end up not setting TNF_SHARED although it is shared. > For other folio_estimate_sharers() users it's more obvious. The task_numa_group() will check the TNF_SHARED, if processes share same page/folio, they will be packed into a single numa group, and the numa group fault statistic will be used in should_numa_migrate_memory() to decide whether to migrate or not, if not setting TNF_SHARED, maybe be lead to more page/folio migration. > > As a side note, it could have happened already in corner cases (e.g., > concurrent page migration of a small folio). > > If precision as documented in that commit is really required remains to > be seen -- just wanted to spell it out. >