From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41D79C4332F for ; Tue, 14 Nov 2023 11:11:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B34136B01F6; Tue, 14 Nov 2023 06:11:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AE44C6B02BB; Tue, 14 Nov 2023 06:11:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9AC5A6B02BF; Tue, 14 Nov 2023 06:11:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8C1906B01F6 for ; Tue, 14 Nov 2023 06:11:54 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 69025140A46 for ; Tue, 14 Nov 2023 11:11:54 +0000 (UTC) X-FDA: 81456294948.13.DBFD61B Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) by imf03.hostedemail.com (Postfix) with ESMTP id 5A47E20010 for ; Tue, 14 Nov 2023 11:11:51 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf03.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699960312; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H2OWIBPm2a70qDXZFObKwwZrLST9v6i+kqi+MS5FBNY=; b=tZzCgb+LxrUTQXUCCdMO9Wu4r178Ot9XeiCocWqF8UTbsiZBdDgbFHN85T8cXc+hRjby1G AssznUbM+EFwk5242mTxm7qYCY9e3jYV9PQElgFCvOyAZoEeci3WgANoJeOHzxRqcAnlCH q8dBWw77NOsL5/RNvhitS6zfgpAgNUA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf03.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699960312; a=rsa-sha256; cv=none; b=MTGKpLCWki6lmzpfN87GpJY28SizoNKFZAHhZbcX6voF5cXdng8nRZfsOe6mvkXowxdRC5 MEX0dPDCu3YwUC79QOgar56qH9nEjUlpRMknnCvdo62NZYZDCrG31o+J4S5eCRigTm+LqF yFXuNHN9HrDcMnX9CUNr1EnIn+OyCMw= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R871e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VwPD2sM_1699960303; Received: from 30.97.48.66(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0VwPD2sM_1699960303) by smtp.aliyun-inc.com; Tue, 14 Nov 2023 19:11:44 +0800 Message-ID: Date: Tue, 14 Nov 2023 19:11:56 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH] mm: support large folio numa balancing To: "Huang, Ying" , David Hildenbrand Cc: akpm@linux-foundation.org, wangkefeng.wang@huawei.com, willy@infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, John Hubbard , Mel Gorman , Peter Zijlstra References: <606d2d7a-d937-4ffe-a6f2-dfe3ae5a0c91@redhat.com> <871qctf89m.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Baolin Wang In-Reply-To: <871qctf89m.fsf@yhuang6-desk2.ccr.corp.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Stat-Signature: 5jisaskmpeou45cpnxx6zye1zy95qm9a X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5A47E20010 X-HE-Tag: 1699960311-249426 X-HE-Meta: U2FsdGVkX18d+RDopZH/cAK5Uo8lKQZqlp8gCwRyk10mloUr+tZNtNMq2UnGWmb9pb2sL6Wa4T82Jbri258Q1EcGaeDyIMlMrp0Ardj1l9WxA0MNV4oz+AuZItQUqcNgk1Fl867UkHOu3Jm8gbizYHTf277nHKLc+pxDLn/GqzTEOy85NUUDGqi3OgbMYgTy1UOBuhzb1mM32FDWNYZ04nv6+pY2bcBOqxtW4rrPOljYe4dSsm9Eh3s2vTLkpgPAX/9dkRZJnKYVCb4TsHaM2pmVXQqy15E+AkELBRXj0ed5y8iYOSGBGFKtYtS8ToeEKQV2YjWLf4ASN6xDYodlSuRBZGEKO+BusfhNBFgSMoOl6ejx/aecsQN7gV5Vk8HDpaUOiGfsAqZ4EYp4aaCXRfFAwOY1eEVy28HnXCbw9ILipjGl4WpFcdd80MAScXER/1/SmxjMbdVMPErgaRQXxp6qCQmOm8sReyb92bVyxQXUtaBQ+69lcmB3+UpqktO0mRR35TTxiW0T5BQYwYprNJqKfwfH8r+gYfCK0DlD1/Zu/qNF7POqkw6fqshGQXo5VSaDZE4rU8AgnDEo9LHQqvEYZq4c1DFoHQgXU/qGOUUA5XyGat4fLpk3KPgI/mdACvBC9T9Xn6Kl4JHOnZOBubf8Rf1wm24D0AeyeBU0XaxIYOWMuKjax1adAz8o8DtBes/4GgX3hIfJc0K7Sl6o6T2pbVT6x/ESvHvIwSpMPJE8UttKLYZfZTkCGGP9SCkhFWuXFuM+Fo1wrwZNpJWBj5pN8+/p9qGiWOjpP4aLOFTxrqkMi0nQeSemmqEUoGUoVpLVJBX9zUYDDk3qYOLoJ6L/SP16JCnGc0h0io2144D9v3UbC9jexgRhctdB5zpprCEPqRZcdpQoqcfAY7I6k8OG+1IXhxxfxrY0B2wEHoujX75XwSYa99lwxQnx5pEPRUl8un4VgmHkfu9IMKe GPBOUxKJ pe5Eids7Q+JOrxNsgGyQxp/5N+ygRDx1W5EDUwa1ug9TUQQ2AJaH8WvvboFlqv23VAUGZAhSuf2QJqZQ08NJfmRpYbKZK4eOEABsWemmPnkYhYx/2ZGUloVfPF1mACS/n418/ZXl8GqdyHmQdriF/1V1K1p/DXC35hmGBRZT2+o3Sf0IQ8q7pjvcMOw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 11/14/2023 9:12 AM, Huang, Ying wrote: > David Hildenbrand writes: > >> On 13.11.23 11:45, Baolin Wang wrote: >>> Currently, the file pages already support large folio, and supporting for >>> anonymous pages is also under discussion[1]. Moreover, the numa balancing >>> code are converted to use a folio by previous thread[2], and the migrate_pages >>> function also already supports the large folio migration. >>> So now I did not see any reason to continue restricting NUMA >>> balancing for >>> large folio. >> >> I recall John wanted to look into that. CCing him. >> >> I'll note that the "head page mapcount" heuristic to detect sharers will >> now strike on the PTE path and make us believe that a large folios is >> exclusive, although it isn't. > > Even 4k folio may be shared by multiple processes/threads. So, numa > balancing uses a multi-stage node selection algorithm (mostly > implemented in should_numa_migrate_memory()) to identify shared folios. > I think that the algorithm needs to be adjusted for PTE mapped large > folio for shared folios. Not sure I get you here. In should_numa_migrate_memory(), it will use last CPU id, last PID and group numa faults to determine if this page can be migrated to the target node. So for large folio, a precise folio sharers check can make the numa faults of a group more accurate, which is enough for should_numa_migrate_memory() to make a decision? Could you provide a more detailed description of the algorithm you would like to change for large folio? Thanks. > And, as a performance improvement patch, some performance data needs to Do you have some benchmark recommendation? I know the the autonuma can not support large folio now. > be provided. And, the effect of shared folio detection needs to be > tested too