From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D65FC433F5 for ; Wed, 24 Nov 2021 10:47:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B6736B007B; Wed, 24 Nov 2021 05:47:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 265746B007D; Wed, 24 Nov 2021 05:47:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17B746B007E; Wed, 24 Nov 2021 05:47:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0053.hostedemail.com [216.40.44.53]) by kanga.kvack.org (Postfix) with ESMTP id 085626B007B for ; Wed, 24 Nov 2021 05:47:20 -0500 (EST) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id C73A38264464 for ; Wed, 24 Nov 2021 10:47:09 +0000 (UTC) X-FDA: 78843496578.31.845CD82 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by imf31.hostedemail.com (Postfix) with ESMTP id D53581046318 for ; Wed, 24 Nov 2021 10:47:01 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0Uy7GqCS_1637750824; Received: from 30.21.164.55(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0Uy7GqCS_1637750824) by smtp.aliyun-inc.com(127.0.0.1); Wed, 24 Nov 2021 18:47:05 +0800 Message-ID: Date: Wed, 24 Nov 2021 18:47:54 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.3.0 Subject: Re: [PATCH 2/3] mm: migrate: Correct the hugetlb migration stats To: Mike Kravetz , Andrew Morton Cc: ziy@nvidia.com, shy828301@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <71a4b6c22f208728fe8c78ad26375436c4ff9704.1636275127.git.baolin.wang@linux.alibaba.com> <20211115202146.473fff2404d7fb200dd48bd3@linux-foundation.org> <71816b8f-93e5-5a2a-e616-d52a1c4d354c@linux.alibaba.com> <3e6dcac6-c947-5f94-cd94-b59a8247dbcf@oracle.com> From: Baolin Wang In-Reply-To: <3e6dcac6-c947-5f94-cd94-b59a8247dbcf@oracle.com> Content-Type: text/plain; charset=UTF-8; format=flowed X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D53581046318 X-Stat-Signature: 54jtuidwe47rzsekgow69ja8ncgtz3wk Authentication-Results: imf31.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf31.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.45 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com X-HE-Tag: 1637750821-417810 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/11/24 3:25, Mike Kravetz wrote: > On 11/15/21 22:03, Baolin Wang wrote: >> >> >> On 2021/11/16 12:21, Andrew Morton wrote: >>> On Sun,=C2=A0 7 Nov 2021 16:57:26 +0800 Baolin Wang wrote: >>> >>>> Correct the migration stats for hugetlb with using compound_nr() ins= tead >>>> of thp_nr_pages(), >>> >>> It would be helpful to explain why using thp_nr_pages() was wrong. >> >> Sure. Using thp_nr_pages() to get the number of subpages for a hugetlb= is incorrect, since the number of subpages in te hugetlb is not always H= PAGE_PMD_NR. >> >=20 > Correct. However, prior to this patch the return value from thp_nr_pag= es > was never used for hugetlb pages; only THP. So, this really did not ha= ve any > bad side effects prior to this patch that I can see. > >>> And to explain the end user visible effects of this bug so we can >> >> Actually not also user visible effect, but also hugetlb migration stat= s in kernel are incorrect. For he end user visible effects, like I descri= bed in patch 1,=C2=A0 the syscall move_pages() can return a non-migrated = number larger than the number of pages the users tried to migrate, when a= THP page is failed to migrate. This is confusing for users. >> >=20 > It looks like hugetlb pages were never taken into account when original= ly > defining the migration stats. In the documentation (page_migration.rst= ) it > only talks about Normal and THP pages. It does not mention how hugetlb= pages > are counted. >=20 > Currently, hugetlb pages count as 'a single page' in the stats > PGMIGRATE_SUCCESS/FAIL. Correct? After this change we will increment = these > stats by the number of sub-pages. Correct? Right. >=20 > I 'think' this is OK since the behavior is not really defined today. B= ut, we > are changing user visible output. Actually we did not change the user visible output for a hugetlb=20 migration. Since we still return the number of hugetlb failed to migrate=20 as before (though previous hugetlb behavior is not reasonable), not the=20 number of hguetlb subpages. We just correct the hugetlb migration stats=20 for the hugetlb in kernel, like PGMIGRATE_SUCCESS/FAIL stats. >=20 > Perhaps we should go ahead and document the hugetlb behavior when makin= g these > changes? Sure. How about adding below modification for hugetlb? diff --git a/Documentation/vm/page_migration.rst=20 b/Documentation/vm/page_migration.rst index 08810f5..8c5cb81 100644 --- a/Documentation/vm/page_migration.rst +++ b/Documentation/vm/page_migration.rst @@ -263,15 +263,15 @@ Monitoring Migration The following events (counters) can be used to monitor page migration. 1. PGMIGRATE_SUCCESS: Normal page migration success. Each count means=20 that a - page was migrated. If the page was a non-THP page, then this counter = is - increased by one. If the page was a THP, then this counter is=20 increased by - the number of THP subpages. For example, migration of a single 2MB=20 THP that - has 4KB-size base pages (subpages) will cause this counter to=20 increase by - 512. + page was migrated. If the page was a non-THP and non-hugetlb page, th= en + this counter is increased by one. If the page was a THP or hugetlb, t= hen + this counter is increased by the number of THP or hugetlb subpages. + For example, migration of a single 2MB THP that has 4KB-size base pag= es + (subpages) will cause this counter to increase by 512. 2. PGMIGRATE_FAIL: Normal page migration failure. Same counting rules=20 as for PGMIGRATE_SUCCESS, above: this will be increased by the number of=20 subpages, - if it was a THP. + if it was a THP or hugetlb. 3. THP_MIGRATION_SUCCESS: A THP was migrated without being split.