From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D7AAC433EF for ; Tue, 2 Nov 2021 06:08:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8EC3B61050 for ; Tue, 2 Nov 2021 06:08:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8EC3B61050 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id BA3CD940010; Tue, 2 Nov 2021 02:07:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B53DD94000A; Tue, 2 Nov 2021 02:07:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4293940010; Tue, 2 Nov 2021 02:07:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0103.hostedemail.com [216.40.44.103]) by kanga.kvack.org (Postfix) with ESMTP id 95EF994000A for ; Tue, 2 Nov 2021 02:07:59 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 3829A5648A for ; Tue, 2 Nov 2021 06:07:59 +0000 (UTC) X-FDA: 78762959478.14.85C7D59 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) by imf11.hostedemail.com (Postfix) with ESMTP id 304F6F0000B4 for ; Tue, 2 Nov 2021 06:07:56 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0UuhLMoK_1635833272; Received: from 30.21.164.46(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0UuhLMoK_1635833272) by smtp.aliyun-inc.com(127.0.0.1); Tue, 02 Nov 2021 14:07:53 +0800 Subject: Re: [PATCH] mm: migrate: Correct the hugetlb migration stats To: Zi Yan Cc: akpm@linux-foundation.org, shy828301@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <677EF981-F33E-4002-AA38-DD669C319284@nvidia.com> <29aa9c6e-7191-71bb-d8a3-e2695b18fa3e@linux.alibaba.com> From: Baolin Wang Message-ID: <7f45b2c8-fd2c-345a-ec6c-43b8b1c06de1@linux.alibaba.com> Date: Tue, 2 Nov 2021 14:08:38 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 304F6F0000B4 Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf11.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.132 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com X-Stat-Signature: iyoymsgbuji6ucfnu7h7oqducxac8k5j X-HE-Tag: 1635833276-290409 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021/11/1 23:12, Zi Yan wrote: > On 1 Nov 2021, at 2:54, Baolin Wang wrote: >=20 >> On 2021/10/29 23:43, Zi Yan wrote: >>> On 29 Oct 2021, at 3:42, Baolin Wang wrote: >>> >>>> Now hugetlb migration is also available for some scenarios, such as >>>> soft offling or memory compaction. So we should correct the migratio= n >>> >>> hugetlb migration is available at the time if (PageHuge(page)) branch >>> is added. I am not sure what is new here. >> >> No new things actually, sorry for confusing and will update the commit= message in next version. >> >>> >>>> stats for hugetlb with using compound_nr() instead of thp_nr_pages() >>>> to get the number of pages. >>> >>> nr_failed records the number of pages, not subpages. It is returned t= o >> >> I also think nr_failed should record the number of pages, not the numb= er of hugetlb, if I understand you correctly. >> >>> user space when move_pages() syscall is used. After your change, >>> if users try to migrate a list of pages including THPs and/or hugetlb >>> pages and some of THPs and/or hugetlb fail to migrate, move_pages() >>> will return a number larger than the number of pages the users tried >> >> OK, thanks for pointing out the issue. >> >> But before my patch, we've already returned the number of pages succes= sed or failed for THP migration, instead of the number of THP. That means= if we just move only 1 page by >=20 > Ah, you are right. >=20 >> move_pages() and if this page is 2M THP, so move_pages() will return 5= 12 if failed to migrate, which is larger than the page count specified fr= om user. >> >> if (err > 0) >> err +=3D nr_pages - i - 1; >=20 > I am not sure this is right for user-space. >=20 >> >> On the other hand, the stats of PGMIGRATE_SUCCESS/PGMIGRATE_FAIL shoul= d stand for the number of pages, instead of the number of hugetlb. Also f= or hugetlb migration when memory compaction, we've already counted the nu= mber of pages for a hugetlb into cc->nr_migratepages, if the hugetlb migr= ation failed, the trace stat of compaction will be confusing if we return= the number of hugetlb. >> >> trace_mm_compaction_migratepages(cc->nr_migratepages, err, = &cc->migratepages); >> >> So I think the stats of hugetlb migration should be consistent with TH= P. >=20 > It makes sense to me. >=20 >> >>> to migrate. I am not sure this is the change we want. Or at least, >>> the comment of migrate_pages() and the manpage of move_pages() need >>> to be changed and linux-api mailing list should be cc=E2=80=99d. >> >> I don't think we should update the comments of migrate_pages(), "Retur= ns the number of pages that were not migrated" makes sense to me if I und= erstand correctly. >> >> For the manpage of move_pages(), as you said, the the returned non-mig= rate page numbers can be larger than the numbers specified from user if f= ailed to migrate a THP or a hugetlb. I am not sure if we should change th= e manpage, since the THP already did, but I can send a patch to update th= e manpage if you think this is still necessary. Thanks. >=20 > I am not sure changing manpage would help the users of move_pages() aft= er > think about it again, since users might not know all the THP and/or hug= etlb > information when they call move_pages() and they just pass a list of N = pages. > > I just wonder if we could fix the rc value of migrate_pages to return > the number of {base page, THP, hugetlb} instead, so that move_pages() > can get its return value right. IMO it will break the usage in other places if we change the rc value of=20 migrate_pages(), for example, the page migration when doing memory=20 compaction as I said before, which will expect the number of normal=20 pages. Meanwhile the THP page can be split into normal pages during=20 migration, so it will not be consistent if we return the number of THP. Changing the return value of migrate_pages() will make things more=20 complicated, and I am not sure whether it is worth doing. Any=20 suggestion? Thanks.