From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A4F7C33CAF for ; Fri, 17 Jan 2020 02:33:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DC053206E6 for ; Fri, 17 Jan 2020 02:33:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DC053206E6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 771156B0306; Thu, 16 Jan 2020 21:33:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 722736B0307; Thu, 16 Jan 2020 21:33:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6379F6B0308; Thu, 16 Jan 2020 21:33:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0238.hostedemail.com [216.40.44.238]) by kanga.kvack.org (Postfix) with ESMTP id 476286B0306 for ; Thu, 16 Jan 2020 21:33:15 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id D3AAA181AEF1F for ; Fri, 17 Jan 2020 02:33:14 +0000 (UTC) X-FDA: 76385554308.21.alley23_3944c1ea2700b X-HE-Tag: alley23_3944c1ea2700b X-Filterd-Recvd-Size: 5299 Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Fri, 17 Jan 2020 02:33:13 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04452;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0TnwKrvb_1579228380; Received: from US-143344MP.local(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TnwKrvb_1579228380) by smtp.aliyun-inc.com(127.0.0.1); Fri, 17 Jan 2020 10:33:08 +0800 Subject: Re: [PATCH 2/2] mm/mempolicy: Skip walking HUGETLB vma if MPOL_MF_STRICT is specified alone To: Mike Kravetz , Michal Hocko Cc: Li Xinhai , "linux-mm@kvack.org" , akpm , n-horiguchi References: <1578993378-10860-1-git-send-email-lixinhai.lxh@gmail.com> <1578993378-10860-2-git-send-email-lixinhai.lxh@gmail.com> <2020011422092314671410@gmail.com> <20200116075933.GN19428@dhcp22.suse.cz> From: Yang Shi Message-ID: <481db4ca-9377-598e-b2f0-8e7c54c35f37@linux.alibaba.com> Date: Thu, 16 Jan 2020 18:32:59 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 1/16/20 11:22 AM, Mike Kravetz wrote: > On 1/15/20 11:59 PM, Michal Hocko wrote: >> On Wed 15-01-20 13:07:17, Mike Kravetz wrote: >>> What should we do? >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> 1) Nothing more than optimizations by Li Xinhai. Behavior that could= be >>> seen as conflicting with man page has existed since v3.12 and I a= m >>> not aware of any complaints. >>> 2) In addition to optimizations by Li Xinhai, modify code to truly ig= nore >>> MPOL_MF_STRICT for huge page mappings. This would be fairly easy= to do >>> after a failure of migrate_pages(). We could simply traverse the= list >>> of pages that were not migrated looking for any non-hugetlb page. >>> 3) Remove the statement "MPOL_MF_STRICT is ignored on huge page mappi= ngs." >>> and modify code accordingly. >>> >>> My suggestion would be for 1 or 2. Thoughts? >> And why do we exactly need to do anything at all? There is an >> inconsistency that has been there for years without anybody noticing. >> NUMA API is a mess on its own and unfixable at this stage, there will >> always be some corner cases. If there is no real workload hitting this >> incosistency and suffering, I would rather not touch this at all. >> Unless the change would clean up the code or make it more maintainable= . > That is a very valid point. Sometimes we as developers get focused on = the > actual code changes and fail to ask the question "does this really need= to > be changed?" or "what value do the code changes provide?". > > Li Xinhai came up with two optimizations in how the mbind code deals wi= th > hugetlb pages. This 'sub-optimal' code has existed for more than 6 yea= rs. > Unless I am mistaken, nobody has actually complained or noticed this be= havior. > I believe Li Xinhai noticed this inefficient code via code inspection. = Of > course, based on what we know today one could write a test program to s= how > the inefficient behavior. However, no real users have noticed this dur= ing > the past 6 years. > > The proposed code changes are fairly simple. However, I would not say = that > they clean up the code or make it more maintainable. They essentially = add > or modify two checks to bail out early for hugetlb vma's if the flag wh= ich > is documented to not apply to hugetlb pages (MPOL_MF_STRICT) is specifi= ed. > If one is trying to follow the entire mbind code path for hugetlb pages= , > these patches will make that easier follow/understand. That is simply > because one can ignore downstream code/functionality. > > Based on Michal's criteria above, I now believe the code changes should= not > be made. Yes, they are fairly simple. However, even simple changes ha= ve > the potential to break something (build breakage with v1 of patch). We= should > leave this code as is unless issues are reported by users. I tend to agree with you. And, according to what Horiguchi explained,=20 the intention was not ignoring hugetlb mappings when hugetlb migration=20 was added at the first place.=C2=A0 And, I'm supposed all of us agree hug= etlb=20 pages should be not treated specially although it is not a good timing=20 or there is not strong motivation to fix it right now (we may correct=20 the behavior in the future). The patch may convey the wrong information.=20 And, the code path is definitely not a hot path, so I'm fine to drop it. And, I'm wondering if we need add some comments in the code to explain=20 the edge case just in case someone else repeat all the tedious history=20 digging.