From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2008DC433E7 for ; Thu, 3 Sep 2020 08:02:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B2C1C206C0 for ; Thu, 3 Sep 2020 08:02:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B2C1C206C0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E54338E0001; Thu, 3 Sep 2020 04:02:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E05BF6B0003; Thu, 3 Sep 2020 04:02:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF2C78E0001; Thu, 3 Sep 2020 04:02:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0194.hostedemail.com [216.40.44.194]) by kanga.kvack.org (Postfix) with ESMTP id BBB8C6B0002 for ; Thu, 3 Sep 2020 04:02:34 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 7F13B181AEF0B for ; Thu, 3 Sep 2020 08:02:34 +0000 (UTC) X-FDA: 77221008228.19.queen79_120695f270a8 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 497E61AD1B9 for ; Thu, 3 Sep 2020 08:02:34 +0000 (UTC) X-HE-Tag: queen79_120695f270a8 X-Filterd-Recvd-Size: 6880 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Thu, 3 Sep 2020 08:02:33 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 92CDCAED8; Thu, 3 Sep 2020 08:02:33 +0000 (UTC) Date: Thu, 3 Sep 2020 10:02:31 +0200 From: Michal Hocko To: Yang Shi Cc: Mike Kravetz , Li Xinhai , "linux-mm@kvack.org" , akpm , "kirill.shutemov" Subject: Re: [PATCH v4] mm/page_vma_mapped.c: Explicitly compare pfn for normal, hugetlbfs and THP page Message-ID: <20200903080231.GQ4617@dhcp22.suse.cz> References: <1578737885-8890-1-git-send-email-lixinhai.lxh@gmail.com> <20200114092545.GF19428@dhcp22.suse.cz> <202001142147485028116@gmail.com> <20200115093148.GZ19428@dhcp22.suse.cz> <986d0f55-ae54-f3af-0d50-3e3b6621a863@oracle.com> <20200116102944.GQ19428@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 497E61AD1B9 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 02-09-20 15:26:39, Yang Shi wrote: > On Thu, Jan 16, 2020 at 2:29 AM Michal Hocko wrote: > > > > On Wed 15-01-20 16:40:26, Mike Kravetz wrote: > > > On 1/15/20 1:31 AM, Michal Hocko wrote: > > > > On Tue 14-01-20 21:47:51, Li Xinhai wrote: > > > >> On 2020-01-14 at 17:25 Michal Hocko wrote: > > > >>> On Sat 11-01-20 10:18:05, Li Xinhai wrote: > > > >>>> When check_pte, pfn of normal, hugetlbfs and THP page need be compared. > > > >>>> The current implementation apply comparison as > > > >>>> - normal 4K page: page_pfn <= pfn < page_pfn + 1 > > > >>>> - hugetlbfs page: page_pfn <= pfn < page_pfn + HPAGE_PMD_NR > > > >>>> - THP page: page_pfn <= pfn < page_pfn + HPAGE_PMD_NR > > > >>>> in pfn_in_hpage. For hugetlbfs page, it should be > > > >>>> page_pfn == pfn > > > >>>> > > > >>>> Now, change pfn_in_hpage to pfn_is_match to highlight that comparison > > > >>>> is not only for THP and explicitly compare for these cases. > > > >>> > > > >>> Why is this important to do. I have asked and Mike had the same feeling > > > >>> that the patch is missing any real justification. Why do we need this > > > >>> change? It is great that you have dropped VM_BUG_ON btw. > > > >>> > > > >> I think it is important to make the code clear, as said, comparing hugetlbfs page > > > >> in range page_pfn <= pfn < page_pfn + HPAGE_PMD_NR is confusion. > > > > > > > > I am sorry but I do not really see a big confusion here. It is probably > > > > a matter of taste. If others consider this an improvement then I will > > > > not stand in the way but I consider the justification insuficient for > > > > merging. > > > > > > Perhaps I am confused, but the patch does update the check done for > > > hugetlb pages. IIUC, previously there was no distinction between THP > > > pages and hugetlb pages in the check. It is valid to pass in a sub- > > > page of a THP page, but not that of a hugetlb page. > > > > > > I do not believe it is possible for existing code to pass in a sub-page > > > of a hugetlb page. And, no one has ever seen this as an issue. This > > > is why I was questioning the need for the patch. > > > > Exactly and that is the reason I fail the see a point. > > > > > With all that said, the new code/check is better (more complete) than > > > the original. It may not do anything for the current code base, but > > > it 'could' catch potential errors in future code. Because of this, I > > > do consider the code an improvement. > > > > It adds a branch for something that doesn't happen and also to a code > > path which is quite internal to the MM AFAICS. That being said, if you > > believe this is an improvement then I will not stand in the way. But > > there are so many other places which could add checks that are not > > exercised... > > I just saw a bad page bug on one our host with 4.14 kernel: > > backtrace: > :BUG: Bad page map in process python3.6 pte:1036e24025 pmd:1dd743d067 > :page:fffff62840db8900 count:105 mapcount:-35 mapping:ffff97d432e97ad0 index:0x1 > :flags: 0xbfffe000001006c(referenced|uptodate|lru|active|mappedtodisk) > :raw: 0bfffe000001006c ffff97d432e97ad0 0000000000000001 00000069ffffffdc > :raw: dead000000000100 dead000000000200 0000000000000000 ffff97c58fc0e000 > :page dumped because: bad pte > :page->mem_cgroup:ffff97c58fc0e000 > :addr:00007f2ddffcc000 vm_flags:00000075 anon_vma: (null) > mapping:ffff97d432e97ad0 index:7f > :file:libc-2.17.so fault:xfs_filemap_fault [xfs] mmap:xfs_file_mmap > [xfs] readpage:xfs_vm_readpage [xfs] > :do_exit+0x563/0xbb0 > :? vfs_write+0x162/0x1a0 > :do_group_exit+0x3a/0xa0 > :file:libc-2.17.so fault:xfs_filemap_fault [xfs] mmap:xfs_file_mmap > [xfs] readpage:xfs_vm_readpage [xfs] > :SyS_exit_group+0x10/0x10 > :do_syscall_64+0x60/0x110 > :entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > :RIP: 0033:0x7fb2504091d9 > :RSP: 002b:00007ffcf035db88 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 > :RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb2504091d9 > :RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > :RBP: 00007fb250706838 R08: 000000000000003c R09: 00000000000000e7 > :R10: ffffffffffffff70 R11: 0000000000000246 R12: 00007fb250706838 > :R13: 00007fb25070be80 R14: 0000000000000000 R15: 0000000000000000 > :CPU: 42 PID: 454567 Comm: python3.6 Tainted: G B W > 4.14.98-t5.el7.twitter.x86_64 #1 > > I just saw it once and it didn't happen on the newer kernel (maybe > just not happen yet). I can't tell why the mapcount could reach -35 > since all page_remove_rmap() is protected by page table lock. Then I > looked into pvmw code, and suspected the PTE might be changed after > acquiring ptl. > > With the old check it was fine as long as "page_pfn <= pfn < page_pfn > + 1", but for the base page case, it means we might be dec'ing > mapcount for another page. > > Then I came across this commit. I guess this should be able to solve > the problem, but unfortunately the problem is very rare so basically I > can't prove this commit could solve it. > > If you agree my analysis, we may consider to backport this to stable tree. I am not sure I follow. Your page looks like a normal page cache and the patch you are referring to is only clarifying hugetlb pages. I do not remember details but it shouldn't have any functional effect. Are those even used in your environemnt? -- Michal Hocko SUSE Labs