From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC9C1C433E0 for ; Thu, 21 Jan 2021 12:01:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 07B3A2395A for ; Thu, 21 Jan 2021 12:01:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 07B3A2395A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0CC366B0005; Thu, 21 Jan 2021 07:01:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 07E476B0007; Thu, 21 Jan 2021 07:01:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E865E6B0008; Thu, 21 Jan 2021 07:01:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0217.hostedemail.com [216.40.44.217]) by kanga.kvack.org (Postfix) with ESMTP id D179C6B0005 for ; Thu, 21 Jan 2021 07:01:55 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 985C12C0C for ; Thu, 21 Jan 2021 12:01:55 +0000 (UTC) X-FDA: 77729643390.30.book22_300c37b27563 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id 7E088180B3C8E for ; Thu, 21 Jan 2021 12:01:55 +0000 (UTC) X-HE-Tag: book22_300c37b27563 X-Filterd-Recvd-Size: 11963 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) by imf21.hostedemail.com (Postfix) with ESMTP for ; Thu, 21 Jan 2021 12:01:54 +0000 (UTC) Received: by mail-pl1-f171.google.com with SMTP id r4so1144219pls.11 for ; Thu, 21 Jan 2021 04:01:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9zEj33G+/f7SKLVCmHj6CuzSHpc4OEl9xYJk1xQ2zHg=; b=KWm0w8bhrmf+kRj/jE8m6r5XoOhaBVOutYgs0OJ+IigifPxXYbnqRi+rdXAPFGJx1v 6eaLqAMnFYq8+gEMFEim+ghqjAJziPsFFMKHovAAQh7FtJzNM/ikqINtmb/nkPEjNm/N u7bW3E5Lz0TVPFlBjn7PAo9f6ePk7kx/CgVMTDSDF3SbwtXCHoeRV7eI0Tq/UJq0GWdB tJ+oZCTESPsgotzEDzEr+c0TVHAULN5gQ0PrJ0GFCxOf/bPwLJ4ymSolqvFQGufe3KVY fYGTz1I1SiOs2xpL/pN3WZFUE/guoLarAze/bTXPFdkyGP/UDSh77iOiVSJBF2DIsmhY 5jIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9zEj33G+/f7SKLVCmHj6CuzSHpc4OEl9xYJk1xQ2zHg=; b=L+8LFPKFHLjg2JBasiOsPRYpwXkha3eNE+XBvbpFEJQOBcDPvC3KdoX5NTIpaMV4Ct cHQ30gaDdrbB8xjLflARRVxQvzCi3DY+Z10wePIxD0bTJi2G+6sPWtaDu/D6JeZja11b EhsPrgn35u0pnydwrLkCWSWZpYyH8eidssscc+K2OA3UOlGZReIsInoGSTeVAuJXbOS8 s71T9byrcrltJRZbEHf8Sm+8ONw+TnGwzXKZ+J2jRjAxWFplqI1A2VXV8ZaiipvnMQH+ O6GGe48Uz7sZX4uixuIsExgZg3XwneT+4pQ96NTmWGL/0oRZ+jX5dRUEqgmPsnetHixU WaTA== X-Gm-Message-State: AOAM531IMZ5Nl2ZDQpPhac18ezgjhNpp3oS2TSwSc34Oh7RquQmy/3Sp 2BmG4dH7loIX62zDCgFB1YbOgX/7t4IhUt3tuRceOQ== X-Google-Smtp-Source: ABdhPJwmsbPO9zRWE/Z//iaB62pdHGPKssDpE7namvpRMPJJVAjoiFAWSffilw7OP7hDWF+wG2DxZFHL4/pe4T6sENA= X-Received: by 2002:a17:90a:808a:: with SMTP id c10mr11487636pjn.229.1611230513542; Thu, 21 Jan 2021 04:01:53 -0800 (PST) MIME-Version: 1.0 References: <20210120013049.311822-1-mike.kravetz@oracle.com> <20210120013049.311822-3-mike.kravetz@oracle.com> In-Reply-To: <20210120013049.311822-3-mike.kravetz@oracle.com> From: Muchun Song Date: Thu, 21 Jan 2021 20:01:15 +0800 Message-ID: Subject: Re: [External] [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag To: Mike Kravetz Cc: LKML , Linux Memory Management List , Michal Hocko , Naoya Horiguchi , David Hildenbrand , Oscar Salvador , Matthew Wilcox , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jan 20, 2021 at 9:33 AM Mike Kravetz wrote: > > Use the new hugetlb page specific flag HPageMigratable to replace the > page_huge_active interfaces. By it's name, page_huge_active implied > that a huge page was on the active list. However, that is not really > what code checking the flag wanted to know. It really wanted to determine > if the huge page could be migrated. This happens when the page is actually > added the page cache and/or task page table. This is the reasoning behind > the name change. > > The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not > really necessary as we KNOW the page is a hugetlb page. Therefore, they > are removed. > > The routine page_huge_active checked for PageHeadHuge before testing the > active bit. This is unnecessary in the case where we hold a reference or > lock and know it is a hugetlb head page. page_huge_active is also called > without holding a reference or lock (scan_movable_pages), and can race with > code freeing the page. The extra check in page_huge_active shortened the > race window, but did not prevent the race. Offline code calling > scan_movable_pages already deals with these races, so removing the check > is acceptable. Add comment to racy code. > > Signed-off-by: Mike Kravetz Reviewed-by: Muchun Song Thanks. > --- > fs/hugetlbfs/inode.c | 2 +- > include/linux/hugetlb.h | 5 +++++ > include/linux/page-flags.h | 6 ----- > mm/hugetlb.c | 45 ++++++++++---------------------------- > mm/memory_hotplug.c | 8 ++++++- > 5 files changed, 24 insertions(+), 42 deletions(-) > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index b8a661780c4a..ec9f03aa2738 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -735,7 +735,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, > > mutex_unlock(&hugetlb_fault_mutex_table[hash]); > > - set_page_huge_active(page); > + SetHPageMigratable(page); > /* > * unlock_page because locked by add_to_page_cache() > * put_page() due to reference from alloc_huge_page() > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index be71a00ee2a0..ce3d03da0133 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -480,9 +480,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, > * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at > * allocation time. Cleared when page is fully instantiated. Free > * routine checks flag to restore a reservation on error paths. > + * HPG_migratable - Set after a newly allocated page is added to the page > + * cache and/or page tables. Indicates the page is a candidate for > + * migration. > */ > enum hugetlb_page_flags { > HPG_restore_reserve = 0, > + HPG_migratable, > __NR_HPAGEFLAGS, > }; > > @@ -529,6 +533,7 @@ static inline void ClearHPage##uname(struct page *page) \ > * Create functions associated with hugetlb page flags > */ > HPAGEFLAG(RestoreReserve, restore_reserve) > +HPAGEFLAG(Migratable, migratable) > > #ifdef CONFIG_HUGETLB_PAGE > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index bc6fd1ee7dd6..04a34c08e0a6 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -592,15 +592,9 @@ static inline void ClearPageCompound(struct page *page) > #ifdef CONFIG_HUGETLB_PAGE > int PageHuge(struct page *page); > int PageHeadHuge(struct page *page); > -bool page_huge_active(struct page *page); > #else > TESTPAGEFLAG_FALSE(Huge) > TESTPAGEFLAG_FALSE(HeadHuge) > - > -static inline bool page_huge_active(struct page *page) > -{ > - return 0; > -} > #endif > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 8bed6b5202d2..c24da40626d3 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1353,30 +1353,6 @@ struct hstate *size_to_hstate(unsigned long size) > return NULL; > } > > -/* > - * Test to determine whether the hugepage is "active/in-use" (i.e. being linked > - * to hstate->hugepage_activelist.) > - * > - * This function can be called for tail pages, but never returns true for them. > - */ > -bool page_huge_active(struct page *page) > -{ > - return PageHeadHuge(page) && PagePrivate(&page[1]); > -} > - > -/* never called for tail page */ > -void set_page_huge_active(struct page *page) > -{ > - VM_BUG_ON_PAGE(!PageHeadHuge(page), page); > - SetPagePrivate(&page[1]); > -} > - > -static void clear_page_huge_active(struct page *page) > -{ > - VM_BUG_ON_PAGE(!PageHeadHuge(page), page); > - ClearPagePrivate(&page[1]); > -} > - > /* > * Internal hugetlb specific page flag. Do not use outside of the hugetlb > * code > @@ -1438,7 +1414,7 @@ static void __free_huge_page(struct page *page) > } > > spin_lock(&hugetlb_lock); > - clear_page_huge_active(page); > + ClearHPageMigratable(page); > hugetlb_cgroup_uncharge_page(hstate_index(h), > pages_per_huge_page(h), page); > hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h), > @@ -4220,7 +4196,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, > make_huge_pte(vma, new_page, 1)); > page_remove_rmap(old_page, true); > hugepage_add_new_anon_rmap(new_page, vma, haddr); > - set_page_huge_active(new_page); > + SetHPageMigratable(new_page); > /* Make the old page be freed below */ > new_page = old_page; > } > @@ -4457,12 +4433,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, > spin_unlock(ptl); > > /* > - * Only make newly allocated pages active. Existing pages found > - * in the pagecache could be !page_huge_active() if they have been > - * isolated for migration. > + * Only set HPageMigratable in newly allocated pages. Existing pages > + * found in the pagecache may not have HPageMigratableset if they have > + * been isolated for migration. > */ > if (new_page) > - set_page_huge_active(page); > + SetHPageMigratable(page); > > unlock_page(page); > out: > @@ -4773,7 +4749,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > update_mmu_cache(dst_vma, dst_addr, dst_pte); > > spin_unlock(ptl); > - set_page_huge_active(page); > + SetHPageMigratable(page); > if (vm_shared) > unlock_page(page); > ret = 0; > @@ -5591,12 +5567,13 @@ bool isolate_huge_page(struct page *page, struct list_head *list) > bool ret = true; > > spin_lock(&hugetlb_lock); > - if (!PageHeadHuge(page) || !page_huge_active(page) || > + if (!PageHeadHuge(page) || > + !HPageMigratable(page) || > !get_page_unless_zero(page)) { > ret = false; > goto unlock; > } > - clear_page_huge_active(page); > + ClearHPageMigratable(page); > list_move_tail(&page->lru, list); > unlock: > spin_unlock(&hugetlb_lock); > @@ -5607,7 +5584,7 @@ void putback_active_hugepage(struct page *page) > { > VM_BUG_ON_PAGE(!PageHead(page), page); > spin_lock(&hugetlb_lock); > - set_page_huge_active(page); > + SetHPageMigratable(page); > list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist); > spin_unlock(&hugetlb_lock); > put_page(page); > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index f9d57b9be8c7..563da803e0e0 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1260,7 +1260,13 @@ static int scan_movable_pages(unsigned long start, unsigned long end, > if (!PageHuge(page)) > continue; > head = compound_head(page); > - if (page_huge_active(head)) > + /* > + * This test is racy as we hold no reference or lock. The > + * hugetlb page could have been free'ed and head is no longer > + * a hugetlb page before the following check. In such unlikely > + * cases false positives and negatives are possible. > + */ > + if (HPageMigratable(head)) > goto found; > skip = compound_nr(head) - (page - head); > pfn += skip - 1; > -- > 2.29.2 >