From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59E6CC63777 for ; Fri, 20 Nov 2020 09:31:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B79F622227 for ; Fri, 20 Nov 2020 09:31:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="sqBBFahK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B79F622227 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 33AB36B005D; Fri, 20 Nov 2020 04:31:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2EA986B0068; Fri, 20 Nov 2020 04:31:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 18D266B006E; Fri, 20 Nov 2020 04:31:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id D2F6E6B005D for ; Fri, 20 Nov 2020 04:31:12 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 57489180AD838 for ; Fri, 20 Nov 2020 09:31:12 +0000 (UTC) X-FDA: 77504277984.04.body32_05043292734a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 3C7C0800AB59 for ; Fri, 20 Nov 2020 09:31:12 +0000 (UTC) X-HE-Tag: body32_05043292734a X-Filterd-Recvd-Size: 7580 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Fri, 20 Nov 2020 09:31:11 +0000 (UTC) Received: by mail-pf1-f193.google.com with SMTP id q10so7324353pfn.0 for ; Fri, 20 Nov 2020 01:31:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=oOtWTRWUacCV4afBAmB0uz1HPtJWEPOQW4mb7oZY5Lc=; b=sqBBFahKA/fWvvT6v0hWx9OTS0y3W0Pch9RhXjiAujkFb2wZDebmdh4H/dIgkQwMdh zjXM/S8s20+UeBMZEUO+imZ9K18Bhz3HIMh4eoe274VbCtgQNRPcDlwBcFh6xNbsx9NM rq/M0ZAqb/GbPrcvxM58dkKlX+EGOmCxGvjZ/Koljpog8ws+9Nez4m2crpKLLDhmpP+E atZGgxJQxbC+ZcBg+cA3vzOCyb+GITC7PEf65rliKp6AJHBFwEME7x/syO0uxKc7r+jZ P6vpz7oB5n0o8O9XPwlKNDlq9F0o7ve1MCE+UblxrBROFhlmkW4aYBxAQ0zWmQ7XUVyV /LLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oOtWTRWUacCV4afBAmB0uz1HPtJWEPOQW4mb7oZY5Lc=; b=WrsAAp8j2FwqcDUk+uCRsd/b1Wa8zAdojqPj5HexRJwOYdOOQHXwJsS+4lgIDXsgeH WPSaHMEX5aPResImoAs+WszAdccs3hKbXZfErnK19kGTJA/nY7qJerqTj7I5sVg1rqGb S6yE9RRcd1Z4uTgKOc5y/9QRYDdLdgeAm+afpUh/SFwNSbnBFzfxAUZcY5QT3qQLXNKy ZOEuDslFpTtlAiEKoFRhhkr+83MpEw7bypsFi08t+39zDoYPQ6g7FCfxFZKAvmfJjBDo qPlcbShj8Kv7ZyoIlcmvR56UuP2h/5+6WBA2YUvn1v+AGB4pJPixiCYxBdU5gGlBpCVS ic/A== X-Gm-Message-State: AOAM533AKuiBNCQEWHc5zGL18lkNPIfZtMzotd3gqbIRigcJUiUjamFP TCYG5rmleaNJcAIJ+UnP4IIv8NKSflplY6AVQJzr/g== X-Google-Smtp-Source: ABdhPJzUrWQJPe2coyaCpG5TNvaEojKLO8WAH7jZ5NPrvrro/woXmv5lXw0Cz4SHGWe1FgYGyudNgCWSGaUyxNltC3g= X-Received: by 2002:a17:90b:88b:: with SMTP id bj11mr9514004pjb.229.1605864670167; Fri, 20 Nov 2020 01:31:10 -0800 (PST) MIME-Version: 1.0 References: <20201120064325.34492-1-songmuchun@bytedance.com> <20201120064325.34492-14-songmuchun@bytedance.com> <20201120081638.GD3200@dhcp22.suse.cz> In-Reply-To: <20201120081638.GD3200@dhcp22.suse.cz> From: Muchun Song Date: Fri, 20 Nov 2020 17:30:27 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v5 13/21] mm/hugetlb: Use PG_slab to indicate split pmd To: Michal Hocko Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Oscar Salvador , "Song Bao Hua (Barry Song)" , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Nov 20, 2020 at 4:16 PM Michal Hocko wrote: > > On Fri 20-11-20 14:43:17, Muchun Song wrote: > > When we allocate hugetlb page from buddy, we may need split huge pmd > > to pte. When we free the hugetlb page, we can merge pte to pmd. So > > we need to distinguish whether the previous pmd has been split. The > > page table is not allocated from slab. So we can reuse the PG_slab > > to indicate that the pmd has been split. > > PageSlab is used outside of the slab allocator proper and that code > might get confused by this AFAICS. I got your concerns. Maybe we can use PG_private instead of the PG_slab. > > From the above description it is not really clear why this is needed > though. Who is supposed to use this? Say you are allocating a fresh > hugetlb page. Once you have it, nobody else can be interfering. It is > exclusive to the caller. The later machinery can check the vmemmap page > tables to find out whether a split is needed or not. Or do I miss > something? Yeah, the commit log needs some improvement. The vmemmap pages can use huge page mapping or basepage(e.g. 4KB) mapping. These two cases may exist at the same time. I want to know which page size the vmemmap pages mapping to. If we have split a PMD page table then we set the flag, when we free the HugeTLB and the flag is set, we want to merge the PTE page table to PMD. If the flag is not set, we do nothing about the PTE page table. Thanks. > > > Signed-off-by: Muchun Song > > --- > > mm/hugetlb_vmemmap.c | 26 ++++++++++++++++++++++++-- > > 1 file changed, 24 insertions(+), 2 deletions(-) > > > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > > index 06e2b8a7b7c8..e2ddc73ce25f 100644 > > --- a/mm/hugetlb_vmemmap.c > > +++ b/mm/hugetlb_vmemmap.c > > @@ -293,6 +293,25 @@ static void remap_huge_page_pmd_vmemmap(struct hstate *h, pmd_t *pmd, > > flush_tlb_kernel_range(start, end); > > } > > > > +static inline bool pmd_split(pmd_t *pmd) > > +{ > > + return PageSlab(pmd_page(*pmd)); > > +} > > + > > +static inline void set_pmd_split(pmd_t *pmd) > > +{ > > + /* > > + * We should not use slab for page table allocation. So we can set > > + * PG_slab to indicate that the pmd has been split. > > + */ > > + __SetPageSlab(pmd_page(*pmd)); > > +} > > + > > +static inline void clear_pmd_split(pmd_t *pmd) > > +{ > > + __ClearPageSlab(pmd_page(*pmd)); > > +} > > + > > static void __remap_huge_page_pte_vmemmap(struct page *reuse, pte_t *ptep, > > unsigned long start, > > unsigned long end, > > @@ -357,11 +376,12 @@ void alloc_huge_page_vmemmap(struct hstate *h, struct page *head) > > ptl = vmemmap_pmd_lock(pmd); > > remap_huge_page_pmd_vmemmap(h, pmd, (unsigned long)head, &remap_pages, > > __remap_huge_page_pte_vmemmap); > > - if (!freed_vmemmap_hpage_dec(pmd_page(*pmd))) { > > + if (!freed_vmemmap_hpage_dec(pmd_page(*pmd)) && pmd_split(pmd)) { > > /* > > * Todo: > > * Merge pte to huge pmd if it has ever been split. > > */ > > + clear_pmd_split(pmd); > > } > > spin_unlock(ptl); > > } > > @@ -443,8 +463,10 @@ void free_huge_page_vmemmap(struct hstate *h, struct page *head) > > BUG_ON(!pmd); > > > > ptl = vmemmap_pmd_lock(pmd); > > - if (vmemmap_pmd_huge(pmd)) > > + if (vmemmap_pmd_huge(pmd)) { > > split_vmemmap_huge_page(head, pmd); > > + set_pmd_split(pmd); > > + } > > > > remap_huge_page_pmd_vmemmap(h, pmd, (unsigned long)head, &free_pages, > > __free_huge_page_pte_vmemmap); > > -- > > 2.11.0 > > > > -- > Michal Hocko > SUSE Labs -- Yours, Muchun