From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21583CF9C6F for ; Tue, 24 Sep 2024 12:17:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7986E6B0089; Tue, 24 Sep 2024 08:17:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 721686B008A; Tue, 24 Sep 2024 08:17:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C2366B008C; Tue, 24 Sep 2024 08:17:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3CDB96B0089 for ; Tue, 24 Sep 2024 08:17:31 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DA5B08196C for ; Tue, 24 Sep 2024 12:17:30 +0000 (UTC) X-FDA: 82599532260.09.7A05C79 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 2C36DC0007 for ; Tue, 24 Sep 2024 12:17:27 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727180130; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2GnlwSeHF73pMZRlwPRxHnekUBq+AOrwl0jukhyKZrQ=; b=IA+CenTq+AWvb4mEmu+d9xwTrEtNW2LYp9THaf1SFC/dhf+mHt66Qnon1BhQ1Yx5bwXyxh D04FlJF7do2leoLpeMMbjeAfNQBFZTOYVJq6Wv9r5R4ZmIhTi1A9L4SNOfbvUhACuqoo4Z 9fjlfYldAFhwwICnyw/a5/2kg9LB58o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727180130; a=rsa-sha256; cv=none; b=M46ryTOgx8BlBfrsVYG4+cBVZh2IeVwfu+KPI7eWbmCLnZT6kiwYkQPHmOsXxrx2Dhi0kv zYc53qR1xGaUTB5T+VzVgLffjBWxLMXoG48R/ooI7YSje26mUpDyjfG9A0RMsxSKRS9D6L yeHBl6OSzdhM7GcSGv2zR8hZYUyy72Y= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf22.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 84706DA7; Tue, 24 Sep 2024 05:17:56 -0700 (PDT) Received: from [10.163.64.4] (unknown [10.163.64.4]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9DDBB3F64C; Tue, 24 Sep 2024 05:17:16 -0700 (PDT) Message-ID: <91892582-1063-4757-9cc6-664d57b9d828@arm.com> Date: Tue, 24 Sep 2024 17:47:13 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v5 1/2] mm: Abstract THP allocation To: Kefeng Wang , akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, mark.rutland@arm.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, jglisse@google.com, ziy@nvidia.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240924101654.1777697-1-dev.jain@arm.com> <20240924101654.1777697-2-dev.jain@arm.com> <61b16640-49e0-4f84-8587-ae9b90a78887@huawei.com> Content-Language: en-US From: Dev Jain In-Reply-To: <61b16640-49e0-4f84-8587-ae9b90a78887@huawei.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 2C36DC0007 X-Stat-Signature: uikbpn4h4sb1em864o98pqkpjy5ndr91 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1727180247-144762 X-HE-Meta: U2FsdGVkX1++nKwleijIqBAr22WwqSFnPGXyyJj+TMNsbko489ss4fyG7Qt1Y+wn34WCpHMAhAItEJUFnd57pVShEwWaQ0BxPXiGXh8E1THx+d2/mvjczR3hMkN0+8Xxxcw1gABheRZ2Yf95P2HTBwyYjB/0oVygfrZLGn0bsrJDYRLmvP8DUsskHcCf2tDWvPPR2/nWBV66aOMtPvI4e0Ais7VDRan2uKWeE3vSerJG8qEw6gVBLdNLM/Q8D522KZxwDasy5NxrWU+4yU2rWo2eMEjkbcs1qTe07K5cDjProGoNy6ErN7sypi6Aman2aCAr5s567ZkUvO9g2AraqMpH4Tsm0tMlPFolGayUfdg9W/GSYBtADwBhy0VNgji5GhvreHa+4bAGM5s/Zb+IZA05+9dcU8R7PPR/gS24inMyXjBqU/GwYpWVKxmRMQqFEOn4eWXxjbFt4HvOi5036SqeRsESiL/aVMCOgBuUz5IZPC/BRKymROTSpOfMJ1pp+6YGYCYjHPP7MbGZdZFNaClpENsXJySEYu27p3fBuBUYRVAgO9DZ+ihR6qLyt7R9bWLGKIuLo8Xaexp89BBG2QnaJoeib1ae7rBAHAJadL8XKRaqFusTQHc3GAijjGTRwKFONfKATd1M3jThGfQqwPvNhOGkBWWq4q5CC0bVYdujbeIHvBes4Nq3HnAygoJjjtJBPOvXB4DtcK1rbAgTvn1qRnBOCh6qrUkf4wbxgO5w/rAFcEFSGVn7DdlINO88xgFsz2zI04KwXZBD+4de0YD6DTnURnpPaXLtK6x4B2LeDFrhha5h+vodEmagF+XPG48RcC/kW2NceiLclQVZ33mB56Xv/+mHGY44khftccNWYIVYGtdSsvzglZhbcthRmEhZjeyxU1RBl7P/u+udpJOYNdxUCyR9td/XkWyyidvXz7tvwhp8LBBeAE4q1SWlnQh9jO0dXc9Y2k5DUUn AyjvcUF7 CQnGuVwIxuVQhlZTlWYShsRSWtj+YnRGRKwWhLA/ILDaRxhe1WN0f6ZojUvjn7I37eO5GnXcZ7l5TzLHu9CXlbACDWwnPbE7kKVnivWEtB0uPexzc13C/m0tWCqWkWKeszPS83FVT4DjMeBTOCQx8xUp0SlVJNFR5VPHxh4+laPS03Wfa51WgwjFOdbiNXNU59+bYBzTuc6Ba6gna6xeVHqabcx1vQlgBYXAk0c4je6hmaY/8a3jP664iYnGuMVhgpBakeWW5XQJM/DRius/Iy3xCAcsu4XrO7V1I7NQnJHsLotU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 9/24/24 16:50, Kefeng Wang wrote: > > > On 2024/9/24 18:16, Dev Jain wrote: >> In preparation for the second patch, abstract away the THP allocation >> logic present in the create_huge_pmd() path, which corresponds to the >> faulting case when no page is present. >> >> There should be no functional change as a result of applying this patch, >> except that, as David notes at [1], a PMD-aligned address should >> be passed to update_mmu_cache_pmd(). >> >> [1]: >> https://lore.kernel.org/all/ddd3fcd2-48b3-4170-bcaa-2fe66e093f43@redhat.com/ >> >> Acked-by: David Hildenbrand >> Signed-off-by: Dev Jain >> --- >>   mm/huge_memory.c | 98 ++++++++++++++++++++++++++++-------------------- >>   1 file changed, 57 insertions(+), 41 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 4e34b7f89daf..bdbf67c18f6c 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -1148,47 +1148,81 @@ unsigned long thp_get_unmapped_area(struct >> file *filp, unsigned long addr, >>   } >>   EXPORT_SYMBOL_GPL(thp_get_unmapped_area); >>   -static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, >> -            struct page *page, gfp_t gfp) >> +static struct folio *vma_alloc_anon_folio_pmd(struct vm_area_struct >> *vma, >> +                          unsigned long addr) >>   { >> -    struct vm_area_struct *vma = vmf->vma; >> -    struct folio *folio = page_folio(page); >> -    pgtable_t pgtable; >> -    unsigned long haddr = vmf->address & HPAGE_PMD_MASK; >> -    vm_fault_t ret = 0; >> +    unsigned long haddr = addr & HPAGE_PMD_MASK; >> +    gfp_t gfp = vma_thp_gfp_mask(vma); >> +    const int order = HPAGE_PMD_ORDER; >> +    struct folio *folio = vma_alloc_folio(gfp, order, vma, haddr, >> true); > > There is a warning without NUMA, > > ../mm/huge_memory.c: In function ‘vma_alloc_anon_folio_pmd’: > ../mm/huge_memory.c:1154:16: warning: unused variable ‘haddr’ > [-Wunused-variable] >  1154 |  unsigned long haddr = addr & HPAGE_PMD_MASK; >       |                ^~~~~ > But why is this happening? > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index c584e77efe10..147a6e069c71 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1151,11 +1151,11 @@ EXPORT_SYMBOL_GPL(thp_get_unmapped_area); >  static struct folio *vma_alloc_anon_folio_pmd(struct vm_area_struct > *vma, >                                               unsigned long addr) >  { > -       unsigned long haddr = addr & HPAGE_PMD_MASK; >         gfp_t gfp = vma_thp_gfp_mask(vma); >         const int order = HPAGE_PMD_ORDER; > -       struct folio *folio = vma_alloc_folio(gfp, order, vma, haddr, > true); > +       struct folio *folio; > > +       folio = vma_alloc_folio(gfp, order, vma, addr & > HPAGE_PMD_MASK, true); >         if (unlikely(!folio)) { >                 count_vm_event(THP_FAULT_FALLBACK); >                 count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); > >>   - VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); >> +    if (unlikely(!folio)) { >> +        count_vm_event(THP_FAULT_FALLBACK); >> +        count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); >> +        goto out; > > Maybe return NULL to omit the out? Ah sorry, I have made a mess of unnecessary delayed returns :) > > > Reviewed-by: Kefeng Wang Thanks! > > >> +    } >>   +    VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); >>       if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { >>           folio_put(folio); >>           count_vm_event(THP_FAULT_FALLBACK); >>           count_vm_event(THP_FAULT_FALLBACK_CHARGE); >> -        count_mthp_stat(HPAGE_PMD_ORDER, >> MTHP_STAT_ANON_FAULT_FALLBACK); >> -        count_mthp_stat(HPAGE_PMD_ORDER, >> MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); >> -        return VM_FAULT_FALLBACK; >> +        count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); >> +        count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); >> +        return NULL; >>       } >>       folio_throttle_swaprate(folio, gfp); >>   -    pgtable = pte_alloc_one(vma->vm_mm); >> -    if (unlikely(!pgtable)) { >> -        ret = VM_FAULT_OOM; >> -        goto release; >> -    } >> - >> -    folio_zero_user(folio, vmf->address); >> +    folio_zero_user(folio, addr); >>       /* >>        * The memory barrier inside __folio_mark_uptodate makes sure that >>        * folio_zero_user writes become visible before the set_pmd_at() >>        * write. >>        */ >>       __folio_mark_uptodate(folio); >> +out: >> +    return folio; >> +} >> + >> +static void map_anon_folio_pmd(struct folio *folio, pmd_t *pmd, >> +                   struct vm_area_struct *vma, unsigned long haddr) >> +{ >> +    pmd_t entry; >> + >> +    entry = mk_huge_pmd(&folio->page, vma->vm_page_prot); >> +    entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); >> +    folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); >> +    folio_add_lru_vma(folio, vma); >> +    set_pmd_at(vma->vm_mm, haddr, pmd, entry); >> +    update_mmu_cache_pmd(vma, haddr, pmd); >> +    add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); >> +    count_vm_event(THP_FAULT_ALLOC); >> +    count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); >> +    count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); >> +} >> + >> +static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) >> +{ >> +    unsigned long haddr = vmf->address & HPAGE_PMD_MASK; >> +    struct vm_area_struct *vma = vmf->vma; >> +    struct folio *folio; >> +    pgtable_t pgtable; >> +    vm_fault_t ret = 0; >> + >> +    folio = vma_alloc_anon_folio_pmd(vma, vmf->address); >> +    if (unlikely(!folio)) >> +        return VM_FAULT_FALLBACK; >> + >> +    pgtable = pte_alloc_one(vma->vm_mm); >> +    if (unlikely(!pgtable)) { >> +        ret = VM_FAULT_OOM; >> +        goto release; >> +    } >>         vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); >>       if (unlikely(!pmd_none(*vmf->pmd))) { >>           goto unlock_release; >>       } else { >> -        pmd_t entry; >> - >>           ret = check_stable_address_space(vma->vm_mm); >>           if (ret) >>               goto unlock_release; >> @@ -1202,21 +1236,11 @@ static vm_fault_t >> __do_huge_pmd_anonymous_page(struct vm_fault *vmf, >>               VM_BUG_ON(ret & VM_FAULT_FALLBACK); >>               return ret; >>           } >> - >> -        entry = mk_huge_pmd(page, vma->vm_page_prot); >> -        entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); >> -        folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); >> -        folio_add_lru_vma(folio, vma); >>           pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); >> -        set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); >> -        update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); >> -        add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); >> +        map_anon_folio_pmd(folio, vmf->pmd, vma, haddr); >>           mm_inc_nr_ptes(vma->vm_mm); >>           deferred_split_folio(folio, false); >>           spin_unlock(vmf->ptl); >> -        count_vm_event(THP_FAULT_ALLOC); >> -        count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); >> -        count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); >>       } >>         return 0; >> @@ -1283,8 +1307,6 @@ static void set_huge_zero_folio(pgtable_t >> pgtable, struct mm_struct *mm, >>   vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) >>   { >>       struct vm_area_struct *vma = vmf->vma; >> -    gfp_t gfp; >> -    struct folio *folio; >>       unsigned long haddr = vmf->address & HPAGE_PMD_MASK; >>       vm_fault_t ret; >>   @@ -1335,14 +1357,8 @@ vm_fault_t do_huge_pmd_anonymous_page(struct >> vm_fault *vmf) >>           } >>           return ret; >>       } >> -    gfp = vma_thp_gfp_mask(vma); >> -    folio = vma_alloc_folio(gfp, HPAGE_PMD_ORDER, vma, haddr, true); >> -    if (unlikely(!folio)) { >> -        count_vm_event(THP_FAULT_FALLBACK); >> -        count_mthp_stat(HPAGE_PMD_ORDER, >> MTHP_STAT_ANON_FAULT_FALLBACK); >> -        return VM_FAULT_FALLBACK; >> -    } >> -    return __do_huge_pmd_anonymous_page(vmf, &folio->page, gfp); >> + >> +    return __do_huge_pmd_anonymous_page(vmf); >>   } >>     static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned >> long addr, >