From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7DBECD5BA9 for ; Thu, 5 Sep 2024 11:08:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 09E026B0286; Thu, 5 Sep 2024 07:08:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F3EE16B0287; Thu, 5 Sep 2024 07:08:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D90396B0288; Thu, 5 Sep 2024 07:08:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B293A6B0286 for ; Thu, 5 Sep 2024 07:08:37 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9CACF81BBA for ; Thu, 5 Sep 2024 11:08:36 +0000 (UTC) X-FDA: 82530411432.26.72E985C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf14.hostedemail.com (Postfix) with ESMTP id 89F27100004 for ; Thu, 5 Sep 2024 11:08:34 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725534417; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ILgCb5dC1MHU0XLYwUDU+xsbNLr6oOsevb4G+WQJcxQ=; b=VI0ZQlXMpWkfR1i6uYzOqACNSIgAiw7KhyPokWMDutbBzA1iLQ1ClbcBrxM9KH6A/i0J5q PE1+iC/n72Tpurc26F1lJRjrzUtKDkY73gbKkTAx7TOZpiEF6GEDXCsW1jBKwoNBhuAOUo vwuE77uWYNNgNsRJHAbefb0leCNCmmw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725534417; a=rsa-sha256; cv=none; b=eNFnW6EDZRPQXpLGv4eVVkB7osCPbcIHSmDczAAc2yII88UZ1/BZe8OWVqe2z5bf0Co1wZ cnoYZoUepln6CZeSVJEhRKda5NVsZRkUox7dEkjcEmEDG5vi4C6024O6XHXOac7NH507zF mn8A01ydIHB8D+iEozWIcW4GO0LlzXo= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1D2D1DA7; Thu, 5 Sep 2024 04:09:00 -0700 (PDT) Received: from [10.1.36.183] (XHFQ2J9959.cambridge.arm.com [10.1.36.183]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 20CAC3F73B; Thu, 5 Sep 2024 04:08:30 -0700 (PDT) Message-ID: <336ce914-43dc-4613-a339-1a33f16f71ad@arm.com> Date: Thu, 5 Sep 2024 12:08:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/2] mm: Abstract THP allocation Content-Language: en-GB To: Dev Jain , akpm@linux-foundation.org, david@redhat.com, willy@infradead.org, kirill.shutemov@linux.intel.com Cc: anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, mark.rutland@arm.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, jglisse@google.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240904100923.290042-1-dev.jain@arm.com> <20240904100923.290042-2-dev.jain@arm.com> From: Ryan Roberts In-Reply-To: <20240904100923.290042-2-dev.jain@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: kzshs4z4jdf369q4potrkrut68zfqeps X-Rspamd-Queue-Id: 89F27100004 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1725534514-502141 X-HE-Meta: U2FsdGVkX1/r1TbskBnvqJ6NJNgpmho3RlC3NQ3sJeNBmvKjafIQl4I+b4Xm0VzZo0x/LDzOvrV1+90oesD1FK/qu59cKK1Ud8VNwr54Efihd/VC36ZyIu5VGfnoGjnlxywaADLMbJkUN0p6ALy0mCYOQqVdLyTQTJZ3B3b1F5TZYyl2RVnpa2B1bnXuvQ8GGGJDefBrZ0rkEXoqPXvGRMF9If8dtg+gEE/taTWXiXQTJbMbxODAclenGpi24pFWJCgfySUqJwx11NgNsnTMefJEXm2hSE6fQSg/zmjP+/fwhlMhQ7WIupS7PykUoM7MqhorWm9/xGrURCBaDTn/EOoYlvXl5sb+vItw7YA3QMkb658TmJRqTuyW5wOXoXf3erFhl1EWiybYRhfocsgJ8JE4K9xOKgkjpQC7CCtGifD9a94XsixX3RsjMg8IDGuk7d2ESfD1ZGPKYixGNoJ1OhuOvSTlLGegQqwZKxNft0iWtpyUb5VFZc1ms8JUI/8yXchEVwcqCW9GXFCd2pxFuvvQB9McqFcp8z3Z17eWPN8hjKFbbD4h/izDyc78TxFoSz6FswFBplNea68cysaMivVOrnmdJS+I9RLZPytLd8W7ny4mnckocyPURBYWTflfBW3y5ZK/+S++DOQ4NtGBa+KygUzSYxGPyd85fN/QksJBKlKOTLiNkeHNuiOzvWKJVgdZd8IuBtFzFL1CzCDYtyE5OwsMeVzNUUu+uNySo4d8xPusj7M4RWWhZ8bUkjFNJaeow8O05wamsMbJUshQGTXSrR7pHWlz4R51rEIXXQ71m8hB8bA7txfUVZW38sbnGj13xCdeDprpbtCBs/ITMhJyJBj8gk7oki8HsxAu7vuTvBCp5eMj/TsHJ/RsFal+hmcMKZxkd9KN1GQYFj/1/b7Lzkt5ifCEwLfhM53pGfuZEvrfCpgGGpyhGDbzm6zmAgNocMHDr7eKeiJAfJY 765Tapc9 q62SSJ6bwGKGKGora7pmQvno4YIaeoyAl1NjTo4IheiyeApIXYwJuAjuqtB6B8Pu6qM3VSRn2eTZ5GFQAHUpVNX33xBw7V1KeuIeo+A06M7i+kRBZoLYcEMIUDCUi+s78+vXk+zrDFQYAM+rKbuWw5W0oz5DH67nF52utn4f//bRVNapbL6zLtUdJQ3mNhRd4r3/15Rw4WbZ56ro= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 04/09/2024 11:09, Dev Jain wrote: > In preparation for the second patch, abstract away the THP allocation > logic present in the create_huge_pmd() path, which corresponds to the > faulting case when no page is present. > > There should be no functional change as a result of applying > this patch. > > Signed-off-by: Dev Jain > --- > mm/huge_memory.c | 110 +++++++++++++++++++++++++++++------------------ > 1 file changed, 67 insertions(+), 43 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 67c86a5d64a6..58125fbcc532 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -943,47 +943,89 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, > } > EXPORT_SYMBOL_GPL(thp_get_unmapped_area); > > -static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, > - struct page *page, gfp_t gfp) > +static vm_fault_t thp_fault_alloc(gfp_t gfp, int order, struct vm_area_struct *vma, Is there a reason for specifying order as a parameter? Previously it was hardcoded to HPAGE_PMD_ORDER. But now, thp_fault_alloc() and __thp_fault_success_stats() both take order and map_pmd_thp() is still implicitly mapping a PMD-sized block. Unless there is a reason you need this parameter in the next patch (I don't think there is?) I suggest simplifying. > + unsigned long haddr, struct folio **foliop, FWIW, I agree with Kirill's suggestion to just return folio* and drop the output param. Thanks, Ryan > + unsigned long addr) > { > - struct vm_area_struct *vma = vmf->vma; > - struct folio *folio = page_folio(page); > - pgtable_t pgtable; > - unsigned long haddr = vmf->address & HPAGE_PMD_MASK; > - vm_fault_t ret = 0; > + struct folio *folio = vma_alloc_folio(gfp, order, vma, haddr, true); > > - VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); > + *foliop = folio; > + if (unlikely(!folio)) { > + count_vm_event(THP_FAULT_FALLBACK); > + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); > + return VM_FAULT_FALLBACK; > + } > > + VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); > if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { > folio_put(folio); > count_vm_event(THP_FAULT_FALLBACK); > count_vm_event(THP_FAULT_FALLBACK_CHARGE); > - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_FALLBACK); > - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); > + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); > + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); > return VM_FAULT_FALLBACK; > } > folio_throttle_swaprate(folio, gfp); > > - pgtable = pte_alloc_one(vma->vm_mm); > - if (unlikely(!pgtable)) { > - ret = VM_FAULT_OOM; > - goto release; > - } > - > - folio_zero_user(folio, vmf->address); > + folio_zero_user(folio, addr); > /* > * The memory barrier inside __folio_mark_uptodate makes sure that > * folio_zero_user writes become visible before the set_pmd_at() > * write. > */ > __folio_mark_uptodate(folio); > + return 0; > +} > + > +static void __thp_fault_success_stats(struct vm_area_struct *vma, int order) > +{ > + count_vm_event(THP_FAULT_ALLOC); > + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_ALLOC); > + count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); > +} > + > +static void map_pmd_thp(struct folio *folio, struct vm_fault *vmf, > + struct vm_area_struct *vma, unsigned long haddr, > + pgtable_t pgtable) > +{ > + pmd_t entry; > + > + entry = mk_huge_pmd(&folio->page, vma->vm_page_prot); > + entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); > + folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); > + folio_add_lru_vma(folio, vma); > + pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); > + set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); > + update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); > + add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); > + mm_inc_nr_ptes(vma->vm_mm); > +} > + > +static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) > +{ > + struct vm_area_struct *vma = vmf->vma; > + struct folio *folio = NULL; > + pgtable_t pgtable; > + unsigned long haddr = vmf->address & HPAGE_PMD_MASK; > + vm_fault_t ret = 0; > + gfp_t gfp = vma_thp_gfp_mask(vma); > + > + pgtable = pte_alloc_one(vma->vm_mm); > + if (unlikely(!pgtable)) { > + ret = VM_FAULT_OOM; > + goto release; > + } > + > + ret = thp_fault_alloc(gfp, HPAGE_PMD_ORDER, vma, haddr, &folio, > + vmf->address); > + if (ret) > + goto release; > > vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); > + > if (unlikely(!pmd_none(*vmf->pmd))) { > goto unlock_release; > } else { > - pmd_t entry; > - > ret = check_stable_address_space(vma->vm_mm); > if (ret) > goto unlock_release; > @@ -997,20 +1039,9 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, > VM_BUG_ON(ret & VM_FAULT_FALLBACK); > return ret; > } > - > - entry = mk_huge_pmd(page, vma->vm_page_prot); > - entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); > - folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); > - folio_add_lru_vma(folio, vma); > - pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); > - set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); > - update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); > - add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); > - mm_inc_nr_ptes(vma->vm_mm); > + map_pmd_thp(folio, vmf, vma, haddr, pgtable); > spin_unlock(vmf->ptl); > - count_vm_event(THP_FAULT_ALLOC); > - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); > - count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); > + __thp_fault_success_stats(vma, HPAGE_PMD_ORDER); > } > > return 0; > @@ -1019,7 +1050,8 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, > release: > if (pgtable) > pte_free(vma->vm_mm, pgtable); > - folio_put(folio); > + if (folio) > + folio_put(folio); > return ret; > > } > @@ -1077,8 +1109,6 @@ static void set_huge_zero_folio(pgtable_t pgtable, struct mm_struct *mm, > vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) > { > struct vm_area_struct *vma = vmf->vma; > - gfp_t gfp; > - struct folio *folio; > unsigned long haddr = vmf->address & HPAGE_PMD_MASK; > vm_fault_t ret; > > @@ -1129,14 +1159,8 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) > } > return ret; > } > - gfp = vma_thp_gfp_mask(vma); > - folio = vma_alloc_folio(gfp, HPAGE_PMD_ORDER, vma, haddr, true); > - if (unlikely(!folio)) { > - count_vm_event(THP_FAULT_FALLBACK); > - count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_FALLBACK); > - return VM_FAULT_FALLBACK; > - } > - return __do_huge_pmd_anonymous_page(vmf, &folio->page, gfp); > + > + return __do_huge_pmd_anonymous_page(vmf); > } > > static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,