From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDB0DC433F5 for ; Mon, 30 May 2022 15:36:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E87026B0071; Mon, 30 May 2022 11:36:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E11C16B0072; Mon, 30 May 2022 11:36:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8A046B0073; Mon, 30 May 2022 11:36:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B6E8E6B0071 for ; Mon, 30 May 2022 11:36:54 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8AA9A20B34 for ; Mon, 30 May 2022 15:36:54 +0000 (UTC) X-FDA: 79522812348.17.05EFA5F Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) by imf05.hostedemail.com (Postfix) with ESMTP id 4B6B8100061 for ; Mon, 30 May 2022 15:36:20 +0000 (UTC) Received: by mail-pf1-f177.google.com with SMTP id c65so10297963pfb.1 for ; Mon, 30 May 2022 08:36:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=WWbn7EZI+ZfA1nS/AHKnyXPK2aRQ/fZAZ6wQKoRMdpc=; b=XQ1xT1dLseQvQ8a1kwNSiOBoWYJZKnwzjCvxN+MbdoO5fE372ROFc/lqlmply44nFC avWAhtJHDkH/hKW/RQJzHvyhfBHO4D5yR1sV7/CMSRCxA9cSWh2oB8FROicaorhXcIGf eiqinzoyPwI6jx9NYzwWFR6TdboS/Ce07AvourWIR6LDlrdM/m0v62O+usUa0XQud0/W 1x/JA1FSviSiRB11bHBU445WWiEYJVjR6d/vAB43cCdGJWEMvCNkPwU8E5qDnEhM7FaY v5iGph+62ORqqF97GpEC6banPu4TJeN/Y5bCLApkb1Du9Aw+P4GTmR0xpdXw0aU49cAr KQaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=WWbn7EZI+ZfA1nS/AHKnyXPK2aRQ/fZAZ6wQKoRMdpc=; b=JGqZ53AKQKshe0RhZ2cpdZAmXlAuB2w+E9RGJERV56IP3AXx2D2BX9K94dMoah3QK1 WQGp6CpvFGdy6716F02IlLkh79vxNdRU3KY/KViKZh55wy03Zr8aJ1A+ibyJCnBYgI89 ubw3u/dFlUqHimLXZveAwh3Tb2U+px1zeOoGaibYDjCbauMCs5uKYlBDD8C4j0AAqy6I a6BrXgNzyhgqr70XJYxJv2O7P6nE3v+O8Y7k8cGfs6kSV2FZLYFJNNgdrXtPEA78R8fm DnEink2WF78M5UyoqYZiwarDQXwKQ1/JLpoGZEqo6QiI9DHIFoKjspqr2A2sKRm6qx63 H7RQ== X-Gm-Message-State: AOAM5317WSV37XC8LV2uuM1wKt9byAwrMWXOgLZyPH95EEJeD2ishPoT jVMUM9WmAbdlRRulQNWFi49WWw== X-Google-Smtp-Source: ABdhPJyt+kChVq69cTd4d5ZYxvL1e82CvbT3ct3GVrYlR8RPkvYjqM/kyduXzhDb2iGTWIyFusP8dg== X-Received: by 2002:a63:e443:0:b0:3f5:e5b3:437f with SMTP id i3-20020a63e443000000b003f5e5b3437fmr48914294pgk.423.1653925011956; Mon, 30 May 2022 08:36:51 -0700 (PDT) Received: from localhost ([2408:8207:18da:2310:2071:e13a:8aa:cacf]) by smtp.gmail.com with ESMTPSA id g20-20020aa78754000000b00517c84fd24asm9239353pfo.172.2022.05.30.08.36.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 May 2022 08:36:51 -0700 (PDT) Date: Mon, 30 May 2022 23:36:43 +0800 From: Muchun Song To: Mike Kravetz Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Michal Hocko , Peter Xu , Naoya Horiguchi , James Houghton , Mina Almasry , "Aneesh Kumar K . V" , Anshuman Khandual , Paul Walmsley , Christian Borntraeger , Andrew Morton Subject: Re: [RFC PATCH 2/3] hugetlb: do not update address in huge_pmd_unshare Message-ID: References: <20220527225849.284839-1-mike.kravetz@oracle.com> <20220527225849.284839-3-mike.kravetz@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220527225849.284839-3-mike.kravetz@oracle.com> X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 4B6B8100061 X-Stat-Signature: 8s85ng6t33us8supeimpbhi631ky7pbi X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=XQ1xT1dL; spf=pass (imf05.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-HE-Tag: 1653924980-577660 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, May 27, 2022 at 03:58:48PM -0700, Mike Kravetz wrote: > As an optimization for loops sequentially processing hugetlb address > ranges, huge_pmd_unshare would update a passed address if it unshared a > pmd. Updating a loop control variable outside the loop like this is > generally a bad idea. These loops are now using hugetlb_mask_last_hp Totally agree. > to optimize scanning when non-present ptes are discovered. The same > can be done when huge_pmd_unshare returns 1 indicating a pmd was > unshared. > > Remove address update from huge_pmd_unshare. Change the passed argument > type and update all callers. In loops sequentially processing addresses > use hugetlb_mask_last_hp to update address if pmd is unshared. > > Signed-off-by: Mike Kravetz Acked-by: Muchun Song Some nits below. > --- > include/linux/hugetlb.h | 4 ++-- > mm/hugetlb.c | 46 ++++++++++++++++++----------------------- > mm/rmap.c | 4 ++-- > 3 files changed, 24 insertions(+), 30 deletions(-) > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 25078a0ea1d8..307c8f6e6752 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -196,7 +196,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, > unsigned long addr, unsigned long sz); > unsigned long hugetlb_mask_last_hp(struct hstate *h); > int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, > - unsigned long *addr, pte_t *ptep); > + unsigned long addr, pte_t *ptep); > void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, > unsigned long *start, unsigned long *end); > struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address, > @@ -243,7 +243,7 @@ static inline struct address_space *hugetlb_page_mapping_lock_write( > > static inline int huge_pmd_unshare(struct mm_struct *mm, > struct vm_area_struct *vma, > - unsigned long *addr, pte_t *ptep) > + unsigned long addr, pte_t *ptep) > { > return 0; > } > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index a2db878b2255..c7d3fbf3ec05 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -4940,7 +4940,6 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, > struct mm_struct *mm = vma->vm_mm; > unsigned long old_end = old_addr + len; > unsigned long last_addr_mask; > - unsigned long old_addr_copy; > pte_t *src_pte, *dst_pte; > struct mmu_notifier_range range; > bool shared_pmd = false; > @@ -4968,14 +4967,10 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, > if (huge_pte_none(huge_ptep_get(src_pte))) > continue; > > - /* old_addr arg to huge_pmd_unshare() is a pointer and so the > - * arg may be modified. Pass a copy instead to preserve the > - * value in old_addr. > - */ > - old_addr_copy = old_addr; > - > - if (huge_pmd_unshare(mm, vma, &old_addr_copy, src_pte)) { > + if (huge_pmd_unshare(mm, vma, old_addr, src_pte)) { > shared_pmd = true; > + old_addr |= last_addr_mask; > + new_addr |= last_addr_mask; > continue; > } > > @@ -5040,10 +5035,11 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct > } > > ptl = huge_pte_lock(h, mm, ptep); > - if (huge_pmd_unshare(mm, vma, &address, ptep)) { > + if (huge_pmd_unshare(mm, vma, address, ptep)) { > spin_unlock(ptl); > tlb_flush_pmd_range(tlb, address & PUD_MASK, PUD_SIZE); > force_flush = true; > + address |= last_addr_mask; > continue; > } > > @@ -6327,7 +6323,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, > continue; > } > ptl = huge_pte_lock(h, mm, ptep); > - if (huge_pmd_unshare(mm, vma, &address, ptep)) { > + if (huge_pmd_unshare(mm, vma, address, ptep)) { > /* > * When uffd-wp is enabled on the vma, unshare > * shouldn't happen at all. Warn about it if it > @@ -6337,6 +6333,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, > pages++; > spin_unlock(ptl); > shared_pmd = true; > + address |= last_addr_mask; > continue; > } > pte = huge_ptep_get(ptep); > @@ -6760,11 +6757,11 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, > * 0 the underlying pte page is not shared, or it is the last user > */ > int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, > - unsigned long *addr, pte_t *ptep) > + unsigned long addr, pte_t *ptep) > { > - pgd_t *pgd = pgd_offset(mm, *addr); > - p4d_t *p4d = p4d_offset(pgd, *addr); > - pud_t *pud = pud_offset(p4d, *addr); > + pgd_t *pgd = pgd_offset(mm, addr); > + p4d_t *p4d = p4d_offset(pgd, addr); > + pud_t *pud = pud_offset(p4d, addr); > > i_mmap_assert_write_locked(vma->vm_file->f_mapping); > BUG_ON(page_count(virt_to_page(ptep)) == 0); > @@ -6774,14 +6771,6 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, > pud_clear(pud); > put_page(virt_to_page(ptep)); > mm_dec_nr_pmds(mm); > - /* > - * This update of passed address optimizes loops sequentially > - * processing addresses in increments of huge page size (PMD_SIZE > - * in this case). By clearing the pud, a PUD_SIZE area is unmapped. > - * Update address to the 'last page' in the cleared area so that > - * calling loop can move to first page past this area. > - */ > - *addr |= PUD_SIZE - PMD_SIZE; > return 1; > } > > @@ -6793,7 +6782,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, > } > > int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, > - unsigned long *addr, pte_t *ptep) > + unsigned long addr, pte_t *ptep) > { > return 0; > } > @@ -6902,6 +6891,13 @@ unsigned long hugetlb_mask_last_hp(struct hstate *h) > /* See description above. Architectures can provide their own version. */ > __weak unsigned long hugetlb_mask_last_hp(struct hstate *h) > { > + unsigned long hp_size = huge_page_size(h); > + > +#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE > + if (hp_size == PMD_SIZE) /* required for pmd sharing */ > + return PUD_SIZE - PMD_SIZE; > +#endif > + > return ~(0); Should be ~0UL (However, this should belong to the previous patch). > } > > @@ -7128,14 +7124,12 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) > mmu_notifier_invalidate_range_start(&range); > i_mmap_lock_write(vma->vm_file->f_mapping); > for (address = start; address < end; address += PUD_SIZE) { > - unsigned long tmp = address; > - > ptep = huge_pte_offset(mm, address, sz); > if (!ptep) > continue; > ptl = huge_pte_lock(h, mm, ptep); > /* We don't want 'address' to be changed */ Dead comment, should be removed. > - huge_pmd_unshare(mm, vma, &tmp, ptep); > + huge_pmd_unshare(mm, vma, address, ptep); > spin_unlock(ptl); > } > flush_hugetlb_tlb_range(vma, start, end); > diff --git a/mm/rmap.c b/mm/rmap.c > index 5bcb334cd6f2..45b04e2e83ab 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1559,7 +1559,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, > */ > VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); > > - if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { > + if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { > flush_tlb_range(vma, range.start, range.end); > mmu_notifier_invalidate_range(mm, range.start, > range.end); > @@ -1923,7 +1923,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, > */ > VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); > > - if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { > + if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { > flush_tlb_range(vma, range.start, range.end); > mmu_notifier_invalidate_range(mm, range.start, > range.end); > -- > 2.35.3 > >