linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Qi Zheng <zhengqi.arch@bytedance.com>
To: David Hildenbrand <david@redhat.com>,
	akpm@linux-foundation.org, tglx@linutronix.de,
	hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com,
	kirill.shutemov@linux.intel.com, mika.penttila@nextfour.com
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, songmuchun@bytedance.com
Subject: Re: [PATCH v1 2/2] mm: remove redundant smp_wmb()
Date: Tue, 31 Aug 2021 20:36:51 +0800	[thread overview]
Message-ID: <3f8e9805-b90f-7df3-8514-139afa653671@bytedance.com> (raw)
In-Reply-To: <9da807d4-1fcc-72e0-dc9e-91ab9fbeb7c6@redhat.com>



On 2021/8/31 PM6:02, David Hildenbrand wrote:
> On 28.08.21 06:23, Qi Zheng wrote:
>> The smp_wmb() which is in the __pte_alloc() is used to
>> ensure all ptes setup is visible before the pte is made
>> visible to other CPUs by being put into page tables. We
>> only need this when the pte is actually populated, so
>> move it to pte_install(). __pte_alloc_kernel(),
>> __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar
>> to this case.
>>
>> We can also defer smp_wmb() to the place where the pmd entry
>> is really populated by preallocated pte. There are two kinds
>> of user of preallocated pte, one is filemap & finish_fault(),
>> another is THP. The former does not need another smp_wmb()
>> because the smp_wmb() has been done by pte_install().
>> Fortunately, the latter also does not need another smp_wmb()
>> because there is already a smp_wmb() before populating the
>> new pte when the THP uses a preallocated pte to split a huge
>> pmd.
>>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>> ---
>>   mm/memory.c         | 47 
>> ++++++++++++++++++++---------------------------
>>   mm/sparse-vmemmap.c |  2 +-
>>   2 files changed, 21 insertions(+), 28 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index ef7b1762e996..9c7534187454 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t 
>> *pmd, pgtable_t *pte)
>>       if (likely(pmd_none(*pmd))) {    /* Has another populated it ? */
>>           mm_inc_nr_ptes(mm);
>> +        /*
>> +         * Ensure all pte setup (eg. pte page lock and page clearing) 
>> are
>> +         * visible before the pte is made visible to other CPUs by being
>> +         * put into page tables.
>> +         *
>> +         * The other side of the story is the pointer chasing in the 
>> page
>> +         * table walking code (when walking the page table without 
>> locking;
>> +         * ie. most of the time). Fortunately, these data accesses 
>> consist
>> +         * of a chain of data-dependent loads, meaning most CPUs (alpha
>> +         * being the notable exception) will already guarantee loads are
>> +         * seen in-order. See the alpha page table accessors for the
>> +         * smp_rmb() barriers in page table walking code.
>> +         */
>> +        smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
>>           pmd_populate(mm, pmd, *pte);
>>           *pte = NULL;
>>       }
>> @@ -451,21 +465,6 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
>>       if (!new)
>>           return -ENOMEM;
>> -    /*
>> -     * Ensure all pte setup (eg. pte page lock and page clearing) are
>> -     * visible before the pte is made visible to other CPUs by being
>> -     * put into page tables.
>> -     *
>> -     * The other side of the story is the pointer chasing in the page
>> -     * table walking code (when walking the page table without locking;
>> -     * ie. most of the time). Fortunately, these data accesses consist
>> -     * of a chain of data-dependent loads, meaning most CPUs (alpha
>> -     * being the notable exception) will already guarantee loads are
>> -     * seen in-order. See the alpha page table accessors for the
>> -     * smp_rmb() barriers in page table walking code.
>> -     */
>> -    smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
>> -
>>       pmd_install(mm, pmd, &new);
>>       if (new)
>>           pte_free(mm, new);
>> @@ -478,10 +477,9 @@ int __pte_alloc_kernel(pmd_t *pmd)
>>       if (!new)
>>           return -ENOMEM;
>> -    smp_wmb(); /* See comment in __pte_alloc */
>> -
>>       spin_lock(&init_mm.page_table_lock);
>>       if (likely(pmd_none(*pmd))) {    /* Has another populated it ? */
>> +        smp_wmb(); /* See comment in pmd_install() */
>>           pmd_populate_kernel(&init_mm, pmd, new);
>>           new = NULL;
>>       }
>> @@ -3857,7 +3855,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>>           vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
>>           if (!vmf->prealloc_pte)
>>               return VM_FAULT_OOM;
>> -        smp_wmb(); /* See comment in __pte_alloc() */
>>       }
>>       ret = vma->vm_ops->fault(vmf);
>> @@ -3919,7 +3916,6 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, 
>> struct page *page)
>>           vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
>>           if (!vmf->prealloc_pte)
>>               return VM_FAULT_OOM;
>> -        smp_wmb(); /* See comment in __pte_alloc() */
>>       }
>>       vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
>> @@ -4144,7 +4140,6 @@ static vm_fault_t do_fault_around(struct 
>> vm_fault *vmf)
>>           vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm);
>>           if (!vmf->prealloc_pte)
>>               return VM_FAULT_OOM;
>> -        smp_wmb(); /* See comment in __pte_alloc() */
>>       }
>>       return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
>> @@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pgd_t 
>> *pgd, unsigned long address)
>>       if (!new)
>>           return -ENOMEM;
>> -    smp_wmb(); /* See comment in __pte_alloc */
>> -
>>       spin_lock(&mm->page_table_lock);
>>       if (pgd_present(*pgd))        /* Another has populated it */
>>           p4d_free(mm, new);
>> -    else
>> +    else {
>> +        smp_wmb(); /* See comment in pmd_install() */
>>           pgd_populate(mm, pgd, new);
>> +    }
> 
> Nit:
> 
> if () {
> 
> } else {
> 
> }
> 
> see Documentation/process/coding-style.rst
> 
> "This does not apply if only one branch of a conditional statement is a 
> single statement; in the latter case use braces in both branches:"

Got it.

> 
> 
> Apart from that, I think this is fine,
> 
> Acked-by: David Hildenbrand <david@redhat.com>
> 

Thanks,
Qi



  reply	other threads:[~2021-08-31 12:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-28  4:23 [PATCH v1 0/2] Do some code cleanups related to mm Qi Zheng
2021-08-28  4:23 ` [PATCH v1 1/2] mm: introduce pmd_install() helper Qi Zheng
2021-08-28  5:25   ` Muchun Song
2021-08-28  4:23 ` [PATCH v1 2/2] mm: remove redundant smp_wmb() Qi Zheng
2021-08-31 10:02   ` David Hildenbrand
2021-08-31 12:36     ` Qi Zheng [this message]
2021-08-31 10:20   ` Vlastimil Babka
2021-08-31 12:52     ` Qi Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f8e9805-b90f-7df3-8514-139afa653671@bytedance.com \
    --to=zhengqi.arch@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mika.penttila@nextfour.com \
    --cc=songmuchun@bytedance.com \
    --cc=tglx@linutronix.de \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox