From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH] powerpc/mm: THP page cache support
Date: Mon, 26 Sep 2016 21:23:53 +0530 [thread overview]
Message-ID: <87wphy8xny.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20160926105054.GA16074@node.shutemov.name>
"Kirill A. Shutemov" <kirill@shutemov.name> writes:
> On Thu, Sep 22, 2016 at 09:32:40PM +0530, Aneesh Kumar K.V wrote:
>> Update arch hook in the generic THP page cache code, that will
>> deposit and withdarw preallocated page table. Archs like ppc64 use
>> this preallocated table to store the hash pte slot information.
>>
>> This is an RFC patch and I am sharing this early to get feedback on the
>> approach taken. I have used stress-ng mmap-file operation and that
>> resulted in some thp_file_mmap as show below.
>>
>> [/mnt/stress]$ grep thp_file /proc/vmstat
>> thp_file_alloc 25403
>> thp_file_mapped 16967
>> [/mnt/stress]$
>>
>> I did observe wrong nr_ptes count once. I need to recreate the problem
>> again.
>
> I don't see anything that could cause that.
>
I still need to debug this.
> The patch looks good to me (apart from nr_ptes issue). Few minor nitpicks
> below.
>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/asm/book3s/64/pgtable.h | 3 ++
>> include/asm-generic/pgtable.h | 8 +++-
>> mm/Kconfig | 6 +--
>> mm/huge_memory.c | 19 +++++++++-
>> mm/khugepaged.c | 21 ++++++++++-
>> mm/memory.c | 56 +++++++++++++++++++++++-----
>> 6 files changed, 93 insertions(+), 20 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 263bf39ced40..1f45b06ce78e 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -1017,6 +1017,9 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>> */
>> return true;
>> }
>> +
>> +#define arch_needs_pgtable_deposit() (true)
>> +
>> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>> #endif /* __ASSEMBLY__ */
>> #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index d4458b6dbfb4..0d1e400e82a2 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -660,11 +660,17 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl,
>> /*
>> * With split pmd lock we also need to move preallocated
>> * PTE page table if new_pmd is on different PMD page table.
>> + *
>> + * We also don't deposit and withdraw tables for file pages.
>> */
>> - return new_pmd_ptl != old_pmd_ptl;
>> + return (new_pmd_ptl != old_pmd_ptl) && vma_is_anonymous(vma);
>> }
>> #endif
>>
>> +#ifndef arch_needs_pgtable_deposit
>> +#define arch_needs_pgtable_deposit() (false)
>> +#endif
>> +
>> /*
>> * This function is meant to be used by sites walking pagetables with
>> * the mmap_sem hold in read mode to protect against MADV_DONTNEED and
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index be0ee11fa0d9..0a279d399722 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -447,13 +447,9 @@ choice
>> benefit.
>> endchoice
>>
>> -#
>> -# We don't deposit page tables on file THP mapping,
>> -# but Power makes use of them to address MMU quirk.
>> -#
>> config TRANSPARENT_HUGE_PAGECACHE
>> def_bool y
>> - depends on TRANSPARENT_HUGEPAGE && !PPC
>> + depends on TRANSPARENT_HUGEPAGE
>>
>> #
>> # UP and nommu archs use km based percpu allocator
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index a6abd76baa72..37176f455d16 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -1320,6 +1320,14 @@ out_unlocked:
>> return ret;
>> }
>>
>> +void zap_deposited_table(struct mm_struct *mm, pmd_t *pmd)
>
> static?
Ok I will fix that.
>
>> +{
>> + pgtable_t pgtable;
>> + pgtable = pgtable_trans_huge_withdraw(mm, pmd);
>> + pte_free(mm, pgtable);
>> + atomic_long_dec(&mm->nr_ptes);
>> +}
>> +
>> int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>> pmd_t *pmd, unsigned long addr)
>> {
>> @@ -1359,6 +1367,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>> atomic_long_dec(&tlb->mm->nr_ptes);
>> add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PMD_NR);
>> } else {
>> + if (arch_needs_pgtable_deposit())
>
> Just hide the arch_needs_pgtable_deposit() check in zap_deposited_table().
ok.
>
>> + zap_deposited_table(tlb->mm, pmd);
>> add_mm_counter(tlb->mm, MM_FILEPAGES, -HPAGE_PMD_NR);
>> }
>> spin_unlock(ptl);
-aneesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-09-26 15:54 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-22 16:02 Aneesh Kumar K.V
2016-09-26 10:50 ` Kirill A. Shutemov
2016-09-26 15:53 ` Aneesh Kumar K.V [this message]
2016-09-27 1:37 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wphy8xny.fsf@linux.vnet.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox