From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF6EBC33CB3 for ; Thu, 16 Jan 2020 03:14:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A002124656 for ; Thu, 16 Jan 2020 03:14:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A002124656 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=hisilicon.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2924E8E002C; Wed, 15 Jan 2020 22:14:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1F1FF8E002B; Wed, 15 Jan 2020 22:14:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B9A48E002A; Wed, 15 Jan 2020 22:14:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0153.hostedemail.com [216.40.44.153]) by kanga.kvack.org (Postfix) with ESMTP id E20118E0026 for ; Wed, 15 Jan 2020 22:14:12 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id B0798180AD81A for ; Thu, 16 Jan 2020 03:14:12 +0000 (UTC) X-FDA: 76382028744.12.brass97_41c8d88a93812 X-HE-Tag: brass97_41c8d88a93812 X-Filterd-Recvd-Size: 3429 Received: from huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Thu, 16 Jan 2020 03:14:11 +0000 (UTC) Received: from DGGEMS408-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 9E354639AE2689F3FF6F; Thu, 16 Jan 2020 11:14:08 +0800 (CST) Received: from linux-ibm.site (10.175.102.37) by DGGEMS408-HUB.china.huawei.com (10.3.19.208) with Microsoft SMTP Server id 14.3.439.0; Thu, 16 Jan 2020 11:13:59 +0800 From: Xuefeng Wang To: , , , , CC: , , , , Subject: [PATCH 2/2] arm64: mm: rework the pmd protect changing flow Date: Thu, 16 Jan 2020 11:09:17 +0800 Message-ID: <1579144157-7736-3-git-send-email-wxf.wang@hisilicon.com> X-Mailer: git-send-email 1.7.12.4 In-Reply-To: <1579144157-7736-1-git-send-email-wxf.wang@hisilicon.com> References: <1579144157-7736-1-git-send-email-wxf.wang@hisilicon.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.175.102.37] X-CFilter-Loop: Reflected X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On KunPeng920 board. When changing permission of a large range region, pmdp_invalidate() takes about 65% in profile (with hugepages) in JIT tool. Kernel will flush tlb twice: first flush happens in pmdp_invalidate, second flush happens at the end of change_protect_range(). The first pmdp_invalidate is not necessary if the hardware support atomic pmdp changing. The atomic changing pimd to zero can prevent the hardware from update asynchronous. So reconstruct it and remove the first pmdp_invalidate. And the second tlb flush can make sure the new tlb entry valid. Add pmdp_modify_prot_start() in arm64, which uses pmdp_huge_get_and_clear() to fetch the pmd and zero entry, preventing racing of any hardware updates. After rework, the mprotect can get 3~13 times performace gain in range 64M to 512M. 4K granule/THP on memory size(M) 64 128 256 320 448 512 pre-patch 0.77 1.40 2.64 3.23 4.49 5.10 post-patch 0.20 0.23 0.28 0.31 0.37 0.39 Signed-off-by: Xuefeng Wang Signed-off-by: Chen Zhou --- arch/arm64/include/asm/pgtable.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index cd5de0e40bfa..bccdaa5bd5f2 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -769,6 +769,20 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define __HAVE_ARCH_PMDP_MODIFY_PROT_TRANSACTION +static inline pmd_t pmdp_modify_prot_start(struct vm_area_struct *vma, + unsigned long addr, + pmd_t *pmdp) +{ + /* + * Atomic change pmd to zero, prevent the hardware from update + * aynchronously update it. + */ + return pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp); +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + /* * ptep_set_wrprotect - mark read-only while trasferring potential hardware * dirty status (PTE_DBM && !PTE_RDONLY) to the software PTE_DIRTY bit. -- 2.17.1