From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAEFBC3DA6F for ; Thu, 24 Aug 2023 07:13:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 307DB28007E; Thu, 24 Aug 2023 03:13:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B8158E0011; Thu, 24 Aug 2023 03:13:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1801528007E; Thu, 24 Aug 2023 03:13:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 0916F8E0011 for ; Thu, 24 Aug 2023 03:13:52 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D027C1C91C1 for ; Thu, 24 Aug 2023 07:13:51 +0000 (UTC) X-FDA: 81158133462.13.2CB73DB Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf27.hostedemail.com (Postfix) with ESMTP id 702A94000B for ; Thu, 24 Aug 2023 07:13:49 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=Oi4TfOxu; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf27.hostedemail.com: domain of agordeev@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=agordeev@linux.ibm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692861229; a=rsa-sha256; cv=none; b=G6tXll3rSKElvt0oN85bxhxkZRrVemRXnandVo7yS6xr7k43hxTdbp+xLtngybQbX6fylZ HXHH7mM0st4WyGGtoQR3wpn5tGaKSd8zssa1D6OBFMcnzvk2OZdWlJz2h1qEuH1H2oA96S WKP2ewa5M3aujt+qlz5I/eDg0zYuTFg= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=Oi4TfOxu; dmarc=pass (policy=none) header.from=ibm.com; spf=pass (imf27.hostedemail.com: domain of agordeev@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=agordeev@linux.ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692861229; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WlYEZfZCf0ozJ1tZc7MZf5FbLW/t+nxFBsciNbifwtA=; b=UMpdGij36zLmd84gPnR/sr3FSZ1YF42buyWdalVU0/KSyWcjB/oDm4+oAM2OhNiqtE95WY dentevwUZ2y69MzpXdwepotA9LLJXWvigg2nr8B8qBry3+ngGwLz6FoASK9V7CipN1Otv9 MjBpUqzI8ii0Vv+dhNkoOsf86yWTwuM= Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 37O79t9V007786; Thu, 24 Aug 2023 07:12:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=WlYEZfZCf0ozJ1tZc7MZf5FbLW/t+nxFBsciNbifwtA=; b=Oi4TfOxu6Sxi9puD2LucXhSgcaGmYR6GxEyyMnZWXYCm72n/s5LX8UoY6gPT8kwSjsi0 djDZlZcNPPZa8NP2I/NdX09KVld0f+c//v1W72ER5tNZS7lcVMQDnEjC/r7iQKcLlxos 2bc9EnBcQGdO9+Q64ihV9S+dDR4fwsifhCvOY5R0yQZtg0sECyhba0kTqzSOEZyKQIvN NQYLG0DzsapLaDTEyvB2uUuVuBkxL0n81D1HYhWBDGiPMS+UmOavGUyrPYJyJdqpfPiQ zS8zT7UmJIDKtW1K8r8o3g2wfNzWMOtX0eQjC8BC5Rh7YzS3U5XpbPnNkMSWv7gaLJx2 wg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sp2ba09ak-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 24 Aug 2023 07:12:55 +0000 Received: from m0353728.ppops.net (m0353728.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 37O7B3NX013367; Thu, 24 Aug 2023 07:12:54 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3sp2ba09a0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 24 Aug 2023 07:12:54 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 37O4MerU016435; Thu, 24 Aug 2023 07:12:52 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3sn227w42u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 24 Aug 2023 07:12:52 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 37O7Cnei9896518 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 24 Aug 2023 07:12:49 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3B2B020043; Thu, 24 Aug 2023 07:12:49 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 39CD720040; Thu, 24 Aug 2023 07:12:47 +0000 (GMT) Received: from li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com (unknown [9.171.83.96]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTPS; Thu, 24 Aug 2023 07:12:47 +0000 (GMT) Date: Thu, 24 Aug 2023 09:12:45 +0200 From: Alexander Gordeev To: Kefeng Wang Cc: Andrew Morton , linux-mm@kvack.org, surenb@google.com, willy@infradead.org, Russell King , Catalin Marinas , Will Deacon , Huacai Chen , WANG Xuerui , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Albert Ou , Gerald Schaefer , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Sven Schnelle , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, loongarch@lists.linux.dev, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org Subject: Re: [PATCH rfc v2 01/10] mm: add a generic VMA lock-based page fault handler Message-ID: References: <20230821123056.2109942-1-wangkefeng.wang@huawei.com> <20230821123056.2109942-2-wangkefeng.wang@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230821123056.2109942-2-wangkefeng.wang@huawei.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: XRoDdEmpcV5g5_mmP6WT_85M4g4eI751 X-Proofpoint-ORIG-GUID: RuEgWxQsYQ1DVXl1GZzqprjH3NxBC5KY X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-08-24_03,2023-08-22_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 impostorscore=0 bulkscore=0 clxscore=1011 suspectscore=0 mlxlogscore=456 mlxscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2308240057 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 702A94000B X-Stat-Signature: y3jc1axw5zk1r4oqui9y7wdroc8o36ac X-HE-Tag: 1692861229-160855 X-HE-Meta: U2FsdGVkX18UHTfV9zLMunAaue3RekRuuvIQ7+F2hjO644yicPOTdzTQ2BeguMYPzOE35NDBk4hDkV1LEnSS6+fK5RcSqaFL1kUM0hyrDkG2J4DZvqvFDOsSAimWnv45QYm2MC+RzbDv2HZoLt5swghpoRCVTwxZdAaX7q5qqXdQ3LZNlS3Ck2vVHDJznaZVToqJg0a/MiVIDAcFMMqWDf11zKx1FmctA0oV2s3qcBoWeS/XDb58+1qHmpBz/tRow1lCDdPooYYP1QaROWCOI+uWSPsiEJ0wv27bzAqz1sG6nSs0aHlVPOvLW5r8bF8EmV89Fd5evO5R22ehiFDjFSIqPW/Tkc1v+uZQOa98RKHdTxyaGRsQ4IS2J0JVAjamx+wDG9rUuDKnLRKIJgMWLLgDMLooMtTfBtu5nCL8mxyrhhF/+9w6oC2q7Y538Vb5UeFMBw2WHIfNwUO0iGt3iRO14j6R6wxj1gFZp5DfdzIN5Krguh04Klg+1UKA+JFMJcZXd6/Uqmji+psziwn6vhcb8Ryvl4We+AaBn3qqHQZgMIe2hBAPiPtBhlAJ+OlEGfaRxDLVPpTS7CssbMCq48XFXtKLDb1iktI1R+IKjwXskPRp2nYiXtjupQbX2/f/gMryZlTGHtEpZ2mHmL1CdFynZkNvL2IVs1MFx+gktwyMthvszdXqjPsw2/ttXUdPJhx/r1bnAX34RQCLIDnpVStIxeL1lFTtpyfaSZ30lBWnYfSkxqbX8YNSP3LNKllpCjJdD/qpRUmZTxXb/zIBbKd1PBakqqNNWdq+ySjSBfhLbLJ00ulahiAJVWRtFZ8Fnj2ra092G6vbknWvgTvZkeCscBjSpsAXL0STPNwu4H1xjdK+uOYPlElTLj/NzcUMhoHqC7ygcpln0K+x9pFrdpG7XUphB8fDJ2a0K28BrmFbmzofsJLzy1pMdy8SslnB6c+uAv1MmGJ9cCp/jXF HFJHqb1J 97FogJVbIjYje4QPv91QFc1eno357TE8BptOhMdujSZa8XKcLwB+8OajhZVPHfCojEcloWdAgogdRRUE3kS9iqCQwGutlhl9scw/wW0P9eTZYoeq9aLBlwFdoDFTkLRyEyRpiupZMIJ59sI/bMWwFYO73VyntLj+juKga X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Aug 21, 2023 at 08:30:47PM +0800, Kefeng Wang wrote: Hi Kefeng, > The ARCH_SUPPORTS_PER_VMA_LOCK are enabled by more and more architectures, > eg, x86, arm64, powerpc and s390, and riscv, those implementation are very > similar which results in some duplicated codes, let's add a generic VMA > lock-based page fault handler try_to_vma_locked_page_fault() to eliminate > them, and which also make us easy to support this on new architectures. > > Since different architectures use different way to check vma whether is > accessable or not, the struct pt_regs, page fault error code and vma flags > are added into struct vm_fault, then, the architecture's page fault code > could re-use struct vm_fault to record and check vma accessable by each > own implementation. > > Signed-off-by: Kefeng Wang > --- > include/linux/mm.h | 17 +++++++++++++++++ > include/linux/mm_types.h | 2 ++ > mm/memory.c | 39 +++++++++++++++++++++++++++++++++++++++ > 3 files changed, 58 insertions(+) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 3f764e84e567..22a6f4c56ff3 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -512,9 +512,12 @@ struct vm_fault { > pgoff_t pgoff; /* Logical page offset based on vma */ > unsigned long address; /* Faulting virtual address - masked */ > unsigned long real_address; /* Faulting virtual address - unmasked */ > + unsigned long fault_code; /* Faulting error code during page fault */ > + struct pt_regs *regs; /* The registers stored during page fault */ > }; > enum fault_flag flags; /* FAULT_FLAG_xxx flags > * XXX: should really be 'const' */ > + vm_flags_t vm_flags; /* VMA flags to be used for access checking */ > pmd_t *pmd; /* Pointer to pmd entry matching > * the 'address' */ > pud_t *pud; /* Pointer to pud entry matching > @@ -774,6 +777,9 @@ static inline void assert_fault_locked(struct vm_fault *vmf) > struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, > unsigned long address); > > +bool arch_vma_access_error(struct vm_area_struct *vma, struct vm_fault *vmf); > +vm_fault_t try_vma_locked_page_fault(struct vm_fault *vmf); > + > #else /* CONFIG_PER_VMA_LOCK */ > > static inline bool vma_start_read(struct vm_area_struct *vma) > @@ -801,6 +807,17 @@ static inline void assert_fault_locked(struct vm_fault *vmf) > mmap_assert_locked(vmf->vma->vm_mm); > } > > +static inline struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, > + unsigned long address) > +{ > + return NULL; > +} > + > +static inline vm_fault_t try_vma_locked_page_fault(struct vm_fault *vmf) > +{ > + return VM_FAULT_NONE; > +} > + > #endif /* CONFIG_PER_VMA_LOCK */ > > extern const struct vm_operations_struct vma_dummy_vm_ops; > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index f5ba5b0bc836..702820cea3f9 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -1119,6 +1119,7 @@ typedef __bitwise unsigned int vm_fault_t; > * fault. Used to decide whether a process gets delivered SIGBUS or > * just gets major/minor fault counters bumped up. > * > + * @VM_FAULT_NONE: Special case, not starting to handle fault > * @VM_FAULT_OOM: Out Of Memory > * @VM_FAULT_SIGBUS: Bad access > * @VM_FAULT_MAJOR: Page read from storage > @@ -1139,6 +1140,7 @@ typedef __bitwise unsigned int vm_fault_t; > * > */ > enum vm_fault_reason { > + VM_FAULT_NONE = (__force vm_fault_t)0x000000, > VM_FAULT_OOM = (__force vm_fault_t)0x000001, > VM_FAULT_SIGBUS = (__force vm_fault_t)0x000002, > VM_FAULT_MAJOR = (__force vm_fault_t)0x000004, > diff --git a/mm/memory.c b/mm/memory.c > index 3b4aaa0d2fff..60fe35db5134 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -5510,6 +5510,45 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, > count_vm_vma_lock_event(VMA_LOCK_ABORT); > return NULL; > } > + > +#ifdef CONFIG_PER_VMA_LOCK > +bool __weak arch_vma_access_error(struct vm_area_struct *vma, struct vm_fault *vmf) > +{ > + return (vma->vm_flags & vmf->vm_flags) == 0; > +} > +#endif > + > +vm_fault_t try_vma_locked_page_fault(struct vm_fault *vmf) > +{ > + vm_fault_t fault = VM_FAULT_NONE; > + struct vm_area_struct *vma; > + > + if (!(vmf->flags & FAULT_FLAG_USER)) > + return fault; > + > + vma = lock_vma_under_rcu(current->mm, vmf->real_address); > + if (!vma) > + return fault; > + > + if (arch_vma_access_error(vma, vmf)) { > + vma_end_read(vma); > + return fault; > + } > + > + fault = handle_mm_fault(vma, vmf->real_address, > + vmf->flags | FAULT_FLAG_VMA_LOCK, vmf->regs); > + > + if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) > + vma_end_read(vma); Could you please explain how vma_end_read() call could be conditional? > + > + if (fault & VM_FAULT_RETRY) > + count_vm_vma_lock_event(VMA_LOCK_RETRY); > + else > + count_vm_vma_lock_event(VMA_LOCK_SUCCESS); > + > + return fault; > +} > + > #endif /* CONFIG_PER_VMA_LOCK */ > > #ifndef __PAGETABLE_P4D_FOLDED