From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75F4FCD11C2 for ; Wed, 10 Apr 2024 10:58:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CA1A6B007B; Wed, 10 Apr 2024 06:58:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 97A8E6B0082; Wed, 10 Apr 2024 06:58:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 868D16B0083; Wed, 10 Apr 2024 06:58:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 69AE26B007B for ; Wed, 10 Apr 2024 06:58:36 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9003D1207A1 for ; Wed, 10 Apr 2024 10:58:35 +0000 (UTC) X-FDA: 81993323790.28.26295C1 Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) by imf15.hostedemail.com (Postfix) with ESMTP id 55C12A0012 for ; Wed, 10 Apr 2024 10:58:31 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf15.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712746713; a=rsa-sha256; cv=none; b=wV2dvYchbwVtcvG8fFEyULGnF2kpgvB+mKz/iRr9wm/OCL9OJXrFQrvB8elH7VCLavFyXQ 7iR1rVwaRbBIDqd1rR6/lwiwE4RopwAA/oItMgyok4l/IOvah33rbcI21B6vRWAqWCAnI8 j/33tGtC3Q1724QIEPctgJHDkUx7U88= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf15.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712746713; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pY3UUs3PmIqaPlVhR5CrCBf4R4y0O4nkUS/k139b/fM=; b=F+NlSqVudTMX4BUUm/0+vlhavrKiGENH/K1dxEvoNFBCnJmhglOPC4EqFNtPH3sQ+2Iw9s /Zi2vremmJYMdetpIShqN8ZYnZe/2lBtjO/fAsUA/Er9+mF3mdZnGMHdUcPFmRK7XehGQl ZQoIC9FbKGkdGWkn09qzh7i1HFAN5PA= Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4VF09N1Vgyz1RC5t; Wed, 10 Apr 2024 18:55:36 +0800 (CST) Received: from dggpemm100001.china.huawei.com (unknown [7.185.36.93]) by mail.maildlp.com (Postfix) with ESMTPS id 72EF9140155; Wed, 10 Apr 2024 18:58:28 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 10 Apr 2024 18:58:28 +0800 Message-ID: Date: Wed, 10 Apr 2024 18:58:27 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] arm64: mm: drop VM_FAULT_BADMAP/VM_FAULT_BADACCESS Content-Language: en-US From: Kefeng Wang To: Catalin Marinas CC: Andrew Morton , Russell King , Will Deacon , , References: <20240407081211.2292362-1-wangkefeng.wang@huawei.com> <20240407081211.2292362-2-wangkefeng.wang@huawei.com> In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm100001.china.huawei.com (7.185.36.93) X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 55C12A0012 X-Stat-Signature: w86riyd8jicieh8676ine8bxe3cuonk5 X-Rspam-User: X-HE-Tag: 1712746711-195047 X-HE-Meta: U2FsdGVkX1+gwYfxu4aAQOrIa0cq49rhG6INbiPy/WuYAyA6qNCkM6MTDOtV+yxY5gIr8zBy0gs65AHJPtlehVFwhk4ugKG7xrXzzzLAt1J/qy3Ju44vmWTAAMQK5Arcg4iE/fZ1Wj555LZJotgZsVF3lCrtz+0+svMVU0WStpZ9gCK9GkKqE0d86FL6+nEJof4kUZdGCDYufIkvUtUb1qdB7WybVTB9aRJXAnbJfx2uGf+PeM4YWMkSgOPfbAgBlbnSjGJCiFrmZy6XQQaSt+DwjB8LSXHur44+Qn7bzIZtXSLI1Ae1D3NtXLqqxQiD1AmPoxV55JPudVzfaJo61CM98H3dvGBmUX+mUYD3AgDMHKOWgd1m/n3dXk3oKC0iqNR/EVGu15HZZujE0NA217hChZWC0qeYl+Zc4r8hOHAJdZ/C1ImXyzHNosvsbI+5nmsMlVQuOqTCkM3Yq0i5K87/foAD2Dxb0vPaRGCDBXk0ZIf41IeMJ3Guli6tvGgh486lNbDlaUXbXXOzVcJfwj9gyQ6WMhZsuCD5tZ1kDBgg2jfG5vu2XXtODkz8+/xfSDRPz598IV4nJX+vFuUq8ln1p3M8ZQNrZvU9hAz50hGN+KksKGQXVsHGhAuUDB5OA9EhDDI4yXbnN/CrZs1cqvO4qY+EKaDSNxosdkSq/rq8yV7YpwodB11NDXEeFss7PIGp4aCz16qwpQUaO4Ho1D03BxQofOK5s4TZlM19USc6sqJe2MMxYEgaOjI7gi+heN+4Uhs9wMYxlNPRX30c8QHGEr8LR1owDMx/SP6AHEElwzrWNMBWluEVrLTiVP7RtmemAFav6Od+8jSPqskJaP39BN+AR8KnbsE6KRuC3l78nNIaKjndR7n5Ws+Exsxzk1rMTNWt+e0yVXqH6+tYtCrBrNg0/16FhMVi9mEKHQr6+ppo5xUeNd+Sb+leocafPDWSlXTfIczV7bakGB8 wS4D2+mZ 4oZV9XgHD9qn0iD07xmdBtClynj4CBV3oStUbyf9Dx+KaZ+Rqd43J/0Np94uVWaCMtVr/j1r8Sxc5PabXZfKC1d3myJghSpiGn0q+ESY/zgOT6VTi5XjKFiU5F2PaPFcMMf6Zp7aWpU2OgCGpQABLABy4QHbUMcsOiLtF9lqraTmRjCQRIi3XzT57Yd62z0f2pHCZCDD94+BlWr5S5VdFsuKeWA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/4/10 9:30, Kefeng Wang wrote: > > > On 2024/4/9 22:28, Catalin Marinas wrote: >> Hi Kefeng, >> >> On Sun, Apr 07, 2024 at 04:12:10PM +0800, Kefeng Wang wrote: >>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c >>> index 405f9aa831bd..61a2acae0dca 100644 >>> --- a/arch/arm64/mm/fault.c >>> +++ b/arch/arm64/mm/fault.c >>> @@ -500,9 +500,6 @@ static bool is_write_abort(unsigned long esr) >>>       return (esr & ESR_ELx_WNR) && !(esr & ESR_ELx_CM); >>>   } >>> -#define VM_FAULT_BADMAP        ((__force vm_fault_t)0x010000) >>> -#define VM_FAULT_BADACCESS    ((__force vm_fault_t)0x020000) >>> - >>>   static int __kprobes do_page_fault(unsigned long far, unsigned long >>> esr, >>>                      struct pt_regs *regs) >>>   { >>> @@ -513,6 +510,7 @@ static int __kprobes do_page_fault(unsigned long >>> far, unsigned long esr, >>>       unsigned int mm_flags = FAULT_FLAG_DEFAULT; >>>       unsigned long addr = untagged_addr(far); >>>       struct vm_area_struct *vma; >>> +    int si_code; >> >> I think we should initialise this to 0. Currently all paths seem to set >> si_code to something meaningful but I'm not sure the last 'else' close >> in this patch is guaranteed to always cover exactly those earlier code >> paths updating si_code. I'm not talking about the 'goto bad_area' paths >> since they set 'fault' to 0 but the fall through after the second (under >> the mm lock) handle_mm_fault(). > > Recheck it, without this patch, the second handle_mm_fault() never > return VM_FAULT_BADACCESS, but could return VM_FAULT_SIGSEGV(maybe > other), which not handled in the other error path, > >  handle_mm_fault >     ret = sanitize_fault_flags(vma, &flags); >     if (!arch_vma_access_permitted()) >      ret = VM_FAULT_SIGSEGV; > > so the orignal logical will set si_code to SEGV_MAPERR > >   fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR, > > therefore, i think we should set the default si_code to SEGV_MAPERR. > > >> >>>       if (kprobe_page_fault(regs, esr)) >>>           return 0; >>> @@ -572,9 +570,10 @@ static int __kprobes do_page_fault(unsigned long >>> far, unsigned long esr, >>>       if (!(vma->vm_flags & vm_flags)) { >>>           vma_end_read(vma); >>> -        fault = VM_FAULT_BADACCESS; >>> +        fault = 0; >>> +        si_code = SEGV_ACCERR; >>>           count_vm_vma_lock_event(VMA_LOCK_SUCCESS); >>> -        goto done; >>> +        goto bad_area; >>>       } >>>       fault = handle_mm_fault(vma, addr, mm_flags | >>> FAULT_FLAG_VMA_LOCK, regs); >>>       if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) >>> @@ -599,15 +598,18 @@ static int __kprobes do_page_fault(unsigned >>> long far, unsigned long esr, >>>   retry: >>>       vma = lock_mm_and_find_vma(mm, addr, regs); >>>       if (unlikely(!vma)) { >>> -        fault = VM_FAULT_BADMAP; >>> -        goto done; >>> +        fault = 0; >>> +        si_code = SEGV_MAPERR; >>> +        goto bad_area; >>>       } >>> -    if (!(vma->vm_flags & vm_flags)) >>> -        fault = VM_FAULT_BADACCESS; >>> -    else >>> -        fault = handle_mm_fault(vma, addr, mm_flags, regs); >>> +    if (!(vma->vm_flags & vm_flags)) { >>> +        fault = 0; >>> +        si_code = SEGV_ACCERR; >>> +        goto bad_area; >>> +    } >> >> What's releasing the mm lock here? Prior to this change, it is falling >> through to mmap_read_unlock() below or handle_mm_fault() was releasing >> the lock (VM_FAULT_RETRY, VM_FAULT_COMPLETED). > > Indeed, will fix, > >> >>> +    fault = handle_mm_fault(vma, addr, mm_flags, regs); >>>       /* Quick path to respond to signals */ >>>       if (fault_signal_pending(fault, regs)) { >>>           if (!user_mode(regs)) >>> @@ -626,13 +628,11 @@ static int __kprobes do_page_fault(unsigned >>> long far, unsigned long esr, >>>       mmap_read_unlock(mm); >>>   done: >>> -    /* >>> -     * Handle the "normal" (no error) case first. >>> -     */ >>> -    if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP | >>> -                  VM_FAULT_BADACCESS)))) >>> +    /* Handle the "normal" (no error) case first. */ >>> +    if (likely(!(fault & VM_FAULT_ERROR))) >>>           return 0; Another choice, we set si_code = SEGV_MAPERR here, since normal pagefault don't use si_code, only the error patch need to initialize. >>> +bad_area: >>>       /* >>>        * If we are in kernel mode at this point, we have no context to >>>        * handle this fault with. >>> @@ -667,13 +667,8 @@ static int __kprobes do_page_fault(unsigned long >>> far, unsigned long esr, >>>           arm64_force_sig_mceerr(BUS_MCEERR_AR, far, lsb, inf->name); >>>       } else { >>> -        /* >>> -         * Something tried to access memory that isn't in our memory >>> -         * map. >>> -         */ >>> -        arm64_force_sig_fault(SIGSEGV, >>> -                      fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : >>> SEGV_MAPERR, >>> -                      far, inf->name); >>> +        /* Something tried to access memory that out of memory map */ >>> +        arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name); >>>       } >> >> We can get to the 'else' close after the second handle_mm_fault(). Do we >> guarantee that 'fault == 0' in this last block? If not, maybe a warning >> and some safe initialisation for 'si_code' to avoid leaking stack data. > > As analyzed above, it is sufficient that make si_code to SEGV_MAPPER by > default, right? > > Thanks. > > >> >