From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A25F5C43464 for ; Fri, 18 Sep 2020 02:10:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4E0B52395C for ; Fri, 18 Sep 2020 02:10:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="WKZK8Qdb" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4E0B52395C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D27316B0089; Thu, 17 Sep 2020 22:10:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD9876B008A; Thu, 17 Sep 2020 22:10:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BC7966B008C; Thu, 17 Sep 2020 22:10:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0136.hostedemail.com [216.40.44.136]) by kanga.kvack.org (Postfix) with ESMTP id A64706B0089 for ; Thu, 17 Sep 2020 22:10:04 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 73FA0180AD81F for ; Fri, 18 Sep 2020 02:10:04 +0000 (UTC) X-FDA: 77274551928.14.screw31_0a0fc9927127 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id 514AC18229818 for ; Fri, 18 Sep 2020 02:10:04 +0000 (UTC) X-HE-Tag: screw31_0a0fc9927127 X-Filterd-Recvd-Size: 5883 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Sep 2020 02:10:03 +0000 (UTC) Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E197A238A1; Fri, 18 Sep 2020 02:10:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600395003; bh=JowzN/bIxpyOXVHhRnnXQMBRiOFn8vAfZy4Obc1DDjg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WKZK8QdbRi+sReTREx6VdWMvo7NdMN4c42wGOgL14GOmLB6Hb/5RW49yW2GlQsf0+ mc4yvnUlJ/9oxd2+4B0UlrChWcOM5t2gIOiMc+naIv7SZQqiZyzaoCide9Mi2+A5JE AY9Pe4z9I7U2DxYIfQQeOSWPMc6gn9+1+RqZXj6c= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: "Kirill A. Shutemov" , Jeff Moyer , Andrew Morton , "Kirill A . Shutemov" , Justin He , Dan Williams , Linus Torvalds , Sasha Levin , linux-mm@kvack.org Subject: [PATCH AUTOSEL 4.19 099/206] mm: avoid data corruption on CoW fault into PFN-mapped VMA Date: Thu, 17 Sep 2020 22:06:15 -0400 Message-Id: <20200918020802.2065198-99-sashal@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918020802.2065198-1-sashal@kernel.org> References: <20200918020802.2065198-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Kirill A. Shutemov" [ Upstream commit c3e5ea6ee574ae5e845a40ac8198de1fb63bb3ab ] Jeff Moyer has reported that one of xfstests triggers a warning when run on DAX-enabled filesystem: WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50 ... wp_page_copy+0x98c/0xd50 (unreliable) do_wp_page+0xd8/0xad0 __handle_mm_fault+0x748/0x1b90 handle_mm_fault+0x120/0x1f0 __do_page_fault+0x240/0xd70 do_page_fault+0x38/0xd0 handle_page_fault+0x10/0x30 The warning happens on failed __copy_from_user_inatomic() which tries to copy data into a CoW page. This happens because of race between MADV_DONTNEED and CoW page fault: CPU0 CPU1 handle_mm_fault() do_wp_page() wp_page_copy() do_wp_page() madvise(MADV_DONTNEED) zap_page_range() zap_pte_range() ptep_get_and_clear_full() __copy_from_user_inatomic() sees empty PTE and fails WARN_ON_ONCE(1) clear_page() The solution is to re-try __copy_from_user_inatomic() under PTL after checking that PTE is matches the orig_pte. The second copy attempt can still fail, like due to non-readable PTE, but there's nothing reasonable we can do about, except clearing the CoW page. Reported-by: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Kirill A. Shutemov Tested-by: Jeff Moyer Cc: Cc: Justin He Cc: Dan Williams Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@lin= ux.intel.com Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/memory.c | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index fcad8a0d943d3..eeae63bd95027 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2353,7 +2353,7 @@ static inline bool cow_user_page(struct page *dst, = struct page *src, bool ret; void *kaddr; void __user *uaddr; - bool force_mkyoung; + bool locked =3D false; struct vm_area_struct *vma =3D vmf->vma; struct mm_struct *mm =3D vma->vm_mm; unsigned long addr =3D vmf->address; @@ -2378,11 +2378,11 @@ static inline bool cow_user_page(struct page *dst= , struct page *src, * On architectures with software "accessed" bits, we would * take a double page fault, so mark it accessed here. */ - force_mkyoung =3D arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)= ; - if (force_mkyoung) { + if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) { pte_t entry; =20 vmf->pte =3D pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); + locked =3D true; if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { /* * Other thread has already handled the fault @@ -2406,18 +2406,37 @@ static inline bool cow_user_page(struct page *dst= , struct page *src, * zeroes. */ if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) { + if (locked) + goto warn; + + /* Re-validate under PTL if the page is still mapped */ + vmf->pte =3D pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); + locked =3D true; + if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { + /* The PTE changed under us. Retry page fault. */ + ret =3D false; + goto pte_unlock; + } + /* - * Give a warn in case there can be some obscure - * use-case + * The same page can be mapped back since last copy attampt. + * Try to copy again under PTL. */ - WARN_ON_ONCE(1); - clear_page(kaddr); + if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) { + /* + * Give a warn in case there can be some obscure + * use-case + */ +warn: + WARN_ON_ONCE(1); + clear_page(kaddr); + } } =20 ret =3D true; =20 pte_unlock: - if (force_mkyoung) + if (locked) pte_unmap_unlock(vmf->pte, vmf->ptl); kunmap_atomic(kaddr); flush_dcache_page(dst); --=20 2.25.1