From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 442B6E7716B for ; Wed, 4 Dec 2024 15:31:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B348C6B0085; Wed, 4 Dec 2024 10:31:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AE4986B0088; Wed, 4 Dec 2024 10:31:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9AC876B0089; Wed, 4 Dec 2024 10:31:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 7F0A26B0085 for ; Wed, 4 Dec 2024 10:31:11 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2387DC1088 for ; Wed, 4 Dec 2024 15:31:11 +0000 (UTC) X-FDA: 82857664470.01.97A3626 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf26.hostedemail.com (Postfix) with ESMTP id D867B14002C for ; Wed, 4 Dec 2024 15:30:56 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=X5tAfFYg; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733326254; a=rsa-sha256; cv=none; b=ghWVKo9uESOBR1nhL/IX7qMPaS32EbtAZMPxUyG/0n8m4ZC/jlcpHcOgils83aMpjjgmx9 dGun7h7a9H0LXb0s4Az0TRIiqn1ayFyuK3P5sPswFOJGNYDCBh0ag4ZUKyEsIlyg0qY8Zb JsoMVfHhi/kU0QPWwhe7LpnpcE7Iq4U= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=X5tAfFYg; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf26.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733326254; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=l9fhs8oLh1QjbfUj0vlrM+TGXkkJdF2GwEtvfsxMfMY=; b=dtpEq5poXHhMT4tSDx/ivoPsVWzJX3UnYsA7wekQQBTAnK43J/kXsiF5oDFDV6VXmvbLAe hhGTm4NQgFxydkh+I0+tVmLHzlAWPKO+bKER+xDf8CASNcadQ53ekWV9X1lSqcxAKnWcJM fws/dQe0b/hTwHihap83FDnYdKEIQLY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733326267; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=l9fhs8oLh1QjbfUj0vlrM+TGXkkJdF2GwEtvfsxMfMY=; b=X5tAfFYg0v3CYE/gIavBqJ0wXttxCb2/PKJKdwJtuaaOcpVdSeEfPhUxKLbNtC9jXpOUD9 bBomNGnHdFk0G4byzUZLxtGxeMAzKk57It1PXBFDB1xNOP/8qnzvWbMtS/coRtiz4IZivX f9Aia7MSFhdppxi2w+BVKQ/A/6BHKqc= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-619-G6JqDLObMIa_dczDST6GFg-1; Wed, 04 Dec 2024 10:31:05 -0500 X-MC-Unique: G6JqDLObMIa_dczDST6GFg-1 X-Mimecast-MFC-AGG-ID: G6JqDLObMIa_dczDST6GFg Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-385e1339790so3131836f8f.2 for ; Wed, 04 Dec 2024 07:31:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733326264; x=1733931064; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=l9fhs8oLh1QjbfUj0vlrM+TGXkkJdF2GwEtvfsxMfMY=; b=WgdkvbTOTRA1dhkKyUxSJAV/9heWPcd/qGuHdHu5J+3KuyghcPNEBbiHdEn1Fi6VKt FwJ7T4Ykak9/ISFsWf7IuWM4jaHmYC2ti17nT/PZI3QXPJWQfiu9hYBPP4ZTO3rhEvUF Ngjxp7LRvAZgljzxXLLt755r8XhalyFDJsjhXzpLnZtZZLTMySRbLY6ebQKfwx06oZuE Gmll3vCwFYO07Kmewa+bWZ+gi+UKCZfY5RYiP+bb9Td8BaF8ruIExRhQuUb7gZdsSKyM Bp5x8ydtZVfTzb+yIk1+jPtREGhQOD2OnFxJYsT2/3lMe6s3cq354vKMiE/B28TB6Nig dh4g== X-Gm-Message-State: AOJu0YzGabwEQ2rblXgatQWR9thDxeIZ2PmVXnQT9jyqrNMpN8Ot4zfD Fk7ZJ5eeVHpukbYM0+z1FDlIHsgj/7WZASnpgYwgmOslCnNcOjbcd4NOXbACtia6uqXAlOoHKs1 L45tOVqKRlhDEub7OwYjHxSSEn3G9N52o3/8vS/u1DKxGYxh0 X-Gm-Gg: ASbGncv7JH92kVw3lIi6l7KsfYnEud0DIwUoFHhxjR0LTHoPZ4P57vkZYJEqGdQ0T4P CYFfwguHmSWevPAX5RiUYP2s06Z2BJC2d8V0m1h0buHFbnruLshr3DfhM9B9C/KoevzN0wceRDP P4n1UJ+uua8fOLwq65eTfxbjgbNjmUcAkZPimhno8oB1XFdETulXfD3i5M74TT/no65nkJUMMIq LlSPkf0wXjQQ3Nb/Ls6G1bixDZboPQydWo2GdfwJKrVW7Ii07OxeMdUSyWznbinW/zQMbVbaWJs +pjVPggPZAW9A3Jzv0dlZ1cxCP4x0xeUs/A= X-Received: by 2002:a05:6000:1845:b0:385:fb8d:865b with SMTP id ffacd0b85a97d-385fd43b822mr6278173f8f.48.1733326264341; Wed, 04 Dec 2024 07:31:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IFK3GfFKAdadjhJ8yUNAOcQLour5Trqk91M2KqVctSQG28t/yBAlyfLEoVLJ5g8CHGaYcEHVA== X-Received: by 2002:a05:6000:1845:b0:385:fb8d:865b with SMTP id ffacd0b85a97d-385fd43b822mr6278120f8f.48.1733326263725; Wed, 04 Dec 2024 07:31:03 -0800 (PST) Received: from localhost (p200300cbc70be10038d68aa111b0a20a.dip0.t-ipconnect.de. [2003:cb:c70b:e100:38d6:8aa1:11b0:a20a]) by smtp.gmail.com with UTF8SMTPSA id ffacd0b85a97d-385dc169d79sm16684772f8f.92.2024.12.04.07.31.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 04 Dec 2024 07:31:02 -0800 (PST) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Muchun Song , Andrew Morton , Peter Xu , Guillaume Morin Subject: [PATCH v1] mm/hugetlb: don't map folios writable without VM_WRITE when copying during fork() Date: Wed, 4 Dec 2024 16:31:00 +0100 Message-ID: <20241204153100.1967364-1-david@redhat.com> X-Mailer: git-send-email 2.47.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: hD3B8EJYfq7KIXyYxK1c6OiqXoeZnHMxzF6ne8QRN1E_1733326264 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit content-type: text/plain; charset="US-ASCII"; x-default=true X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D867B14002C X-Stat-Signature: f3jjojhxfsw9rmrif3rw5nbedbsaxtbq X-Rspam-User: X-HE-Tag: 1733326256-809955 X-HE-Meta: U2FsdGVkX1/G2ruKe9UzTr12oVCG5OJgVAVrsZmn3FphwDNDhNJTiKCBxxcGMG6blDSg5tgwR+KvZyyd2AQHYIYPf7jcH8iarl1nnNod1uCiwFiQVA5gR/v00dgmaQxrMobmLxh7OvcCLDk6hF33sehI/3vNOQezaN1Q/TM3pDSEaU9QHV/g9/YDEkVXDfkPAef/PKkxJEAONNYowL48rvA8gFJKBb6X+F8hDkq+ChrF3uZ5MEdvpdL71gLto29WUxSkrc3ZOuXCzLMvw/g0YGTuiJSQp/5rCptSMKYTxogvUx7TFZj4FIdyM/gtVWZHt+iqqV1SFO8IfFaUrodx44nxv5/0kj/Z3kO6Kgxgmst+0+Fa+8KrqdmuWlbI6NFqY8kfJuBAtpTjGVlaJVv2ftr5r+SJJhreZ59mDpJk8HOCR4D6kPbSsA4PCni3fISy9KTA3l6bZXfogIDcl5pxirhNjL3fw04788OxLtge6bux4yjvGPjE5/DL9ECzjrwPQkDxeiGU1CptQsfS/cnN3Nrz2XhizFMS6a/8MQOoBU/2WEsycw1MIPk5/NYm6Ank2GR09Si7bH1t9Hu2iO1Bjiwq6xVW2XT2U2O8U/xZVBsR+hwaJRv6jAUQTdXqodBdwblaDI/xvFIiwCIy6XKFThF0mylLT8Sx44gjZOc+nnVIuELb90ecdXotNJG/Oq4DGs4+EQJLZw7BYIj8+bYnoyEQUm9mNy6m5tOAC6491Imf6n4eN6ddD3iNUVnUousqyiK8bsc0OYSMA++VzvMBV83R8yHLL6f24GRUd3C30JbYbhR/Mhr3Np0rZqc8E29VXx8Jnd4zKmIoD8eYfv+bd4wTwnlvk1x5Tx8xxebpPJa5k2AC0f5UTLorkYYGaJPXOXoZla6Mxb9DiubnYnSIpz1FJ8BFVUqYd4aVTTa697Rrw3oi8k4xWrKtSJ5EUIyYJ9xGuo3717/20OyswxY EHvQ68sY zqhVPRn4dFAoRCVPul/bOhRC0tHw5IGJGXwxZzWPEjH1bb4NlcMaa7b1DNnXxlzYML36//dEzBuWBV+Tp+4CXqEuf84nx8R2ZOXgapvg3aSefw25VvgQfT1TSYUmLgUhfZGqXv/kUvPCP4LMXvQBu/ddGvLi1dRolZghcqYUYVRtPLoWUXlPDk02Y/F47eolZYklvvIpKpBnLGwsec4T9aYNq0q5lMO9XGPJYLBLVx+DPMEagfPqmFI6N0R6iaH364jeCLnxp67M1+wfgyRHaUIdPbNLfPEHHceFVRBlxacfzShMMhrtxZmSncANX7u6FR696FS3HgkBx39HR/7kJR+lbapI1LAowIMRrfgdpUSNkRbGf235APYnatwFEELfet76DnnIe3fgHL10t+Q4LqC8crB6Ee+cNI4fDV3pb7+5qd6bCpMWaXb1ukLweppnNHshC4EHRtLH0K4AmEslijSS8Kb7EPUiJdAJX1gYVuX2GQTbTP+36epQcPw89ji/4ATYaGqMM0JNsYouIFF27Qd1yCLrVDPAn6lP4uI1zmfL1G0DNE5THtYR1hsJr9I+IYir53GYJkWrWS2szfLfjBnDcfsuRZjkEBGNxF2RtEA/CcxaFGQYmuTAuEw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If we have to trigger a hugetlb folio copy during fork() because the anon folio might be pinned, we currently unconditionally create a writable PTE. However, the VMA might not have write permissions (VM_WRITE) at that point. Fix it by checking the VMA for VM_WRITE. Make the code less error prone by moving checking for VM_WRITE into make_huge_pte(), and letting callers only specify whether we should try making it writable. A simple reproducer that longter-pins the folios using liburing to then mprotect(PROT_READ) the folios befor fork() [1] results in: Before: [FAIL] access should not have worked After: [PASS] access did not work as expected [1] https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/reproducers/hugetlb-mkwrite-fork.c This is rather a corner case, so stable might not be warranted. Fixes: 4eae4efa2c29 ("hugetlb: do early cow when page pinned on src mm") Cc: Muchun Song Cc: Andrew Morton Cc: Peter Xu Cc: Guillaume Morin Signed-off-by: David Hildenbrand --- mm/hugetlb.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5c8de0f5c760..6db4e8176303 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5141,12 +5141,12 @@ const struct vm_operations_struct hugetlb_vm_ops = { }; static pte_t make_huge_pte(struct vm_area_struct *vma, struct page *page, - int writable) + bool try_mkwrite) { pte_t entry; unsigned int shift = huge_page_shift(hstate_vma(vma)); - if (writable) { + if (try_mkwrite && (vma->vm_flags & VM_WRITE)) { entry = huge_pte_mkwrite(huge_pte_mkdirty(mk_huge_pte(page, vma->vm_page_prot))); } else { @@ -5199,7 +5199,7 @@ static void hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr, struct folio *new_folio, pte_t old, unsigned long sz) { - pte_t newpte = make_huge_pte(vma, &new_folio->page, 1); + pte_t newpte = make_huge_pte(vma, &new_folio->page, true); __folio_mark_uptodate(new_folio); hugetlb_add_new_anon_rmap(new_folio, vma, addr); @@ -6223,8 +6223,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping, hugetlb_add_new_anon_rmap(folio, vma, vmf->address); else hugetlb_add_file_rmap(folio); - new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE) - && (vma->vm_flags & VM_SHARED))); + new_pte = make_huge_pte(vma, &folio->page, vma->vm_flags & VM_SHARED); /* * If this pte was previously wr-protected, keep it wr-protected even * if populated. @@ -6556,7 +6555,6 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, spinlock_t *ptl; int ret = -ENOMEM; struct folio *folio; - int writable; bool folio_in_pagecache = false; if (uffd_flags_mode_is(flags, MFILL_ATOMIC_POISON)) { @@ -6710,12 +6708,8 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, * For either: (1) CONTINUE on a non-shared VMA, or (2) UFFDIO_COPY * with wp flag set, don't set pte write bit. */ - if (wp_enabled || (is_continue && !vm_shared)) - writable = 0; - else - writable = dst_vma->vm_flags & VM_WRITE; - - _dst_pte = make_huge_pte(dst_vma, &folio->page, writable); + _dst_pte = make_huge_pte(dst_vma, &folio->page, + !wp_enabled && !(is_continue && !vm_shared)); /* * Always mark UFFDIO_COPY page dirty; note that this may not be * extremely important for hugetlbfs for now since swapping is not -- 2.47.1