From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 205E7C369C2 for ; Tue, 22 Apr 2025 14:49:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BECC86B0005; Tue, 22 Apr 2025 10:49:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B9A5B6B0006; Tue, 22 Apr 2025 10:49:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3E1A6B000C; Tue, 22 Apr 2025 10:49:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 812E86B0005 for ; Tue, 22 Apr 2025 10:49:48 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D243E14049F for ; Tue, 22 Apr 2025 14:49:49 +0000 (UTC) X-FDA: 83361964098.01.6CDBC00 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id B7582A000B for ; Tue, 22 Apr 2025 14:49:47 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=baenVkFw; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745333387; a=rsa-sha256; cv=none; b=iloXIp0ISWDvFD8fjQx+A+Tg9DlzXYnohJB+VVuPXUzOnZH4tOSTPKTf9OCUqqhdF0zA7Y f2MbgY34tC68MF9LfxSWWUHw9uuw/fY+wwruLD2dE202lAKJPZudvWKpFqB+MIv/XTcpti eUfFModXlxAGCELoJbIoTJc8AfTciPw= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=baenVkFw; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of dhildenb@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhildenb@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745333387; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=FL1h0upM6Rf3HUd8dipjWCj+SjWMZ+52iTEFTXr+muw=; b=mIv03vXjXWY8xlqChzslEzNg9ZKjOBOo27FiOaA219gDfNR/xCHxyKp0w2VR7tESI5+Ivt IZ6o1q9MniM/g3m5pGWC28cK45m3H1xxyAUoMz5ZQJlWTBk7iP6KtlSN8xgJxlwBM1OsaS Rx1DBiRO4L+dz2h54P/VSbCNnFTJff0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1745333387; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=FL1h0upM6Rf3HUd8dipjWCj+SjWMZ+52iTEFTXr+muw=; b=baenVkFw9BryaN6FKR1XH2rLLIzNGc5laKR1whM8KANlP52TKuzDUgrnWbK4NSjhwRzyjj kASGgWoFbE/FJAhcXHEabbkxjN/TwigoPXNJchGedUDrVpMG7ElzCRuuwawSt2X490HDlP 0srcYVllJw7N+baTeg44ceRlMOP0Yxo= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-561-s-pRlbBuMtaJA-7nRuwwUg-1; Tue, 22 Apr 2025 10:49:45 -0400 X-MC-Unique: s-pRlbBuMtaJA-7nRuwwUg-1 X-Mimecast-MFC-AGG-ID: s-pRlbBuMtaJA-7nRuwwUg_1745333384 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-39979ad285bso2363353f8f.2 for ; Tue, 22 Apr 2025 07:49:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745333384; x=1745938184; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FL1h0upM6Rf3HUd8dipjWCj+SjWMZ+52iTEFTXr+muw=; b=xR+Nz5qFd+kSui0tFQ/ouMKL7qXLtD/KirD4cSbmrPGgURrUcFFyUUZw6AmB87cAfE yDC985r1SgwHZHxJK/8ofbUp8QFAAWxaaf/x+q+8IhCc3BPEKd0U22tORE1xZhEfzM4M LsjKYmYRdLiXMYP3Kxnk+UuPVCdJHS9qBzhdNFnRjLFVE1GyM0hVO/zxVUQU4Olhe57h tlyZwm7L50DaT5t19netl2jRMHxzZ1TYSEU1cVA0hqRMsEW6OaktLn5yTD8TOY+E9VEC q9TnHYL19mMZC+mLwQrvPKxkQ+UiN83TtYgGQCQ54iggV/AikT6sUSU73Zaj0P/R5PWq Stcw== X-Gm-Message-State: AOJu0YzuWvwvQvi2pc4GUFXULUAjxwv/lq2f/+ocAz4gsdZTX5dSMFT4 SBNjGRtG7P0zo28TIcL6Qmsmx0/xBFBYH6iI1/SpGpQxrIIIPypsKhEXEIRXrFDTIuYdPHJPlPO ksW6ZkmXPkiRTggaNEagrGDO/qSauUujRTt9BBvMCK+fX2VVv X-Gm-Gg: ASbGnctCQkuRwHf1RHTcUdc5J4I0yIxim3UsHIGmBoB70+vIWoD/GxlfJSxvNz37jpM iHrjMbJCx8pAGaj5Mnf8EB4tRUZ54FGMEmN6HNT2UqLmIzo0slWgDAxAaS4/mw5z5G7ss/ztn8M R7n4Zn5995fdSgLiZHue4TfUXAqRrBmeWLn4XT4P7EjmQaKbmSmudp4Sv8NWx7GzYOtuL7+/A5H tJbQEna6ddVKBNoW6+lDMyeJ1BlSK2DMPW4txhkW2shjTvxhHd+1QpVuKO5C4zXqL9ZTJwujx22 5wvLNwTtX7IapAxSRA7sgfcyQT1YHEhSQHnttaixeKbmS3DHgmMwzx5Gfl599KDDdYtJjh4= X-Received: by 2002:a05:6000:1888:b0:390:f358:85db with SMTP id ffacd0b85a97d-39efba5edf2mr12256975f8f.30.1745333384287; Tue, 22 Apr 2025 07:49:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGYU5QEpqjLlj2P54HTbHeGJnwmjYy5OLXepNOJz3GjZvVq3SaKK7+9M8tbDBxZ426uyeOY8w== X-Received: by 2002:a05:6000:1888:b0:390:f358:85db with SMTP id ffacd0b85a97d-39efba5edf2mr12256950f8f.30.1745333383882; Tue, 22 Apr 2025 07:49:43 -0700 (PDT) Received: from localhost (p200300cbc73187003969778603229641.dip0.t-ipconnect.de. [2003:cb:c731:8700:3969:7786:322:9641]) by smtp.gmail.com with UTF8SMTPSA id 5b1f17b1804b1-4406d5a9e0esm181509965e9.5.2025.04.22.07.49.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 22 Apr 2025 07:49:43 -0700 (PDT) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, David Hildenbrand , Andrew Morton , Lorenzo Stoakes , Ingo Molnar , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Borislav Petkov , Rik van Riel , "H. Peter Anvin" , Linus Torvalds Subject: [PATCH v1] kernel/fork: only call untrack_pfn_clear() on VMAs duplicated for fork() Date: Tue, 22 Apr 2025 16:49:42 +0200 Message-ID: <20250422144942.2871395-1-david@redhat.com> X-Mailer: git-send-email 2.49.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: GM2cxKHB5a_enpm9OJJrFIEhCacn2zS50m-BL-02AEI_1745333384 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit content-type: text/plain; charset="US-ASCII"; x-default=true X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: B7582A000B X-Stat-Signature: 1fi4rcu3hcrb1hw63n7gb4oiutr7ifgu X-Rspam-User: X-HE-Tag: 1745333387-337092 X-HE-Meta: U2FsdGVkX1//ug/J7HFHoRaarauxSnh6r4dAlC1+spwITxk3ctyiQ4MZxgf1SLc2DK2JJKail/3e8jOHUTZLmjzrJCtuJtWCpRspCPw6wCUA09h+kU6eCwmREqtxRd5cxyQXl/civ3MzwVQA8dhdrmd5otv/wl0eq3pMRhER4Ade4i/y3ygDaxh90aZjOPRQ0C0b7Z9R1jK0iwK44l1BhdBZbwf4mWN5VZ99g2AgLQMU50qzzZdVE9sBo0MXFx0vhNuuJKii//aOtcqQc4ZqlmNbAgLqdb/R3KJ24Ra8zrzCVkRecwIe0IeqBrhx3Tb5WUhiUPX3iryEx/o/BcnY4exGnIX/vr+bkhyVFXNKIbubooH6iLD5Qj6/rttPNOSBRor/kMcyTTvjfViR1gSRXYB6SVwVDncfi1H6ydb3fSqU03IQCFebngucEABRTKZEw1S9D8YmcyqpOhdASraRIumw9d/Y3oIA6FB1jOqZW5pNk3qBj4oAWiDhm4r7Rt6pGkJNhW6Q7zETUlTwlMkYgOAU3uKC0qN/OXEslvPE/tGFr9RXejA6ZXXTr/QyhLaJz8a1pyBHdtLB69MbPYIk0ggkUfibauTql8DzMafggKAb/IMFqx0KtfwD4jI2BcpfaizJi05WDhe2O7dDehm6pWFPjoboiSwfgsI2HuvloRHyQp3dpJDKnsOzVTuGZYNxyEi3C3APrwNSFK48zMj06JZS2u8v59mVd53Sy9Nio0yCGBTnu3+89zXz1SkOLJRODtIIp1LMriyHW1HQOpOrifMBU+ecLoD7jjulDOYt2R1bvhEFqYreH/QnjL4VplI0MgktGiWOezLLN4nTOKhO1HACH4qNgQW9ud78hUoR73eimepRgF0thoX93MsE/YniQszinWyoWJX09h4aqIh44j8Ni0pRYYUHt6JIVSDXkIegv3Z9cRDdM2TGN8bK/5pkLTHTufRzUaAZPVTIQWW pHKmt0sG qVw63PcJDmqfLKpqrXGIaT/XCYKCKGc7ofDAVj5Ru0PzEjLGEULFKmetBHAE4IBIVWNhT4yDe+dtc1u7iaDchxESNf3pRBoIWY9RpnC6Z1uvgLPfnAByrA9pwak7T0U1c8vjffz4oWibQ5zLxjTa2Qd838Rtv73cjKoipNQMrQZtp1fX0AQxMelWlX/ZVa72aJ5xDo9LbSEZRKHeLaOiRnpVOGr0REKcZ+AxeVnieo1Q3tmMnCaOv/Apw5J0R5ZSquVFKUupEe2RGXzGoQ5Vqs8WiEuV1AeSet9QpmqCsIUUDkK/AZPwlb0H39jv/cnAcxg469P2CfVlZxMj6xxpHKzzRYHdDekb3UPznjHE7m96wFhaqMlZ/w7a6yEBpU/sAPIcpMxuN2zsA/CZJDqN+RvmC0Zdiil/TfH5/0MFEe2u3Z9AoRddz271nPTGHrVSz9wVjoeoeSR8ec6HMF7iLqlLavTs1gAjgeWbEJ3pCiER0Q1s+SPdzf8Y/NcZ8udIFkOi07pejoZ0mv7hXUH5SFm9KHpr2tzECfetCfcxm7E/38G94lOb6CZNEvBaYmW7C+K7gbAbBO44fkBetWppiaQIFz1UTHsvqwMutz2x+A5NGE/sgFdOdfwJiFLIyDcRwgJ7dFwnzgnRxo7wlRWigjlyvay6d4xSUMe+zvIJkxVeb0IFRThFT17mRfFVWGR25If5hI+l4RuYlSnGe6UjD7TWBPprV1zoovFbxtrF3MNBFGdNY7o7/rVm4oy3ELoO4F5YrZ2k88HecK5tYPLpqaNrLVP+l2wL1mLqUyqXTcqhvDU+NwLDt/qo6ZqY+1Uxxu81p4Km6jr82p47H6GsyMPSdBA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Not intuitive, but vm_area_dup() located in kernel/fork.c is not only used for duplicating VMAs during fork(), but also for duplicating VMAs when splitting VMAs or when mremap()'ing them. VM_PFNMAP mappings can at least get ordinarily mremap()'ed (no change in size) and apparently also shrunk during mremap(), which implies duplicating the VMA in __split_vma() first. In case of ordinary mremap() (no change in size), we first duplicate the VMA in copy_vma_and_data()->copy_vma() to then call untrack_pfn_clear() on the old VMA: we effectively move the VM_PAT reservation. So the untrack_pfn_clear() call on the new VMA duplicating is wrong in that context. Splitting of VMAs seems problematic, because we don't duplicate/adjust the reservation when splitting the VMA. Instead, in memtype_erase() -- called during zapping/munmap -- we shrink a reservation in case only the end address matches: Assume we split a VMA into A and B, both would share a reservation until B is unmapped. So when unmapping B, the reservation would be updated to cover only A. When unmapping A, we would properly remove the now-shrunk reservation. That scenario describes the mremap() shrinking (old_size > new_size), where we split + unmap B, and the untrack_pfn_clear() on the new VMA when is wrong. What if we manage to split a VM_PFNMAP VMA into A and B and unmap A first? It would be broken because we would never free the reservation. Likely, there are ways to trigger such a VMA split outside of mremap(). Affecting other VMA duplication was not intended, vm_area_dup() being used outside of kernel/fork.c was an oversight. So let's fix that for; how to handle VMA splits better should be investigated separately. This was found by code inspection only, while staring at yet another VM_PAT problem. Fixes: dc84bc2aba85 ("x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range()") Cc: Andrew Morton Cc: Lorenzo Stoakes Cc: Ingo Molnar Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Borislav Petkov Cc: Rik van Riel Cc: "H. Peter Anvin" Cc: Linus Torvalds Signed-off-by: David Hildenbrand --- This VM_PAT code really wants me to scream at my computer. So far it didn't succeed, but I am close. Well, at least now I understand how it interacts with VMA splitting ... --- kernel/fork.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index c4b26cd8998b8..168681fc4b25a 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -498,10 +498,6 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) vma_numab_state_init(new); dup_anon_vma_name(orig, new); - /* track_pfn_copy() will later take care of copying internal state. */ - if (unlikely(new->vm_flags & VM_PFNMAP)) - untrack_pfn_clear(new); - return new; } @@ -672,6 +668,11 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, tmp = vm_area_dup(mpnt); if (!tmp) goto fail_nomem; + + /* track_pfn_copy() will later take care of copying internal state. */ + if (unlikely(tmp->vm_flags & VM_PFNMAP)) + untrack_pfn_clear(tmp); + retval = vma_dup_policy(mpnt, tmp); if (retval) goto fail_nomem_policy; -- 2.49.0