From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7517C3DA59 for ; Mon, 15 Jul 2024 14:30:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 342716B0085; Mon, 15 Jul 2024 10:30:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 318AF6B0088; Mon, 15 Jul 2024 10:30:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1BA1A6B0089; Mon, 15 Jul 2024 10:30:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F2B226B0085 for ; Mon, 15 Jul 2024 10:30:08 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 95552C1693 for ; Mon, 15 Jul 2024 14:30:08 +0000 (UTC) X-FDA: 82342221696.22.FA4C13A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 5BD584001A for ; Mon, 15 Jul 2024 14:30:06 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="X77QxZ/m"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721053775; a=rsa-sha256; cv=none; b=qY7zm0SvUVtkGBD5vn+fFyPu9i3GPpwDlfXsRlmOFBP9aWYCrs24sV4tloO1Fz4aqJfCOb t8T9kyIudmVuut6xv411FYUKpowjhWQQuENBa/6zhKvheSmFbmIMsa9UITpo0MPM4DPfwe i32z5vEouxqFcm9HWPcVr3a0OsGjo+k= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="X77QxZ/m"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721053775; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hE27kCNNoTECqXxKQzCfQDUL6DEdnR6uNqGY9T0xVUs=; b=3griPJjM+/y5snMB2g1faVjrDJrtSRc65MF8pui/SRt07CrE/ALO5Qy7pKyrwRO98XHXjd oYQ3F0cvioNN5qf5maxdEu1e5ZlG7GwpH77P5s+HkYJDHaUlsilQSb/4v2+yLz3vcvFVHr KImXbEuzuHueU08vfTqzuvam0PPGvSc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721053805; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hE27kCNNoTECqXxKQzCfQDUL6DEdnR6uNqGY9T0xVUs=; b=X77QxZ/mRvgiXra2nLuCg+P3v10qN6j8Zknjnf+mP7asIfdPor1Y/fiMXK8n1E3nXL55Ko wm4OttNKkOAVHStxeqRzPf8cJt7AcWL4YeoFipFxluWbtMotf5I/WVahsTSU960x6PGoL8 JwrwdFqwVfrqOqOZiKAdM4ce15o5g7A= Received: from mail-yw1-f197.google.com (mail-yw1-f197.google.com [209.85.128.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-611-nrmfTrGCMS6k4m6cn6QgLQ-1; Mon, 15 Jul 2024 10:30:03 -0400 X-MC-Unique: nrmfTrGCMS6k4m6cn6QgLQ-1 Received: by mail-yw1-f197.google.com with SMTP id 00721157ae682-65c66c8540aso834297b3.2 for ; Mon, 15 Jul 2024 07:30:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721053803; x=1721658603; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hE27kCNNoTECqXxKQzCfQDUL6DEdnR6uNqGY9T0xVUs=; b=VfQvfhEZ/MLkB8FhAUeVFv4Tc8KKcYDg32g4e47F418da8nxVPruZLv/2DTvaQDRlm 3ox/e25ex1umOfmA6iLbBcdE+qpIlJK/TtQpK0B4A+5ol7r1+qG9dcAQkK4BH7SQJmrN diJA4kw8S6qXju0ukaQoYvb7gMrFI/0mUNPsYSuJJclWDWErOIoIPq9pTOauVIo/jRvr 88UKYR0l/1okSvFcEz3JW/O+0mUsHa0rwFS2BZXk++92Ed05Rt7+OFN1KjW6lT9okY7k eJ7vcj6d7k58cAShaftjWyYmaIWyzizZpMrvXIs3BCvgSGOqV8l5rt2YQppmzzzAUZP6 o5mA== X-Forwarded-Encrypted: i=1; AJvYcCXqsQXWdBSr2jrF6ju3j+zetlsv2Sm64U7442Q3OYPI63bHLqjjaBI8Bt2n1BZcoFaRr5F6AN7ZyoB77cr+D7OxUts= X-Gm-Message-State: AOJu0YwCojPp86cl0/kzrZD64fW8vw4GiYzSLkLPqFYLJ9uryGwH7mf2 HUPpJ+BMgFzmE44h3SCeG9i3wDMuVYFdwCTx+rRMf1Kd1M4Z2YhaGSR4gxG35oslDWuHbPPASBm 8/1Gz2zSvCse5DSWZ+ZT+Am/3RWsqCM576/6+wgWeYKFWZ88O X-Received: by 2002:a05:690c:7241:b0:627:a1d9:9665 with SMTP id 00721157ae682-65ca8a12551mr84674347b3.2.1721053803140; Mon, 15 Jul 2024 07:30:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHlhG26ohilGxMChNc+yj/ht5zjp6PEFzHfGUqFvb9KcgwWhYLqCMlM1mktr4cE+OGMtER5gw== X-Received: by 2002:a05:690c:7241:b0:627:a1d9:9665 with SMTP id 00721157ae682-65ca8a12551mr84673867b3.2.1721053802527; Mon, 15 Jul 2024 07:30:02 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a160c6418dsm204018485a.92.2024.07.15.07.30.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jul 2024 07:30:02 -0700 (PDT) Date: Mon, 15 Jul 2024 10:29:59 -0400 From: Peter Xu To: Yan Zhao Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Alex Williamson , Jason Gunthorpe , Al Viro , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "Kirill A . Shutemov" , x86@kernel.org, Kevin Tian , Pei Li , David Hildenbrand , David Wang <00107082@163.com>, Bert Karwatzki , Sergey Senozhatsky Subject: Re: [PATCH] mm/x86/pat: Only untrack the pfn range if unmap region Message-ID: References: <20240712144244.3090089-1-peterx@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5BD584001A X-Stat-Signature: 6d3tsc8a5d6cqk871e8aad3egiygta3q X-Rspam-User: X-HE-Tag: 1721053806-298963 X-HE-Meta: U2FsdGVkX182hBI5BEd+bQKztlI3wzxXZYptW58aaDfR6d+IMC3Gd1zvQPj3es4SbGmuhjiNrHGesL70LObMEHOA6isaClZ520AtaAqYUFUp5h3B033LTquy8K59dl3C4oXl4OED80dTITeVr7yzYGXrFtfpYutFGxpiZ2YlIfaNOymXYcrRqWNOVFrO48jd0S3J0NHQ8JTyVEC31xmYssL2tX55jSeHQm2N6Vxx5iK2dTAln4RVxsYAlicrAY+VZB2fCGPUKwl3OLU70ilgdB+z538zCn0lDEeU6tWO6r/fRGrMY7DrlsE/ZQLUI0x+WIP/Jp5nGA/OpYf4dQUAlNs9KeydFQEChXkbtkImRpPiHuZduDfse2p/CsL25EexSSWiKzBYHhn/yv8XhdN+wXn33c2ZZ7/+ZyimYUZhztAanbnyTLUMeCgF4IeLuXtcl5H2Bk+NwXFRSYpIsBsoiJuLQVq6QjmfnTOlEbxu5YwncJgoCD0Dojw1oHjxXmEKAqHfwgTaAXc2L2sFJ9W0/mWgu6nqiHSO7+fCoaZSYzvpLz/B5ycPwzmqpkfPe14l4GGcx65PtThwqnclXZG0bFR8YwCu63wLzxOV9SoRCiHzV/QYl6TJ7l5gI3R3lRB/jTn0Td6ycJ/TrDew9SOgqRs6MiWsdjJFXfUtd5/CRw34rYasbBeDrAZANu37Yxf214Bdi3TqtWgvqYIR8OWCuZ+OhuGTmRdVU53u2/eub3NWdtiZCmmVZB/wVS4b/uqd90lMnXovrjvjxUNr7viRNCgvVrR//m8Xj8rilCvYBq1THZbM+szCPK/fhCI8bOETF99FX8ORg5LLS3MLO5GifSRsZEoWhRSGBP1uNZEwttzGUh6lNuVOEXn03LaA6jwSOWHKEcwgbFFnZcLRzdXMQ1J5k9f1JoSucNRx6zqJZSmOXCcSjFQFbWzWF+fpDyKkO06zjtsYjVW5Ejv7jr2 1esV12X3 d5uVs5gboHLfO/5wOuqeSUqNtajd3NCLcefV6NILrlF+ILwvS06s71ydxN0e8G3DDlzP3Zb0Jzman/7v2PknTmGFFLUyWj8Ve/xGXFHFUNbQrCj7+XxRluII3a08wBCCndrf+58mDMFmS+eHZUa0+lGc1tjgarS3Wi6057Q0PCJOoLZumNltuoXg3suBe233JmQmafTYeSoz4LnJu3eLngGg5lahOalNlPsiwuOwFyAGnhblBLpNbtzJVZ79B9k6nBX54UyFLB5/R7mn8YZ3EdDK24+3Byxa8AnEkyBqztN1UmfKZW+bRQosnjJr7X/XDlaKefUBsbzFWhMBFstiz1jLnRqvbHKLU08Ps+wIc6nWgVr8UgTwidq1Wy8wX1lINRzythkR1LZk5Nk303lIIlyvkxKRKrulDiZCU/VgwKAdjAKehhrqoGX5iQDmllZoaRg6VyjE4TVJf7tROtietV/DwM2kUrSvrYOP6h4OS7os8uzUsLo2Q9dDqq3lPjAi2j2vcZVbJonT1ky7vlpIgP58etXmBGHW0L9RY5M+sPQXLUnEQjEA33M1CF9GWhPCTsb1lRq3gtRQPYcvZwTAiqWpStV8713pA6c0EHm93Xvw99DlBrzXy6MKnCkc3iXbQ3sOQPE+jtA20vXReQlckRu4Rj7UWTeFQoDefhQXa2gaxRIsOimz2rIxbQzISij54Y3B6t20x2JlOrQx38QwC8FLZfzWZ9AVwd748RhxqFxDqbM1h6jkXQIeBR8L6JfiE6emp6CapsasYRu6NCDFk4t5oG+bg1v8O7YQOD55cna5AHAijRvcBM7TaQN0OmZcimGIUIIe8n3QuaTk7tfRobNdz0lZ+mayXvqmJNoBpWt8qTAy4Jr3qg2Uwiw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jul 15, 2024 at 03:08:58PM +0800, Yan Zhao wrote: > On Fri, Jul 12, 2024 at 10:42:44AM -0400, Peter Xu wrote: > > This patch is one patch of an old series [1] that got reposted standalone > > here, with the hope to fix some reported untrack_pfn() issues reported > > recently [2,3], where there used to be other fix [4] but unfortunately > > which looks like to cause other issues. The hope is this patch can fix it > > the right way. > > > > X86 uses pfn tracking to do pfnmaps. AFAICT, the tracking should normally > > start at mmap() of device drivers, then untracked when munmap(). However > > in the current code the untrack is done in unmap_single_vma(). This might > > be problematic. > > > > For example, unmap_single_vma() can be used nowadays even for zapping a > > single page rather than the whole vmas. It's very confusing to do whole > > vma untracking in this function even if a caller would like to zap one > > page. It could simply be wrong. > > > > Such issue won't be exposed by things like MADV_DONTNEED won't ever work > > for pfnmaps and it'll fail the madvise() already before reaching here. > > However looks like it can be triggered like what was reported where invoked > > from an unmap request from a file vma. > > > > There's also work [5] on VFIO (merged now) to allow tearing down MMIO > > pgtables before an munmap(), in which case we may not want to untrack the > > pfns if we're only tearing down the pgtables. IOW, we may want to keep the > > pfn tracking information as those pfn mappings can be restored later with > > the same vma object. Currently it's not an immediate problem for VFIO, as > > VFIO uses UC- by default, but it looks like there's plan to extend that in > > the near future. > > > > IIUC, this was overlooked when zap_page_range_single() was introduced, > > while in the past it was only used in the munmap() path which wants to > > always unmap the region completely. E.g., commit f5cc4eef9987 ("VM: make > > zap_page_range() callers that act on a single VMA use separate helper") is > > the initial commit that introduced unmap_single_vma(), in which the chunk > > of untrack_pfn() was moved over from unmap_vmas(). > > > > Recover that behavior to untrack pfnmap only when unmap regions. > > > > [1] https://lore.kernel.org/r/20240523223745.395337-1-peterx@redhat.com > > [2] https://groups.google.com/g/syzkaller-bugs/c/FeQZvSbqWbQ/m/tHFmoZthAAAJ > > [3] https://lore.kernel.org/r/20240712131931.20207-1-00107082@163.com > > [4] https://lore.kernel.org/all/20240710-bug12-v1-1-0e5440f9b8d3@gmail.com/ > > [5] https://lore.kernel.org/r/20240523195629.218043-1-alex.williamson@redhat.com > > > > Cc: Alex Williamson > > Cc: Jason Gunthorpe > > Cc: Al Viro > > Cc: Dave Hansen > > Cc: Andy Lutomirski > > Cc: Peter Zijlstra > > Cc: Thomas Gleixner > > Cc: Ingo Molnar > > Cc: Borislav Petkov > > Cc: Kirill A. Shutemov > > Cc: x86@kernel.org > > Cc: Yan Zhao > > Cc: Kevin Tian > > Cc: Pei Li > > Cc: David Hildenbrand > > Cc: David Wang <00107082@163.com> > > Cc: Bert Karwatzki > > Cc: Sergey Senozhatsky > > Signed-off-by: Peter Xu > > --- > > > > NOTE: I massaged the commit message comparing to the rfc post [1], the > > patch itself is untouched. Also removed rfc tag, and added more people > > into the loop. Please kindly help test this patch if you have a reproducer, > > as I can't reproduce it myself even with the syzbot reproducer on top of > > mm-unstable. Instead of further check on the reproducer, I decided to send > > this out first as we have a bunch of reproducers on the list now.. > > --- > > mm/memory.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 4bcd79619574..f57cc304b318 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1827,9 +1827,6 @@ static void unmap_single_vma(struct mmu_gather *tlb, > > if (vma->vm_file) > > uprobe_munmap(vma, start, end); > > > > - if (unlikely(vma->vm_flags & VM_PFNMAP)) > > - untrack_pfn(vma, 0, 0, mm_wr_locked); > > - > Specifically to VFIO's case, looks it doesn't matter if untrack_pfn() is > called here, since remap_pfn_range() is not called in mmap() and fault > handler, and therefore (vma->vm_flags & VM_PAT) is always 0. Right when with current repo, but I'm thinking maybe we should have VM_PAT there.. See what reserve_pfn_range() does right now: it'll make sure only one owner reserve this area, e.g. memtype_reserve() will fail if it has already been reserved. Then when succeeded as the first one to reserve the region, it'll make sure this mem type to also be synchronized to the kernel map (memtype_kernel_map_sync). So I feel like when MMIO disabled for a VFIO card, we may need to call reserve_pfn_range(), telling the kernel the mem type VFIO uses, even if the pgtable will be empty, and even if currently it's always UC- for now: vfio_pci_core_mmap(): vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); Then the memtype will be reserved even if it cannot be used. Otherwise I don't know whether there's side effects of kernel identity mapping where it mismatch later with what's mapped in the userspace via the vma, when MMIO got enabled again: pgtable started to be injected with a different memtype that specified only in the vma's pgprot. > > > if (start != end) { > > if (unlikely(is_vm_hugetlb_page(vma))) { > > /* > > @@ -1894,6 +1891,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas, > > unsigned long start = start_addr; > > unsigned long end = end_addr; > > hugetlb_zap_begin(vma, &start, &end); > > + if (unlikely(vma->vm_flags & VM_PFNMAP)) > > + untrack_pfn(vma, 0, 0, mm_wr_locked); > Same here. > > > unmap_single_vma(tlb, vma, start, end, &details, > > mm_wr_locked); > > hugetlb_zap_end(vma, &details); > > -- > > 2.45.0 > > > -- Peter Xu