From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFF33C3ABB2 for ; Wed, 28 May 2025 14:54:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C51C6B007B; Wed, 28 May 2025 10:54:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 575CD6B0082; Wed, 28 May 2025 10:54:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 48B8B6B0083; Wed, 28 May 2025 10:54:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2B5136B007B for ; Wed, 28 May 2025 10:54:42 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CE2AF80B34 for ; Wed, 28 May 2025 14:54:41 +0000 (UTC) X-FDA: 83492613162.24.9EE766C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 7E62340003 for ; Wed, 28 May 2025 14:54:39 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hjDxwpCn; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748444079; a=rsa-sha256; cv=none; b=DYimaJhtcfA5fsfYzmEb5AqO9pTf91EB/rvKd2UBYN40UP04yEjn2E3Opa2ANemuOEfRGR WFcnAr3KoXlZEsni72I81M0R0/ph5O/u38qcpcz9r3KmYG/hQEFLENWawT+ea5LwpQVBrD ndqa065uocAY/HaxUj5cH9zzDhYgAsM= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hjDxwpCn; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748444079; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bsAruA6SZ+HPthhSmrbo03MGa3m0o2vyngWdDpeaaN8=; b=iqCjyJnaRVaAaXLRbW/LyIKtEjv3aNaqHpoPaEkY19PdWCddMzi+ZKS13CpPB670wi3F+K Ojkk/LlO4srBfZi7fUSGbqJT1xdNodJgUZRBzEVZ/R/dsMvSYulGTqtY/pLunM/knMmaJ8 Q+WMbFzIw3CruJX1zL2cLgL78+19P1U= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1748444078; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bsAruA6SZ+HPthhSmrbo03MGa3m0o2vyngWdDpeaaN8=; b=hjDxwpCnsMzLl4hyYxh7oH5MwovDnyRRAHUOk4unc5hbzNe8g0AMSAKzb+qEcPDCG734L1 Cab7m1sTWG0IbZaMq3tS8GxNPiYLb+fUxLp56+h2pvQvdW7ycdw4P0Y+WhYtTWLliZ1lNu rBN4cql5A8PF31Jl1r8qMSfhC3Iga4s= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-283-5ID_iQQUNTy72NUBxrTNPg-1; Wed, 28 May 2025 10:54:31 -0400 X-MC-Unique: 5ID_iQQUNTy72NUBxrTNPg-1 X-Mimecast-MFC-AGG-ID: 5ID_iQQUNTy72NUBxrTNPg_1748444071 Received: by mail-qk1-f197.google.com with SMTP id af79cd13be357-7c791987cf6so862071785a.0 for ; Wed, 28 May 2025 07:54:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748444071; x=1749048871; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=bsAruA6SZ+HPthhSmrbo03MGa3m0o2vyngWdDpeaaN8=; b=T0SLPkyIQjgkhpts9+jDtfL1XvJ+975ooTNWtMmRe7Qxkyh+aPXipesllJbzGgnG4G 1PTW3LPayJZ7S6zTZ9J70QzzuFIyJkvtDThE7lQ9usfZRiftns1sHJBLD8es/l4mK9bg djgByQX2cJDHLB7bF9O5mXLTGHlCEp4h55t/sA/2ONzUT/PZ52apl77JELjf/HHYmlNS BeLxX9AZMXQtRoVZJGnV+V4NaNE0653r6CjBFWQizeiw4RT6J3k6bLktWH9rU0QDI8ma R3c2MPhbpRWPnML3TzYnM0J2lRdgTiAF6Nu+sUlRdKt67711N1xdCnW0/zHcjrz+ISV2 MioA== X-Forwarded-Encrypted: i=1; AJvYcCVR35yu2KxNwgRXVxF/yBiXvQPEdUh2Ytjpo/opTfhf/slK9C4bG0UAvu//iYhBWjAeUc0n47Qjqw==@kvack.org X-Gm-Message-State: AOJu0Yz8te4CbzkjIUOhQTZRy9CrlTcxNjMHL+8y01xmRHjU7KwXvJ/A xl3xi0YMnHEZuIbzku09qg69cym5BuAzf+X/nYNL8h6Yu3jmmDNr40P9ZUC2tYkYD2q5zp4xRvY hklPO+0533AlmRKWa2Grx/EMfWXrtDgNcQ5HdPS6Z250TOvB1v+6G X-Gm-Gg: ASbGncuQBr8oFbw3qCcvlcowDr0zhkC7qmFw2DuOE/6BPNXgfnjZwRZurdGbWcvHSTX oen/N/n7ZsQkPyBtxwzrH7cA66P3BqUsY4c761CerqSCKQ/yslkCCQyeyCKr02osYh0/Z9vMBwd GitEtM1sW1mWfq4G41yR8yLRyHqtYD6H+1D12T/kitZ2AM2wyr7ltKZJzbGxvAyn8r+aQC1pxym x5YUd/RpkLpYSWVDLlMfgigLN36bS2Wq7sRcMgUzEZf459jQLaj/KeVmtsfqpaY17ayiaAiUGE6 2hw= X-Received: by 2002:a05:620a:254c:b0:7ca:f41a:546b with SMTP id af79cd13be357-7ceecba4432mr3053678785a.6.1748444070633; Wed, 28 May 2025 07:54:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGp0MQooL1CbQva+rWC4C3A0ajPGn7fUfpB+6VlrWY4LpJtxLY38QXc95q9KRZfeRzaFwM44Q== X-Received: by 2002:a05:620a:254c:b0:7ca:f41a:546b with SMTP id af79cd13be357-7ceecba4432mr3053674885a.6.1748444070266; Wed, 28 May 2025 07:54:30 -0700 (PDT) Received: from x1.local ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7cfb82101c8sm79454385a.26.2025.05.28.07.54.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 May 2025 07:54:29 -0700 (PDT) Date: Wed, 28 May 2025 10:54:26 -0400 From: Peter Xu To: David Hildenbrand Cc: Jinjiang Tu , akpm@linux-foundation.org, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-mm@kvack.org, wangkefeng.wang@huawei.com Subject: Re: [PATCH] mm: fix COW mapping handing in generic_access_phys Message-ID: References: <20250528015617.302681-1-tujinjiang@huawei.com> <0d4f0180-52e6-47c9-b141-54e7e7c86880@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: GEGgjNIAlRusO0hpoEXdLhwjxHNtVo2KxoPz29Xx7WA_1748444071 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7E62340003 X-Stat-Signature: 3emreqrrotth9z4cz8eh9cb4ztm4i7d4 X-Rspam-User: X-HE-Tag: 1748444079-529068 X-HE-Meta: U2FsdGVkX1/RUjeT7x4ZZ9Dli1CxwrWxsfRr/0YDB7P3E/uUgTVIJm6vDmKMgN02IQSymWMSHkfmDp7JBm5FQp201nTvFYEbRd8+62dB8IFGNPcrOBZ4NTBIQt+bTnucIMb86VIO1Arx+XaFiL1jhCrOGVPBL907vZ7JSI+hFyvwv4cAKgw6dkmtJMfqGLSk/mrSVM7rg4tzKzkK1ahkrBBE+qjngHkl3aDpQ2ouS8yPj+x8qUWE51CWnQBy1R3VWSmg+07GBRBaDoFQcOILYl5cgmRJzhPotShkbsbK9xP8xM+1x2Jtom+a5UP+/+cwFP0vsqYVjEqqH5oDO4LdxOeC4z/BD/KD7egoJ+dvs1is58lNgIPAF24QvMtO0zRlOQ+MgUpCcFZs2Jo+iKBs+4o1Axc6z7aBHjja3n6Qfq0hSpQlGJJV8gQwvhvMwexa6icd+uG+q312DPjD4xmIgkjxb0+Y8G6SbDtUIshcY36NJMWE9JwLWtSSdH0EFTRhdq7pv3Qikh2ioRbXH78xYY++5GhTxZcdoX44P76UjZNFTu5IB4E2w93p7v5txXQmNLrlSivkSoB3cJPI5TUI+nG8TluaqZKJa9gMqWAEvUpkMuQH6Yy2jFB3mfrHGHDQJEMK4+k0ufOveoacnZhz3KS3witKHz0hX91w16s35HwCP2AwqgEkeWqryN+uzOxjbw4g6ymOwUEBa3EGS8gYi7KWxDR+FNCpzzH5aVaYqM5wKISkLZItJUVrpECbJk4XUoXIoewQpKUOxjI88Cp3Xsn1VcmiSqN9CQgInPe7dEdvlmyP7IgrsxnX4gE4/+d/0+Ylr5zuTOq58jCd/l6oUQJl63WL679CBZ7oZTgopO+2bJjydYvjxQdvDa2gMUJmih/lSkmqYkkK9w6uxW57zT7MAAsKU909MA3SkquNusqzYx0Hbx5FtGaBa8lumFtVag1816tlhqq3nwzWz9v lvWmIGuQ Zrkm+lrBIYN6TgqdSOim9YYAk7OHprJa2ygTHnnSSbx8FWCE0FZxvgDwd/glc63NWc5KdtHgnBFBbXl8bmJC13STCisPOlICqrtLLuHpPOEgnu7WIrKW4MAiPtLgB7EYViBzs0D9on0bP7R5Mhe1tqxvz+gk1AJOPnbtSlfS1SX2yNOR+E5TF+myZWH0K92jepIwq5Hl1FJ0mnB8uSAxAvw5hVYsmjHfxCIMU9QsUEcHHzXKPnSpdUBF18EeXAkx6CXeUe+cZbMejE46Y8aLP7uY56OdxfFgTZg7ilswzEqf9Z7VbI+TSJiFTnxoL4oLZwIF09i5gt3NBnmgon3HT3t1GpCFv6yAMc2/27z11puhlq+IhFScuX+H3Rz26wS40tiFSR/tx+wsP+F1CrGd0zPVqLe1jVUGl6ypmC6yW0rCJvfIRbIpuyXJ2PMUB4l1O3ODE9EAG1rG0ubEnuw/B8vY1lvO7j5AilC7DsVOVmuJc6oMNbxEsEqd/wRklE9meqFHJvz7irA4aO9rWCsY24DyQlvh4zbcZOZuXR2eeTfK96T7qOFfg71S6bLJMAezglaPLRLTpQyZyPP67MH2ohlHsZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [Add Jason] On Wed, May 28, 2025 at 11:59:56AM +0200, David Hildenbrand wrote: > On 28.05.25 10:59, David Hildenbrand wrote: > > On 28.05.25 03:56, Jinjiang Tu wrote: > > > Syzkaller reports a below BUG: > > > ioremap on RAM at 0x0000000022727000 - 0x0000000022727fff > > > WARNING: CPU: 3 PID: 3609 at arch/x86/mm/ioremap.c:216 __ioremap_caller+0x644/0x7f0 arch/x86/mm/ioremap.c:216 > > > Modules linked in: > > > CPU: 3 PID: 3609 Comm: syz.2.577 Not tainted 6.6.0+ #63 > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > > > RIP: 0010:__ioremap_caller+0x644/0x7f0 arch/x86/mm/ioremap.c:216 > > > Call Trace: > > > > > > generic_access_phys+0x241/0x480 mm/memory.c:6458 > > > __access_remote_vm+0x6af/0x970 mm/memory.c:6535 > > > access_process_vm+0x53/0x80 mm/memory.c:6600 > > > get_cmdline+0x192/0x380 mm/util.c:1041 > > > audit_log_proctitle kernel/auditsc.c:1620 [inline] > > > audit_log_exit+0x1424/0x18c0 kernel/auditsc.c:1811 > > > __audit_syscall_exit+0x252/0x2f0 kernel/auditsc.c:2079 > > > audit_syscall_exit include/linux/audit.h:356 [inline] > > > syscall_exit_work+0x10f/0x130 kernel/entry/common.c:166 > > > __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline] > > > syscall_exit_to_user_mode+0x10/0x1e0 kernel/entry/common.c:218 > > > do_syscall_64+0x66/0x110 arch/x86/entry/common.c:87 > > > entry_SYSCALL_64_after_hwframe+0x78/0xe2 > > > > > > The /dev/mem is mapped with COW mapping, and mremap at the mm->args_start. > > > The special pfn mapping is replaced by anon folios due to COW. > > > generic_access_phys() is supposed to handle iomem, instead of RAM pfn, > > > thus trigger a WARN_ON. > > > > > > Similar to commit 04c35ab3bdae ("x86/mm/pat: fix VM_PAT handling in > > > COW mappings"). check if the pte is special to reject Cowed anon folios. > > > > > > Signed-off-by: Jinjiang Tu > > > --- > > > mm/memory.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/mm/memory.c b/mm/memory.c > > > index 49199410805c..e1dac84536ee 100644 > > > --- a/mm/memory.c > > > +++ b/mm/memory.c > > > @@ -6840,6 +6840,13 @@ int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, > > > retry: > > > if (follow_pfnmap_start(&args)) > > > return -EINVAL; > > > + > > > + /* Never return PFNs of anon folios in COW mappings. */ > > > + if (!args.special) { > > > + follow_pfnmap_end(&args); > > > + return -EINVAL; > > > + } > > > + > > > prot = args.pgprot; > > > phys_addr = (resource_size_t)args.pfn << PAGE_SHIFT; > > > writable = args.writable; > > > > I assume we trigger this through vma->vm_ops->access, when the vm_ops have generic_access_phys set. > > > > I still dislike exposing the "special" bit here, as it is absolutely not what we should care about in the caller. > > > > In case our arch does not support pte_special, you fix will not catch that case ... > > > > The following might be better: > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 37d8738f5e12e..810adb8d1a53b 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -6681,6 +6681,14 @@ int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, > > prot = args.pgprot; > > phys_addr = (resource_size_t)args.pfn << PAGE_SHIFT; > > writable = args.writable; > > + > > + /* Refuse (refcounted) anonymous pages in CoW mappings. */ > > + if (is_cow_mapping(vma->vm_flags) && > > + vm_normal_page(vma, addr, ptep_get(args.ptep))) { > > + follow_pfnmap_end(&args); > > + return -EINVAL; > > + } > > + > > Thinking again, we might have a PMD/PUD mapping, so maybe > follow_pfnmap_start() should really just refuse any refcounted pages. We may want to be careful on this. I feel like we can still potentially break drivers that follow_pfnmap_start() used to work on debateable things like RAM page injections, unless breaking them is the intention. OTOH, I also see at least two in-tree drivers set VM_IO|VM_MIXEDMAP: *** drivers/gpu/drm/gma500/fbdev.c: psb_fbdev_fb_mmap[110] vm_flags_set(vma, VM_IO | VM_MIXEDMAP | VM_DONTEXPAND | VM_DONTDUMP); *** drivers/gpu/drm/omapdrm/omap_gem.c: omap_gem_object_mmap[538] vm_flags_set(vma, VM_DONTEXPAND | VM_DONTDUMP | VM_IO | VM_MIXEDMAP); AFAIU, these MIXEDMAP users will still rely on follow_pfnmap_start() to work on e.g. RAM pages, because GUP will simply fail them.. Sligtly off-topic: it's also a bit confusing to me on whether the driver should set VM_IO for VM_MIXEDMAP. I think it should because VM_IO says: #define VM_IO 0x00004000 /* Memory mapped I/O or similar */ IIUC it implies it's IO mapping so e.g. the cache behavior can be different. But I'm not very sure now seeing both kinds of driver exist that only sets MIXEDMAP while the above two set MIXEDMAP|IO. -- Peter Xu