From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 053AFC5AD49 for ; Wed, 28 May 2025 15:25:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9A8156B0089; Wed, 28 May 2025 11:25:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 97F926B008A; Wed, 28 May 2025 11:25:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8964E6B008C; Wed, 28 May 2025 11:25:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 68E706B0089 for ; Wed, 28 May 2025 11:25:17 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9F91F120BF1 for ; Wed, 28 May 2025 15:25:16 +0000 (UTC) X-FDA: 83492690232.30.2F08299 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 6EBC040005 for ; Wed, 28 May 2025 15:25:14 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BD7RFb3S; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748445914; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hTCykupLM2tgHpaOyN1fJQ/OJuNuukgJLLxHYqAWRoM=; b=hLOPtk4FOKGpig6dKk8KImRry4SeTBgv93hC/GKrta5QwGwP6S0baF4oRhlSwqR7nHBxbb G9riCWkIRDyd4EBTQBTOGsNzKwyq3U67a6PvjGQFAdx0AEhZezGfeqlYj3rDf52X8tIn9E UVyI3+CpwcqmCmPf0DCkcQe8QK0h3uE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748445914; a=rsa-sha256; cv=none; b=SaIrSLhAZMb1yInDJWiT4DO1OMeVk7/BlPpxergvwYpDNFkPvBSHwEiFz1bgLQCh+NQq/n q7M87BIekjUbxWp/SiYLck5tCSZhaRtiXOAM4T1wE+/I3pozT0iyfLsKmevzHaEJK2xUxY qHFSCLNfD8ndnLIJslUfxulfonYwU8c= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BD7RFb3S; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1748445913; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hTCykupLM2tgHpaOyN1fJQ/OJuNuukgJLLxHYqAWRoM=; b=BD7RFb3Sgm5iO9fLkmMmVaFYOtgpVJiHyCJAb7GkO85NRLpBugf06U0trDkSmyHP4DAbD4 Y/H7y/F8JykwSMcBZDIx+Kz3br+iUxLvF3s0F9lkDZXSzI+Nbyqdc/a9KQAIssF5Z7jlSC JXAwI0d/eFNTX4UTiQxyPMwRk8vvkZU= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-21-U_5aFOs7MfCl9rVKuQvBzg-1; Wed, 28 May 2025 11:25:10 -0400 X-MC-Unique: U_5aFOs7MfCl9rVKuQvBzg-1 X-Mimecast-MFC-AGG-ID: U_5aFOs7MfCl9rVKuQvBzg_1748445910 Received: by mail-qt1-f198.google.com with SMTP id d75a77b69052e-47693206f16so77164181cf.1 for ; Wed, 28 May 2025 08:25:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748445910; x=1749050710; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hTCykupLM2tgHpaOyN1fJQ/OJuNuukgJLLxHYqAWRoM=; b=mmbw/6lkQX8mcjujt/H8O88MX/lYhGIVI5VvAANEKOj3/JBknHdOXxGU4Ub81NIDN9 ntI92jLocd9Bsw6YWPEe3WO/DK3G/zw+DTxuZSLcQh6yGY6/nEvr06aI/2W7v42T1lcx Youqt8968Sk2PVHss9VZ4zBko3A/2HIFi4Mu86IVpa3SvZcziX2tSvMF9885zNiKdGVY cDQnEPrAmXgK8nN+nsTHJHkC1TamOaEsce8L3ISWmSQVDGppK56lwFnrj9bpKE1i6PJY LTPM8uJrKTtypksgAHi7imV/HSRmsO1DjbLhQ7G08bQqTuLnH6qHVVkX7AlNS9mB5WJZ fjBg== X-Forwarded-Encrypted: i=1; AJvYcCVC33hiQO8rV5AqH8pEruIZ3a+fV7CJMI33BEH24evEFsfmHDG/fdJvfv0ZDiFe9dpxQuSe/Vi45A==@kvack.org X-Gm-Message-State: AOJu0YwwuiCndGnvd3YGr6x1Jwqg8Vvamwuyp0RzkACerbtb5STevuRG knG7BMO6OxQQEeDSg+JgdiD+/VqRXNU+0BipGb7D8m0kbb3qct2q17SNsGuzPp07NTPaZkFLV7o oqkYzOv8oQrTtA7ZoM/Mxsmu8upsBpk1cm3WUT9A/4gqud8Clq+CK X-Gm-Gg: ASbGnct1YVHaOINGALc1R7W+BbAbMq7ZWmOJbhZB8w6RHc+9BHouJUYXMatXGfMIE71 E+yYWy9CVCobGWaf9FDUWXSMrl988LuZ1wHJNv7Rqw8eeLXCfJbzZNwzLMgcaTv329BkJC49I0T tASSj5MIPBKIeTETK/GeU+6hirG3KMfLNDQA6fCEagHOVUTN+N/v4CdZjXQFpsraTgpwR7/xo9w mrc+zhd438SvE/h5t4Jo9DG7GvjwZzDDYREG9KVuOVJwq6pUkFOV8p0oCXaam66LzVIsq4b9r8Q blc= X-Received: by 2002:a05:622a:1e92:b0:472:1d98:c6df with SMTP id d75a77b69052e-49f480c96c4mr267549771cf.52.1748445909890; Wed, 28 May 2025 08:25:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEYOhyTPhfA9VeNZ7RdeKnccKOnW1yx51dyAXotyNrS5FO4WXX2aARSM7K2oAQ7JflHKsLLOg== X-Received: by 2002:a05:622a:1e92:b0:472:1d98:c6df with SMTP id d75a77b69052e-49f480c96c4mr267549151cf.52.1748445909501; Wed, 28 May 2025 08:25:09 -0700 (PDT) Received: from x1.local ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4a3c8008fadsm7379851cf.44.2025.05.28.08.25.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 May 2025 08:25:09 -0700 (PDT) Date: Wed, 28 May 2025 11:25:04 -0400 From: Peter Xu To: David Hildenbrand Cc: Jinjiang Tu , akpm@linux-foundation.org, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-mm@kvack.org, wangkefeng.wang@huawei.com Subject: Re: [PATCH] mm: fix COW mapping handing in generic_access_phys Message-ID: References: <20250528015617.302681-1-tujinjiang@huawei.com> <0d4f0180-52e6-47c9-b141-54e7e7c86880@redhat.com> <5b9f5952-9979-426f-857a-dffa9b7963af@redhat.com> MIME-Version: 1.0 In-Reply-To: <5b9f5952-9979-426f-857a-dffa9b7963af@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: S6Z3dPsjqty_zzXqg8m3n-8WYQ_QXVgHb0O1JSrX-ZI_1748445910 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 6EBC040005 X-Stat-Signature: uxxfbox7ddq3m5sqqqxtzhadf8sxadz3 X-Rspam-User: X-HE-Tag: 1748445914-652325 X-HE-Meta: U2FsdGVkX1+65r2Zpf/RIoRPB4srzcOb0/K0wG7/+UHnWxKE/CLemr8tmFkgNrLcgSAwLHFrxTGAmNjEC4pkCPPzwvJnlr2K80LUR00oHEPOqSkHCxrMXt0zJIQ44/WCn0uwZHP3kC7Zf4ii4N/LS1CRx8VJ+LJjV8r2Z9/AQQEKgXi82j+Ybq/r1KPkQWS6a4yU8v7GuGD4QFtzY+pLn9otwR+Laxy+Ht/FLXrX/ZEWY80Uzule2kX6oD1kib7Icxzt0T7p5/kl0uoGVNCJzh9l3tEfVMtQ0XrzJESMjdqZWXOleAYTCFo2wI7mjJFQhvhp28ROg5lUJs4ZbT2LITTFVFIQ+QENeAFKlQ1kMdPEAXru1mvJJHxIloobTzrIbG5IJfmkzd2G1LlwCJjN4abr6KP2j6tezHRVBOpEWypIlujUGqK3Q0Y5L6yK9Q621PPwS9v6PvSszPpEp9UO7jQyTMspXf/QaGBTeG7lHrU7BALl91+1KPbD+4kQMSwrTT8PO/wTfX128yegDdcnFgv0ti/WCtLG/XKw58flGImLgrqCALHuxmK1tZei7gTsAIhLL2+1fEhqyWvEl2hvI25Evvc0GW7VYTRB2d0wreyD78FcJ9D4wlxUyRZ5bXCUZiW25HMNiCaC1uw8W9QdzlSEoM6XqHf8QcfybJq0Sz35ReED7gSg1tM5mYUBNUPQL4iHIFz/zhux4ZGcPRCvGW/hE5xA7y1UqBq65lSFek1451XhYTxCv1Cp1BHVR1xNJ465s5yxM6/gSIrZ1eSBCVEuyijqiCH1wQOqFJN/c3FhqiZobhBTFLKr2jMb7mD13PRGG/rKxf8Y1P/7gWkEVMe95tOtlmmb26bWYh55VlMDT0EUrQ5SJHlKqARFLcQ5O+QyTIGIcKY4VpusN97/1Z1dnxqgBgzPW6vp7AXwnckqLJl8NDPd0V1psa5dUOGld7Jsv7DDdqBT2cGeM3b UkrSp/By tZr8WPaNVi/6vnhO/ch9/6Hc9sLet0VL7s2QzQzgAEJRX4DWcEvhTO5uFb+/28qUcUOQx7946u2QfXaBkEjq4MyQ6ldhtipjM7wN3nJ1Ny8Fqzd183oLwnQ0OsJpC1EaQm/JgYSwsxrBDhUDzZeGPEHVbH+fmF70kWHuYtHw9zcEdfYqqh5pHtF7Y2UF9HqCruf87u6IWmDz5BmqmbZKMoJJT3elBgE36gFr+egXjv4LLD/DpgZ5Y6ObjwmvkJs8Cjw8wVl8UtvYoGwzcGNGkpyJ7OI4fd4d7tQAA/w/AmwhRPU0rDtykgMhfucSVSXQ3HKpwbfwO59EY/LgxcexDVEiEXy6URjv/4Gy738yCsZ+J8Hz1Ndyz7HjoSqdAoqlJoyfvGGjdbmb0f6fUFEMM0xPqpYE8VGEZ1kowtj5nDRMww9IcmlX8fI5bexJWahYJfH08g14eCYRIzXPIpMjJnubpdG2lnFEfej2U1BY2JuRS7nk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 28, 2025 at 05:02:15PM +0200, David Hildenbrand wrote: > On 28.05.25 16:54, Peter Xu wrote: > > [Add Jason] > > > > On Wed, May 28, 2025 at 11:59:56AM +0200, David Hildenbrand wrote: > > > On 28.05.25 10:59, David Hildenbrand wrote: > > > > On 28.05.25 03:56, Jinjiang Tu wrote: > > > > > Syzkaller reports a below BUG: > > > > > ioremap on RAM at 0x0000000022727000 - 0x0000000022727fff > > > > > WARNING: CPU: 3 PID: 3609 at arch/x86/mm/ioremap.c:216 __ioremap_caller+0x644/0x7f0 arch/x86/mm/ioremap.c:216 > > > > > Modules linked in: > > > > > CPU: 3 PID: 3609 Comm: syz.2.577 Not tainted 6.6.0+ #63 > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > > > > > RIP: 0010:__ioremap_caller+0x644/0x7f0 arch/x86/mm/ioremap.c:216 > > > > > Call Trace: > > > > > > > > > > generic_access_phys+0x241/0x480 mm/memory.c:6458 > > > > > __access_remote_vm+0x6af/0x970 mm/memory.c:6535 > > > > > access_process_vm+0x53/0x80 mm/memory.c:6600 > > > > > get_cmdline+0x192/0x380 mm/util.c:1041 > > > > > audit_log_proctitle kernel/auditsc.c:1620 [inline] > > > > > audit_log_exit+0x1424/0x18c0 kernel/auditsc.c:1811 > > > > > __audit_syscall_exit+0x252/0x2f0 kernel/auditsc.c:2079 > > > > > audit_syscall_exit include/linux/audit.h:356 [inline] > > > > > syscall_exit_work+0x10f/0x130 kernel/entry/common.c:166 > > > > > __syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline] > > > > > syscall_exit_to_user_mode+0x10/0x1e0 kernel/entry/common.c:218 > > > > > do_syscall_64+0x66/0x110 arch/x86/entry/common.c:87 > > > > > entry_SYSCALL_64_after_hwframe+0x78/0xe2 > > > > > > > > > > The /dev/mem is mapped with COW mapping, and mremap at the mm->args_start. > > > > > The special pfn mapping is replaced by anon folios due to COW. > > > > > generic_access_phys() is supposed to handle iomem, instead of RAM pfn, > > > > > thus trigger a WARN_ON. > > > > > > > > > > Similar to commit 04c35ab3bdae ("x86/mm/pat: fix VM_PAT handling in > > > > > COW mappings"). check if the pte is special to reject Cowed anon folios. > > > > > > > > > > Signed-off-by: Jinjiang Tu > > > > > --- > > > > > mm/memory.c | 7 +++++++ > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > diff --git a/mm/memory.c b/mm/memory.c > > > > > index 49199410805c..e1dac84536ee 100644 > > > > > --- a/mm/memory.c > > > > > +++ b/mm/memory.c > > > > > @@ -6840,6 +6840,13 @@ int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, > > > > > retry: > > > > > if (follow_pfnmap_start(&args)) > > > > > return -EINVAL; > > > > > + > > > > > + /* Never return PFNs of anon folios in COW mappings. */ > > > > > + if (!args.special) { > > > > > + follow_pfnmap_end(&args); > > > > > + return -EINVAL; > > > > > + } > > > > > + > > > > > prot = args.pgprot; > > > > > phys_addr = (resource_size_t)args.pfn << PAGE_SHIFT; > > > > > writable = args.writable; > > > > > > > > I assume we trigger this through vma->vm_ops->access, when the vm_ops have generic_access_phys set. > > > > > > > > I still dislike exposing the "special" bit here, as it is absolutely not what we should care about in the caller. > > > > > > > > In case our arch does not support pte_special, you fix will not catch that case ... > > > > > > > > The following might be better: > > > > > > > > diff --git a/mm/memory.c b/mm/memory.c > > > > index 37d8738f5e12e..810adb8d1a53b 100644 > > > > --- a/mm/memory.c > > > > +++ b/mm/memory.c > > > > @@ -6681,6 +6681,14 @@ int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, > > > > prot = args.pgprot; > > > > phys_addr = (resource_size_t)args.pfn << PAGE_SHIFT; > > > > writable = args.writable; > > > > + > > > > + /* Refuse (refcounted) anonymous pages in CoW mappings. */ > > > > + if (is_cow_mapping(vma->vm_flags) && > > > > + vm_normal_page(vma, addr, ptep_get(args.ptep))) { > > > > + follow_pfnmap_end(&args); > > > > + return -EINVAL; > > > > + } > > > > + > > > > > > Thinking again, we might have a PMD/PUD mapping, so maybe > > > follow_pfnmap_start() should really just refuse any refcounted pages. [1] > > > > We may want to be careful on this. > > > > I feel like we can still potentially break drivers that > > follow_pfnmap_start() used to work on debateable things like RAM page > > injections, unless breaking them is the intention. > > Yes, that all needs a cleanup likely; it's all very confusing and > inconsistent. > > > > > OTOH, I also see at least two in-tree drivers set VM_IO|VM_MIXEDMAP: > > > > *** drivers/gpu/drm/gma500/fbdev.c: > > psb_fbdev_fb_mmap[110] vm_flags_set(vma, VM_IO | VM_MIXEDMAP | VM_DONTEXPAND | VM_DONTDUMP); > > > > *** drivers/gpu/drm/omapdrm/omap_gem.c: > > omap_gem_object_mmap[538] vm_flags_set(vma, VM_DONTEXPAND | VM_DONTDUMP | VM_IO | VM_MIXEDMAP); > > > > AFAIU, these MIXEDMAP users will still rely on follow_pfnmap_start() to > > work on e.g. RAM pages, because GUP will simply fail them.. > > Right. > > VM_IO essentially tells us "don't touch this memory, it might have side > effects", such as MMIO, that's why GUP outright refuses VM_IO VMAs. > > I am not sure why generic_access_phys() should be allowed to ... touch that > memory instead? I'm looking at: commit 28b2ee20c7cba812b6f2ccf6d722cf86d00a84dc Author: Rik van Riel Date: Wed Jul 23 21:27:05 2008 -0700 access_process_vm device memory infrastructure VM_IO is also intentionally mentioned in the doc too: Documentation/filesystems/locking.rst ->access() is called when get_user_pages() fails in access_process_vm(), typically used to debug a process through /proc/pid/mem or ptrace. This function is needed only for VM_IO | VM_PFNMAP VMAs. So it definitely looks like intentional, though I know nothing about PPC Cell SPUs.. > > Confusing. > > > > > Sligtly off-topic: it's also a bit confusing to me on whether the driver > > should set VM_IO for VM_MIXEDMAP. I think it should because VM_IO says: > > Depends on the driver. Some VM_MIXEDMAP users only map ordinary kernel > allocations, where VM_IO is not required. For these drivers, I really don't > know. > > -- > Cheers, > > David / dhildenb > -- Peter Xu