From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE85CC5AD49 for ; Wed, 28 May 2025 17:15:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A7956B0083; Wed, 28 May 2025 13:15:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 658696B0088; Wed, 28 May 2025 13:15:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 546AB6B0089; Wed, 28 May 2025 13:15:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 36B2A6B0083 for ; Wed, 28 May 2025 13:15:08 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E35F4E9C11 for ; Wed, 28 May 2025 17:15:07 +0000 (UTC) X-FDA: 83492967054.05.63E2D56 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id B278E40014 for ; Wed, 28 May 2025 17:15:05 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NJ1lkozn; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748452505; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Cntej3ZQBRM08dYeSZt7kH7v43RocgOuK8odN5+2WMM=; b=0iNyMO4LhuIaaI5b9EW2E1T3EiueyX0GNOzKDMMk8zxjI+tol2acYoZ9P4j7cAcwkpMcJP UR0rwNtyvByucWrwwakaNUilDjKTiJvAIHlG7MnI2mbmGt/pBdbhI+4v8b9/vs1E/MrV57 vQWRR0XuTc8e+u/D6LzAPScz4svqDcg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748452505; a=rsa-sha256; cv=none; b=v92eRh3piK31nsXrFkWtnShShzastLg/PIeaaSqlnuAT5yvxi0371cc69S6thCJTFBwQD6 rcBoLfvxePbBJn3thPwesPP5tej/gpuN3VoRkmw9AMnDwosG5DSdd/NnU24Z7exZKeYtDD IJtOLnIl8mDmzKybPY+ApxWF5mi+QtY= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NJ1lkozn; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1748452505; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Cntej3ZQBRM08dYeSZt7kH7v43RocgOuK8odN5+2WMM=; b=NJ1lkoznOV72S7amI52OL0njYgGLqH3Cjon2rk1Wnn1vNfn7JXBVX6Wv+v+ugRrPwR7EQw EUo5wjywVPmRrro197UioKB6tCII/Qcu8VgNsCr6CsyM3V0+XJoGsuOcJEzaGoOKPfJA6L 3Ve4nTScqo58iFUEoYpUeCnSf897dX8= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-379-taLZSRP-NDy3lbD-cg-Rnw-1; Wed, 28 May 2025 13:15:04 -0400 X-MC-Unique: taLZSRP-NDy3lbD-cg-Rnw-1 X-Mimecast-MFC-AGG-ID: taLZSRP-NDy3lbD-cg-Rnw_1748452503 Received: by mail-qt1-f200.google.com with SMTP id d75a77b69052e-4767816a48cso136670391cf.3 for ; Wed, 28 May 2025 10:15:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748452503; x=1749057303; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Cntej3ZQBRM08dYeSZt7kH7v43RocgOuK8odN5+2WMM=; b=fxdXZpxIoPzfuxbU12TLwiX5yc/OlJSxSfAtgKBUeyGPbGYRcAcs3jOP//Rvnz1JcI P3cFVpc9N9PuCBjJtdfDguzAYoGS70gtSHAihoQ+5zSq9J/ZjWgEyt6LCn4dllz7Gy6h u3AgG+dJak4IBzqR1pFm5fSNK1fw9Z9uE2l/3q01ASKcONZTjLiaRi+/xTrE6zB5Mm8W NaNANXn4dfMIlKoxwJhGLfV3hbO2zUcJd1JwIq1cwYQg/8lPfL18AbJ8cGNCgX3FqxuP UzFj5kC2WP88Rom0ApskqUqpzZqf7C8RukE8vkFVwR0TRVHPKQ58qkyFUhtCFSJ0K8/v /Yig== X-Forwarded-Encrypted: i=1; AJvYcCUiqv7I2FCeqf7e14X17inHPwzyQVUBAQBNgJzPHuQ8q+cI7oa/fKydNXtksZ/SDi440x4qsWRuEg==@kvack.org X-Gm-Message-State: AOJu0YywxXDxShsWn+mkmULeg1TZmuioMugh1oDgmxLy3qiGxp9K3cRL GV1ZAhM6XlDCfTI0nBiYy0fkdgF2JH42JsGvE9mCVExrwPYT2zynpRSPpGRG71snLfeUO49DkFp VsVSKq3A/OrgRa+3Y9s+bP+sABcTZOopHvngtYUmFXCCoDdhOzuQXfG/l0Eh6 X-Gm-Gg: ASbGncszf2aXN63Pk3IPV2v3hczYZL1gTNr4rdL+MvbG4TNRIPN5QrN2XIPF7VzZ7yO TU66p4qxco8S3nIwEND0QeGRfd+736FL26PHPgUITYWt9E0Zu94S07NV+80CrYwS/g5MF6i5mHt 5HlrQtcPuHEZg/2Oj0LKt7WvBRXUokNJakK+gnClz2/wEvlKxoDU9lRSry7jFYZkGOwp8r3+P9E aKGIPfISI9OGffBCVDVc3t3hQk2otjgPIicGvlHh6l4hyY2Z9u/eOVxVA1DU6D4LXnjjASfFF1i 8lg= X-Received: by 2002:a05:622a:4d48:b0:494:adc6:239f with SMTP id d75a77b69052e-49f46154abfmr220513651cf.10.1748452502904; Wed, 28 May 2025 10:15:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFvRMXTN58cH0SKLkXZM8CkROLpEsQThYQmSA+MfzBw6zvqw7XGnpFKBsVNCjS4gPX/mG9sUg== X-Received: by 2002:a05:622a:4d48:b0:494:adc6:239f with SMTP id d75a77b69052e-49f46154abfmr220513391cf.10.1748452502570; Wed, 28 May 2025 10:15:02 -0700 (PDT) Received: from x1.local ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4a3c059f4b6sm8388801cf.10.2025.05.28.10.15.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 May 2025 10:15:01 -0700 (PDT) Date: Wed, 28 May 2025 13:14:58 -0400 From: Peter Xu To: Jason Gunthorpe Cc: David Hildenbrand , Jinjiang Tu , akpm@linux-foundation.org, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-mm@kvack.org, wangkefeng.wang@huawei.com Subject: Re: [PATCH] mm: fix COW mapping handing in generic_access_phys Message-ID: References: <20250528015617.302681-1-tujinjiang@huawei.com> <0d4f0180-52e6-47c9-b141-54e7e7c86880@redhat.com> <5b9f5952-9979-426f-857a-dffa9b7963af@redhat.com> <20250528162915.GR61950@nvidia.com> MIME-Version: 1.0 In-Reply-To: <20250528162915.GR61950@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: pyXV37jPBBb_jrQiAYtCALSVLcbp6jGOKM_xAcMGpYA_1748452503 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: B278E40014 X-Stat-Signature: 8amx6tzrfeudiwcnnjecuwsne6u859ue X-Rspam-User: X-HE-Tag: 1748452505-697734 X-HE-Meta: U2FsdGVkX1/iz5bnwDfTrh2HgJ0QIA1GKxif85g7xsD3yRMS5yUjx/nFthOgMiIeHZofD7pYELRvtFfu2BugwNwwA0HgEsl/qko0yfkhbGKK2vm7GWn0m2F5HREtQNwyC9GrNnqkWieHd9H0FYyTbK/s248093fhs9DbiJK7ZEFeBFsTgVdPSzcWXRgA+Iv0FeWuuZ8TZ7vsrUMfN0xKkufYmMa3D6T9UWA8ap+UiXqvsGJ05XkoqkwB77wVKu9lHU1lBIsGykOD5bUQDimPCSTMT8NTITUl/NxLqK1BmGTCE5M/RpISKqeOyo8kIwFxcn2aEJk4OXIB/lAGy+gb1kOFt5P/8XjiOUTAefXHTiWh059buNHL9Tpj+tP66DmQTxTikdnurK+mFse8qFN0qQbPpvij6K4LrBdnWMnOS3BXvMqrtVJiAuY7A97ZTTpyTu5tlFkfOMuxkzwBb6+n6PclnKCb483wYSt/XRN0T6vycHPJEhzxWamWGzENARecMyR6V3rCtkymFBc93nUJai/U7NniynQLZ7xAhpPjml0D6k48MSM/5fksA9AJMVcB5lYAmoQIIQWbdQvTRcUOGnNLQYxIo8Yqr/qtVvnPqxev1TySxc8AQriJpugIv3QTH69uYFi2SjPJ5yT9AgWOj5uA/SAs7Yy2E/5dtgYEQhu95xDndTIJo5pqqrbvXU+pzUmWr+5HcM9ebdlu84f7KeIfIWdFmTiyd8deecqOlbNdeVhBsC6qRjalakOns+yR5hGM3UgBfgkHMTXc88C+tWsM723aCobawWyc4NEpsyBfw77TQF+5s/ioLFF6hdAavQ8a9C/alkux9k+6A8UhAcTj/v2xFQIEZUqI6AkN22W4trtncH6LYiDbq6BUWYuQHtE5nJZtDcHuD+jOEUn9B6Zq2qN5kVX6ribwVaRYaJbXPq8/ENqfYlSvFFpnbZcz0nNZ08g0BpcE6IpFyDS zHbcZxtT qrGvsfmqTfzMg6FWR0kCDUU38GBu+Co2vBgjyGLio6bBi+aBvEo2JYQvExtKcEjGetr9KCAbcVAVKbDJDGAPhdUsijmb+laQvyTx/pHcsxNfKOHld3iKKGDst4YatxfXf5ID+9sjms6etvkRX2NAw8VA1rH+T6vM5E6qjf/Ibth4O6bpYCbwX2vTIcIvwj8odqrLXBOhlZ2WacYbOOZ86Uoq3paF3MNGln3dHimQMCjr52id8DX3W6IJxUBxqV57Ddh8xOksff5iZ/uDXGbguuc0cB7s08ASA58ux1BuJa7HaKU4g1+n1jVTyM06b3J3J35BqQMx7zWkClZFG6+6It9GQRI7qSEfXNCGesk7IR1NZlY8sILcs/Zl8z2fWBcT9AhaJZ2L06zpBJaY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 28, 2025 at 01:29:15PM -0300, Jason Gunthorpe wrote: > On Wed, May 28, 2025 at 12:06:07PM -0400, Peter Xu wrote: > > #define VM_PFNMAP 0x00000400 /* Page-ranges managed without "struct page", just pure PFN */ > > > > I'm not confident to blame any driver yet to have those special cases for > > VM_PFNMAP, because it only says "managed without struct page", it didn't > > say "it must not contain struct page".. Hence it hints the core mm "please > > do not manage these mappings with struct page at all". Still sounds fair > > contract, even if not ideal. > > I think it is pretty clear, if a VMA has VM_PFNMAP then nothing must > ever try to obtain a struct page from any PTEs in it, for any reason, > even if things in it might have a struct page. In practice it means > nothing can call vm_normal_page() on a VM_PFNMAP. > > It would be nice to update the comment to make it clearer. Yes that would help. Maybe the hard part is making sure how it is documented will be how it is used.. > > If the VMA owner wanted to permit access to the struct page then it > should have used VM_MIXEDMAP. > > The fundamental difference between PFNMAP and MIXEDMAP is that > vm_normal_page() is allowed on MIXEDMAP. That comes with some extra > rules and restrictions to support arches without the special pte bit. If in the ideal world where VM_PFNMAP has a stricter semantics, it sounds fair to disable vm_normal_page() on top of VM_PFNMAP, yes. > > VM_IO | VM_PFNMAP further means that all the pfns in the VMA require > the use of io accessors (writel/readl) to access them. Hmm.. I'm not 100% sure on this one. E.g., vDSO is VM_IO now but it's definitely accessible that got mapped into userspace. Same case when e.g. mmap() VM_IO regions then accessing using the virtual addresses. What you said sounds more like what __iomem declares rather than VM_IO. But I confess I at least don't know why VM_IO existed, considering there're also VM_*MAP and VM_DONTDUMP. > > No idea what VM_IO | VM_MIXEDMAP is supposed to mean. Only the special > ptes need io accessors? > > In either case GUP doesn't really work on the VMA. PFNMAP is totally > blocked, and for MIXEDMAP userspace has no way to discover which > subset of the VMA is GUPable. I think that GUP is supported on > MIXEDMAP at all is a bit of a weirdo thing. Does it imply that in the ideal case one should use follow_pfnmap_start() for MIXEDMAP? I don't have a strong feeling yet on how GUP should treat MIXEDMAP, either (1) fail MIXEDMAP like you said, falling back to follow_pfnmap_start(), or (2) allow MIXEDMAP only on page-backed mappings, then fallback to follow_pfnmap_start() on non-page-backed mappings only. > > The main value of MIXEDMAP having access to the struct page is to > allow the page table itself to manage the life-cycle of memory that > was allocated specifically and only for the VMA. It will do fork and > unmap properly and free the memory at the right time. Agree. > > If a driver is managing the memory lifecylce on its own and using zap > to clear the PTEs (because it must to manage the non-struct page PTE > life-cycle), then there isn't much point in using MIXEDMAP, IMHO. I > wonder if some of the GPU drivers are confused like this. > > Jason > -- Peter Xu