From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD609C02194 for ; Tue, 4 Feb 2025 13:26:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51B3A6B007B; Tue, 4 Feb 2025 08:26:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CAEB6B0083; Tue, 4 Feb 2025 08:26:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 393016B0085; Tue, 4 Feb 2025 08:26:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1E3C26B007B for ; Tue, 4 Feb 2025 08:26:30 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8F7CB1A09E1 for ; Tue, 4 Feb 2025 13:26:19 +0000 (UTC) X-FDA: 83082336078.03.9631A22 Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by imf29.hostedemail.com (Postfix) with ESMTP id 7D9A712000A for ; Tue, 4 Feb 2025 13:26:17 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=RI0qrygC; spf=pass (imf29.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.219.41 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738675577; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AVp4b+FFVRvgHW7eQGwrIGJ0Ud3zrEOqDnH6Jl7Uugg=; b=MO54STbSHSBAL50YHnJGHrrHZo+StoBxxnb8Jl5Sl2RdzXIQaqJggQM+vN+odYJ2n0rQdn nZBtCcT3V5kUbfJGnhHX5lsdpHxwWZOSRRl87axzSSU2Q6njLhLb0bmbQ5DKcYTHYKwjEU Rh8MVWcSTm42fTZv4bA1tJLFNnBWEIQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738675577; a=rsa-sha256; cv=none; b=uLFiwoyKLEHHrXrz0Wu4Ub7lA7CsQlUUS1HAIdc3iBE2sxsSR6KTHW9LqUmrH9YZq0VVnS Io7StaXC5hrSdliMKzJbeipi4hGQKRcxn9LzrxMufrywdVpCG1K0jdnqtvEHMCEoWTdryv pbwF4ebrrMDMgG4rswXF2wQoxEyv4TA= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=RI0qrygC; spf=pass (imf29.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.219.41 as permitted sender) smtp.mailfrom=jgg@ziepe.ca; dmarc=none Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-6e4295b72b5so4754446d6.2 for ; Tue, 04 Feb 2025 05:26:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1738675576; x=1739280376; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=AVp4b+FFVRvgHW7eQGwrIGJ0Ud3zrEOqDnH6Jl7Uugg=; b=RI0qrygCbmpV7p4VDO10jYYi5s6IrFQeorRG7WoeVTydpnK5PquAUCHM/j2gXgInaq HhEndvVsXv8qF4TVFFD2UEbq5+me+C2FtrCKZH+2F049U/GiBF0Oa1B79LCXpCXHO845 gvm6eXjxtPrBun18G8buSbfj3wByA4iYRmEE1HhbRheiINYADbS+y72QoFWu5W0PWcTR 7iF85+AuwoXgl+Wc6jammZH3umt/0oMRvmavoh2EEy+1OGz+JkFkuaZeetl96Iz4s30p +pIj459INGTXiTSBcPB3s6WGDnLlVo7ee5q6L/7+5biqphP17o+m2h0mLJf7teXHj0Xy jGPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738675576; x=1739280376; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AVp4b+FFVRvgHW7eQGwrIGJ0Ud3zrEOqDnH6Jl7Uugg=; b=uO7lbMQ4/N4fheMn09qVCM6to7/a7e3TiNax7cPw+mssPjVtZzSQMgPq5uqi8KzaHn h4OlVnO7cHM5MSRKEMZy4Ke1rkf1x7D4oQJgN9mlmpwOS6QEG+uB76Qb5gIQcI8IdtZ4 Pxa/cxbgw66FDt9ri5F6kaURb+U2cSQMgYzsmMZdFLxV3euP7zOt09otxHdRbB9Osu5G KJLZJsrn4mF1k9PD3TJEKiZOZW3IBfPjCNG00/97SQ8/96KjH5zTr/H00eumrcgOdNC1 SsRXgE+J41T2KbC020/0akPtH5ZU9QPVjDRn0Z6KkBtsCq1/7mT5v/sR9bN0pKpOp37j RZBg== X-Forwarded-Encrypted: i=1; AJvYcCVVPjZdE0lxY4QzeT9UCfZms80OaABwlHjJ9zCigSZSkqJ2eDC9zDKCaIiPGjkYjUkdGjWSoTdzSw==@kvack.org X-Gm-Message-State: AOJu0Yya/hGMa0EOCUj+LH95d3LvchM7VLt+TDkvprhgcnFrF/UrHgzx 39OLnAJn6d+pHFCPZQkU+8dklXs2hAsLGMQPfQJdopUi8JK/AO2KDmKHkIyp6Jw= X-Gm-Gg: ASbGncuUtUAKh2GGltUl+CviGBVmEu+Eur98p+mSomV+rINeOEYrJZNaV0Qu5UJKcf5 DVL2bhtG05pRSUxapTP6ZIzK0PU615mxQjDgLOcc0G3MSY8tQfqniGJ73IkdXfFZyUKeyo91IbC bdeP+NMfTN+Nr30VDi6O2K+U82oZZjIrg9uU2+k6sKU6RyAvaD3GBjTav7n7cAbvniqhCQWoxUA byLnM7QCh9RTpWGQ6vrtqUP2X14C2rP+2HrYhnHEEK+cco+YpkImd+NtXATh9hBXN1L78PHuJAZ 6oZsT+vwcQkR/oQUZSYeSNZ+B5ROVPj4xbHPXHAmp5IFH6TBWHBz90xFA5GNuTbN X-Google-Smtp-Source: AGHT+IHlCOKBY+zoBFq/ioLTJPh7bDGQBSV2BY/QCHd0dQL9OkhFoX4vLeuIVKLlvifCm+9qEIE5+w== X-Received: by 2002:a05:6214:5d87:b0:6d8:a5da:3aba with SMTP id 6a1803df08f44-6e243c3b07amr371097016d6.20.1738675576494; Tue, 04 Feb 2025 05:26:16 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-68-128-5.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.128.5]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6e254814d1esm61954386d6.38.2025.02.04.05.26.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Feb 2025 05:26:15 -0800 (PST) Received: from jgg by wakko with local (Exim 4.97) (envelope-from ) id 1tfIwR-0000000BRNQ-1pE0; Tue, 04 Feb 2025 09:26:15 -0400 Date: Tue, 4 Feb 2025 09:26:15 -0400 From: Jason Gunthorpe To: Thomas =?utf-8?Q?Hellstr=C3=B6m?= Cc: Yonatan Maman , kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com, airlied@gmail.com, simona@ffwll.ch, leon@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, GalShalom@nvidia.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-tegra@vger.kernel.org Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages Message-ID: <20250204132615.GI2296753@ziepe.ca> References: <20250128172123.GD1524382@ziepe.ca> <20250129134757.GA2120662@ziepe.ca> <20250130132317.GG2120662@ziepe.ca> <20250130174217.GA2296753@ziepe.ca> <20250203150805.GC2296753@ziepe.ca> <7b7a15fb1f59acc60393eb01cefddf4dc1f32c00.camel@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <7b7a15fb1f59acc60393eb01cefddf4dc1f32c00.camel@linux.intel.com> X-Stat-Signature: ft1e9d4ydhgyo616xdbzia6uzz8aqu5s X-Rspam-User: X-Rspamd-Queue-Id: 7D9A712000A X-Rspamd-Server: rspam03 X-HE-Tag: 1738675577-95689 X-HE-Meta: U2FsdGVkX191YySNOu+PLAZWe6vJQ9d9GXAqGc1VS69jtrnDAGcCn/YV/7bbhpVecCQWrex1VacnIGY1yCzwt5w48bRvq1lI8O1OKlL3LuS1ob4m5iDriHx4zYnFFKvwp8eXD6LlbpW3JziqjKS60s9wfEmKNMrx4HuhkCB2BPyBRxQ5a0drXt3gm5iexVOWAvrh2B7L2jnYVEwkv+JLOl4H4f/anRioLzbw1DH/Lv6dQo56yy0z2NfuEjnimQjzvooDVF7pXMmeE6fBs2wChpGb6WpT/Aq0gBKHn2cjRLVpxmtmC02y53C4UvDqQra6Pa3KLFhFqub42uEvpPlhymEii/FroQlLNkZXPgv2TIW8vKch6PtZFFmEXxfKlpsM6QZS/HH7DAWtT9WMkH7oTPU9mQzpHSxO3bUUy4+SRrDMjiawnXipMycD05yyzV/r7HsOQ71Q45MznYEuPwKL9hc8STGzR6GNh/B7BIGBqjFkF6Oi4ZRRvn+/lwm+py4Xt+JD9vIIjyKIRNmH/8QU1jYkDVKCh3MvhecPCTY2E1JqKK6jsktRGAXnIpnfkebkIZF7tvNW3FcjelZ0uNFSH6bBB0xyJqbVXi+YBOMp8Jbr2J9A/Blh2zyCjE6ZBsrR78vrUMET91sky8i1vEoAbAiUu9OJEAZ5RDTGugMwNZWEV5E/sAMEUqv7isUVB8vR5TZ2iYYRPRf/P/XaAk11SD4s9o+g+Fd3E2iOTwt6dc2sI5MYf5a5tl+uXc7MztLNTejxGBm0jC+sJVIInVnW8Q1Z8/7zJvrCrLhhgSGQeYSkRIj1U6cf+lNDCtnFCggCeUEfiXvAwriuaz/PV8tIXU8qZ2pq2x5UDwMuDBnQysaI+hHnqkVyMWXfsk5SDzcW9Ya8Nrr0VSBRx37818RsnW9mU3z01LIcpI3+dI0qfAV+H5Q1MEsVAXmBjiBfcSB+ti+bFiDGx832gWyPO9b oeg/PH8a JignI3AU7fls+pd+P6pq43MaBfcDUxbTTusN4hGBlB5ZdJ0AkKd53p6fkgdd4qwfB7AitFJENHnHh3NfLqhYE+9YunLwgy7ki8wPRJux4Uej1n2+2dKFPNAj/kBhZrLK+1nU3rHXwLL8JJQL8Zr63rmfukfT9eF/BRM2tiN0wow6MBwwXPDWD3nrtEs8MrUjQPgy37ZtQqSiTRqX7wonrkngH/m1jlhY8STbNOxRADh+Hg5RIgThqYPlZiivaXO73ghZCUjnlbw8VB3La16XEDTDfaflMdUV5axieCgrdNBR7qdXV+RtByULaFg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000262, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellström wrote: > > I would not be happy to see this. Please improve pagemap directly if > > you think you need more things. > > These are mainly helpers to migrate and populate a range of cpu memory > space (struct mm_struct) with GPU device_private memory, migrate to > system on gpu memory shortage and implement the migrate_to_vram pagemap > op, tied to gpu device memory allocations, so I don't think there is > anything we should be exposing at the dev_pagemap level at this point? Maybe that belongs in mm/hmm then? > > Neither really match the expected design here. The owner should be > > entirely based on reachability. Devices that cannot reach each other > > directly should have different owners. > > Actually what I'm putting together is a small helper to allocate and > assign an "owner" based on devices that are previously registered to a > "registry". The caller has to indicate using a callback function for > each struct device pair whether there is a fast interconnect available, > and this is expected to be done at pagemap creation time, so I think > this aligns with the above. Initially a "registry" (which is a list of > device-owner pairs) will be driver-local, but could easily have a wider > scope. Yeah, that seems like a workable idea > This means we handle access control, unplug checks and similar at > migration time, typically before hmm_range_fault(), and the role of > hmm_range_fault() will be to over pfns whose backing memory is directly > accessible to the device, else migrate to system. Yes, that sound right > 1) Existing users would never use the callback. They can still rely on > the owner check, only if that fails we check for callback existence. > 2) By simply caching the result from the last checked dev_pagemap, most > callback calls could typically be eliminated. But then you are not in the locked region so your cache is racy and invalid. > 3) As mentioned before, a callback call would typically always be > followed by either migration to ram or a page-table update. Compared to > these, the callback overhead would IMO be unnoticeable. Why? Surely the normal case should be a callback saying the memory can be accessed? > 4) pcie_p2p is already planning a dev_pagemap callback? Yes, but it is not a racy validation callback, and it already is creating a complicated lifecycle problem inside the exporting the driver. Jason