From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99657C02190 for ; Wed, 29 Jan 2025 13:46:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F26A3280062; Wed, 29 Jan 2025 08:46:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EB20028005E; Wed, 29 Jan 2025 08:46:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2965280062; Wed, 29 Jan 2025 08:46:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B0C3D28005E for ; Wed, 29 Jan 2025 08:46:30 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5EAA480DB8 for ; Wed, 29 Jan 2025 13:46:30 +0000 (UTC) X-FDA: 83060614140.13.D8A55AB Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) by imf05.hostedemail.com (Postfix) with ESMTP id DFD1F100007 for ; Wed, 29 Jan 2025 13:46:27 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b=BWlOcpCK; dmarc=none; spf=none (imf05.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.208.175) smtp.mailfrom=simona.vetter@ffwll.ch ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738158388; a=rsa-sha256; cv=none; b=pUKhRITUlt4e/GjjtOk8lwR2BQNqWXNli/yR3raVQoU8xyKinEaKHFhxKJvl3uYa9vAJZn 0oU87CZMBVkRBGwTfGdiRES5c2NawW3o7t8Om/TLCnPGKGfin6vZnw3JrP3MZki0MxnP1f zo/ChUobb9lhU0i8INgbHpjfxMvMCtk= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b=BWlOcpCK; dmarc=none; spf=none (imf05.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.208.175) smtp.mailfrom=simona.vetter@ffwll.ch ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738158388; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ekPyuSS82CPNI1Lh/E6IsIbzoEEGT2LSw6fRdzUam+8=; b=Lh9WlSqI9x585+eI8NYHRWf+1B4M8hLCUYA2VjeGDX+mhOhA9vEhFH134Gv207HVFQrQQk CpP4PXAN7Y9ZRyD5VUhsVeJdT0luARYMw/MWcwc/LFuwrNsDliYyjgxAYWFH+Mz77wTtDj i7Znt0aru6oWu3LfUvxtJXIUWLwuwBY= Received: by mail-lj1-f175.google.com with SMTP id 38308e7fff4ca-3043e84c687so57997941fa.1 for ; Wed, 29 Jan 2025 05:46:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; t=1738158386; x=1738763186; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:mail-followup-to:message-id:subject:cc:to :from:date:from:to:cc:subject:date:message-id:reply-to; bh=ekPyuSS82CPNI1Lh/E6IsIbzoEEGT2LSw6fRdzUam+8=; b=BWlOcpCKR167o6qV+Q0rkfM5scMrQ9xKb56Vu0XkifYIb7ikuV+GDXE67sFY1nySBu qW2yBrYVZ1BZYEQsSaMZKR2WYogx6FF9c3gtZXJwV05KXHw4b9+2SLFtQzESCQSrhx8e a17GvW1DdseVAqiMDT3qYcou7EKfU2ZBY+/E4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738158386; x=1738763186; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:mail-followup-to:message-id:subject:cc:to :from:date:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ekPyuSS82CPNI1Lh/E6IsIbzoEEGT2LSw6fRdzUam+8=; b=gqvCM9y3Vlx2oMjbjHGALrsbJ83dp6DfeKaGX6za43fymIxrx3hLml8ytlrKemytnq 6AAhLuHZhVaDtV+sG7DtWOMuYSDmzTTv17/MmHrNAWGbcqM1uJhNKjepiRqDXb0LvGhT 2p5kG8ukfc1IlaWiaZMh2KjCpLp1LcHCzdsafT4BecBQ6ysw8Bdnjopy2SAyzvkvTyc1 pOoSrVnXP3+m/8gE4fF74oPQe8HgJ3VJl1eb6Wlayy3DcpwPF4ZnV5mmn2pHQ+gyVl2w wtrWlE1ys7ALcHbqGY3pnI7VSL6ssCHwd5xcj13DCnqxt83NQf8kJkLLl8Fm1JAw2hbb RpJA== X-Forwarded-Encrypted: i=1; AJvYcCUM9ICyVGTfspwQrTURtc1jfoGVGn7gQOxmAOpuONRAy5Inu96NaS+YcAyDk+pvNFGXLbfo+iSdlw==@kvack.org X-Gm-Message-State: AOJu0YzbvOgPH143/LRBBSv5j7UvbKG6yGZt4HNyO5X3BKcQJ8fLx4uc S0gO+nXSLHPFsQln4pkEM8NTE2bTW6++oN9mQ9fbBsNvoa3HYKIhlFQ2U+EpQUl1FovGQXoF+Ni 5 X-Gm-Gg: ASbGnctbJusR/3RlhMM02NxR0sp9k3WXzI3VPNtdXjBId0dqVbZoi8O9e4HzRvQlyFY PzWuvofSTV++ljBfhmXmVS/kJBl4vJP6/0Lp0b3Jtrql++DDF3tKqaO8DZM+Bl80UkRnKYOhD/Q ziP6UlOp2FR0UB+a43SVDPLAyCeymY4qwFnL4HQ1OXzcyCpXGfJ0x/TYz9TahWnjXGZtBFIqIdB aX7VAZiLrYkwP57WHyZk569bU2td0MhRlFtOBQQ0eunZcSkXU6kF/KV+uWm107rFOW6DQsKg3Og CQXHLfjtPG2/6gc+mXL1uN+hIcM= X-Google-Smtp-Source: AGHT+IFp40yYl/FrcEC18d1HA1Wbr1YBHRju+Twtn7vgv1BIenNhcb0w0RrYUn5K84MH3BCfQBYGmw== X-Received: by 2002:a05:600c:5486:b0:433:c76d:d57e with SMTP id 5b1f17b1804b1-438dc3a40d3mr30211795e9.5.1738157942387; Wed, 29 Jan 2025 05:39:02 -0800 (PST) Received: from phenom.ffwll.local ([2a02:168:57f4:0:5485:d4b2:c087:b497]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438dcc263f0sm23501465e9.9.2025.01.29.05.39.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jan 2025 05:39:01 -0800 (PST) Date: Wed, 29 Jan 2025 14:38:58 +0100 From: Simona Vetter To: Jason Gunthorpe Cc: Thomas =?iso-8859-1?Q?Hellstr=F6m?= , Yonatan Maman , kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com, airlied@gmail.com, simona@ffwll.ch, leon@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, GalShalom@nvidia.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-tegra@vger.kernel.org Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages Message-ID: Mail-Followup-To: Jason Gunthorpe , Thomas =?iso-8859-1?Q?Hellstr=F6m?= , Yonatan Maman , kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com, airlied@gmail.com, simona@ffwll.ch, leon@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, GalShalom@nvidia.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-tegra@vger.kernel.org References: <20241201103659.420677-1-ymaman@nvidia.com> <20241201103659.420677-2-ymaman@nvidia.com> <7282ac68c47886caa2bc2a2813d41a04adf938e1.camel@linux.intel.com> <20250128132034.GA1524382@ziepe.ca> <20250128151610.GC1524382@ziepe.ca> <20250128172123.GD1524382@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20250128172123.GD1524382@ziepe.ca> X-Operating-System: Linux phenom 6.12.11-amd64 X-Rspam-User: X-Rspamd-Queue-Id: DFD1F100007 X-Rspamd-Server: rspam10 X-Stat-Signature: xfd5gge4hwnup9wnsnxtnp1woz798nrj X-HE-Tag: 1738158387-375832 X-HE-Meta: U2FsdGVkX1/M5C93zfYk1kM2t1h+y2rJPfrROsMdsNXAlgs/lQPt3e/WwochrpvaqVQee7IL7q8Y6QLXwsH1AtsPShZPyICzyBV0Vv9gUvSgMHMPlVi9Q2NDmnR4knioQukbsz4cs9z+nJb5MQyFL1yzc/5WSEY1Exm7Wj9EA4tXuQ7jhcOY302pvfws9ZsP9uHrE2pjoBjPM9w6wAq00H9FNNPtGL/TfDEyT4s9n8OTk9r7nR4bCuJ/Lh0Ilvkh5D48YRDC0vU8UU3Hk9iItdbBhKWiCtc2TTUHLfEG3ZXG1zl3tY4rLn20MvhZ/I7nO2f3DVMMqaKdfIrbdVbL0Hvcro3zJawJprnp36m6C4cQEUvKJY3xVmSpEdAVrjVG0dLBhO7oKSmMaWh/qoifBkhho4i0KlS9xeMgKNAm2Ei3YClqriNaaFrDRj5eCH9h4LJEQtLkPlDljwfxIAjWwwVNvKPiukwyB96yrCM84zZw+gLo99gvnB/Pv1n1nUSARmgHGo2CtOhEmnYXWTPpSWLFC86tNwxeD70pdyXLhBjSt6KkVi7/h+H9oKpB1xWNKT3iPbRe3vppWvNeJOOc0lj1OzCTzOfqusQz17wWNRvJ6IoZ+tJkaRl52NY6Ri0uUkIH6tba9IJ7yc01vw7TUaHUUwX6ydUzio+zBdoBOJy7moZ0bXLSFJx7e4m0SicSEWPFcHZWtW/+CwcpjkdvYLEzGHlQEV3MUhcY5hdC9xN+KacXHGz8h2+JQqYWYBz2yUck1LvGoio/lX9XZmCQxnHyKLVpe0Z/S6V4tRefeFnt3swRBqqP4SkAdBfTo3OLwt6Ioz5Jc4eGM2pa3/S1qkqp6KKJS1CcbLoDGXSC2cbpbFAzEj7orT+5XLe6KIE80j9GIpHDy4zS2UoCdHizLBdj86OyP9jYVTr7BYzGrdtROiyK81wCiFH+1sIbchOOz3VnoBGkTFmVsc8q92R qGupup4Y 55BfQERTogUpjikpWS1S1ZUI6koqJ2POejAbQslJTg7i7DjqAW2YA1Bgi2u5zm1Jtu1OeKLvrK6EljVkiI6BQ71A3hCYJRrvj+qZ11ZlH0eqTpfvV7n9J8aLerYGKqS3D/t2osuJVgBA4X72I+waZO7lETv9JjITWFhkj7/pQZJaszlZd4G9YTYx79LsE5ZmcYw2GsyH91qpkPh7y9WNrRTjQS1mgE5idwUoDlicUr0hQfVqWArBt7tzpzIN9BRo8cOEET23FaZWsUVzg2uXQLeC0roAI1rI3Loksc0Mg+0V03bJb0OAWVn8sW7umbRLWT9Ts X-Bogosity: Ham, tests=bogofilter, spamicity=0.028847, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jan 28, 2025 at 01:21:23PM -0400, Jason Gunthorpe wrote: > On Tue, Jan 28, 2025 at 05:32:23PM +0100, Thomas Hellström wrote: > > > This series supports three case: > > > > > >  1) pgmap->owner == range->dev_private_owner > > >     This is "driver private fast interconnect" in this case HMM > > > should > > >     immediately return the page. The calling driver understands the > > >     private parts of the pgmap and computes the private interconnect > > >     address. > > > > > >     This requires organizing your driver so that all private > > >     interconnect has the same pgmap->owner. > > > > Yes, although that makes this map static, since pgmap->owner has to be > > set at pgmap creation time. and we were during initial discussions > > looking at something dynamic here. However I think we can probably do > > with a per-driver owner for now and get back if that's not sufficient. > > The pgmap->owner doesn't *have* to fixed, certainly during early boot before > you hand out any page references it can be changed. I wouldn't be > surprised if this is useful to some requirements to build up the > private interconnect topology? The trouble I'm seeing is device probe and the fundemantal issue that you never know when you're done. And so if we entirely rely on pgmap->owner to figure out the driver private interconnect topology, that's going to be messy. That's why I'm also leaning towards both comparing owners and having an additional check whether the interconnect is actually there or not yet. You can fake that by doing these checks after hmm_range_fault returned, and if you get a bunch of unsuitable pages, toss it back to hmm_range_fault asking for an unconditional migration to system memory for those. But that's kinda not great and I think goes at least against the spirit of how you want to handle pci p2p in step 2 below? Cheers, Sima > > >  2) The page is DEVICE_PRIVATE and get_dma_pfn_for_device() exists. > > >     The exporting driver has the option to return a P2P struct page > > >     that can be used for PCI P2P without any migration. In a PCI GPU > > >     context this means the GPU has mapped its local memory to a PCI > > >     address. The assumption is that P2P always works and so this > > >     address can be DMA'd from. > > > > So do I understand it correctly, that the driver then needs to set up > > one device_private struct page and one pcie_p2p struct page for each > > page of device memory participating in this way? > > Yes, for now. I hope to remove the p2p page eventually. > > > > If you are just talking about your private multi-path, then that is > > > already handled.. > > > > No, the issue I'm having with this is really why would > > hmm_range_fault() need the new pfn when it could easily be obtained > > from the device-private pfn by the hmm_range_fault() caller? > > That isn't the API of HMM, the caller uses hmm to get PFNs it can use. > > Deliberately returning PFNs the caller cannot use is nonsensical to > it's purpose :) > > > So anyway what we'll do is to try to use an interconnect-common owner > > for now and revisit the problem if that's not sufficient so we can come > > up with an acceptable solution. > > That is the intention for sure. The idea was that the drivers under > the private pages would somehow generate unique owners for shared > private interconnect segments. > > I wouldn't say this is the end all of the idea, if there are better > ways to handle accepting private pages they can certainly be > explored.. > > Jason -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch