From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECCDBC02190 for ; Fri, 31 Jan 2025 16:59:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 79EAF280003; Fri, 31 Jan 2025 11:59:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7272E280001; Fri, 31 Jan 2025 11:59:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A0B3280003; Fri, 31 Jan 2025 11:59:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 31CE2280001 for ; Fri, 31 Jan 2025 11:59:34 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8BA091A0FE1 for ; Fri, 31 Jan 2025 16:59:32 +0000 (UTC) X-FDA: 83068358184.25.3A0F40C Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) by imf08.hostedemail.com (Postfix) with ESMTP id 89FFE16000A for ; Fri, 31 Jan 2025 16:59:30 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b=gJ7nRtqg; spf=none (imf08.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.128.51) smtp.mailfrom=simona.vetter@ffwll.ch; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738342770; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VV6lM3YFA6dZ2gmoOiF/2v5mb84a3EuFvSRPBCXsykU=; b=AmfLUgHldrgqExFRoMo2ae2/3JpuSzw03iwE1E2z8Wi/5Hpuqya0fJUgFskPYgUFSz8YP9 +6Kqwin+JvCqi/JjEdgU+zMFCf8YOVN3EO+nFgXCXvtsC/x6bBIXCwFhwPAazMYKe8QkCs 7JH5s6uUCnxO7VrbZiyIKYCxrHqUijg= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b=gJ7nRtqg; spf=none (imf08.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.128.51) smtp.mailfrom=simona.vetter@ffwll.ch; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738342770; a=rsa-sha256; cv=none; b=j9tft0thkaosmKx2/IAhCGVCs6Az8aZeItihNMSxskDAXdQGQ00Um7iJ+yN2fvwfPLl8GO 72mbIIKIWMeYMDYPseajykeQhE2ZuLpQuFknfA9AXFNlCG+VLT//t0mkuqis6ssgCgdTad 2g1dMpbV7A9iTYd8MKb3bfgqTRSTMbE= Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-436345cc17bso17322695e9.0 for ; Fri, 31 Jan 2025 08:59:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; t=1738342769; x=1738947569; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=VV6lM3YFA6dZ2gmoOiF/2v5mb84a3EuFvSRPBCXsykU=; b=gJ7nRtqgcnR7TwAfqZYUuFkUQqD2jG1PhMwgHX1uPaDgG3xVdQnAPTg4qxUrgUY/CD HrYH/qJywTD3UIEJnEWzwY3JETWDE2A0Ci98lfSQrmv6tn+JGRBlmfmeImiZdyv1rXii Hz3Mpzwy78IRHnE/EBuiJZCNPTUD/RAGB9qeY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738342769; x=1738947569; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VV6lM3YFA6dZ2gmoOiF/2v5mb84a3EuFvSRPBCXsykU=; b=MgoU4V3qRiPxsqbAsun5VKFeGyZk+fMYPbwpXEZ6+mMeyX8ZA2TcYjjJJyf9XKvt8Z O0X9arZhDhYIRUSLuvjO/RafDGZjFpd0bjSHH7hNekRtd1mGxOAnYrZy/QEEyBhhU+6t Q3WmK6xKYQE3q0nfqHqyZQVqUHsBbM8gz3G9Lcg1V7zzfPtENkZFNcOxJPrO+wS6cWlY or7HLiiv53bQIbxaMsV7HxoCAp1uTJTy0AzAFIJQqVhD9M3YurGq7YmBlGfzFcmtwocO o0Ilcigf9KibNBQCKctjSQILq3ZPs/YN/cA+oclvgd+Rx1ZnXs/f+6ncV70+Kna4knR8 lEPQ== X-Forwarded-Encrypted: i=1; AJvYcCUydalnxfq/eXkfCPw1l+JnaPZ3t1A+pFY8WOyVCbM5+D/dcVkR7m4Y1WyyWDC3ksE7apEWG2b8rA==@kvack.org X-Gm-Message-State: AOJu0YxWgM/QxMVuhs+yXMlmmlcRA1zbvX0NUyEVRRpLNCivi4bLGBRU Y+yHfUjXoD+wYHvFDbxoPa+uIhT4tmIrMq9ZyNZa1q0l+2YBnI/zHcXaSCBZY6Y= X-Gm-Gg: ASbGncvW6/BnxlVHG70vmBf9PQ7R9+XIpZLjfwWUk6XqvsA/CGRRd4JApgqJ6LUG9Wk YGvFZVt7xKroyPwei4RJ+kLK/QbVCa2wJK+ar7HGSCD31BfooF1DXEmphFF0S+wco7LiOP45Aa0 vLIbOv4C7hUXzUoYkMhza1tgRxnNPKXx/EgSnClTpYktrsWaXo3anR/4JFNAp3JGyZA5MJ4gU1X 5JzHB/1w7YdmfpXniQABABRnp1IwOw84MmNWilNYAq5j6rmLYWK2InvEqqYkcCElV9s+aDAKJ3F WT05Ll1jdF5p7q97pLfA9HBBY+I= X-Google-Smtp-Source: AGHT+IGol4Q1biN5R/T0S8PgIDvIBvNwryDWNshGEwCGFyy4OqFaWpi7Ea88cFCjbaHh4AUGnh4Gfg== X-Received: by 2002:a05:600c:5119:b0:434:f917:2b11 with SMTP id 5b1f17b1804b1-438dc40bd8dmr99961815e9.21.1738342768800; Fri, 31 Jan 2025 08:59:28 -0800 (PST) Received: from phenom.ffwll.local ([2a02:168:57f4:0:5485:d4b2:c087:b497]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438e23de2d6sm59965215e9.11.2025.01.31.08.59.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 08:59:28 -0800 (PST) Date: Fri, 31 Jan 2025 17:59:26 +0100 From: Simona Vetter To: Jason Gunthorpe Cc: Thomas =?iso-8859-1?Q?Hellstr=F6m?= , Yonatan Maman , kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com, airlied@gmail.com, simona@ffwll.ch, leon@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, GalShalom@nvidia.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-tegra@vger.kernel.org Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages Message-ID: Mail-Followup-To: Jason Gunthorpe , Thomas =?iso-8859-1?Q?Hellstr=F6m?= , Yonatan Maman , kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com, airlied@gmail.com, simona@ffwll.ch, leon@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, GalShalom@nvidia.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-tegra@vger.kernel.org References: <20250128151610.GC1524382@ziepe.ca> <20250128172123.GD1524382@ziepe.ca> <20250129134757.GA2120662@ziepe.ca> <20250130132317.GG2120662@ziepe.ca> <20250130174217.GA2296753@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250130174217.GA2296753@ziepe.ca> X-Operating-System: Linux phenom 6.12.11-amd64 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 89FFE16000A X-Stat-Signature: 33ndbd5id35mh9jtbjg6qmo5996w87k8 X-Rspam-User: X-HE-Tag: 1738342770-312564 X-HE-Meta: U2FsdGVkX18xYbOkVlnX2WMulTygMpOnbr7gRouPGto5T6JGFsv6tDVbegx0J1BlgIrh8aU02nXYhg4qyNh5SCsUCKWTeBMMEd1MldmnT5GgC7B9iJ5PdLt79b9qJia71mWO5PjtDF8OXFLffHzEfURG9atikvWc7S1sOsi1zqJxKls9jNH2CThdR5Q2Bxc7bzBCltNaXVamoAwp42xkTAM1ZRV9PYI59wCvSnnlC/GHukxak1osUVvWJ6nnWteXjXmma+8oCVDZ50IMYHfGAg1b9ISXsUBhlSxzYVP0nGSXQtsuAZ0JN8Zy12x8JtA6XwW7dtE/T/u7cvR/vJB9wC4fjOJh4qsp+wmZEQAxj4tmbhyujsy1Un0oxc7vJ6fWxi8YRO/5wn1GXIiYkKhU8lBPWAiyW8jNowcuDL2ovS9dtBFB3xYFmufEiMjvUfvW9n5bmU3IaII8D2+Jnb2QteewtpMWjK/kj3PG3VF9B90RzGvp/1smHzSgp87T3GC/yUUixKh0Cs3QdphIV2Rt5z1iwgNqjp2ZXqNaWsMPGBJ0Aku9lJ6gZZ5seMXpRjopz/n2VmvzXGntO9BlFNzpwq38sa2jPZZn7GhhTCP3OFmkXp1lQNJqm47wYq6SsXSdWgvivwvyaBrZkz4hCB1qNQfcIBiEWEdZxhNTtIZrqG9aGiws9ycpl9Z/UB9coX0RHOw1fI5FQDNO6J5nVCqS56FO/r5MTBJ6o89LFL8/XZbG+h7Yh3dWitMqNYL5KFoS662a6xezF6dtJ3Tqkp7cSQKkAT0q9oWFjLScJSSyo4ulNTSVUWCeKk/y6QXyWHdB2LTC/rKXdZk98KxhNQC/oIi9d3xa5Cmr0GfngGGx/BeyqwB0wGlvak4pRKVBjRZoRzSfd0ZMRfD33l3a1UyoMDbpr3mHlqv+X4oeNxk7fg9TMd2N4Yw6k46eDEgjO37+3aak8l3KJBKBoZyQ0pb 1laRhxDY 6wfwxG8mWY6AC5Brqrt4jkzuff7+HQpNs2+Hx+So+5GRsO5/zt6EqzUm3q888L7YiwKqaSBxOVwzgID68Owmywl5ZZvHS3HltjKDPfEoe56BVLw/moran3AWkaWNQ085kLofx50d3Lh5d92kcP1pYNVwaf9QQxM2fABo6GJyJ8Tpcw1unXnUjIZ4Ek2l1MJWp6PLtprptOFhdLNywIQWlzUA2V9tHeycQkQVzDZsyqCNtK6Ld95ag559/eYLyv2AJk7OfJ2A0QGhvDr5ruGuMY/ka4rNgmVg4bT+Kj15TOpxxjsw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.009477, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 30, 2025 at 01:42:17PM -0400, Jason Gunthorpe wrote: > On Thu, Jan 30, 2025 at 05:09:44PM +0100, Simona Vetter wrote: > > > > An optional callback is a lot less scary to me here (or redoing > > > > hmm_range_fault or whacking the migration helpers a few times) looks a lot > > > > less scary than making pgmap->owner mutable in some fashion. > > > > > > It extra for every single 4k page on every user :\ > > > > > > And what are you going to do better inside this callback? > > > > Having more comfy illusions :-P > > Exactly! > > > Slightly more seriously, I can grab some locks and make life easier, > > Yes, but then see my concern about performance again. Now you are > locking/unlocking every 4k? And then it still races since it can > change after hmm_range_fault returns. That's not small, and not really > better. Hm yeah, I think that's the death argument for the callback. Consider me convinced on that being a bad idea. > > whereas sprinkling locking or even barriers over pgmap->owner in core mm > > is not going to fly. Plus more flexibility, e.g. when the interconnect > > doesn't work for atomics or some other funny reason it only works for some > > of the traffic, where you need to more dynamically decide where memory is > > ok to sit. > > Sure, an asymmetric mess could be problematic, and we might need more > later, but lets get to that first.. > > > Or checking p2pdma connectivity and all that stuff. > > Should be done in the core code, don't want drivers open coding this > stuff. Yeah so after amdkfd and noveau I agree that letting drivers mess this up isn't great. But see below, I'm not sure whether going all the way to core code is the right approach, at least for gpu internal needs. > > Also note that fundamentally you can't protect against the hotunplug or > > driver unload case for hardware access. So some access will go to nowhere > > when that happens, until we've torn down all the mappings and migrated > > memory out. > > I think a literal (safe!) hot unplug must always use the page map > revoke, and that should be safely locked between hmm_range_fault and > the notifier. > > If the underlying fabric is loosing operations during an unplanned hot > unplug I expect things will need resets to recover.. So one aspect where I don't like the pgmap->owner approach much is that it's a big thing to get right, and it feels a bit to me that we don't yet know the right questions. A bit related is that we'll have to do some driver-specific migration after hmm_range_fault anyway for allocation policies. With coherent interconnect that'd be up to numactl, but for driver private it's all up to the driver. And once we have that, we can also migrate memory around that's misplaced for functional and not just performance reasons. The plan I discussed with Thomas a while back at least for gpus was to have that as a drm_devpagemap library, which would have a common owner (or maybe per driver or so as Thomas suggested). Then it would still not be in drivers, but also a bit easier to mess around with for experiments. And once we have some clear things that hmm_range_fault should do instead for everyone, we can lift them up. Doing this at a pagemap level should also be much more efficient, since I think we can make the assumption that access limitations are uniform for a given dev_pagemap (and if they're not if e.g. not the entire vram is bus visible, drivers can handle that by splitting things up). But upfront speccing all this out doesn't seem like a good idea to, because I honestly don't know what we all need. Cheers, Sima -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch