From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFFE8C0218F for ; Thu, 30 Jan 2025 16:09:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42C022800DB; Thu, 30 Jan 2025 11:09:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3DB5A2800D7; Thu, 30 Jan 2025 11:09:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A2A22800DB; Thu, 30 Jan 2025 11:09:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0DC802800D7 for ; Thu, 30 Jan 2025 11:09:52 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9E7641C57B5 for ; Thu, 30 Jan 2025 16:09:51 +0000 (UTC) X-FDA: 83064604182.19.DD79799 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf17.hostedemail.com (Postfix) with ESMTP id 9435440012 for ; Thu, 30 Jan 2025 16:09:49 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b=aC6BgluZ; dmarc=none; spf=none (imf17.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.128.52) smtp.mailfrom=simona.vetter@ffwll.ch ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738253389; a=rsa-sha256; cv=none; b=mM/hKPuhTIHZKxvfjmh3qMDKIkqW8lh8IBUEUar/Ys2iV3gI83NFiIY34y6w7/b0VJieI7 O0tcCYFRWsKO0VqObEhB6MI+AxzNAPsaMSP+LIz8hnoMTbXV00jCPOaw3q3ppBW1Ce1hRC DhV+bwi7YE/3h2HTx4RuG9GRh8bGtBQ= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b=aC6BgluZ; dmarc=none; spf=none (imf17.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.128.52) smtp.mailfrom=simona.vetter@ffwll.ch ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738253389; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DHjECDXWr9DtzEWz0JXb8DxrPMtRT+1WnFwQ9T4qTYY=; b=xE/AQUrNbtojEcdg7J1SPBoWp1xSsdwvw26MgWh1Ka4lmKcrHdfQ7zdl3YO0hZ4vqNtJ7Q GEtvuuZEHCsvp05nlkVjETIZDzKDzZyEioIAgA6zhu9HprVMjLLPYfJhOvbHP9NbACaMeL pm0S1tXANeHyuWAqGt+aTqny+jVgruw= Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-4361dc6322fso7243715e9.3 for ; Thu, 30 Jan 2025 08:09:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; t=1738253388; x=1738858188; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=DHjECDXWr9DtzEWz0JXb8DxrPMtRT+1WnFwQ9T4qTYY=; b=aC6BgluZyYWlxbHm1AhhrtpFQWsEK90liXOmryCVPycKmV8R7ondyYr9MmhJw76JG4 ocED94atGibV0ZyJ4UKzrvsQVv2wv31rxhLw+WD0EB2pPELZ7gOBIToj4cSXzxjCtoMo PMex7vAHGeRVJQImtcBbaO//c5JFKM1DXWpb0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738253388; x=1738858188; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DHjECDXWr9DtzEWz0JXb8DxrPMtRT+1WnFwQ9T4qTYY=; b=xBflTSp6cfnKHR+SE8V/SM9UIF3IpIByw7bLPVq9gZ++Pf7Er0I7ialh3/a1diW8fL RfoHFPdrmPtd+2OJdJfJ2Wa6C9N7CzcNIB9xvUcrMtOiy4kch08/QG+eK28WN+1U9ZTs IZaz6xXvjsYUUfaqQlQd40r0F1bjF9iTRljTl1GRc4hVsjsc0Qpc2l1XhmQvt1ettW88 bws6Hz4p3v6nMt4SvGF1XMle6ojVQsYCNyWzrZaAHMn31btLQ7oMbGxC3tONaIAIen40 +Y91yscqZm2oACpLQ4yu2OwGaMrNT7d3FfUVbP0+xbCmwAk4FvMh2V8FeTXYGMnNmUxt kRsA== X-Forwarded-Encrypted: i=1; AJvYcCX6aRlkFvTa+46CHRJItXwGK+2KaRtctFYSN0OPXEHxi773EeL/Gitp2Hqg8FgMRwkBROBq9dlaKQ==@kvack.org X-Gm-Message-State: AOJu0YwYxfNN4K0N5z1uLk6wsMzyDFgiYOzRBKcsZj0kwbmla4m/3QuX 09ybUaZqIXbdPkHQTpOFtwI+mcuWk0iZNLfPX3kHeVRoKB2HeWNv/OMg7uklneg= X-Gm-Gg: ASbGncsXtecCtIIFuDC3IjAUYOVd2anqTzgOtdjgEWE673nZ/UrlUwFRLgI3VSCnnvZ ghUzFgXG48rN2x424K7PryW0Tky0WyPF/IptvoSeWega3ssmrhg2tFOcb2Qv+H60znYFRheNM4j iLKt5ZL7S1IwEPFELsFyTos80f+5mjIaYmS8q3RoF0tzBSr9ZjTFFv8OtQoErgP910wZd4pbsFF OfrVSMoMWW2CmeJ3OpriAUHtO95KKTnJ8dgYvqOKc6FRULkjGGZpwori+IitjOjy1L+hZq9ddEp xteqW8tiN9bsvtZLOsB4rKGl41E= X-Google-Smtp-Source: AGHT+IHzkAlG33CPN/kMrEln846pAhOyRPiLkWV/nev8K+POyx7veXX/IQAiWwM1YsBE7oYiCeZW/A== X-Received: by 2002:a05:600c:1e0d:b0:436:f960:3427 with SMTP id 5b1f17b1804b1-438dc40ffbbmr65699075e9.22.1738253387756; Thu, 30 Jan 2025 08:09:47 -0800 (PST) Received: from phenom.ffwll.local ([2a02:168:57f4:0:5485:d4b2:c087:b497]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438dcc2ef08sm64120395e9.22.2025.01.30.08.09.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jan 2025 08:09:46 -0800 (PST) Date: Thu, 30 Jan 2025 17:09:44 +0100 From: Simona Vetter To: Jason Gunthorpe Cc: Thomas =?iso-8859-1?Q?Hellstr=F6m?= , Yonatan Maman , kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com, airlied@gmail.com, simona@ffwll.ch, leon@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, GalShalom@nvidia.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-tegra@vger.kernel.org Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages Message-ID: Mail-Followup-To: Jason Gunthorpe , Thomas =?iso-8859-1?Q?Hellstr=F6m?= , Yonatan Maman , kherbst@redhat.com, lyude@redhat.com, dakr@redhat.com, airlied@gmail.com, simona@ffwll.ch, leon@kernel.org, jglisse@redhat.com, akpm@linux-foundation.org, GalShalom@nvidia.com, dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, linux-tegra@vger.kernel.org References: <7282ac68c47886caa2bc2a2813d41a04adf938e1.camel@linux.intel.com> <20250128132034.GA1524382@ziepe.ca> <20250128151610.GC1524382@ziepe.ca> <20250128172123.GD1524382@ziepe.ca> <20250129134757.GA2120662@ziepe.ca> <20250130132317.GG2120662@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250130132317.GG2120662@ziepe.ca> X-Operating-System: Linux phenom 6.12.11-amd64 X-Rspam-User: X-Rspamd-Queue-Id: 9435440012 X-Rspamd-Server: rspam10 X-Stat-Signature: x3wch1wgrqo5gadb7468qqd1oyjfcp46 X-HE-Tag: 1738253389-120957 X-HE-Meta: U2FsdGVkX1+Ki0ugUENdvTvt5KhbnhQemoEmjjXEnellQ2Y90qx4bSubY+JucMvjQepSQmp70KflxUMDvi2fymneusyd+JcadmSROQZTK5eCqQAlodlVnPouy4BPBiMZiFHBmfhKfST4h4+YucWPK0pxTyGQWS6RFEmm1vF4lGA+KKGFbzHqbSYJBl6C+pn/qAjDbSl/XPiSbXW0Mr2i8BH4yjVRAZSo9mLxEaoiinclRQRtUAtkqArY2W0shdEhIelk+A0XnRrKDMGlAtwXdfx8hKoupix//N7Bh1luthPnSLwusrGqPxHq+1ililaXiIRJAG+yvULI8iNT16aox1+M9NQNOO1rMfDeu9RCsyDiSIaMYCnthIRclw3Wle61XovyqKfE8xe4/pacqtAbBOet6qcOIZzxpfB5d9eYepxGL0LcJN3LVy7v2k2EvxxL01gNfo0ot9C1OnpwUmCzj9EaqI8d3OXqzt7sSdU8GjrSarW0iKT0M01GycCM2k+5z3ZdZyER5zCJpl7A7DGUqfjmxZz7sxi6TD563tccFhU/YEMtP41MmkBKsASYRocdLnBbIOosNJMw4jyg3ZdYApgPeghK0ziO3vJ5Rwn13MEceGiWH/PobdoNH45fGToM2mmjWfiShOMihhSHBesG225h9VGLfh7BlFZ0gUm3NjAGOb3ci7l4Twqo1/Uz69UwrxxG5oB0Vt/zMuoTKNC3x7MvaZ/Cabo1ljZdo+nx6B3qUegsxxq2WRtSMcGEVUti8vZuFD2w/rjeHXwkoVrC3Uoya8ARUorOjtn/luhRvI3AYXFQtVEs5WgHvDMBP71tSbZopQW/NifRALvnB8xIvn07suli7UcKCIX0twZR+q6nxerD/7/SVXjkWXnZJC6SGaksFNATPIisRUA5Ttw2IGh8Xucqp2a7g+agwt4RolLlZRPFo/BHace3EshC5GatrDhRhrQ1PevttnntodP 7REKD9yf 9Z54j6U58A0TDX0/x5p1IQHO6t+cHrUIaE2dbjMBz2QfhIP2Cru53VR1BiRXu9W8UBYHH4YpXSygaCynUP76EwrqhRI6nobdOpePMcqnodILI70Yx6PBMrZVYQtETm4R1PCcT2PQmKNOHgME3rBONIDFyZo5p+snuolM8EjLelA1P6u7sUI44IJBVC0Eb9ch2opIjYxLATgXMnBbIt/au3lizmIjFEGSw766bbXCzV5F40Z3WXWNDAznUrS8o99NvRLnqWuAARSM9XtRdfFs023NzJye3CIECdSZte0fxZ6U08k0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.005117, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 30, 2025 at 09:23:17AM -0400, Jason Gunthorpe wrote: > On Thu, Jan 30, 2025 at 11:50:27AM +0100, Simona Vetter wrote: > > On Wed, Jan 29, 2025 at 09:47:57AM -0400, Jason Gunthorpe wrote: > > > On Wed, Jan 29, 2025 at 02:38:58PM +0100, Simona Vetter wrote: > > > > > > > > The pgmap->owner doesn't *have* to fixed, certainly during early boot before > > > > > you hand out any page references it can be changed. I wouldn't be > > > > > surprised if this is useful to some requirements to build up the > > > > > private interconnect topology? > > > > > > > > The trouble I'm seeing is device probe and the fundemantal issue that you > > > > never know when you're done. And so if we entirely rely on pgmap->owner to > > > > figure out the driver private interconnect topology, that's going to be > > > > messy. That's why I'm also leaning towards both comparing owners and > > > > having an additional check whether the interconnect is actually there or > > > > not yet. > > > > > > Hoenstely, I'd rather invest more effort into being able to update > > > owner for those special corner cases than to slow down the fast path > > > in hmm_range_fault.. > > > > I'm not sure how you want to make the owner mutable. > > You'd probably have to use a system where you never free them until > all the page maps are destroyed. > > You could also use an integer instead of a pointer to indicate the > cluster of interconnect, I think there are many options.. Hm yeah I guess an integer allocater of the atomic_inc kind plus "surely 32bit is enough" could work. But I don't think it's needed, if we can reliable just unregister the entire dev_pagemap and then just set up a new one. Plus that avoids thinking about which barriers we might need where exactly all over mm code that looks at the owner field. > > And I've looked at the lifetime fun of unregistering a dev_pagemap for > > device hotunplug and pretty firmly concluded it's unfixable and that I > > should run away to do something else :-P > > ? It is supposed to work, it blocks until all the pages are freed, but > AFAIK ther is no fundamental life time issue. The driver is > responsible to free all its usage. Hm I looked at it again, and I guess with the fixes to make migration to system memory work reliable in Matt Brost's latest series it should indeed work reliable. The devm_ version still freaks me out because of how easily people misuse these for things that are memory allocations. > > An optional callback is a lot less scary to me here (or redoing > > hmm_range_fault or whacking the migration helpers a few times) looks a lot > > less scary than making pgmap->owner mutable in some fashion. > > It extra for every single 4k page on every user :\ > > And what are you going to do better inside this callback? Having more comfy illusions :-P Slightly more seriously, I can grab some locks and make life easier, whereas sprinkling locking or even barriers over pgmap->owner in core mm is not going to fly. Plus more flexibility, e.g. when the interconnect doesn't work for atomics or some other funny reason it only works for some of the traffic, where you need to more dynamically decide where memory is ok to sit. Or checking p2pdma connectivity and all that stuff. But we can also do all that stuff by checking afterwards or migrating memory around as needed. At least for drivers who cooperate and all set the same owner, which I think is Thomas' current plan. Also note that fundamentally you can't protect against the hotunplug or driver unload case for hardware access. So some access will go to nowhere when that happens, until we've torn down all the mappings and migrated memory out. -Sima -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch