From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 925DEC10F1A for ; Thu, 9 May 2024 17:42:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B72A6B0095; Thu, 9 May 2024 13:42:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 068636B0096; Thu, 9 May 2024 13:42:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E71A36B0098; Thu, 9 May 2024 13:42:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CDA5A6B0095 for ; Thu, 9 May 2024 13:42:09 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 72CBD40125 for ; Thu, 9 May 2024 17:42:09 +0000 (UTC) X-FDA: 82099575978.19.B9EDC44 Received: from mail-yw1-f175.google.com (mail-yw1-f175.google.com [209.85.128.175]) by imf29.hostedemail.com (Postfix) with ESMTP id 6AF1A120017 for ; Thu, 9 May 2024 17:42:07 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=gttW5ghm; dmarc=none; spf=pass (imf29.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.128.175 as permitted sender) smtp.mailfrom=jgg@ziepe.ca ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715276527; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WevGPTgw8EqB15lVD4jdzyFg2kXB80qSvqfKiUoMKg4=; b=ArrM+HQyJTHqB3UWeF+rVi9loYvpTqtJKkJ9wqVTDsR5yxU1y5p3VRsbrF2Rs8kEl41WGL Gr9WT6hUnSxkkBeuIvtMPMM0j37oK3ltlcSmF97j+78ytgj3pM26IdsET71T+yrsY4kYP5 CzJCfBeP5TJMRaCparhFDxtxjQ8P7pg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715276527; a=rsa-sha256; cv=none; b=K8Waw9VMiFgMTWoRYxaInpK8Yz6ySwAs/rmWUmiZPs2ZQs3AWd/CTneVp67BbQ+2JF6mJI kJS+9BQrc7x+30xV24G2EchsiygyS75lOsj+1WMX+8X0fW61cbLFaTqNGDUM7j5aEteYaG AX/PohX72Z0RD2JorWpVZWScpmCohjs= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=ziepe.ca header.s=google header.b=gttW5ghm; dmarc=none; spf=pass (imf29.hostedemail.com: domain of jgg@ziepe.ca designates 209.85.128.175 as permitted sender) smtp.mailfrom=jgg@ziepe.ca Received: by mail-yw1-f175.google.com with SMTP id 00721157ae682-61b68644ab4so11257917b3.0 for ; Thu, 09 May 2024 10:42:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1715276526; x=1715881326; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=WevGPTgw8EqB15lVD4jdzyFg2kXB80qSvqfKiUoMKg4=; b=gttW5ghm+O8a5MhlHTD+Yx1UpIvER5QrPuFHAMyOTkURs0OuCWk1hOEIiuL+kPdJX4 efHGp3vOS2G1YNqARsF7mIQCyS2QwFVRk3Ve5IepEk9gs8lqFmTikGVCmjI6R8DqutIP qMgV0/GYeZMURQ+SXX4lE02B2dWxNnCci9fj4HZ2+mZDqlbFvo4eQU62hj6GXYliYxwV tv71sz/88VK35ao2USeTsX5LIxQUcqX/CbZtI3QDPVGW9FnKwKpOz3K5KpCVvFkJmXEp RHZUkXbFM/SlahX60GO6LphEHFJO4BsE9ubDk4+cemBOWyw/m3eDJRzcdOiVfeHTqsvM Yajw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715276526; x=1715881326; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=WevGPTgw8EqB15lVD4jdzyFg2kXB80qSvqfKiUoMKg4=; b=VZvNi/FMFe9SPyT0yQP9u018rwG6Cz3X6GhSNt3Ns3Gw06Q03+LMN+J+xBviFxYDEk EQ9YD5kdEq+QcpJsulFlWAD0B+eZjAaHMlHTtrld4EBoiAXxbD+IG3QzkdVKb2FAzrZ8 /BbQdK17Xk7jMAdZO+EBQgpPDWLWwEb62+phcg5TleJcLYCpb459v1Ihnq7BOMVow4mz NduhhPttjASTQPklHNMFgUmutRWO4Ybm4DjDychw2bEzqNr+xNI1oiVagDaTj5aoK1DR akk+qm0JpmpKQUT5hy7EsrSowKO7QleOMTcRMb6B9y6cKvT7DtNCRK+auWcViJOwYxp+ T/uQ== X-Forwarded-Encrypted: i=1; AJvYcCVR7U8ik+6+kYDCdRt0JTPskGCGwt0Y+hF8TI4Eg9mN4YiE3U+huJGTJDMuujUuLOhUfS/dMetva/RJoa4Q/x4zQOw= X-Gm-Message-State: AOJu0Yxw0p7TucP04WvTBAbOxGKFC52evWbUwSJlddwZ6bE3mTTFx9+B stOAPph8aUiq73Tl17Roix3xheppJUpJFQRoFP2aYhsMDWtsZScuoghX0O7wOU0= X-Google-Smtp-Source: AGHT+IFkh/Fvhe1lLQ7P5zisOKMjZcvUkoEW5Lpnj/Em/Y5QOPx8/vuJSiaVYLs/fOF7CMdN+zyviQ== X-Received: by 2002:a05:690c:6c03:b0:61a:b573:65f4 with SMTP id 00721157ae682-622affc5609mr2466837b3.6.1715276526417; Thu, 09 May 2024 10:42:06 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-68-80-239.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.80.239]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-43df569847bsm10554411cf.59.2024.05.09.10.42.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 May 2024 10:42:05 -0700 (PDT) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1s57mO-004n27-IQ; Thu, 09 May 2024 14:42:04 -0300 Date: Thu, 9 May 2024 14:42:04 -0300 From: Jason Gunthorpe To: Logan Gunthorpe Cc: Martin Oliveira , Christoph Hellwig , Dan Williams , LKML , "linux-rdma@vger.kernel.org" , "linux-mm@kvack.org" Subject: Re: P2PDMA in Userspace RDMA Message-ID: <20240509174204.GV4718@ziepe.ca> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 6AF1A120017 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 78r5sgafwm7erqoqf574popijtfuofgb X-HE-Tag: 1715276527-514938 X-HE-Meta: U2FsdGVkX1/9wxQH5Pg2XCs6FM5QVLD6/N+13RdeI/I7oo6r38yp2OPDAfuaISslOQkr+0QxvRTYG7KFhm1DDcsnPjKF34eLupVGmpu1DLYCcG87w1VHUZctqoxaCJwrUkaZBpOSjFOQMsavgufY8b2CSBcZ6PhTe31x0Y+Xe7WEyKSw7gdD9/vJNlKJbkmHApNogvGRVb1xEytWHwbKmTpNCEsiN0BVcZcM8QxHoT6PpaMcSCDgJR4KVIDXt/sVBh7eGTW/A0efisMdprYDtk58QOOBoOm+KceSDr+Y5+lNKIkQKGmV1HAI6Vwuv7rm3emWOVvQxejqDQa3Slh4AkRodxCgFyhRg5d/9igQPm0MgWfA63xFlCuDS5+vCjKKvaUdn6M0bV78CegutHCW9lF+IQ0fCkPNuK5zZTdhQ+it25QSH1t6tj39t6RFKGwfc9cFFc/H2Gwduf/qDXLe2VnmnKv0serbJN8s5Xx1dUW3xfwL1S7czA2bP2KxotybmUHS/GdYK22yqeH5rXLPQbOAN9v6j7VjHGwRqZHEse7u6JNHCapI0dPGIUr2QJe7J33uos/8bBRqIQoNYU6LvZYwnuIF/Mx984oCyRQUvWjLLDJsC+R0eLoNqgo5kIB1YUinSLK70EEeL1MJVaigy31Xi+nKwSsvut4CtJjEFqJGqqEG9bVL37Ll1VFv2yMA/9eRRV9s7MFfLKMaL5aVBYRzeQrDGFF3LNdIgeHRVyyQA0XPOXxc/wDXiP9xFF5DEa3qSQeZTrCjTECLMwEgC+wOUONeDbQUeuyEw5NhbZRnyTQyQ1JVUzFoiBkdk8Je2V+CLokjdqqSMhYf9jlQevRXJ62lW3o4IQHZiSAcS7W2DFDN4BNSRMX0MtkvkANiODAkWFiGEeJeOtWRN1rNaIAqBZjZtM0+neqcSjRQleywdB1pJ7FurQHjg6a/bsPw3Xd8+gE4sslQYcfyLTE 9EZ+Ires xcrm/U0snzLIgC4YL02I88zgVJWBKGsQWJNS+OJ7WKfDefKGaEeeGqGEZ+K7Fklk2Qf4/6Rnk96AxWUiXzKuW/s8Sby2+oaTetye5w+Wt6Mjr6Llw3Sy1X8AmbMfqtO8yD3RSl2IlQi6F2ChhzLtzYdp2tOybk/zAfYZ7ijN4hPfOqmzRYLw2nYjFtvqW/Zb3xnCM/XwQLGpeurqMRrtf7OlbqP0fz/RRIS5MOy67JGR1vvI8tQ9+Mm4u824opYi0/83gn2UXbxtpo+BSh+VJIVGVzw3lnh/6wJzCrLoQrNfHFUGyvuDuy1YrWJ4TT8qYeD7UdH3p+jsC2IyXZToNTImkYoID/CTXobQI2VOBSc5qTGM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, May 09, 2024 at 11:31:33AM -0600, Logan Gunthorpe wrote: > Hi Jason, > > We've become interested again in enabling P2PDMA transactions with > userspace RDMA and the NVMe CMBs we are already exporting to userspace > from our previous work. > > Enabling FOLL_PCI_P2PDMA in ib_umem_get is almost a trivial change, but > there are two issues holding us back. > > The biggest issue is that we disallowed FOLL_LONGTERM with > FOLL_PCI_P2PDMA out of concern that P2PDMA had the same problem as > fs-dax. Yeah, it was not a great outcome of that issue. > See [1] to review the discussion from 2 years ago. However, in > trying to understand the problem again, I'm not sure that concern was > valid. In P2PDMA, unmap_mapping_range() is strictly only called on > driver unbind when everything is getting torn down[2]. The next thing > that happens immediately after the unmap is the tear down of the pgmap > which drops the elevated reference on the pages and waits for all page's > reference counts to go back to zero. This will effectively wait until > all longterm pins involving the memory have been released. This can > cause a hang on unbind but, in your words, its "annoying not critical". Yes But you are looking at the code as it is right now, and stuff has been quitely fixed with the pgmap refcount area since. I think it is probably good now. IIRC it was pushed over the finish line when the ZONE_DEVICE/PRIVATE pages were converted to have normal reference counting. If p2p is following the new ZONE_DEVICE scheme then it should be fine. It would be good to read over Alistair's latest series fixing up fsdax refcounts to see if anything pops out as problematic specifically with the P2P case. Otherwise a careful check through is probably all that is needed. > The other issue we hit when enabling this feature is the check for > vma_needs_dirty_tracking() in writable_file_mapping_allowed() during the > gup flow. This hits because the p2pdma code is using the common > sysfs/kernfs infrastructure to create the VMA which installs a > page_mkwrite operator()[4] to change the file update time on write. Ah. > I don't think this feature really makes any sense for the P2PDMA > sysfs file which is really operating as an allocator in userspace -- > the time on the file does not really need to reflect the last write > of some process that wrote to memory allocated using it. Right, you shouldn't have mkwrite for these pages. Jason