linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-block@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-mm@kvack.org, "Christoph Hellwig" <hch@lst.de>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Christian König" <christian.koenig@amd.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Don Dutile" <ddutile@redhat.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Minturn Dave B" <dave.b.minturn@intel.com>,
	"Jason Ekstrand" <jason@jlekstrand.net>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Xiong Jianxin" <jianxin.xiong@intel.com>,
	"Bjorn Helgaas" <helgaas@kernel.org>,
	"Ira Weiny" <ira.weiny@intel.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Martin Oliveira" <martin.oliveira@eideticom.com>,
	"Chaitanya Kulkarni" <ckulkarnilinux@gmail.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Stephen Bates" <sbates@raithlin.com>
Subject: Re: [PATCH v9 7/8] PCI/P2PDMA: Allow userspace VMA allocations through sysfs
Date: Fri, 2 Sep 2022 12:46:54 -0600	[thread overview]
Message-ID: <db8cd049-c78b-1aa0-dcd0-0feb8c6cb25c@deltatee.com> (raw)
In-Reply-To: <YxGad5h2Nn/Ejslc@kroah.com>



On 2022-09-01 23:53, Greg Kroah-Hartman wrote:
> On Thu, Sep 01, 2022 at 01:16:54PM -0600, Logan Gunthorpe wrote:
>> This surprises me. Can you elaborate on this classic issue?
> 
> There's long threads about it on the ksummit discuss mailing list and
> other places.

I've managed to find one such thread dealing with lifetime issues of
different objects and bugs that are common with mistakes with its usage.
I've dealt with similar issues in the past, but as best as I can see 
there are no lifetime issues in this code.

> I have never used devm_add_action_or_reset() so I can't say why it is
> there.  I am just pointing out that manually messing with a sysfs group
> from a driver is a huge flag that something is wrong.  A driver should
> almost never be touching a raw kobject or calling any sysfs_* call if
> all is normal, which is why I questioned this.

In this case we need to remove the specifc sysfs file to teardown any
vmas earlier in the remove sequence than it would be done normally. Whether
we do that through devm or remove() doesn't change the fact that we need
to access the dev->kobj to do that early.

>> But if it's that important I can make the change to these patches for v10.
> 
> Try it the way I suggest, with a remove() callback, and see if that
> looks simpler and easier to follow and maintain over time.

See the diff at the bottom of this email. I can apply it on top of this
patch, but IMO it is neither easier to follow nor maintain. Unless you 
have a different suggestion...

Thanks,

Logan

--

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index a6ed6bbca214..4e1211a2a6cd 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -206,6 +206,23 @@ static const struct dev_pagemap_ops p2pdma_pgmap_ops = {
 	.page_free = p2pdma_page_free,
 };
 
+void pci_p2pdma_remove(struct pci_dev *pdev)
+{
+	if (!rcu_access_pointer(pdev->p2pdma))
+		return;
+
+	/*
+	 * Any userspace mappings must be unmapped before the
+	 * devm_memremap_pages() release happens, otherwise a device remove
+	 * will hang on any processes that have pages mapped. To avoid this,
+	 * remove the alloc attribute from sysfs which will call
+	 * unmap_mapping_range() on the inode and teardown any existing
+	 * userspace mappings.
+	 */
+	sysfs_remove_file_from_group(&pdev->dev.kobj, &p2pmem_alloc_attr.attr,
+				     p2pmem_group.name);
+}
+
 static void pci_p2pdma_release(void *data)
 {
 	struct pci_dev *pdev = data;
@@ -257,19 +274,6 @@ static int pci_p2pdma_setup(struct pci_dev *pdev)
 	return error;
 }
 
-static void pci_p2pdma_unmap_mappings(void *data)
-{
-	struct pci_dev *pdev = data;
-
-	/*
-	 * Removing the alloc attribute from sysfs will call
-	 * unmap_mapping_range() on the inode, teardown any existing userspace
-	 * mappings and prevent new ones from being created.
-	 */
-	sysfs_remove_file_from_group(&pdev->dev.kobj, &p2pmem_alloc_attr.attr,
-				     p2pmem_group.name);
-}
-
 /**
  * pci_p2pdma_add_resource - add memory for use as p2p memory
  * @pdev: the device to add the memory to
@@ -328,11 +332,6 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
 		goto pgmap_free;
 	}
 
-	error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_unmap_mappings,
-					 pdev);
-	if (error)
-		goto pages_free;
-
 	p2pdma = rcu_dereference_protected(pdev->p2pdma, 1);
 	error = gen_pool_add_owner(p2pdma->pool, (unsigned long)addr,
 			pci_bus_address(pdev, bar) + offset,
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 49238ddd39ee..a096f2723eac 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -471,6 +471,8 @@ static void pci_device_remove(struct device *dev)
 	struct pci_dev *pci_dev = to_pci_dev(dev);
 	struct pci_driver *drv = pci_dev->driver;
 
+	pci_p2pdma_remove(pci_dev);
+
 	if (drv->remove) {
 		pm_runtime_get_sync(dev);
 		drv->remove(pci_dev);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 785f31086313..1c5c901a2fcc 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -774,4 +774,12 @@ static inline pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
 }
 #endif
 
+#ifdef CONFIG_PCI_P2PDMA
+void pci_p2pdma_remove(struct pci_dev *dev);
+#else
+static inline void pci_p2pdma_remove(struct pci_dev *dev);
+{
+}
+#endif
+
 #endif /* DRIVERS_PCI_H */










  reply	other threads:[~2022-09-02 18:47 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25 15:24 [PATCH v9 0/8] Userspace P2PDMA with O_DIRECT NVMe devices Logan Gunthorpe
2022-08-25 15:24 ` [PATCH v9 1/8] mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages Logan Gunthorpe
2022-09-05 22:27   ` John Hubbard
2022-08-25 15:24 ` [PATCH v9 2/8] iov_iter: introduce iov_iter_get_pages_[alloc_]flags() Logan Gunthorpe
2022-09-05 14:33   ` Christoph Hellwig
2022-09-05 23:21   ` John Hubbard
2022-09-06 16:52     ` Logan Gunthorpe
2022-08-25 15:24 ` [PATCH v9 3/8] block: add check when merging zone device pages Logan Gunthorpe
2022-09-05 14:34   ` Christoph Hellwig
2022-09-05 23:58   ` John Hubbard
2022-08-25 15:24 ` [PATCH v9 4/8] lib/scatterlist: " Logan Gunthorpe
2022-09-05 14:34   ` Christoph Hellwig
2022-09-06  0:21   ` John Hubbard
2022-08-25 15:24 ` [PATCH v9 5/8] block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages() Logan Gunthorpe
2022-09-05 14:36   ` Christoph Hellwig
2022-09-06  0:48   ` John Hubbard
2022-08-25 15:24 ` [PATCH v9 6/8] block: set FOLL_PCI_P2PDMA in bio_map_user_iov() Logan Gunthorpe
2022-09-05 14:36   ` Christoph Hellwig
2022-09-06  0:54   ` John Hubbard
2022-08-25 15:24 ` [PATCH v9 7/8] PCI/P2PDMA: Allow userspace VMA allocations through sysfs Logan Gunthorpe
2022-09-01 16:20   ` Greg Kroah-Hartman
2022-09-01 16:32     ` Logan Gunthorpe
2022-09-01 16:42       ` Greg Kroah-Hartman
2022-09-01 18:14         ` Logan Gunthorpe
2022-09-01 18:36           ` Greg Kroah-Hartman
2022-09-01 19:16             ` Logan Gunthorpe
2022-09-02  5:53               ` Greg Kroah-Hartman
2022-09-02 18:46                 ` Logan Gunthorpe [this message]
2022-09-20  6:46                   ` Christoph Hellwig
2022-09-22  8:38                     ` Greg Kroah-Hartman
2022-09-22 14:58                       ` Logan Gunthorpe
2022-08-25 15:24 ` [PATCH v9 8/8] ABI: sysfs-bus-pci: add documentation for p2pmem allocate Logan Gunthorpe
2022-09-01 16:18   ` Greg Kroah-Hartman
2022-09-01 16:33     ` Logan Gunthorpe
2022-09-06  1:03   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db8cd049-c78b-1aa0-dcd0-0feb8c6cb25c@deltatee.com \
    --to=logang@deltatee.com \
    --cc=christian.koenig@amd.com \
    --cc=ckulkarnilinux@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dave.b.minturn@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=ddutile@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=ira.weiny@intel.com \
    --cc=jason@jlekstrand.net \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=jianxin.xiong@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=martin.oliveira@eideticom.com \
    --cc=rcampbell@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=sbates@raithlin.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox