linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* RFC: dev_pagemap reference counting
@ 2017-12-05  0:34 Christoph Hellwig
  2017-12-05  0:34 ` [PATCH 1/2] mm: move get_dev_pagemap out of line Christoph Hellwig
  2017-12-05  0:34 ` [PATCH 2/2] mm: fix dev_pagemap reference counting around get_dev_pagemap Christoph Hellwig
  0 siblings, 2 replies; 6+ messages in thread
From: Christoph Hellwig @ 2017-12-05  0:34 UTC (permalink / raw)
  To: dan.j.williams; +Cc: linux-nvdimm, linux-mm

Hi Dan,

maybe I'm missing something, but it seems like we release the reference
to the previously found pgmap before passing it to get_dev_pagemap again.

Can you check if my findings make sense?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] mm: move get_dev_pagemap out of line
  2017-12-05  0:34 RFC: dev_pagemap reference counting Christoph Hellwig
@ 2017-12-05  0:34 ` Christoph Hellwig
  2017-12-05  0:34 ` [PATCH 2/2] mm: fix dev_pagemap reference counting around get_dev_pagemap Christoph Hellwig
  1 sibling, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2017-12-05  0:34 UTC (permalink / raw)
  To: dan.j.williams; +Cc: linux-nvdimm, linux-mm

This is a pretty big function, which should be out of line in general,
and a no-op stub if CONFIG_ZONE_DEVICD? is not set.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/memremap.h | 42 +++++-------------------------------------
 kernel/memremap.c        | 36 ++++++++++++++++++++++++++++++++++--
 2 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 10d23c367048..f24e0c71d6a6 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -136,8 +136,8 @@ struct dev_pagemap {
 #ifdef CONFIG_ZONE_DEVICE
 void *devm_memremap_pages(struct device *dev, struct resource *res,
 		struct percpu_ref *ref, struct vmem_altmap *altmap);
-struct dev_pagemap *find_dev_pagemap(resource_size_t phys);
-
+struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
+		struct dev_pagemap *pgmap);
 static inline bool is_zone_device_page(const struct page *page);
 #else
 static inline void *devm_memremap_pages(struct device *dev,
@@ -153,11 +153,12 @@ static inline void *devm_memremap_pages(struct device *dev,
 	return ERR_PTR(-ENXIO);
 }
 
-static inline struct dev_pagemap *find_dev_pagemap(resource_size_t phys)
+static inline struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
+		struct dev_pagemap *pgmap)
 {
 	return NULL;
 }
-#endif
+#endif /* CONFIG_ZONE_DEVICE */
 
 #if defined(CONFIG_DEVICE_PRIVATE) || defined(CONFIG_DEVICE_PUBLIC)
 static inline bool is_device_private_page(const struct page *page)
@@ -173,39 +174,6 @@ static inline bool is_device_public_page(const struct page *page)
 }
 #endif /* CONFIG_DEVICE_PRIVATE || CONFIG_DEVICE_PUBLIC */
 
-/**
- * get_dev_pagemap() - take a new live reference on the dev_pagemap for @pfn
- * @pfn: page frame number to lookup page_map
- * @pgmap: optional known pgmap that already has a reference
- *
- * @pgmap allows the overhead of a lookup to be bypassed when @pfn lands in the
- * same mapping.
- */
-static inline struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
-		struct dev_pagemap *pgmap)
-{
-	const struct resource *res = pgmap ? pgmap->res : NULL;
-	resource_size_t phys = PFN_PHYS(pfn);
-
-	/*
-	 * In the cached case we're already holding a live reference so
-	 * we can simply do a blind increment
-	 */
-	if (res && phys >= res->start && phys <= res->end) {
-		percpu_ref_get(pgmap->ref);
-		return pgmap;
-	}
-
-	/* fall back to slow path lookup */
-	rcu_read_lock();
-	pgmap = find_dev_pagemap(phys);
-	if (pgmap && !percpu_ref_tryget_live(pgmap->ref))
-		pgmap = NULL;
-	rcu_read_unlock();
-
-	return pgmap;
-}
-
 static inline void put_dev_pagemap(struct dev_pagemap *pgmap)
 {
 	if (pgmap)
diff --git a/kernel/memremap.c b/kernel/memremap.c
index 403ab9cdb949..f0b54eca85b0 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -314,7 +314,7 @@ static void devm_memremap_pages_release(struct device *dev, void *data)
 }
 
 /* assumes rcu_read_lock() held at entry */
-struct dev_pagemap *find_dev_pagemap(resource_size_t phys)
+static struct dev_pagemap *find_dev_pagemap(resource_size_t phys)
 {
 	struct page_map *page_map;
 
@@ -500,8 +500,40 @@ struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
 
 	return pgmap ? pgmap->altmap : NULL;
 }
-#endif /* CONFIG_ZONE_DEVICE */
 
+/**
+ * get_dev_pagemap() - take a new live reference on the dev_pagemap for @pfn
+ * @pfn: page frame number to lookup page_map
+ * @pgmap: optional known pgmap that already has a reference
+ *
+ * @pgmap allows the overhead of a lookup to be bypassed when @pfn lands in the
+ * same mapping.
+ */
+struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
+		struct dev_pagemap *pgmap)
+{
+	const struct resource *res = pgmap ? pgmap->res : NULL;
+	resource_size_t phys = PFN_PHYS(pfn);
+
+	/*
+	 * In the cached case we're already holding a live reference so
+	 * we can simply do a blind increment
+	 */
+	if (res && phys >= res->start && phys <= res->end) {
+		percpu_ref_get(pgmap->ref);
+		return pgmap;
+	}
+
+	/* fall back to slow path lookup */
+	rcu_read_lock();
+	pgmap = find_dev_pagemap(phys);
+	if (pgmap && !percpu_ref_tryget_live(pgmap->ref))
+		pgmap = NULL;
+	rcu_read_unlock();
+
+	return pgmap;
+}
+#endif /* CONFIG_ZONE_DEVICE */
 
 #if IS_ENABLED(CONFIG_DEVICE_PRIVATE) ||  IS_ENABLED(CONFIG_DEVICE_PUBLIC)
 void put_zone_device_private_or_public_page(struct page *page)
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] mm: fix dev_pagemap reference counting around get_dev_pagemap
  2017-12-05  0:34 RFC: dev_pagemap reference counting Christoph Hellwig
  2017-12-05  0:34 ` [PATCH 1/2] mm: move get_dev_pagemap out of line Christoph Hellwig
@ 2017-12-05  0:34 ` Christoph Hellwig
  2017-12-06  2:43   ` Dan Williams
  1 sibling, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2017-12-05  0:34 UTC (permalink / raw)
  To: dan.j.williams; +Cc: linux-nvdimm, linux-mm

Both callers of get_dev_pagemap that pass in a pgmap don't actually hold a
reference to the pgmap they pass in, contrary to the comment in the function.

Change the calling convention so that get_dev_pagemap always consumes the
previous reference instead of doing this using an explicit earlier call to
put_dev_pagemap in the callers.

The callers will still need to put the final reference after finishing the
loop over the pages.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 kernel/memremap.c | 17 +++++++++--------
 mm/gup.c          |  7 +++++--
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/kernel/memremap.c b/kernel/memremap.c
index f0b54eca85b0..502fa107a585 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -506,22 +506,23 @@ struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
  * @pfn: page frame number to lookup page_map
  * @pgmap: optional known pgmap that already has a reference
  *
- * @pgmap allows the overhead of a lookup to be bypassed when @pfn lands in the
- * same mapping.
+ * If @pgmap is non-NULL and covers @pfn it will be returned as-is.  If @pgmap
+ * is non-NULL but does not cover @pfn the reference to it while be released.
  */
 struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
 		struct dev_pagemap *pgmap)
 {
-	const struct resource *res = pgmap ? pgmap->res : NULL;
 	resource_size_t phys = PFN_PHYS(pfn);
 
 	/*
-	 * In the cached case we're already holding a live reference so
-	 * we can simply do a blind increment
+	 * In the cached case we're already holding a live reference.
 	 */
-	if (res && phys >= res->start && phys <= res->end) {
-		percpu_ref_get(pgmap->ref);
-		return pgmap;
+	if (pgmap) {
+		const struct resource *res = pgmap ? pgmap->res : NULL;
+
+		if (res && phys >= res->start && phys <= res->end)
+			return pgmap;
+		put_dev_pagemap(pgmap);
 	}
 
 	/* fall back to slow path lookup */
diff --git a/mm/gup.c b/mm/gup.c
index d3fb60e5bfac..9d142eb9e2e9 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1410,7 +1410,6 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 
 		VM_BUG_ON_PAGE(compound_head(page) != head, page);
 
-		put_dev_pagemap(pgmap);
 		SetPageReferenced(page);
 		pages[*nr] = page;
 		(*nr)++;
@@ -1420,6 +1419,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 	ret = 1;
 
 pte_unmap:
+	if (pgmap)
+		put_dev_pagemap(pgmap);
 	pte_unmap(ptem);
 	return ret;
 }
@@ -1459,10 +1460,12 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr,
 		SetPageReferenced(page);
 		pages[*nr] = page;
 		get_page(page);
-		put_dev_pagemap(pgmap);
 		(*nr)++;
 		pfn++;
 	} while (addr += PAGE_SIZE, addr != end);
+
+	if (pgmap)
+		put_dev_pagemap(pgmap);
 	return 1;
 }
 
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] mm: fix dev_pagemap reference counting around get_dev_pagemap
  2017-12-05  0:34 ` [PATCH 2/2] mm: fix dev_pagemap reference counting around get_dev_pagemap Christoph Hellwig
@ 2017-12-06  2:43   ` Dan Williams
  2017-12-06 22:44     ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Dan Williams @ 2017-12-06  2:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvdimm, Linux MM

On Mon, Dec 4, 2017 at 4:34 PM, Christoph Hellwig <hch@lst.de> wrote:
> Both callers of get_dev_pagemap that pass in a pgmap don't actually hold a
> reference to the pgmap they pass in, contrary to the comment in the function.
>
> Change the calling convention so that get_dev_pagemap always consumes the
> previous reference instead of doing this using an explicit earlier call to
> put_dev_pagemap in the callers.
>
> The callers will still need to put the final reference after finishing the
> loop over the pages.

I don't think we need this change, but perhaps the reasoning should be
added to the code as a comment... details below.

>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  kernel/memremap.c | 17 +++++++++--------
>  mm/gup.c          |  7 +++++--
>  2 files changed, 14 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/memremap.c b/kernel/memremap.c
> index f0b54eca85b0..502fa107a585 100644
> --- a/kernel/memremap.c
> +++ b/kernel/memremap.c
> @@ -506,22 +506,23 @@ struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
>   * @pfn: page frame number to lookup page_map
>   * @pgmap: optional known pgmap that already has a reference
>   *
> - * @pgmap allows the overhead of a lookup to be bypassed when @pfn lands in the
> - * same mapping.
> + * If @pgmap is non-NULL and covers @pfn it will be returned as-is.  If @pgmap
> + * is non-NULL but does not cover @pfn the reference to it while be released.
>   */
>  struct dev_pagemap *get_dev_pagemap(unsigned long pfn,
>                 struct dev_pagemap *pgmap)
>  {
> -       const struct resource *res = pgmap ? pgmap->res : NULL;
>         resource_size_t phys = PFN_PHYS(pfn);
>
>         /*
> -        * In the cached case we're already holding a live reference so
> -        * we can simply do a blind increment
> +        * In the cached case we're already holding a live reference.
>          */
> -       if (res && phys >= res->start && phys <= res->end) {
> -               percpu_ref_get(pgmap->ref);
> -               return pgmap;
> +       if (pgmap) {
> +               const struct resource *res = pgmap ? pgmap->res : NULL;
> +
> +               if (res && phys >= res->start && phys <= res->end)
> +                       return pgmap;
> +               put_dev_pagemap(pgmap);
>         }
>
>         /* fall back to slow path lookup */
> diff --git a/mm/gup.c b/mm/gup.c
> index d3fb60e5bfac..9d142eb9e2e9 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1410,7 +1410,6 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
>
>                 VM_BUG_ON_PAGE(compound_head(page) != head, page);
>
> -               put_dev_pagemap(pgmap);
>                 SetPageReferenced(page);
>                 pages[*nr] = page;
>                 (*nr)++;
> @@ -1420,6 +1419,8 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
>         ret = 1;
>
>  pte_unmap:
> +       if (pgmap)
> +               put_dev_pagemap(pgmap);
>         pte_unmap(ptem);
>         return ret;
>  }
> @@ -1459,10 +1460,12 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr,
>                 SetPageReferenced(page);
>                 pages[*nr] = page;
>                 get_page(page);
> -               put_dev_pagemap(pgmap);

It's safe to do the put_dev_pagemap() here because the pgmap cannot be
released until the corresponding put_page() for that get_page() we
just did occurs. So we're only holding the pgmap reference long enough
to take individual page references.

We used to take and put individual pgmap references inside get_page()
/ put_page(), but that got simplified in this commit to just take and
put page reference at devm_memremap_pages() setup / teardown time:

71389703839e mm, zone_device: Replace {get, put}_zone_device_page()
with a single reference to fix pmem crash

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] mm: fix dev_pagemap reference counting around get_dev_pagemap
  2017-12-06  2:43   ` Dan Williams
@ 2017-12-06 22:44     ` Christoph Hellwig
  2017-12-06 22:52       ` Dan Williams
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2017-12-06 22:44 UTC (permalink / raw)
  To: Dan Williams; +Cc: Christoph Hellwig, linux-nvdimm, Linux MM

On Tue, Dec 05, 2017 at 06:43:36PM -0800, Dan Williams wrote:
> I don't think we need this change, but perhaps the reasoning should be
> added to the code as a comment... details below.

Hmm, looks like we are ok at least.  But even if it's not a correctness
issue there is no good point in decrementing and incrementing the
reference count every time.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] mm: fix dev_pagemap reference counting around get_dev_pagemap
  2017-12-06 22:44     ` Christoph Hellwig
@ 2017-12-06 22:52       ` Dan Williams
  0 siblings, 0 replies; 6+ messages in thread
From: Dan Williams @ 2017-12-06 22:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvdimm, Linux MM

On Wed, Dec 6, 2017 at 2:44 PM, Christoph Hellwig <hch@lst.de> wrote:
> On Tue, Dec 05, 2017 at 06:43:36PM -0800, Dan Williams wrote:
>> I don't think we need this change, but perhaps the reasoning should be
>> added to the code as a comment... details below.
>
> Hmm, looks like we are ok at least.  But even if it's not a correctness
> issue there is no good point in decrementing and incrementing the
> reference count every time.

True, we can take it once and drop it at the end when all the related
page references have been taken.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-12-06 22:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-05  0:34 RFC: dev_pagemap reference counting Christoph Hellwig
2017-12-05  0:34 ` [PATCH 1/2] mm: move get_dev_pagemap out of line Christoph Hellwig
2017-12-05  0:34 ` [PATCH 2/2] mm: fix dev_pagemap reference counting around get_dev_pagemap Christoph Hellwig
2017-12-06  2:43   ` Dan Williams
2017-12-06 22:44     ` Christoph Hellwig
2017-12-06 22:52       ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox