linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] mm: remove total_mapcount()
@ 2024-02-26 14:13 David Hildenbrand
  2024-02-26 14:13 ` [PATCH v2 1/2] mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins() David Hildenbrand
  2024-02-26 14:13 ` [PATCH v2 2/2] mm: remove total_mapcount() David Hildenbrand
  0 siblings, 2 replies; 7+ messages in thread
From: David Hildenbrand @ 2024-02-26 14:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm, David Hildenbrand, Andrew Morton, Matthew Wilcox

Let's remove the remaining user from mm/memfd.c so we can get rid of
total_mapcount().

v1 -> v2:
* Adjust mm/memfd.c separately, and as suggested by Willy, clean it up
  properly.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>

David Hildenbrand (2):
  mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins()
  mm: remove total_mapcount()

 include/linux/mm.h |  9 +--------
 mm/memfd.c         | 47 ++++++++++++++++++----------------------------
 2 files changed, 19 insertions(+), 37 deletions(-)

-- 
2.43.2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins()
  2024-02-26 14:13 [PATCH v2 0/2] mm: remove total_mapcount() David Hildenbrand
@ 2024-02-26 14:13 ` David Hildenbrand
  2024-02-26 16:07   ` Matthew Wilcox
  2024-02-26 14:13 ` [PATCH v2 2/2] mm: remove total_mapcount() David Hildenbrand
  1 sibling, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2024-02-26 14:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm, David Hildenbrand, Andrew Morton, Matthew Wilcox

Both functions are the remaining users of total_mapcount(). Let's get
rid of the calls by converting the code to folios.

As it turns out, the code is unnecessarily complicated, especially:

1) We can query the number of pagecache references for a folio simply via
   folio_nr_pages(). This will handle other folio sizes in the future
   correctly.

2) The xas_set(xas, page->index + cache_count) call to increment the
   iterator for large folios is not required. Remove it.

Further, simplify the XA_CHECK_SCHED check, counting each entry exactly
once.

Memfd pages can be swapped out when using shmem; leave xa_is_value()
checks in place.

Co-developed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memfd.c | 47 ++++++++++++++++++-----------------------------
 1 file changed, 18 insertions(+), 29 deletions(-)

diff --git a/mm/memfd.c b/mm/memfd.c
index d3a1ba4208c90..7d8d3ab3fa378 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -29,29 +29,25 @@
 #define MEMFD_TAG_PINNED        PAGECACHE_TAG_TOWRITE
 #define LAST_SCAN               4       /* about 150ms max */
 
+static bool memfd_folio_has_extra_refs(struct folio *folio)
+{
+	return folio_ref_count(folio) - folio_mapcount(folio) !=
+	       folio_nr_pages(folio);
+}
+
 static void memfd_tag_pins(struct xa_state *xas)
 {
-	struct page *page;
+	struct folio *folio;
 	int latency = 0;
-	int cache_count;
 
 	lru_add_drain();
 
 	xas_lock_irq(xas);
-	xas_for_each(xas, page, ULONG_MAX) {
-		cache_count = 1;
-		if (!xa_is_value(page) &&
-		    PageTransHuge(page) && !PageHuge(page))
-			cache_count = HPAGE_PMD_NR;
-
-		if (!xa_is_value(page) &&
-		    page_count(page) - total_mapcount(page) != cache_count)
+	xas_for_each(xas, folio, ULONG_MAX) {
+		if (!xa_is_value(folio) && memfd_folio_has_extra_refs(folio))
 			xas_set_mark(xas, MEMFD_TAG_PINNED);
-		if (cache_count != 1)
-			xas_set(xas, page->index + cache_count);
 
-		latency += cache_count;
-		if (latency < XA_CHECK_SCHED)
+		if (++latency < XA_CHECK_SCHED)
 			continue;
 		latency = 0;
 
@@ -66,16 +62,16 @@ static void memfd_tag_pins(struct xa_state *xas)
 /*
  * Setting SEAL_WRITE requires us to verify there's no pending writer. However,
  * via get_user_pages(), drivers might have some pending I/O without any active
- * user-space mappings (eg., direct-IO, AIO). Therefore, we look at all pages
+ * user-space mappings (eg., direct-IO, AIO). Therefore, we look at all folios
  * and see whether it has an elevated ref-count. If so, we tag them and wait for
  * them to be dropped.
  * The caller must guarantee that no new user will acquire writable references
- * to those pages to avoid races.
+ * to those folios to avoid races.
  */
 static int memfd_wait_for_pins(struct address_space *mapping)
 {
 	XA_STATE(xas, &mapping->i_pages, 0);
-	struct page *page;
+	struct folio *folio;
 	int error, scan;
 
 	memfd_tag_pins(&xas);
@@ -83,7 +79,6 @@ static int memfd_wait_for_pins(struct address_space *mapping)
 	error = 0;
 	for (scan = 0; scan <= LAST_SCAN; scan++) {
 		int latency = 0;
-		int cache_count;
 
 		if (!xas_marked(&xas, MEMFD_TAG_PINNED))
 			break;
@@ -95,20 +90,15 @@ static int memfd_wait_for_pins(struct address_space *mapping)
 
 		xas_set(&xas, 0);
 		xas_lock_irq(&xas);
-		xas_for_each_marked(&xas, page, ULONG_MAX, MEMFD_TAG_PINNED) {
+		xas_for_each_marked(&xas, folio, ULONG_MAX, MEMFD_TAG_PINNED) {
 			bool clear = true;
 
-			cache_count = 1;
-			if (!xa_is_value(page) &&
-			    PageTransHuge(page) && !PageHuge(page))
-				cache_count = HPAGE_PMD_NR;
-
-			if (!xa_is_value(page) && cache_count !=
-			    page_count(page) - total_mapcount(page)) {
+			if (!xa_is_value(folio) &&
+			    memfd_folio_has_extra_refs(folio)) {
 				/*
 				 * On the last scan, we clean up all those tags
 				 * we inserted; but make a note that we still
-				 * found pages pinned.
+				 * found folios pinned.
 				 */
 				if (scan == LAST_SCAN)
 					error = -EBUSY;
@@ -118,8 +108,7 @@ static int memfd_wait_for_pins(struct address_space *mapping)
 			if (clear)
 				xas_clear_mark(&xas, MEMFD_TAG_PINNED);
 
-			latency += cache_count;
-			if (latency < XA_CHECK_SCHED)
+			if (++latency < XA_CHECK_SCHED)
 				continue;
 			latency = 0;
 
-- 
2.43.2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] mm: remove total_mapcount()
  2024-02-26 14:13 [PATCH v2 0/2] mm: remove total_mapcount() David Hildenbrand
  2024-02-26 14:13 ` [PATCH v2 1/2] mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins() David Hildenbrand
@ 2024-02-26 14:13 ` David Hildenbrand
  2024-02-26 16:09   ` Matthew Wilcox
  1 sibling, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2024-02-26 14:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm, David Hildenbrand, Andrew Morton, Matthew Wilcox

All users of total_mapcount() are gone, let's remove it.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6f4825d829656..49e22a2f6cccc 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1183,7 +1183,7 @@ static inline int is_vmalloc_or_module_addr(const void *x)
  * How many times the entire folio is mapped as a single unit (eg by a
  * PMD or PUD entry).  This is probably not what you want, except for
  * debugging purposes - it does not include PTE-mapped sub-pages; look
- * at folio_mapcount() or page_mapcount() or total_mapcount() instead.
+ * at folio_mapcount() or page_mapcount() instead.
  */
 static inline int folio_entire_mapcount(struct folio *folio)
 {
@@ -1243,13 +1243,6 @@ static inline int folio_mapcount(struct folio *folio)
 	return folio_total_mapcount(folio);
 }
 
-static inline int total_mapcount(struct page *page)
-{
-	if (likely(!PageCompound(page)))
-		return atomic_read(&page->_mapcount) + 1;
-	return folio_total_mapcount(page_folio(page));
-}
-
 static inline bool folio_large_is_mapped(struct folio *folio)
 {
 	/*
-- 
2.43.2



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins()
  2024-02-26 14:13 ` [PATCH v2 1/2] mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins() David Hildenbrand
@ 2024-02-26 16:07   ` Matthew Wilcox
  2024-02-26 16:56     ` David Hildenbrand
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2024-02-26 16:07 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: linux-kernel, linux-mm, Andrew Morton

On Mon, Feb 26, 2024 at 03:13:23PM +0100, David Hildenbrand wrote:
> +	xas_for_each(xas, folio, ULONG_MAX) {
> +		if (!xa_is_value(folio) && memfd_folio_has_extra_refs(folio))
>  			xas_set_mark(xas, MEMFD_TAG_PINNED);

... we decline to tag value entries here ...

> @@ -95,20 +90,15 @@ static int memfd_wait_for_pins(struct address_space *mapping)
>  
>  		xas_set(&xas, 0);
>  		xas_lock_irq(&xas);
> -		xas_for_each_marked(&xas, page, ULONG_MAX, MEMFD_TAG_PINNED) {
> +		xas_for_each_marked(&xas, folio, ULONG_MAX, MEMFD_TAG_PINNED) {
>  			bool clear = true;
>  
> -			cache_count = 1;
> -			if (!xa_is_value(page) &&
> -			    PageTransHuge(page) && !PageHuge(page))
> -				cache_count = HPAGE_PMD_NR;
> -
> -			if (!xa_is_value(page) && cache_count !=
> -			    page_count(page) - total_mapcount(page)) {
> +			if (!xa_is_value(folio) &&
> +			    memfd_folio_has_extra_refs(folio)) {

... so we don't need to test it here because we'll never see any value
entries.  No?



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] mm: remove total_mapcount()
  2024-02-26 14:13 ` [PATCH v2 2/2] mm: remove total_mapcount() David Hildenbrand
@ 2024-02-26 16:09   ` Matthew Wilcox
  0 siblings, 0 replies; 7+ messages in thread
From: Matthew Wilcox @ 2024-02-26 16:09 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: linux-kernel, linux-mm, Andrew Morton

On Mon, Feb 26, 2024 at 03:13:24PM +0100, David Hildenbrand wrote:
> All users of total_mapcount() are gone, let's remove it.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins()
  2024-02-26 16:07   ` Matthew Wilcox
@ 2024-02-26 16:56     ` David Hildenbrand
  2024-02-27 15:27       ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2024-02-26 16:56 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: linux-kernel, linux-mm, Andrew Morton

On 26.02.24 17:07, Matthew Wilcox wrote:
> On Mon, Feb 26, 2024 at 03:13:23PM +0100, David Hildenbrand wrote:
>> +	xas_for_each(xas, folio, ULONG_MAX) {
>> +		if (!xa_is_value(folio) && memfd_folio_has_extra_refs(folio))
>>   			xas_set_mark(xas, MEMFD_TAG_PINNED);
> 
> ... we decline to tag value entries here ...
> 
>> @@ -95,20 +90,15 @@ static int memfd_wait_for_pins(struct address_space *mapping)
>>   
>>   		xas_set(&xas, 0);
>>   		xas_lock_irq(&xas);
>> -		xas_for_each_marked(&xas, page, ULONG_MAX, MEMFD_TAG_PINNED) {
>> +		xas_for_each_marked(&xas, folio, ULONG_MAX, MEMFD_TAG_PINNED) {
>>   			bool clear = true;
>>   
>> -			cache_count = 1;
>> -			if (!xa_is_value(page) &&
>> -			    PageTransHuge(page) && !PageHuge(page))
>> -				cache_count = HPAGE_PMD_NR;
>> -
>> -			if (!xa_is_value(page) && cache_count !=
>> -			    page_count(page) - total_mapcount(page)) {
>> +			if (!xa_is_value(folio) &&
>> +			    memfd_folio_has_extra_refs(folio)) {
> 
> ... so we don't need to test it here because we'll never see any value
> entries.  No?

I was not able to convince myself that swapout code would clear the mark 
when replacing the entry.

shmem_writepage()->shmem_delete_from_page_cache()->shmem_replace_entry()

will perform a xas_store() with swp_to_radix_entry(swap) under 
xa_lock_irq().

Reading the doc, and staring at the code for a bit too long, I think 
xas_store() would only clear tags when deleting an entry (passing NULL).

But maybe xas_store() will always clear tags?

In memfd code, I think we could see swapout between memfd_tag_pins() and 
the check for tags, where we drop the xa_lock. Unless some other lock 
(inode lock?) protects us.

Thanks!

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins()
  2024-02-26 16:56     ` David Hildenbrand
@ 2024-02-27 15:27       ` Matthew Wilcox
  0 siblings, 0 replies; 7+ messages in thread
From: Matthew Wilcox @ 2024-02-27 15:27 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: linux-kernel, linux-mm, Andrew Morton

On Mon, Feb 26, 2024 at 05:56:09PM +0100, David Hildenbrand wrote:
> > > @@ -95,20 +90,15 @@ static int memfd_wait_for_pins(struct address_space *mapping)
> > >   		xas_set(&xas, 0);
> > >   		xas_lock_irq(&xas);
> > > -		xas_for_each_marked(&xas, page, ULONG_MAX, MEMFD_TAG_PINNED) {
> > > +		xas_for_each_marked(&xas, folio, ULONG_MAX, MEMFD_TAG_PINNED) {
> > >   			bool clear = true;
> > > -			cache_count = 1;
> > > -			if (!xa_is_value(page) &&
> > > -			    PageTransHuge(page) && !PageHuge(page))
> > > -				cache_count = HPAGE_PMD_NR;
> > > -
> > > -			if (!xa_is_value(page) && cache_count !=
> > > -			    page_count(page) - total_mapcount(page)) {
> > > +			if (!xa_is_value(folio) &&
> > > +			    memfd_folio_has_extra_refs(folio)) {
> > 
> > ... so we don't need to test it here because we'll never see any value
> > entries.  No?
> 
> I was not able to convince myself that swapout code would clear the mark
> when replacing the entry.
> 
> shmem_writepage()->shmem_delete_from_page_cache()->shmem_replace_entry()
> 
> will perform a xas_store() with swp_to_radix_entry(swap) under
> xa_lock_irq().
> 
> Reading the doc, and staring at the code for a bit too long, I think
> xas_store() would only clear tags when deleting an entry (passing NULL).
> 
> But maybe xas_store() will always clear tags?

No, xas_store() will leave the tag alone ... this is the right thing to
do for the pagecache because we always clear the tags before removing
a folio from the cache.

> In memfd code, I think we could see swapout between memfd_tag_pins() and the
> check for tags, where we drop the xa_lock. Unless some other lock (inode
> lock?) protects us.

... and if it does happen, we see the value entry tagged and clear the
tag on it.  OK.

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-02-27 15:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-26 14:13 [PATCH v2 0/2] mm: remove total_mapcount() David Hildenbrand
2024-02-26 14:13 ` [PATCH v2 1/2] mm/memfd: refactor memfd_tag_pins() and memfd_wait_for_pins() David Hildenbrand
2024-02-26 16:07   ` Matthew Wilcox
2024-02-26 16:56     ` David Hildenbrand
2024-02-27 15:27       ` Matthew Wilcox
2024-02-26 14:13 ` [PATCH v2 2/2] mm: remove total_mapcount() David Hildenbrand
2024-02-26 16:09   ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox