linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5.1] fs: Allow fine-grained control of folio sizes
@ 2024-05-27 21:01 Matthew Wilcox (Oracle)
  2024-05-27 22:09 ` Pankaj Raghav (Samsung)
  2024-05-28 10:20 ` Hannes Reinecke
  0 siblings, 2 replies; 12+ messages in thread
From: Matthew Wilcox (Oracle) @ 2024-05-27 21:01 UTC (permalink / raw)
  To: akpm, willy, djwong, brauner, david, chandan.babu
  Cc: hare, ritesh.list, john.g.garry, ziy, linux-fsdevel, linux-xfs,
	linux-mm, linux-block, gost.dev, p.raghav, kernel, mcgrof

We need filesystems to be able to communicate acceptable folio sizes
to the pagecache for a variety of uses (e.g. large block sizes).
Support a range of folio sizes between order-0 and order-31.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Co-developed-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
For this version, I fixed the TODO that the maximum folio size was not
being honoured.  I made some other changes too like adding const, moving
the location of the constants, checking CONFIG_TRANSPARENT_HUGEPAGE, and
dropping some of the functions which aren't needed until later patches.
(They can be added in the commits that need them).  Also rebased against
current Linus tree, so MAX_PAGECACHE_ORDER no longer needs to be moved).

 include/linux/pagemap.h | 81 +++++++++++++++++++++++++++++++++++------
 mm/filemap.c            |  6 +--
 mm/readahead.c          |  4 +-
 3 files changed, 73 insertions(+), 18 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 1ed9274a0deb..c6aaceed0de6 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -204,13 +204,18 @@ enum mapping_flags {
 	AS_EXITING	= 4, 	/* final truncate in progress */
 	/* writeback related tags are not used */
 	AS_NO_WRITEBACK_TAGS = 5,
-	AS_LARGE_FOLIO_SUPPORT = 6,
-	AS_RELEASE_ALWAYS,	/* Call ->release_folio(), even if no private data */
-	AS_STABLE_WRITES,	/* must wait for writeback before modifying
+	AS_RELEASE_ALWAYS = 6,	/* Call ->release_folio(), even if no private data */
+	AS_STABLE_WRITES = 7,	/* must wait for writeback before modifying
 				   folio contents */
-	AS_UNMOVABLE,		/* The mapping cannot be moved, ever */
+	AS_UNMOVABLE = 8,	/* The mapping cannot be moved, ever */
+	AS_FOLIO_ORDER_MIN = 16,
+	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
 };
 
+#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
+#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
+#define AS_FOLIO_ORDER_MASK (AS_FOLIO_ORDER_MIN_MASK | AS_FOLIO_ORDER_MAX_MASK)
+
 /**
  * mapping_set_error - record a writeback error in the address_space
  * @mapping: the mapping in which an error should be set
@@ -359,9 +364,48 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
 #define MAX_PAGECACHE_ORDER	8
 #endif
 
+/*
+ * mapping_set_folio_order_range() - Set the orders supported by a file.
+ * @mapping: The address space of the file.
+ * @min: Minimum folio order (between 0-MAX_PAGECACHE_ORDER inclusive).
+ * @max: Maximum folio order (between @min-MAX_PAGECACHE_ORDER inclusive).
+ *
+ * The filesystem should call this function in its inode constructor to
+ * indicate which base size (min) and maximum size (max) of folio the VFS
+ * can use to cache the contents of the file.  This should only be used
+ * if the filesystem needs special handling of folio sizes (ie there is
+ * something the core cannot know).
+ * Do not tune it based on, eg, i_size.
+ *
+ * Context: This should not be called while the inode is active as it
+ * is non-atomic.
+ */
+static inline void mapping_set_folio_order_range(struct address_space *mapping,
+		unsigned int min, unsigned int max)
+{
+	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
+		return;
+
+	if (min > MAX_PAGECACHE_ORDER)
+		min = MAX_PAGECACHE_ORDER;
+	if (max > MAX_PAGECACHE_ORDER)
+		max = MAX_PAGECACHE_ORDER;
+	if (max < min)
+		max = min;
+
+	mapping->flags = (mapping->flags & ~AS_FOLIO_ORDER_MASK) |
+		(min << AS_FOLIO_ORDER_MIN) | (max << AS_FOLIO_ORDER_MAX);
+}
+
+static inline void mapping_set_folio_min_order(struct address_space *mapping,
+		unsigned int min)
+{
+	mapping_set_folio_order_range(mapping, min, MAX_PAGECACHE_ORDER);
+}
+
 /**
  * mapping_set_large_folios() - Indicate the file supports large folios.
- * @mapping: The file.
+ * @mapping: The address space of the file.
  *
  * The filesystem should call this function in its inode constructor to
  * indicate that the VFS can use large folios to cache the contents of
@@ -372,7 +416,23 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
  */
 static inline void mapping_set_large_folios(struct address_space *mapping)
 {
-	__set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
+	mapping_set_folio_order_range(mapping, 0, MAX_PAGECACHE_ORDER);
+}
+
+static inline
+unsigned int mapping_max_folio_order(const struct address_space *mapping)
+{
+	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
+		return 0;
+	return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX;
+}
+
+static inline
+unsigned int mapping_min_folio_order(const struct address_space *mapping)
+{
+	if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
+		return 0;
+	return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN;
 }
 
 /*
@@ -381,16 +441,13 @@ static inline void mapping_set_large_folios(struct address_space *mapping)
  */
 static inline bool mapping_large_folio_support(struct address_space *mapping)
 {
-	return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
-		test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
+	return mapping_max_folio_order(mapping) > 0;
 }
 
 /* Return the maximum folio size for this pagecache mapping, in bytes. */
-static inline size_t mapping_max_folio_size(struct address_space *mapping)
+static inline size_t mapping_max_folio_size(const struct address_space *mapping)
 {
-	if (mapping_large_folio_support(mapping))
-		return PAGE_SIZE << MAX_PAGECACHE_ORDER;
-	return PAGE_SIZE;
+	return PAGE_SIZE << mapping_max_folio_order(mapping);
 }
 
 static inline int filemap_nr_thps(struct address_space *mapping)
diff --git a/mm/filemap.c b/mm/filemap.c
index 382c3d06bfb1..0557020f130e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1933,10 +1933,8 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
 		if (WARN_ON_ONCE(!(fgp_flags & (FGP_LOCK | FGP_FOR_MMAP))))
 			fgp_flags |= FGP_LOCK;
 
-		if (!mapping_large_folio_support(mapping))
-			order = 0;
-		if (order > MAX_PAGECACHE_ORDER)
-			order = MAX_PAGECACHE_ORDER;
+		if (order > mapping_max_folio_order(mapping))
+			order = mapping_max_folio_order(mapping);
 		/* If we're not aligned, allocate a smaller folio */
 		if (index & ((1UL << order) - 1))
 			order = __ffs(index);
diff --git a/mm/readahead.c b/mm/readahead.c
index c1b23989d9ca..66058ae02f2e 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -503,9 +503,9 @@ void page_cache_ra_order(struct readahead_control *ractl,
 
 	limit = min(limit, index + ra->size - 1);
 
-	if (new_order < MAX_PAGECACHE_ORDER) {
+	if (new_order < mapping_max_folio_order(mapping)) {
 		new_order += 2;
-		new_order = min_t(unsigned int, MAX_PAGECACHE_ORDER, new_order);
+		new_order = min(mapping_max_folio_order(mapping), new_order);
 		new_order = min_t(unsigned int, new_order, ilog2(ra->size));
 	}
 
-- 
2.43.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-27 21:01 [PATCH v5.1] fs: Allow fine-grained control of folio sizes Matthew Wilcox (Oracle)
@ 2024-05-27 22:09 ` Pankaj Raghav (Samsung)
  2024-05-27 22:39   ` Matthew Wilcox
  2024-05-28 10:20 ` Hannes Reinecke
  1 sibling, 1 reply; 12+ messages in thread
From: Pankaj Raghav (Samsung) @ 2024-05-27 22:09 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: akpm, djwong, brauner, david, chandan.babu, hare, ritesh.list,
	john.g.garry, ziy, linux-fsdevel, linux-xfs, linux-mm,
	linux-block, gost.dev, p.raghav, mcgrof

On Mon, May 27, 2024 at 10:01:23PM +0100, Matthew Wilcox (Oracle) wrote:
> We need filesystems to be able to communicate acceptable folio sizes
> to the pagecache for a variety of uses (e.g. large block sizes).
> Support a range of folio sizes between order-0 and order-31.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Co-developed-by: Pankaj Raghav <p.raghav@samsung.com>
> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> ---
> For this version, I fixed the TODO that the maximum folio size was not
> being honoured.  I made some other changes too like adding const, moving
> the location of the constants, checking CONFIG_TRANSPARENT_HUGEPAGE, and
> dropping some of the functions which aren't needed until later patches.
> (They can be added in the commits that need them).  Also rebased against
> current Linus tree, so MAX_PAGECACHE_ORDER no longer needs to be moved).

Thanks for this! So I am currently running my xfstests on the new series
I am planning to send in a day or two based on next-20240523.

I assume this patch is intended to be folded in to the next LBS series?

> 
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 1ed9274a0deb..c6aaceed0de6 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -204,13 +204,18 @@ enum mapping_flags {
>  	AS_EXITING	= 4, 	/* final truncate in progress */
>  	/* writeback related tags are not used */
>  	AS_NO_WRITEBACK_TAGS = 5,
> -	AS_LARGE_FOLIO_SUPPORT = 6,
> -	AS_RELEASE_ALWAYS,	/* Call ->release_folio(), even if no private data */
> -	AS_STABLE_WRITES,	/* must wait for writeback before modifying
> +	AS_RELEASE_ALWAYS = 6,	/* Call ->release_folio(), even if no private data */
> +	AS_STABLE_WRITES = 7,	/* must wait for writeback before modifying
>  				   folio contents */
> -	AS_UNMOVABLE,		/* The mapping cannot be moved, ever */
> +	AS_UNMOVABLE = 8,	/* The mapping cannot be moved, ever */
> +	AS_FOLIO_ORDER_MIN = 16,
> +	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
>  };
>  
> +#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> +#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000

As you changed the mapping flag offset, these masks also needs to be
changed accordingly.

I moved(pun intended) the AS_UNMOVABLE and kept the
AS_FOLIO_ORDER_(MIN|MAX) value the same.

	AS_FOLIO_ORDER_MIN = 8,
	AS_FOLIO_ORDER_MAX = 13, /* Bit 8-17 are used for FOLIO_ORDER */
	AS_UNMOVABLE = 18,	 /* The mapping cannot be moved, ever */
	AS_INACCESSIBLE,	/* Do not attempt direct R/W access to the mapping */

> +#define AS_FOLIO_ORDER_MASK (AS_FOLIO_ORDER_MIN_MASK | AS_FOLIO_ORDER_MAX_MASK)
> +
>  
--
Pankaj


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-27 22:09 ` Pankaj Raghav (Samsung)
@ 2024-05-27 22:39   ` Matthew Wilcox
  2024-05-27 22:43     ` Matthew Wilcox
  2024-05-28 10:13     ` Pankaj Raghav (Samsung)
  0 siblings, 2 replies; 12+ messages in thread
From: Matthew Wilcox @ 2024-05-27 22:39 UTC (permalink / raw)
  To: Pankaj Raghav (Samsung)
  Cc: akpm, djwong, brauner, david, chandan.babu, hare, ritesh.list,
	john.g.garry, ziy, linux-fsdevel, linux-xfs, linux-mm,
	linux-block, gost.dev, p.raghav, mcgrof

On Mon, May 27, 2024 at 10:09:26PM +0000, Pankaj Raghav (Samsung) wrote:
> > For this version, I fixed the TODO that the maximum folio size was not
> > being honoured.  I made some other changes too like adding const, moving
> > the location of the constants, checking CONFIG_TRANSPARENT_HUGEPAGE, and
> > dropping some of the functions which aren't needed until later patches.
> > (They can be added in the commits that need them).  Also rebased against
> > current Linus tree, so MAX_PAGECACHE_ORDER no longer needs to be moved).
> 
> Thanks for this! So I am currently running my xfstests on the new series
> I am planning to send in a day or two based on next-20240523.
> 
> I assume this patch is intended to be folded in to the next LBS series?

Right, that was why I numbered it as 5.1 so as to not preempt your v6.

> > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> > index 1ed9274a0deb..c6aaceed0de6 100644
> > --- a/include/linux/pagemap.h
> > +++ b/include/linux/pagemap.h
> > @@ -204,13 +204,18 @@ enum mapping_flags {
> >  	AS_EXITING	= 4, 	/* final truncate in progress */
> >  	/* writeback related tags are not used */
> >  	AS_NO_WRITEBACK_TAGS = 5,
> > -	AS_LARGE_FOLIO_SUPPORT = 6,
> > -	AS_RELEASE_ALWAYS,	/* Call ->release_folio(), even if no private data */
> > -	AS_STABLE_WRITES,	/* must wait for writeback before modifying
> > +	AS_RELEASE_ALWAYS = 6,	/* Call ->release_folio(), even if no private data */
> > +	AS_STABLE_WRITES = 7,	/* must wait for writeback before modifying
> >  				   folio contents */
> > -	AS_UNMOVABLE,		/* The mapping cannot be moved, ever */
> > +	AS_UNMOVABLE = 8,	/* The mapping cannot be moved, ever */
> > +	AS_FOLIO_ORDER_MIN = 16,
> > +	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
> >  };
> >  
> > +#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> > +#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> 
> As you changed the mapping flag offset, these masks also needs to be
> changed accordingly.

That's why I did change them?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-27 22:39   ` Matthew Wilcox
@ 2024-05-27 22:43     ` Matthew Wilcox
  2024-05-27 23:00       ` Dave Chinner
  2024-05-28 10:13     ` Pankaj Raghav (Samsung)
  1 sibling, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2024-05-27 22:43 UTC (permalink / raw)
  To: Pankaj Raghav (Samsung)
  Cc: akpm, djwong, brauner, david, chandan.babu, hare, ritesh.list,
	john.g.garry, ziy, linux-fsdevel, linux-xfs, linux-mm,
	linux-block, gost.dev, p.raghav, mcgrof

On Mon, May 27, 2024 at 11:39:47PM +0100, Matthew Wilcox wrote:
> > > +	AS_FOLIO_ORDER_MIN = 16,
> > > +	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
> > >  };
> > >  
> > > +#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> > > +#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> > 
> > As you changed the mapping flag offset, these masks also needs to be
> > changed accordingly.
> 
> That's why I did change them?

How about:

-#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
-#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
+#define AS_FOLIO_ORDER_MIN_MASK (31 << AS_FOLIO_ORDER_MIN)
+#define AS_FOLIO_ORDER_MAX_MASK (31 << AS_FOLIO_ORDER_MAX)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-27 22:43     ` Matthew Wilcox
@ 2024-05-27 23:00       ` Dave Chinner
  2024-05-28  9:12         ` Pankaj Raghav (Samsung)
  0 siblings, 1 reply; 12+ messages in thread
From: Dave Chinner @ 2024-05-27 23:00 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Pankaj Raghav (Samsung),
	akpm, djwong, brauner, chandan.babu, hare, ritesh.list,
	john.g.garry, ziy, linux-fsdevel, linux-xfs, linux-mm,
	linux-block, gost.dev, p.raghav, mcgrof

On Mon, May 27, 2024 at 11:43:43PM +0100, Matthew Wilcox wrote:
> On Mon, May 27, 2024 at 11:39:47PM +0100, Matthew Wilcox wrote:
> > > > +	AS_FOLIO_ORDER_MIN = 16,
> > > > +	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
> > > >  };
> > > >  
> > > > +#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> > > > +#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> > > 
> > > As you changed the mapping flag offset, these masks also needs to be
> > > changed accordingly.
> > 
> > That's why I did change them?
> 
> How about:
> 
> -#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> -#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> +#define AS_FOLIO_ORDER_MIN_MASK (31 << AS_FOLIO_ORDER_MIN)
> +#define AS_FOLIO_ORDER_MAX_MASK (31 << AS_FOLIO_ORDER_MAX)

Lots of magic numbers based on the order having only having 5 bits
of resolution. Removing that magic looks like this:

	AS_FOLIO_ORDER_BITS = 5,
	AS_FOLIO_ORDER_MIN = 16,
	AS_FOLIO_ORDER_MAX = AS_FOLIO_ORDER_MIN + AS_FOLIO_ORDER_BITS,
};

#define AS_FOLIO_ORDER_MASK	((1u << AS_FOLIO_ORDER_BITS) - 1)
#define AS_FOLIO_ORDER_MIN_MASK	(AS_FOLIO_ORDER_MASK << AS_FOLIO_ORDER_MIN)
#define AS_FOLIO_ORDER_MAX_MASK	(AS_FOLIO_ORDER_MASK << AS_FOLIO_ORDER_MAX)

This way if we want to increase the order mask, we only need to
change AS_FOLIO_ORDER_BITS and everything else automatically
recalculates.

Doing this means We could also easily use the high bits of the flag
word for the folio orders, rather than putting them in the middle of
the flag space...

-Dave.
-- 
Dave Chinner
david@fromorbit.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-27 23:00       ` Dave Chinner
@ 2024-05-28  9:12         ` Pankaj Raghav (Samsung)
  2024-05-28  9:45           ` Dave Chinner
  0 siblings, 1 reply; 12+ messages in thread
From: Pankaj Raghav (Samsung) @ 2024-05-28  9:12 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Matthew Wilcox, akpm, djwong, brauner, chandan.babu, hare,
	ritesh.list, john.g.garry, ziy, linux-fsdevel, linux-xfs,
	linux-mm, linux-block, gost.dev, p.raghav, mcgrof

On Tue, May 28, 2024 at 09:00:00AM +1000, Dave Chinner wrote:
> On Mon, May 27, 2024 at 11:43:43PM +0100, Matthew Wilcox wrote:
> > On Mon, May 27, 2024 at 11:39:47PM +0100, Matthew Wilcox wrote:
> > > > > +	AS_FOLIO_ORDER_MIN = 16,
> > > > > +	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
> > > > >  };
> > > > >  
> > > > > +#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> > > > > +#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> > > > 
> > > > As you changed the mapping flag offset, these masks also needs to be
> > > > changed accordingly.
> > > 
> > > That's why I did change them?
> > 
> > How about:
> > 
> > -#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> > -#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> > +#define AS_FOLIO_ORDER_MIN_MASK (31 << AS_FOLIO_ORDER_MIN)
> > +#define AS_FOLIO_ORDER_MAX_MASK (31 << AS_FOLIO_ORDER_MAX)
> 
> Lots of magic numbers based on the order having only having 5 bits
> of resolution. Removing that magic looks like this:
> 
> 	AS_FOLIO_ORDER_BITS = 5,

I think this needs to be defined outside of the enum as 5 is already
taken by AS_NO_WRITEBACK_TAGS? But I like the idea of making it generic
like this.

Something like this?

#define AS_FOLIO_ORDER_BITS 5
/*
 * Bits in mapping->flags.
 */
enum mapping_flags {
	AS_EIO		= 0,	/* IO error on async write */
	AS_ENOSPC	= 1,	/* ENOSPC on async write */
	AS_MM_ALL_LOCKS	= 2,	/* under mm_take_all_locks() */
	AS_UNEVICTABLE	= 3,	/* e.g., ramdisk, SHM_LOCK */
	AS_EXITING	= 4, 	/* final truncate in progress */
	/* writeback related tags are not used */
	AS_NO_WRITEBACK_TAGS = 5,
	AS_RELEASE_ALWAYS = 6,	/* Call ->release_folio(), even if no private data */
	AS_STABLE_WRITES = 7,	/* must wait for writeback before modifying
				   folio contents */
	AS_UNMOVABLE = 8,	 /* The mapping cannot be moved, ever */
	/* Bit 16-21 are used for FOLIO_ORDER */
	AS_FOLIO_ORDER_MIN = 16,
	AS_FOLIO_ORDER_MAX = AS_FOLIO_ORDER_MIN + AS_FOLIO_ORDER_BITS, 
};

@willy: I can fold this change that Chinner is proposing if you are fine
with this.

> 	AS_FOLIO_ORDER_MIN = 16,
> 	AS_FOLIO_ORDER_MAX = AS_FOLIO_ORDER_MIN + AS_FOLIO_ORDER_BITS,
> };
> 
> #define AS_FOLIO_ORDER_MASK	((1u << AS_FOLIO_ORDER_BITS) - 1)
> #define AS_FOLIO_ORDER_MIN_MASK	(AS_FOLIO_ORDER_MASK << AS_FOLIO_ORDER_MIN)
> #define AS_FOLIO_ORDER_MAX_MASK	(AS_FOLIO_ORDER_MASK << AS_FOLIO_ORDER_MAX)
> 
> This way if we want to increase the order mask, we only need to
> change AS_FOLIO_ORDER_BITS and everything else automatically
> recalculates.
> 
> Doing this means We could also easily use the high bits of the flag
> word for the folio orders, rather than putting them in the middle of
> the flag space...
> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-28  9:12         ` Pankaj Raghav (Samsung)
@ 2024-05-28  9:45           ` Dave Chinner
  0 siblings, 0 replies; 12+ messages in thread
From: Dave Chinner @ 2024-05-28  9:45 UTC (permalink / raw)
  To: Pankaj Raghav (Samsung)
  Cc: Matthew Wilcox, akpm, djwong, brauner, chandan.babu, hare,
	ritesh.list, john.g.garry, ziy, linux-fsdevel, linux-xfs,
	linux-mm, linux-block, gost.dev, p.raghav, mcgrof

On Tue, May 28, 2024 at 09:12:02AM +0000, Pankaj Raghav (Samsung) wrote:
> On Tue, May 28, 2024 at 09:00:00AM +1000, Dave Chinner wrote:
> > On Mon, May 27, 2024 at 11:43:43PM +0100, Matthew Wilcox wrote:
> > > On Mon, May 27, 2024 at 11:39:47PM +0100, Matthew Wilcox wrote:
> > > > > > +	AS_FOLIO_ORDER_MIN = 16,
> > > > > > +	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
> > > > > >  };
> > > > > >  
> > > > > > +#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> > > > > > +#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> > > > > 
> > > > > As you changed the mapping flag offset, these masks also needs to be
> > > > > changed accordingly.
> > > > 
> > > > That's why I did change them?
> > > 
> > > How about:
> > > 
> > > -#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> > > -#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> > > +#define AS_FOLIO_ORDER_MIN_MASK (31 << AS_FOLIO_ORDER_MIN)
> > > +#define AS_FOLIO_ORDER_MAX_MASK (31 << AS_FOLIO_ORDER_MAX)
> > 
> > Lots of magic numbers based on the order having only having 5 bits
> > of resolution. Removing that magic looks like this:
> > 
> > 	AS_FOLIO_ORDER_BITS = 5,
> 
> I think this needs to be defined outside of the enum as 5 is already
> taken by AS_NO_WRITEBACK_TAGS? But I like the idea of making it generic
> like this.

Duplicate values in assigned enums are legal and fairly common.
This:

enum {
        FOO = 1,
        BAR = 2,
        BAZ = 1,
};

int main(int argc, char *argv[])
{
        printf("foo %d, bar %d, baz %d\n", FOO, BAR, BAZ);
}

compiles without warnings or errors and gives the output:

foo 1, bar 2, baz 1

-Dave.
-- 
Dave Chinner
david@fromorbit.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-27 22:39   ` Matthew Wilcox
  2024-05-27 22:43     ` Matthew Wilcox
@ 2024-05-28 10:13     ` Pankaj Raghav (Samsung)
  2024-05-28 12:13       ` Matthew Wilcox
  1 sibling, 1 reply; 12+ messages in thread
From: Pankaj Raghav (Samsung) @ 2024-05-28 10:13 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: akpm, djwong, brauner, david, chandan.babu, hare, ritesh.list,
	john.g.garry, ziy, linux-fsdevel, linux-xfs, linux-mm,
	linux-block, gost.dev, p.raghav, mcgrof

On Mon, May 27, 2024 at 11:39:47PM +0100, Matthew Wilcox wrote:
> On Mon, May 27, 2024 at 10:09:26PM +0000, Pankaj Raghav (Samsung) wrote:
> > > For this version, I fixed the TODO that the maximum folio size was not
> > > being honoured.  I made some other changes too like adding const, moving
> > > the location of the constants, checking CONFIG_TRANSPARENT_HUGEPAGE, and
> > > dropping some of the functions which aren't needed until later patches.
> > > (They can be added in the commits that need them).  Also rebased against
> > > current Linus tree, so MAX_PAGECACHE_ORDER no longer needs to be moved).
> > 
> > Thanks for this! So I am currently running my xfstests on the new series
> > I am planning to send in a day or two based on next-20240523.
> > 
> > I assume this patch is intended to be folded in to the next LBS series?
> 
> Right, that was why I numbered it as 5.1 so as to not preempt your v6.
> 
> > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> > > index 1ed9274a0deb..c6aaceed0de6 100644
> > > --- a/include/linux/pagemap.h
> > > +++ b/include/linux/pagemap.h
> > > @@ -204,13 +204,18 @@ enum mapping_flags {
> > >  	AS_EXITING	= 4, 	/* final truncate in progress */
> > >  	/* writeback related tags are not used */
> > >  	AS_NO_WRITEBACK_TAGS = 5,
> > > -	AS_LARGE_FOLIO_SUPPORT = 6,
> > > -	AS_RELEASE_ALWAYS,	/* Call ->release_folio(), even if no private data */
> > > -	AS_STABLE_WRITES,	/* must wait for writeback before modifying
> > > +	AS_RELEASE_ALWAYS = 6,	/* Call ->release_folio(), even if no private data */
> > > +	AS_STABLE_WRITES = 7,	/* must wait for writeback before modifying
> > >  				   folio contents */
> > > -	AS_UNMOVABLE,		/* The mapping cannot be moved, ever */
> > > +	AS_UNMOVABLE = 8,	/* The mapping cannot be moved, ever */
> > > +	AS_FOLIO_ORDER_MIN = 16,
> > > +	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
> > >  };
> > >  
> > > +#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> > > +#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> > 
> > As you changed the mapping flag offset, these masks also needs to be
> > changed accordingly.
> 
> That's why I did change them?

Oops, I missed the zeroes at the end.


Btw, I noticed you have removed mapping_align_start_index(). I will add
it back in.

-- 
Pankaj Raghav


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-27 21:01 [PATCH v5.1] fs: Allow fine-grained control of folio sizes Matthew Wilcox (Oracle)
  2024-05-27 22:09 ` Pankaj Raghav (Samsung)
@ 2024-05-28 10:20 ` Hannes Reinecke
  2024-05-28 11:37   ` Pankaj Raghav (Samsung)
  2024-05-28 11:40   ` Pankaj Raghav (Samsung)
  1 sibling, 2 replies; 12+ messages in thread
From: Hannes Reinecke @ 2024-05-28 10:20 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle), akpm, djwong, brauner, david, chandan.babu
  Cc: ritesh.list, john.g.garry, ziy, linux-fsdevel, linux-xfs,
	linux-mm, linux-block, gost.dev, p.raghav, kernel, mcgrof

On 5/27/24 23:01, Matthew Wilcox (Oracle) wrote:
> We need filesystems to be able to communicate acceptable folio sizes
> to the pagecache for a variety of uses (e.g. large block sizes).
> Support a range of folio sizes between order-0 and order-31.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Co-developed-by: Pankaj Raghav <p.raghav@samsung.com>
> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: Hannes Reinecke <hare@suse.de>
> ---
> For this version, I fixed the TODO that the maximum folio size was not
> being honoured.  I made some other changes too like adding const, moving
> the location of the constants, checking CONFIG_TRANSPARENT_HUGEPAGE, and
> dropping some of the functions which aren't needed until later patches.
> (They can be added in the commits that need them).  Also rebased against
> current Linus tree, so MAX_PAGECACHE_ORDER no longer needs to be moved).
> 
>   include/linux/pagemap.h | 81 +++++++++++++++++++++++++++++++++++------
>   mm/filemap.c            |  6 +--
>   mm/readahead.c          |  4 +-
>   3 files changed, 73 insertions(+), 18 deletions(-)
> 
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 1ed9274a0deb..c6aaceed0de6 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -204,13 +204,18 @@ enum mapping_flags {
>   	AS_EXITING	= 4, 	/* final truncate in progress */
>   	/* writeback related tags are not used */
>   	AS_NO_WRITEBACK_TAGS = 5,
> -	AS_LARGE_FOLIO_SUPPORT = 6,
> -	AS_RELEASE_ALWAYS,	/* Call ->release_folio(), even if no private data */
> -	AS_STABLE_WRITES,	/* must wait for writeback before modifying
> +	AS_RELEASE_ALWAYS = 6,	/* Call ->release_folio(), even if no private data */
> +	AS_STABLE_WRITES = 7,	/* must wait for writeback before modifying
>   				   folio contents */
> -	AS_UNMOVABLE,		/* The mapping cannot be moved, ever */
> +	AS_UNMOVABLE = 8,	/* The mapping cannot be moved, ever */
> +	AS_FOLIO_ORDER_MIN = 16,
> +	AS_FOLIO_ORDER_MAX = 21, /* Bits 16-25 are used for FOLIO_ORDER */
>   };
>   
> +#define AS_FOLIO_ORDER_MIN_MASK 0x001f0000
> +#define AS_FOLIO_ORDER_MAX_MASK 0x03e00000
> +#define AS_FOLIO_ORDER_MASK (AS_FOLIO_ORDER_MIN_MASK | AS_FOLIO_ORDER_MAX_MASK)
> +
>   /**
>    * mapping_set_error - record a writeback error in the address_space
>    * @mapping: the mapping in which an error should be set
> @@ -359,9 +364,48 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
>   #define MAX_PAGECACHE_ORDER	8
>   #endif
>   
> +/*
> + * mapping_set_folio_order_range() - Set the orders supported by a file.
> + * @mapping: The address space of the file.
> + * @min: Minimum folio order (between 0-MAX_PAGECACHE_ORDER inclusive).
> + * @max: Maximum folio order (between @min-MAX_PAGECACHE_ORDER inclusive).
> + *
> + * The filesystem should call this function in its inode constructor to
> + * indicate which base size (min) and maximum size (max) of folio the VFS
> + * can use to cache the contents of the file.  This should only be used
> + * if the filesystem needs special handling of folio sizes (ie there is
> + * something the core cannot know).
> + * Do not tune it based on, eg, i_size.
> + *
> + * Context: This should not be called while the inode is active as it
> + * is non-atomic.
> + */
> +static inline void mapping_set_folio_order_range(struct address_space *mapping,
> +		unsigned int min, unsigned int max)
> +{
> +	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
> +		return;
> +
Errm. Sure? When transparent hugepages are _enabled_ we don't support 
this feature?
Confused.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-28 10:20 ` Hannes Reinecke
@ 2024-05-28 11:37   ` Pankaj Raghav (Samsung)
  2024-05-28 11:40   ` Pankaj Raghav (Samsung)
  1 sibling, 0 replies; 12+ messages in thread
From: Pankaj Raghav (Samsung) @ 2024-05-28 11:37 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Matthew Wilcox (Oracle),
	akpm, djwong, brauner, david, chandan.babu, ritesh.list,
	john.g.garry, ziy, linux-fsdevel, linux-xfs, linux-mm,
	linux-block, gost.dev, p.raghav, mcgrof

> > + * is non-atomic.
> > + */
> > +static inline void mapping_set_folio_order_range(struct address_space *mapping,
> > +		unsigned int min, unsigned int max)
> > +{
> > +	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
> > +		return;
> > +
> Errm. Sure? When transparent hugepages are _enabled_ we don't support this
> feature?
> Confused.

I think large folio support depends on THP, at least for now. I remember
willy mentioning that in a thread. The future plan is to get rid of
this dependency.

> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke                  Kernel Storage Architect
> hare@suse.de                                +49 911 74053 688
> SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
> HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
> 

-- 
Pankaj Raghav


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-28 10:20 ` Hannes Reinecke
  2024-05-28 11:37   ` Pankaj Raghav (Samsung)
@ 2024-05-28 11:40   ` Pankaj Raghav (Samsung)
  1 sibling, 0 replies; 12+ messages in thread
From: Pankaj Raghav (Samsung) @ 2024-05-28 11:40 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Matthew Wilcox (Oracle),
	akpm, djwong, brauner, david, chandan.babu, ritesh.list,
	john.g.garry, ziy, linux-fsdevel, linux-xfs, linux-mm,
	linux-block, gost.dev, p.raghav, mcgrof

> > + */
> > +static inline void mapping_set_folio_order_range(struct address_space *mapping,
> > +		unsigned int min, unsigned int max)
> > +{
> > +	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
> > +		return;
> > +
> Errm. Sure? When transparent hugepages are _enabled_ we don't support this
> feature?
> Confused.
> 
Replied too quickly. I think this is a mistake. Thanks. I will fold this
change.

> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke                  Kernel Storage Architect
> hare@suse.de                                +49 911 74053 688
> SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
> HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
> 

-- 
Pankaj Raghav


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5.1] fs: Allow fine-grained control of folio sizes
  2024-05-28 10:13     ` Pankaj Raghav (Samsung)
@ 2024-05-28 12:13       ` Matthew Wilcox
  0 siblings, 0 replies; 12+ messages in thread
From: Matthew Wilcox @ 2024-05-28 12:13 UTC (permalink / raw)
  To: Pankaj Raghav (Samsung)
  Cc: akpm, djwong, brauner, david, chandan.babu, hare, ritesh.list,
	john.g.garry, ziy, linux-fsdevel, linux-xfs, linux-mm,
	linux-block, gost.dev, p.raghav, mcgrof

On Tue, May 28, 2024 at 10:13:32AM +0000, Pankaj Raghav (Samsung) wrote:
> Btw, I noticed you have removed mapping_align_start_index(). I will add
> it back in.

Add it in the patch which uses it, not this patch.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-05-28 12:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-27 21:01 [PATCH v5.1] fs: Allow fine-grained control of folio sizes Matthew Wilcox (Oracle)
2024-05-27 22:09 ` Pankaj Raghav (Samsung)
2024-05-27 22:39   ` Matthew Wilcox
2024-05-27 22:43     ` Matthew Wilcox
2024-05-27 23:00       ` Dave Chinner
2024-05-28  9:12         ` Pankaj Raghav (Samsung)
2024-05-28  9:45           ` Dave Chinner
2024-05-28 10:13     ` Pankaj Raghav (Samsung)
2024-05-28 12:13       ` Matthew Wilcox
2024-05-28 10:20 ` Hannes Reinecke
2024-05-28 11:37   ` Pankaj Raghav (Samsung)
2024-05-28 11:40   ` Pankaj Raghav (Samsung)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox