[PATCH 0/4] Support large folios for tmpfs

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] Support large folios for tmpfs
@ 2024-11-08  4:12 Baolin Wang
  2024-11-08  4:12 ` [PATCH 1/4] mm: factor out the order calculation into a new helper Baolin Wang
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Baolin Wang @ 2024-11-08  4:12 UTC (permalink / raw)
  To: akpm, hughd
  Cc: willy, david, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, baolin.wang, linux-mm, linux-kernel

Traditionally, tmpfs only supported PMD-sized huge folios. However nowadays
with other file systems supporting any sized large folios, and extending
anonymous to support mTHP, we should not restrict tmpfs to allocating only
PMD-sized huge folios, making it more special. Instead, we should allow
tmpfs can allocate any sized large folios.

Considering that tmpfs already has the 'huge=' option to control the huge
folios allocation, we can extend the 'huge=' option to allow any sized huge
folios. The semantics of the 'huge=' mount option are:

huge=never: no any sized huge folios
huge=always: any sized huge folios
huge=within_size: like 'always' but respect the i_size
huge=advise: like 'always' if requested with fadvise()/madvise()

Note: for tmpfs mmap() faults, due to the lack of a write size hint, still
allocate the PMD-sized huge folios if huge=always/within_size/advise is set.

Moreover, the 'deny' and 'force' testing options controlled by
'/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the same
semantics. The 'deny' can disable any sized large folios for tmpfs, while
the 'force' can enable PMD sized large folios for tmpfs.

Any comments and suggestions are appreciated. Thanks.

Hi David,
I did not add a new Kconfig option to control the default behavior of 'huge='
in the current version. I have not changed the default behavior at this
time, and let's see if there is a need for this.

Changes from RFC v3:
 - Drop the huge=write_size option.
 - Allow any sized huge folios for 'hgue' option.
 - Update the documentation, per David.

Changes from RFC v2:
 - Drop mTHP interfaces to control huge page allocation, per Matthew.
 - Add a new helper to calculate the order, suggested by Matthew.
 - Add a new huge=write_size option to allocate large folios based on
   the write size.
 - Add a new patch to update the documentation.

Changes from RFC v1:
 - Drop patch 1.
 - Use 'write_end' to calculate the length in shmem_allowable_huge_orders().
 - Update shmem_mapping_size_order() per Daniel.

Baolin Wang (3):
  mm: factor out the order calculation into a new helper
  mm: shmem: change shmem_huge_global_enabled() to return huge order
    bitmap
  mm: shmem: add large folio support for tmpfs

David Hildenbrand (1):
  docs: tmpfs: update the huge folios policy for tmpfs and shmem

 Documentation/admin-guide/mm/transhuge.rst |  52 ++++++---
 include/linux/pagemap.h                    |  16 ++-
 mm/shmem.c                                 | 128 ++++++++++++++++-----
 3 files changed, 146 insertions(+), 50 deletions(-)

-- 
2.39.3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] mm: factor out the order calculation into a new helper
  2024-11-08  4:12 [PATCH 0/4] Support large folios for tmpfs Baolin Wang
@ 2024-11-08  4:12 ` Baolin Wang
  2024-11-08  4:29   ` Barry Song
  2024-11-11 19:51   ` David Hildenbrand
  2024-11-08  4:12 ` [PATCH 2/4] mm: shmem: change shmem_huge_global_enabled() to return huge order bitmap Baolin Wang
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Baolin Wang @ 2024-11-08  4:12 UTC (permalink / raw)
  To: akpm, hughd
  Cc: willy, david, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, baolin.wang, linux-mm, linux-kernel

Factor out the order calculation into a new helper, which can be reused
by shmem in the following patch.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 include/linux/pagemap.h | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index bcf0865a38ae..d796c8a33647 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -727,6 +727,16 @@ typedef unsigned int __bitwise fgf_t;
 
 #define FGP_WRITEBEGIN		(FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE)
 
+static inline unsigned int filemap_get_order(size_t size)
+{
+	unsigned int shift = ilog2(size);
+
+	if (shift <= PAGE_SHIFT)
+		return 0;
+
+	return shift - PAGE_SHIFT;
+}
+
 /**
  * fgf_set_order - Encode a length in the fgf_t flags.
  * @size: The suggested size of the folio to create.
@@ -740,11 +750,11 @@ typedef unsigned int __bitwise fgf_t;
  */
 static inline fgf_t fgf_set_order(size_t size)
 {
-	unsigned int shift = ilog2(size);
+	unsigned int order = filemap_get_order(size);
 
-	if (shift <= PAGE_SHIFT)
+	if (!order)
 		return 0;
-	return (__force fgf_t)((shift - PAGE_SHIFT) << 26);
+	return (__force fgf_t)(order << 26);
 }
 
 void *filemap_get_entry(struct address_space *mapping, pgoff_t index);
-- 
2.39.3



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] mm: factor out the order calculation into a new helper
  2024-11-08  4:12 ` [PATCH 1/4] mm: factor out the order calculation into a new helper Baolin Wang
@ 2024-11-08  4:29   ` Barry Song
  2024-11-11 19:51   ` David Hildenbrand
  1 sibling, 0 replies; 13+ messages in thread
From: Barry Song @ 2024-11-08  4:29 UTC (permalink / raw)
  To: Baolin Wang
  Cc: akpm, hughd, willy, david, wangkefeng.wang, ryan.roberts,
	ioworker0, da.gomez, linux-mm, linux-kernel

On Fri, Nov 8, 2024 at 5:13 PM Baolin Wang
<baolin.wang@linux.alibaba.com> wrote:
>
> Factor out the order calculation into a new helper, which can be reused
> by shmem in the following patch.
>
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Reviewed-by: Barry Song <baohua@kernel.org>

> ---
>  include/linux/pagemap.h | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index bcf0865a38ae..d796c8a33647 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -727,6 +727,16 @@ typedef unsigned int __bitwise fgf_t;
>
>  #define FGP_WRITEBEGIN         (FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE)
>
> +static inline unsigned int filemap_get_order(size_t size)
> +{
> +       unsigned int shift = ilog2(size);
> +
> +       if (shift <= PAGE_SHIFT)
> +               return 0;
> +
> +       return shift - PAGE_SHIFT;
> +}
> +
>  /**
>   * fgf_set_order - Encode a length in the fgf_t flags.
>   * @size: The suggested size of the folio to create.
> @@ -740,11 +750,11 @@ typedef unsigned int __bitwise fgf_t;
>   */
>  static inline fgf_t fgf_set_order(size_t size)
>  {
> -       unsigned int shift = ilog2(size);
> +       unsigned int order = filemap_get_order(size);
>
> -       if (shift <= PAGE_SHIFT)
> +       if (!order)
>                 return 0;
> -       return (__force fgf_t)((shift - PAGE_SHIFT) << 26);
> +       return (__force fgf_t)(order << 26);
>  }
>
>  void *filemap_get_entry(struct address_space *mapping, pgoff_t index);
> --
> 2.39.3
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] mm: factor out the order calculation into a new helper
  2024-11-08  4:12 ` [PATCH 1/4] mm: factor out the order calculation into a new helper Baolin Wang
  2024-11-08  4:29   ` Barry Song
@ 2024-11-11 19:51   ` David Hildenbrand
  1 sibling, 0 replies; 13+ messages in thread
From: David Hildenbrand @ 2024-11-11 19:51 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: willy, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, linux-mm, linux-kernel

On 08.11.24 05:12, Baolin Wang wrote:
> Factor out the order calculation into a new helper, which can be reused
> by shmem in the following patch.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>   include/linux/pagemap.h | 16 +++++++++++++---
>   1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index bcf0865a38ae..d796c8a33647 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -727,6 +727,16 @@ typedef unsigned int __bitwise fgf_t;
>   
>   #define FGP_WRITEBEGIN		(FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE)
>   
> +static inline unsigned int filemap_get_order(size_t size)
> +{
> +	unsigned int shift = ilog2(size);
> +
> +	if (shift <= PAGE_SHIFT)
> +		return 0;
> +
> +	return shift - PAGE_SHIFT;
> +}
> +

I'd have added some words somewhere, how this differs to get_order() 
[calculated order might not have space to fit the complete size] and why 
[avoid over-allocation].

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/4] mm: shmem: change shmem_huge_global_enabled() to return huge order bitmap
  2024-11-08  4:12 [PATCH 0/4] Support large folios for tmpfs Baolin Wang
  2024-11-08  4:12 ` [PATCH 1/4] mm: factor out the order calculation into a new helper Baolin Wang
@ 2024-11-08  4:12 ` Baolin Wang
  2024-11-08 15:11   ` kernel test robot
  2024-11-08  4:12 ` [PATCH 3/4] mm: shmem: add large folio support for tmpfs Baolin Wang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Baolin Wang @ 2024-11-08  4:12 UTC (permalink / raw)
  To: akpm, hughd
  Cc: willy, david, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, baolin.wang, linux-mm, linux-kernel

Change the shmem_huge_global_enabled() to return the suitable huge
order bitmap, and return 0 if huge pages are not allowed. This is a
preparation for supporting various huge orders allocation of tmpfs
in the following patches.

No functional changes.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/shmem.c | 45 ++++++++++++++++++++++++---------------------
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 579e58cb3262..361da46c4bd5 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -549,37 +549,37 @@ static bool shmem_confirm_swap(struct address_space *mapping,
 
 static int shmem_huge __read_mostly = SHMEM_HUGE_NEVER;
 
-static bool shmem_huge_global_enabled(struct inode *inode, pgoff_t index,
-				      loff_t write_end, bool shmem_huge_force,
-				      unsigned long vm_flags)
+static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t index,
+					      loff_t write_end, bool shmem_huge_force,
+					      unsigned long vm_flags)
 {
 	loff_t i_size;
 
 	if (HPAGE_PMD_ORDER > MAX_PAGECACHE_ORDER)
-		return false;
+		return 0;
 	if (!S_ISREG(inode->i_mode))
-		return false;
+		return 0;
 	if (shmem_huge == SHMEM_HUGE_DENY)
-		return false;
+		return 0;
 	if (shmem_huge_force || shmem_huge == SHMEM_HUGE_FORCE)
-		return true;
+		return BIT(HPAGE_PMD_ORDER);
 
 	switch (SHMEM_SB(inode->i_sb)->huge) {
 	case SHMEM_HUGE_ALWAYS:
-		return true;
+		return BIT(HPAGE_PMD_ORDER);
 	case SHMEM_HUGE_WITHIN_SIZE:
 		index = round_up(index + 1, HPAGE_PMD_NR);
 		i_size = max(write_end, i_size_read(inode));
 		i_size = round_up(i_size, PAGE_SIZE);
 		if (i_size >> PAGE_SHIFT >= index)
-			return true;
+			return BIT(HPAGE_PMD_ORDER);
 		fallthrough;
 	case SHMEM_HUGE_ADVISE:
 		if (vm_flags & VM_HUGEPAGE)
-			return true;
+			return BIT(HPAGE_PMD_ORDER);
 		fallthrough;
 	default:
-		return false;
+		return 0;
 	}
 }
 
@@ -774,11 +774,11 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo,
 	return 0;
 }
 
-static bool shmem_huge_global_enabled(struct inode *inode, pgoff_t index,
-				      loff_t write_end, bool shmem_huge_force,
-				      unsigned long vm_flags)
+static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t index,
+					      loff_t write_end, bool shmem_huge_force,
+					      unsigned long vm_flags)
 {
-	return false;
+	return 0;
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
@@ -1173,8 +1173,11 @@ static int shmem_getattr(struct mnt_idmap *idmap,
 	generic_fillattr(idmap, request_mask, inode, stat);
 	inode_unlock_shared(inode);
 
-	if (shmem_huge_global_enabled(inode, 0, 0, false, 0))
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+	if (shmem_huge_global_enabled(inode, 0, 0, false, 0) ==
+	    BIT(HPAGE_PMD_ORDER))
 		stat->blksize = HPAGE_PMD_SIZE;
+#endif
 
 	if (request_mask & STATX_BTIME) {
 		stat->result_mask |= STATX_BTIME;
@@ -1682,21 +1685,21 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
 	unsigned long mask = READ_ONCE(huge_shmem_orders_always);
 	unsigned long within_size_orders = READ_ONCE(huge_shmem_orders_within_size);
 	unsigned long vm_flags = vma ? vma->vm_flags : 0;
-	bool global_huge;
+	unsigned int global_orders;
 	loff_t i_size;
 	int order;
 
 	if (thp_disabled_by_hw() || (vma && vma_thp_disabled(vma, vm_flags)))
 		return 0;
 
-	global_huge = shmem_huge_global_enabled(inode, index, write_end,
-						shmem_huge_force, vm_flags);
+	global_orders = shmem_huge_global_enabled(inode, index, write_end,
+						  shmem_huge_force, vm_flags);
 	if (!vma || !vma_is_anon_shmem(vma)) {
 		/*
 		 * For tmpfs, we now only support PMD sized THP if huge page
 		 * is enabled, otherwise fallback to order 0.
 		 */
-		return global_huge ? BIT(HPAGE_PMD_ORDER) : 0;
+		return global_orders;
 	}
 
 	/*
@@ -1729,7 +1732,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
 	if (vm_flags & VM_HUGEPAGE)
 		mask |= READ_ONCE(huge_shmem_orders_madvise);
 
-	if (global_huge)
+	if (global_orders > 0)
 		mask |= READ_ONCE(huge_shmem_orders_inherit);
 
 	return THP_ORDERS_ALL_FILE_DEFAULT & mask;
-- 
2.39.3



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] mm: shmem: change shmem_huge_global_enabled() to return huge order bitmap
  2024-11-08  4:12 ` [PATCH 2/4] mm: shmem: change shmem_huge_global_enabled() to return huge order bitmap Baolin Wang
@ 2024-11-08 15:11   ` kernel test robot
  0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2024-11-08 15:11 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: oe-kbuild-all, willy, david, wangkefeng.wang, 21cnbao,
	ryan.roberts, ioworker0, da.gomez, baolin.wang, linux-mm,
	linux-kernel

Hi Baolin,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on next-20241108]
[cannot apply to linus/master v6.12-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Baolin-Wang/mm-factor-out-the-order-calculation-into-a-new-helper/20241108-121545
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/a0d41cdc3491878260277e8c18a3e71deb2bc1fb.1731038280.git.baolin.wang%40linux.alibaba.com
patch subject: [PATCH 2/4] mm: shmem: change shmem_huge_global_enabled() to return huge order bitmap
config: arc-allnoconfig (https://download.01.org/0day-ci/archive/20241108/202411082236.7mwWSsNe-lkp@intel.com/config)
compiler: arc-elf-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241108/202411082236.7mwWSsNe-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411082236.7mwWSsNe-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> mm/shmem.c:777:21: warning: 'shmem_huge_global_enabled' defined but not used [-Wunused-function]
     777 | static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t index,
         |                     ^~~~~~~~~~~~~~~~~~~~~~~~~


vim +/shmem_huge_global_enabled +777 mm/shmem.c

   776	
 > 777	static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t index,
   778						      loff_t write_end, bool shmem_huge_force,
   779						      unsigned long vm_flags)
   780	{
   781		return 0;
   782	}
   783	#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
   784	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/4] mm: shmem: add large folio support for tmpfs
  2024-11-08  4:12 [PATCH 0/4] Support large folios for tmpfs Baolin Wang
  2024-11-08  4:12 ` [PATCH 1/4] mm: factor out the order calculation into a new helper Baolin Wang
  2024-11-08  4:12 ` [PATCH 2/4] mm: shmem: change shmem_huge_global_enabled() to return huge order bitmap Baolin Wang
@ 2024-11-08  4:12 ` Baolin Wang
  2024-11-08 11:25   ` kernel test robot
  2024-11-08  4:12 ` [PATCH 4/4] docs: tmpfs: update the huge folios policy for tmpfs and shmem Baolin Wang
  2024-11-08 15:30 ` [PATCH 0/4] Support large folios for tmpfs David Hildenbrand
  4 siblings, 1 reply; 13+ messages in thread
From: Baolin Wang @ 2024-11-08  4:12 UTC (permalink / raw)
  To: akpm, hughd
  Cc: willy, david, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, baolin.wang, linux-mm, linux-kernel

Add large folio support for tmpfs write and fallocate paths matching the
same high order preference mechanism used in the iomap buffered IO path
as used in __filemap_get_folio().

Add shmem_mapping_size_orders() to get a hint for the orders of the folio
based on the file size which takes care of the mapping requirements.

Traditionally, tmpfs only supported PMD-sized huge folios. However nowadays
with other file systems supporting any sized large folios, and extending
anonymous to support mTHP, we should not restrict tmpfs to allocating only
PMD-sized huge folios, making it more special. Instead, we should allow
tmpfs can allocate any sized large folios.

Considering that tmpfs already has the 'huge=' option to control the huge
folios allocation, we can extend the 'huge=' option to allow any sized huge
folios. The semantics of the 'huge=' mount option are:

huge=never: no any sized huge folios
huge=always: any sized huge folios
huge=within_size: like 'always' but respect the i_size
huge=advise: like 'always' if requested with fadvise()/madvise()

Note: for tmpfs mmap() faults, due to the lack of a write size hint, still
allocate the PMD-sized huge folios if huge=always/within_size/advise is set.

Moreover, the 'deny' and 'force' testing options controlled by
'/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the same
semantics. The 'deny' can disable any sized large folios for tmpfs, while
the 'force' can enable PMD sized large folios for tmpfs.

Co-developed-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/shmem.c | 91 +++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 77 insertions(+), 14 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 361da46c4bd5..98503a93a404 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -549,10 +549,50 @@ static bool shmem_confirm_swap(struct address_space *mapping,
 
 static int shmem_huge __read_mostly = SHMEM_HUGE_NEVER;
 
+/**
+ * shmem_mapping_size_orders - Get allowable folio orders for the given file size.
+ * @mapping: Target address_space.
+ * @index: The page index.
+ * @size: The suggested size of the folio to create.
+ *
+ * This returns a high order for folios (when supported) based on the file size
+ * which the mapping currently allows at the given index. The index is relevant
+ * due to alignment considerations the mapping might have. The returned order
+ * may be less than the size passed.
+ *
+ * Return: The orders.
+ */
+static inline unsigned int
+shmem_mapping_size_orders(struct address_space *mapping, pgoff_t index, loff_t write_end)
+{
+	unsigned int order;
+	size_t size;
+
+	if (!mapping_large_folio_support(mapping) || !write_end)
+		return 0;
+
+	/* Calculate the write size based on the write_end */
+	size = write_end - (index << PAGE_SHIFT);
+	order = filemap_get_order(size);
+	if (!order)
+		return 0;
+
+	/* If we're not aligned, allocate a smaller folio */
+	if (index & ((1UL << order) - 1))
+		order = __ffs(index);
+
+	order = min_t(size_t, order, MAX_PAGECACHE_ORDER);
+	return order > 0 ? BIT(order + 1) - 1 : 0;
+}
+
 static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t index,
 					      loff_t write_end, bool shmem_huge_force,
+					      struct vm_area_struct *vma,
 					      unsigned long vm_flags)
 {
+	unsigned long within_size_orders;
+	unsigned int order;
+	pgoff_t aligned_index;
 	loff_t i_size;
 
 	if (HPAGE_PMD_ORDER > MAX_PAGECACHE_ORDER)
@@ -564,15 +604,41 @@ static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t index
 	if (shmem_huge_force || shmem_huge == SHMEM_HUGE_FORCE)
 		return BIT(HPAGE_PMD_ORDER);
 
+	/*
+	 * The huge order allocation for anon shmem is controlled through
+	 * the mTHP interface, so we still use PMD-sized huge order to
+	 * check whether global control is enabled.
+	 *
+	 * For tmpfs mmap()'s huge order, we still use PMD-sized order to
+	 * allocate huge pages due to lack of a write size hint.
+	 *
+	 * Otherwise, tmpfs will allow getting a highest order hint based on
+	 * the size of write and fallocate paths, then will try each allowable
+	 * huge orders.
+	 */
 	switch (SHMEM_SB(inode->i_sb)->huge) {
 	case SHMEM_HUGE_ALWAYS:
-		return BIT(HPAGE_PMD_ORDER);
-	case SHMEM_HUGE_WITHIN_SIZE:
-		index = round_up(index + 1, HPAGE_PMD_NR);
-		i_size = max(write_end, i_size_read(inode));
-		i_size = round_up(i_size, PAGE_SIZE);
-		if (i_size >> PAGE_SHIFT >= index)
+		if (vma)
 			return BIT(HPAGE_PMD_ORDER);
+
+		return shmem_mapping_size_orders(inode->i_mapping, index, write_end);
+	case SHMEM_HUGE_WITHIN_SIZE:
+		if (vma)
+			within_size_orders = BIT(HPAGE_PMD_ORDER);
+		else
+			within_size_orders = shmem_mapping_size_orders(inode->i_mapping,
+								       index, write_end);
+
+		order = highest_order(within_size_orders);
+		while (within_size_orders) {
+			aligned_index = round_up(index + 1, 1 << order);
+			i_size = max(write_end, i_size_read(inode));
+			i_size = round_up(i_size, PAGE_SIZE);
+			if (i_size >> PAGE_SHIFT >= aligned_index)
+				return within_size_orders;
+
+			order = next_order(&within_size_orders, order);
+		}
 		fallthrough;
 	case SHMEM_HUGE_ADVISE:
 		if (vm_flags & VM_HUGEPAGE)
@@ -776,6 +842,7 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo,
 
 static unsigned int shmem_huge_global_enabled(struct inode *inode, pgoff_t index,
 					      loff_t write_end, bool shmem_huge_force,
+					      struct vm_area_struct *vma,
 					      unsigned long vm_flags)
 {
 	return 0;
@@ -1174,7 +1241,7 @@ static int shmem_getattr(struct mnt_idmap *idmap,
 	inode_unlock_shared(inode);
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (shmem_huge_global_enabled(inode, 0, 0, false, 0) ==
+	if (shmem_huge_global_enabled(inode, 0, 0, false, NULL, 0) ==
 	    BIT(HPAGE_PMD_ORDER))
 		stat->blksize = HPAGE_PMD_SIZE;
 #endif
@@ -1693,14 +1760,10 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
 		return 0;
 
 	global_orders = shmem_huge_global_enabled(inode, index, write_end,
-						  shmem_huge_force, vm_flags);
-	if (!vma || !vma_is_anon_shmem(vma)) {
-		/*
-		 * For tmpfs, we now only support PMD sized THP if huge page
-		 * is enabled, otherwise fallback to order 0.
-		 */
+						  shmem_huge_force, vma, vm_flags);
+	/* Tmpfs huge pages allocation? */
+	if (!vma || !vma_is_anon_shmem(vma))
 		return global_orders;
-	}
 
 	/*
 	 * Following the 'deny' semantics of the top level, force the huge
-- 
2.39.3



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] mm: shmem: add large folio support for tmpfs
  2024-11-08  4:12 ` [PATCH 3/4] mm: shmem: add large folio support for tmpfs Baolin Wang
@ 2024-11-08 11:25   ` kernel test robot
  0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2024-11-08 11:25 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: oe-kbuild-all, willy, david, wangkefeng.wang, 21cnbao,
	ryan.roberts, ioworker0, da.gomez, baolin.wang, linux-mm,
	linux-kernel

Hi Baolin,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on next-20241108]
[cannot apply to linus/master v6.12-rc6]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Baolin-Wang/mm-factor-out-the-order-calculation-into-a-new-helper/20241108-121545
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/e2f4e483f75e54be0654fafb2147822faacac16d.1731038280.git.baolin.wang%40linux.alibaba.com
patch subject: [PATCH 3/4] mm: shmem: add large folio support for tmpfs
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20241108/202411081926.LQ3wEK7l-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241108/202411081926.LQ3wEK7l-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411081926.LQ3wEK7l-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> mm/shmem.c:567: warning: Function parameter or struct member 'write_end' not described in 'shmem_mapping_size_orders'
>> mm/shmem.c:567: warning: Excess function parameter 'size' description in 'shmem_mapping_size_orders'


vim +567 mm/shmem.c

   551	
   552	/**
   553	 * shmem_mapping_size_orders - Get allowable folio orders for the given file size.
   554	 * @mapping: Target address_space.
   555	 * @index: The page index.
   556	 * @size: The suggested size of the folio to create.
   557	 *
   558	 * This returns a high order for folios (when supported) based on the file size
   559	 * which the mapping currently allows at the given index. The index is relevant
   560	 * due to alignment considerations the mapping might have. The returned order
   561	 * may be less than the size passed.
   562	 *
   563	 * Return: The orders.
   564	 */
   565	static inline unsigned int
   566	shmem_mapping_size_orders(struct address_space *mapping, pgoff_t index, loff_t write_end)
 > 567	{
   568		unsigned int order;
   569		size_t size;
   570	
   571		if (!mapping_large_folio_support(mapping) || !write_end)
   572			return 0;
   573	
   574		/* Calculate the write size based on the write_end */
   575		size = write_end - (index << PAGE_SHIFT);
   576		order = filemap_get_order(size);
   577		if (!order)
   578			return 0;
   579	
   580		/* If we're not aligned, allocate a smaller folio */
   581		if (index & ((1UL << order) - 1))
   582			order = __ffs(index);
   583	
   584		order = min_t(size_t, order, MAX_PAGECACHE_ORDER);
   585		return order > 0 ? BIT(order + 1) - 1 : 0;
   586	}
   587	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 4/4] docs: tmpfs: update the huge folios policy for tmpfs and shmem
  2024-11-08  4:12 [PATCH 0/4] Support large folios for tmpfs Baolin Wang
                   ` (2 preceding siblings ...)
  2024-11-08  4:12 ` [PATCH 3/4] mm: shmem: add large folio support for tmpfs Baolin Wang
@ 2024-11-08  4:12 ` Baolin Wang
  2024-11-08 15:30 ` [PATCH 0/4] Support large folios for tmpfs David Hildenbrand
  4 siblings, 0 replies; 13+ messages in thread
From: Baolin Wang @ 2024-11-08  4:12 UTC (permalink / raw)
  To: akpm, hughd
  Cc: willy, david, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, baolin.wang, linux-mm, linux-kernel

From: David Hildenbrand <david@redhat.com>

Update the huge folios policy for tmpfs and shmem.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 Documentation/admin-guide/mm/transhuge.rst | 52 +++++++++++++++-------
 1 file changed, 36 insertions(+), 16 deletions(-)

diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index 5034915f4e8e..2a7705bf622d 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -352,8 +352,21 @@ default to ``never``.
 Hugepages in tmpfs/shmem
 ========================
 
-You can control hugepage allocation policy in tmpfs with mount option
-``huge=``. It can have following values:
+Traditionally, tmpfs only supported a single huge page size ("PMD"). Today,
+it also supports smaller sizes just like anonymous memory, often referred
+to as "multi-size THP" (mTHP). Huge pages of any size are commonly
+represented in the kernel as "large folios".
+
+While there is fine control over the huge page sizes to use for the internal
+shmem mount (see below), ordinary tmpfs mounts will make use of all available
+huge page sizes without any control over the exact sizes, behaving more like
+other file systems.
+
+tmpfs mounts
+------------
+
+The THP allocation policy for tmpfs mounts can be adjusted using the mount
+option: ``huge=``. It can have following values:
 
 always
     Attempt to allocate huge pages every time we need a new page;
@@ -374,13 +387,9 @@ The default policy is ``never``.
 ``huge=never`` will not attempt to break up huge pages at all, just stop more
 from being allocated.
 
-There's also sysfs knob to control hugepage allocation policy for internal
-shmem mount: /sys/kernel/mm/transparent_hugepage/shmem_enabled. The mount
-is used for SysV SHM, memfds, shared anonymous mmaps (of /dev/zero or
-MAP_ANONYMOUS), GPU drivers' DRM objects, Ashmem.
-
-In addition to policies listed above, shmem_enabled allows two further
-values:
+In addition to policies listed above, the sysfs knob
+/sys/kernel/mm/transparent_hugepage/shmem_enabled will affect the
+allocation policy of tmpfs mounts, when set to the following values:
 
 deny
     For use in emergencies, to force the huge option off from
@@ -388,13 +397,24 @@ deny
 force
     Force the huge option on for all - very useful for testing;
 
-Shmem can also use "multi-size THP" (mTHP) by adding a new sysfs knob to
-control mTHP allocation:
-'/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled',
-and its value for each mTHP is essentially consistent with the global
-setting.  An 'inherit' option is added to ensure compatibility with these
-global settings.  Conversely, the options 'force' and 'deny' are dropped,
-which are rather testing artifacts from the old ages.
+shmem / internal tmpfs
+----------------------
+The mount internal tmpfs mount is used for SysV SHM, memfds, shared anonymous
+mmaps (of /dev/zero or MAP_ANONYMOUS), GPU drivers' DRM  objects, Ashmem.
+
+To control the THP allocation policy for this internal tmpfs mount, the
+sysfs knob /sys/kernel/mm/transparent_hugepage/shmem_enabled and the knobs
+per THP size in
+'/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled'
+can be used.
+
+The global knob has the same semantics as the ``huge=`` mount options
+for tmpfs mounts, except that the different huge page sizes can be controlled
+individually, and will only use the setting of the global knob when the
+per-size knob is set to 'inherit'.
+
+The options 'force' and 'deny' are dropped for the individual sizes, which
+are rather testing artifacts from the old ages.
 
 always
     Attempt to allocate <size> huge pages every time we need a new page;
-- 
2.39.3



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] Support large folios for tmpfs
  2024-11-08  4:12 [PATCH 0/4] Support large folios for tmpfs Baolin Wang
                   ` (3 preceding siblings ...)
  2024-11-08  4:12 ` [PATCH 4/4] docs: tmpfs: update the huge folios policy for tmpfs and shmem Baolin Wang
@ 2024-11-08 15:30 ` David Hildenbrand
  2024-11-09  7:12   ` Baolin Wang
  4 siblings, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2024-11-08 15:30 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: willy, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, linux-mm, linux-kernel

On 08.11.24 05:12, Baolin Wang wrote:
> Traditionally, tmpfs only supported PMD-sized huge folios. However nowadays
> with other file systems supporting any sized large folios, and extending
> anonymous to support mTHP, we should not restrict tmpfs to allocating only
> PMD-sized huge folios, making it more special. Instead, we should allow
> tmpfs can allocate any sized large folios.
> 
> Considering that tmpfs already has the 'huge=' option to control the huge
> folios allocation, we can extend the 'huge=' option to allow any sized huge
> folios. The semantics of the 'huge=' mount option are:
> 
> huge=never: no any sized huge folios
> huge=always: any sized huge folios
> huge=within_size: like 'always' but respect the i_size
> huge=advise: like 'always' if requested with fadvise()/madvise()
> 
> Note: for tmpfs mmap() faults, due to the lack of a write size hint, still
> allocate the PMD-sized huge folios if huge=always/within_size/advise is set.

So, no fallback to smaller sizes for now in case we fail to allocate a 
PMD one? Of course, this can be added later fairly easily.

> 
> Moreover, the 'deny' and 'force' testing options controlled by
> '/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the same
> semantics. The 'deny' can disable any sized large folios for tmpfs, while
> the 'force' can enable PMD sized large folios for tmpfs.
> 
> Any comments and suggestions are appreciated. Thanks.
> 
> Hi David,
> I did not add a new Kconfig option to control the default behavior of 'huge='
> in the current version. I have not changed the default behavior at this
> time, and let's see if there is a need for this.

Likely we want to change the default at some point so people might get a 
benefit in more scenarios automatically. But I did not investigate how 
/tmp is mapped as default by Fedora, for example.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] Support large folios for tmpfs
  2024-11-08 15:30 ` [PATCH 0/4] Support large folios for tmpfs David Hildenbrand
@ 2024-11-09  7:12   ` Baolin Wang
  2024-11-11 19:47     ` David Hildenbrand
  0 siblings, 1 reply; 13+ messages in thread
From: Baolin Wang @ 2024-11-09  7:12 UTC (permalink / raw)
  To: David Hildenbrand, akpm, hughd
  Cc: willy, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, linux-mm, linux-kernel



On 2024/11/8 23:30, David Hildenbrand wrote:
> On 08.11.24 05:12, Baolin Wang wrote:
>> Traditionally, tmpfs only supported PMD-sized huge folios. However 
>> nowadays
>> with other file systems supporting any sized large folios, and extending
>> anonymous to support mTHP, we should not restrict tmpfs to allocating 
>> only
>> PMD-sized huge folios, making it more special. Instead, we should allow
>> tmpfs can allocate any sized large folios.
>>
>> Considering that tmpfs already has the 'huge=' option to control the huge
>> folios allocation, we can extend the 'huge=' option to allow any sized 
>> huge
>> folios. The semantics of the 'huge=' mount option are:
>>
>> huge=never: no any sized huge folios
>> huge=always: any sized huge folios
>> huge=within_size: like 'always' but respect the i_size
>> huge=advise: like 'always' if requested with fadvise()/madvise()
>>
>> Note: for tmpfs mmap() faults, due to the lack of a write size hint, 
>> still
>> allocate the PMD-sized huge folios if huge=always/within_size/advise 
>> is set.
> 
> So, no fallback to smaller sizes for now in case we fail to allocate a 
> PMD one? Of course, this can be added later fairly easily.

Right. I have no strong preference on this. If no one objects, I can add 
a fallback to smaller large folios if the PMD sized allocation fails in 
the next version.

>> Moreover, the 'deny' and 'force' testing options controlled by
>> '/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the 
>> same
>> semantics. The 'deny' can disable any sized large folios for tmpfs, while
>> the 'force' can enable PMD sized large folios for tmpfs.
>>
>> Any comments and suggestions are appreciated. Thanks.
>>
>> Hi David,
>> I did not add a new Kconfig option to control the default behavior of 
>> 'huge='
>> in the current version. I have not changed the default behavior at this
>> time, and let's see if there is a need for this.
> 
> Likely we want to change the default at some point so people might get a 
> benefit in more scenarios automatically. But I did not investigate how 
> /tmp is mapped as default by Fedora, for example.

Personally, adding a cmdline to change the default value might be more 
useful than the Kconfig. Anyway, I still want to investigate if there is 
a real need.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] Support large folios for tmpfs
  2024-11-09  7:12   ` Baolin Wang
@ 2024-11-11 19:47     ` David Hildenbrand
  2024-11-12  3:19       ` Baolin Wang
  0 siblings, 1 reply; 13+ messages in thread
From: David Hildenbrand @ 2024-11-11 19:47 UTC (permalink / raw)
  To: Baolin Wang, akpm, hughd
  Cc: willy, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, linux-mm, linux-kernel

On 09.11.24 08:12, Baolin Wang wrote:
> 
> 
> On 2024/11/8 23:30, David Hildenbrand wrote:
>> On 08.11.24 05:12, Baolin Wang wrote:
>>> Traditionally, tmpfs only supported PMD-sized huge folios. However
>>> nowadays
>>> with other file systems supporting any sized large folios, and extending
>>> anonymous to support mTHP, we should not restrict tmpfs to allocating
>>> only
>>> PMD-sized huge folios, making it more special. Instead, we should allow
>>> tmpfs can allocate any sized large folios.
>>>
>>> Considering that tmpfs already has the 'huge=' option to control the huge
>>> folios allocation, we can extend the 'huge=' option to allow any sized
>>> huge
>>> folios. The semantics of the 'huge=' mount option are:
>>>
>>> huge=never: no any sized huge folios
>>> huge=always: any sized huge folios
>>> huge=within_size: like 'always' but respect the i_size
>>> huge=advise: like 'always' if requested with fadvise()/madvise()
>>>
>>> Note: for tmpfs mmap() faults, due to the lack of a write size hint,
>>> still
>>> allocate the PMD-sized huge folios if huge=always/within_size/advise
>>> is set.
>>
>> So, no fallback to smaller sizes for now in case we fail to allocate a
>> PMD one? Of course, this can be added later fairly easily.
> 
> Right. I have no strong preference on this. If no one objects, I can add
> a fallback to smaller large folios if the PMD sized allocation fails in
> the next version.

I'm fine with a staged approach, to perform this change separately.

> 
>>> Moreover, the 'deny' and 'force' testing options controlled by
>>> '/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the
>>> same
>>> semantics. The 'deny' can disable any sized large folios for tmpfs, while
>>> the 'force' can enable PMD sized large folios for tmpfs.
>>>
>>> Any comments and suggestions are appreciated. Thanks.
>>>
>>> Hi David,
>>> I did not add a new Kconfig option to control the default behavior of
>>> 'huge='
>>> in the current version. I have not changed the default behavior at this
>>> time, and let's see if there is a need for this.
>>
>> Likely we want to change the default at some point so people might get a
>> benefit in more scenarios automatically. But I did not investigate how
>> /tmp is mapped as default by Fedora, for example.
> 
> Personally, adding a cmdline to change the default value might be more
> useful than the Kconfig. Anyway, I still want to investigate if there is
> a real need.

Likely both will be reasonable to have.

FWIW, "systemctl cat tmp.mount" on a Fedora40 system tells me
"Options=mode=1777,strictatime,nosuid,nodev,size=50%%,nr_inodes=1m"

To be precise:

$ grep tmpfs /etc/mtab
vendorfw /usr/lib/firmware/vendor tmpfs rw,relatime,mode=755,inode64 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=4096k,nr_inodes=4063361,mode=755,inode64 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
tmpfs /run tmpfs rw,nosuid,nodev,size=6511156k,nr_inodes=819200,mode=755,inode64 0 0
tmpfs /tmp tmpfs rw,nosuid,nodev,size=16277892k,nr_inodes=1048576,inode64 0 0
tmpfs /run/user/100813 tmpfs rw,nosuid,nodev,relatime,size=3255576k,nr_inodes=813894,mode=700,uid=100813,gid=100813,inode64 0 0


Having a way to change the default will likely be extremely helpful.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/4] Support large folios for tmpfs
  2024-11-11 19:47     ` David Hildenbrand
@ 2024-11-12  3:19       ` Baolin Wang
  0 siblings, 0 replies; 13+ messages in thread
From: Baolin Wang @ 2024-11-12  3:19 UTC (permalink / raw)
  To: David Hildenbrand, akpm, hughd
  Cc: willy, wangkefeng.wang, 21cnbao, ryan.roberts, ioworker0,
	da.gomez, linux-mm, linux-kernel



On 2024/11/12 03:47, David Hildenbrand wrote:
> On 09.11.24 08:12, Baolin Wang wrote:
>>
>>
>> On 2024/11/8 23:30, David Hildenbrand wrote:
>>> On 08.11.24 05:12, Baolin Wang wrote:
>>>> Traditionally, tmpfs only supported PMD-sized huge folios. However
>>>> nowadays
>>>> with other file systems supporting any sized large folios, and 
>>>> extending
>>>> anonymous to support mTHP, we should not restrict tmpfs to allocating
>>>> only
>>>> PMD-sized huge folios, making it more special. Instead, we should allow
>>>> tmpfs can allocate any sized large folios.
>>>>
>>>> Considering that tmpfs already has the 'huge=' option to control the 
>>>> huge
>>>> folios allocation, we can extend the 'huge=' option to allow any sized
>>>> huge
>>>> folios. The semantics of the 'huge=' mount option are:
>>>>
>>>> huge=never: no any sized huge folios
>>>> huge=always: any sized huge folios
>>>> huge=within_size: like 'always' but respect the i_size
>>>> huge=advise: like 'always' if requested with fadvise()/madvise()
>>>>
>>>> Note: for tmpfs mmap() faults, due to the lack of a write size hint,
>>>> still
>>>> allocate the PMD-sized huge folios if huge=always/within_size/advise
>>>> is set.
>>>
>>> So, no fallback to smaller sizes for now in case we fail to allocate a
>>> PMD one? Of course, this can be added later fairly easily.
>>
>> Right. I have no strong preference on this. If no one objects, I can add
>> a fallback to smaller large folios if the PMD sized allocation fails in
>> the next version.
> 
> I'm fine with a staged approach, to perform this change separately.

Sure.

>>>> Moreover, the 'deny' and 'force' testing options controlled by
>>>> '/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the
>>>> same
>>>> semantics. The 'deny' can disable any sized large folios for tmpfs, 
>>>> while
>>>> the 'force' can enable PMD sized large folios for tmpfs.
>>>>
>>>> Any comments and suggestions are appreciated. Thanks.
>>>>
>>>> Hi David,
>>>> I did not add a new Kconfig option to control the default behavior of
>>>> 'huge='
>>>> in the current version. I have not changed the default behavior at this
>>>> time, and let's see if there is a need for this.
>>>
>>> Likely we want to change the default at some point so people might get a
>>> benefit in more scenarios automatically. But I did not investigate how
>>> /tmp is mapped as default by Fedora, for example.
>>
>> Personally, adding a cmdline to change the default value might be more
>> useful than the Kconfig. Anyway, I still want to investigate if there is
>> a real need.
> 
> Likely both will be reasonable to have.
> 
> FWIW, "systemctl cat tmp.mount" on a Fedora40 system tells me
> "Options=mode=1777,strictatime,nosuid,nodev,size=50%%,nr_inodes=1m"
> 
> To be precise:
> 
> $ grep tmpfs /etc/mtab
> vendorfw /usr/lib/firmware/vendor tmpfs rw,relatime,mode=755,inode64 0 0
> devtmpfs /dev devtmpfs 
> rw,nosuid,size=4096k,nr_inodes=4063361,mode=755,inode64 0 0
> tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
> tmpfs /run tmpfs 
> rw,nosuid,nodev,size=6511156k,nr_inodes=819200,mode=755,inode64 0 0
> tmpfs /tmp tmpfs 
> rw,nosuid,nodev,size=16277892k,nr_inodes=1048576,inode64 0 0
> tmpfs /run/user/100813 tmpfs 
> rw,nosuid,nodev,relatime,size=3255576k,nr_inodes=813894,mode=700,uid=100813,gid=100813,inode64 0 0
> 
> 
> Having a way to change the default will likely be extremely helpful.

Thanks. I'd like to add a command line option like 
'transparent_hugepage_shmem' to control the default value.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-11-12  3:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-08  4:12 [PATCH 0/4] Support large folios for tmpfs Baolin Wang
2024-11-08  4:12 ` [PATCH 1/4] mm: factor out the order calculation into a new helper Baolin Wang
2024-11-08  4:29   ` Barry Song
2024-11-11 19:51   ` David Hildenbrand
2024-11-08  4:12 ` [PATCH 2/4] mm: shmem: change shmem_huge_global_enabled() to return huge order bitmap Baolin Wang
2024-11-08 15:11   ` kernel test robot
2024-11-08  4:12 ` [PATCH 3/4] mm: shmem: add large folio support for tmpfs Baolin Wang
2024-11-08 11:25   ` kernel test robot
2024-11-08  4:12 ` [PATCH 4/4] docs: tmpfs: update the huge folios policy for tmpfs and shmem Baolin Wang
2024-11-08 15:30 ` [PATCH 0/4] Support large folios for tmpfs David Hildenbrand
2024-11-09  7:12   ` Baolin Wang
2024-11-11 19:47     ` David Hildenbrand
2024-11-12  3:19       ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox