linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
@ 2025-02-26  2:41 zhongjinji
  2025-02-26 23:17 ` Andrew Morton
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: zhongjinji @ 2025-02-26  2:41 UTC (permalink / raw)
  To: akpm
  Cc: yuzhao, linux-kernel, linux-mm, rientjes, vbabka, zhongjinji,
	yipengxiang, liulu.liu, feng.han

From: zhongjinji <zhongjinji@honor.com>

In the past, nr_reserved_highatomic was considered to be part of
unusable_free when there was no ALLOC_RESERVES flag. To prevent
unusable_free from being too large, it is reasonable to set a
fixed maximum highatomic value.

Even if the maximum number of highatomic pageblocks is set to be larger,
unusable_free may not increase, since Yu Zhao provided the modification
about nr_free_highatomic in
https://lore.kernel.org/all/20241028182653.3420139-1-yuzhao@google.com/T/#u

More highatomic pageblocks are beneficial for the successful allocation
of high-order page, which is helpful in some devices. Therefore, use
highatomic_reserve_ratio to adjust the maximum number of highatomic
pageblocks.

Signed-off-by: zhongjinji <zhongjinji@honor.com>
---
 mm/page_alloc.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 579789600a3c..dbdce6a0f694 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -273,6 +273,7 @@ int min_free_kbytes = 1024;
 int user_min_free_kbytes = -1;
 static int watermark_boost_factor __read_mostly = 15000;
 static int watermark_scale_factor = 10;
+static int highatomic_reserve_ratio = 10;
 
 /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
 int movable_zone;
@@ -2046,7 +2047,8 @@ static void reserve_highatomic_pageblock(struct page *page, int order,
 	 */
 	if ((zone_managed_pages(zone) / 100) < pageblock_nr_pages)
 		return;
-	max_managed = ALIGN((zone_managed_pages(zone) / 100), pageblock_nr_pages);
+	max_managed = ALIGN((zone_managed_pages(zone) * highatomic_reserve_ratio / 1000),
+		      pageblock_nr_pages);
 	if (zone->nr_reserved_highatomic >= max_managed)
 		return;
 
@@ -6199,6 +6201,13 @@ static const struct ctl_table page_alloc_sysctl_table[] = {
 		.mode		= 0644,
 		.proc_handler	= percpu_pagelist_high_fraction_sysctl_handler,
 		.extra1		= SYSCTL_ZERO,
+	},
+		.procname	= "highatomic_reserve_ratio",
+		.data		= &highatomic_reserve_ratio,
+		.maxlen		= sizeof(highatomic_reserve_ratio),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
 	},
 	{
 		.procname	= "lowmem_reserve_ratio",
-- 
2.17.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
  2025-02-26  2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
@ 2025-02-26 23:17 ` Andrew Morton
  2025-02-27  4:53 ` kernel test robot
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2025-02-26 23:17 UTC (permalink / raw)
  To: zhongjinji
  Cc: yuzhao, linux-kernel, linux-mm, rientjes, vbabka, yipengxiang,
	liulu.liu, feng.han

On Wed, 26 Feb 2025 10:41:25 +0800 <zhongjinji@honor.com> wrote:

> From: zhongjinji <zhongjinji@honor.com>
> 
> In the past, nr_reserved_highatomic was considered to be part of
> unusable_free when there was no ALLOC_RESERVES flag. To prevent
> unusable_free from being too large, it is reasonable to set a
> fixed maximum highatomic value.

Can you explain further?  Why is it "reasonable"?  Please fully describe
the userspace-visible problem which this patch fixes.

> Even if the maximum number of highatomic pageblocks is set to be larger,
> unusable_free may not increase, since Yu Zhao provided the modification
> about nr_free_highatomic in
> https://lore.kernel.org/all/20241028182653.3420139-1-yuzhao@google.com/T/#u
> 
> More highatomic pageblocks are beneficial for the successful allocation
> of high-order page, which is helpful in some devices. Therefore, use
> highatomic_reserve_ratio to adjust the maximum number of highatomic
> pageblocks.

Can you provide testcases and measurements which help us to understand
and to quantify the benefits of this change?

> @@ -6199,6 +6201,13 @@ static const struct ctl_table page_alloc_sysctl_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= percpu_pagelist_high_fraction_sysctl_handler,
>  		.extra1		= SYSCTL_ZERO,
> +	},
> +		.procname	= "highatomic_reserve_ratio",
> +		.data		= &highatomic_reserve_ratio,
> +		.maxlen		= sizeof(highatomic_reserve_ratio),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= SYSCTL_ZERO,
>  	},

Is there any way at all in which we can address the issue you're seeing
without adding yet another tunable?

Also, a new highatomic_reserve_ratio should be documented in
Documentation/admin-guide/sysctl/vm.rst, please.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
  2025-02-26  2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
  2025-02-26 23:17 ` Andrew Morton
@ 2025-02-27  4:53 ` kernel test robot
  2025-02-27  5:25 ` kernel test robot
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2025-02-27  4:53 UTC (permalink / raw)
  To: zhongjinji, akpm
  Cc: oe-kbuild-all, yuzhao, linux-kernel, linux-mm, rientjes, vbabka,
	zhongjinji, yipengxiang, liulu.liu, feng.han

Hi,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/zhongjinji-honor-com/mm-page_alloc-make-the-maximum-number-of-highatomic-pageblocks-resizable/20250226-121712
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250226024126.3718-1-zhongjinji%40honor.com
patch subject: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
config: x86_64-buildonly-randconfig-006-20250227 (https://download.01.org/0day-ci/archive/20250227/202502271248.bHUExwvw-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250227/202502271248.bHUExwvw-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502271248.bHUExwvw-lkp@intel.com/

All warnings (new ones prefixed by >>):

   mm/page_alloc.c:6262:17: error: field name not in record or union initializer
    6262 |                 .procname       = "highatomic_reserve_ratio",
         |                 ^
   mm/page_alloc.c:6262:17: note: (near initialization for 'page_alloc_sysctl_table')
   mm/page_alloc.c:6263:17: error: field name not in record or union initializer
    6263 |                 .data           = &highatomic_reserve_ratio,
         |                 ^
   mm/page_alloc.c:6263:17: note: (near initialization for 'page_alloc_sysctl_table')
   mm/page_alloc.c:6263:35: error: initialization of 'const char *' from incompatible pointer type 'int *' [-Werror=incompatible-pointer-types]
    6263 |                 .data           = &highatomic_reserve_ratio,
         |                                   ^
   mm/page_alloc.c:6263:35: note: (near initialization for 'page_alloc_sysctl_table[5].procname')
   mm/page_alloc.c:6264:17: error: field name not in record or union initializer
    6264 |                 .maxlen         = sizeof(highatomic_reserve_ratio),
         |                 ^
   mm/page_alloc.c:6264:17: note: (near initialization for 'page_alloc_sysctl_table')
>> mm/page_alloc.c:6264:35: warning: initialization of 'const char *' from 'long unsigned int' makes pointer from integer without a cast [-Wint-conversion]
    6264 |                 .maxlen         = sizeof(highatomic_reserve_ratio),
         |                                   ^~~~~~
   mm/page_alloc.c:6264:35: note: (near initialization for 'page_alloc_sysctl_table[6].procname')
   mm/page_alloc.c:6265:17: error: field name not in record or union initializer
    6265 |                 .mode           = 0644,
         |                 ^
   mm/page_alloc.c:6265:17: note: (near initialization for 'page_alloc_sysctl_table')
>> mm/page_alloc.c:6265:35: warning: initialization of 'const char *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
    6265 |                 .mode           = 0644,
         |                                   ^~~~
   mm/page_alloc.c:6265:35: note: (near initialization for 'page_alloc_sysctl_table[7].procname')
   mm/page_alloc.c:6266:17: error: field name not in record or union initializer
    6266 |                 .proc_handler   = proc_dointvec_minmax,
         |                 ^
   mm/page_alloc.c:6266:17: note: (near initialization for 'page_alloc_sysctl_table')
   mm/page_alloc.c:6266:35: error: initialization of 'const char *' from incompatible pointer type 'int (*)(const struct ctl_table *, int,  void *, size_t *, loff_t *)' {aka 'int (*)(const struct ctl_table *, int,  void *, long unsigned int *, long long int *)'} [-Werror=incompatible-pointer-types]
    6266 |                 .proc_handler   = proc_dointvec_minmax,
         |                                   ^~~~~~~~~~~~~~~~~~~~
   mm/page_alloc.c:6266:35: note: (near initialization for 'page_alloc_sysctl_table[8].procname')
   mm/page_alloc.c:6267:17: error: field name not in record or union initializer
    6267 |                 .extra1         = SYSCTL_ZERO,
         |                 ^
   mm/page_alloc.c:6267:17: note: (near initialization for 'page_alloc_sysctl_table')
   mm/page_alloc.c:6228:59: warning: missing braces around initializer [-Wmissing-braces]
    6228 | static const struct ctl_table page_alloc_sysctl_table[] = {
         |                                                           ^
   mm/page_alloc.c:6269:9: error: expected identifier or '(' before '{' token
    6269 |         {
         |         ^
   mm/page_alloc.c:6275:10: error: expected identifier or '(' before ',' token
    6275 |         },
         |          ^
   mm/page_alloc.c:6174:12: warning: 'lowmem_reserve_ratio_sysctl_handler' defined but not used [-Wunused-function]
    6174 | static int lowmem_reserve_ratio_sysctl_handler(const struct ctl_table *table,
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/suspend.h:5,
                    from mm/page_alloc.c:28:
   include/linux/swap.h:590:12: warning: 'folio_alloc_swap' defined but not used [-Wunused-function]
     590 | static int folio_alloc_swap(struct folio *folio, gfp_t gfp_mask)
         |            ^~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +6264 mm/page_alloc.c

  6227	
  6228	static const struct ctl_table page_alloc_sysctl_table[] = {
  6229		{
  6230			.procname	= "min_free_kbytes",
  6231			.data		= &min_free_kbytes,
  6232			.maxlen		= sizeof(min_free_kbytes),
  6233			.mode		= 0644,
  6234			.proc_handler	= min_free_kbytes_sysctl_handler,
  6235			.extra1		= SYSCTL_ZERO,
  6236		},
  6237		{
  6238			.procname	= "watermark_boost_factor",
  6239			.data		= &watermark_boost_factor,
  6240			.maxlen		= sizeof(watermark_boost_factor),
  6241			.mode		= 0644,
  6242			.proc_handler	= proc_dointvec_minmax,
  6243			.extra1		= SYSCTL_ZERO,
  6244		},
  6245		{
  6246			.procname	= "watermark_scale_factor",
  6247			.data		= &watermark_scale_factor,
  6248			.maxlen		= sizeof(watermark_scale_factor),
  6249			.mode		= 0644,
  6250			.proc_handler	= watermark_scale_factor_sysctl_handler,
  6251			.extra1		= SYSCTL_ONE,
  6252			.extra2		= SYSCTL_THREE_THOUSAND,
  6253		},
  6254		{
  6255			.procname	= "percpu_pagelist_high_fraction",
  6256			.data		= &percpu_pagelist_high_fraction,
  6257			.maxlen		= sizeof(percpu_pagelist_high_fraction),
  6258			.mode		= 0644,
  6259			.proc_handler	= percpu_pagelist_high_fraction_sysctl_handler,
  6260			.extra1		= SYSCTL_ZERO,
  6261		},
  6262			.procname	= "highatomic_reserve_ratio",
> 6263			.data		= &highatomic_reserve_ratio,
> 6264			.maxlen		= sizeof(highatomic_reserve_ratio),
> 6265			.mode		= 0644,
  6266			.proc_handler	= proc_dointvec_minmax,
  6267			.extra1		= SYSCTL_ZERO,
  6268		},
  6269		{
  6270			.procname	= "lowmem_reserve_ratio",
  6271			.data		= &sysctl_lowmem_reserve_ratio,
  6272			.maxlen		= sizeof(sysctl_lowmem_reserve_ratio),
  6273			.mode		= 0644,
  6274			.proc_handler	= lowmem_reserve_ratio_sysctl_handler,
  6275		},
  6276	#ifdef CONFIG_NUMA
  6277		{
  6278			.procname	= "numa_zonelist_order",
  6279			.data		= &numa_zonelist_order,
  6280			.maxlen		= NUMA_ZONELIST_ORDER_LEN,
  6281			.mode		= 0644,
  6282			.proc_handler	= numa_zonelist_order_handler,
  6283		},
  6284		{
  6285			.procname	= "min_unmapped_ratio",
  6286			.data		= &sysctl_min_unmapped_ratio,
  6287			.maxlen		= sizeof(sysctl_min_unmapped_ratio),
  6288			.mode		= 0644,
  6289			.proc_handler	= sysctl_min_unmapped_ratio_sysctl_handler,
  6290			.extra1		= SYSCTL_ZERO,
  6291			.extra2		= SYSCTL_ONE_HUNDRED,
  6292		},
  6293		{
  6294			.procname	= "min_slab_ratio",
  6295			.data		= &sysctl_min_slab_ratio,
  6296			.maxlen		= sizeof(sysctl_min_slab_ratio),
  6297			.mode		= 0644,
  6298			.proc_handler	= sysctl_min_slab_ratio_sysctl_handler,
  6299			.extra1		= SYSCTL_ZERO,
  6300			.extra2		= SYSCTL_ONE_HUNDRED,
  6301		},
  6302	#endif
  6303	};
  6304	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
  2025-02-26  2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
  2025-02-26 23:17 ` Andrew Morton
  2025-02-27  4:53 ` kernel test robot
@ 2025-02-27  5:25 ` kernel test robot
  2025-02-27  5:35 ` kernel test robot
  2025-02-28 11:01 ` Vlastimil Babka
  4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2025-02-27  5:25 UTC (permalink / raw)
  To: zhongjinji, akpm
  Cc: llvm, oe-kbuild-all, yuzhao, linux-kernel, linux-mm, rientjes,
	vbabka, zhongjinji, yipengxiang, liulu.liu, feng.han

Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/zhongjinji-honor-com/mm-page_alloc-make-the-maximum-number-of-highatomic-pageblocks-resizable/20250226-121712
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250226024126.3718-1-zhongjinji%40honor.com
patch subject: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
config: x86_64-buildonly-randconfig-001-20250227 (https://download.01.org/0day-ci/archive/20250227/202502271250.KUfFnxnl-lkp@intel.com/config)
compiler: clang version 19.1.7 (https://github.com/llvm/llvm-project cd708029e0b2869e80abe31ddb175f7c35361f90)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250227/202502271250.KUfFnxnl-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502271250.KUfFnxnl-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from mm/page_alloc.c:19:
   In file included from include/linux/mm.h:2302:
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   In file included from mm/page_alloc.c:44:
   include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      47 |         __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
         |                                    ~~~~~~~~~~~ ^ ~~~
   include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
      49 |                                 NR_ZONE_LRU_BASE + lru, nr_pages);
         |                                 ~~~~~~~~~~~~~~~~ ^ ~~~
   mm/page_alloc.c:2854:2: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    2854 |         __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
         |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   mm/page_alloc.c:2971:3: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    2971 |                 __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
         |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
   mm/page_alloc.c:4743:2: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
    4743 |         __count_zid_vm_events(PGALLOC, zone_idx(zone), nr_account);
         |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
     139 |         __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
         |                           ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
>> mm/page_alloc.c:6262:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
    6262 |                 .procname       = "highatomic_reserve_ratio",
         |                 ^
   mm/page_alloc.c:6263:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
    6263 |                 .data           = &highatomic_reserve_ratio,
         |                 ^
   mm/page_alloc.c:6264:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
    6264 |                 .maxlen         = sizeof(highatomic_reserve_ratio),
         |                 ^
   mm/page_alloc.c:6265:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
    6265 |                 .mode           = 0644,
         |                 ^
   mm/page_alloc.c:6266:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
    6266 |                 .proc_handler   = proc_dointvec_minmax,
         |                 ^
   mm/page_alloc.c:6267:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
    6267 |                 .extra1         = SYSCTL_ZERO,
         |                 ^
>> mm/page_alloc.c:6268:3: error: expected ';' at end of declaration
    6268 |         },
         |          ^
         |          ;
>> mm/page_alloc.c:6269:2: error: expected identifier or '('
    6269 |         {
         |         ^
>> mm/page_alloc.c:6303:1: error: extraneous closing brace ('}')
    6303 | };
         | ^
   6 warnings and 9 errors generated.


vim +6262 mm/page_alloc.c

  6227	
  6228	static const struct ctl_table page_alloc_sysctl_table[] = {
  6229		{
  6230			.procname	= "min_free_kbytes",
  6231			.data		= &min_free_kbytes,
  6232			.maxlen		= sizeof(min_free_kbytes),
  6233			.mode		= 0644,
  6234			.proc_handler	= min_free_kbytes_sysctl_handler,
  6235			.extra1		= SYSCTL_ZERO,
  6236		},
  6237		{
  6238			.procname	= "watermark_boost_factor",
  6239			.data		= &watermark_boost_factor,
  6240			.maxlen		= sizeof(watermark_boost_factor),
  6241			.mode		= 0644,
  6242			.proc_handler	= proc_dointvec_minmax,
  6243			.extra1		= SYSCTL_ZERO,
  6244		},
  6245		{
  6246			.procname	= "watermark_scale_factor",
  6247			.data		= &watermark_scale_factor,
  6248			.maxlen		= sizeof(watermark_scale_factor),
  6249			.mode		= 0644,
  6250			.proc_handler	= watermark_scale_factor_sysctl_handler,
  6251			.extra1		= SYSCTL_ONE,
  6252			.extra2		= SYSCTL_THREE_THOUSAND,
  6253		},
  6254		{
  6255			.procname	= "percpu_pagelist_high_fraction",
  6256			.data		= &percpu_pagelist_high_fraction,
  6257			.maxlen		= sizeof(percpu_pagelist_high_fraction),
  6258			.mode		= 0644,
  6259			.proc_handler	= percpu_pagelist_high_fraction_sysctl_handler,
  6260			.extra1		= SYSCTL_ZERO,
  6261		},
> 6262			.procname	= "highatomic_reserve_ratio",
  6263			.data		= &highatomic_reserve_ratio,
  6264			.maxlen		= sizeof(highatomic_reserve_ratio),
  6265			.mode		= 0644,
  6266			.proc_handler	= proc_dointvec_minmax,
  6267			.extra1		= SYSCTL_ZERO,
> 6268		},
> 6269		{
  6270			.procname	= "lowmem_reserve_ratio",
  6271			.data		= &sysctl_lowmem_reserve_ratio,
  6272			.maxlen		= sizeof(sysctl_lowmem_reserve_ratio),
  6273			.mode		= 0644,
  6274			.proc_handler	= lowmem_reserve_ratio_sysctl_handler,
  6275		},
  6276	#ifdef CONFIG_NUMA
  6277		{
  6278			.procname	= "numa_zonelist_order",
  6279			.data		= &numa_zonelist_order,
  6280			.maxlen		= NUMA_ZONELIST_ORDER_LEN,
  6281			.mode		= 0644,
  6282			.proc_handler	= numa_zonelist_order_handler,
  6283		},
  6284		{
  6285			.procname	= "min_unmapped_ratio",
  6286			.data		= &sysctl_min_unmapped_ratio,
  6287			.maxlen		= sizeof(sysctl_min_unmapped_ratio),
  6288			.mode		= 0644,
  6289			.proc_handler	= sysctl_min_unmapped_ratio_sysctl_handler,
  6290			.extra1		= SYSCTL_ZERO,
  6291			.extra2		= SYSCTL_ONE_HUNDRED,
  6292		},
  6293		{
  6294			.procname	= "min_slab_ratio",
  6295			.data		= &sysctl_min_slab_ratio,
  6296			.maxlen		= sizeof(sysctl_min_slab_ratio),
  6297			.mode		= 0644,
  6298			.proc_handler	= sysctl_min_slab_ratio_sysctl_handler,
  6299			.extra1		= SYSCTL_ZERO,
  6300			.extra2		= SYSCTL_ONE_HUNDRED,
  6301		},
  6302	#endif
> 6303	};
  6304	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
  2025-02-26  2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
                   ` (2 preceding siblings ...)
  2025-02-27  5:25 ` kernel test robot
@ 2025-02-27  5:35 ` kernel test robot
  2025-02-28 11:01 ` Vlastimil Babka
  4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2025-02-27  5:35 UTC (permalink / raw)
  To: zhongjinji, akpm
  Cc: oe-kbuild-all, yuzhao, linux-kernel, linux-mm, rientjes, vbabka,
	zhongjinji, yipengxiang, liulu.liu, feng.han

Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/zhongjinji-honor-com/mm-page_alloc-make-the-maximum-number-of-highatomic-pageblocks-resizable/20250226-121712
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250226024126.3718-1-zhongjinji%40honor.com
patch subject: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
config: x86_64-buildonly-randconfig-003-20250227 (https://download.01.org/0day-ci/archive/20250227/202502271338.7VyyFcoF-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250227/202502271338.7VyyFcoF-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502271338.7VyyFcoF-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

>> mm/page_alloc.c:6262:17: error: field name not in record or union initializer
    6262 |                 .procname       = "highatomic_reserve_ratio",
         |                 ^
   mm/page_alloc.c:6262:17: note: (near initialization for 'page_alloc_sysctl_table')
>> mm/page_alloc.c:6262:35: error: invalid initializer
    6262 |                 .procname       = "highatomic_reserve_ratio",
         |                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~
   mm/page_alloc.c:6262:35: note: (near initialization for 'page_alloc_sysctl_table[4].<anonymous>')
   mm/page_alloc.c:6263:17: error: field name not in record or union initializer
    6263 |                 .data           = &highatomic_reserve_ratio,
         |                 ^
   mm/page_alloc.c:6263:17: note: (near initialization for 'page_alloc_sysctl_table')
   mm/page_alloc.c:6263:35: error: invalid initializer
    6263 |                 .data           = &highatomic_reserve_ratio,
         |                                   ^
   mm/page_alloc.c:6263:35: note: (near initialization for 'page_alloc_sysctl_table[5].<anonymous>')
   mm/page_alloc.c:6264:17: error: field name not in record or union initializer
    6264 |                 .maxlen         = sizeof(highatomic_reserve_ratio),
         |                 ^
   mm/page_alloc.c:6264:17: note: (near initialization for 'page_alloc_sysctl_table')
   mm/page_alloc.c:6264:35: error: invalid initializer
    6264 |                 .maxlen         = sizeof(highatomic_reserve_ratio),
         |                                   ^~~~~~
   mm/page_alloc.c:6264:35: note: (near initialization for 'page_alloc_sysctl_table[6].<anonymous>')
   mm/page_alloc.c:6265:17: error: field name not in record or union initializer
    6265 |                 .mode           = 0644,
         |                 ^
   mm/page_alloc.c:6265:17: note: (near initialization for 'page_alloc_sysctl_table')
   mm/page_alloc.c:6265:35: error: invalid initializer
    6265 |                 .mode           = 0644,
         |                                   ^~~~
   mm/page_alloc.c:6265:35: note: (near initialization for 'page_alloc_sysctl_table[7].<anonymous>')
   mm/page_alloc.c:6266:17: error: field name not in record or union initializer
    6266 |                 .proc_handler   = proc_dointvec_minmax,
         |                 ^
   mm/page_alloc.c:6266:17: note: (near initialization for 'page_alloc_sysctl_table')
   mm/page_alloc.c:6266:35: error: invalid initializer
    6266 |                 .proc_handler   = proc_dointvec_minmax,
         |                                   ^~~~~~~~~~~~~~~~~~~~
   mm/page_alloc.c:6266:35: note: (near initialization for 'page_alloc_sysctl_table[8].<anonymous>')
   mm/page_alloc.c:6267:17: error: field name not in record or union initializer
    6267 |                 .extra1         = SYSCTL_ZERO,
         |                 ^
   mm/page_alloc.c:6267:17: note: (near initialization for 'page_alloc_sysctl_table')
   In file included from include/linux/key.h:17,
                    from include/linux/cred.h:13,
                    from include/linux/sched/signal.h:10,
                    from include/linux/rcuwait.h:6,
                    from include/linux/mm.h:35,
                    from mm/page_alloc.c:19:
>> include/linux/sysctl.h:41:41: error: invalid initializer
      41 | #define SYSCTL_ZERO                     ((void *)&sysctl_vals[0])
         |                                         ^
   mm/page_alloc.c:6267:35: note: in expansion of macro 'SYSCTL_ZERO'
    6267 |                 .extra1         = SYSCTL_ZERO,
         |                                   ^~~~~~~~~~~
   include/linux/sysctl.h:41:41: note: (near initialization for 'page_alloc_sysctl_table[9].<anonymous>')
      41 | #define SYSCTL_ZERO                     ((void *)&sysctl_vals[0])
         |                                         ^
   mm/page_alloc.c:6267:35: note: in expansion of macro 'SYSCTL_ZERO'
    6267 |                 .extra1         = SYSCTL_ZERO,
         |                                   ^~~~~~~~~~~
>> mm/page_alloc.c:6228:59: warning: missing braces around initializer [-Wmissing-braces]
    6228 | static const struct ctl_table page_alloc_sysctl_table[] = {
         |                                                           ^
>> mm/page_alloc.c:6269:9: error: expected identifier or '(' before '{' token
    6269 |         {
         |         ^
   mm/page_alloc.c:6275:10: error: expected identifier or '(' before ',' token
    6275 |         },
         |          ^
>> mm/page_alloc.c:6174:12: warning: 'lowmem_reserve_ratio_sysctl_handler' defined but not used [-Wunused-function]
    6174 | static int lowmem_reserve_ratio_sysctl_handler(const struct ctl_table *table,
         |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from include/linux/suspend.h:5,
                    from mm/page_alloc.c:28:
   include/linux/swap.h:590:12: warning: 'folio_alloc_swap' defined but not used [-Wunused-function]
     590 | static int folio_alloc_swap(struct folio *folio, gfp_t gfp_mask)
         |            ^~~~~~~~~~~~~~~~


vim +6262 mm/page_alloc.c

  6164	
  6165	/*
  6166	 * lowmem_reserve_ratio_sysctl_handler - just a wrapper around
  6167	 *	proc_dointvec() so that we can call setup_per_zone_lowmem_reserve()
  6168	 *	whenever sysctl_lowmem_reserve_ratio changes.
  6169	 *
  6170	 * The reserve ratio obviously has absolutely no relation with the
  6171	 * minimum watermarks. The lowmem reserve ratio can only make sense
  6172	 * if in function of the boot time zone sizes.
  6173	 */
> 6174	static int lowmem_reserve_ratio_sysctl_handler(const struct ctl_table *table,
  6175			int write, void *buffer, size_t *length, loff_t *ppos)
  6176	{
  6177		int i;
  6178	
  6179		proc_dointvec_minmax(table, write, buffer, length, ppos);
  6180	
  6181		for (i = 0; i < MAX_NR_ZONES; i++) {
  6182			if (sysctl_lowmem_reserve_ratio[i] < 1)
  6183				sysctl_lowmem_reserve_ratio[i] = 0;
  6184		}
  6185	
  6186		setup_per_zone_lowmem_reserve();
  6187		return 0;
  6188	}
  6189	
  6190	/*
  6191	 * percpu_pagelist_high_fraction - changes the pcp->high for each zone on each
  6192	 * cpu. It is the fraction of total pages in each zone that a hot per cpu
  6193	 * pagelist can have before it gets flushed back to buddy allocator.
  6194	 */
  6195	static int percpu_pagelist_high_fraction_sysctl_handler(const struct ctl_table *table,
  6196			int write, void *buffer, size_t *length, loff_t *ppos)
  6197	{
  6198		struct zone *zone;
  6199		int old_percpu_pagelist_high_fraction;
  6200		int ret;
  6201	
  6202		mutex_lock(&pcp_batch_high_lock);
  6203		old_percpu_pagelist_high_fraction = percpu_pagelist_high_fraction;
  6204	
  6205		ret = proc_dointvec_minmax(table, write, buffer, length, ppos);
  6206		if (!write || ret < 0)
  6207			goto out;
  6208	
  6209		/* Sanity checking to avoid pcp imbalance */
  6210		if (percpu_pagelist_high_fraction &&
  6211		    percpu_pagelist_high_fraction < MIN_PERCPU_PAGELIST_HIGH_FRACTION) {
  6212			percpu_pagelist_high_fraction = old_percpu_pagelist_high_fraction;
  6213			ret = -EINVAL;
  6214			goto out;
  6215		}
  6216	
  6217		/* No change? */
  6218		if (percpu_pagelist_high_fraction == old_percpu_pagelist_high_fraction)
  6219			goto out;
  6220	
  6221		for_each_populated_zone(zone)
  6222			zone_set_pageset_high_and_batch(zone, 0);
  6223	out:
  6224		mutex_unlock(&pcp_batch_high_lock);
  6225		return ret;
  6226	}
  6227	
> 6228	static const struct ctl_table page_alloc_sysctl_table[] = {
  6229		{
  6230			.procname	= "min_free_kbytes",
  6231			.data		= &min_free_kbytes,
  6232			.maxlen		= sizeof(min_free_kbytes),
  6233			.mode		= 0644,
  6234			.proc_handler	= min_free_kbytes_sysctl_handler,
  6235			.extra1		= SYSCTL_ZERO,
  6236		},
  6237		{
  6238			.procname	= "watermark_boost_factor",
  6239			.data		= &watermark_boost_factor,
  6240			.maxlen		= sizeof(watermark_boost_factor),
  6241			.mode		= 0644,
  6242			.proc_handler	= proc_dointvec_minmax,
  6243			.extra1		= SYSCTL_ZERO,
  6244		},
  6245		{
  6246			.procname	= "watermark_scale_factor",
  6247			.data		= &watermark_scale_factor,
  6248			.maxlen		= sizeof(watermark_scale_factor),
  6249			.mode		= 0644,
  6250			.proc_handler	= watermark_scale_factor_sysctl_handler,
  6251			.extra1		= SYSCTL_ONE,
  6252			.extra2		= SYSCTL_THREE_THOUSAND,
  6253		},
  6254		{
  6255			.procname	= "percpu_pagelist_high_fraction",
  6256			.data		= &percpu_pagelist_high_fraction,
  6257			.maxlen		= sizeof(percpu_pagelist_high_fraction),
  6258			.mode		= 0644,
  6259			.proc_handler	= percpu_pagelist_high_fraction_sysctl_handler,
  6260			.extra1		= SYSCTL_ZERO,
  6261		},
> 6262			.procname	= "highatomic_reserve_ratio",
  6263			.data		= &highatomic_reserve_ratio,
  6264			.maxlen		= sizeof(highatomic_reserve_ratio),
  6265			.mode		= 0644,
  6266			.proc_handler	= proc_dointvec_minmax,
  6267			.extra1		= SYSCTL_ZERO,
  6268		},
> 6269		{
  6270			.procname	= "lowmem_reserve_ratio",
  6271			.data		= &sysctl_lowmem_reserve_ratio,
  6272			.maxlen		= sizeof(sysctl_lowmem_reserve_ratio),
  6273			.mode		= 0644,
  6274			.proc_handler	= lowmem_reserve_ratio_sysctl_handler,
  6275		},
  6276	#ifdef CONFIG_NUMA
  6277		{
  6278			.procname	= "numa_zonelist_order",
  6279			.data		= &numa_zonelist_order,
  6280			.maxlen		= NUMA_ZONELIST_ORDER_LEN,
  6281			.mode		= 0644,
  6282			.proc_handler	= numa_zonelist_order_handler,
  6283		},
  6284		{
  6285			.procname	= "min_unmapped_ratio",
  6286			.data		= &sysctl_min_unmapped_ratio,
  6287			.maxlen		= sizeof(sysctl_min_unmapped_ratio),
  6288			.mode		= 0644,
  6289			.proc_handler	= sysctl_min_unmapped_ratio_sysctl_handler,
  6290			.extra1		= SYSCTL_ZERO,
  6291			.extra2		= SYSCTL_ONE_HUNDRED,
  6292		},
  6293		{
  6294			.procname	= "min_slab_ratio",
  6295			.data		= &sysctl_min_slab_ratio,
  6296			.maxlen		= sizeof(sysctl_min_slab_ratio),
  6297			.mode		= 0644,
  6298			.proc_handler	= sysctl_min_slab_ratio_sysctl_handler,
  6299			.extra1		= SYSCTL_ZERO,
  6300			.extra2		= SYSCTL_ONE_HUNDRED,
  6301		},
  6302	#endif
  6303	};
  6304	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
  2025-02-26  2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
                   ` (3 preceding siblings ...)
  2025-02-27  5:35 ` kernel test robot
@ 2025-02-28 11:01 ` Vlastimil Babka
  4 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2025-02-28 11:01 UTC (permalink / raw)
  To: zhongjinji, akpm, Mel Gorman, Johannes Weiner, Brendan Jackman
  Cc: yuzhao, linux-kernel, linux-mm, rientjes, yipengxiang, liulu.liu,
	feng.han

+Cc Mel, Johannes and Brendan, please keep on future submissions

On 2/26/25 03:41, zhongjinji@honor.com wrote:
> From: zhongjinji <zhongjinji@honor.com>
> 
> In the past, nr_reserved_highatomic was considered to be part of
> unusable_free when there was no ALLOC_RESERVES flag. To prevent
> unusable_free from being too large, it is reasonable to set a
> fixed maximum highatomic value.
> 
> Even if the maximum number of highatomic pageblocks is set to be larger,
> unusable_free may not increase, since Yu Zhao provided the modification
> about nr_free_highatomic in
> https://lore.kernel.org/all/20241028182653.3420139-1-yuzhao@google.com/T/#u

I think I raised in discussing that patch the fact that
unreserve_highatomic_pageblock() is flawed in that it will only find a
highatomic block to unreserve if it has some free pages as it searches via
freelist. So if the existing reserve becomes fully used up, it's dead
weight, unless the highatomic allocations are shortlived. It could make
sense to remove the HIGHATOMIC marking so other, more free blocks might be
reserved instead.

I fear if this approach isn't just working around that limitation.

> More highatomic pageblocks are beneficial for the successful allocation
> of high-order page, which is helpful in some devices. Therefore, use

Can you elaborate what is it helpful with? And what kind of values you
expect to be using? Do you find that your existing HIGHATOMIC blocks are
fully depleted as I speculate above? Are your highatomic allocations short
or long lived?

Maybe simply at this point nr_free_highatomic would be a better metric than
nr_reserved_highatomic to decide if we should reserve another pageblock? The
counter didn't exist when highatomic reserves were introduced.

But maybe also some mechanism (compaction maybe?) could also be looking for
highatomic blocks that are fully used and unreserve them?

> highatomic_reserve_ratio to adjust the maximum number of highatomic
> pageblocks.
> 
> Signed-off-by: zhongjinji <zhongjinji@honor.com>

Minimally this needs to be documented in e.g.
Documentation/admin-guide/sysctl/vm.rst, it's not obvious that the value is
divided by 1000 and not 100 for example.

> ---
>  mm/page_alloc.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 579789600a3c..dbdce6a0f694 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -273,6 +273,7 @@ int min_free_kbytes = 1024;
>  int user_min_free_kbytes = -1;
>  static int watermark_boost_factor __read_mostly = 15000;
>  static int watermark_scale_factor = 10;
> +static int highatomic_reserve_ratio = 10;
>  
>  /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
>  int movable_zone;
> @@ -2046,7 +2047,8 @@ static void reserve_highatomic_pageblock(struct page *page, int order,
>  	 */
>  	if ((zone_managed_pages(zone) / 100) < pageblock_nr_pages)
>  		return;
> -	max_managed = ALIGN((zone_managed_pages(zone) / 100), pageblock_nr_pages);
> +	max_managed = ALIGN((zone_managed_pages(zone) * highatomic_reserve_ratio / 1000),
> +		      pageblock_nr_pages);
>  	if (zone->nr_reserved_highatomic >= max_managed)
>  		return;
>  
> @@ -6199,6 +6201,13 @@ static const struct ctl_table page_alloc_sysctl_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= percpu_pagelist_high_fraction_sysctl_handler,
>  		.extra1		= SYSCTL_ZERO,
> +	},
> +		.procname	= "highatomic_reserve_ratio",
> +		.data		= &highatomic_reserve_ratio,
> +		.maxlen		= sizeof(highatomic_reserve_ratio),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= SYSCTL_ZERO,
>  	},
>  	{
>  		.procname	= "lowmem_reserve_ratio",



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-02-28 11:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-26  2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
2025-02-26 23:17 ` Andrew Morton
2025-02-27  4:53 ` kernel test robot
2025-02-27  5:25 ` kernel test robot
2025-02-27  5:35 ` kernel test robot
2025-02-28 11:01 ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox