* [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
@ 2025-02-26 2:41 zhongjinji
2025-02-26 23:17 ` Andrew Morton
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: zhongjinji @ 2025-02-26 2:41 UTC (permalink / raw)
To: akpm
Cc: yuzhao, linux-kernel, linux-mm, rientjes, vbabka, zhongjinji,
yipengxiang, liulu.liu, feng.han
From: zhongjinji <zhongjinji@honor.com>
In the past, nr_reserved_highatomic was considered to be part of
unusable_free when there was no ALLOC_RESERVES flag. To prevent
unusable_free from being too large, it is reasonable to set a
fixed maximum highatomic value.
Even if the maximum number of highatomic pageblocks is set to be larger,
unusable_free may not increase, since Yu Zhao provided the modification
about nr_free_highatomic in
https://lore.kernel.org/all/20241028182653.3420139-1-yuzhao@google.com/T/#u
More highatomic pageblocks are beneficial for the successful allocation
of high-order page, which is helpful in some devices. Therefore, use
highatomic_reserve_ratio to adjust the maximum number of highatomic
pageblocks.
Signed-off-by: zhongjinji <zhongjinji@honor.com>
---
mm/page_alloc.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 579789600a3c..dbdce6a0f694 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -273,6 +273,7 @@ int min_free_kbytes = 1024;
int user_min_free_kbytes = -1;
static int watermark_boost_factor __read_mostly = 15000;
static int watermark_scale_factor = 10;
+static int highatomic_reserve_ratio = 10;
/* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
int movable_zone;
@@ -2046,7 +2047,8 @@ static void reserve_highatomic_pageblock(struct page *page, int order,
*/
if ((zone_managed_pages(zone) / 100) < pageblock_nr_pages)
return;
- max_managed = ALIGN((zone_managed_pages(zone) / 100), pageblock_nr_pages);
+ max_managed = ALIGN((zone_managed_pages(zone) * highatomic_reserve_ratio / 1000),
+ pageblock_nr_pages);
if (zone->nr_reserved_highatomic >= max_managed)
return;
@@ -6199,6 +6201,13 @@ static const struct ctl_table page_alloc_sysctl_table[] = {
.mode = 0644,
.proc_handler = percpu_pagelist_high_fraction_sysctl_handler,
.extra1 = SYSCTL_ZERO,
+ },
+ .procname = "highatomic_reserve_ratio",
+ .data = &highatomic_reserve_ratio,
+ .maxlen = sizeof(highatomic_reserve_ratio),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
},
{
.procname = "lowmem_reserve_ratio",
--
2.17.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
2025-02-26 2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
@ 2025-02-26 23:17 ` Andrew Morton
2025-02-27 4:53 ` kernel test robot
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2025-02-26 23:17 UTC (permalink / raw)
To: zhongjinji
Cc: yuzhao, linux-kernel, linux-mm, rientjes, vbabka, yipengxiang,
liulu.liu, feng.han
On Wed, 26 Feb 2025 10:41:25 +0800 <zhongjinji@honor.com> wrote:
> From: zhongjinji <zhongjinji@honor.com>
>
> In the past, nr_reserved_highatomic was considered to be part of
> unusable_free when there was no ALLOC_RESERVES flag. To prevent
> unusable_free from being too large, it is reasonable to set a
> fixed maximum highatomic value.
Can you explain further? Why is it "reasonable"? Please fully describe
the userspace-visible problem which this patch fixes.
> Even if the maximum number of highatomic pageblocks is set to be larger,
> unusable_free may not increase, since Yu Zhao provided the modification
> about nr_free_highatomic in
> https://lore.kernel.org/all/20241028182653.3420139-1-yuzhao@google.com/T/#u
>
> More highatomic pageblocks are beneficial for the successful allocation
> of high-order page, which is helpful in some devices. Therefore, use
> highatomic_reserve_ratio to adjust the maximum number of highatomic
> pageblocks.
Can you provide testcases and measurements which help us to understand
and to quantify the benefits of this change?
> @@ -6199,6 +6201,13 @@ static const struct ctl_table page_alloc_sysctl_table[] = {
> .mode = 0644,
> .proc_handler = percpu_pagelist_high_fraction_sysctl_handler,
> .extra1 = SYSCTL_ZERO,
> + },
> + .procname = "highatomic_reserve_ratio",
> + .data = &highatomic_reserve_ratio,
> + .maxlen = sizeof(highatomic_reserve_ratio),
> + .mode = 0644,
> + .proc_handler = proc_dointvec_minmax,
> + .extra1 = SYSCTL_ZERO,
> },
Is there any way at all in which we can address the issue you're seeing
without adding yet another tunable?
Also, a new highatomic_reserve_ratio should be documented in
Documentation/admin-guide/sysctl/vm.rst, please.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
2025-02-26 2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
2025-02-26 23:17 ` Andrew Morton
@ 2025-02-27 4:53 ` kernel test robot
2025-02-27 5:25 ` kernel test robot
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2025-02-27 4:53 UTC (permalink / raw)
To: zhongjinji, akpm
Cc: oe-kbuild-all, yuzhao, linux-kernel, linux-mm, rientjes, vbabka,
zhongjinji, yipengxiang, liulu.liu, feng.han
Hi,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/zhongjinji-honor-com/mm-page_alloc-make-the-maximum-number-of-highatomic-pageblocks-resizable/20250226-121712
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20250226024126.3718-1-zhongjinji%40honor.com
patch subject: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
config: x86_64-buildonly-randconfig-006-20250227 (https://download.01.org/0day-ci/archive/20250227/202502271248.bHUExwvw-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250227/202502271248.bHUExwvw-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502271248.bHUExwvw-lkp@intel.com/
All warnings (new ones prefixed by >>):
mm/page_alloc.c:6262:17: error: field name not in record or union initializer
6262 | .procname = "highatomic_reserve_ratio",
| ^
mm/page_alloc.c:6262:17: note: (near initialization for 'page_alloc_sysctl_table')
mm/page_alloc.c:6263:17: error: field name not in record or union initializer
6263 | .data = &highatomic_reserve_ratio,
| ^
mm/page_alloc.c:6263:17: note: (near initialization for 'page_alloc_sysctl_table')
mm/page_alloc.c:6263:35: error: initialization of 'const char *' from incompatible pointer type 'int *' [-Werror=incompatible-pointer-types]
6263 | .data = &highatomic_reserve_ratio,
| ^
mm/page_alloc.c:6263:35: note: (near initialization for 'page_alloc_sysctl_table[5].procname')
mm/page_alloc.c:6264:17: error: field name not in record or union initializer
6264 | .maxlen = sizeof(highatomic_reserve_ratio),
| ^
mm/page_alloc.c:6264:17: note: (near initialization for 'page_alloc_sysctl_table')
>> mm/page_alloc.c:6264:35: warning: initialization of 'const char *' from 'long unsigned int' makes pointer from integer without a cast [-Wint-conversion]
6264 | .maxlen = sizeof(highatomic_reserve_ratio),
| ^~~~~~
mm/page_alloc.c:6264:35: note: (near initialization for 'page_alloc_sysctl_table[6].procname')
mm/page_alloc.c:6265:17: error: field name not in record or union initializer
6265 | .mode = 0644,
| ^
mm/page_alloc.c:6265:17: note: (near initialization for 'page_alloc_sysctl_table')
>> mm/page_alloc.c:6265:35: warning: initialization of 'const char *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
6265 | .mode = 0644,
| ^~~~
mm/page_alloc.c:6265:35: note: (near initialization for 'page_alloc_sysctl_table[7].procname')
mm/page_alloc.c:6266:17: error: field name not in record or union initializer
6266 | .proc_handler = proc_dointvec_minmax,
| ^
mm/page_alloc.c:6266:17: note: (near initialization for 'page_alloc_sysctl_table')
mm/page_alloc.c:6266:35: error: initialization of 'const char *' from incompatible pointer type 'int (*)(const struct ctl_table *, int, void *, size_t *, loff_t *)' {aka 'int (*)(const struct ctl_table *, int, void *, long unsigned int *, long long int *)'} [-Werror=incompatible-pointer-types]
6266 | .proc_handler = proc_dointvec_minmax,
| ^~~~~~~~~~~~~~~~~~~~
mm/page_alloc.c:6266:35: note: (near initialization for 'page_alloc_sysctl_table[8].procname')
mm/page_alloc.c:6267:17: error: field name not in record or union initializer
6267 | .extra1 = SYSCTL_ZERO,
| ^
mm/page_alloc.c:6267:17: note: (near initialization for 'page_alloc_sysctl_table')
mm/page_alloc.c:6228:59: warning: missing braces around initializer [-Wmissing-braces]
6228 | static const struct ctl_table page_alloc_sysctl_table[] = {
| ^
mm/page_alloc.c:6269:9: error: expected identifier or '(' before '{' token
6269 | {
| ^
mm/page_alloc.c:6275:10: error: expected identifier or '(' before ',' token
6275 | },
| ^
mm/page_alloc.c:6174:12: warning: 'lowmem_reserve_ratio_sysctl_handler' defined but not used [-Wunused-function]
6174 | static int lowmem_reserve_ratio_sysctl_handler(const struct ctl_table *table,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/suspend.h:5,
from mm/page_alloc.c:28:
include/linux/swap.h:590:12: warning: 'folio_alloc_swap' defined but not used [-Wunused-function]
590 | static int folio_alloc_swap(struct folio *folio, gfp_t gfp_mask)
| ^~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +6264 mm/page_alloc.c
6227
6228 static const struct ctl_table page_alloc_sysctl_table[] = {
6229 {
6230 .procname = "min_free_kbytes",
6231 .data = &min_free_kbytes,
6232 .maxlen = sizeof(min_free_kbytes),
6233 .mode = 0644,
6234 .proc_handler = min_free_kbytes_sysctl_handler,
6235 .extra1 = SYSCTL_ZERO,
6236 },
6237 {
6238 .procname = "watermark_boost_factor",
6239 .data = &watermark_boost_factor,
6240 .maxlen = sizeof(watermark_boost_factor),
6241 .mode = 0644,
6242 .proc_handler = proc_dointvec_minmax,
6243 .extra1 = SYSCTL_ZERO,
6244 },
6245 {
6246 .procname = "watermark_scale_factor",
6247 .data = &watermark_scale_factor,
6248 .maxlen = sizeof(watermark_scale_factor),
6249 .mode = 0644,
6250 .proc_handler = watermark_scale_factor_sysctl_handler,
6251 .extra1 = SYSCTL_ONE,
6252 .extra2 = SYSCTL_THREE_THOUSAND,
6253 },
6254 {
6255 .procname = "percpu_pagelist_high_fraction",
6256 .data = &percpu_pagelist_high_fraction,
6257 .maxlen = sizeof(percpu_pagelist_high_fraction),
6258 .mode = 0644,
6259 .proc_handler = percpu_pagelist_high_fraction_sysctl_handler,
6260 .extra1 = SYSCTL_ZERO,
6261 },
6262 .procname = "highatomic_reserve_ratio",
> 6263 .data = &highatomic_reserve_ratio,
> 6264 .maxlen = sizeof(highatomic_reserve_ratio),
> 6265 .mode = 0644,
6266 .proc_handler = proc_dointvec_minmax,
6267 .extra1 = SYSCTL_ZERO,
6268 },
6269 {
6270 .procname = "lowmem_reserve_ratio",
6271 .data = &sysctl_lowmem_reserve_ratio,
6272 .maxlen = sizeof(sysctl_lowmem_reserve_ratio),
6273 .mode = 0644,
6274 .proc_handler = lowmem_reserve_ratio_sysctl_handler,
6275 },
6276 #ifdef CONFIG_NUMA
6277 {
6278 .procname = "numa_zonelist_order",
6279 .data = &numa_zonelist_order,
6280 .maxlen = NUMA_ZONELIST_ORDER_LEN,
6281 .mode = 0644,
6282 .proc_handler = numa_zonelist_order_handler,
6283 },
6284 {
6285 .procname = "min_unmapped_ratio",
6286 .data = &sysctl_min_unmapped_ratio,
6287 .maxlen = sizeof(sysctl_min_unmapped_ratio),
6288 .mode = 0644,
6289 .proc_handler = sysctl_min_unmapped_ratio_sysctl_handler,
6290 .extra1 = SYSCTL_ZERO,
6291 .extra2 = SYSCTL_ONE_HUNDRED,
6292 },
6293 {
6294 .procname = "min_slab_ratio",
6295 .data = &sysctl_min_slab_ratio,
6296 .maxlen = sizeof(sysctl_min_slab_ratio),
6297 .mode = 0644,
6298 .proc_handler = sysctl_min_slab_ratio_sysctl_handler,
6299 .extra1 = SYSCTL_ZERO,
6300 .extra2 = SYSCTL_ONE_HUNDRED,
6301 },
6302 #endif
6303 };
6304
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
2025-02-26 2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
2025-02-26 23:17 ` Andrew Morton
2025-02-27 4:53 ` kernel test robot
@ 2025-02-27 5:25 ` kernel test robot
2025-02-27 5:35 ` kernel test robot
2025-02-28 11:01 ` Vlastimil Babka
4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2025-02-27 5:25 UTC (permalink / raw)
To: zhongjinji, akpm
Cc: llvm, oe-kbuild-all, yuzhao, linux-kernel, linux-mm, rientjes,
vbabka, zhongjinji, yipengxiang, liulu.liu, feng.han
Hi,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/zhongjinji-honor-com/mm-page_alloc-make-the-maximum-number-of-highatomic-pageblocks-resizable/20250226-121712
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20250226024126.3718-1-zhongjinji%40honor.com
patch subject: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
config: x86_64-buildonly-randconfig-001-20250227 (https://download.01.org/0day-ci/archive/20250227/202502271250.KUfFnxnl-lkp@intel.com/config)
compiler: clang version 19.1.7 (https://github.com/llvm/llvm-project cd708029e0b2869e80abe31ddb175f7c35361f90)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250227/202502271250.KUfFnxnl-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502271250.KUfFnxnl-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from mm/page_alloc.c:19:
In file included from include/linux/mm.h:2302:
include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
| ~~~~~~~~~~~ ^ ~~~
In file included from mm/page_alloc.c:44:
include/linux/mm_inline.h:47:41: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
47 | __mod_lruvec_state(lruvec, NR_LRU_BASE + lru, nr_pages);
| ~~~~~~~~~~~ ^ ~~~
include/linux/mm_inline.h:49:22: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
49 | NR_ZONE_LRU_BASE + lru, nr_pages);
| ~~~~~~~~~~~~~~~~ ^ ~~~
mm/page_alloc.c:2854:2: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
2854 | __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
139 | __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
| ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
mm/page_alloc.c:2971:3: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
2971 | __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
139 | __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
| ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
mm/page_alloc.c:4743:2: warning: arithmetic between different enumeration types ('enum vm_event_item' and 'enum zone_type') [-Wenum-enum-conversion]
4743 | __count_zid_vm_events(PGALLOC, zone_idx(zone), nr_account);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/vmstat.h:139:34: note: expanded from macro '__count_zid_vm_events'
139 | __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta)
| ~~~~~~~~~~~~~ ^ ~~~~~~~~~~~
>> mm/page_alloc.c:6262:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
6262 | .procname = "highatomic_reserve_ratio",
| ^
mm/page_alloc.c:6263:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
6263 | .data = &highatomic_reserve_ratio,
| ^
mm/page_alloc.c:6264:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
6264 | .maxlen = sizeof(highatomic_reserve_ratio),
| ^
mm/page_alloc.c:6265:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
6265 | .mode = 0644,
| ^
mm/page_alloc.c:6266:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
6266 | .proc_handler = proc_dointvec_minmax,
| ^
mm/page_alloc.c:6267:3: error: field designator cannot initialize a non-struct, non-union type 'const struct ctl_table[]'
6267 | .extra1 = SYSCTL_ZERO,
| ^
>> mm/page_alloc.c:6268:3: error: expected ';' at end of declaration
6268 | },
| ^
| ;
>> mm/page_alloc.c:6269:2: error: expected identifier or '('
6269 | {
| ^
>> mm/page_alloc.c:6303:1: error: extraneous closing brace ('}')
6303 | };
| ^
6 warnings and 9 errors generated.
vim +6262 mm/page_alloc.c
6227
6228 static const struct ctl_table page_alloc_sysctl_table[] = {
6229 {
6230 .procname = "min_free_kbytes",
6231 .data = &min_free_kbytes,
6232 .maxlen = sizeof(min_free_kbytes),
6233 .mode = 0644,
6234 .proc_handler = min_free_kbytes_sysctl_handler,
6235 .extra1 = SYSCTL_ZERO,
6236 },
6237 {
6238 .procname = "watermark_boost_factor",
6239 .data = &watermark_boost_factor,
6240 .maxlen = sizeof(watermark_boost_factor),
6241 .mode = 0644,
6242 .proc_handler = proc_dointvec_minmax,
6243 .extra1 = SYSCTL_ZERO,
6244 },
6245 {
6246 .procname = "watermark_scale_factor",
6247 .data = &watermark_scale_factor,
6248 .maxlen = sizeof(watermark_scale_factor),
6249 .mode = 0644,
6250 .proc_handler = watermark_scale_factor_sysctl_handler,
6251 .extra1 = SYSCTL_ONE,
6252 .extra2 = SYSCTL_THREE_THOUSAND,
6253 },
6254 {
6255 .procname = "percpu_pagelist_high_fraction",
6256 .data = &percpu_pagelist_high_fraction,
6257 .maxlen = sizeof(percpu_pagelist_high_fraction),
6258 .mode = 0644,
6259 .proc_handler = percpu_pagelist_high_fraction_sysctl_handler,
6260 .extra1 = SYSCTL_ZERO,
6261 },
> 6262 .procname = "highatomic_reserve_ratio",
6263 .data = &highatomic_reserve_ratio,
6264 .maxlen = sizeof(highatomic_reserve_ratio),
6265 .mode = 0644,
6266 .proc_handler = proc_dointvec_minmax,
6267 .extra1 = SYSCTL_ZERO,
> 6268 },
> 6269 {
6270 .procname = "lowmem_reserve_ratio",
6271 .data = &sysctl_lowmem_reserve_ratio,
6272 .maxlen = sizeof(sysctl_lowmem_reserve_ratio),
6273 .mode = 0644,
6274 .proc_handler = lowmem_reserve_ratio_sysctl_handler,
6275 },
6276 #ifdef CONFIG_NUMA
6277 {
6278 .procname = "numa_zonelist_order",
6279 .data = &numa_zonelist_order,
6280 .maxlen = NUMA_ZONELIST_ORDER_LEN,
6281 .mode = 0644,
6282 .proc_handler = numa_zonelist_order_handler,
6283 },
6284 {
6285 .procname = "min_unmapped_ratio",
6286 .data = &sysctl_min_unmapped_ratio,
6287 .maxlen = sizeof(sysctl_min_unmapped_ratio),
6288 .mode = 0644,
6289 .proc_handler = sysctl_min_unmapped_ratio_sysctl_handler,
6290 .extra1 = SYSCTL_ZERO,
6291 .extra2 = SYSCTL_ONE_HUNDRED,
6292 },
6293 {
6294 .procname = "min_slab_ratio",
6295 .data = &sysctl_min_slab_ratio,
6296 .maxlen = sizeof(sysctl_min_slab_ratio),
6297 .mode = 0644,
6298 .proc_handler = sysctl_min_slab_ratio_sysctl_handler,
6299 .extra1 = SYSCTL_ZERO,
6300 .extra2 = SYSCTL_ONE_HUNDRED,
6301 },
6302 #endif
> 6303 };
6304
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
2025-02-26 2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
` (2 preceding siblings ...)
2025-02-27 5:25 ` kernel test robot
@ 2025-02-27 5:35 ` kernel test robot
2025-02-28 11:01 ` Vlastimil Babka
4 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2025-02-27 5:35 UTC (permalink / raw)
To: zhongjinji, akpm
Cc: oe-kbuild-all, yuzhao, linux-kernel, linux-mm, rientjes, vbabka,
zhongjinji, yipengxiang, liulu.liu, feng.han
Hi,
kernel test robot noticed the following build errors:
[auto build test ERROR on akpm-mm/mm-everything]
url: https://github.com/intel-lab-lkp/linux/commits/zhongjinji-honor-com/mm-page_alloc-make-the-maximum-number-of-highatomic-pageblocks-resizable/20250226-121712
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20250226024126.3718-1-zhongjinji%40honor.com
patch subject: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
config: x86_64-buildonly-randconfig-003-20250227 (https://download.01.org/0day-ci/archive/20250227/202502271338.7VyyFcoF-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250227/202502271338.7VyyFcoF-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202502271338.7VyyFcoF-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
>> mm/page_alloc.c:6262:17: error: field name not in record or union initializer
6262 | .procname = "highatomic_reserve_ratio",
| ^
mm/page_alloc.c:6262:17: note: (near initialization for 'page_alloc_sysctl_table')
>> mm/page_alloc.c:6262:35: error: invalid initializer
6262 | .procname = "highatomic_reserve_ratio",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
mm/page_alloc.c:6262:35: note: (near initialization for 'page_alloc_sysctl_table[4].<anonymous>')
mm/page_alloc.c:6263:17: error: field name not in record or union initializer
6263 | .data = &highatomic_reserve_ratio,
| ^
mm/page_alloc.c:6263:17: note: (near initialization for 'page_alloc_sysctl_table')
mm/page_alloc.c:6263:35: error: invalid initializer
6263 | .data = &highatomic_reserve_ratio,
| ^
mm/page_alloc.c:6263:35: note: (near initialization for 'page_alloc_sysctl_table[5].<anonymous>')
mm/page_alloc.c:6264:17: error: field name not in record or union initializer
6264 | .maxlen = sizeof(highatomic_reserve_ratio),
| ^
mm/page_alloc.c:6264:17: note: (near initialization for 'page_alloc_sysctl_table')
mm/page_alloc.c:6264:35: error: invalid initializer
6264 | .maxlen = sizeof(highatomic_reserve_ratio),
| ^~~~~~
mm/page_alloc.c:6264:35: note: (near initialization for 'page_alloc_sysctl_table[6].<anonymous>')
mm/page_alloc.c:6265:17: error: field name not in record or union initializer
6265 | .mode = 0644,
| ^
mm/page_alloc.c:6265:17: note: (near initialization for 'page_alloc_sysctl_table')
mm/page_alloc.c:6265:35: error: invalid initializer
6265 | .mode = 0644,
| ^~~~
mm/page_alloc.c:6265:35: note: (near initialization for 'page_alloc_sysctl_table[7].<anonymous>')
mm/page_alloc.c:6266:17: error: field name not in record or union initializer
6266 | .proc_handler = proc_dointvec_minmax,
| ^
mm/page_alloc.c:6266:17: note: (near initialization for 'page_alloc_sysctl_table')
mm/page_alloc.c:6266:35: error: invalid initializer
6266 | .proc_handler = proc_dointvec_minmax,
| ^~~~~~~~~~~~~~~~~~~~
mm/page_alloc.c:6266:35: note: (near initialization for 'page_alloc_sysctl_table[8].<anonymous>')
mm/page_alloc.c:6267:17: error: field name not in record or union initializer
6267 | .extra1 = SYSCTL_ZERO,
| ^
mm/page_alloc.c:6267:17: note: (near initialization for 'page_alloc_sysctl_table')
In file included from include/linux/key.h:17,
from include/linux/cred.h:13,
from include/linux/sched/signal.h:10,
from include/linux/rcuwait.h:6,
from include/linux/mm.h:35,
from mm/page_alloc.c:19:
>> include/linux/sysctl.h:41:41: error: invalid initializer
41 | #define SYSCTL_ZERO ((void *)&sysctl_vals[0])
| ^
mm/page_alloc.c:6267:35: note: in expansion of macro 'SYSCTL_ZERO'
6267 | .extra1 = SYSCTL_ZERO,
| ^~~~~~~~~~~
include/linux/sysctl.h:41:41: note: (near initialization for 'page_alloc_sysctl_table[9].<anonymous>')
41 | #define SYSCTL_ZERO ((void *)&sysctl_vals[0])
| ^
mm/page_alloc.c:6267:35: note: in expansion of macro 'SYSCTL_ZERO'
6267 | .extra1 = SYSCTL_ZERO,
| ^~~~~~~~~~~
>> mm/page_alloc.c:6228:59: warning: missing braces around initializer [-Wmissing-braces]
6228 | static const struct ctl_table page_alloc_sysctl_table[] = {
| ^
>> mm/page_alloc.c:6269:9: error: expected identifier or '(' before '{' token
6269 | {
| ^
mm/page_alloc.c:6275:10: error: expected identifier or '(' before ',' token
6275 | },
| ^
>> mm/page_alloc.c:6174:12: warning: 'lowmem_reserve_ratio_sysctl_handler' defined but not used [-Wunused-function]
6174 | static int lowmem_reserve_ratio_sysctl_handler(const struct ctl_table *table,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/suspend.h:5,
from mm/page_alloc.c:28:
include/linux/swap.h:590:12: warning: 'folio_alloc_swap' defined but not used [-Wunused-function]
590 | static int folio_alloc_swap(struct folio *folio, gfp_t gfp_mask)
| ^~~~~~~~~~~~~~~~
vim +6262 mm/page_alloc.c
6164
6165 /*
6166 * lowmem_reserve_ratio_sysctl_handler - just a wrapper around
6167 * proc_dointvec() so that we can call setup_per_zone_lowmem_reserve()
6168 * whenever sysctl_lowmem_reserve_ratio changes.
6169 *
6170 * The reserve ratio obviously has absolutely no relation with the
6171 * minimum watermarks. The lowmem reserve ratio can only make sense
6172 * if in function of the boot time zone sizes.
6173 */
> 6174 static int lowmem_reserve_ratio_sysctl_handler(const struct ctl_table *table,
6175 int write, void *buffer, size_t *length, loff_t *ppos)
6176 {
6177 int i;
6178
6179 proc_dointvec_minmax(table, write, buffer, length, ppos);
6180
6181 for (i = 0; i < MAX_NR_ZONES; i++) {
6182 if (sysctl_lowmem_reserve_ratio[i] < 1)
6183 sysctl_lowmem_reserve_ratio[i] = 0;
6184 }
6185
6186 setup_per_zone_lowmem_reserve();
6187 return 0;
6188 }
6189
6190 /*
6191 * percpu_pagelist_high_fraction - changes the pcp->high for each zone on each
6192 * cpu. It is the fraction of total pages in each zone that a hot per cpu
6193 * pagelist can have before it gets flushed back to buddy allocator.
6194 */
6195 static int percpu_pagelist_high_fraction_sysctl_handler(const struct ctl_table *table,
6196 int write, void *buffer, size_t *length, loff_t *ppos)
6197 {
6198 struct zone *zone;
6199 int old_percpu_pagelist_high_fraction;
6200 int ret;
6201
6202 mutex_lock(&pcp_batch_high_lock);
6203 old_percpu_pagelist_high_fraction = percpu_pagelist_high_fraction;
6204
6205 ret = proc_dointvec_minmax(table, write, buffer, length, ppos);
6206 if (!write || ret < 0)
6207 goto out;
6208
6209 /* Sanity checking to avoid pcp imbalance */
6210 if (percpu_pagelist_high_fraction &&
6211 percpu_pagelist_high_fraction < MIN_PERCPU_PAGELIST_HIGH_FRACTION) {
6212 percpu_pagelist_high_fraction = old_percpu_pagelist_high_fraction;
6213 ret = -EINVAL;
6214 goto out;
6215 }
6216
6217 /* No change? */
6218 if (percpu_pagelist_high_fraction == old_percpu_pagelist_high_fraction)
6219 goto out;
6220
6221 for_each_populated_zone(zone)
6222 zone_set_pageset_high_and_batch(zone, 0);
6223 out:
6224 mutex_unlock(&pcp_batch_high_lock);
6225 return ret;
6226 }
6227
> 6228 static const struct ctl_table page_alloc_sysctl_table[] = {
6229 {
6230 .procname = "min_free_kbytes",
6231 .data = &min_free_kbytes,
6232 .maxlen = sizeof(min_free_kbytes),
6233 .mode = 0644,
6234 .proc_handler = min_free_kbytes_sysctl_handler,
6235 .extra1 = SYSCTL_ZERO,
6236 },
6237 {
6238 .procname = "watermark_boost_factor",
6239 .data = &watermark_boost_factor,
6240 .maxlen = sizeof(watermark_boost_factor),
6241 .mode = 0644,
6242 .proc_handler = proc_dointvec_minmax,
6243 .extra1 = SYSCTL_ZERO,
6244 },
6245 {
6246 .procname = "watermark_scale_factor",
6247 .data = &watermark_scale_factor,
6248 .maxlen = sizeof(watermark_scale_factor),
6249 .mode = 0644,
6250 .proc_handler = watermark_scale_factor_sysctl_handler,
6251 .extra1 = SYSCTL_ONE,
6252 .extra2 = SYSCTL_THREE_THOUSAND,
6253 },
6254 {
6255 .procname = "percpu_pagelist_high_fraction",
6256 .data = &percpu_pagelist_high_fraction,
6257 .maxlen = sizeof(percpu_pagelist_high_fraction),
6258 .mode = 0644,
6259 .proc_handler = percpu_pagelist_high_fraction_sysctl_handler,
6260 .extra1 = SYSCTL_ZERO,
6261 },
> 6262 .procname = "highatomic_reserve_ratio",
6263 .data = &highatomic_reserve_ratio,
6264 .maxlen = sizeof(highatomic_reserve_ratio),
6265 .mode = 0644,
6266 .proc_handler = proc_dointvec_minmax,
6267 .extra1 = SYSCTL_ZERO,
6268 },
> 6269 {
6270 .procname = "lowmem_reserve_ratio",
6271 .data = &sysctl_lowmem_reserve_ratio,
6272 .maxlen = sizeof(sysctl_lowmem_reserve_ratio),
6273 .mode = 0644,
6274 .proc_handler = lowmem_reserve_ratio_sysctl_handler,
6275 },
6276 #ifdef CONFIG_NUMA
6277 {
6278 .procname = "numa_zonelist_order",
6279 .data = &numa_zonelist_order,
6280 .maxlen = NUMA_ZONELIST_ORDER_LEN,
6281 .mode = 0644,
6282 .proc_handler = numa_zonelist_order_handler,
6283 },
6284 {
6285 .procname = "min_unmapped_ratio",
6286 .data = &sysctl_min_unmapped_ratio,
6287 .maxlen = sizeof(sysctl_min_unmapped_ratio),
6288 .mode = 0644,
6289 .proc_handler = sysctl_min_unmapped_ratio_sysctl_handler,
6290 .extra1 = SYSCTL_ZERO,
6291 .extra2 = SYSCTL_ONE_HUNDRED,
6292 },
6293 {
6294 .procname = "min_slab_ratio",
6295 .data = &sysctl_min_slab_ratio,
6296 .maxlen = sizeof(sysctl_min_slab_ratio),
6297 .mode = 0644,
6298 .proc_handler = sysctl_min_slab_ratio_sysctl_handler,
6299 .extra1 = SYSCTL_ZERO,
6300 .extra2 = SYSCTL_ONE_HUNDRED,
6301 },
6302 #endif
6303 };
6304
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable
2025-02-26 2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
` (3 preceding siblings ...)
2025-02-27 5:35 ` kernel test robot
@ 2025-02-28 11:01 ` Vlastimil Babka
4 siblings, 0 replies; 6+ messages in thread
From: Vlastimil Babka @ 2025-02-28 11:01 UTC (permalink / raw)
To: zhongjinji, akpm, Mel Gorman, Johannes Weiner, Brendan Jackman
Cc: yuzhao, linux-kernel, linux-mm, rientjes, yipengxiang, liulu.liu,
feng.han
+Cc Mel, Johannes and Brendan, please keep on future submissions
On 2/26/25 03:41, zhongjinji@honor.com wrote:
> From: zhongjinji <zhongjinji@honor.com>
>
> In the past, nr_reserved_highatomic was considered to be part of
> unusable_free when there was no ALLOC_RESERVES flag. To prevent
> unusable_free from being too large, it is reasonable to set a
> fixed maximum highatomic value.
>
> Even if the maximum number of highatomic pageblocks is set to be larger,
> unusable_free may not increase, since Yu Zhao provided the modification
> about nr_free_highatomic in
> https://lore.kernel.org/all/20241028182653.3420139-1-yuzhao@google.com/T/#u
I think I raised in discussing that patch the fact that
unreserve_highatomic_pageblock() is flawed in that it will only find a
highatomic block to unreserve if it has some free pages as it searches via
freelist. So if the existing reserve becomes fully used up, it's dead
weight, unless the highatomic allocations are shortlived. It could make
sense to remove the HIGHATOMIC marking so other, more free blocks might be
reserved instead.
I fear if this approach isn't just working around that limitation.
> More highatomic pageblocks are beneficial for the successful allocation
> of high-order page, which is helpful in some devices. Therefore, use
Can you elaborate what is it helpful with? And what kind of values you
expect to be using? Do you find that your existing HIGHATOMIC blocks are
fully depleted as I speculate above? Are your highatomic allocations short
or long lived?
Maybe simply at this point nr_free_highatomic would be a better metric than
nr_reserved_highatomic to decide if we should reserve another pageblock? The
counter didn't exist when highatomic reserves were introduced.
But maybe also some mechanism (compaction maybe?) could also be looking for
highatomic blocks that are fully used and unreserve them?
> highatomic_reserve_ratio to adjust the maximum number of highatomic
> pageblocks.
>
> Signed-off-by: zhongjinji <zhongjinji@honor.com>
Minimally this needs to be documented in e.g.
Documentation/admin-guide/sysctl/vm.rst, it's not obvious that the value is
divided by 1000 and not 100 for example.
> ---
> mm/page_alloc.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 579789600a3c..dbdce6a0f694 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -273,6 +273,7 @@ int min_free_kbytes = 1024;
> int user_min_free_kbytes = -1;
> static int watermark_boost_factor __read_mostly = 15000;
> static int watermark_scale_factor = 10;
> +static int highatomic_reserve_ratio = 10;
>
> /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
> int movable_zone;
> @@ -2046,7 +2047,8 @@ static void reserve_highatomic_pageblock(struct page *page, int order,
> */
> if ((zone_managed_pages(zone) / 100) < pageblock_nr_pages)
> return;
> - max_managed = ALIGN((zone_managed_pages(zone) / 100), pageblock_nr_pages);
> + max_managed = ALIGN((zone_managed_pages(zone) * highatomic_reserve_ratio / 1000),
> + pageblock_nr_pages);
> if (zone->nr_reserved_highatomic >= max_managed)
> return;
>
> @@ -6199,6 +6201,13 @@ static const struct ctl_table page_alloc_sysctl_table[] = {
> .mode = 0644,
> .proc_handler = percpu_pagelist_high_fraction_sysctl_handler,
> .extra1 = SYSCTL_ZERO,
> + },
> + .procname = "highatomic_reserve_ratio",
> + .data = &highatomic_reserve_ratio,
> + .maxlen = sizeof(highatomic_reserve_ratio),
> + .mode = 0644,
> + .proc_handler = proc_dointvec_minmax,
> + .extra1 = SYSCTL_ZERO,
> },
> {
> .procname = "lowmem_reserve_ratio",
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-02-28 11:02 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-26 2:41 [PATCH] mm/page_alloc: make the maximum number of highatomic pageblocks resizable zhongjinji
2025-02-26 23:17 ` Andrew Morton
2025-02-27 4:53 ` kernel test robot
2025-02-27 5:25 ` kernel test robot
2025-02-27 5:35 ` kernel test robot
2025-02-28 11:01 ` Vlastimil Babka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox