* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline [not found] <20240808101700.571701-1-ryan.roberts@arm.com> @ 2024-08-08 21:17 ` Barry Song 2024-08-09 7:58 ` Ryan Roberts 2024-08-09 8:49 ` Baolin Wang ` (2 subsequent siblings) 3 siblings, 1 reply; 17+ messages in thread From: Barry Song @ 2024-08-08 21:17 UTC (permalink / raw) To: Ryan Roberts Cc: Andrew Morton, Jonathan Corbet, David Hildenbrand, Lance Yang, Baolin Wang, linux-kernel, linux-mm On Thu, Aug 8, 2024 at 10:17 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > > Add thp_anon= cmdline parameter to allow specifying the default > enablement of each supported anon THP size. The parameter accepts the > following format and can be provided multiple times to configure each > size: > > thp_anon=<size>[KMG]:<value> > > See Documentation/admin-guide/mm/transhuge.rst for more details. > > Configuring the defaults at boot time is useful to allow early user > space to take advantage of mTHP before its been configured through > sysfs. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > --- > > Hi All, > > I've split this off from my RFC at [1] because Barry highlighted that he would > benefit from it immediately [2]. There are no changes vs the version in that > series. > > It applies against today's mm-unstable (275d686abcb59). (although I had to fix a > minor build bug in stackdepot.c due to MIN() not being defined in this tree). > > Thanks, > Ryan > > > .../admin-guide/kernel-parameters.txt | 8 +++ > Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- > mm/huge_memory.c | 55 ++++++++++++++++++- > 3 files changed, 82 insertions(+), 7 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index bcdee8984e1f0..5c79b58c108ec 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6631,6 +6631,14 @@ > <deci-seconds>: poll all this frequency > 0: no polling (default) > > + thp_anon= [KNL] > + Format: <size>[KMG]:always|madvise|never|inherit > + Can be used to control the default behavior of the > + system with respect to anonymous transparent hugepages. > + Can be used multiple times for multiple anon THP sizes. > + See Documentation/admin-guide/mm/transhuge.rst for more > + details. > + > threadirqs [KNL,EARLY] > Force threading of all interrupt handlers except those > marked explicitly IRQF_NO_THREAD. > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > index 24eec1c03ad88..f63b0717366c6 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block the collapse:: > > A higher value may increase memory footprint for some workloads. > > -Boot parameter > -============== > +Boot parameters > +=============== > > -You can change the sysfs boot time defaults of Transparent Hugepage > -Support by passing the parameter ``transparent_hugepage=always`` or > -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > -to the kernel command line. > +You can change the sysfs boot time default for the top-level "enabled" > +control by passing the parameter ``transparent_hugepage=always`` or > +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > +kernel command line. > + > +Alternatively, each supported anonymous THP size can be controlled by > +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > +``inherit``. > + > +For example, the following will set 64K THP to ``always``:: > + > + thp_anon=64K:always > + > +``thp_anon=`` may be specified multiple times to configure all THP sizes as > +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > +not explicitly configured on the command line are implicitly set to > +``never``. > > Hugepages in tmpfs/shmem > ======================== > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 0c3075ee00012..c2c0da1eb94e6 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -82,6 +82,7 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL; > unsigned long huge_anon_orders_always __read_mostly; > unsigned long huge_anon_orders_madvise __read_mostly; > unsigned long huge_anon_orders_inherit __read_mostly; > +static bool anon_orders_configured; > > unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > unsigned long vm_flags, > @@ -672,7 +673,10 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) > * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time > * constant so we have to do this here. > */ > - huge_anon_orders_inherit = BIT(PMD_ORDER); > + if (!anon_orders_configured) { > + huge_anon_orders_inherit = BIT(PMD_ORDER); > + anon_orders_configured = true; > + } If a user configures 64KB and doesn't adjust anything for PMD_ORDER, then PMD_ORDER will be set to "never", correct? This seems to change the default behavior of PMD_ORDER. Could we instead achieve this by checking if PMD_ORDER has been explicitly configured? > > *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj); > if (unlikely(!*hugepage_kobj)) { > @@ -857,6 +861,55 @@ static int __init setup_transparent_hugepage(char *str) > } > __setup("transparent_hugepage=", setup_transparent_hugepage); > > +static int __init setup_thp_anon(char *str) > +{ > + unsigned long size; > + char *state; > + int order; > + int ret = 0; > + > + if (!str) > + goto out; > + > + size = (unsigned long)memparse(str, &state); > + order = ilog2(size >> PAGE_SHIFT); > + if (*state != ':' || !is_power_of_2(size) || size <= PAGE_SIZE || > + !(BIT(order) & THP_ORDERS_ALL_ANON)) > + goto out; > + > + state++; > + > + if (!strcmp(state, "always")) { > + clear_bit(order, &huge_anon_orders_inherit); > + clear_bit(order, &huge_anon_orders_madvise); > + set_bit(order, &huge_anon_orders_always); > + ret = 1; > + } else if (!strcmp(state, "inherit")) { > + clear_bit(order, &huge_anon_orders_always); > + clear_bit(order, &huge_anon_orders_madvise); > + set_bit(order, &huge_anon_orders_inherit); > + ret = 1; > + } else if (!strcmp(state, "madvise")) { > + clear_bit(order, &huge_anon_orders_always); > + clear_bit(order, &huge_anon_orders_inherit); > + set_bit(order, &huge_anon_orders_madvise); > + ret = 1; > + } else if (!strcmp(state, "never")) { > + clear_bit(order, &huge_anon_orders_always); > + clear_bit(order, &huge_anon_orders_inherit); > + clear_bit(order, &huge_anon_orders_madvise); > + ret = 1; > + } > + > + if (ret) > + anon_orders_configured = true; I mean: if (ret && order == PMD_ORDER) anon_pmd_order_configured = true; > +out: > + if (!ret) > + pr_warn("thp_anon=%s: cannot parse, ignored\n", str); > + return ret; > +} > +__setup("thp_anon=", setup_thp_anon); > + > pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) > { > if (likely(vma->vm_flags & VM_WRITE)) > -- > 2.43.0 > Thanks Barry ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-08 21:17 ` [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline Barry Song @ 2024-08-09 7:58 ` Ryan Roberts 2024-08-09 8:31 ` Barry Song 0 siblings, 1 reply; 17+ messages in thread From: Ryan Roberts @ 2024-08-09 7:58 UTC (permalink / raw) To: Barry Song Cc: Andrew Morton, Jonathan Corbet, David Hildenbrand, Lance Yang, Baolin Wang, linux-kernel, linux-mm On 08/08/2024 22:17, Barry Song wrote: > On Thu, Aug 8, 2024 at 10:17 PM Ryan Roberts <ryan.roberts@arm.com> wrote: >> >> Add thp_anon= cmdline parameter to allow specifying the default >> enablement of each supported anon THP size. The parameter accepts the >> following format and can be provided multiple times to configure each >> size: >> >> thp_anon=<size>[KMG]:<value> >> >> See Documentation/admin-guide/mm/transhuge.rst for more details. >> >> Configuring the defaults at boot time is useful to allow early user >> space to take advantage of mTHP before its been configured through >> sysfs. >> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >> --- >> >> Hi All, >> >> I've split this off from my RFC at [1] because Barry highlighted that he would >> benefit from it immediately [2]. There are no changes vs the version in that >> series. >> >> It applies against today's mm-unstable (275d686abcb59). (although I had to fix a >> minor build bug in stackdepot.c due to MIN() not being defined in this tree). >> >> Thanks, >> Ryan >> >> >> .../admin-guide/kernel-parameters.txt | 8 +++ >> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- >> mm/huge_memory.c | 55 ++++++++++++++++++- >> 3 files changed, 82 insertions(+), 7 deletions(-) >> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt >> index bcdee8984e1f0..5c79b58c108ec 100644 >> --- a/Documentation/admin-guide/kernel-parameters.txt >> +++ b/Documentation/admin-guide/kernel-parameters.txt >> @@ -6631,6 +6631,14 @@ >> <deci-seconds>: poll all this frequency >> 0: no polling (default) >> >> + thp_anon= [KNL] >> + Format: <size>[KMG]:always|madvise|never|inherit >> + Can be used to control the default behavior of the >> + system with respect to anonymous transparent hugepages. >> + Can be used multiple times for multiple anon THP sizes. >> + See Documentation/admin-guide/mm/transhuge.rst for more >> + details. >> + >> threadirqs [KNL,EARLY] >> Force threading of all interrupt handlers except those >> marked explicitly IRQF_NO_THREAD. >> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >> index 24eec1c03ad88..f63b0717366c6 100644 >> --- a/Documentation/admin-guide/mm/transhuge.rst >> +++ b/Documentation/admin-guide/mm/transhuge.rst >> @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block the collapse:: >> >> A higher value may increase memory footprint for some workloads. >> >> -Boot parameter >> -============== >> +Boot parameters >> +=============== >> >> -You can change the sysfs boot time defaults of Transparent Hugepage >> -Support by passing the parameter ``transparent_hugepage=always`` or >> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` >> -to the kernel command line. >> +You can change the sysfs boot time default for the top-level "enabled" >> +control by passing the parameter ``transparent_hugepage=always`` or >> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the >> +kernel command line. >> + >> +Alternatively, each supported anonymous THP size can be controlled by >> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size >> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or >> +``inherit``. >> + >> +For example, the following will set 64K THP to ``always``:: >> + >> + thp_anon=64K:always >> + >> +``thp_anon=`` may be specified multiple times to configure all THP sizes as >> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes >> +not explicitly configured on the command line are implicitly set to >> +``never``. >> >> Hugepages in tmpfs/shmem >> ======================== >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 0c3075ee00012..c2c0da1eb94e6 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -82,6 +82,7 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL; >> unsigned long huge_anon_orders_always __read_mostly; >> unsigned long huge_anon_orders_madvise __read_mostly; >> unsigned long huge_anon_orders_inherit __read_mostly; >> +static bool anon_orders_configured; >> >> unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, >> unsigned long vm_flags, >> @@ -672,7 +673,10 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) >> * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time >> * constant so we have to do this here. >> */ >> - huge_anon_orders_inherit = BIT(PMD_ORDER); >> + if (!anon_orders_configured) { >> + huge_anon_orders_inherit = BIT(PMD_ORDER); >> + anon_orders_configured = true; >> + } > > If a user configures 64KB and doesn't adjust anything for PMD_ORDER, > then PMD_ORDER will be set to "never", correct? This seems to change > the default behavior of PMD_ORDER. Could we instead achieve this by > checking if PMD_ORDER has been explicitly configured? Yes, that's how it's implemented in this patch, and the accompanying docs also state: If ``thp_anon=`` is specified at least once, any anon THP sizes not explicitly configured on the command line are implicitly set to ``never``. My initial approach did exactly as you suggest. But in the original series, I also had a similar patch to configure file thp with "thp_file=". And for file, all of the orders default to `always`. So if taking the same approach with that control, the user would have to explicitly opt-out of all supported orders rather than just opt-in to the orders they want. And I thought that could get tricky in future if support is added for more orders. I felt that was potentially very confusing so decided it was clearer to have the above rule and make both controls consistent. What do you think? > >> >> *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj); >> if (unlikely(!*hugepage_kobj)) { >> @@ -857,6 +861,55 @@ static int __init setup_transparent_hugepage(char *str) >> } >> __setup("transparent_hugepage=", setup_transparent_hugepage); >> >> +static int __init setup_thp_anon(char *str) >> +{ >> + unsigned long size; >> + char *state; >> + int order; >> + int ret = 0; >> + >> + if (!str) >> + goto out; >> + >> + size = (unsigned long)memparse(str, &state); >> + order = ilog2(size >> PAGE_SHIFT); >> + if (*state != ':' || !is_power_of_2(size) || size <= PAGE_SIZE || >> + !(BIT(order) & THP_ORDERS_ALL_ANON)) >> + goto out; >> + >> + state++; >> + >> + if (!strcmp(state, "always")) { >> + clear_bit(order, &huge_anon_orders_inherit); >> + clear_bit(order, &huge_anon_orders_madvise); >> + set_bit(order, &huge_anon_orders_always); >> + ret = 1; >> + } else if (!strcmp(state, "inherit")) { >> + clear_bit(order, &huge_anon_orders_always); >> + clear_bit(order, &huge_anon_orders_madvise); >> + set_bit(order, &huge_anon_orders_inherit); >> + ret = 1; >> + } else if (!strcmp(state, "madvise")) { >> + clear_bit(order, &huge_anon_orders_always); >> + clear_bit(order, &huge_anon_orders_inherit); >> + set_bit(order, &huge_anon_orders_madvise); >> + ret = 1; >> + } else if (!strcmp(state, "never")) { >> + clear_bit(order, &huge_anon_orders_always); >> + clear_bit(order, &huge_anon_orders_inherit); >> + clear_bit(order, &huge_anon_orders_madvise); >> + ret = 1; >> + } >> + >> + if (ret) >> + anon_orders_configured = true; > > I mean: > > if (ret && order == PMD_ORDER) > anon_pmd_order_configured = true; > >> +out: >> + if (!ret) >> + pr_warn("thp_anon=%s: cannot parse, ignored\n", str); >> + return ret; >> +} >> +__setup("thp_anon=", setup_thp_anon); >> + >> pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) >> { >> if (likely(vma->vm_flags & VM_WRITE)) >> -- >> 2.43.0 >> > > Thanks > Barry ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 7:58 ` Ryan Roberts @ 2024-08-09 8:31 ` Barry Song 2024-08-09 8:57 ` Ryan Roberts 0 siblings, 1 reply; 17+ messages in thread From: Barry Song @ 2024-08-09 8:31 UTC (permalink / raw) To: Ryan Roberts Cc: Andrew Morton, Jonathan Corbet, David Hildenbrand, Lance Yang, Baolin Wang, linux-kernel, linux-mm On Fri, Aug 9, 2024 at 3:58 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 08/08/2024 22:17, Barry Song wrote: > > On Thu, Aug 8, 2024 at 10:17 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > >> > >> Add thp_anon= cmdline parameter to allow specifying the default > >> enablement of each supported anon THP size. The parameter accepts the > >> following format and can be provided multiple times to configure each > >> size: > >> > >> thp_anon=<size>[KMG]:<value> > >> > >> See Documentation/admin-guide/mm/transhuge.rst for more details. > >> > >> Configuring the defaults at boot time is useful to allow early user > >> space to take advantage of mTHP before its been configured through > >> sysfs. > >> > >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > >> --- > >> > >> Hi All, > >> > >> I've split this off from my RFC at [1] because Barry highlighted that he would > >> benefit from it immediately [2]. There are no changes vs the version in that > >> series. > >> > >> It applies against today's mm-unstable (275d686abcb59). (although I had to fix a > >> minor build bug in stackdepot.c due to MIN() not being defined in this tree). > >> > >> Thanks, > >> Ryan > >> > >> > >> .../admin-guide/kernel-parameters.txt | 8 +++ > >> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- > >> mm/huge_memory.c | 55 ++++++++++++++++++- > >> 3 files changed, 82 insertions(+), 7 deletions(-) > >> > >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > >> index bcdee8984e1f0..5c79b58c108ec 100644 > >> --- a/Documentation/admin-guide/kernel-parameters.txt > >> +++ b/Documentation/admin-guide/kernel-parameters.txt > >> @@ -6631,6 +6631,14 @@ > >> <deci-seconds>: poll all this frequency > >> 0: no polling (default) > >> > >> + thp_anon= [KNL] > >> + Format: <size>[KMG]:always|madvise|never|inherit > >> + Can be used to control the default behavior of the > >> + system with respect to anonymous transparent hugepages. > >> + Can be used multiple times for multiple anon THP sizes. > >> + See Documentation/admin-guide/mm/transhuge.rst for more > >> + details. > >> + > >> threadirqs [KNL,EARLY] > >> Force threading of all interrupt handlers except those > >> marked explicitly IRQF_NO_THREAD. > >> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > >> index 24eec1c03ad88..f63b0717366c6 100644 > >> --- a/Documentation/admin-guide/mm/transhuge.rst > >> +++ b/Documentation/admin-guide/mm/transhuge.rst > >> @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block the collapse:: > >> > >> A higher value may increase memory footprint for some workloads. > >> > >> -Boot parameter > >> -============== > >> +Boot parameters > >> +=============== > >> > >> -You can change the sysfs boot time defaults of Transparent Hugepage > >> -Support by passing the parameter ``transparent_hugepage=always`` or > >> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > >> -to the kernel command line. > >> +You can change the sysfs boot time default for the top-level "enabled" > >> +control by passing the parameter ``transparent_hugepage=always`` or > >> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > >> +kernel command line. > >> + > >> +Alternatively, each supported anonymous THP size can be controlled by > >> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > >> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > >> +``inherit``. > >> + > >> +For example, the following will set 64K THP to ``always``:: > >> + > >> + thp_anon=64K:always > >> + > >> +``thp_anon=`` may be specified multiple times to configure all THP sizes as > >> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > >> +not explicitly configured on the command line are implicitly set to > >> +``never``. > >> > >> Hugepages in tmpfs/shmem > >> ======================== > >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >> index 0c3075ee00012..c2c0da1eb94e6 100644 > >> --- a/mm/huge_memory.c > >> +++ b/mm/huge_memory.c > >> @@ -82,6 +82,7 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL; > >> unsigned long huge_anon_orders_always __read_mostly; > >> unsigned long huge_anon_orders_madvise __read_mostly; > >> unsigned long huge_anon_orders_inherit __read_mostly; > >> +static bool anon_orders_configured; > >> > >> unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > >> unsigned long vm_flags, > >> @@ -672,7 +673,10 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) > >> * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time > >> * constant so we have to do this here. > >> */ > >> - huge_anon_orders_inherit = BIT(PMD_ORDER); > >> + if (!anon_orders_configured) { > >> + huge_anon_orders_inherit = BIT(PMD_ORDER); > >> + anon_orders_configured = true; > >> + } > > > > If a user configures 64KB and doesn't adjust anything for PMD_ORDER, > > then PMD_ORDER will be set to "never", correct? This seems to change > > the default behavior of PMD_ORDER. Could we instead achieve this by > > checking if PMD_ORDER has been explicitly configured? > > Yes, that's how it's implemented in this patch, and the accompanying docs also > state: > > If ``thp_anon=`` is specified at least once, any anon THP sizes > not explicitly configured on the command line are implicitly set to > ``never``. > > My initial approach did exactly as you suggest. But in the original series, I > also had a similar patch to configure file thp with "thp_file=". And for file, > all of the orders default to `always`. So if taking the same approach with that > control, the user would have to explicitly opt-out of all supported orders > rather than just opt-in to the orders they want. And I thought that could get > tricky in future if support is added for more orders. I felt that was > potentially very confusing so decided it was clearer to have the above rule and > make both controls consistent. > > What do you think? If this is the intention, once the user sets the command line, they should realize that the default settings have been overridden. I am perfectly fine with this strategy. with the below cmdline: thp_anon=64K:always thp_anon=8K:inherit thp_anon=32K:madvise thp_anon=1M:inherit thp_anon=2M:always I am getting: / # cat /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled [always] inherit madvise never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-32kB/enabled always inherit [madvise] never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled always [inherit] madvise never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled [always] inherit madvise never Thus, Tested-by: Barry Song <baohua@kernel.org> > > > > > >> > >> *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj); > >> if (unlikely(!*hugepage_kobj)) { > >> @@ -857,6 +861,55 @@ static int __init setup_transparent_hugepage(char *str) > >> } > >> __setup("transparent_hugepage=", setup_transparent_hugepage); > >> > >> +static int __init setup_thp_anon(char *str) > >> +{ > >> + unsigned long size; > >> + char *state; > >> + int order; > >> + int ret = 0; > >> + > >> + if (!str) > >> + goto out; > >> + > >> + size = (unsigned long)memparse(str, &state); > >> + order = ilog2(size >> PAGE_SHIFT); > >> + if (*state != ':' || !is_power_of_2(size) || size <= PAGE_SIZE || > >> + !(BIT(order) & THP_ORDERS_ALL_ANON)) > >> + goto out; > >> + > >> + state++; > >> + > >> + if (!strcmp(state, "always")) { > >> + clear_bit(order, &huge_anon_orders_inherit); > >> + clear_bit(order, &huge_anon_orders_madvise); > >> + set_bit(order, &huge_anon_orders_always); > >> + ret = 1; > >> + } else if (!strcmp(state, "inherit")) { > >> + clear_bit(order, &huge_anon_orders_always); > >> + clear_bit(order, &huge_anon_orders_madvise); > >> + set_bit(order, &huge_anon_orders_inherit); > >> + ret = 1; > >> + } else if (!strcmp(state, "madvise")) { > >> + clear_bit(order, &huge_anon_orders_always); > >> + clear_bit(order, &huge_anon_orders_inherit); > >> + set_bit(order, &huge_anon_orders_madvise); > >> + ret = 1; > >> + } else if (!strcmp(state, "never")) { > >> + clear_bit(order, &huge_anon_orders_always); > >> + clear_bit(order, &huge_anon_orders_inherit); > >> + clear_bit(order, &huge_anon_orders_madvise); > >> + ret = 1; > >> + } > >> + > >> + if (ret) > >> + anon_orders_configured = true; > > > > I mean: > > > > if (ret && order == PMD_ORDER) > > anon_pmd_order_configured = true; > > > >> +out: > >> + if (!ret) > >> + pr_warn("thp_anon=%s: cannot parse, ignored\n", str); > >> + return ret; > >> +} > >> +__setup("thp_anon=", setup_thp_anon); > >> + > >> pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) > >> { > >> if (likely(vma->vm_flags & VM_WRITE)) > >> -- > >> 2.43.0 > >> > > > > Thanks > > Barry > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 8:31 ` Barry Song @ 2024-08-09 8:57 ` Ryan Roberts 2024-08-09 9:04 ` Barry Song 0 siblings, 1 reply; 17+ messages in thread From: Ryan Roberts @ 2024-08-09 8:57 UTC (permalink / raw) To: Barry Song Cc: Andrew Morton, Jonathan Corbet, David Hildenbrand, Lance Yang, Baolin Wang, linux-kernel, linux-mm On 09/08/2024 09:31, Barry Song wrote: > On Fri, Aug 9, 2024 at 3:58 PM Ryan Roberts <ryan.roberts@arm.com> wrote: >> >> On 08/08/2024 22:17, Barry Song wrote: >>> On Thu, Aug 8, 2024 at 10:17 PM Ryan Roberts <ryan.roberts@arm.com> wrote: >>>> >>>> Add thp_anon= cmdline parameter to allow specifying the default >>>> enablement of each supported anon THP size. The parameter accepts the >>>> following format and can be provided multiple times to configure each >>>> size: >>>> >>>> thp_anon=<size>[KMG]:<value> >>>> >>>> See Documentation/admin-guide/mm/transhuge.rst for more details. >>>> >>>> Configuring the defaults at boot time is useful to allow early user >>>> space to take advantage of mTHP before its been configured through >>>> sysfs. >>>> >>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >>>> --- >>>> >>>> Hi All, >>>> >>>> I've split this off from my RFC at [1] because Barry highlighted that he would >>>> benefit from it immediately [2]. There are no changes vs the version in that >>>> series. >>>> >>>> It applies against today's mm-unstable (275d686abcb59). (although I had to fix a >>>> minor build bug in stackdepot.c due to MIN() not being defined in this tree). >>>> >>>> Thanks, >>>> Ryan >>>> >>>> >>>> .../admin-guide/kernel-parameters.txt | 8 +++ >>>> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- >>>> mm/huge_memory.c | 55 ++++++++++++++++++- >>>> 3 files changed, 82 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt >>>> index bcdee8984e1f0..5c79b58c108ec 100644 >>>> --- a/Documentation/admin-guide/kernel-parameters.txt >>>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>>> @@ -6631,6 +6631,14 @@ >>>> <deci-seconds>: poll all this frequency >>>> 0: no polling (default) >>>> >>>> + thp_anon= [KNL] >>>> + Format: <size>[KMG]:always|madvise|never|inherit >>>> + Can be used to control the default behavior of the >>>> + system with respect to anonymous transparent hugepages. >>>> + Can be used multiple times for multiple anon THP sizes. >>>> + See Documentation/admin-guide/mm/transhuge.rst for more >>>> + details. >>>> + >>>> threadirqs [KNL,EARLY] >>>> Force threading of all interrupt handlers except those >>>> marked explicitly IRQF_NO_THREAD. >>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >>>> index 24eec1c03ad88..f63b0717366c6 100644 >>>> --- a/Documentation/admin-guide/mm/transhuge.rst >>>> +++ b/Documentation/admin-guide/mm/transhuge.rst >>>> @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block the collapse:: >>>> >>>> A higher value may increase memory footprint for some workloads. >>>> >>>> -Boot parameter >>>> -============== >>>> +Boot parameters >>>> +=============== >>>> >>>> -You can change the sysfs boot time defaults of Transparent Hugepage >>>> -Support by passing the parameter ``transparent_hugepage=always`` or >>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` >>>> -to the kernel command line. >>>> +You can change the sysfs boot time default for the top-level "enabled" >>>> +control by passing the parameter ``transparent_hugepage=always`` or >>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the >>>> +kernel command line. >>>> + >>>> +Alternatively, each supported anonymous THP size can be controlled by >>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size >>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or >>>> +``inherit``. >>>> + >>>> +For example, the following will set 64K THP to ``always``:: >>>> + >>>> + thp_anon=64K:always >>>> + >>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as >>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes >>>> +not explicitly configured on the command line are implicitly set to >>>> +``never``. >>>> >>>> Hugepages in tmpfs/shmem >>>> ======================== >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>> index 0c3075ee00012..c2c0da1eb94e6 100644 >>>> --- a/mm/huge_memory.c >>>> +++ b/mm/huge_memory.c >>>> @@ -82,6 +82,7 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL; >>>> unsigned long huge_anon_orders_always __read_mostly; >>>> unsigned long huge_anon_orders_madvise __read_mostly; >>>> unsigned long huge_anon_orders_inherit __read_mostly; >>>> +static bool anon_orders_configured; >>>> >>>> unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, >>>> unsigned long vm_flags, >>>> @@ -672,7 +673,10 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) >>>> * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time >>>> * constant so we have to do this here. >>>> */ >>>> - huge_anon_orders_inherit = BIT(PMD_ORDER); >>>> + if (!anon_orders_configured) { >>>> + huge_anon_orders_inherit = BIT(PMD_ORDER); >>>> + anon_orders_configured = true; >>>> + } >>> >>> If a user configures 64KB and doesn't adjust anything for PMD_ORDER, >>> then PMD_ORDER will be set to "never", correct? This seems to change >>> the default behavior of PMD_ORDER. Could we instead achieve this by >>> checking if PMD_ORDER has been explicitly configured? >> >> Yes, that's how it's implemented in this patch, and the accompanying docs also >> state: >> >> If ``thp_anon=`` is specified at least once, any anon THP sizes >> not explicitly configured on the command line are implicitly set to >> ``never``. >> >> My initial approach did exactly as you suggest. But in the original series, I >> also had a similar patch to configure file thp with "thp_file=". And for file, >> all of the orders default to `always`. So if taking the same approach with that >> control, the user would have to explicitly opt-out of all supported orders >> rather than just opt-in to the orders they want. And I thought that could get >> tricky in future if support is added for more orders. I felt that was >> potentially very confusing so decided it was clearer to have the above rule and >> make both controls consistent. >> >> What do you think? > > If this is the intention, once the user sets the command line, they should > realize that the default settings have been overridden. I am perfectly fine > with this strategy. OK. I can see it from both sides to be honest. Let's see if anyone else has issue with this approach. > > with the below cmdline: > thp_anon=64K:always thp_anon=8K:inherit thp_anon=32K:madvise > thp_anon=1M:inherit thp_anon=2M:always > > I am getting: > / # cat /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled > [always] inherit madvise never > / # cat /sys/kernel/mm/transparent_hugepage/hugepages-32kB/enabled > always inherit [madvise] never > / # cat /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled > always [inherit] madvise never > / # cat /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > [always] inherit madvise never And you should also be seeing a warning in the boot log that thp_anon=8K:inherit is unrecognised (since anon doesn't support order-1). > > Thus, > > Tested-by: Barry Song <baohua@kernel.org> Thanks! > >> >> >>> >>>> >>>> *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj); >>>> if (unlikely(!*hugepage_kobj)) { >>>> @@ -857,6 +861,55 @@ static int __init setup_transparent_hugepage(char *str) >>>> } >>>> __setup("transparent_hugepage=", setup_transparent_hugepage); >>>> >>>> +static int __init setup_thp_anon(char *str) >>>> +{ >>>> + unsigned long size; >>>> + char *state; >>>> + int order; >>>> + int ret = 0; >>>> + >>>> + if (!str) >>>> + goto out; >>>> + >>>> + size = (unsigned long)memparse(str, &state); >>>> + order = ilog2(size >> PAGE_SHIFT); >>>> + if (*state != ':' || !is_power_of_2(size) || size <= PAGE_SIZE || >>>> + !(BIT(order) & THP_ORDERS_ALL_ANON)) >>>> + goto out; >>>> + >>>> + state++; >>>> + >>>> + if (!strcmp(state, "always")) { >>>> + clear_bit(order, &huge_anon_orders_inherit); >>>> + clear_bit(order, &huge_anon_orders_madvise); >>>> + set_bit(order, &huge_anon_orders_always); >>>> + ret = 1; >>>> + } else if (!strcmp(state, "inherit")) { >>>> + clear_bit(order, &huge_anon_orders_always); >>>> + clear_bit(order, &huge_anon_orders_madvise); >>>> + set_bit(order, &huge_anon_orders_inherit); >>>> + ret = 1; >>>> + } else if (!strcmp(state, "madvise")) { >>>> + clear_bit(order, &huge_anon_orders_always); >>>> + clear_bit(order, &huge_anon_orders_inherit); >>>> + set_bit(order, &huge_anon_orders_madvise); >>>> + ret = 1; >>>> + } else if (!strcmp(state, "never")) { >>>> + clear_bit(order, &huge_anon_orders_always); >>>> + clear_bit(order, &huge_anon_orders_inherit); >>>> + clear_bit(order, &huge_anon_orders_madvise); >>>> + ret = 1; >>>> + } >>>> + >>>> + if (ret) >>>> + anon_orders_configured = true; >>> >>> I mean: >>> >>> if (ret && order == PMD_ORDER) >>> anon_pmd_order_configured = true; >>> >>>> +out: >>>> + if (!ret) >>>> + pr_warn("thp_anon=%s: cannot parse, ignored\n", str); >>>> + return ret; >>>> +} >>>> +__setup("thp_anon=", setup_thp_anon); >>>> + >>>> pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) >>>> { >>>> if (likely(vma->vm_flags & VM_WRITE)) >>>> -- >>>> 2.43.0 >>>> >>> >>> Thanks >>> Barry >> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 8:57 ` Ryan Roberts @ 2024-08-09 9:04 ` Barry Song 0 siblings, 0 replies; 17+ messages in thread From: Barry Song @ 2024-08-09 9:04 UTC (permalink / raw) To: Ryan Roberts Cc: Andrew Morton, Jonathan Corbet, David Hildenbrand, Lance Yang, Baolin Wang, linux-kernel, linux-mm On Fri, Aug 9, 2024 at 4:57 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 09/08/2024 09:31, Barry Song wrote: > > On Fri, Aug 9, 2024 at 3:58 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > >> > >> On 08/08/2024 22:17, Barry Song wrote: > >>> On Thu, Aug 8, 2024 at 10:17 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > >>>> > >>>> Add thp_anon= cmdline parameter to allow specifying the default > >>>> enablement of each supported anon THP size. The parameter accepts the > >>>> following format and can be provided multiple times to configure each > >>>> size: > >>>> > >>>> thp_anon=<size>[KMG]:<value> > >>>> > >>>> See Documentation/admin-guide/mm/transhuge.rst for more details. > >>>> > >>>> Configuring the defaults at boot time is useful to allow early user > >>>> space to take advantage of mTHP before its been configured through > >>>> sysfs. > >>>> > >>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > >>>> --- > >>>> > >>>> Hi All, > >>>> > >>>> I've split this off from my RFC at [1] because Barry highlighted that he would > >>>> benefit from it immediately [2]. There are no changes vs the version in that > >>>> series. > >>>> > >>>> It applies against today's mm-unstable (275d686abcb59). (although I had to fix a > >>>> minor build bug in stackdepot.c due to MIN() not being defined in this tree). > >>>> > >>>> Thanks, > >>>> Ryan > >>>> > >>>> > >>>> .../admin-guide/kernel-parameters.txt | 8 +++ > >>>> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- > >>>> mm/huge_memory.c | 55 ++++++++++++++++++- > >>>> 3 files changed, 82 insertions(+), 7 deletions(-) > >>>> > >>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > >>>> index bcdee8984e1f0..5c79b58c108ec 100644 > >>>> --- a/Documentation/admin-guide/kernel-parameters.txt > >>>> +++ b/Documentation/admin-guide/kernel-parameters.txt > >>>> @@ -6631,6 +6631,14 @@ > >>>> <deci-seconds>: poll all this frequency > >>>> 0: no polling (default) > >>>> > >>>> + thp_anon= [KNL] > >>>> + Format: <size>[KMG]:always|madvise|never|inherit > >>>> + Can be used to control the default behavior of the > >>>> + system with respect to anonymous transparent hugepages. > >>>> + Can be used multiple times for multiple anon THP sizes. > >>>> + See Documentation/admin-guide/mm/transhuge.rst for more > >>>> + details. > >>>> + > >>>> threadirqs [KNL,EARLY] > >>>> Force threading of all interrupt handlers except those > >>>> marked explicitly IRQF_NO_THREAD. > >>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > >>>> index 24eec1c03ad88..f63b0717366c6 100644 > >>>> --- a/Documentation/admin-guide/mm/transhuge.rst > >>>> +++ b/Documentation/admin-guide/mm/transhuge.rst > >>>> @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block the collapse:: > >>>> > >>>> A higher value may increase memory footprint for some workloads. > >>>> > >>>> -Boot parameter > >>>> -============== > >>>> +Boot parameters > >>>> +=============== > >>>> > >>>> -You can change the sysfs boot time defaults of Transparent Hugepage > >>>> -Support by passing the parameter ``transparent_hugepage=always`` or > >>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > >>>> -to the kernel command line. > >>>> +You can change the sysfs boot time default for the top-level "enabled" > >>>> +control by passing the parameter ``transparent_hugepage=always`` or > >>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > >>>> +kernel command line. > >>>> + > >>>> +Alternatively, each supported anonymous THP size can be controlled by > >>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > >>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > >>>> +``inherit``. > >>>> + > >>>> +For example, the following will set 64K THP to ``always``:: > >>>> + > >>>> + thp_anon=64K:always > >>>> + > >>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as > >>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > >>>> +not explicitly configured on the command line are implicitly set to > >>>> +``never``. > >>>> > >>>> Hugepages in tmpfs/shmem > >>>> ======================== > >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c > >>>> index 0c3075ee00012..c2c0da1eb94e6 100644 > >>>> --- a/mm/huge_memory.c > >>>> +++ b/mm/huge_memory.c > >>>> @@ -82,6 +82,7 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL; > >>>> unsigned long huge_anon_orders_always __read_mostly; > >>>> unsigned long huge_anon_orders_madvise __read_mostly; > >>>> unsigned long huge_anon_orders_inherit __read_mostly; > >>>> +static bool anon_orders_configured; > >>>> > >>>> unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, > >>>> unsigned long vm_flags, > >>>> @@ -672,7 +673,10 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) > >>>> * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time > >>>> * constant so we have to do this here. > >>>> */ > >>>> - huge_anon_orders_inherit = BIT(PMD_ORDER); > >>>> + if (!anon_orders_configured) { > >>>> + huge_anon_orders_inherit = BIT(PMD_ORDER); > >>>> + anon_orders_configured = true; > >>>> + } > >>> > >>> If a user configures 64KB and doesn't adjust anything for PMD_ORDER, > >>> then PMD_ORDER will be set to "never", correct? This seems to change > >>> the default behavior of PMD_ORDER. Could we instead achieve this by > >>> checking if PMD_ORDER has been explicitly configured? > >> > >> Yes, that's how it's implemented in this patch, and the accompanying docs also > >> state: > >> > >> If ``thp_anon=`` is specified at least once, any anon THP sizes > >> not explicitly configured on the command line are implicitly set to > >> ``never``. > >> > >> My initial approach did exactly as you suggest. But in the original series, I > >> also had a similar patch to configure file thp with "thp_file=". And for file, > >> all of the orders default to `always`. So if taking the same approach with that > >> control, the user would have to explicitly opt-out of all supported orders > >> rather than just opt-in to the orders they want. And I thought that could get > >> tricky in future if support is added for more orders. I felt that was > >> potentially very confusing so decided it was clearer to have the above rule and > >> make both controls consistent. > >> > >> What do you think? > > > > If this is the intention, once the user sets the command line, they should > > realize that the default settings have been overridden. I am perfectly fine > > with this strategy. > > OK. I can see it from both sides to be honest. Let's see if anyone else has > issue with this approach. > > > > > with the below cmdline: > > thp_anon=64K:always thp_anon=8K:inherit thp_anon=32K:madvise > > thp_anon=1M:inherit thp_anon=2M:always > > > > I am getting: > > / # cat /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled > > [always] inherit madvise never > > / # cat /sys/kernel/mm/transparent_hugepage/hugepages-32kB/enabled > > always inherit [madvise] never > > / # cat /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled > > always [inherit] madvise never > > / # cat /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > > [always] inherit madvise never > > And you should also be seeing a warning in the boot log that thp_anon=8K:inherit > is unrecognised (since anon doesn't support order-1). Indeed. I am even seeing "Unknown kernel command line parameters "thp_anon=8K:inherit", will be passed to user space." [ 0.000000] huge_memory: thp_anon=8K:inherit: cannot parse, ignored [ 0.000000] Unknown kernel command line parameters "thp_anon=8K:inherit", will be passed to user space. [ 0.000000] random: crng init done [ 0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes, linear) [ 0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes, linear) [ 0.000000] Fallback order for Node 0: 0 [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 131072 [ 0.000000] Policy zone: DMA > > > > > Thus, > > > > Tested-by: Barry Song <baohua@kernel.org> > > Thanks! > > > > >> > >> > >>> > >>>> > >>>> *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj); > >>>> if (unlikely(!*hugepage_kobj)) { > >>>> @@ -857,6 +861,55 @@ static int __init setup_transparent_hugepage(char *str) > >>>> } > >>>> __setup("transparent_hugepage=", setup_transparent_hugepage); > >>>> > >>>> +static int __init setup_thp_anon(char *str) > >>>> +{ > >>>> + unsigned long size; > >>>> + char *state; > >>>> + int order; > >>>> + int ret = 0; > >>>> + > >>>> + if (!str) > >>>> + goto out; > >>>> + > >>>> + size = (unsigned long)memparse(str, &state); > >>>> + order = ilog2(size >> PAGE_SHIFT); > >>>> + if (*state != ':' || !is_power_of_2(size) || size <= PAGE_SIZE || > >>>> + !(BIT(order) & THP_ORDERS_ALL_ANON)) > >>>> + goto out; > >>>> + > >>>> + state++; > >>>> + > >>>> + if (!strcmp(state, "always")) { > >>>> + clear_bit(order, &huge_anon_orders_inherit); > >>>> + clear_bit(order, &huge_anon_orders_madvise); > >>>> + set_bit(order, &huge_anon_orders_always); > >>>> + ret = 1; > >>>> + } else if (!strcmp(state, "inherit")) { > >>>> + clear_bit(order, &huge_anon_orders_always); > >>>> + clear_bit(order, &huge_anon_orders_madvise); > >>>> + set_bit(order, &huge_anon_orders_inherit); > >>>> + ret = 1; > >>>> + } else if (!strcmp(state, "madvise")) { > >>>> + clear_bit(order, &huge_anon_orders_always); > >>>> + clear_bit(order, &huge_anon_orders_inherit); > >>>> + set_bit(order, &huge_anon_orders_madvise); > >>>> + ret = 1; > >>>> + } else if (!strcmp(state, "never")) { > >>>> + clear_bit(order, &huge_anon_orders_always); > >>>> + clear_bit(order, &huge_anon_orders_inherit); > >>>> + clear_bit(order, &huge_anon_orders_madvise); > >>>> + ret = 1; > >>>> + } > >>>> + > >>>> + if (ret) > >>>> + anon_orders_configured = true; > >>> > >>> I mean: > >>> > >>> if (ret && order == PMD_ORDER) > >>> anon_pmd_order_configured = true; > >>> > >>>> +out: > >>>> + if (!ret) > >>>> + pr_warn("thp_anon=%s: cannot parse, ignored\n", str); > >>>> + return ret; > >>>> +} > >>>> +__setup("thp_anon=", setup_thp_anon); > >>>> + > >>>> pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) > >>>> { > >>>> if (likely(vma->vm_flags & VM_WRITE)) > >>>> -- > >>>> 2.43.0 > >>>> > >>> > >>> Thanks > >>> Barry > >> > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline [not found] <20240808101700.571701-1-ryan.roberts@arm.com> 2024-08-08 21:17 ` [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline Barry Song @ 2024-08-09 8:49 ` Baolin Wang 2024-08-09 8:52 ` Barry Song 2024-08-09 9:19 ` David Hildenbrand 3 siblings, 0 replies; 17+ messages in thread From: Baolin Wang @ 2024-08-09 8:49 UTC (permalink / raw) To: Ryan Roberts, Andrew Morton, Jonathan Corbet, David Hildenbrand, Barry Song, Lance Yang Cc: linux-kernel, linux-mm On 2024/8/8 18:16, Ryan Roberts wrote: > Add thp_anon= cmdline parameter to allow specifying the default > enablement of each supported anon THP size. The parameter accepts the > following format and can be provided multiple times to configure each > size: > > thp_anon=<size>[KMG]:<value> > > See Documentation/admin-guide/mm/transhuge.rst for more details. > > Configuring the defaults at boot time is useful to allow early user > space to take advantage of mTHP before its been configured through > sysfs. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Make sense to me. Thanks. Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline [not found] <20240808101700.571701-1-ryan.roberts@arm.com> 2024-08-08 21:17 ` [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline Barry Song 2024-08-09 8:49 ` Baolin Wang @ 2024-08-09 8:52 ` Barry Song 2024-08-09 9:19 ` David Hildenbrand 3 siblings, 0 replies; 17+ messages in thread From: Barry Song @ 2024-08-09 8:52 UTC (permalink / raw) To: Ryan Roberts Cc: Andrew Morton, Jonathan Corbet, David Hildenbrand, Lance Yang, Baolin Wang, linux-kernel, linux-mm On Thu, Aug 8, 2024 at 6:17 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > > Add thp_anon= cmdline parameter to allow specifying the default > enablement of each supported anon THP size. The parameter accepts the > following format and can be provided multiple times to configure each > size: > > thp_anon=<size>[KMG]:<value> > > See Documentation/admin-guide/mm/transhuge.rst for more details. > > Configuring the defaults at boot time is useful to allow early user > space to take advantage of mTHP before its been configured through > sysfs. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Tested-by: Barry Song <baohua@kernel.org> Reviewed-by: Barry Song <baohua@kernel.org> Thanks Barry ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline [not found] <20240808101700.571701-1-ryan.roberts@arm.com> ` (2 preceding siblings ...) 2024-08-09 8:52 ` Barry Song @ 2024-08-09 9:19 ` David Hildenbrand 2024-08-09 9:24 ` Barry Song 3 siblings, 1 reply; 17+ messages in thread From: David Hildenbrand @ 2024-08-09 9:19 UTC (permalink / raw) To: Ryan Roberts, Andrew Morton, Jonathan Corbet, Barry Song, Lance Yang, Baolin Wang Cc: linux-kernel, linux-mm On 08.08.24 12:16, Ryan Roberts wrote: > Add thp_anon= cmdline parameter to allow specifying the default > enablement of each supported anon THP size. The parameter accepts the > following format and can be provided multiple times to configure each > size: > > thp_anon=<size>[KMG]:<value> > > See Documentation/admin-guide/mm/transhuge.rst for more details. > > Configuring the defaults at boot time is useful to allow early user > space to take advantage of mTHP before its been configured through > sysfs. I suspect a khugeapged enhancement and/or kernel-config-dependant defaults and/or early system settings will also be able to mitigate that without getting kernel cmdlines involved in the future. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > --- > > Hi All, > > I've split this off from my RFC at [1] because Barry highlighted that he would > benefit from it immediately [2]. There are no changes vs the version in that > series. > > It applies against today's mm-unstable (275d686abcb59). (although I had to fix a > minor build bug in stackdepot.c due to MIN() not being defined in this tree). > > Thanks, > Ryan > > > .../admin-guide/kernel-parameters.txt | 8 +++ > Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- > mm/huge_memory.c | 55 ++++++++++++++++++- > 3 files changed, 82 insertions(+), 7 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index bcdee8984e1f0..5c79b58c108ec 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6631,6 +6631,14 @@ > <deci-seconds>: poll all this frequency > 0: no polling (default) > > + thp_anon= [KNL] > + Format: <size>[KMG]:always|madvise|never|inherit > + Can be used to control the default behavior of the > + system with respect to anonymous transparent hugepages. > + Can be used multiple times for multiple anon THP sizes. > + See Documentation/admin-guide/mm/transhuge.rst for more > + details. > + > threadirqs [KNL,EARLY] > Force threading of all interrupt handlers except those > marked explicitly IRQF_NO_THREAD. > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > index 24eec1c03ad88..f63b0717366c6 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block the collapse:: > > A higher value may increase memory footprint for some workloads. > > -Boot parameter > -============== > +Boot parameters > +=============== > > -You can change the sysfs boot time defaults of Transparent Hugepage > -Support by passing the parameter ``transparent_hugepage=always`` or > -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > -to the kernel command line. > +You can change the sysfs boot time default for the top-level "enabled" > +control by passing the parameter ``transparent_hugepage=always`` or > +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > +kernel command line. > + > +Alternatively, each supported anonymous THP size can be controlled by > +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > +``inherit``. > + > +For example, the following will set 64K THP to ``always``:: > + > + thp_anon=64K:always > + > +``thp_anon=`` may be specified multiple times to configure all THP sizes as > +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > +not explicitly configured on the command line are implicitly set to > +``never``. I suggest documenting that "thp_anon=" will not effect the value of "transparent_hugepage=", or any configured default. Wondering if a syntax like thp_anon=16K,32K,64K:always;1048K,2048K:madvise (one could also support ranges, like "16K-64K") Would be even better. Then, maybe only allow a single instance. Maybe consider it if it's not too crazy to parse ;) -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 9:19 ` David Hildenbrand @ 2024-08-09 9:24 ` Barry Song 2024-08-09 9:32 ` David Hildenbrand 0 siblings, 1 reply; 17+ messages in thread From: Barry Song @ 2024-08-09 9:24 UTC (permalink / raw) To: David Hildenbrand Cc: Ryan Roberts, Andrew Morton, Jonathan Corbet, Lance Yang, Baolin Wang, linux-kernel, linux-mm On Fri, Aug 9, 2024 at 5:19 PM David Hildenbrand <david@redhat.com> wrote: > > On 08.08.24 12:16, Ryan Roberts wrote: > > Add thp_anon= cmdline parameter to allow specifying the default > > enablement of each supported anon THP size. The parameter accepts the > > following format and can be provided multiple times to configure each > > size: > > > > thp_anon=<size>[KMG]:<value> > > > > See Documentation/admin-guide/mm/transhuge.rst for more details. > > > > Configuring the defaults at boot time is useful to allow early user > > space to take advantage of mTHP before its been configured through > > sysfs. > > I suspect a khugeapged enhancement and/or kernel-config-dependant > defaults and/or early system settings will also be able to mitigate that > without getting kernel cmdlines involved in the future. > > > > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > > --- > > > > Hi All, > > > > I've split this off from my RFC at [1] because Barry highlighted that he would > > benefit from it immediately [2]. There are no changes vs the version in that > > series. > > > > It applies against today's mm-unstable (275d686abcb59). (although I had to fix a > > minor build bug in stackdepot.c due to MIN() not being defined in this tree). > > > > Thanks, > > Ryan > > > > > > .../admin-guide/kernel-parameters.txt | 8 +++ > > Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- > > mm/huge_memory.c | 55 ++++++++++++++++++- > > 3 files changed, 82 insertions(+), 7 deletions(-) > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > > index bcdee8984e1f0..5c79b58c108ec 100644 > > --- a/Documentation/admin-guide/kernel-parameters.txt > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > @@ -6631,6 +6631,14 @@ > > <deci-seconds>: poll all this frequency > > 0: no polling (default) > > > > + thp_anon= [KNL] > > + Format: <size>[KMG]:always|madvise|never|inherit > > + Can be used to control the default behavior of the > > + system with respect to anonymous transparent hugepages. > > + Can be used multiple times for multiple anon THP sizes. > > + See Documentation/admin-guide/mm/transhuge.rst for more > > + details. > > + > > threadirqs [KNL,EARLY] > > Force threading of all interrupt handlers except those > > marked explicitly IRQF_NO_THREAD. > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > > index 24eec1c03ad88..f63b0717366c6 100644 > > --- a/Documentation/admin-guide/mm/transhuge.rst > > +++ b/Documentation/admin-guide/mm/transhuge.rst > > @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block the collapse:: > > > > A higher value may increase memory footprint for some workloads. > > > > -Boot parameter > > -============== > > +Boot parameters > > +=============== > > > > -You can change the sysfs boot time defaults of Transparent Hugepage > > -Support by passing the parameter ``transparent_hugepage=always`` or > > -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > > -to the kernel command line. > > +You can change the sysfs boot time default for the top-level "enabled" > > +control by passing the parameter ``transparent_hugepage=always`` or > > +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > > +kernel command line. > > + > > +Alternatively, each supported anonymous THP size can be controlled by > > +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > > +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > > +``inherit``. > > + > > +For example, the following will set 64K THP to ``always``:: > > + > > + thp_anon=64K:always > > + > > +``thp_anon=`` may be specified multiple times to configure all THP sizes as > > +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > > +not explicitly configured on the command line are implicitly set to > > +``never``. > > I suggest documenting that "thp_anon=" will not effect the value of > "transparent_hugepage=", or any configured default. > > Wondering if a syntax like > > thp_anon=16K,32K,64K:always;1048K,2048K:madvise > > (one could also support ranges, like "16K-64K") > > Would be even better. Then, maybe only allow a single instance. > > Maybe consider it if it's not too crazy to parse ;) I prefer the current approach because it effectively filters cases like this. [ 0.000000] huge_memory: thp_anon=8K:inherit: cannot parse, ignored [ 0.000000] Unknown kernel command line parameters "thp_anon=8K:inherit", will be passed to user space. if we put multiple sizes together, 8K,32K,64K:always We can't determine whether this command line is legal or illegal as it is partially legal and partially illegal. Just my two cents :-) > > -- > Cheers, > > David / dhildenb Thanks Barry ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 9:24 ` Barry Song @ 2024-08-09 9:32 ` David Hildenbrand 2024-08-09 10:37 ` Ryan Roberts 0 siblings, 1 reply; 17+ messages in thread From: David Hildenbrand @ 2024-08-09 9:32 UTC (permalink / raw) To: Barry Song Cc: Ryan Roberts, Andrew Morton, Jonathan Corbet, Lance Yang, Baolin Wang, linux-kernel, linux-mm On 09.08.24 11:24, Barry Song wrote: > On Fri, Aug 9, 2024 at 5:19 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 08.08.24 12:16, Ryan Roberts wrote: >>> Add thp_anon= cmdline parameter to allow specifying the default >>> enablement of each supported anon THP size. The parameter accepts the >>> following format and can be provided multiple times to configure each >>> size: >>> >>> thp_anon=<size>[KMG]:<value> >>> >>> See Documentation/admin-guide/mm/transhuge.rst for more details. >>> >>> Configuring the defaults at boot time is useful to allow early user >>> space to take advantage of mTHP before its been configured through >>> sysfs. >> >> I suspect a khugeapged enhancement and/or kernel-config-dependant >> defaults and/or early system settings will also be able to mitigate that >> without getting kernel cmdlines involved in the future. >> >>> >>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >>> --- >>> >>> Hi All, >>> >>> I've split this off from my RFC at [1] because Barry highlighted that he would >>> benefit from it immediately [2]. There are no changes vs the version in that >>> series. >>> >>> It applies against today's mm-unstable (275d686abcb59). (although I had to fix a >>> minor build bug in stackdepot.c due to MIN() not being defined in this tree). >>> >>> Thanks, >>> Ryan >>> >>> >>> .../admin-guide/kernel-parameters.txt | 8 +++ >>> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- >>> mm/huge_memory.c | 55 ++++++++++++++++++- >>> 3 files changed, 82 insertions(+), 7 deletions(-) >>> >>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt >>> index bcdee8984e1f0..5c79b58c108ec 100644 >>> --- a/Documentation/admin-guide/kernel-parameters.txt >>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>> @@ -6631,6 +6631,14 @@ >>> <deci-seconds>: poll all this frequency >>> 0: no polling (default) >>> >>> + thp_anon= [KNL] >>> + Format: <size>[KMG]:always|madvise|never|inherit >>> + Can be used to control the default behavior of the >>> + system with respect to anonymous transparent hugepages. >>> + Can be used multiple times for multiple anon THP sizes. >>> + See Documentation/admin-guide/mm/transhuge.rst for more >>> + details. >>> + >>> threadirqs [KNL,EARLY] >>> Force threading of all interrupt handlers except those >>> marked explicitly IRQF_NO_THREAD. >>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >>> index 24eec1c03ad88..f63b0717366c6 100644 >>> --- a/Documentation/admin-guide/mm/transhuge.rst >>> +++ b/Documentation/admin-guide/mm/transhuge.rst >>> @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block the collapse:: >>> >>> A higher value may increase memory footprint for some workloads. >>> >>> -Boot parameter >>> -============== >>> +Boot parameters >>> +=============== >>> >>> -You can change the sysfs boot time defaults of Transparent Hugepage >>> -Support by passing the parameter ``transparent_hugepage=always`` or >>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` >>> -to the kernel command line. >>> +You can change the sysfs boot time default for the top-level "enabled" >>> +control by passing the parameter ``transparent_hugepage=always`` or >>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the >>> +kernel command line. >>> + >>> +Alternatively, each supported anonymous THP size can be controlled by >>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size >>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or >>> +``inherit``. >>> + >>> +For example, the following will set 64K THP to ``always``:: >>> + >>> + thp_anon=64K:always >>> + >>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as >>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes >>> +not explicitly configured on the command line are implicitly set to >>> +``never``. >> >> I suggest documenting that "thp_anon=" will not effect the value of >> "transparent_hugepage=", or any configured default. >> >> Wondering if a syntax like >> >> thp_anon=16K,32K,64K:always;1048K,2048K:madvise >> >> (one could also support ranges, like "16K-64K") >> >> Would be even better. Then, maybe only allow a single instance. >> >> Maybe consider it if it's not too crazy to parse ;) > > I prefer the current approach because it effectively filters cases like this. > > [ 0.000000] huge_memory: thp_anon=8K:inherit: cannot parse, ignored > [ 0.000000] Unknown kernel command line parameters > "thp_anon=8K:inherit", will be passed to user space. > > if we put multiple sizes together, 8K,32K,64K:always > > We can't determine whether this command line is legal or illegal as it > is partially legal and partially illegal. Besides: I wouldn't bother about this "user does something stupid" scenario that much. But yes, once we support more sizes a cmdline might turn invalid on an older kernel. However, I don't see the problem here. User passed a non-existant size. Ignore that one but handle the others, like you would with multiple commands? It can be well defined and documented. The command line is legal, just one size does not exist. The world will continue turning :) -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 9:32 ` David Hildenbrand @ 2024-08-09 10:37 ` Ryan Roberts 2024-08-09 11:37 ` Barry Song 2024-08-09 13:15 ` David Hildenbrand 0 siblings, 2 replies; 17+ messages in thread From: Ryan Roberts @ 2024-08-09 10:37 UTC (permalink / raw) To: David Hildenbrand, Barry Song Cc: Andrew Morton, Jonathan Corbet, Lance Yang, Baolin Wang, linux-kernel, linux-mm On 09/08/2024 10:32, David Hildenbrand wrote: > On 09.08.24 11:24, Barry Song wrote: >> On Fri, Aug 9, 2024 at 5:19 PM David Hildenbrand <david@redhat.com> wrote: >>> >>> On 08.08.24 12:16, Ryan Roberts wrote: >>>> Add thp_anon= cmdline parameter to allow specifying the default >>>> enablement of each supported anon THP size. The parameter accepts the >>>> following format and can be provided multiple times to configure each >>>> size: >>>> >>>> thp_anon=<size>[KMG]:<value> >>>> >>>> See Documentation/admin-guide/mm/transhuge.rst for more details. >>>> >>>> Configuring the defaults at boot time is useful to allow early user >>>> space to take advantage of mTHP before its been configured through >>>> sysfs. >>> >>> I suspect a khugeapged enhancement and/or kernel-config-dependant >>> defaults and/or early system settings will also be able to mitigate that >>> without getting kernel cmdlines involved in the future. >>> >>>> >>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >>>> --- >>>> >>>> Hi All, >>>> >>>> I've split this off from my RFC at [1] because Barry highlighted that he would >>>> benefit from it immediately [2]. There are no changes vs the version in that >>>> series. >>>> >>>> It applies against today's mm-unstable (275d686abcb59). (although I had to >>>> fix a >>>> minor build bug in stackdepot.c due to MIN() not being defined in this tree). >>>> >>>> Thanks, >>>> Ryan >>>> >>>> >>>> .../admin-guide/kernel-parameters.txt | 8 +++ >>>> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- >>>> mm/huge_memory.c | 55 ++++++++++++++++++- >>>> 3 files changed, 82 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt >>>> b/Documentation/admin-guide/kernel-parameters.txt >>>> index bcdee8984e1f0..5c79b58c108ec 100644 >>>> --- a/Documentation/admin-guide/kernel-parameters.txt >>>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>>> @@ -6631,6 +6631,14 @@ >>>> <deci-seconds>: poll all this frequency >>>> 0: no polling (default) >>>> >>>> + thp_anon= [KNL] >>>> + Format: <size>[KMG]:always|madvise|never|inherit >>>> + Can be used to control the default behavior of the >>>> + system with respect to anonymous transparent hugepages. >>>> + Can be used multiple times for multiple anon THP sizes. >>>> + See Documentation/admin-guide/mm/transhuge.rst for more >>>> + details. >>>> + >>>> threadirqs [KNL,EARLY] >>>> Force threading of all interrupt handlers except those >>>> marked explicitly IRQF_NO_THREAD. >>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst >>>> b/Documentation/admin-guide/mm/transhuge.rst >>>> index 24eec1c03ad88..f63b0717366c6 100644 >>>> --- a/Documentation/admin-guide/mm/transhuge.rst >>>> +++ b/Documentation/admin-guide/mm/transhuge.rst >>>> @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block >>>> the collapse:: >>>> >>>> A higher value may increase memory footprint for some workloads. >>>> >>>> -Boot parameter >>>> -============== >>>> +Boot parameters >>>> +=============== >>>> >>>> -You can change the sysfs boot time defaults of Transparent Hugepage >>>> -Support by passing the parameter ``transparent_hugepage=always`` or >>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` >>>> -to the kernel command line. >>>> +You can change the sysfs boot time default for the top-level "enabled" >>>> +control by passing the parameter ``transparent_hugepage=always`` or >>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the >>>> +kernel command line. >>>> + >>>> +Alternatively, each supported anonymous THP size can be controlled by >>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size >>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or >>>> +``inherit``. >>>> + >>>> +For example, the following will set 64K THP to ``always``:: >>>> + >>>> + thp_anon=64K:always >>>> + >>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as >>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes >>>> +not explicitly configured on the command line are implicitly set to >>>> +``never``. >>> >>> I suggest documenting that "thp_anon=" will not effect the value of >>> "transparent_hugepage=", or any configured default. Did you see the previous conversation with Barry about whether or not to honour configured defaults when any thp_anon= is provided [1]? Sounds like you also think we should honour the PMD "inherit" default if not explicitly provided on the command line? (see link for justification for the approach I'm currently taking). [1] https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@mail.gmail.com/ >>> >>> Wondering if a syntax like >>> >>> thp_anon=16K,32K,64K:always;1048K,2048K:madvise Are there examples of that syntax already or have you just made it up? I found examples with the colon (:) but nothing this fancy. I guess that's not a reason not to do it though (other than the risk of screwing up the parser in a subtle way). >>> >>> (one could also support ranges, like "16K-64K") >>> >>> Would be even better. Then, maybe only allow a single instance. >>> >>> Maybe consider it if it's not too crazy to parse ;) I'll take a look. I'm going to be out for 3 weeks from end of Monday though, so probably won't get around to that until I'm back. I know Barry is keen to get this merged, so Barry, if you'd like to take it over that's fine by me (I'm sure you have enough on your plate though). >> >> I prefer the current approach because it effectively filters cases like this. >> >> [ 0.000000] huge_memory: thp_anon=8K:inherit: cannot parse, ignored >> [ 0.000000] Unknown kernel command line parameters >> "thp_anon=8K:inherit", will be passed to user space. >> >> if we put multiple sizes together, 8K,32K,64K:always >> >> We can't determine whether this command line is legal or illegal as it >> is partially legal and partially illegal. > > Besides: I wouldn't bother about this "user does something stupid" scenario that > much. > > But yes, once we support more sizes a cmdline might turn invalid on an older > kernel. > > However, I don't see the problem here. User passed a non-existant size. Ignore > that one but handle the others, like you would with multiple commands? Yep, the parser could emit a warning for the size and move on. > > It can be well defined and documented. The command line is legal, just one size > does not exist. > > The world will continue turning :) > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 10:37 ` Ryan Roberts @ 2024-08-09 11:37 ` Barry Song 2024-08-09 13:15 ` David Hildenbrand 1 sibling, 0 replies; 17+ messages in thread From: Barry Song @ 2024-08-09 11:37 UTC (permalink / raw) To: Ryan Roberts Cc: David Hildenbrand, Andrew Morton, Jonathan Corbet, Lance Yang, Baolin Wang, linux-kernel, linux-mm On Fri, Aug 9, 2024 at 6:37 PM Ryan Roberts <ryan.roberts@arm.com> wrote: > > On 09/08/2024 10:32, David Hildenbrand wrote: > > On 09.08.24 11:24, Barry Song wrote: > >> On Fri, Aug 9, 2024 at 5:19 PM David Hildenbrand <david@redhat.com> wrote: > >>> > >>> On 08.08.24 12:16, Ryan Roberts wrote: > >>>> Add thp_anon= cmdline parameter to allow specifying the default > >>>> enablement of each supported anon THP size. The parameter accepts the > >>>> following format and can be provided multiple times to configure each > >>>> size: > >>>> > >>>> thp_anon=<size>[KMG]:<value> > >>>> > >>>> See Documentation/admin-guide/mm/transhuge.rst for more details. > >>>> > >>>> Configuring the defaults at boot time is useful to allow early user > >>>> space to take advantage of mTHP before its been configured through > >>>> sysfs. > >>> > >>> I suspect a khugeapged enhancement and/or kernel-config-dependant > >>> defaults and/or early system settings will also be able to mitigate that > >>> without getting kernel cmdlines involved in the future. > >>> > >>>> > >>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > >>>> --- > >>>> > >>>> Hi All, > >>>> > >>>> I've split this off from my RFC at [1] because Barry highlighted that he would > >>>> benefit from it immediately [2]. There are no changes vs the version in that > >>>> series. > >>>> > >>>> It applies against today's mm-unstable (275d686abcb59). (although I had to > >>>> fix a > >>>> minor build bug in stackdepot.c due to MIN() not being defined in this tree). > >>>> > >>>> Thanks, > >>>> Ryan > >>>> > >>>> > >>>> .../admin-guide/kernel-parameters.txt | 8 +++ > >>>> Documentation/admin-guide/mm/transhuge.rst | 26 +++++++-- > >>>> mm/huge_memory.c | 55 ++++++++++++++++++- > >>>> 3 files changed, 82 insertions(+), 7 deletions(-) > >>>> > >>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt > >>>> b/Documentation/admin-guide/kernel-parameters.txt > >>>> index bcdee8984e1f0..5c79b58c108ec 100644 > >>>> --- a/Documentation/admin-guide/kernel-parameters.txt > >>>> +++ b/Documentation/admin-guide/kernel-parameters.txt > >>>> @@ -6631,6 +6631,14 @@ > >>>> <deci-seconds>: poll all this frequency > >>>> 0: no polling (default) > >>>> > >>>> + thp_anon= [KNL] > >>>> + Format: <size>[KMG]:always|madvise|never|inherit > >>>> + Can be used to control the default behavior of the > >>>> + system with respect to anonymous transparent hugepages. > >>>> + Can be used multiple times for multiple anon THP sizes. > >>>> + See Documentation/admin-guide/mm/transhuge.rst for more > >>>> + details. > >>>> + > >>>> threadirqs [KNL,EARLY] > >>>> Force threading of all interrupt handlers except those > >>>> marked explicitly IRQF_NO_THREAD. > >>>> diff --git a/Documentation/admin-guide/mm/transhuge.rst > >>>> b/Documentation/admin-guide/mm/transhuge.rst > >>>> index 24eec1c03ad88..f63b0717366c6 100644 > >>>> --- a/Documentation/admin-guide/mm/transhuge.rst > >>>> +++ b/Documentation/admin-guide/mm/transhuge.rst > >>>> @@ -284,13 +284,27 @@ that THP is shared. Exceeding the number would block > >>>> the collapse:: > >>>> > >>>> A higher value may increase memory footprint for some workloads. > >>>> > >>>> -Boot parameter > >>>> -============== > >>>> +Boot parameters > >>>> +=============== > >>>> > >>>> -You can change the sysfs boot time defaults of Transparent Hugepage > >>>> -Support by passing the parameter ``transparent_hugepage=always`` or > >>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > >>>> -to the kernel command line. > >>>> +You can change the sysfs boot time default for the top-level "enabled" > >>>> +control by passing the parameter ``transparent_hugepage=always`` or > >>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > >>>> +kernel command line. > >>>> + > >>>> +Alternatively, each supported anonymous THP size can be controlled by > >>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > >>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > >>>> +``inherit``. > >>>> + > >>>> +For example, the following will set 64K THP to ``always``:: > >>>> + > >>>> + thp_anon=64K:always > >>>> + > >>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as > >>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > >>>> +not explicitly configured on the command line are implicitly set to > >>>> +``never``. > >>> > >>> I suggest documenting that "thp_anon=" will not effect the value of > >>> "transparent_hugepage=", or any configured default. > > Did you see the previous conversation with Barry about whether or not to honour > configured defaults when any thp_anon= is provided [1]? Sounds like you also > think we should honour the PMD "inherit" default if not explicitly provided on > the command line? (see link for justification for the approach I'm currently > taking). > > [1] > https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@mail.gmail.com/ > > >>> > >>> Wondering if a syntax like > >>> > >>> thp_anon=16K,32K,64K:always;1048K,2048K:madvise > > Are there examples of that syntax already or have you just made it up? I found > examples with the colon (:) but nothing this fancy. I guess that's not a reason > not to do it though (other than the risk of screwing up the parser in a subtle way). > > >>> > >>> (one could also support ranges, like "16K-64K") > >>> > >>> Would be even better. Then, maybe only allow a single instance. > >>> > >>> Maybe consider it if it's not too crazy to parse ;) > I'll take a look. I'm going to be out for 3 weeks from end of Monday though, so > probably won't get around to that until I'm back. I know Barry is keen to get > this merged, so Barry, if you'd like to take it over that's fine by me (I'm sure > you have enough on your plate though). Yes, this command line is definitely needed. However, I also agree with David that it might be worth considering options like '16K-64K' or '16K, 32K, 64K' after further reflection. For example, if users want to enable six sizes, having six separate commands could end up being quite silly :-) I can address this after the mTHP number counters series. > > >> > >> I prefer the current approach because it effectively filters cases like this. > >> > >> [ 0.000000] huge_memory: thp_anon=8K:inherit: cannot parse, ignored > >> [ 0.000000] Unknown kernel command line parameters > >> "thp_anon=8K:inherit", will be passed to user space. > >> > >> if we put multiple sizes together, 8K,32K,64K:always > >> > >> We can't determine whether this command line is legal or illegal as it > >> is partially legal and partially illegal. > > > > Besides: I wouldn't bother about this "user does something stupid" scenario that > > much. > > > > But yes, once we support more sizes a cmdline might turn invalid on an older > > kernel. > > > > However, I don't see the problem here. User passed a non-existant size. Ignore > > that one but handle the others, like you would with multiple commands? > > Yep, the parser could emit a warning for the size and move on. > > > > > It can be well defined and documented. The command line is legal, just one size > > does not exist. > > > > The world will continue turning :) Right. Thanks Barry ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 10:37 ` Ryan Roberts 2024-08-09 11:37 ` Barry Song @ 2024-08-09 13:15 ` David Hildenbrand 2024-08-12 5:36 ` Barry Song 1 sibling, 1 reply; 17+ messages in thread From: David Hildenbrand @ 2024-08-09 13:15 UTC (permalink / raw) To: Ryan Roberts, Barry Song Cc: Andrew Morton, Jonathan Corbet, Lance Yang, Baolin Wang, linux-kernel, linux-mm >>>>> -You can change the sysfs boot time defaults of Transparent Hugepage >>>>> -Support by passing the parameter ``transparent_hugepage=always`` or >>>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` >>>>> -to the kernel command line. >>>>> +You can change the sysfs boot time default for the top-level "enabled" >>>>> +control by passing the parameter ``transparent_hugepage=always`` or >>>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the >>>>> +kernel command line. >>>>> + >>>>> +Alternatively, each supported anonymous THP size can be controlled by >>>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size >>>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or >>>>> +``inherit``. >>>>> + >>>>> +For example, the following will set 64K THP to ``always``:: >>>>> + >>>>> + thp_anon=64K:always >>>>> + >>>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as >>>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes >>>>> +not explicitly configured on the command line are implicitly set to >>>>> +``never``. >>>> >>>> I suggest documenting that "thp_anon=" will not effect the value of >>>> "transparent_hugepage=", or any configured default. > > Did you see the previous conversation with Barry about whether or not to honour > configured defaults when any thp_anon= is provided [1]? Sounds like you also > think we should honour the PMD "inherit" default if not explicitly provided on > the command line? (see link for justification for the approach I'm currently > taking). I primarily think that we should document it :) What if someone passes "transparent_hugepage=always" and "thp_anon=..."? I would assume that transparent_hugepage would only affect the global toggle then? > > [1] > https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@mail.gmail.com/ > >>>> >>>> Wondering if a syntax like >>>> >>>> thp_anon=16K,32K,64K:always;1048K,2048K:madvise > > Are there examples of that syntax already or have you just made it up? I found > examples with the colon (:) but nothing this fancy. I guess that's not a reason > not to do it though (other than the risk of screwing up the parser in a subtle way). I made it up -- mostly ;) I think we are quite flexible on what we can do. As always, maybe we can keep it bit consistent with existing stuff. For hugetlb_cma we have things like "<node>:nn[KMGTPE|[,<node>:nn[KMGTPE]] "memmap=" options are more ... advanced, including memory ranges. There are a bunch more documented in kernel-parameters.txt that have more elaborate formats. Ranges would probably be the most valuable addition. So maybe we should start with: thp_anon=16K-64K:always,1048K-2048K:madvise So to enable all THPs it would simply be thp_anon=16K-2M:always Interesting question what would happen if someone passes: thp_anon=8K-2M:always Likely we simply would apply it to any size in the range, even if start/end is not a THP size. But we would want to complain to the user if someone only specifies a single one (or a range does not even select a single one) that does not exist: thp_anon=8K:always > >>>> >>>> (one could also support ranges, like "16K-64K") >>>> >>>> Would be even better. Then, maybe only allow a single instance. >>>> >>>> Maybe consider it if it's not too crazy to parse ;) > I'll take a look. I'm going to be out for 3 weeks from end of Monday though, so Oh, lucky you! Enjoy! > probably won't get around to that until I'm back. I know Barry is keen to get > this merged, so Barry, if you'd like to take it over that's fine by me (I'm sure > you have enough on your plate though). -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-09 13:15 ` David Hildenbrand @ 2024-08-12 5:36 ` Barry Song 2024-08-12 8:20 ` David Hildenbrand 0 siblings, 1 reply; 17+ messages in thread From: Barry Song @ 2024-08-12 5:36 UTC (permalink / raw) To: david Cc: akpm, baohua, baolin.wang, corbet, ioworker0, linux-kernel, linux-mm, ryan.roberts On Sat, Aug 10, 2024 at 1:15 AM David Hildenbrand <david@redhat.com> wrote: > > >>>>> -You can change the sysfs boot time defaults of Transparent Hugepage > >>>>> -Support by passing the parameter ``transparent_hugepage=always`` or > >>>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > >>>>> -to the kernel command line. > >>>>> +You can change the sysfs boot time default for the top-level "enabled" > >>>>> +control by passing the parameter ``transparent_hugepage=always`` or > >>>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > >>>>> +kernel command line. > >>>>> + > >>>>> +Alternatively, each supported anonymous THP size can be controlled by > >>>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > >>>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > >>>>> +``inherit``. > >>>>> + > >>>>> +For example, the following will set 64K THP to ``always``:: > >>>>> + > >>>>> + thp_anon=64K:always > >>>>> + > >>>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as > >>>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > >>>>> +not explicitly configured on the command line are implicitly set to > >>>>> +``never``. > >>>> > >>>> I suggest documenting that "thp_anon=" will not effect the value of > >>>> "transparent_hugepage=", or any configured default. > > > > Did you see the previous conversation with Barry about whether or not to honour > > configured defaults when any thp_anon= is provided [1]? Sounds like you also > > think we should honour the PMD "inherit" default if not explicitly provided on > > the command line? (see link for justification for the approach I'm currently > > taking). > > I primarily think that we should document it :) > > What if someone passes "transparent_hugepage=always" and "thp_anon=..."? > I would assume that transparent_hugepage would only affect the global > toggle then? > > > > > [1] > > https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@mail.gmail.com/ > > > >>>> > >>>> Wondering if a syntax like > >>>> > >>>> thp_anon=16K,32K,64K:always;1048K,2048K:madvise > > > > Are there examples of that syntax already or have you just made it up? I found > > examples with the colon (:) but nothing this fancy. I guess that's not a reason > > not to do it though (other than the risk of screwing up the parser in a subtle way). > > I made it up -- mostly ;) I think we are quite flexible on what we can > do. As always, maybe we can keep it bit consistent with existing stuff. > > For hugetlb_cma we have things like > "<node>:nn[KMGTPE|[,<node>:nn[KMGTPE]] > > "memmap=" options are more ... advanced, including memory ranges. There > are a bunch more documented in kernel-parameters.txt that have more > elaborate formats. > > Ranges would probably be the most valuable addition. So maybe we should > start with: > > thp_anon=16K-64K:always,1048K-2048K:madvise > > So to enable all THPs it would simply be > > thp_anon=16K-2M:always > > Interesting question what would happen if someone passes: > > thp_anon=8K-2M:always > > Likely we simply would apply it to any size in the range, even if > start/end is not a THP size. > > But we would want to complain to the user if someone only specifies a > single one (or a range does not even select a single one) that does not > exist: > > thp_anon=8K:always How about rejecting settings with any illegal size or policy? I found there are too many corner cases to say "yes" or "no". It seems the code could be much cleaner if we simply reject illegal settings. On the other hand, we should rely on smart users to provide correct settings, shouldn’t we? While working one the code, I felt that extracting partial correct settings from incorrect ones and supporting them might be a bit of over-design. I have tested the below code by "thp_anon=16K-64K:always;128K,512K:inherit;256K:madvise;1M-2M:never" diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1a12c011e2df..6a1f54cff7f9 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -81,6 +81,7 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL; unsigned long huge_anon_orders_always __read_mostly; unsigned long huge_anon_orders_madvise __read_mostly; unsigned long huge_anon_orders_inherit __read_mostly; +static bool anon_orders_configured; unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, unsigned long vm_flags, @@ -737,7 +738,10 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) * disable all other sizes. powerpc's PMD_ORDER isn't a compile-time * constant so we have to do this here. */ - huge_anon_orders_inherit = BIT(PMD_ORDER); + if (!anon_orders_configured) { + huge_anon_orders_inherit = BIT(PMD_ORDER); + anon_orders_configured = true; + } *hugepage_kobj = kobject_create_and_add("transparent_hugepage", mm_kobj); if (unlikely(!*hugepage_kobj)) { @@ -922,6 +926,103 @@ static int __init setup_transparent_hugepage(char *str) } __setup("transparent_hugepage=", setup_transparent_hugepage); +static inline int get_order_from_str(const char *size_str) +{ + unsigned long size; + char *endptr; + int order; + + size = memparse(size_str, &endptr); + order = fls(size >> PAGE_SHIFT) - 1; + if (order & ~THP_ORDERS_ALL_ANON) { + pr_err("invalid size in thp_anon= %ld\n", size); + return -EINVAL; + } + + return order; +} + +static inline void set_bits(unsigned long *var, int start, int end) +{ + int i; + + for (i = start; i <= end; i++) + set_bit(i, var); +} + +static inline void clear_bits(unsigned long *var, int start, int end) +{ + int i; + + for (i = start; i <= end; i++) + clear_bit(i, var); +} + +static char str_dup[PAGE_SIZE] __meminitdata; +static int __init setup_thp_anon(char *str) +{ + char *token, *range, *policy, *subtoken; + char *start_size, *end_size; + int start, end; + char *p; + + if (!str || strlen(str) + 1 > PAGE_SIZE) + goto err; + strcpy(str_dup, str); + + p = str_dup; + while ((token = strsep(&p, ";")) != NULL) { + range = strsep(&token, ":"); + policy = token; + + if (!policy) + goto err; + + while ((subtoken = strsep(&range, ",")) != NULL) { + if (strchr(subtoken, '-')) { + start_size = strsep(&subtoken, "-"); + end_size = subtoken; + + start = get_order_from_str(start_size); + end = get_order_from_str(end_size); + } else { + start = end = get_order_from_str(subtoken); + } + + if (start < 0 || end < 0 || start > end) + goto err; + + if (!strcmp(policy, "always")) { + set_bits(&huge_anon_orders_always, start, end); + clear_bits(&huge_anon_orders_inherit, start, end); + clear_bits(&huge_anon_orders_madvise, start, end); + } else if (!strcmp(policy, "madvise")) { + set_bits(&huge_anon_orders_madvise, start, end); + clear_bits(&huge_anon_orders_inherit, start, end); + clear_bits(&huge_anon_orders_always, start, end); + } else if (!strcmp(policy, "inherit")) { + set_bits(&huge_anon_orders_inherit, start, end); + clear_bits(&huge_anon_orders_madvise, start, end); + clear_bits(&huge_anon_orders_always, start, end); + } else if (!strcmp(policy, "never")) { + clear_bits(&huge_anon_orders_inherit, start, end); + clear_bits(&huge_anon_orders_madvise, start, end); + clear_bits(&huge_anon_orders_always, start, end); + } else { + goto err; + } + } + } + + anon_orders_configured = true; + return 1; + +err: + pr_warn("thp_anon=%s: cannot parse(invalid size or policy), ignored\n", str); + return 0; +} +__setup("thp_anon=", setup_thp_anon); + pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) { if (likely(vma->vm_flags & VM_WRITE)) -- 2.34.1 Everything seems to be correct: / # cat /sys/kernel/mm/transparent_hugepage/hugepages-16kB/enabled [always] inherit madvise never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-32kB/enabled [always] inherit madvise never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled [always] inherit madvise never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-128kB/enabled always [inherit] madvise never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-256kB/enabled always inherit [madvise] never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-512kB/enabled always [inherit] madvise never / # cat /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled always inherit madvise [never] / # cat /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled always inherit madvise [never] > > > > >>>> > >>>> (one could also support ranges, like "16K-64K") > >>>> > >>>> Would be even better. Then, maybe only allow a single instance. > >>>> > >>>> Maybe consider it if it's not too crazy to parse ;) > > I'll take a look. I'm going to be out for 3 weeks from end of Monday though, so > > Oh, lucky you! Enjoy! > > > probably won't get around to that until I'm back. I know Barry is keen to get > > this merged, so Barry, if you'd like to take it over that's fine by me (I'm sure > > you have enough on your plate though). > > > -- > Cheers, > > David / dhildenb > Thanks Barry ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-12 5:36 ` Barry Song @ 2024-08-12 8:20 ` David Hildenbrand 2024-08-12 8:36 ` Barry Song 0 siblings, 1 reply; 17+ messages in thread From: David Hildenbrand @ 2024-08-12 8:20 UTC (permalink / raw) To: Barry Song Cc: akpm, baohua, baolin.wang, corbet, ioworker0, linux-kernel, linux-mm, ryan.roberts On 12.08.24 07:36, Barry Song wrote: > On Sat, Aug 10, 2024 at 1:15 AM David Hildenbrand <david@redhat.com> wrote: >> >>>>>>> -You can change the sysfs boot time defaults of Transparent Hugepage >>>>>>> -Support by passing the parameter ``transparent_hugepage=always`` or >>>>>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` >>>>>>> -to the kernel command line. >>>>>>> +You can change the sysfs boot time default for the top-level "enabled" >>>>>>> +control by passing the parameter ``transparent_hugepage=always`` or >>>>>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the >>>>>>> +kernel command line. >>>>>>> + >>>>>>> +Alternatively, each supported anonymous THP size can be controlled by >>>>>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size >>>>>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or >>>>>>> +``inherit``. >>>>>>> + >>>>>>> +For example, the following will set 64K THP to ``always``:: >>>>>>> + >>>>>>> + thp_anon=64K:always >>>>>>> + >>>>>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as >>>>>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes >>>>>>> +not explicitly configured on the command line are implicitly set to >>>>>>> +``never``. >>>>>> >>>>>> I suggest documenting that "thp_anon=" will not effect the value of >>>>>> "transparent_hugepage=", or any configured default. >>> >>> Did you see the previous conversation with Barry about whether or not to honour >>> configured defaults when any thp_anon= is provided [1]? Sounds like you also >>> think we should honour the PMD "inherit" default if not explicitly provided on >>> the command line? (see link for justification for the approach I'm currently >>> taking). >> >> I primarily think that we should document it :) >> >> What if someone passes "transparent_hugepage=always" and "thp_anon=..."? >> I would assume that transparent_hugepage would only affect the global >> toggle then? >> >>> >>> [1] >>> https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@mail.gmail.com/ >>> >>>>>> >>>>>> Wondering if a syntax like >>>>>> >>>>>> thp_anon=16K,32K,64K:always;1048K,2048K:madvise >>> >>> Are there examples of that syntax already or have you just made it up? I found >>> examples with the colon (:) but nothing this fancy. I guess that's not a reason >>> not to do it though (other than the risk of screwing up the parser in a subtle way). >> >> I made it up -- mostly ;) I think we are quite flexible on what we can >> do. As always, maybe we can keep it bit consistent with existing stuff. >> >> For hugetlb_cma we have things like >> "<node>:nn[KMGTPE|[,<node>:nn[KMGTPE]] >> >> "memmap=" options are more ... advanced, including memory ranges. There >> are a bunch more documented in kernel-parameters.txt that have more >> elaborate formats. >> >> Ranges would probably be the most valuable addition. So maybe we should >> start with: >> >> thp_anon=16K-64K:always,1048K-2048K:madvise >> >> So to enable all THPs it would simply be >> >> thp_anon=16K-2M:always >> >> Interesting question what would happen if someone passes: >> >> thp_anon=8K-2M:always >> >> Likely we simply would apply it to any size in the range, even if >> start/end is not a THP size. >> >> But we would want to complain to the user if someone only specifies a >> single one (or a range does not even select a single one) that does not >> exist: >> >> thp_anon=8K:always > > How about rejecting settings with any illegal size or policy? > > I found there are too many corner cases to say "yes" or "no". It seems > the code could be much cleaner if we simply reject illegal settings. > On the other hand, we should rely on smart users to provide correct > settings, shouldn’t we? While working one the code, I felt that > extracting partial correct settings from incorrect ones and supporting > them might be a bit of over-design. No strong opinion on failing configs. Ignoring non-existing sizes might lead to more portable cmdlines between kernel versions. > > I have tested the below code by > > "thp_anon=16K-64K:always;128K,512K:inherit;256K:madvise;1M-2M:never" There are some parameters that separate individual options by ";" already (config_acs). Most parameters use ",". No idea what's better here, and which separator to use for sizes instead when using "," for options. No strong opinion. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-12 8:20 ` David Hildenbrand @ 2024-08-12 8:36 ` Barry Song 2024-08-12 9:05 ` David Hildenbrand 0 siblings, 1 reply; 17+ messages in thread From: Barry Song @ 2024-08-12 8:36 UTC (permalink / raw) To: David Hildenbrand Cc: akpm, baolin.wang, corbet, ioworker0, linux-kernel, linux-mm, ryan.roberts On Mon, Aug 12, 2024 at 8:20 PM David Hildenbrand <david@redhat.com> wrote: > > On 12.08.24 07:36, Barry Song wrote: > > On Sat, Aug 10, 2024 at 1:15 AM David Hildenbrand <david@redhat.com> wrote: > >> > >>>>>>> -You can change the sysfs boot time defaults of Transparent Hugepage > >>>>>>> -Support by passing the parameter ``transparent_hugepage=always`` or > >>>>>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` > >>>>>>> -to the kernel command line. > >>>>>>> +You can change the sysfs boot time default for the top-level "enabled" > >>>>>>> +control by passing the parameter ``transparent_hugepage=always`` or > >>>>>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the > >>>>>>> +kernel command line. > >>>>>>> + > >>>>>>> +Alternatively, each supported anonymous THP size can be controlled by > >>>>>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size > >>>>>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or > >>>>>>> +``inherit``. > >>>>>>> + > >>>>>>> +For example, the following will set 64K THP to ``always``:: > >>>>>>> + > >>>>>>> + thp_anon=64K:always > >>>>>>> + > >>>>>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as > >>>>>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes > >>>>>>> +not explicitly configured on the command line are implicitly set to > >>>>>>> +``never``. > >>>>>> > >>>>>> I suggest documenting that "thp_anon=" will not effect the value of > >>>>>> "transparent_hugepage=", or any configured default. > >>> > >>> Did you see the previous conversation with Barry about whether or not to honour > >>> configured defaults when any thp_anon= is provided [1]? Sounds like you also > >>> think we should honour the PMD "inherit" default if not explicitly provided on > >>> the command line? (see link for justification for the approach I'm currently > >>> taking). > >> > >> I primarily think that we should document it :) > >> > >> What if someone passes "transparent_hugepage=always" and "thp_anon=..."? > >> I would assume that transparent_hugepage would only affect the global > >> toggle then? > >> > >>> > >>> [1] > >>> https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@mail.gmail.com/ > >>> > >>>>>> > >>>>>> Wondering if a syntax like > >>>>>> > >>>>>> thp_anon=16K,32K,64K:always;1048K,2048K:madvise > >>> > >>> Are there examples of that syntax already or have you just made it up? I found > >>> examples with the colon (:) but nothing this fancy. I guess that's not a reason > >>> not to do it though (other than the risk of screwing up the parser in a subtle way). > >> > >> I made it up -- mostly ;) I think we are quite flexible on what we can > >> do. As always, maybe we can keep it bit consistent with existing stuff. > >> > >> For hugetlb_cma we have things like > >> "<node>:nn[KMGTPE|[,<node>:nn[KMGTPE]] > >> > >> "memmap=" options are more ... advanced, including memory ranges. There > >> are a bunch more documented in kernel-parameters.txt that have more > >> elaborate formats. > >> > >> Ranges would probably be the most valuable addition. So maybe we should > >> start with: > >> > >> thp_anon=16K-64K:always,1048K-2048K:madvise > >> > >> So to enable all THPs it would simply be > >> > >> thp_anon=16K-2M:always > >> > >> Interesting question what would happen if someone passes: > >> > >> thp_anon=8K-2M:always > >> > >> Likely we simply would apply it to any size in the range, even if > >> start/end is not a THP size. > >> > >> But we would want to complain to the user if someone only specifies a > >> single one (or a range does not even select a single one) that does not > >> exist: > >> > >> thp_anon=8K:always > > > > How about rejecting settings with any illegal size or policy? > > > > I found there are too many corner cases to say "yes" or "no". It seems > > the code could be much cleaner if we simply reject illegal settings. > > On the other hand, we should rely on smart users to provide correct > > settings, shouldn’t we? While working one the code, I felt that > > extracting partial correct settings from incorrect ones and supporting > > them might be a bit of over-design. > > No strong opinion on failing configs. Ignoring non-existing sizes might > lead to more portable cmdlines between kernel versions. Well, I realized that the code I posted has actually applied the correct parts because it modifies the global variable huge_anon_orders_xxx. Unless I use temporary variables and then copy the results to global ones, the code has been this issue to some extent. maybe I should remove the below "goto err", instead, ignore the incorrect strings and go to the next strings to parse. static int __init setup_thp_anon(char *str) { ... if (!strcmp(policy, "always")) { set_bits(&huge_anon_orders_always, start, end); clear_bits(&huge_anon_orders_inherit, start, end); clear_bits(&huge_anon_orders_madvise, start, end); } else if (!strcmp(policy, "madvise")) { ... } else if (!strcmp(policy, "inherit")) { ... } else if (!strcmp(policy, "never")) { ... } else { /* goto err; */ -> pr_err } } } ... err: pr_warn("thp_anon=%s: cannot parse(invalid size or policy), ignored\n", str); return 0; } > > > > > I have tested the below code by > > > > "thp_anon=16K-64K:always;128K,512K:inherit;256K:madvise;1M-2M:never" > > There are some parameters that separate individual options by ";" > already (config_acs). Most parameters use ",". No idea what's better > here, and which separator to use for sizes instead when using "," for > options. No strong opinion. But we have used "," for the purpose "128K,512K:inherit". so here we have to use ";" for "16K-64K:always;128K,512K:inherit" > > > -- > Cheers, > > David / dhildenb Thanks Barry ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline 2024-08-12 8:36 ` Barry Song @ 2024-08-12 9:05 ` David Hildenbrand 0 siblings, 0 replies; 17+ messages in thread From: David Hildenbrand @ 2024-08-12 9:05 UTC (permalink / raw) To: Barry Song Cc: akpm, baolin.wang, corbet, ioworker0, linux-kernel, linux-mm, ryan.roberts On 12.08.24 10:36, Barry Song wrote: > On Mon, Aug 12, 2024 at 8:20 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 12.08.24 07:36, Barry Song wrote: >>> On Sat, Aug 10, 2024 at 1:15 AM David Hildenbrand <david@redhat.com> wrote: >>>> >>>>>>>>> -You can change the sysfs boot time defaults of Transparent Hugepage >>>>>>>>> -Support by passing the parameter ``transparent_hugepage=always`` or >>>>>>>>> -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` >>>>>>>>> -to the kernel command line. >>>>>>>>> +You can change the sysfs boot time default for the top-level "enabled" >>>>>>>>> +control by passing the parameter ``transparent_hugepage=always`` or >>>>>>>>> +``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the >>>>>>>>> +kernel command line. >>>>>>>>> + >>>>>>>>> +Alternatively, each supported anonymous THP size can be controlled by >>>>>>>>> +passing ``thp_anon=<size>[KMG]:<state>``, where ``<size>`` is the THP size >>>>>>>>> +and ``<state>`` is one of ``always``, ``madvise``, ``never`` or >>>>>>>>> +``inherit``. >>>>>>>>> + >>>>>>>>> +For example, the following will set 64K THP to ``always``:: >>>>>>>>> + >>>>>>>>> + thp_anon=64K:always >>>>>>>>> + >>>>>>>>> +``thp_anon=`` may be specified multiple times to configure all THP sizes as >>>>>>>>> +required. If ``thp_anon=`` is specified at least once, any anon THP sizes >>>>>>>>> +not explicitly configured on the command line are implicitly set to >>>>>>>>> +``never``. >>>>>>>> >>>>>>>> I suggest documenting that "thp_anon=" will not effect the value of >>>>>>>> "transparent_hugepage=", or any configured default. >>>>> >>>>> Did you see the previous conversation with Barry about whether or not to honour >>>>> configured defaults when any thp_anon= is provided [1]? Sounds like you also >>>>> think we should honour the PMD "inherit" default if not explicitly provided on >>>>> the command line? (see link for justification for the approach I'm currently >>>>> taking). >>>> >>>> I primarily think that we should document it :) >>>> >>>> What if someone passes "transparent_hugepage=always" and "thp_anon=..."? >>>> I would assume that transparent_hugepage would only affect the global >>>> toggle then? >>>> >>>>> >>>>> [1] >>>>> https://lore.kernel.org/linux-mm/CAGsJ_4x8ruPspuk_FQVggJMWcXLbRuZFq44gg-Dt7Ewt3ExqTw@mail.gmail.com/ >>>>> >>>>>>>> >>>>>>>> Wondering if a syntax like >>>>>>>> >>>>>>>> thp_anon=16K,32K,64K:always;1048K,2048K:madvise >>>>> >>>>> Are there examples of that syntax already or have you just made it up? I found >>>>> examples with the colon (:) but nothing this fancy. I guess that's not a reason >>>>> not to do it though (other than the risk of screwing up the parser in a subtle way). >>>> >>>> I made it up -- mostly ;) I think we are quite flexible on what we can >>>> do. As always, maybe we can keep it bit consistent with existing stuff. >>>> >>>> For hugetlb_cma we have things like >>>> "<node>:nn[KMGTPE|[,<node>:nn[KMGTPE]] >>>> >>>> "memmap=" options are more ... advanced, including memory ranges. There >>>> are a bunch more documented in kernel-parameters.txt that have more >>>> elaborate formats. >>>> >>>> Ranges would probably be the most valuable addition. So maybe we should >>>> start with: >>>> >>>> thp_anon=16K-64K:always,1048K-2048K:madvise >>>> >>>> So to enable all THPs it would simply be >>>> >>>> thp_anon=16K-2M:always >>>> >>>> Interesting question what would happen if someone passes: >>>> >>>> thp_anon=8K-2M:always >>>> >>>> Likely we simply would apply it to any size in the range, even if >>>> start/end is not a THP size. >>>> >>>> But we would want to complain to the user if someone only specifies a >>>> single one (or a range does not even select a single one) that does not >>>> exist: >>>> >>>> thp_anon=8K:always >>> >>> How about rejecting settings with any illegal size or policy? >>> >>> I found there are too many corner cases to say "yes" or "no". It seems >>> the code could be much cleaner if we simply reject illegal settings. >>> On the other hand, we should rely on smart users to provide correct >>> settings, shouldn’t we? While working one the code, I felt that >>> extracting partial correct settings from incorrect ones and supporting >>> them might be a bit of over-design. >> >> No strong opinion on failing configs. Ignoring non-existing sizes might >> lead to more portable cmdlines between kernel versions. > > Well, I realized that the code I posted has actually applied the correct > parts because it modifies the global variable huge_anon_orders_xxx. > Unless I use temporary variables and then copy the results to > global ones, the code has been this issue to some extent. > > maybe I should remove the below "goto err", instead, ignore the > incorrect strings and go to the next strings to parse. I'll have to look at the full patch to be able to have an opinion :) >>> >>> I have tested the below code by >>> >>> "thp_anon=16K-64K:always;128K,512K:inherit;256K:madvise;1M-2M:never" >> >> There are some parameters that separate individual options by ";" >> already (config_acs). Most parameters use ",". No idea what's better >> here, and which separator to use for sizes instead when using "," for >> options. No strong opinion. > > But we have used "," for the purpose "128K,512K:inherit". so here > we have to use ";" for "16K-64K:always;128K,512K:inherit" I know -- I proposed that format -- but I wonder if there is an alternative that would let use use "," in-between options. Staring at various formats in Documentation/admin-guide/kernel-parameters.txt, I'm not able to come up with something "obviously better". -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-08-12 9:05 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20240808101700.571701-1-ryan.roberts@arm.com>
2024-08-08 21:17 ` [PATCH v2] mm: Override mTHP "enabled" defaults at kernel cmdline Barry Song
2024-08-09 7:58 ` Ryan Roberts
2024-08-09 8:31 ` Barry Song
2024-08-09 8:57 ` Ryan Roberts
2024-08-09 9:04 ` Barry Song
2024-08-09 8:49 ` Baolin Wang
2024-08-09 8:52 ` Barry Song
2024-08-09 9:19 ` David Hildenbrand
2024-08-09 9:24 ` Barry Song
2024-08-09 9:32 ` David Hildenbrand
2024-08-09 10:37 ` Ryan Roberts
2024-08-09 11:37 ` Barry Song
2024-08-09 13:15 ` David Hildenbrand
2024-08-12 5:36 ` Barry Song
2024-08-12 8:20 ` David Hildenbrand
2024-08-12 8:36 ` Barry Song
2024-08-12 9:05 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox