From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f71.google.com (mail-pg0-f71.google.com [74.125.83.71]) by kanga.kvack.org (Postfix) with ESMTP id 780226B0033 for ; Fri, 24 Nov 2017 05:15:03 -0500 (EST) Received: by mail-pg0-f71.google.com with SMTP id 199so15459110pgg.20 for ; Fri, 24 Nov 2017 02:15:03 -0800 (PST) Received: from mx2.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id h3si18334697plh.592.2017.11.24.02.15.01 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 24 Nov 2017 02:15:01 -0800 (PST) Date: Fri, 24 Nov 2017 11:14:57 +0100 From: Michal Hocko Subject: Re: [PATCH] mm:Add watermark slope for high mark Message-ID: <20171124101457.by7eoblmk357jwnz@dhcp22.suse.cz> References: <20171124100707.24190-1-peter.enderborg@sony.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171124100707.24190-1-peter.enderborg@sony.com> Sender: owner-linux-mm@kvack.org List-ID: To: Peter Enderborg Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Jonathan Corbet , "Luis R . Rodriguez" , Kees Cook , Alex Deucher , "David S . Miller" , Harry Wentland , Greg Kroah-Hartman , Tony Cheng , David Rientjes , Andrew Morton , Jan Kara , "Kirill A . Shutemov" , Dave Jiang , =?iso-8859-1?B?Suly9G1l?= Glisse , Ross Zwisler , Matthew Wilcox , Hugh Dickins , Johannes Weiner , Kemi Wang , Vlastimil Babka , YASUAKI ISHIMATSU , Nikolay Borisov , Mel Gorman , Pavel Tatashin On Fri 24-11-17 11:07:07, Peter Enderborg wrote: > When tuning the watermark_scale_factor to reduce stalls and compactions > the high mark is also changed, it changed a bit too much. So this > patch introduces a slope that can reduce this overhead a bit, or > increase it if needed. This doesn't explain what is the problem, why it is a problem and why we need yet another tuning to address it. Users shouldn't really care about internal stuff like watermark tuning for each watermark independently. This looks like a gross hack. Please start over with the problem description and then we can move on to an approapriate fix. Piling up tuning knobs to workaround problems is simply not acceptable. > Signed-off-by: Peter Enderborg > --- > Documentation/sysctl/vm.txt | 15 +++++++++++++++ > include/linux/mm.h | 1 + > include/linux/mmzone.h | 2 ++ > kernel/sysctl.c | 9 +++++++++ > mm/page_alloc.c | 6 +++++- > 5 files changed, 32 insertions(+), 1 deletion(-) > > diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt > index eda628c..aecff6c 100644 > --- a/Documentation/sysctl/vm.txt > +++ b/Documentation/sysctl/vm.txt > @@ -62,6 +62,7 @@ Currently, these files are in /proc/sys/vm: > - user_reserve_kbytes > - vfs_cache_pressure > - watermark_scale_factor > +- watermark_high_factor_slope > - zone_reclaim_mode > > ============================================================== > @@ -857,6 +858,20 @@ that the number of free pages kswapd maintains for latency reasons is > too small for the allocation bursts occurring in the system. This knob > can then be used to tune kswapd aggressiveness accordingly. > > +============================================================= > + > +watermark_high_factor_slope: > + > +This factor is high mark for watermark_scale_factor. > +The unit is in percent. > +Max value is 1000 and min value is 100. (High watermark is the same as > +low water mark) Low watermark is min_wmark_pages + watermark_scale_factor. > +and high watermark is > +min_wmark_pages+(watermark_scale_factor * watermark_high_factor_slope). > + > +The default value is 200. > + > + > ============================================================== > > zone_reclaim_mode: > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 7661156..c89536b 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2094,6 +2094,7 @@ extern void zone_pcp_reset(struct zone *zone); > /* page_alloc.c */ > extern int min_free_kbytes; > extern int watermark_scale_factor; > +extern int watermark_high_factor_slope; > > /* nommu.c */ > extern atomic_long_t mmap_pages_allocated; > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 67f2e3c..91bf842 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -886,6 +886,8 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int, > void __user *, size_t *, loff_t *); > int watermark_scale_factor_sysctl_handler(struct ctl_table *, int, > void __user *, size_t *, loff_t *); > +//int watermark_high_factor_tilt_sysctl_handler(struct ctl_table *, int, > +// void __user *, size_t *, loff_t *); > extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1]; > int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int, > void __user *, size_t *, loff_t *); > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index 2fb4e27..83c48c9 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -1444,6 +1444,15 @@ static struct ctl_table vm_table[] = { > .extra2 = &one_thousand, > }, > { > + .procname = "watermark_high_factor_slope", > + .data = &watermark_high_factor_slope, > + .maxlen = sizeof(watermark_high_factor_slope), > + .mode = 0644, > + .proc_handler = watermark_scale_factor_sysctl_handler, > + .extra1 = &one_hundred, > + .extra2 = &one_thousand, > + }, > + { > .procname = "percpu_pagelist_fraction", > .data = &percpu_pagelist_fraction, > .maxlen = sizeof(percpu_pagelist_fraction), > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 48b5b01..3dc50ff 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -263,6 +263,7 @@ compound_page_dtor * const compound_page_dtors[] = { > int min_free_kbytes = 1024; > int user_min_free_kbytes = -1; > int watermark_scale_factor = 10; > +int watermark_high_factor_slope = 200; > > static unsigned long __meminitdata nr_kernel_pages; > static unsigned long __meminitdata nr_all_pages; > @@ -6989,6 +6990,7 @@ static void __setup_per_zone_wmarks(void) > > for_each_zone(zone) { > u64 tmp; > + u64 tmp_high; > > spin_lock_irqsave(&zone->lock, flags); > tmp = (u64)pages_min * zone->managed_pages; > @@ -7026,7 +7028,9 @@ static void __setup_per_zone_wmarks(void) > watermark_scale_factor, 10000)); > > zone->watermark[WMARK_LOW] = min_wmark_pages(zone) + tmp; > - zone->watermark[WMARK_HIGH] = min_wmark_pages(zone) + tmp * 2; > + tmp_high = mult_frac(tmp, watermark_high_factor_slope, 100); > + zone->watermark[WMARK_HIGH] = min_wmark_pages(zone) + tmp_high; > + > > spin_unlock_irqrestore(&zone->lock, flags); > } > -- > 2.7.4 > -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org