From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f70.google.com (mail-pg0-f70.google.com [74.125.83.70]) by kanga.kvack.org (Postfix) with ESMTP id BE8F76B0033 for ; Mon, 18 Sep 2017 11:33:39 -0400 (EDT) Received: by mail-pg0-f70.google.com with SMTP id i130so880412pgc.5 for ; Mon, 18 Sep 2017 08:33:39 -0700 (PDT) Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0066.outbound.protection.outlook.com. [104.47.0.66]) by mx.google.com with ESMTPS id g24si5081123plj.233.2017.09.18.08.33.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 18 Sep 2017 08:33:38 -0700 (PDT) Subject: Re: Page allocator bottleneck References: <20170915092839.690ea9e9@redhat.com> <6069fd36-ed0e-145c-3134-35232bf951a7@mellanox.com> <20170918073447.GB4107@intel.com> <20170918074404.GD4107@intel.com> From: Tariq Toukan Message-ID: <082e7901-7842-e9d9-221d-45322da0fcff@mellanox.com> Date: Mon, 18 Sep 2017 18:33:20 +0300 MIME-Version: 1.0 In-Reply-To: <20170918074404.GD4107@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Aaron Lu , Tariq Toukan Cc: Jesper Dangaard Brouer , David Miller , Mel Gorman , Eric Dumazet , Alexei Starovoitov , Saeed Mahameed , Eran Ben Elisha , Linux Kernel Network Developers , Andrew Morton , Michal Hocko , linux-mm , Dave Hansen On 18/09/2017 10:44 AM, Aaron Lu wrote: > On Mon, Sep 18, 2017 at 03:34:47PM +0800, Aaron Lu wrote: >> On Sun, Sep 17, 2017 at 07:16:15PM +0300, Tariq Toukan wrote: >>> >>> It's nice to have the option to dynamically play with the parameter. >>> But maybe we should also think of changing the default fraction guaranteed >>> to the PCP, so that unaware admins of networking servers would also benefit. >> >> I collected some performance data with will-it-scale/page_fault1 process >> mode on different machines with different pcp->batch sizes, starting >> from the default 31(calculated by zone_batchsize(), 31 is the standard >> value for any zone that has more than 1/2MiB memory), then incremented >> by 31 upwards till 527. PCP's upper limit is 6*batch. >> >> An image is plotted and attached: batch_full.png(full here means the >> number of process started equals to CPU number). > > To be clear: X-axis is the value of batch size(31, 62, 93, ..., 527), > Y-axis is the value of per_process_ops, generated by will-it-scale, > higher is better. > >> >> From the image: >> - For EX machines, they all see throughput increase with increased batch >> size and peaked at around batch_size=310, then fall; >> - For EP machines, Haswell-EP and Broadwell-EP also see throughput >> increase with increased batch size and peaked at batch_size=279, then >> fall, batch_size=310 also delivers pretty good result. Skylake-EP is >> quite different in that it doesn't see any obvious throughput increase >> after batch_size=93, though the trend is still increasing, but in a very >> small way and finally peaked at batch_size=403, then fall. >> Ivybridge EP behaves much like desktop ones. >> - For Desktop machines, they do not see any obvious changes with >> increased batch_size. >> >> So the default batch size(31) doesn't deliver good enough result, we >> probbaly should change the default value. Thanks Aaron for sharing your experiment results. That's a good analysis of the effect of the batch value. I agree with your conclusion. From networking perspective, we should reconsider the defaults to be able to reach the increasing NICs linerates. Not only for pcp->batch, but also for pcp->high. Regards, Tariq -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org