From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABCC5C43461 for ; Wed, 16 Sep 2020 12:46:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 316D920672 for ; Wed, 16 Sep 2020 12:46:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 316D920672 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 754FE6B005A; Wed, 16 Sep 2020 08:46:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 703906B005C; Wed, 16 Sep 2020 08:46:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C94D8E0001; Wed, 16 Sep 2020 08:46:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id 47F146B005A for ; Wed, 16 Sep 2020 08:46:24 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 078A61DF2 for ; Wed, 16 Sep 2020 12:46:24 +0000 (UTC) X-FDA: 77268897888.30.hat14_2017b1d2711a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id D4513180B3C83 for ; Wed, 16 Sep 2020 12:46:23 +0000 (UTC) X-HE-Tag: hat14_2017b1d2711a X-Filterd-Recvd-Size: 11263 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by imf39.hostedemail.com (Postfix) with ESMTP for ; Wed, 16 Sep 2020 12:46:22 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04395;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=24;SR=0;TI=SMTPD_---0U97kXya_1600260375; Received: from IT-FVFX43SYHV2H.local(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0U97kXya_1600260375) by smtp.aliyun-inc.com(127.0.0.1); Wed, 16 Sep 2020 20:46:17 +0800 Subject: Re: [PATCH v18 00/32] per memcg lru_lock: reviews To: Daniel Jordan , Hugh Dickins Cc: Andrew Morton , mgorman@techsingularity.net, tj@kernel.org, khlebnikov@yandex-team.ru, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com, kirill@shutemov.name, alexander.duyck@gmail.com, rong.a.chen@intel.com, mhocko@suse.com, vdavydov.dev@gmail.com, shy828301@gmail.com, vbabka@suse.cz, minchan@kernel.org, cai@lca.pw References: <20200824114204.cc796ca182db95809dd70a47@linux-foundation.org> <61a42a87-eec9-e300-f710-992756f70de6@linux.alibaba.com> <855ad6ee-dba4-9729-78bd-23e392905cf6@linux.alibaba.com> <5cfc6142-752d-26e6-0108-38d13009268b@linux.alibaba.com> <20200915165807.kpp7uhiw7l3loofu@ca-dmjordan1.us.oracle.com> From: Alex Shi Message-ID: <82711200-7b71-44a6-f238-b8e8fcb879e9@linux.alibaba.com> Date: Wed, 16 Sep 2020 20:44:41 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200915165807.kpp7uhiw7l3loofu@ca-dmjordan1.us.oracle.com> Content-Type: text/plain; charset=UTF-8 X-Rspamd-Queue-Id: D4513180B3C83 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000012, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =E5=9C=A8 2020/9/16 =E4=B8=8A=E5=8D=8812:58, Daniel Jordan =E5=86=99=E9=81= =93: > On Tue, Sep 15, 2020 at 01:21:56AM -0700, Hugh Dickins wrote: >> On Sun, 13 Sep 2020, Alex Shi wrote: >>> Uh, I updated the testing with some new results here: >>> https://lkml.org/lkml/2020/8/26/212 >> >> Right, I missed that, that's better, thanks. Any other test results? >=20 > Alex, you were doing some will-it-scale runs earlier. Are you planning= to do > more of those? Otherwise I can add them in. >=20 Hi Daniel, I am happy to see your testing result. :) > This is what I have so far. >=20 >=20 > sysbench oltp read-only > ----------------------- >=20 > The goal was to run a real world benchmark, at least more so than somet= hing > like vm-scalability, with the memory controller enabled but unused to c= heck for > regressions. >=20 > I chose sysbench because it was relatively straightforward to run, but = I'm open > to ideas for other high level benchmarks that might be more sensitive t= o this > series. >=20 > CoeffVar shows the test was pretty noisy overall. It's nice to see the= re's no > significant difference between the kernels for low thread counts (1-12)= , but > I'm not sure what to make of the 18 and 20 thread cases. At 20 threads= , the > CPUs of the node that the test was confined to were saturated and the v= ariance > is especially high. I'm tempted to write the 18 and 20 thread cases of= f as > noise. >=20 > - 2-socket * 10-core * 2-hyperthread broadwell server > - test bound to node 1 to lower variance > - 251G memory, divided evenly between the nodes (memory size of test sh= runk to > accommodate confining to one node) > - 12 iterations per thread count per kernel > - THP enabled Thanks a lot for the results! Alex >=20 > export OLTP_CACHESIZE=3D$(($MEMTOTAL_BYTES/4)) > export OLTP_SHAREDBUFFERS=3D$((MEMTOTAL_BYTES/8)) > export OLTP_PAGESIZES=3D"default" > export SYSBENCH_DRIVER=3Dpostgres > export SYSBENCH_MAX_TRANSACTIONS=3Dauto > export SYSBENCH_READONLY=3Dyes > export SYSBENCH_MAX_THREADS=3D$((NUMCPUS / 2)) > export SYSBENCH_ITERATIONS=3D12 > export SYSBENCH_WORKLOAD_SIZE=3D$((MEMTOTAL_BYTES*3/8)) > export SYSBENCH_CACHE_COLD=3Dno > export DATABASE_INIT_ONCE=3Dyes >=20 > export MMTESTS_NUMA_POLICY=3Dfullbind_single_instance_node > numactl --cpunodebind=3D1 --membind=3D1 >=20 > sysbench Transactions per second > 5.9-rc2 5.9-rc2-lru-v18 > Min 1 593.23 ( 0.00%) 583.37 ( -1.66%) > Min 4 1897.34 ( 0.00%) 1871.77 ( -1.35%) > Min 7 2471.14 ( 0.00%) 2449.77 ( -0.86%) > Min 12 2680.00 ( 0.00%) 2853.25 ( 6.46%) > Min 18 2183.82 ( 0.00%) 1191.43 ( -45.44%) > Min 20 924.96 ( 0.00%) 526.66 ( -43.06%) > Hmean 1 912.08 ( 0.00%) 904.24 ( -0.86%) > Hmean 4 2057.11 ( 0.00%) 2044.69 ( -0.60%) > Hmean 7 2817.59 ( 0.00%) 2812.80 ( -0.17%) > Hmean 12 3201.05 ( 0.00%) 3171.09 ( -0.94%) > Hmean 18 2529.10 ( 0.00%) 2009.99 * -20.53%* > Hmean 20 1742.29 ( 0.00%) 1127.77 * -35.27%* > Stddev 1 219.21 ( 0.00%) 220.92 ( -0.78%) > Stddev 4 94.94 ( 0.00%) 84.34 ( 11.17%) > Stddev 7 189.42 ( 0.00%) 167.58 ( 11.53%) > Stddev 12 372.13 ( 0.00%) 199.40 ( 46.42%) > Stddev 18 248.42 ( 0.00%) 574.66 (-131.32%) > Stddev 20 757.69 ( 0.00%) 666.87 ( 11.99%) > CoeffVar 1 22.54 ( 0.00%) 22.86 ( -1.42%) > CoeffVar 4 4.61 ( 0.00%) 4.12 ( 10.60%) > CoeffVar 7 6.69 ( 0.00%) 5.94 ( 11.30%) > CoeffVar 12 11.49 ( 0.00%) 6.27 ( 45.46%) > CoeffVar 18 9.74 ( 0.00%) 26.22 (-169.23%) > CoeffVar 20 36.32 ( 0.00%) 47.18 ( -29.89%) > Max 1 1117.45 ( 0.00%) 1107.33 ( -0.91%) > Max 4 2184.92 ( 0.00%) 2136.65 ( -2.21%) > Max 7 3086.81 ( 0.00%) 3049.52 ( -1.21%) > Max 12 4020.07 ( 0.00%) 3580.95 ( -10.92%) > Max 18 3032.30 ( 0.00%) 2810.85 ( -7.30%) > Max 20 2891.27 ( 0.00%) 2675.80 ( -7.45%) > BHmean-50 1 1098.77 ( 0.00%) 1093.58 ( -0.47%) > BHmean-50 4 2139.76 ( 0.00%) 2107.13 ( -1.52%) > BHmean-50 7 2972.18 ( 0.00%) 2953.94 ( -0.61%) > BHmean-50 12 3494.73 ( 0.00%) 3311.33 ( -5.25%) > BHmean-50 18 2729.70 ( 0.00%) 2606.32 ( -4.52%) > BHmean-50 20 2668.72 ( 0.00%) 1779.87 ( -33.31%) > BHmean-95 1 958.94 ( 0.00%) 951.84 ( -0.74%) > BHmean-95 4 2072.98 ( 0.00%) 2062.01 ( -0.53%) > BHmean-95 7 2853.96 ( 0.00%) 2851.21 ( -0.10%) > BHmean-95 12 3258.65 ( 0.00%) 3203.53 ( -1.69%) > BHmean-95 18 2565.99 ( 0.00%) 2143.90 ( -16.45%) > BHmean-95 20 1894.47 ( 0.00%) 1258.34 ( -33.58%) > BHmean-99 1 958.94 ( 0.00%) 951.84 ( -0.74%) > BHmean-99 4 2072.98 ( 0.00%) 2062.01 ( -0.53%) > BHmean-99 7 2853.96 ( 0.00%) 2851.21 ( -0.10%) > BHmean-99 12 3258.65 ( 0.00%) 3203.53 ( -1.69%) > BHmean-99 18 2565.99 ( 0.00%) 2143.90 ( -16.45%) > BHmean-99 20 1894.47 ( 0.00%) 1258.34 ( -33.58%) >=20 > sysbench Time > 5.9-rc2 5.9-rc2-lru > Min 1 8.96 ( 0.00%) 9.04 ( -0.89%) > Min 4 4.63 ( 0.00%) 4.74 ( -2.38%) > Min 7 3.34 ( 0.00%) 3.38 ( -1.20%) > Min 12 2.65 ( 0.00%) 2.95 ( -11.32%) > Min 18 3.54 ( 0.00%) 3.80 ( -7.34%) > Min 20 3.74 ( 0.00%) 4.02 ( -7.49%) > Amean 1 11.00 ( 0.00%) 11.11 ( -0.98%) > Amean 4 4.92 ( 0.00%) 4.95 ( -0.59%) > Amean 7 3.65 ( 0.00%) 3.65 ( -0.16%) > Amean 12 3.29 ( 0.00%) 3.32 ( -0.89%) > Amean 18 4.20 ( 0.00%) 5.22 * -24.39%* > Amean 20 6.02 ( 0.00%) 9.14 * -51.98%* > Stddev 1 3.33 ( 0.00%) 3.45 ( -3.40%) > Stddev 4 0.23 ( 0.00%) 0.21 ( 7.89%) > Stddev 7 0.25 ( 0.00%) 0.22 ( 9.87%) > Stddev 12 0.35 ( 0.00%) 0.19 ( 45.09%) > Stddev 18 0.38 ( 0.00%) 1.75 (-354.74%) > Stddev 20 2.93 ( 0.00%) 4.73 ( -61.72%) > CoeffVar 1 30.30 ( 0.00%) 31.02 ( -2.40%) > CoeffVar 4 4.63 ( 0.00%) 4.24 ( 8.43%) > CoeffVar 7 6.77 ( 0.00%) 6.10 ( 10.02%) > CoeffVar 12 10.74 ( 0.00%) 5.85 ( 45.57%) > CoeffVar 18 9.15 ( 0.00%) 33.45 (-265.58%) > CoeffVar 20 48.64 ( 0.00%) 51.75 ( -6.41%) > Max 1 17.01 ( 0.00%) 17.36 ( -2.06%) > Max 4 5.33 ( 0.00%) 5.40 ( -1.31%) > Max 7 4.14 ( 0.00%) 4.18 ( -0.97%) > Max 12 3.89 ( 0.00%) 3.67 ( 5.66%) > Max 18 4.82 ( 0.00%) 8.64 ( -79.25%) > Max 20 11.09 ( 0.00%) 19.26 ( -73.67%) > BAmean-50 1 9.12 ( 0.00%) 9.16 ( -0.49%) > BAmean-50 4 4.73 ( 0.00%) 4.80 ( -1.55%) > BAmean-50 7 3.46 ( 0.00%) 3.48 ( -0.58%) > BAmean-50 12 3.02 ( 0.00%) 3.18 ( -5.24%) > BAmean-50 18 3.90 ( 0.00%) 4.08 ( -4.52%) > BAmean-50 20 4.02 ( 0.00%) 5.90 ( -46.56%) > BAmean-95 1 10.45 ( 0.00%) 10.54 ( -0.82%) > BAmean-95 4 4.88 ( 0.00%) 4.91 ( -0.52%) > BAmean-95 7 3.60 ( 0.00%) 3.60 ( -0.08%) > BAmean-95 12 3.23 ( 0.00%) 3.28 ( -1.60%) > BAmean-95 18 4.14 ( 0.00%) 4.91 ( -18.58%) > BAmean-95 20 5.56 ( 0.00%) 8.22 ( -48.04%) > BAmean-99 1 10.45 ( 0.00%) 10.54 ( -0.82%) > BAmean-99 4 4.88 ( 0.00%) 4.91 ( -0.52%) > BAmean-99 7 3.60 ( 0.00%) 3.60 ( -0.08%) > BAmean-99 12 3.23 ( 0.00%) 3.28 ( -1.60%) > BAmean-99 18 4.14 ( 0.00%) 4.91 ( -18.58%) > BAmean-99 20 5.56 ( 0.00%) 8.22 ( -48.04%) >=20 >=20 > docker-ized readtwice microbenchmark > ------------------------------------ >=20 > This is Alex's modified readtwice case. Needed a few fixes, and I made= it into > a script. Updated version attached. >=20 > Same machine, three runs per kernel, 40 containers per test. This is a= verage > MB/s over all containers. >=20 > 5.9-rc2 5.9-rc2-lru > ----------- ----------- > 220.5 (3.3) 356.9 (0.5) >=20 > That's a 62% improvement. >=20