From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2374BC433DF for ; Sun, 21 Jun 2020 15:46:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DC8A320809 for ; Sun, 21 Jun 2020 15:46:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DC8A320809 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 451E98D0023; Sun, 21 Jun 2020 11:46:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 403768D001E; Sun, 21 Jun 2020 11:46:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 33F4C8D0023; Sun, 21 Jun 2020 11:46:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id 1BADB8D001E for ; Sun, 21 Jun 2020 11:46:00 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 91B47824556B for ; Sun, 21 Jun 2020 15:45:59 +0000 (UTC) X-FDA: 76953644838.10.cream22_5f0600926e2b Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id 5F674169F9A for ; Sun, 21 Jun 2020 15:45:59 +0000 (UTC) X-HE-Tag: cream22_5f0600926e2b X-Filterd-Recvd-Size: 5267 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Sun, 21 Jun 2020 15:45:55 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01419;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0U0FRv8j_1592754341; Received: from IT-FVFX43SYHV2H.lan(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0U0FRv8j_1592754341) by smtp.aliyun-inc.com(127.0.0.1); Sun, 21 Jun 2020 23:45:42 +0800 Subject: Re: [PATCH v13 00/18] per memcg lru lock To: Andrew Morton Cc: mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com References: <1592555636-115095-1-git-send-email-alex.shi@linux.alibaba.com> <20200620160807.0e0997c3e0e3ca1b18e68a53@linux-foundation.org> From: Alex Shi Message-ID: <5561f72b-8f9a-f84e-94a4-600c66084f29@linux.alibaba.com> Date: Sun, 21 Jun 2020 23:44:47 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200620160807.0e0997c3e0e3ca1b18e68a53@linux-foundation.org> Content-Type: text/plain; charset=gbk X-Rspamd-Queue-Id: 5F674169F9A X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =D4=DA 2020/6/21 =C9=CF=CE=E77:08, Andrew Morton =D0=B4=B5=C0: > On Fri, 19 Jun 2020 16:33:38 +0800 Alex Shi wrote: >=20 >> This is a new version which bases on linux-next, merged much suggestio= n >> from Hugh Dickins, from compaction fix to less TestClearPageLRU and >> comments reverse etc. Thank a lot, Hugh! >> >> Johannes Weiner has suggested: >> "So here is a crazy idea that may be worth exploring: >> >> Right now, pgdat->lru_lock protects both PageLRU *and* the lruvec's >> linked list. >> >> Can we make PageLRU atomic and use it to stabilize the lru_lock >> instead, and then use the lru_lock only serialize list operations? >=20 > I don't understand this sentence. How can a per-page flag stabilize a > per-pgdat spinlock? Perhaps some additional description will help. Hi Andrew, Well, above comments miss a context, which lru_lock means new lru_lock on= each of memcg not the current per node lru_lock. Sorry! Currently the lru bit changed under lru_lock, so isolate a page from lru = just need take lru_lock. New patch will change it with a atomic action alone f= rom=20 lru_lock, so isolate a page need both actions: TestClearPageLRU and take = the lru_lock. like followings in isolate_lru_page(): The main reason for this comes from isolate_migratepages_block() in compa= ction.c we have to take lru bit before lru lock, that serialized the page isolati= on in=20 memcg page charge/migration which will change page's lruvec and new lru_l= ock in it. The current isolation just take lru lock directly which fails on g= uard=20 page's lruvec change(memcg change). changes in isolate_lru_page():- if (PageLRU(page)) { + if (TestClearPageLRU(page)) { pg_data_t *pgdat =3D page_pgdat(page); struct lruvec *lruvec; + int lru =3D page_lru(page); =20 - spin_lock_irq(&pgdat->lru_lock); + get_page(page); lruvec =3D mem_cgroup_page_lruvec(page, pgdat); - if (PageLRU(page)) { - int lru =3D page_lru(page); - get_page(page); - ClearPageLRU(page); - del_page_from_lru_list(page, lruvec, lru); - ret =3D 0; - } + spin_lock_irq(&pgdat->lru_lock); + del_page_from_lru_list(page, lruvec, lru); spin_unlock_irq(&pgdat->lru_lock); + ret =3D 0; } >=20 >> >> Following Daniel Jordan's suggestion, I have run 208 'dd' with on 104 >> containers on a 2s * 26cores * HT box with a modefied case: >> https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git= /tree/case-lru-file-readtwice >> >> With this patchset, the readtwice performance increased about 80% >> in concurrent containers. >> >> Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought this >> idea 8 years ago, and others who give comments as well: Daniel Jordan,= =20 >> Mel Gorman, Shakeel Butt, Matthew Wilcox etc. >> >> Thanks for Testing support from Intel 0day and Rong Chen, Fengguang Wu= , >> and Yun Wang. Hugh Dickins also shared his kbuild-swap case. Thanks! >> >> ... >> >> 24 files changed, 500 insertions(+), 357 deletions(-) >=20 > It's a large patchset and afaict the whole point is performance gain.=20 > 80% in one specialized test sounds nice, but is there a plan for more > extensive quantification? Once I got 5% aim7 performance gain on 16 cores machine, and about 20+% readtwice performance gain. the performance gain is increased a lot follo= wing larger cores. Is there some suggestion for this? >=20 > There isn't much sign of completed review activity here, so I'll go > into hiding for a while. >=20 Yes, it's relatively big. also much of change from comments part. :) Anyway, thanks for look into! Thanks Alex