From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22C57C2D0CF for ; Mon, 16 Dec 2019 09:27:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B2729206D3 for ; Mon, 16 Dec 2019 09:27:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B2729206D3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E9BB88E0006; Mon, 16 Dec 2019 04:27:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E24EE8E0003; Mon, 16 Dec 2019 04:27:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3C298E0006; Mon, 16 Dec 2019 04:27:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id B6BA58E0003 for ; Mon, 16 Dec 2019 04:27:23 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 76C03181AC1E9 for ; Mon, 16 Dec 2019 09:27:23 +0000 (UTC) X-FDA: 76270476366.21.place03_405d60307503b X-HE-Tag: place03_405d60307503b X-Filterd-Recvd-Size: 5772 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Mon, 16 Dec 2019 09:27:21 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04446;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0Tl4L2qr_1576488436; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0Tl4L2qr_1576488436) by smtp.aliyun-inc.com(127.0.0.1); Mon, 16 Dec 2019 17:27:17 +0800 From: Alex Shi To: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, shakeelb@google.com, hannes@cmpxchg.org Cc: Alex Shi Subject: [PATCH v6 00/10] per lruvec lru_lock for memcg Date: Mon, 16 Dec 2019 17:26:16 +0800 Message-Id: <1576488386-32544-1-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patchset move lru_lock into lruvec, give a lru_lock for each of lruvec, thus bring a lru_lock for each of memcg per node. We introduces function lock_page_lruvec, which will lock the page's memcg and then memcg's lruvec->lru_lock(Thanks Johannes Weiner, Hugh Dickins and Konstantin Khlebnikov suggestion/reminder) to replace old pgdat->lru_lock. Following to Daniel Jordan's suggestion, I run 208 'dd' with on 104 containers on a 2s * 26cores * HT box with a modefied case: https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/= tree/case-lru-file-readtwice With this patchset, the readtwice performance increased about 80% with containers. And no performance drops w/o container. The previous suspicious slightly drop has gone in v5.5-rc version. And we even got ano= ther 5% performance increase. Also the readtwice case performance increased fr= om 5.4 to 5.5-rc. Another way to guard move_account is by lru_lock instead of move_lock=20 Considering the memcg move task path: mem_cgroup_move_task: mem_cgroup_move_charge: lru_add_drain_all(); atomic_inc(&mc.from->moving_account); //ask lruvec's move_lock synchronize_rcu(); walk_parge_range: do charge_walk_ops(mem_cgroup_move_charge_pte_range): isolate_lru_page(); mem_cgroup_move_account(page,) spin_lock(&from->move_lock)=20 page->mem_cgroup =3D to; spin_unlock(&from->move_lock)=20 putback_lru_page(page) to guard 'page->mem_cgroup =3D to' by to_vec->lru_lock has the similar ef= fect with move_lock. So for performance reason, both solutions are same. Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same= idea 7 years ago. Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel = Jordan,=20 Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun W= ang etc. and some testing support from Intel 0days! v6,=20 a, rebase on v5.5-rc2, and do retesting. b, pick up Johanness' comments change and a lock_page_lru cleanup. v5, a, locking page's memcg according JohannesW suggestion b, using macro for non memcg, according to Johanness and Metthew's sugg= estion. v4:=20 a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner b, remove the irqsave flags changes, thanks Metthew Wilcox c, merge/split patches for better understanding and bisection purpose v3: rebase on linux-next, and fold the relock fix patch into introduceing= patch v2: bypass a performance regression bug and fix some function issues v1: initial version, aim testing show 5% performance increase Alex Shi (8): mm/vmscan: remove unnecessary lruvec adding mm/lru: replace pgdat lru_lock with lruvec lock mm/lru: introduce the relock_page_lruvec function mm/mlock: optimize munlock_pagevec by relocking mm/swap: only change the lru_lock iff page's lruvec is different mm/pgdat: remove pgdat lru_lock mm/lru: debug checking for page memcg moving and lock_page_memcg mm/memcg: fold lock in lock_page_lru Hugh Dickins (1): mm/lru: revise the comments of lru_lock Johannes Weiner (1): mm: revise the comments of mem_cgroup_page_lruvec Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +--- Documentation/admin-guide/cgroup-v1/memory.rst | 6 +- Documentation/trace/events-kmem.rst | 2 +- Documentation/vm/unevictable-lru.rst | 22 ++--- include/linux/memcontrol.h | 63 ++++++++++++++ include/linux/mm_types.h | 2 +- include/linux/mmzone.h | 5 +- mm/compaction.c | 59 ++++++++----- mm/filemap.c | 4 +- mm/huge_memory.c | 18 ++-- mm/memcontrol.c | 95 ++++++++++++++++= +---- mm/mlock.c | 28 +++---- mm/mmzone.c | 1 + mm/page_alloc.c | 1 - mm/page_idle.c | 7 +- mm/rmap.c | 2 +- mm/swap.c | 75 +++++++---------= - mm/vmscan.c | 97 ++++++++++++----= ------ 18 files changed, 309 insertions(+), 193 deletions(-) --=20 1.8.3.1