From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61232C2D0DC for ; Wed, 25 Dec 2019 09:04:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 158AD20643 for ; Wed, 25 Dec 2019 09:04:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 158AD20643 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7143A8E0005; Wed, 25 Dec 2019 04:04:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C4D38E0001; Wed, 25 Dec 2019 04:04:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DA6F8E0005; Wed, 25 Dec 2019 04:04:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0162.hostedemail.com [216.40.44.162]) by kanga.kvack.org (Postfix) with ESMTP id 48EFD8E0001 for ; Wed, 25 Dec 2019 04:04:46 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 0911083F9 for ; Wed, 25 Dec 2019 09:04:46 +0000 (UTC) X-FDA: 76303078572.18.tree42_63a6d2768cd38 X-HE-Tag: tree42_63a6d2768cd38 X-Filterd-Recvd-Size: 5652 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Wed, 25 Dec 2019 09:04:44 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=13;SR=0;TI=SMTPD_---0TltD6EY_1577264680; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TltD6EY_1577264680) by smtp.aliyun-inc.com(127.0.0.1); Wed, 25 Dec 2019 17:04:40 +0800 From: Alex Shi To: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, shakeelb@google.com, hannes@cmpxchg.org Subject: [PATCH v7 00/10] per lruvec lru_lock for memcg Date: Wed, 25 Dec 2019 17:04:16 +0800 Message-Id: <1577264666-246071-1-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi all, Merry Christmas! :) This patchset move lru_lock into lruvec, give a lru_lock for each of lruvec, thus bring a lru_lock for each of memcg per node. We introduce function lock_page_lruvec, which will lock the page's memcg and then memcg's lruvec->lru_lock(Thanks Johannes Weiner, Hugh Dickins and Konstantin Khlebnikov suggestion/reminder) to replace old pgdat->lru_lock. Following to Daniel Jordan's suggestion, I run 208 'dd' with on 104 containers on a 2s * 26cores * HT box with a modefied case: https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/= tree/case-lru-file-readtwice With this patchset, the readtwice performance increased about 80% with containers. And no performance drops w/o container. Another way to guard move_account is by lru_lock instead of move_lock=20 Considering the memcg move task path: mem_cgroup_move_task: mem_cgroup_move_charge: lru_add_drain_all(); atomic_inc(&mc.from->moving_account); //ask lruvec's move_lock synchronize_rcu(); walk_parge_range: do charge_walk_ops(mem_cgroup_move_charge_pte_range): isolate_lru_page(); mem_cgroup_move_account(page,) spin_lock(&from->move_lock)=20 page->mem_cgroup =3D to; spin_unlock(&from->move_lock)=20 putback_lru_page(page) to guard 'page->mem_cgroup =3D to' by to_vec->lru_lock has the similar ef= fect with move_lock. So for performance reason, both solutions are same. Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same= idea 7 years ago. Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel = Jordan,=20 Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun W= ang etc. and some testing support from Intel 0days! v7, a, rebase on v5.5-rc3,=20 b, move the lock_page_lru() clean up before lock replace. v6,=20 a, rebase on v5.5-rc2, and do retesting. b, pick up Johanness' comments change and a lock_page_lru cleanup. v5, a, locking page's memcg according JohannesW suggestion b, using macro for non memcg, according to Johanness and Metthew's sugg= estion. v4:=20 a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner b, remove the irqsave flags changes, thanks Metthew Wilcox c, merge/split patches for better understanding and bisection purpose v3: rebase on linux-next, and fold the relock fix patch into introduceing= patch v2: bypass a performance regression bug and fix some function issues v1: initial version, aim testing show 5% performance increase on a 16 thr= eads box. Alex Shi (9): mm/vmscan: remove unnecessary lruvec adding mm/memcg: fold lru_lock in lock_page_lru mm/lru: replace pgdat lru_lock with lruvec lock mm/lru: introduce the relock_page_lruvec function mm/mlock: optimize munlock_pagevec by relocking mm/swap: only change the lru_lock iff page's lruvec is different mm/pgdat: remove pgdat lru_lock mm/lru: add debug checking for page memcg moving mm/memcg: add debug checking in lock_page_memcg Hugh Dickins (1): mm/lru: revise the comments of lru_lock Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +--- Documentation/admin-guide/cgroup-v1/memory.rst | 6 +- Documentation/trace/events-kmem.rst | 2 +- Documentation/vm/unevictable-lru.rst | 22 ++--- include/linux/memcontrol.h | 63 ++++++++++++++ include/linux/mm_types.h | 2 +- include/linux/mmzone.h | 5 +- mm/compaction.c | 59 ++++++++----- mm/filemap.c | 4 +- mm/huge_memory.c | 18 ++-- mm/memcontrol.c | 84 +++++++++++++++-= --- mm/mlock.c | 28 +++---- mm/mmzone.c | 1 + mm/page_alloc.c | 1 - mm/page_idle.c | 7 +- mm/rmap.c | 2 +- mm/swap.c | 75 +++++++---------= - mm/vmscan.c | 98 ++++++++++++----= ------ 18 files changed, 297 insertions(+), 195 deletions(-) --=20 1.8.3.1