From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB77EC3F68F for ; Tue, 10 Dec 2019 11:48:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A96E82073B for ; Tue, 10 Dec 2019 11:48:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A96E82073B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 26D8D6B2C24; Tue, 10 Dec 2019 06:48:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D4C66B2C23; Tue, 10 Dec 2019 06:48:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D6A366B2C27; Tue, 10 Dec 2019 06:48:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0026.hostedemail.com [216.40.44.26]) by kanga.kvack.org (Postfix) with ESMTP id B12286B2C24 for ; Tue, 10 Dec 2019 06:48:03 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 5D2424857 for ; Tue, 10 Dec 2019 11:48:03 +0000 (UTC) X-FDA: 76249058046.24.story04_462bf61e7f618 X-HE-Tag: story04_462bf61e7f618 X-Filterd-Recvd-Size: 5487 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Tue, 10 Dec 2019 11:48:01 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R351e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07488;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TkXAQar_1575978470; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TkXAQar_1575978470) by smtp.aliyun-inc.com(127.0.0.1); Tue, 10 Dec 2019 19:47:51 +0800 From: Alex Shi To: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, shakeelb@google.com, hannes@cmpxchg.org Cc: Alex Shi Subject: [PATCH v5 0/8] per lruvec lru_lock for memcg Date: Tue, 10 Dec 2019 19:46:16 +0800 Message-Id: <1575978384-222381-1-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi all, Sorry for send out later. This patchset move lru_lock into lruvec, give a lru_lock for each of lruvec, thus bring a lru_lock for each of memcg per node. This is the main patch to replace per node lru_lock with per memcg lruvec lock. We introduces function lock_page_lruvec, which will lock the page's memcg and then memcg's lruvec->lru_lock. (Thanks Johannes Weiner, Hugh Dickins and Konstantin Khlebnikov suggestion/reminder) According to Daniel Jordan's suggestion, I run 208 'dd' with on 104 containers on a 2s * 26cores * HT box with a modefied case: https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/= tree/case-lru-file-readtwice With this and later patches, the readtwice performance increases about 80% with containers, but w/o memcg the readtwice performance drops about 5%.(and another 5% drops with the last debug patch). Slighty better than v4.(about 6% drop w/o memcg) Considering the memcg move task path: mem_cgroup_move_task: mem_cgroup_move_charge: lru_add_drain_all(); atomic_inc(&mc.from->moving_account); //ask lruvec's move_lock synchronize_rcu(); walk_parge_range: do charge_walk_ops(mem_cgroup_move_charge_pte_range): isolate_lru_page(); mem_cgroup_move_account(page,) spin_lock(&from->move_lock)=20 page->mem_cgroup =3D to; spin_unlock(&from->move_lock)=20 putback_lru_page(page) to guard 'page->mem_cgroup =3D to' by to_vec->lru_lock has the similar ef= fect with move_lock. So for performance reason, both solutions are same. Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same= idea 7 years ago. Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel = Jordan,=20 Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun W= ang etc. and some testing support from Intel 0days! v5, a, locking page's memcg according JohannesW suggestion b, using macro for non memcg, according to Johanness and Metthew's sugg= estion. v4:=20 a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner b, remove the irqsave flags changes, thanks Metthew Wilcox c, merge/split patches for better understanding and bisection purpose v3: rebase on linux-next, and fold the relock fix patch into introduceing= patch v2: bypass a performance regression bug and fix some function issues v1: initial version, aim testing show 5% performance increase Alex Shi (7): mm/vmscan: remove unnecessary lruvec adding mm/lru: replace pgdat lru_lock with lruvec lock mm/lru: introduce the relock_page_lruvec function mm/mlock: optimize munlock_pagevec by relocking mm/swap: only change the lru_lock iff page's lruvec is different mm/pgdat: remove pgdat lru_lock mm/lru: debug checking for page memcg moving and lock_page_memcg Hugh Dickins (1): mm/lru: revise the comments of lru_lock Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +--- Documentation/admin-guide/cgroup-v1/memory.rst | 6 +- Documentation/trace/events-kmem.rst | 2 +- Documentation/vm/unevictable-lru.rst | 22 ++--- include/linux/memcontrol.h | 63 ++++++++++++++ include/linux/mm_types.h | 2 +- include/linux/mmzone.h | 5 +- mm/compaction.c | 59 ++++++++----- mm/filemap.c | 4 +- mm/huge_memory.c | 18 ++-- mm/memcontrol.c | 88 +++++++++++++++-= ---- mm/mlock.c | 28 +++---- mm/mmzone.c | 1 + mm/page_alloc.c | 1 - mm/page_idle.c | 7 +- mm/rmap.c | 2 +- mm/swap.c | 75 +++++++---------= - mm/vmscan.c | 97 ++++++++++++----= ------ 18 files changed, 300 insertions(+), 195 deletions(-) --=20 1.8.3.1