From: Alex Shi <alex.shi@linux.alibaba.com>
To: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, akpm@linux-foundation.org,
mgorman@techsingularity.net, tj@kernel.org, hughd@google.com,
khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com,
yang.shi@linux.alibaba.com, willy@infradead.org,
shakeelb@google.com, hannes@cmpxchg.org
Subject: [PATCH v7 00/10] per lruvec lru_lock for memcg
Date: Wed, 25 Dec 2019 17:04:16 +0800 [thread overview]
Message-ID: <1577264666-246071-1-git-send-email-alex.shi@linux.alibaba.com> (raw)
Hi all,
Merry Christmas! :)
This patchset move lru_lock into lruvec, give a lru_lock for each of
lruvec, thus bring a lru_lock for each of memcg per node.
We introduce function lock_page_lruvec, which will lock the page's
memcg and then memcg's lruvec->lru_lock(Thanks Johannes Weiner,
Hugh Dickins and Konstantin Khlebnikov suggestion/reminder) to replace
old pgdat->lru_lock.
Following to Daniel Jordan's suggestion, I run 208 'dd' with on 104
containers on a 2s * 26cores * HT box with a modefied case:
https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice
With this patchset, the readtwice performance increased about 80%
with containers. And no performance drops w/o container.
Another way to guard move_account is by lru_lock instead of move_lock
Considering the memcg move task path:
mem_cgroup_move_task:
mem_cgroup_move_charge:
lru_add_drain_all();
atomic_inc(&mc.from->moving_account); //ask lruvec's move_lock
synchronize_rcu();
walk_parge_range: do charge_walk_ops(mem_cgroup_move_charge_pte_range):
isolate_lru_page();
mem_cgroup_move_account(page,)
spin_lock(&from->move_lock)
page->mem_cgroup = to;
spin_unlock(&from->move_lock)
putback_lru_page(page)
to guard 'page->mem_cgroup = to' by to_vec->lru_lock has the similar effect with
move_lock. So for performance reason, both solutions are same.
Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought the same idea
7 years ago.
Thanks all the comments from Hugh Dickins, Konstantin Khlebnikov, Daniel Jordan,
Johannes Weiner, Mel Gorman, Shakeel Butt, Rong Chen, Fengguang Wu, Yun Wang etc.
and some testing support from Intel 0days!
v7,
a, rebase on v5.5-rc3,
b, move the lock_page_lru() clean up before lock replace.
v6,
a, rebase on v5.5-rc2, and do retesting.
b, pick up Johanness' comments change and a lock_page_lru cleanup.
v5,
a, locking page's memcg according JohannesW suggestion
b, using macro for non memcg, according to Johanness and Metthew's suggestion.
v4:
a, fix the page->mem_cgroup dereferencing issue, thanks Johannes Weiner
b, remove the irqsave flags changes, thanks Metthew Wilcox
c, merge/split patches for better understanding and bisection purpose
v3: rebase on linux-next, and fold the relock fix patch into introduceing patch
v2: bypass a performance regression bug and fix some function issues
v1: initial version, aim testing show 5% performance increase on a 16 threads box.
Alex Shi (9):
mm/vmscan: remove unnecessary lruvec adding
mm/memcg: fold lru_lock in lock_page_lru
mm/lru: replace pgdat lru_lock with lruvec lock
mm/lru: introduce the relock_page_lruvec function
mm/mlock: optimize munlock_pagevec by relocking
mm/swap: only change the lru_lock iff page's lruvec is different
mm/pgdat: remove pgdat lru_lock
mm/lru: add debug checking for page memcg moving
mm/memcg: add debug checking in lock_page_memcg
Hugh Dickins (1):
mm/lru: revise the comments of lru_lock
Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +---
Documentation/admin-guide/cgroup-v1/memory.rst | 6 +-
Documentation/trace/events-kmem.rst | 2 +-
Documentation/vm/unevictable-lru.rst | 22 ++---
include/linux/memcontrol.h | 63 ++++++++++++++
include/linux/mm_types.h | 2 +-
include/linux/mmzone.h | 5 +-
mm/compaction.c | 59 ++++++++-----
mm/filemap.c | 4 +-
mm/huge_memory.c | 18 ++--
mm/memcontrol.c | 84 +++++++++++++++----
mm/mlock.c | 28 +++----
mm/mmzone.c | 1 +
mm/page_alloc.c | 1 -
mm/page_idle.c | 7 +-
mm/rmap.c | 2 +-
mm/swap.c | 75 +++++++----------
mm/vmscan.c | 98 ++++++++++++----------
18 files changed, 297 insertions(+), 195 deletions(-)
--
1.8.3.1
next reply other threads:[~2019-12-25 9:04 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-25 9:04 Alex Shi [this message]
2019-12-25 9:04 ` [PATCH v7 01/10] mm/vmscan: remove unnecessary lruvec adding Alex Shi
2020-01-10 8:39 ` Konstantin Khlebnikov
2020-01-13 7:21 ` Alex Shi
2019-12-25 9:04 ` [PATCH v7 02/10] mm/memcg: fold lru_lock in lock_page_lru Alex Shi
2020-01-10 8:49 ` Konstantin Khlebnikov
2020-01-13 9:45 ` Alex Shi
2020-01-13 9:55 ` Konstantin Khlebnikov
2020-01-13 12:47 ` Alex Shi
2020-01-13 16:34 ` Matthew Wilcox
2020-01-14 9:20 ` Alex Shi
2019-12-25 9:04 ` [PATCH v7 03/10] mm/lru: replace pgdat lru_lock with lruvec lock Alex Shi
2020-01-13 15:41 ` Daniel Jordan
2020-01-14 6:33 ` Alex Shi
2019-12-25 9:04 ` [PATCH v7 04/10] mm/lru: introduce the relock_page_lruvec function Alex Shi
2019-12-25 9:04 ` [PATCH v7 05/10] mm/mlock: optimize munlock_pagevec by relocking Alex Shi
2019-12-25 9:04 ` [PATCH v7 06/10] mm/swap: only change the lru_lock iff page's lruvec is different Alex Shi
2019-12-25 9:04 ` [PATCH v7 07/10] mm/pgdat: remove pgdat lru_lock Alex Shi
2019-12-25 9:04 ` [PATCH v7 08/10] mm/lru: revise the comments of lru_lock Alex Shi
2019-12-25 9:04 ` [PATCH v7 09/10] mm/lru: add debug checking for page memcg moving Alex Shi
2019-12-25 9:04 ` [PATCH v7 10/10] mm/memcg: add debug checking in lock_page_memcg Alex Shi
2019-12-31 23:05 ` [PATCH v7 00/10] per lruvec lru_lock for memcg Andrew Morton
2020-01-02 10:21 ` Alex Shi
2020-01-10 2:01 ` Alex Shi
2020-01-13 8:48 ` Hugh Dickins
2020-01-13 12:45 ` Alex Shi
2020-01-13 20:20 ` Hugh Dickins
2020-01-14 9:14 ` Alex Shi
2020-01-14 9:29 ` Alex Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1577264666-246071-1-git-send-email-alex.shi@linux.alibaba.com \
--to=alex.shi@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=daniel.m.jordan@oracle.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=khlebnikov@yandex-team.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
--cc=willy@infradead.org \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox