From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id B3E81C433F5
	for <linux-mm@archiver.kernel.org>; Thu, 13 Jan 2022 13:25:36 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 323B16B0072; Thu, 13 Jan 2022 08:25:36 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 2D2826B0074; Thu, 13 Jan 2022 08:25:36 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 1E8E76B0075; Thu, 13 Jan 2022 08:25:36 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0053.hostedemail.com [216.40.44.53])
	by kanga.kvack.org (Postfix) with ESMTP id 124D76B0072
	for <linux-mm@kvack.org>; Thu, 13 Jan 2022 08:25:36 -0500 (EST)
Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay01.hostedemail.com (Postfix) with ESMTP id A5F9818205810
	for <linux-mm@kvack.org>; Thu, 13 Jan 2022 13:25:35 +0000 (UTC)
X-FDA: 79025335830.16.92D4A69
Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187])
	by imf14.hostedemail.com (Postfix) with ESMTP id C0D5F10000B
	for <linux-mm@kvack.org>; Thu, 13 Jan 2022 13:25:34 +0000 (UTC)
Received: from dggpemm500022.china.huawei.com (unknown [172.30.72.54])
	by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JZQC74vn1zccTr;
	Thu, 13 Jan 2022 21:24:51 +0800 (CST)
Received: from dggpemm500003.china.huawei.com (7.185.36.56) by
 dggpemm500022.china.huawei.com (7.185.36.162) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2308.20; Thu, 13 Jan 2022 21:25:31 +0800
Received: from huawei.com (10.175.104.170) by dggpemm500003.china.huawei.com
 (7.185.36.56) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Thu, 13 Jan
 2022 21:25:31 +0800
From: Liang Zhang <zhangliang5@huawei.com>
To: <akpm@linux-foundation.org>
CC: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	<wangzhigang17@huawei.com>, <zhangliang5@huawei.com>
Subject: [PATCH] mm: reuse the unshared swapcache page in do_wp_page
Date: Thu, 13 Jan 2022 22:03:18 +0800
Message-ID: <20220113140318.11117-1-zhangliang5@huawei.com>
X-Mailer: git-send-email 2.30.0
MIME-Version: 1.0
Content-Type: text/plain
X-Originating-IP: [10.175.104.170]
X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To
 dggpemm500003.china.huawei.com (7.185.36.56)
X-CFilter-Loop: Reflected
X-Rspamd-Server: rspam10
X-Rspamd-Queue-Id: C0D5F10000B
X-Stat-Signature: e66dnu8936x7j4axenhm7n51f1ethezj
Authentication-Results: imf14.hostedemail.com;
	dkim=none;
	dmarc=pass (policy=quarantine) header.from=huawei.com;
	spf=pass (imf14.hostedemail.com: domain of zhangliang5@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=zhangliang5@huawei.com
X-HE-Tag: 1642080334-864768
Content-Transfer-Encoding: quoted-printable
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

In current implementation, process's read requestions will fault in pages
with WP flags in PTEs. Next, if process emit a write requestion will go
into do_wp_page() and copy data to a new allocated page from the old one
due to refcount > 1 (page table mapped and swapcache), which could be
result in performance degradation. In fact, this page is exclusively owne=
d
by this process and the duplication from old to a new allocated page is
really unnecessary.

So In this situation, these unshared pages can be reused by its process.

Signed-off-by: Liang Zhang <zhangliang5@huawei.com>
---
This patch has been tested with redis benchmark. Here is the test
result.

Hardware
=3D=3D=3D=3D=3D=3D=3D=3D
Memory (GB): 512G
CPU (total #): 88
NVMe SSD (GB): 1024

OS
=3D=3D
kernel 5.10.0

Testcase
=3D=3D=3D=3D=3D=3D=3D=3D
step 1:
  Run 16 VMs (4U8G), each running with redis-server, in a cgroup=20
  limiting memory.limit_in_bytes to 100G.=20
step 2:
  Run memtier_bemchmark in host with params "--threads=3D1 --clients=3D1 =
\
--pipeline=3D256 --data-size=3D2048 --requests=3Dallkeys --key-minimum=3D=
1 \
--key-maximum=3D30000000 --key-prefix=3Dmemtier-benchmark-prefix-redistes=
ts"
  to test every VM concurrently.

Workset size
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
cat memory.memsw.usage_in_bytes
125403303936

Result
=3D=3D=3D=3D=3D=3D
Comparing with Baseline, this patch can achieved 41% more Ops/sec,=20
41% more Hits/sec, 41% more Misses/sec, 30% less Latency and=20
41% more KB/sec.=20

  Index(average)        Baseline kernel        Patched kernel
  Ops/sec               109497                 155428
  Hits/sec              8653                   12283
  Misses/sec            90889                  129014
  Latency               2.297                  1.603
  KB/sec                44569                  63186


 mm/memory.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 23f2f1300d42..fd4d868b1c2d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3291,10 +3291,16 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf=
)
 		struct page *page =3D vmf->page;
=20
 		/* PageKsm() doesn't necessarily raise the page refcount */
-		if (PageKsm(page) || page_count(page) !=3D 1)
+		if (PageKsm(page))
 			goto copy;
 		if (!trylock_page(page))
 			goto copy;
+
+		/* reuse the unshared swapcache page */
+		if (PageSwapCache(page) && reuse_swap_page(page, NULL)) {
+			goto reuse;
+		}
+
 		if (PageKsm(page) || page_mapcount(page) !=3D 1 || page_count(page) !=3D=
 1) {
 			unlock_page(page);
 			goto copy;
@@ -3304,6 +3310,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
 		 * page count reference, and the page is locked,
 		 * it's dark out, and we're wearing sunglasses. Hit it.
 		 */
+reuse:
 		unlock_page(page);
 		wp_page_reuse(vmf);
 		return VM_FAULT_WRITE;
--=20
2.30.0