linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chih-En Lin <shiyn.lin@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	David Hildenbrand <david@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Luis Chamberlain <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Iurii Zaikin <yzaikin@google.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	William Kucharski <william.kucharski@oracle.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Peter Xu <peterx@redhat.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Tong Tiangen <tongtiangen@huawei.com>,
	Pasha Tatashin <pasha.tatashin@soleen.com>,
	Li kunyu <kunyu@nfschina.com>, Nadav Amit <namit@vmware.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Minchan Kim <minchan@kernel.org>, Yang Shi <shy828301@gmail.com>,
	Song Liu <song@kernel.org>, Miaohe Lin <linmiaohe@huawei.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Andy Lutomirski <luto@kernel.org>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Dinglan Peng <peng301@purdue.edu>,
	Pedro Fonseca <pfonseca@purdue.edu>,
	Jim Huang <jserv@ccns.ncku.edu.tw>,
	Huichun Feng <foxhoundsk.tw@gmail.com>,
	Chih-En Lin <shiyn.lin@gmail.com>
Subject: [RFC PATCH v2 5/9] mm, pgtable: Add a refcount to PTE table
Date: Wed, 28 Sep 2022 00:29:53 +0800	[thread overview]
Message-ID: <20220927162957.270460-6-shiyn.lin@gmail.com> (raw)
In-Reply-To: <20220927162957.270460-1-shiyn.lin@gmail.com>

Reuse the _refcount in struct page for the page table to maintain the
number of process references to COWed PTE table. Before decreasing the
refcount, it will check whether refcount is one or not for reusing
shared PTE table.

Signed-off-by: Chih-En Lin <shiyn.lin@gmail.com>
---
 include/linux/mm.h      |  1 +
 include/linux/pgtable.h | 28 ++++++++++++++++++++++++++++
 mm/memory.c             |  1 +
 3 files changed, 30 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 965523dcca3b8..bfe6a8c7ab9ed 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2290,6 +2290,7 @@ static inline bool pgtable_pte_page_ctor(struct page *page)
 	__SetPageTable(page);
 	inc_lruvec_page_state(page, NR_PAGETABLE);
 	page->cow_pte_owner = NULL;
+	set_page_count(page, 1);
 	return true;
 }
 
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 25c1e5c42fdf3..8b497d7d800ed 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -9,6 +9,7 @@
 #ifdef CONFIG_MMU
 
 #include <linux/mm_types.h>
+#include <linux/page_ref.h>
 #include <linux/bug.h>
 #include <linux/errno.h>
 #include <asm-generic/pgtable_uffd.h>
@@ -628,6 +629,33 @@ static inline bool cow_pte_owner_is_same(pmd_t *pmd, pmd_t *owner)
 	return smp_load_acquire(&pmd_page(*pmd)->cow_pte_owner) == owner;
 }
 
+static inline int pmd_get_pte(pmd_t *pmd)
+{
+	return page_ref_inc_return(pmd_page(*pmd));
+}
+
+/*
+ * If the COW PTE refcount is 1, instead of decreasing the counter,
+ * clear write protection of the corresponding PMD entry and reset
+ * the COW PTE owner to reuse the table.
+ * But if the reuse parameter is false, do not thing. This help us
+ * to handle the situation that PTE table we already handled.
+ */
+static inline int pmd_put_pte(struct vm_area_struct *vma, pmd_t *pmd,
+			      unsigned long addr, bool reuse)
+{
+	if (!page_ref_add_unless(pmd_page(*pmd), -1, 1) && reuse) {
+		cow_pte_fallback(vma, pmd, addr);
+		return 1;
+	}
+	return 0;
+}
+
+static inline int cow_pte_count(pmd_t *pmd)
+{
+	return page_count(pmd_page(*pmd));
+}
+
 #ifndef pte_access_permitted
 #define pte_access_permitted(pte, write) \
 	(pte_present(pte) && (!(write) || pte_write(pte)))
diff --git a/mm/memory.c b/mm/memory.c
index d29f84801f3cd..3e66e229f4169 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2875,6 +2875,7 @@ void cow_pte_fallback(struct vm_area_struct *vma, pmd_t *pmd,
 	pmd_t new;
 
 	VM_WARN_ON(pmd_write(*pmd));
+	VM_WARN_ON(cow_pte_count(pmd) != 1);
 
 	start = addr & PMD_MASK;
 	end = (addr + PMD_SIZE) & PMD_MASK;
-- 
2.37.3



  parent reply	other threads:[~2022-09-27 16:28 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-27 16:29 [RFC PATCH v2 0/9] Introduce Copy-On-Write to Page Table Chih-En Lin
2022-09-27 16:29 ` [RFC PATCH v2 1/9] mm: Add new mm flags for Copy-On-Write PTE table Chih-En Lin
2022-09-27 17:23   ` Nadav Amit
2022-09-27 17:36     ` Chih-En Lin
2022-09-27 16:29 ` [RFC PATCH v2 2/9] mm: pgtable: Add sysctl to enable COW PTE Chih-En Lin
2022-09-27 17:27   ` Nadav Amit
2022-09-27 18:05     ` Chih-En Lin
2022-09-27 21:22   ` John Hubbard
2022-09-28  8:36     ` Chih-En Lin
2022-09-27 16:29 ` [RFC PATCH v2 3/9] mm, pgtable: Add ownership to PTE table Chih-En Lin
2022-09-27 17:30   ` Nadav Amit
2022-09-27 18:23     ` Chih-En Lin
2022-09-27 16:29 ` [RFC PATCH v2 4/9] mm: Add COW PTE fallback functions Chih-En Lin
2022-09-27 17:51   ` Nadav Amit
2022-09-27 19:00     ` Chih-En Lin
2022-09-27 16:29 ` Chih-En Lin [this message]
2022-09-27 17:59   ` [RFC PATCH v2 5/9] mm, pgtable: Add a refcount to PTE table Nadav Amit
2022-09-27 19:07     ` Chih-En Lin
2022-09-27 16:29 ` [RFC PATCH v2 6/9] mm, pgtable: Add COW_PTE_OWNER_EXCLUSIVE flag Chih-En Lin
2022-09-27 16:29 ` [RFC PATCH v2 7/9] mm: Add the break COW PTE handler Chih-En Lin
2022-09-27 18:15   ` Nadav Amit
2022-09-27 19:23     ` Chih-En Lin
2022-09-27 16:29 ` [RFC PATCH v2 8/9] mm: Handle COW PTE with reclaim algorithm Chih-En Lin
2022-09-27 16:29 ` [RFC PATCH v2 9/9] mm: Introduce Copy-On-Write PTE table Chih-En Lin
2022-09-27 18:38   ` Nadav Amit
2022-09-27 19:53     ` Chih-En Lin
2022-09-27 21:26       ` John Hubbard
2022-09-28  8:52         ` Chih-En Lin
2022-09-28 14:03       ` David Hildenbrand
2022-09-29 13:38         ` Chih-En Lin
2022-09-29 13:49           ` Chih-En Lin
2022-09-29 17:24           ` David Hildenbrand
2022-09-29 18:29             ` Chih-En Lin
2022-09-29 18:38               ` David Hildenbrand
2022-09-29 18:57                 ` Chih-En Lin
2022-09-29 19:00                   ` David Hildenbrand
2022-09-29 18:40               ` Nadav Amit
2022-09-29 19:02                 ` Chih-En Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220927162957.270460-6-shiyn.lin@gmail.com \
    --to=shiyn.lin@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=arnd@arndb.de \
    --cc=bigeasy@linutronix.de \
    --cc=christophe.leroy@csgroup.eu \
    --cc=david@redhat.com \
    --cc=fenghua.yu@intel.com \
    --cc=foxhoundsk.tw@gmail.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kunyu@nfschina.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=minchan@kernel.org \
    --cc=namit@vmware.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=peng301@purdue.edu \
    --cc=peterx@redhat.com \
    --cc=pfonseca@purdue.edu \
    --cc=shy828301@gmail.com \
    --cc=song@kernel.org \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=tongtiangen@huawei.com \
    --cc=vbabka@suse.cz \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    --cc=yzaikin@google.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox