linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: akpm@linux-foundation.org, Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCH 4/7] khugepaged: Allow to callapse a page shared across fork
Date: Fri, 27 Mar 2020 20:05:58 +0300	[thread overview]
Message-ID: <20200327170601.18563-5-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <20200327170601.18563-1-kirill.shutemov@linux.intel.com>

The page can be included into collapse as long as it doesn't have extra
pins (from GUP or otherwise).

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/khugepaged.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 39e0994abeb8..b47edfe57f7b 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -581,18 +581,26 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 		}
 
 		/*
-		 * cannot use mapcount: can't collapse if there's a gup pin.
-		 * The page must only be referenced by the scanned process
-		 * and page swap cache.
+		 * Check if the page has any GUP (or other external) pins.
+		 *
+		 * The page table that maps the page has been already unlinked
+		 * from the page table tree and this process cannot get
+		 * additinal pin on the page.
+		 *
+		 * New pins can come later if the page is shared across fork,
+		 * but not for the this process. It is fine. The other process
+		 * cannot write to the page, only trigger CoW.
 		 */
-		if (page_count(page) != 1 + PageSwapCache(page)) {
+		if (total_mapcount(page) + PageSwapCache(page) !=
+				page_count(page)) {
 			/*
 			 * Drain pagevec and retry just in case we can get rid
 			 * of the extra pin, like in swapin case.
 			 */
 			lru_add_drain();
 		}
-		if (page_count(page) != 1 + PageSwapCache(page)) {
+		if (total_mapcount(page) + PageSwapCache(page) !=
+				page_count(page)) {
 			unlock_page(page);
 			result = SCAN_PAGE_COUNT;
 			goto out;
@@ -680,7 +688,6 @@ static void __collapse_huge_page_copy(pte_t *pte, struct page *page,
 		} else {
 			src_page = pte_page(pteval);
 			copy_user_highpage(page, src_page, address, vma);
-			VM_BUG_ON_PAGE(page_mapcount(src_page) != 1, src_page);
 			release_pte_page(src_page);
 			/*
 			 * ptl mostly unnecessary, but preempt has to
@@ -1209,12 +1216,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm,
 			goto out_unmap;
 		}
 
-		/*
-		 * cannot use mapcount: can't collapse if there's a gup pin.
-		 * The page must only be referenced by the scanned process
-		 * and page swap cache.
-		 */
-		if (page_count(page) != 1 + PageSwapCache(page)) {
+		/* Check if the page has any GUP (or other external) pins */
+		if (total_mapcount(page) + PageSwapCache(page) !=
+				page_count(page)) {
 			result = SCAN_PAGE_COUNT;
 			goto out_unmap;
 		}
-- 
2.26.0



  parent reply	other threads:[~2020-03-27 17:06 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-27 17:05 [PATCH 0/7] thp/khugepaged improvements and CoW semantics Kirill A. Shutemov
2020-03-27 17:05 ` [PATCH 1/7] khugepaged: Add self test Kirill A. Shutemov
2020-03-27 17:05 ` [PATCH 2/7] khugepaged: Do not stop collapse if less than half PTEs are referenced Kirill A. Shutemov
2020-03-27 17:30   ` Zi Yan
2020-03-27 17:46   ` Yang Shi
2020-03-27 17:05 ` [PATCH 3/7] khugepaged: Drain LRU add pagevec to get rid of extra pins Kirill A. Shutemov
2020-03-27 17:34   ` Zi Yan
2020-03-28  0:20     ` Kirill A. Shutemov
2020-03-27 18:10   ` Yang Shi
2020-03-28 12:18     ` Kirill A. Shutemov
2020-03-30 18:30       ` Yang Shi
2020-03-30 21:38         ` Kirill A. Shutemov
2020-03-27 17:05 ` Kirill A. Shutemov [this message]
2020-03-27 18:19   ` [PATCH 4/7] khugepaged: Allow to callapse a page shared across fork Zi Yan
2020-03-27 21:31     ` Yang Shi
2020-03-27 21:44       ` Zi Yan
2020-03-27 17:05 ` [PATCH 5/7] khugepaged: Allow to collapse PTE-mapped compound pages Kirill A. Shutemov
2020-03-27 18:53   ` Yang Shi
2020-03-28  0:34     ` Kirill A. Shutemov
2020-03-28  1:09       ` Yang Shi
2020-03-28 12:27         ` Kirill A. Shutemov
2020-03-30 18:38           ` Yang Shi
2020-03-27 18:55   ` Zi Yan
2020-03-28  0:39     ` Kirill A. Shutemov
2020-03-28  1:17       ` Zi Yan
2020-03-28 12:33         ` Kirill A. Shutemov
2020-03-30 18:41           ` Yang Shi
2020-03-30 18:50           ` Yang Shi
2020-03-31 14:08             ` Kirill A. Shutemov
2020-04-01 19:45               ` Yang Shi
2020-03-27 20:45   ` Yang Shi
2020-03-28  0:40     ` Kirill A. Shutemov
2020-03-28  1:12       ` Yang Shi
2020-03-27 17:06 ` [PATCH 6/7] thp: Change CoW semantics for anon-THP Kirill A. Shutemov
2020-03-27 20:07   ` Yang Shi
2020-03-28  0:43     ` Kirill A. Shutemov
2020-03-28  1:30       ` Yang Shi
2020-03-27 17:06 ` [PATCH 7/7] khugepaged: Introduce 'max_ptes_shared' tunable Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200327170601.18563-5-kirill.shutemov@linux.intel.com \
    --to=kirill@shutemov.name \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox