From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5102CC54E76 for ; Thu, 5 Jan 2023 10:19:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CC64794001B; Thu, 5 Jan 2023 05:19:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C75A7940008; Thu, 5 Jan 2023 05:19:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA51194001B; Thu, 5 Jan 2023 05:19:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 900E4940008 for ; Thu, 5 Jan 2023 05:19:40 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 66F931402A9 for ; Thu, 5 Jan 2023 10:19:40 +0000 (UTC) X-FDA: 80320348920.30.C2C76C3 Received: from mail-vs1-f73.google.com (mail-vs1-f73.google.com [209.85.217.73]) by imf28.hostedemail.com (Postfix) with ESMTP id C4EA4C000A for ; Thu, 5 Jan 2023 10:19:38 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Fc+sFoa0; spf=pass (imf28.hostedemail.com: domain of 3OqS2YwoKCIEoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3OqS2YwoKCIEoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1672913978; a=rsa-sha256; cv=none; b=rROxeAnJoqvOwwmxkMkeecP8iSAxUma2ukA+DIgzxL/4LGhmgUi8R9lZJiZXRm0mi5c8bt f4DPZCz8L5mkDhuiesKV6Hnn1pvGWS0EPK1FgE0hXmlIuTb8qu+HS2xNGfeS6v7rpveODl gD6ixScOMa0AUsKML7WT/SVD6a4M+hA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Fc+sFoa0; spf=pass (imf28.hostedemail.com: domain of 3OqS2YwoKCIEoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com designates 209.85.217.73 as permitted sender) smtp.mailfrom=3OqS2YwoKCIEoymtzlmytslttlqj.htrqnsz2-rrp0fhp.twl@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1672913978; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=D85Uov3Vr9+M/n7WNaDbBuEuFLe7i/N1k26DEBJc4Oo=; b=Rem6mosH2B2/3OI62b6YyWFZH0/JlQJEyTi4pRygXAri576i9Lwrf0rExm6gNqo+7zB1zu EFIHfTVrY0sqEEyWZfiTmqmP0A7M8Zb+wdkt2QFSSgAyR8wOJnuj3S4o7tmGy5wwMWkHaV nePx/PhXE2VsHno8JSQI6R5BCrC1jY0= Received: by mail-vs1-f73.google.com with SMTP id m63-20020a677142000000b003ce30446ff5so2514336vsc.23 for ; Thu, 05 Jan 2023 02:19:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=D85Uov3Vr9+M/n7WNaDbBuEuFLe7i/N1k26DEBJc4Oo=; b=Fc+sFoa0jzCWo3hJ/NOaTUhKo/Ryani/NbeUDYNYcfhk7QQC6IXT8SsP8RkpxdgQH0 Boa0nJMUokcB7P333Ef2yEOjw4nVlYYEEjKGQIu6pJHSa/sd/3BcY0VeYH8OQ7Ull7iD RZidov6eS7oRCowhc+nmf87KUuctR7UFMLVm7HW4S88jfAaN6OtaeNBZWA+IhO0nSuMt 3o/wfdL6uSYfAutpej7givW+b5zwSgmNlC9Aes9ldYfBsTRoMBs8tJvrkPpLBLkF1Gue jGQhMxLX3e6eD9v4YR/phB1VaySb2QLjouCRE8nhl5DlmoHI5vi/DDIvbIlCxmosvGfr rr2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=D85Uov3Vr9+M/n7WNaDbBuEuFLe7i/N1k26DEBJc4Oo=; b=DH7p+tHROjlZVv8c+jSl6AxLWatl0zxElYYqfok0UsB2sGsDqmlkjGRjHx3ABZWx5n 1B/pi0VGc3w0fMTS1tvUBa5XMEZEzeF89XAYum5iF5wHldzKMez1yxcdjzoeI0aManle jwpw3MBkHjf1Kk0RU40IvBfVexBbjmAF5DyWfHhqtoXY1o/d0kTT3Ca4UmPoSAT6PNj0 MEtiPRaVRLQz8o/iriOahEFxX7XZV8K7Q7h31DnN6SeIldg/5pwkh5UgaLBPJKMv1lLB X4CO7xrdkKhTQnsiSjeBR6hafzTYZnUjKczYnpgwnj90qVQ2i80abpj3YAfacEDu03t5 jedw== X-Gm-Message-State: AFqh2koRcNEDklr0TV5VlpaJUuz1sWSuoTw47CeBcDzOz1QIACNVwuY/ V25pEpj38CmzzL66iARO3+ytSHMRDBjADClO X-Google-Smtp-Source: AMrXdXtKzLn8QHs3kEDlKhCPd1a8JiEnPA2DYqJZosaypmuepOntKXSSgRH/++rLw3+DjzBbsl1PvOkHnG0diFXv X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a05:6130:a19:b0:4aa:585:d7c2 with SMTP id bx25-20020a0561300a1900b004aa0585d7c2mr4446498uab.48.1672913978000; Thu, 05 Jan 2023 02:19:38 -0800 (PST) Date: Thu, 5 Jan 2023 10:18:28 +0000 In-Reply-To: <20230105101844.1893104-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230105101844.1893104-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230105101844.1893104-31-jthoughton@google.com> Subject: [PATCH 30/46] hugetlb: add high-granularity migration support From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: C4EA4C000A X-Stat-Signature: x864kdwisa1oekj5m6s9ysjtd51ykur3 X-HE-Tag: 1672913978-458828 X-HE-Meta: U2FsdGVkX19dzkkH+WPRUkX+/A0ia1CNmJ+KdP7yoYgwe59KM8dnRFxu8amaQ0CO8DcZV4CrAxSZIDtI5A0mTZGQAkLEvZHnSp0m+7nNG0dQYtd8pJ1lj93wgQpsv9BkeOA9Vm8Zqm3WxToYDPqhZx41PCy1qV9NlVKVJ8lVJD7NCM3viXrzwPlOppMSkH42U5vh0eiKODG8o6Iogd0l5fba/NB0sJ4bL6K7ly6uxfOkXk9p7Ran0x0mKTc53RDIN6BLf2zVEPdFy+hLB0WjhVkGGAOfp2+oe/r9JlvLGh8j7VqdS/p9eBXrj2w24Y4AqDSfJzU1/pVJFclP+geew9jQZmXT/0T5FW75zbmzm7hnh9EjinCrjk3OTCqNfY2/A7pa6zbCy4CRKuD3xTayPqJ5UWGPIlUXO+pKSTRTo0EeH7yFbTdSUwGqeJrvkfg6LN94CCIryyBBTPzQ5iZkuwMMR5EU7JH1F+yh3bpPm/qYZqBSu0Wj3WYLOJfTKtkIU1yPnvofdC119QthyxAySpnRihNKK2LTHPTEVBzm7+dzWyHoaq6Y4u4rNFsMDKHmGry0dp6c3G3bS463SCZoVAwnG1XB5v6ImArwE6ldU+CKCkjGzAwJZKr0cmVvT4p+M2zhoiqJiI0LBe5/Qk6OUrMuvN4f8IV6VFsIhwq9jDTDB7zM++2YLMbKfFSr4nBXcCiNBMYlWRI5ptbHruyihVUCsK1zKgMnsn/kiBBFpaRoHWUK8ZsKZEtvb01H0Bgd7D2Smia2IMJZn8d1pkS2B6Yapgc3BSTpwmfIHNumPeFBJ1pyUIDIihbAP9f7gzQDjSyJi8NsHOfgwksUliAYEGgTjrL+fo2sXqxY3bOuUb/k0YNEdgjfDd9rRYDfUg9mcRTgNImIYNeTvihgbIJTZP+tlPjiS0ov5rPo/1prrT3oI8XtE2wRVPhqkMI1FEnpwn24NdOkGJ3R9vPbmP8 7KDBCffa zc+1/7gnQ30ygm5yaXwEnViZX6l7/qlMsWK0rEJIcka7lOC53RZ5cbE6XuCHbx9fBrhzuIXA6bACoZ3XLT+qadGmKAPDcU3hyS4/KWj6nkqDjOxX+38++D6iVA35TD9XacodOjJBZJPDtAo/3rqGgyk6KdENGlHIBGTlPL7aIxPzmOXzgvUI4iRMoYhI3rWXVpNjE+R930OGEelqDhfpn9SMlgY5mDk/E7qeSF6eMj7oy+uo1loSQ/yrIU4RfiWBJRXJ+3oEmjPtZuWYG27X9ihBZA3eqgaZx7JCSUBBx9VfungrB5SEFJ51etmOPHlvlJDjzilnNDXQUkUOT5XAIAvb0Yzqj2WDUc+zcOm0w53+jFvRuRyuuF0E03som4IPuZLrZ1kv59a6VLNAHZl70QITYBQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: To prevent queueing a hugepage for migration multiple times, we use last_page to keep track of the last page we saw in queue_pages_hugetlb, and if the page we're looking at is last_page, then we skip it. For the non-hugetlb cases, last_page, although unused, is still updated so that it has a consistent meaning with the hugetlb case. This commit adds a check in hugetlb_fault for high-granularity migration PTEs. Signed-off-by: James Houghton --- include/linux/swapops.h | 8 ++++++-- mm/hugetlb.c | 2 +- mm/mempolicy.c | 24 +++++++++++++++++++----- mm/migrate.c | 18 ++++++++++-------- 4 files changed, 36 insertions(+), 16 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 3a451b7afcb3..6ef80763e629 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -68,6 +68,8 @@ static inline bool is_pfn_swap_entry(swp_entry_t entry); +struct hugetlb_pte; + /* Clear all flags but only keep swp_entry_t related information */ static inline pte_t pte_swp_clear_flags(pte_t pte) { @@ -339,7 +341,8 @@ extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, #ifdef CONFIG_HUGETLB_PAGE extern void __migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *ptep, spinlock_t *ptl); -extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte); +extern void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte); #endif /* CONFIG_HUGETLB_PAGE */ #else /* CONFIG_MIGRATION */ static inline swp_entry_t make_readable_migration_entry(pgoff_t offset) @@ -369,7 +372,8 @@ static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, #ifdef CONFIG_HUGETLB_PAGE static inline void __migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *ptep, spinlock_t *ptl) { } -static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { } +static inline void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte) { } #endif /* CONFIG_HUGETLB_PAGE */ static inline int is_writable_migration_entry(swp_entry_t entry) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8e690a22456a..2fb95ecafc63 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6269,7 +6269,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * be released there. */ mutex_unlock(&hugetlb_fault_mutex_table[hash]); - migration_entry_wait_huge(vma, hpte.ptep); + migration_entry_wait_huge(vma, &hpte); return 0; } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) ret = VM_FAULT_HWPOISON_LARGE | diff --git a/mm/mempolicy.c b/mm/mempolicy.c index e5859ed34e90..6c4c3c923fa2 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -424,6 +424,7 @@ struct queue_pages { unsigned long start; unsigned long end; struct vm_area_struct *first; + struct page *last_page; }; /* @@ -475,6 +476,7 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr, flags = qp->flags; /* go to thp migration */ if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) { + qp->last_page = page; if (!vma_migratable(walk->vma) || migrate_page_add(page, qp->pagelist, flags)) { ret = 1; @@ -532,6 +534,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, continue; if (!queue_pages_required(page, qp)) continue; + if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) { /* MPOL_MF_STRICT must be specified if we get here */ if (!vma_migratable(vma)) { @@ -539,6 +542,8 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, break; } + qp->last_page = page; + /* * Do not abort immediately since there may be * temporary off LRU pages in the range. Still @@ -570,15 +575,22 @@ static int queue_pages_hugetlb(struct hugetlb_pte *hpte, spinlock_t *ptl; pte_t entry; - /* We don't migrate high-granularity HugeTLB mappings for now. */ - if (hugetlb_hgm_enabled(walk->vma)) - return -EINVAL; - ptl = hugetlb_pte_lock(hpte); entry = huge_ptep_get(hpte->ptep); if (!pte_present(entry)) goto unlock; - page = pte_page(entry); + + if (!hugetlb_pte_present_leaf(hpte, entry)) { + ret = -EAGAIN; + goto unlock; + } + + page = compound_head(pte_page(entry)); + + /* We already queued this page with another high-granularity PTE. */ + if (page == qp->last_page) + goto unlock; + if (!queue_pages_required(page, qp)) goto unlock; @@ -605,6 +617,7 @@ static int queue_pages_hugetlb(struct hugetlb_pte *hpte, /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */ if (flags & (MPOL_MF_MOVE_ALL) || (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) { + qp->last_page = page; if (isolate_hugetlb(page, qp->pagelist) && (flags & MPOL_MF_STRICT)) /* @@ -739,6 +752,7 @@ queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end, .start = start, .end = end, .first = NULL, + .last_page = NULL, }; err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp); diff --git a/mm/migrate.c b/mm/migrate.c index 0062689f4878..c30647b75459 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -195,6 +195,9 @@ static bool remove_migration_pte(struct folio *folio, /* pgoff is invalid for ksm pages, but they are never large */ if (folio_test_large(folio) && !folio_test_hugetlb(folio)) idx = linear_page_index(vma, pvmw.address) - pvmw.pgoff; + else if (folio_test_hugetlb(folio)) + idx = (pvmw.address & ~huge_page_mask(hstate_vma(vma)))/ + PAGE_SIZE; new = folio_page(folio, idx); #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION @@ -244,14 +247,15 @@ static bool remove_migration_pte(struct folio *folio, #ifdef CONFIG_HUGETLB_PAGE if (folio_test_hugetlb(folio)) { + struct page *hpage = folio_page(folio, 0); unsigned int shift = pvmw.pte_order + PAGE_SHIFT; pte = arch_make_huge_pte(pte, shift, vma->vm_flags); if (folio_test_anon(folio)) - hugepage_add_anon_rmap(new, vma, pvmw.address, + hugepage_add_anon_rmap(hpage, vma, pvmw.address, rmap_flags); else - page_dup_file_rmap(new, true); + page_dup_file_rmap(hpage, true); set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } else #endif @@ -267,7 +271,7 @@ static bool remove_migration_pte(struct folio *folio, mlock_page_drain_local(); trace_remove_migration_pte(pvmw.address, pte_val(pte), - compound_order(new)); + pvmw.pte_order); /* No need to invalidate - it was non-present before */ update_mmu_cache(vma, pvmw.address, pvmw.pte); @@ -358,12 +362,10 @@ void __migration_entry_wait_huge(struct vm_area_struct *vma, } } -void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) +void migration_entry_wait_huge(struct vm_area_struct *vma, + struct hugetlb_pte *hpte) { - spinlock_t *ptl = huge_pte_lockptr(huge_page_shift(hstate_vma(vma)), - vma->vm_mm, pte); - - __migration_entry_wait_huge(vma, pte, ptl); + __migration_entry_wait_huge(vma, hpte->ptep, hpte->ptl); } #endif -- 2.39.0.314.g84b9a713c41-goog