From: Miaohe Lin <linmiaohe@huawei.com>
To: <akpm@linux-foundation.org>, <mike.kravetz@oracle.com>,
<songmuchun@bytedance.com>
Cc: <lukas.bulwahn@gmail.com>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <linmiaohe@huawei.com>
Subject: [PATCH v2 6/6] mm/hugetlb: make detecting shared pte more reliable
Date: Tue, 23 Aug 2022 11:02:09 +0800 [thread overview]
Message-ID: <20220823030209.57434-7-linmiaohe@huawei.com> (raw)
In-Reply-To: <20220823030209.57434-1-linmiaohe@huawei.com>
If the pagetables are shared, we shouldn't copy or take references. Since
src could have unshared and dst shares with another vma, huge_pte_none()
is thus used to determine whether dst_pte is shared. But this check isn't
reliable. A shared pte could have pte none in pagetable in fact. The page
count of ptep page should be checked here in order to reliably determine
whether pte is shared.
[Thanks Lukas for cleanup unused local variable dst_entry.]
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
---
mm/hugetlb.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2dfd10599f98..8aa62765a055 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4763,7 +4763,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
struct vm_area_struct *dst_vma,
struct vm_area_struct *src_vma)
{
- pte_t *src_pte, *dst_pte, entry, dst_entry;
+ pte_t *src_pte, *dst_pte, entry;
struct page *ptepage;
unsigned long addr;
bool cow = is_cow_mapping(src_vma->vm_flags);
@@ -4808,15 +4808,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
/*
* If the pagetables are shared don't copy or take references.
- * dst_pte == src_pte is the common case of src/dest sharing.
*
+ * dst_pte == src_pte is the common case of src/dest sharing.
* However, src could have 'unshared' and dst shares with
- * another vma. If dst_pte !none, this implies sharing.
- * Check here before taking page table lock, and once again
- * after taking the lock below.
+ * another vma. So page_count of ptep page is checked instead
+ * to reliably determine whether pte is shared.
*/
- dst_entry = huge_ptep_get(dst_pte);
- if ((dst_pte == src_pte) || !huge_pte_none(dst_entry)) {
+ if (page_count(virt_to_page(dst_pte)) > 1) {
addr |= last_addr_mask;
continue;
}
@@ -4825,13 +4823,10 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
src_ptl = huge_pte_lockptr(h, src, src_pte);
spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
entry = huge_ptep_get(src_pte);
- dst_entry = huge_ptep_get(dst_pte);
again:
- if (huge_pte_none(entry) || !huge_pte_none(dst_entry)) {
+ if (huge_pte_none(entry)) {
/*
- * Skip if src entry none. Also, skip in the
- * unlikely case dst entry !none as this implies
- * sharing with another vma.
+ * Skip if src entry none.
*/
;
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) {
@@ -4910,7 +4905,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
restore_reserve_on_error(h, dst_vma, addr,
new);
put_page(new);
- /* dst_entry won't change as in child */
+ /* huge_ptep of dst_pte won't change as in child */
goto again;
}
hugetlb_install_page(dst_vma, dst_pte, addr, new);
--
2.23.0
prev parent reply other threads:[~2022-08-23 3:02 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-23 3:02 [PATCH v2 0/6] A few fixup patches for hugetlb Miaohe Lin
2022-08-23 3:02 ` [PATCH v2 1/6] mm/hugetlb: fix incorrect update of max_huge_pages Miaohe Lin
2022-08-23 3:02 ` [PATCH v2 2/6] mm/hugetlb: fix WARN_ON(!kobj) in sysfs_create_group() Miaohe Lin
2022-08-23 3:02 ` [PATCH v2 3/6] mm/hugetlb: fix missing call to restore_reserve_on_error() Miaohe Lin
2022-08-24 18:21 ` Mike Kravetz
2022-08-23 3:02 ` [PATCH v2 4/6] mm: hugetlb_vmemmap: add missing smp_wmb() before set_pte_at() Miaohe Lin
2022-08-23 3:02 ` [PATCH v2 5/6] mm/hugetlb: fix sysfs group leak in hugetlb_unregister_node() Miaohe Lin
2022-08-23 3:02 ` Miaohe Lin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220823030209.57434-7-linmiaohe@huawei.com \
--to=linmiaohe@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lukas.bulwahn@gmail.com \
--cc=mike.kravetz@oracle.com \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox