From: Jane Chu <jane.chu@oracle.com>
To: Miaohe Lin <linmiaohe@huawei.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: akpm@linux-foundation.org, brauner@kernel.org, oleg@redhat.com,
tandersen@netflix.com, mjguzik@gmail.com, willy@infradead.org,
kent.overstreet@linux.dev, zhangpeng.00@bytedance.com,
hca@linux.ibm.com, mike.kravetz@oracle.com,
muchun.song@linux.dev, thorvald@google.com,
Liam.Howlett@Oracle.com
Subject: Re: [PATCH] fork: defer linking file vma until vma is fully initialized
Date: Mon, 15 Apr 2024 16:32:28 -0700 [thread overview]
Message-ID: <28976a8e-678e-4cfa-8748-e566c9c29053@oracle.com> (raw)
In-Reply-To: <20240410091441.3539905-1-linmiaohe@huawei.com>
On 4/10/2024 2:14 AM, Miaohe Lin wrote:
> Thorvald reported a WARNING [1]. And the root cause is below race:
>
> CPU 1 CPU 2
> fork hugetlbfs_fallocate
> dup_mmap hugetlbfs_punch_hole
> i_mmap_lock_write(mapping);
> vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree.
> i_mmap_unlock_write(mapping);
> hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem!
> i_mmap_lock_write(mapping);
> hugetlb_vmdelete_list
> vma_interval_tree_foreach
> hugetlb_vma_trylock_write -- Vma_lock is cleared.
> tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem!
> hugetlb_vma_unlock_write -- Vma_lock is assigned!!!
> i_mmap_unlock_write(mapping);
>
> hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside
> i_mmap_rwsem lock while vma lock can be used in the same time. Fix this
> by deferring linking file vma until vma is fully initialized. Those vmas
> should be initialized first before they can be used.
>
> Reported-by: Thorvald Natvig <thorvald@google.com>
> Closes: https://lore.kernel.org/linux-mm/20240129161735.6gmjsswx62o4pbja@revolver/T/ [1]
> Fixes: 8d9bfb260814 ("hugetlb: add vma based lock for pmd sharing")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
> kernel/fork.c | 33 +++++++++++++++++----------------
> 1 file changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 84de5faa8c9a..99076dbe27d8 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -714,6 +714,23 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> } else if (anon_vma_fork(tmp, mpnt))
> goto fail_nomem_anon_vma_fork;
> vm_flags_clear(tmp, VM_LOCKED_MASK);
> + /*
> + * Copy/update hugetlb private vma information.
> + */
> + if (is_vm_hugetlb_page(tmp))
> + hugetlb_dup_vma_private(tmp);
> +
> + /*
> + * Link the vma into the MT. After using __mt_dup(), memory
> + * allocation is not necessary here, so it cannot fail.
> + */
> + vma_iter_bulk_store(&vmi, tmp);
> +
> + mm->map_count++;
> +
> + if (tmp->vm_ops && tmp->vm_ops->open)
> + tmp->vm_ops->open(tmp);
> +
> file = tmp->vm_file;
> if (file) {
> struct address_space *mapping = file->f_mapping;
> @@ -730,25 +747,9 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> i_mmap_unlock_write(mapping);
> }
>
> - /*
> - * Copy/update hugetlb private vma information.
> - */
> - if (is_vm_hugetlb_page(tmp))
> - hugetlb_dup_vma_private(tmp);
> -
> - /*
> - * Link the vma into the MT. After using __mt_dup(), memory
> - * allocation is not necessary here, so it cannot fail.
> - */
> - vma_iter_bulk_store(&vmi, tmp);
> -
> - mm->map_count++;
> if (!(tmp->vm_flags & VM_WIPEONFORK))
> retval = copy_page_range(tmp, mpnt);
>
> - if (tmp->vm_ops && tmp->vm_ops->open)
> - tmp->vm_ops->open(tmp);
> -
> if (retval) {
> mpnt = vma_next(&vmi);
> goto loop_out;
Looks good.
Reviewed-by: Jane Chu <jane.chu@oracle.com>
-jane
prev parent reply other threads:[~2024-04-15 23:33 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-10 9:14 Miaohe Lin
2024-04-10 20:21 ` Andrew Morton
2024-04-11 2:18 ` Miaohe Lin
2024-04-15 23:32 ` Jane Chu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=28976a8e-678e-4cfa-8748-e566c9c29053@oracle.com \
--to=jane.chu@oracle.com \
--cc=Liam.Howlett@Oracle.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=hca@linux.ibm.com \
--cc=kent.overstreet@linux.dev \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=mjguzik@gmail.com \
--cc=muchun.song@linux.dev \
--cc=oleg@redhat.com \
--cc=tandersen@netflix.com \
--cc=thorvald@google.com \
--cc=willy@infradead.org \
--cc=zhangpeng.00@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox