From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B15EC00140 for ; Fri, 29 Jul 2022 02:55:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 990AA6B0071; Thu, 28 Jul 2022 22:55:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 940486B0072; Thu, 28 Jul 2022 22:55:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E0978E0001; Thu, 28 Jul 2022 22:55:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6FFB56B0071 for ; Thu, 28 Jul 2022 22:55:22 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 545A741620 for ; Fri, 29 Jul 2022 02:55:22 +0000 (UTC) X-FDA: 79738621284.08.E506ABB Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf21.hostedemail.com (Postfix) with ESMTP id DE6AB1C00C4 for ; Fri, 29 Jul 2022 02:55:20 +0000 (UTC) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4LvBsd0nvlzmVNx; Fri, 29 Jul 2022 10:53:25 +0800 (CST) Received: from [10.174.177.76] (10.174.177.76) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Fri, 29 Jul 2022 10:55:15 +0800 Subject: Re: [RFC PATCH v4 6/8] hugetlb: add vma based lock for pmd sharing synchronization To: Mike Kravetz , , CC: Muchun Song , Michal Hocko , Peter Xu , Naoya Horiguchi , David Hildenbrand , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , James Houghton , Mina Almasry , Pasha Tatashin , Axel Rasmussen , Ray Fucillo , Andrew Morton References: <20220706202347.95150-1-mike.kravetz@oracle.com> <20220706202347.95150-7-mike.kravetz@oracle.com> From: Miaohe Lin Message-ID: <5b8c6b49-e17a-2c0b-4440-ccf3c5493cb2@huawei.com> Date: Fri, 29 Jul 2022 10:55:15 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20220706202347.95150-7-mike.kravetz@oracle.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.76] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659063322; a=rsa-sha256; cv=none; b=m8/3Mk7vBhdadPOfnUnFaE+agzfW3kNcwz9muGICeEkJGRgc45P7Vw6X8y64wMTiRVIdf/ MfqyV+eVF99hvVJ1A42VmjmqDYJVVlsOnPC2/YMZMuHDXiibRizGiKWRpcuYExG45GINV/ AGaU+h+dPZElKeDEWkLaqF8wcvbVloc= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf21.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659063322; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P7qlUHVU9TvQu1FFChEjZ0OXAwapgFBANFRv1DFhoTQ=; b=wFzet9gAIwBHJSQdImSLgmZgSCc2cnpogUewXMLHoYZGBLm7iuoehsJKskzpQcdqn0mbyK lvx0ZYu//tFZaG5IvCdHhFljZv6bHV1+JjD9DQ+KpwpLjtupBjmp5Hab0RlQwdBaE6IEj3 M04nk8agiS61adRv6w7ysp7lQP+dswE= X-Rspamd-Queue-Id: DE6AB1C00C4 X-Rspam-User: Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf21.hostedemail.com: domain of linmiaohe@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linmiaohe@huawei.com X-Rspamd-Server: rspam09 X-Stat-Signature: uhuskgsyr7gidgx6hj3tz98heood736i X-HE-Tag: 1659063320-978689 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/7/7 4:23, Mike Kravetz wrote: > Allocate a rw semaphore and hang off vm_private_data for > synchronization use by vmas that could be involved in pmd sharing. Only > add infrastructure for the new lock here. Actual use will be added in > subsequent patch. > > Signed-off-by: Mike Kravetz > --- > include/linux/hugetlb.h | 36 +++++++++- > kernel/fork.c | 6 +- > mm/hugetlb.c | 150 ++++++++++++++++++++++++++++++++++++---- > mm/rmap.c | 8 ++- > 4 files changed, 178 insertions(+), 22 deletions(-) > > > /* Forward declaration */ > static int hugetlb_acct_memory(struct hstate *h, long delta); > +static bool vma_pmd_shareable(struct vm_area_struct *vma); > > static inline bool subpool_is_free(struct hugepage_subpool *spool) > { > @@ -904,6 +905,89 @@ resv_map_set_hugetlb_cgroup_uncharge_info(struct resv_map *resv_map, > #endif > } > > +static bool __vma_shareable_flags_pmd(struct vm_area_struct *vma) > +{ > + return vma->vm_flags & (VM_MAYSHARE | VM_SHARED) && Should me make __vma_aligned_range_pmd_shareable check (VM_MAYSHARE | VM_SHARED) like above instead of VM_MAYSHARE to make code more consistent? > + vma->vm_private_data; > +} > + > +void hugetlb_vma_lock_read(struct vm_area_struct *vma) > +{ > + if (__vma_shareable_flags_pmd(vma)) > + down_read((struct rw_semaphore *)vma->vm_private_data); > +} > + > +void hugetlb_vma_unlock_read(struct vm_area_struct *vma) > +{ > + if (__vma_shareable_flags_pmd(vma)) > + up_read((struct rw_semaphore *)vma->vm_private_data); > +} > + > +void hugetlb_vma_lock_write(struct vm_area_struct *vma) > +{ > + if (__vma_shareable_flags_pmd(vma)) > + down_write((struct rw_semaphore *)vma->vm_private_data); > +} > + > +void hugetlb_vma_unlock_write(struct vm_area_struct *vma) > +{ > + if (__vma_shareable_flags_pmd(vma)) > + up_write((struct rw_semaphore *)vma->vm_private_data); > +} > + > +int hugetlb_vma_trylock_write(struct vm_area_struct *vma) > +{ > + if (!__vma_shareable_flags_pmd(vma)) > + return 1; > + > + return down_write_trylock((struct rw_semaphore *)vma->vm_private_data); > +} > + > +void hugetlb_vma_assert_locked(struct vm_area_struct *vma) > +{ > + if (__vma_shareable_flags_pmd(vma)) > + lockdep_assert_held((struct rw_semaphore *) > + vma->vm_private_data); > +} > + > +static void hugetlb_free_vma_lock(struct vm_area_struct *vma) > +{ > + /* Only present in sharable vmas */ > + if (!vma || !(vma->vm_flags & (VM_MAYSHARE | VM_SHARED))) > + return; > + > + if (vma->vm_private_data) { > + kfree(vma->vm_private_data); > + vma->vm_private_data = NULL; > + } > +} > + > +static void hugetlb_alloc_vma_lock(struct vm_area_struct *vma) > +{ > + struct rw_semaphore *vma_sema; > + > + /* Only establish in (flags) sharable vmas */ > + if (!vma || !(vma->vm_flags & (VM_MAYSHARE | VM_SHARED))) > + return; > +> + if (!vma_pmd_shareable(vma)) { > + vma->vm_private_data = NULL; > + return; > + } > + > + vma_sema = kmalloc(sizeof(*vma_sema), GFP_KERNEL); > + if (!vma_sema) { > + /* > + * If we can not allocate semaphore, then vma can not > + * participate in pmd sharing. > + */ > + vma->vm_private_data = NULL; > + } else { > + init_rwsem(vma_sema); > + vma->vm_private_data = vma_sema; > + } This code is really subtle. If it's called from hugetlb_vm_op_open during fork after hugetlb_dup_vma_private is done, there should already be a kmalloc-ed vma_sema for this vma (because hugetlb_alloc_vma_lock is also called by hugetlb_dup_vma_private). So we can't simply change the value of vm_private_data here or vma_sema will be leaked ? But when hugetlb_alloc_vma_lock is called from hugetlb_reserve_pages, it should work fine. Or am I miss something? Thanks.