From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F9A2CF9C71 for ; Tue, 24 Sep 2024 07:29:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2B3B6B00A4; Tue, 24 Sep 2024 03:29:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB4656B00A7; Tue, 24 Sep 2024 03:29:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2D4F6B00A8; Tue, 24 Sep 2024 03:29:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A384A6B00A4 for ; Tue, 24 Sep 2024 03:29:27 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 26DB8A17C2 for ; Tue, 24 Sep 2024 07:29:27 +0000 (UTC) X-FDA: 82598806374.26.3DF3C4F Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by imf14.hostedemail.com (Postfix) with ESMTP id 429F4100008 for ; Tue, 24 Sep 2024 07:29:24 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hwCNgzkm; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf14.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.175 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727162834; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=d2PX+4bvLDBXLZ7VO01vqg1hyq+vLyU3jjzE13L5vTo=; b=T3rxKuoOC5hlRHec5NQdeJQj6/FOoM4GUdo2nRQPI3bVk75SImV/70DNQAaNk83qJ/9xQm Hf0N4DxnHWWN/Lgk+n9MTZb0WLGUh8Almb0nO1YfLBqMHoo15YsFdmAgAFhIc6sbJCvdq2 R8VTZEhRpEfO2B7fG+QpFSXXQTubGYc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727162834; a=rsa-sha256; cv=none; b=5au1AM5UQMrXm8Nle8i08OMUVAyEVKP8WCtj0/cSs0yIXg/fyCWLoySpW8j/9tYmp52aLq cCy6I7QHTDzckvhDZiO7eW80D+zBQp6Pej7XITE7+LWrID7gIGEyGziRq6EEfSIhvbVxlW 0K4HEROG6dvI9uiTO9V62vG79c0/47I= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hwCNgzkm; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf14.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.210.175 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-71781f42f75so4837356b3a.1 for ; Tue, 24 Sep 2024 00:29:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1727162962; x=1727767762; darn=kvack.org; h=content-transfer-encoding:in-reply-to:content-language:from :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=d2PX+4bvLDBXLZ7VO01vqg1hyq+vLyU3jjzE13L5vTo=; b=hwCNgzkmU2nVOoOwH+qPLJZmRCwAy+PzfgXANwV5q1Devqd0RQwSXupsKQ/aJzVeeL BL7e3PllOwOUzzeTHMWqjXHvsTgN6j9ZVMdUhETx0YvGadZAM28GArjBPg5Gp/mrhH/Y WOFE7sgdK1kAsBGLutwSix7AbDZD0M/koX5FOlgzBwRUMNwyzrpnsFcTr9V3XoBVEXP4 nxPWmq7NOAnxuOeJAyd3rSQB4eyQ0t5RfKRJcfTyqhcL1xlyYXgXQsHLbrxcw9eFb+Eo fp8BkAs1c36PBrlW4227kK7ZAq6MJs1TLUWx1GYuQbBeu/nay9+LbSwyre0rJQtNYvTI u6ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727162962; x=1727767762; h=content-transfer-encoding:in-reply-to:content-language:from :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=d2PX+4bvLDBXLZ7VO01vqg1hyq+vLyU3jjzE13L5vTo=; b=tYQqQlQYUAoaMs0/i7G3BJD45/jJVNFWBkEZ0p+Xs7rH0GL107kco4+cgniNnrUI/a +i2upeaEi3HRKCmWA3Q4vwMY0anFfzHL+WJJEYoqfJN1K+VT3x1vdV60YDIxUnLCCMUN 7GrRsoQaCRDRJS0CSOnuOaUbyfOmjt8zeeFhs0Quq36cRKGIrW3tmGGNC4SBmQvQeNS6 DrKm3NT5tZJku498DwxdUhCBhusQEX3dCs6rRVdLrMBOGxF3emkGSXLY5Z66Id86H6Qg pXSWcyJCJHo2GRvISdVeTDGSiRsKOaw9CXPNN0PerhYJKpfwiyMugPRuUZl5y5YLCeu9 kD5Q== X-Forwarded-Encrypted: i=1; AJvYcCVg6iiBgQyFmLrTS6WVay20SvIZPgcHeHE9tZ65sPKI3u518wTZpGaiw19dYyMmrEdwsUvBoErDtw==@kvack.org X-Gm-Message-State: AOJu0YzhC8O9nb9z0zNg2YJ8NoftHUvi/AhkoV0fSWIfroeXr89erwZc DxUktOET/Q66IZJY767cCNunS9NfbSwOlFc/nRf7aFXiG/6TWxyeARpsIdO6G8Y= X-Google-Smtp-Source: AGHT+IGKRBBYcFOAYtGkDsa6YaiDH1UmP5mz9dhjFWfJvV6oCsckH+MIyF++jOSsJbB5DKwqj/mXKA== X-Received: by 2002:a05:6a00:1701:b0:70d:2892:402b with SMTP id d2e1a72fcca58-7199cd6ad79mr23305104b3a.7.1727162962602; Tue, 24 Sep 2024 00:29:22 -0700 (PDT) Received: from [10.68.125.128] ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71afc9390cfsm665902b3a.118.2024.09.24.00.29.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 24 Sep 2024 00:29:22 -0700 (PDT) Message-ID: <2343da2e-f91f-4861-bb22-28f77db98c52@bytedance.com> Date: Tue, 24 Sep 2024 15:29:14 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 07/13] mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_rw_nolock() To: Muchun Song Cc: david@redhat.com, hughd@google.com, willy@infradead.org, vbabka@kernel.org, akpm@linux-foundation.org, rppt@kernel.org, vishal.moola@gmail.com, peterx@redhat.com, ryan.roberts@arm.com, christophe.leroy2@cs-soprasteria.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org References: <07d975c50fe09c246e087303b39998430b1a66bd.1727148662.git.zhengqi.arch@bytedance.com> <79699B24-0D99-4051-91F3-5695D32D62AC@linux.dev> From: Qi Zheng Content-Language: en-US In-Reply-To: <79699B24-0D99-4051-91F3-5695D32D62AC@linux.dev> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 429F4100008 X-Stat-Signature: 3f5io5xxwc8cdzmo5cukapop5q6ijwxi X-Rspam-User: X-HE-Tag: 1727162964-593421 X-HE-Meta: U2FsdGVkX18XFQ+0zuBiss1unH4spEWR2kCppEnt6DMcR9UewvPazbifXy2Po3omiyeitLzCy+myc24XxEUDNRmoVFwe7Jnl+oswY02HUGAMgP9+AWEGZZcQCqUekWWHC/z2SjoJKj4lSLpUZis9f/5hA5QhdRRtTqQ4LAdvWQXnBTZaqGkX13SW+oHwAiJ//5qa3fgFVJZVDH7iXE+zftBMVLNwkgkWz3pNCA1MY7UZSpsZ/7dpjc/TQT24zU7LJ1qQ9DmtYZ1suZGD2C2/pc4ibrVxBRSLw2dR0JoE6oTViBWV6ua1qFc7i/dnIoc5II07VQ+P6EH0Ti8f4HjnkvIu21bblCRKM/hg9jm5ekV3C8mWp/IlIOlJdiatzsZmk+53I9wtXJhWh0PbNtl+Lw7VWqYeaBOMws923vOR9WFfA+9OFqH4ZTjCW5rB/r/SBYT2VoYkWPeCyO5hTrxAJxrjNUUoxxK7H+Fux4TWQ7KB6DVL80IrDgKeWFWc24Zp5gx9BhGOt5z3/RfXWEgLc8WBUm71DePgaeu4Fqo0qnK8w6OSqXWXaHQ9C9n0bMFfXr7zMKEfdd6SbQQoQoLwWT/5GoRkiJlr3TgENPsTqbtDAmrZY6IZAPwkem5e6Hp5KtfExLK5e0lMV6qJUU/EIb/PZYC3+d5lPgtZrSJ7wMdf1w3tMyqEm1S1k0v9J8FmWLjWMa96TH9mL4UnodbOu8txmRH2nAPo5nRR366z19FJ1vHKVNFk6YGmkPcrl2n7mfSvgYi2tkyKfU+5wG1VU1+ToTlFN503qCVrJ71cbub4W7acWaCFR5cq4uEDu52XTMTNbE8OZK4ie5WHmxJDjvhBt+iDnSSTmd8a6XZiur3MkV/VVBOW8QQIfyPxMBNa7idERzlv+Q/hnBxwO8ApsXNrAoB9BQZqTKMA1K1CRoa1I7GZGx/ftzzqJKV+Hq37p8AWX4CgXUc3UrlmDgT MOKCNjs+ HodiOub8YwIh2k3aXDhXH6CyJq8wQsjGZeY/j7+MT6Dp5odcmBpNsuRq9t6euepkX/KMPPEj6jWWHhAboIhMYgRQdOoU2d8WhhMAv0gP7xYcpHkKeMgv8TqCQ+HdG24l/d0q4ktQOKLi8cD1RwRBgQ5lBtA93fKqtelxeZi8g+Y/Nju6OIi2io6F/a3f4hwspudbhbLH7DXZsou44acAnOX+evG2cMCjd1yZ7YQuPVTOUEEF3/150Z92vMt0rRMcAXvvSfPqqsyMcizvoomIki0nxWVWUcCL4jCm4zVKFbupOdBUVSiIlA8gdgfRbc7iGa/mDQXrf1YqhyMbeSCwHJPsD2CLry8pA8hx5H6ydlMVVLTMMVqBo5odZDP9ozh6DZ/tlwzPHuUiObeqXZpkrFGNn4Si8b9MhhptKeEl8dLdDj7d/iDcHh1gUV7zQBQ4gjipWjq5BHBYcCXWsVWjqrE5FVO9p1XdYr0aKIDSlSDDEJwT3wOIbHwrf7jmiM7Wkoa0lxH6SfK0bFEdb7I1n24mmFQIcF4NVgdmB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/9/24 15:14, Muchun Song wrote: > > >> On Sep 24, 2024, at 14:11, Qi Zheng wrote: >> In collapse_pte_mapped_thp(), we may modify the pte and pmd entry after >> acquring the ptl, so convert it to using pte_offset_map_rw_nolock(). At >> this time, the pte_same() check is not performed after the PTL held. So we >> should get pgt_pmd and do pmd_same() check after the ptl held. >> >> Signed-off-by: Qi Zheng >> --- >> mm/khugepaged.c | 14 +++++++++++--- >> 1 file changed, 11 insertions(+), 3 deletions(-) >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 6498721d4783a..8ab79c13d077f 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -1605,7 +1605,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, >> if (userfaultfd_armed(vma) && !(vma->vm_flags & VM_SHARED)) >> pml = pmd_lock(mm, pmd); >> >> - start_pte = pte_offset_map_nolock(mm, pmd, haddr, &ptl); >> + start_pte = pte_offset_map_rw_nolock(mm, pmd, haddr, &pgt_pmd, &ptl); >> if (!start_pte) /* mmap_lock + page lock should prevent this */ >> goto abort; >> if (!pml) >> @@ -1613,6 +1613,9 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, >> else if (ptl != pml) >> spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); >> >> + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) >> + goto abort; >> + >> /* step 2: clear page table and adjust rmap */ >> for (i = 0, addr = haddr, pte = start_pte; >> i < HPAGE_PMD_NR; i++, addr += PAGE_SIZE, pte++) { >> @@ -1645,7 +1648,6 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, >> nr_ptes++; >> } >> >> - pte_unmap(start_pte); >> if (!pml) >> spin_unlock(ptl); >> >> @@ -1658,13 +1660,19 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, >> /* step 4: remove empty page table */ >> if (!pml) { >> pml = pmd_lock(mm, pmd); >> - if (ptl != pml) >> + if (ptl != pml) { >> spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); >> + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { >> + spin_unlock(pml); >> + goto abort; > > Drop the reference of folio and the mm counter twice at the label of abort and the step 3. My bad, should set nr_ptes to 0 and call flush_tlb_mm() here, right? > >> + } >> + } >> } >> pgt_pmd = pmdp_collapse_flush(vma, haddr, pmd); >> pmdp_get_lockless_sync(); >> if (ptl != pml) >> spin_unlock(ptl); >> + pte_unmap(start_pte); >> spin_unlock(pml); > > Why not? > > pte_unmap_unlock(start_pte, ptl); > if (pml != ptl) > spin_unlock(pml); Both are fine, will do. Thanks, Qi > >> >> mmu_notifier_invalidate_range_end(&range); >> -- >> 2.20.1