From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4884BC021B3 for ; Fri, 21 Feb 2025 15:42:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B93C76B0083; Fri, 21 Feb 2025 10:42:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B4437280004; Fri, 21 Feb 2025 10:42:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A0BC9280001; Fri, 21 Feb 2025 10:42:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 81C756B0082 for ; Fri, 21 Feb 2025 10:42:04 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2A5B71C9D1B for ; Fri, 21 Feb 2025 15:42:04 +0000 (UTC) X-FDA: 83144367768.18.140706F Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf21.hostedemail.com (Postfix) with ESMTP id 5D6E31C0019 for ; Fri, 21 Feb 2025 15:42:02 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kjKRDHQ6; spf=pass (imf21.hostedemail.com: domain of will@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=will@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740152522; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FTg49ei2ktzzvZ7HSXauWZyXc8iLOlDU6A+pGEyfHDg=; b=YCqxCkyNgTxRCEXym4iDAva93mIlky3LZ6dLj2ZKt7P96cD5KLMVV5Pq+nEPolZkeBDOeJ l1Um4ADUSSpQa4L5nSgjmC4+He4fyAJjMAgX3SIqqllhuG5wRTVaAa54TbFldX+warp3tO pQHSHnHNFEnGXs0xARWK86E3zTHURNU= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kjKRDHQ6; spf=pass (imf21.hostedemail.com: domain of will@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=will@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740152522; a=rsa-sha256; cv=none; b=7Gn5SLdEitMeU6wFrrDSNpTWY0Fn9bLGryhKy3Nw17uAF7sRyjutB2/cn+VKPG0Ck5b8ck GoGHrenQwAJSBLb2LXe3B0vUEyuSxaWaC130ChD5NXBaDYlK6d/fNVu3ton2Lx2oYVb40T dS+gTJLgr/OlBWTTwy0ptD8MYyxa0Cw= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 602265C66C8; Fri, 21 Feb 2025 15:41:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A2698C4CED6; Fri, 21 Feb 2025 15:41:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740152521; bh=WwwYdepPVfXGQt6TJoehiecmQZzDSZ9gbFfiLyqFhlA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kjKRDHQ6dgEuuAVSrZXfKwM3BHcPorKJuS1mM0ThcTQu0GlztYPn0UIFlJ/AAE4jW Pav9wCthU+U2IcIB+bI4CT2HO82K5NYFzmmYj4r1sTY6WOBpolMngrRZh9dacQRspo XW0Ou5gaa4TAW5x8py/uvJHwwBoPV1+6qzmhej1VAnEIAQQp/C0+5FoJNI/8Cd/AZx HrHJwm7QrOYltxQeqDsUYltF4Mhxqnl+9vu1GZhh6Nu4j/ygpsx0cLV2frkp43BXy5 5EDPsYC5YExdV2mHv7GX1T4rJpc7mak/V2r9uPOeFA6ApxDkHv0so5EA6mTeWOXMEG klC+hX7plG7dQ== Date: Fri, 21 Feb 2025 15:41:56 +0000 From: Will Deacon To: Yu Zhao Cc: Andrew Morton , David Hildenbrand , Mateusz Guzik , "Matthew Wilcox (Oracle)" , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH mm-unstable v2] mm/hugetlb_vmemmap: fix memory loads ordering Message-ID: <20250221154155.GE20567@willie-the-truck> References: <20250108074822.722696-1-yuzhao@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250108074822.722696-1-yuzhao@google.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspam-User: X-Stat-Signature: ocm953rqpydy4xn13wqadxymwywnbps9 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5D6E31C0019 X-HE-Tag: 1740152522-280929 X-HE-Meta: U2FsdGVkX190rpq7J8+Y6blWoJh1EqDXkvDr2HgxXDeeK4R7cR1LN2A4lmB5gNLdHORWn876GSZu1VRSoodkL20qfJqED3yfMPBgvvfFI1RlKa6hZN4jPzQf3xEXW/vcnbBEDBnM4bS4gs2noMi28vXcGTxtbeRbl9zYTn/qWWxwPvywMAjgdUVqo6Tn6DM2lEAbQtL7f0Um0niv0RkDBtkVEnKm1EQkqQ7vEKEK6ulw6LBMT07t/99V8jb7sRgYc+7UTG72aktSZLORlbl+L7XOvjjTlO1AIM6t4QcjJzVuhx1UZKvfmd1ly3bWR7qz+l60GUd4kBjvwIwqJSFJesaYPqKQJNM0WDgxSriKkzWOuQPOj1sCWwBKaMA+UJMmeoi+zIjzkvyH5eYy0WVVpEIm3sfOqHXDJ0GxeU1mpMn6wDaVDCtmVogeufbxHYiUjdbCbCOOavZgFou1KKYnLSk77X+SV7LBncjW6uyqzHULQnIjXuGJjWnqgCx6/ePV13fKEZ1LolELTnNOPH2RXZ1EHSy57LBamt7PSH1HLLGd78e/bFfOSjflvkhY/cMHYDbr6LumDYZ8+4w1wIwVhEhUDywwHmxa0FL2JjCmb5vbomeWrjRhZh0GQ469dk/8wh16pQoF1VunWnpUMzBIvrqjy72yXqPCR4numMyJur5JQTNqs4CvtMerg79s8tB9heD6vuWdNyI5YiS170FSYvwN2ml1JWHnV2jdOzIc2f/Jp83DqLHVdB1ylldkqIc7InUfM6dbRZy3Hwyfyc0V8N5nUjAwCLhx2yg+Tva+0DoKpVXCZnrtip8XCdDkdRCaLCQij7N+X7fD7vRqvd6oEhpeHdaOSofkv/JvYPlFQhMtgEXOxWBiBZZpalwDZWgyK5G5Vc6oBdAlvhDOsh6XRNSm08anVBpk27p6v8TWbsqVlw5x6ECoRBP3ktaawBr86lwvXY7qEH9ChbqlEfD IzLx443W 1bywztdkzbKkNWt494ru+Z+c2VmeC02E8UMFRgplQ48phgFqQcG+6nQ/xeCzy3ZnLDm+5ZgihRugETgjXhJEbe6GGzuKAGnKsufUEnxABY3Atk5IGEPvPdl15ujU+2qfqCcVKo5DYEhF6lM0BGzvOIUjDtlhnf8MUrgRl4BZwjVk9TAHzGlR0i4Gtilqh4/m3ckqK8Un92I+ULsw6b/CSZnms6O/fhxs1l1AJj9yrsgGOLEz0/YR7AaIr1xnZp3S0XHqaeiO92Jjlc8lPAQZD8qfhGfK0yPGJ0Ajwe1x+qjWqgolbBFoQeXwUSQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jan 08, 2025 at 12:48:21AM -0700, Yu Zhao wrote: > Using x86_64 as an example, for a 32KB struct page[] area describing a > 2MB hugeTLB, HVO reduces the area to 4KB by the following steps: > 1. Split the (r/w vmemmap) PMD mapping the area into 512 (r/w) PTEs; > 2. For the 8 PTEs mapping the area, remap PTE 1-7 to the page mapped > by PTE 0, and at the same time change the permission from r/w to > r/o; > 3. Free the pages PTE 1-7 used to map, hence the reduction from 32KB > to 4KB. > > However, the following race can happen due to improperly memory loads > ordering: > CPU 1 (HVO) CPU 2 (speculative PFN walker) > > page_ref_freeze() > synchronize_rcu() > rcu_read_lock() > page_is_fake_head() is false > vmemmap_remap_pte() > XXX: struct page[] becomes r/o > > page_ref_unfreeze() > page_ref_count() is not zero > > atomic_add_unless(&page->_refcount) > XXX: try to modify r/o struct page[] > > Specifically, page_is_fake_head() must be ordered after > page_ref_count() on CPU 2 so that it can only return true for this > case, to avoid the later attempt to modify r/o struct page[]. > > This patch adds the missing memory barrier and makes the tests on > page_is_fake_head() and page_ref_count() done in the proper order. > > Fixes: bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers") > Reported-by: Will Deacon > Closes: https://lore.kernel.org/20241128142028.GA3506@willie-the-truck/ > Signed-off-by: Yu Zhao > --- > include/linux/page-flags.h | 37 +++++++++++++++++++++++++++++++++++++ > include/linux/page_ref.h | 2 +- > 2 files changed, 38 insertions(+), 1 deletion(-) Sorry for the very late reply, but I finally found time to sit down and go through this. I think it resolves the problem I pointed out, so: Acked-by: Will Deacon Thanks! Will