From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A318C32772 for ; Thu, 18 Aug 2022 09:50:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3FF378D0001; Thu, 18 Aug 2022 05:50:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3ADB46B0075; Thu, 18 Aug 2022 05:50:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2759C8D0001; Thu, 18 Aug 2022 05:50:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 175276B0074 for ; Thu, 18 Aug 2022 05:50:34 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B98434198D for ; Thu, 18 Aug 2022 09:50:33 +0000 (UTC) X-FDA: 79812243546.19.6FE24B9 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf02.hostedemail.com (Postfix) with ESMTP id A201580705 for ; Thu, 18 Aug 2022 09:47:01 +0000 (UTC) Content-Type: text/plain; charset=utf-8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1660814331; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qyEVh1JMxIRxomH1QldNUUxAXM0JxLRF2OseZ5hCiag=; b=ZVtxOChpvztE7Y9cYR+DgLHUFxbc9/f+Q9sG1z+hTRTMIen4OoZ8KmIHj9RVQA4Iemblth 3ER/O594tQoAdf6Bfid48AeqsHEQkxqJNtbJfZaCtjmu1qElXZHwFpaui1iaFiJa8THtgv dBUummR8HbYR3rk31ICu2CPt7uVXOFI= MIME-Version: 1.0 Subject: Re: [PATCH 4/6] mm: hugetlb_vmemmap: add missing smp_wmb() before set_pte_at() X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <7408156a-f708-5e73-d0a2-69b1acca9b96@intel.com> Date: Thu, 18 Aug 2022 17:18:09 +0800 Cc: Andrew Morton , Mike Kravetz , Muchun Song , Linux MM , linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Message-Id: <15DD6DCA-39BC-4EA2-984F-D488E94CC4FF@linux.dev> References: <20220816130553.31406-1-linmiaohe@huawei.com> <20220816130553.31406-5-linmiaohe@huawei.com> <0EAF1279-6A1C-41FA-9A32-414C36B3792A@linux.dev> <019c1272-9d01-9d51-91a0-2d656b25c318@intel.com> <18adbf89-473e-7ba6-9a2b-522e1592bdc6@huawei.com> <9c791de0-b702-1bbe-38a4-30e87d9d1b95@intel.com> <931536E2-3948-40AB-88A7-E36F67954AAA@linux.dev> <7be98c64-88a1-3bee-7f92-67bb1f4f495b@huawei.com> <3B1463C2-9DC4-43D0-93EC-2D2334A20502@linux.dev> <7fa5b2b2-dcef-f264-7932-c4fdbb9619d0@intel.com> <7408156a-f708-5e73-d0a2-69b1acca9b96@intel.com> To: "Yin, Fengwei" , Miaohe Lin X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660816024; a=rsa-sha256; cv=none; b=WuMjgL7pESe2Tj1jaqN/Vz5RfME/w4zjIOETyBvULkUEkwWRBlYuHpvz/09HNRKdc/xfSh xDhGARJq9j44BAiuwPjab0E8UYFrKeimWv1QXmnMxTwYu4AqHHjocZJ5CWKUQmGCkpnUom Xs3lHG2ea2i4W7YzTR1iwcO947lEoXQ= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=ZVtxOChp; spf=pass (imf02.hostedemail.com: domain of muchun.song@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660816024; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qyEVh1JMxIRxomH1QldNUUxAXM0JxLRF2OseZ5hCiag=; b=5ttgQzeANZfVpAbINri08XGc2xllogYWsoWg8UfdAHYq2E5mqMyqJrN5Ex+fteqijIqnGY l1ISViAe8ReYmefND9lQKezCSombhHqTfImN+fMzKdkTtj5AGSaR5VfshmwDrb0AzpGIdV kgQZqjcmR8nkEOR0T/d+TLxt5DsQ/rk= X-Rspamd-Queue-Id: A201580705 Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=ZVtxOChp; spf=pass (imf02.hostedemail.com: domain of muchun.song@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Rspam-User: X-Stat-Signature: b743px9x1gdn1zbhhwy7m9zzs633ky6m X-Rspamd-Server: rspam12 X-HE-Tag: 1660816021-361226 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Aug 18, 2022, at 16:54, Yin, Fengwei wrote: >=20 >=20 >=20 > On 8/18/2022 4:40 PM, Muchun Song wrote: >>=20 >>=20 >>> On Aug 18, 2022, at 16:32, Yin, Fengwei = wrote: >>>=20 >>>=20 >>>=20 >>> On 8/18/2022 3:59 PM, Muchun Song wrote: >>>>=20 >>>>=20 >>>>> On Aug 18, 2022, at 15:52, Miaohe Lin = wrote: >>>>>=20 >>>>> On 2022/8/18 10:47, Muchun Song wrote: >>>>>>=20 >>>>>>=20 >>>>>>> On Aug 18, 2022, at 10:00, Yin, Fengwei = wrote: >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>> On 8/18/2022 9:55 AM, Miaohe Lin wrote: >>>>>>>>>>> /* >>>>>>>>>>> * The memory barrier inside __SetPageUptodate makes = sure that >>>>>>>>>>> * preceding stores to the page contents become visible = before >>>>>>>>>>> * the set_pte_at() write. >>>>>>>>>>> */ >>>>>>>>>>> __SetPageUptodate(page); >>>>>>>>>> IIUC, the case here we should make sure others (CPUs) can see = new page=E2=80=99s >>>>>>>>>> contents after they have saw PG_uptodate is set. I think = commit 0ed361dec369 >>>>>>>>>> can tell us more details. >>>>>>>>>>=20 >>>>>>>>>> I also looked at commit 52f37629fd3c to see why we need a = barrier before >>>>>>>>>> set_pte_at(), but I didn=E2=80=99t find any info to explain = why. I guess we want >>>>>>>>>> to make sure the order between the page=E2=80=99s contents = and subsequent memory >>>>>>>>>> accesses using the corresponding virtual address, do you = agree with this? >>>>>>>>> This is my understanding also. Thanks. >>>>>>>> That's also my understanding. Thanks both. >>>>>>> I have an unclear thing (not related with this patch directly): = Who is response >>>>>>> for the read barrier in the read side in this case? >>>>>>>=20 >>>>>>> For SetPageUptodate, there are paring write/read memory barrier. >>>>>>>=20 >>>>>>=20 >>>>>> I have the same question. So I think the example proposed by = Miaohe is a little >>>>>> difference from the case (hugetlb_vmemmap) here. >>>>>=20 >>>>> Per my understanding, memory barrier in PageUptodate() is needed = because user might access the >>>>> page contents using page_address() (corresponding pagetable entry = already exists) soon. But for >>>>> the above proposed case, if user wants to access the page = contents, the corresponding pagetable >>>>> should be visible first or the page contents can't be accessed. So = there should be a data dependency >>>>> acting as memory barrier between pagetable entry is loaded and = page contents is accessed. >>>>> Or am I miss something? >>>>=20 >>>> Yep, it is a data dependency. The difference between = hugetlb_vmemmap and PageUptodate() is that >>>> the page table (a pointer to the mapped page frame) is loaded by = MMU while PageUptodate() is >>>> loaded by CPU. Seems like the data dependency should be inserted = between the MMU access and the CPU >>>> access. Maybe it is hardware=E2=80=99s guarantee? >>> I just found the comment in pmd_install() explained why most arch = has no read >>=20 >> I think pmd_install() is a little different as well. We should make = sure the >> page table walker (like GUP) see the correct PTE entry after they see = the pmd >> entry. >=20 > The difference I can see is that pmd/pte thing has both hardware page = walker and > software page walker (like GUP) as read side. While the case here only = has hardware > page walker as read side. But I suppose the memory barrier requirement = still apply > here. I am not against this change. Just in order to make me get a better = understanding of hardware behavior. >=20 > Maybe we could do a test: add large delay between reset_struct_page() = and set_pte_at? Hi Miaohe, Would you mind doing this test? One thread do vmemmap_restore_pte(), = another thread detect if it can see a tail page with PG_head after the previous thread = has executed set_pte_at(). Thanks. >=20 > Regards > Yin, Fengwei=20 >=20 >>=20 >>> side memory barrier except alpha which has read side memory barrier. >>=20 >> Right. Only alpha has data dependency barrier. >>=20 >>>=20 >>>=20 >>> Regards >>> Yin, Fengwei >>>=20 >>>>=20 >>>>>=20 >>>>> Thanks, >>>>> Miaohe Lin