From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54C5DC83F15 for ; Thu, 31 Aug 2023 03:28:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55C258E0008; Wed, 30 Aug 2023 23:28:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E47D8D0001; Wed, 30 Aug 2023 23:28:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3FB0F8E0008; Wed, 30 Aug 2023 23:28:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 303998D0001 for ; Wed, 30 Aug 2023 23:28:41 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id F260C160493 for ; Thu, 31 Aug 2023 03:28:40 +0000 (UTC) X-FDA: 81182967600.14.9F67416 Received: from out-245.mta0.migadu.com (out-245.mta0.migadu.com [91.218.175.245]) by imf10.hostedemail.com (Postfix) with ESMTP id 1AF55C001A for ; Thu, 31 Aug 2023 03:28:38 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=GDP1BPLu; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf10.hostedemail.com: domain of muchun.song@linux.dev designates 91.218.175.245 as permitted sender) smtp.mailfrom=muchun.song@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693452519; a=rsa-sha256; cv=none; b=mk7RbikrPnqSaYqhPVUUtUUwfcftIk7V6VZnvsfFHgneO6T8XW+mRkObvSX9oPqLtTskWW ii/gQ4fcuzflLwG1yENCN2Ar2KrvXgqx3LL1FbKvoLYs4O3RdKUcAQw1zNmpD2ToQVdftv Jc4uTJWaNPcbnLk0LrTVgfM4izIyp+k= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=GDP1BPLu; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf10.hostedemail.com: domain of muchun.song@linux.dev designates 91.218.175.245 as permitted sender) smtp.mailfrom=muchun.song@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693452519; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VIeM3nyX8rQsoSZOOJBtrylSeTVSP/9u4tsI5GW+b0I=; b=C2h0Pz2QzU/ug2QmyBBHKcNktiCy+V79mL6NxwO9gO/Mz2dL0R5Jp7N1Dp0+mL4UcOAfO1 NdCdM60JlXOLKoslrIAHQE01HWnj+RKHJi76FBhVksZBQ0aOLBYnObDLgnAqxatcANWqQ/ frsVc82bKXFphYRqW75uLENtk3OF7HM= Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693452513; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VIeM3nyX8rQsoSZOOJBtrylSeTVSP/9u4tsI5GW+b0I=; b=GDP1BPLu5hRvbUVAUyMhkrvbjzXXrHtH7tALDkLcCW81o/bTst5s5Hro+nPfQDIztk1Ma2 GAB93DMcmsfObQWpmJsPT+khIly/x5V67WZuA/saNnSumiAYYwuAgGpHkTbkR/77NKKx3J NhGI7eKlwgsiGdcsCGwT5fzdpyzc+hQ= Mime-Version: 1.0 Subject: Re: [PATCH 09/12] hugetlb_vmemmap: Optimistically set Optimized flag X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <20230830224706.GC55006@monkey> Date: Thu, 31 Aug 2023 11:27:52 +0800 Cc: Linux-MM , LKML , Muchun Song , Joao Martins , Oscar Salvador , David Hildenbrand , Miaohe Lin , David Rientjes , Anshuman Khandual , Naoya Horiguchi , Michal Hocko , Matthew Wilcox , Xiongchun Duan , Andrew Morton Content-Transfer-Encoding: quoted-printable Message-Id: <7D01FB21-5182-428D-BCD8-89C679ABEEC8@linux.dev> References: <20230825190436.55045-1-mike.kravetz@oracle.com> <20230825190436.55045-10-mike.kravetz@oracle.com> <8e298c9f-1ef3-5c99-d7b5-47fd6703cf83@linux.dev> <20230830224706.GC55006@monkey> To: Mike Kravetz X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 1AF55C001A X-Stat-Signature: 864ckbj43m9m3sh9sfx38recndctt8ky X-HE-Tag: 1693452518-618677 X-HE-Meta: U2FsdGVkX192I8DfrivdgWP96RL38dptxlHEJbX21uAF1SK2TBhzt/qYD0q3P4nrVevDNnxJE/PiDMrM8AgN9jCxvXPNOXIcl2NlEHBDr5cfnEQ0sqf4p+HnFTTzRmJZimTb1rVhoI+4ZswvBO6tnP0c2R6EvyVypuSMpap5dFXUMiMVkLQqgLUagTnIMB0DohBK9tNAxFyIyBYmKJaQfOwcoZBAVuZd5RID9JE3y1NTKXu6m8/O4XboPQ9YRrUfY6NDEtutMfVaCCtY0flvj+rh4VYOuImABRVIKhKMz+kDYrzGHVZiAD4zuRBuv6148CDBJQkVvNsQGxoYmJLN9CqKRwJUmNZz42ok2bsjYgNUpsXtcBpp5iZWr8yt5T41hqgOcwCJoLMVdbsbCXZ+5EEQ8TDWtksQabeiXLd/mYoGJZe72XDc38huyvOEamD1e1P6DF6ZqiUtR8Ky5Uri6SEsL5Tr7DdFzdKXlIdeo4J24uCHtI6fBndIyjIi6DWgtWK5msDAQGIIeTdhi+jYYmx9NJqwVfTCpdsGTwXbudwu3jloQN8Frz+69OZXN3PTESHRM3E0Mnd0EdVDRemdqcqUFMb1VeLsEM6ZOK/9BCvNidNsE0GvsQHZTO1efqH8uk+jEcHEMEh1B3qLwH4iHvdSM9dtdfXZdrFGDweVia6/nfIvs86rYZE3U1VdBiiUrgXHZOSjD8BmVfJTehWjYjIbxTIlYUxPQ3FIF0KoljHb9WvDS8L/0Nv1O0NcO0cNibPAfoC52DKFksxS4/EJEI0OX8ve21vAW5C3u2+VvuaW+SVZ1y14HJvnwa/To3DCwSRngDmV8DiTpChgY8RkI+FRoa5TcSgsLXPuTaCOEPZkFHtGANzO8EiMk5RUOQmU5y6Vpj39oxr7QyTm4xnrxebat3gsFWwVlonCuNPXmQnOQpJZsd4sC96dlhDMWFPAMw+agDL864XlgwGOFuL 0zCBIMtx Uj87I7DGaKYeTVNJ/EHOEyxZ1jUq6PuiU3LKYGiAqGI/MRU3IcgCYO0FNTuHSzC3h1uaW7uX2Rp/Tfjjph2Lq0R5h+G9d79C/sui0sGkUBd0/hjoR4X+AnPSjSq+yU9B5Mdw3MNBT16wvFwj2nj7qUCmNcNPRf/gZmjQx3tOKKZ9VrBhmCnunBpY1q377UGlKI7BR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Aug 31, 2023, at 06:47, Mike Kravetz = wrote: >=20 > On 08/30/23 15:26, Muchun Song wrote: >>=20 >>=20 >> On 2023/8/26 03:04, Mike Kravetz wrote: >>> At the beginning of hugetlb_vmemmap_optimize, optimistically set >>> the HPageVmemmapOptimized flag in the head page. Clear the flag >>> if the operation fails. >>>=20 >>> No change in behavior. However, this will become important in >>> subsequent patches where we batch delay TLB flushing. We need to >>> make sure the content in the old and new vmemmap pages are the same. >>=20 >> Sorry, I didn't get the point here. Could you elaborate it? >>=20 >=20 > Sorry, this really could use a better explanation. >=20 >>>=20 >>> Signed-off-by: Mike Kravetz >>> --- >>> mm/hugetlb_vmemmap.c | 8 +++++--- >>> 1 file changed, 5 insertions(+), 3 deletions(-) >>>=20 >>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c >>> index e390170c0887..500a118915ff 100644 >>> --- a/mm/hugetlb_vmemmap.c >>> +++ b/mm/hugetlb_vmemmap.c >>> @@ -566,7 +566,9 @@ static void __hugetlb_vmemmap_optimize(const = struct hstate *h, >>> if (!vmemmap_should_optimize(h, head)) >>> return; >>> + /* Optimistically assume success */ >>> static_branch_inc(&hugetlb_optimize_vmemmap_key); >>> + SetHPageVmemmapOptimized(head); >>> vmemmap_end =3D vmemmap_start + hugetlb_vmemmap_size(h); >>> vmemmap_reuse =3D vmemmap_start; >>> @@ -577,10 +579,10 @@ static void __hugetlb_vmemmap_optimize(const = struct hstate *h, >>> * to the page which @vmemmap_reuse is mapped to, then free the = pages >>> * which the range [@vmemmap_start, @vmemmap_end] is mapped to. >>> */ >>> - if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, = bulk_pages)) >>> + if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, = bulk_pages)) { >>> static_branch_dec(&hugetlb_optimize_vmemmap_key); >>> - else >>> - SetHPageVmemmapOptimized(head); >>> + ClearHPageVmemmapOptimized(head); >>> + } >=20 > Consider the case where we have successfully remapped vmemmap AND > - we have replaced the page table page (pte page) containing the = struct > page of the hugetlb head page. Joao's commit 11aad2631bf7 > 'mm/hugetlb_vmemmap: remap head page to newly allocated page'. > - we have NOT flushed the TLB after remapping due to batching the > operations before flush. >=20 > In this case, it is possible that the old head page is still in the = TLB > and caches and SetHPageVmemmapOptimized(head) will actually set the = flag > in the old pte page. We then have an optimized hugetlb page without = the > HPageVmemmapOptimized flag set. When developing this series, we > experienced various BUGs as a result of this situation. Now, I got it. Thanks for your elaboration. >=20 > In the case of an error during optimization, we do a TLB flush so if > we need to clear the flag we will write to the correct pte page. Right. >=20 > Hope that makes sense. >=20 > I add an explanation like this to the commit message and perhaps put > this closer to/or squash with the patch that batches operations before > flushing TLB. Yes. But I'd also like to add a big comment to explain what's going on = here instead of a simple "Optimistically assume success". This one really = makes me think it is an optimization not a mandatory premise. Thanks. > --=20 > Mike Kravetz >=20 >>> } >>> /**