From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C68C6C19F2B for ; Wed, 3 Aug 2022 22:42:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0300D8E0002; Wed, 3 Aug 2022 18:42:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F22068E0001; Wed, 3 Aug 2022 18:42:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D73DC8E0002; Wed, 3 Aug 2022 18:42:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C22B58E0001 for ; Wed, 3 Aug 2022 18:42:24 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 98371C13D4 for ; Wed, 3 Aug 2022 22:42:24 +0000 (UTC) X-FDA: 79759756608.17.E317D3B Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf10.hostedemail.com (Postfix) with ESMTP id 0D4ECC006A for ; Wed, 3 Aug 2022 22:42:23 +0000 (UTC) Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 273ME5LQ021710; Wed, 3 Aug 2022 22:42:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=date : from : to : cc : subject : message-id : references : content-type : in-reply-to : mime-version; s=corp-2022-7-12; bh=AMVSCcRxycITg+uEL7OmZmhJvwzB0R6HlIzB77PJ/NI=; b=2I5HFe6TUn4sfkisjDP7heXF+npPiJ0jY7c4uS4+rWGj3so5BaAbvsr/fU8X13ICXIPd XQd9Orq4LuCzvfNeqiNlHTVQhdR8SAFTcI2ZACAH0/43K/HeDJIdamBj695fYZSrze+r mHz7nwcyuiDeh06+dWTAqUWFUbjtitUB9BEkgmED74Tp/1KbTZQR5dwOeVIFppe8DHW8 GV+y3CIrAD8oWgWpg2jqNWleRuL+pqvvmf9sbvC/r/i2yciylAvKmkaQdPuEGrx2qK7p u9k67JdEs5EAylMLd1Z4JsjPoAaG+k2cE0habuScGqLMC4xZ95+GsPDZracQAtDM0tce nA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3hmu813au8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 03 Aug 2022 22:42:21 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 273L6GIJ010751; Wed, 3 Aug 2022 22:42:21 GMT Received: from nam10-mw2-obe.outbound.protection.outlook.com (mail-mw2nam10lp2101.outbound.protection.outlook.com [104.47.55.101]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3hmu33pac5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 03 Aug 2022 22:42:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cLWw1P1TP01uJ7frJgJbOBRGYeYSYMlwMaflk72Y2WzhLbATnJiVbOaA1rAJWmD16IhYLaoaqxhkwVJl1KhhP4H35OoYNO8BiVeMHmkhAnma4lW9VnuTfouFnhiOOrpI08IIPlRsilrfjAkN53P/QBLplwD9Ztm5BUZaE7Kve8ueGZ+5mjaTqE7WkVxW2nFv6akBbD61ufbFrR8qayVKX3RQtIbcRnKglgjUG9Tim3tDV0If9OqL2tZjQitMYzmiF0S06EkvEBTJI62cf1EftjJJ/QzaRf4JNKvOt0N1AskrV3pP0XBYpJgtwI1PsxXkF051EWSIy2+lbRygw83DVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AMVSCcRxycITg+uEL7OmZmhJvwzB0R6HlIzB77PJ/NI=; b=LwA1PdbdAxquGlCtbMW3XY9Cy20vaMgoEW5IQp1YI4T7J7d6+0Lf327BkKOeUb5DOM1xE70zpnsVmixGAfCBJx1AVx/RhbWy4+J+CLz4SMQn8+Z/bPM3Kv6chXs6iH7mRaU9whsIAVY66Xv4L5cUgTpQwo2iwEHG/YMTxb/DtiWDd5IqFL7zeudP0WJo+2jFkSsK86WzdqFQ4fR8HRh00gZXIXwEl82UrDSf0s54Cts0G18UCjNpltbYfRCWMv5ip5pRXA6xo7XGhGBXIWrXRm2B4X80WwF8b1ZBglDC1JO42xQ7jIUxokFyPm19P/Uhvkz5nT8MJEDHzC9/JjoKcQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=AMVSCcRxycITg+uEL7OmZmhJvwzB0R6HlIzB77PJ/NI=; b=lKcqvPjQbHqvyjY+0VcisLMUmlbIbtKnjJciVdP0EAaWZ5AsdORCG4sroT9Cy9adnhZU+9u/ECeGJVE5IwWABK7HDkJ+3NF8fCJzrSn+e0j9n41hzbbxnkBEsmsm6+/J6bIQYMasDoxcBrWjFdF8HecXhceUR2pQOSGxfQHB5co= Received: from BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) by DM6PR10MB2937.namprd10.prod.outlook.com (2603:10b6:5:69::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5482.12; Wed, 3 Aug 2022 22:42:18 +0000 Received: from BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::c1ba:c197:f81f:ec0]) by BY5PR10MB4196.namprd10.prod.outlook.com ([fe80::c1ba:c197:f81f:ec0%6]) with mapi id 15.20.5482.016; Wed, 3 Aug 2022 22:42:18 +0000 Date: Wed, 3 Aug 2022 15:42:15 -0700 From: Mike Kravetz To: Muchun Song Cc: Joao Martins , linux-mm@kvack.org, Andrew Morton Subject: Re: [PATCH v1] mm/hugetlb_vmemmap: remap head page to newly allocated page Message-ID: References: <20220802180309.19340-1-joao.m.martins@oracle.com> <0b085bb1-b5f7-dfc2-588a-880de0d77ea2@oracle.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: MW4PR04CA0080.namprd04.prod.outlook.com (2603:10b6:303:6b::25) To BY5PR10MB4196.namprd10.prod.outlook.com (2603:10b6:a03:20d::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7ebd734e-bb92-4c89-e708-08da75a16b77 X-MS-TrafficTypeDiagnostic: DM6PR10MB2937:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: iQRtHFdQwQLxl4aDRp33F6d8VUMg9TG1O9Zf1goOldbIYlrWRgramnTpgibvi1HdRynThV4AN1nt2qAtAXwUzA+fRWFnmA+HUs6sVcGKOQ8zZ0JG6RbfctPv5cUH2345+oJtc9x5+rP8kgnw3aJ/JEy5s0j//mqrS7Ox/b+khEyR1kVhcZ7n/Z/WpU+OpPx/Nrhe8d1GThBXVI44keg+Nt28vLPd5uJ7kbp1cI4FODKg8idHEJhGvrD1uIYF1t+T52ul3cLeFur0/10hRsGngxrEj9LjDs/MDyrciDEhxxoKfczeLh8vKHMLnqQ83Zmg/Au8h67oCCkaQpsaQTSawx//Y36YAJnq8bOX6APZX8zRnOUs4bNeLf5TIC17M/RTx3tfdNV8LrEEf0mYdIiPwP/4eSCkvWyEmpbxBtswt1AvQB928gvXYdDLEazFtWS3k2YCJz8Nx7WiOUeW8CrZdBsvTYJRIRIN9iblLwjUAe4XWG6jiAWO8vGVpR/FJlq+bmwyMnj5tA3vhkxuswKVuYX4bRleOyoKG4jD3Vhe2sFIc1tjVJX0mK4FnlkpymUifVKoi2QSpUtOgJdICZbt/Pi2hNDTQTDsemlU7rKp4Eax0ixg4+nYEykqhAWqFnCjE/y30AC54cIL9M4XENMVBgqiDQKbQ+OJTF3FpfSv+Tz5xEmUnR+0oQBiVv10lfhIlcwCsACWp0USivyT6yhx5ERwA+F6Ldwv7R6kkO4nRdxMNQFCinQzqiBhqV8knAaDf+F6PxodAL2A2h8J3/okBuYfP/v7kIGSCgYSOJ6WXeHUxMdBZRfua3PDku86PVQO X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BY5PR10MB4196.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(7916004)(346002)(376002)(366004)(39860400002)(396003)(136003)(86362001)(83380400001)(26005)(53546011)(9686003)(6666004)(6506007)(41300700001)(186003)(2906002)(38100700002)(6512007)(44832011)(5660300002)(6916009)(316002)(6486002)(966005)(54906003)(4326008)(8676002)(66476007)(8936002)(66556008)(33716001)(478600001)(66946007)(36044002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?vfAf4Oj+u+Hg/a5ii1VyUxpeus/iLjcfXXv2JXess9+/XZ585JjCUqqimBe4?= =?us-ascii?Q?LOqitP8ISqfSnItd2PxCAMXqlzBhszmD9oWYwDum669cagG/SyvYROA375n/?= =?us-ascii?Q?8L5HNk7WMvLYWaBZn4Htox0ri35F9ZD7tjMMxkCmjKQZETsc6mMxFhQNVuTN?= =?us-ascii?Q?wygreP+QlaV9DDxrAr7IDj/BdvqEebXckD5D+AV6d10Tvd5DI1rFt4+lWfV5?= =?us-ascii?Q?7nVXlVNy9sB4ILHRYAt50kCCtdveBHetTbC8FyE7KxfT/xXc5YpANz126qJb?= =?us-ascii?Q?Hy2vzzNKZ6S7eMlHizpOztIDtWv5phuC/As7yfLGMl04BkdQWHzP/34rt/kG?= =?us-ascii?Q?+k96FDEq+QI33u6OnkdydeL8Eie5zVLNERJ60qK+FFGDZXl+8kB5jsNIfGoD?= =?us-ascii?Q?Kxf0cF32EfK++jRyXrLv7FvvqG+q/OirFHAoA6sOvLEoqR9pSvXbXrvElS0P?= =?us-ascii?Q?+m+SR57XLSmm7ZnfPK0isSBxQtNwo5VS98BxlHalvT8CHrLOlLk3wlcIwctr?= =?us-ascii?Q?GWOIyhTbT+TwKmFMY6xkkCr+OcLxJ8ODg4m8cq2uDT2PXuM/50fcWN6c/upK?= =?us-ascii?Q?hkwiI3p0lSeowy5ermT48byvym9M7RZJv9Auib+0x1m3UV6/4oTfLqXiLSRR?= =?us-ascii?Q?a9rS/dQ1vUlH6bZKoeHOnjOx4MkjFDJevqF5SV1r129Cg8/EqyTtiOr8Urz4?= =?us-ascii?Q?oDWrhMkmnch43GVUDjo7UXj0vVw3iwNTFExEPeCJ9o57YUIyEmJ/ZZQrtw0E?= =?us-ascii?Q?XQAa5dBB83tv5ifADjsLcqy+fBoSeaooEK1lRL9Xgrq5zCuf2Yqs21xhUMkC?= =?us-ascii?Q?2n7eluI3pM+M3rNtTu3iEoAFuDOYOrJEbniVmnh5IxqxUIH/o1bXDMY/+F0k?= =?us-ascii?Q?gme2IigP2PtGevxWVfPvvynmiP2fgDVBCmEQTLC9Ez1XFV98R/EEvULKRder?= =?us-ascii?Q?E1zabcXCNgbuwKShiOTzibqTSW4QWmxBKIuOlc2miecQhPmhzfnP/pqdfTkC?= =?us-ascii?Q?PiHAhn+5lGx/RLLqAhrDuJcMo8jOH58hLW41p24hFXTG+5LlSQn0QrMSkhm1?= =?us-ascii?Q?HYBkRQ9FItjj0MDTV3+2Q34E9oGZdRx6r1VwLwUcXZXx0FGhOz+d719lH83Y?= =?us-ascii?Q?QAp+wZIN7JSOvVJ614spCi1kLOq+IiBZXINeVKzn8O4NYTX0MgbUCbI9BtMJ?= =?us-ascii?Q?DEyV3BgJPN7wVxtyUe9dgshjjVtv6NcoRQkaLvBGAx4izZpSFARruc3UL5Us?= =?us-ascii?Q?s9PMivSlztWHmI3d9Lex5RsBNfW/Ri8ilK9eX5ZKGYhH79xmf19YQt5T2sDG?= =?us-ascii?Q?rLMeioi8E62wgVRcts5/dQVva7c6PGDZfpQvDsQ2q7ZFKsp/2Sp+o+jGZiPA?= =?us-ascii?Q?QSOTPhFTU9eyCMwFd3dVhIDhrxlq42DYmzOCRqP5szw4sY0lmj5ioXARRIYP?= =?us-ascii?Q?qt0y9cUeDxAQsVuT75u64WjajtMbUxnGO2b7cfth0xxM0Le479qcqNBaRYq4?= =?us-ascii?Q?r57JVxOFLAsUmZTVPQXrcZ+XaGLEEblZVa7ygl3uqA66d+RcSlhj/4ldboA+?= =?us-ascii?Q?dHS8QnCiWN//LsArzzznPrQ3TvNHEXLOBv65slci/WVDd/fSp/fWo2F0eyk3?= =?us-ascii?Q?tA=3D=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7ebd734e-bb92-4c89-e708-08da75a16b77 X-MS-Exchange-CrossTenant-AuthSource: BY5PR10MB4196.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Aug 2022 22:42:18.0956 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 9Yy/Ed2ojH//1VN7jVFU/cVbposxVLNtNiEI1QPfnlwj86VsrKcvyfxLGjyXNXaLruUrbtLJHejfXhV6I6uuwg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR10MB2937 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-03_06,2022-08-02_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 mlxlogscore=999 malwarescore=0 bulkscore=0 spamscore=0 phishscore=0 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2208030095 X-Proofpoint-ORIG-GUID: OVVLGOerZil4vBiG5QochCXnDNdFCxmc X-Proofpoint-GUID: OVVLGOerZil4vBiG5QochCXnDNdFCxmc ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659566544; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AMVSCcRxycITg+uEL7OmZmhJvwzB0R6HlIzB77PJ/NI=; b=x+eSN0KzE7EUep2vaETg4HulNDibuMrvJtKvKTzE7Jf++liUyvOyvHa1m++tTuIqQ+jWOD Nv4DW9hDNJwb9uHBPsUcsonfa8JuBg4KUyvhpzrB7LeiXp19QcbFnYOE0k3ll4ghVKNtsh qYYdv3BGWlJj9FcPvQxeQgU75CGmyn8= ARC-Authentication-Results: i=2; imf10.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=2I5HFe6T; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=lKcqvPjQ; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf10.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1659566544; a=rsa-sha256; cv=pass; b=qyqQnZYDNk+pb6L1GFoS36v/7VYMl1hmhbOUxYl9Cg/w8vzgrqldjwvsuyyMesPrSm8WIJ QixXGXGqfx/Jx6BfmoZz5EoiRzU5RTGyKfIIbBb4XXj16Za4VB/wGt/lwBEj0kjJaQFSDv XnNudbAKxsMFLet6Y8COIB7e21wfPnQ= X-Stat-Signature: hjug5ku8bbt6hjd3supdzxw63c37pj5z X-Rspamd-Queue-Id: 0D4ECC006A X-Rspamd-Server: rspam01 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b=2I5HFe6T; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=lKcqvPjQ; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf10.hostedemail.com: domain of mike.kravetz@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com X-Rspam-User: X-HE-Tag: 1659566543-702043 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 08/03/22 18:44, Muchun Song wrote: > On Wed, Aug 03, 2022 at 10:52:13AM +0100, Joao Martins wrote: > > On 8/3/22 05:11, Muchun Song wrote: > > > On Tue, Aug 02, 2022 at 07:03:09PM +0100, Joao Martins wrote: > > >> Today with `hugetlb_free_vmemmap=on` the struct page memory that is > > >> freed back to page allocator is as following: for a 2M hugetlb page it > > >> will reuse the first 4K vmemmap page to remap the remaining 7 vmemmap > > >> pages, and for a 1G hugetlb it will remap the remaining 4095 vmemmap > > >> pages. Essentially, that means that it breaks the first 4K of a > > >> potentially contiguous chunk of memory of 32K (for 2M hugetlb pages) or > > >> 16M (for 1G hugetlb pages). For this reason the memory that it's free > > >> back to page allocator cannot be used for hugetlb to allocate huge pages > > >> of the same size, but rather only of a smaller huge page size: > > >> > > > > > > Hi Joao, > > > > > > Thanks for your work on this. I admit you are right. The current mechanism > > > prevented the freed vmemmap pages from being mergerd into a potential > > > contiguous page. Allocating a new head page is straightforward approach, > > > however, it is very dangerous at runtime after system booting up. Why > > > dangerous? Because you should first 1) copy the content from the head vmemmap > > > page to the targeted (newly allocated) page, and then 2) change the PTE > > > entry to the new page. However, the content (especially the refcount) of the > > > old head vmemmmap page could be changed elsewhere (e.g. other modules) > > > between the step 1) and 2). Eventually, the new allocated vmemmap page is > > > corrupted. Unluckily, we don't have an easy approach to prevent it. > > > > > OK, I see what I missed. You mean the refcount (or any other data) on the > > preceeding struct pages to this head struct page unrelated to the hugetlb > > page being remapped. Meaning when the first struct page in the old vmemmap > > page is *not* the head page we are trying to remap right? > > > > See further below in your patch but I wonder if we could actually check this > > from the hugetlb head va being aligned to PAGE_SIZE. Meaning that we would be checking > > that this head page is the first of struct page in the vmemmap page that > > we are still safe to remap? If so, the patch could be simpler more > > like mine, without the special freeing path you added below. > > > > If I'm right, see at the end. > > > I am not sure we are on the same page (it seems that we are not after I saw your > below patch). So let me make it become more clarified. Thanks Muchun! I told Joao that you would be the expert in this area and was correct. > > CPU0: CPU1: > > vmemmap_remap_free(start, end, reuse) > // alloc a new page used to be the head vmemmap page > page = alloc_pages_node(); > > memcpy(page_address(page), reuse, PAGE_SIZE); > // Now the @reuse address is mapped to the original > // page frame. So the change will be reflected on the > // original page frame. > get_page(reuse); Here is a thought. This code gets called early after allocating a new hugetlb page. This new compound page has a ref count of 1 on the head page and 0 on all the tail pages. If the ref count was 0 on the head page, get_page() would not succeed. I can not think of a reason why we NEED to have a ref count of 1 on the head page. It does make it more convenient to simply call put_page() on the newly initialized hugetlb page and have the page be added to the huegtlb free lists, but this could easily be modified. Matthew Willcox recently pointed me at this: https://lore.kernel.org/linux-mm/20220531150611.1303156-1-willy@infradead.org/T/#m98fb9f9bd476155b4951339da51a0887b2377476 That would only work for hugetlb pages allocated from buddy. For gigantic pages, we manually 'freeze' (zero ref count) of tail pages and check for an unexpected increased ref count. We could do the same with the head page. Would having zero ref count on the head page eliminate this race? -- Mike Kravetz > vmemmap_remap_pte(); > // remap to the above new allocated page > set_pte_at(); > > flush_tlb_kernel_range(); > // Now the @reuse address is mapped to the new allocated > // page frame. So the change will be reflected on the > // new page frame and it is corrupted. > put_page(reuse); > > So we should make 1) memcpy, 2) remap and 3) TLB flush atomic on CPU0, however > it is not easy.