From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB430C19F2D for ; Thu, 4 Aug 2022 09:24:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2CE588E0002; Thu, 4 Aug 2022 05:24:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 27D7E8E0001; Thu, 4 Aug 2022 05:24:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D16C8E0002; Thu, 4 Aug 2022 05:24:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EB1088E0001 for ; Thu, 4 Aug 2022 05:24:01 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B6F2BA0C58 for ; Thu, 4 Aug 2022 09:24:01 +0000 (UTC) X-FDA: 79761373482.21.07CC71C Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf07.hostedemail.com (Postfix) with ESMTP id 17E7F40116 for ; Thu, 4 Aug 2022 09:24:00 +0000 (UTC) Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2748ONlc020082; Thu, 4 Aug 2022 09:23:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=corp-2022-7-12; bh=JPAwvQYD+CfQZZKFRMyVJkOkdmTG8XrITzs/3pDsPsg=; b=d7Eob/3JQD2pIOtKUtTk7FUc5SbsbweFqv7dXisDAOuPB+BSyDG0zjravdwyC4bQQQNY bAULIFbL2xQxaatNy7QS4YLWufe8vLdycTMG+AoIDbfCbWqPpppEjRgj7JRgnOEGiAze 4bTg352uOywWzr2T7pwznuK382HCJavWBL8LuNU4gWqsSqcyJcvlJORn695QjvnLz/6w pdSroJji0JVXQLR9xonJB3utr5wra0rvrcsjCFoInM+hLXM0aM3fOeFLq1r5i2sNwPc3 h0ePd89XD+11rZHue/FuWDqriBUrP5vhCOWnPYEkdJGEVJCiAnbzYRnzQRJVn2edXABe cg== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3hmvh9v2gx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 04 Aug 2022 09:23:58 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.5/8.17.1.5) with ESMTP id 2747v1aF001025; Thu, 4 Aug 2022 09:23:57 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2177.outbound.protection.outlook.com [104.47.59.177]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3hp57t014y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 04 Aug 2022 09:23:57 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JKsRvSMwlqsU1KbSNYkK4rq3G8y1uB0stIww4E+PM0jUeL9tj3jjYh3U+lUhxKLuvFuMfvpqFQq+9plaqAZxSpw8OmnL4f4o9krQMHkI07n5Iy4SwLe8OytuOj+GcwUcpsK7t1lIRTjVPe8OLLuTObkn7VYZjj+G8uJzOndkK0TwfVEYcOP4evJq8UWYjvDWsc87OS/tCM4y76Wqbb8JASSEhreJjLEwFUuYwL60A/S/ztTmbqCpJP6+3FXfeF8oHeO2+iEV9rsDXDnZZdWHfBFh5VniQouV9HdfJ/qmeyn/5cK76Rh43ZwDbBxsxd+V0v6L7Q0+E4AvycZUYC8otA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JPAwvQYD+CfQZZKFRMyVJkOkdmTG8XrITzs/3pDsPsg=; b=a/wk+c/YyHSqz9XLe6zaK6SjYCa34QHLwex4TOxZLG96maTJCt+MWc6xMdINBEqH4+NkPhCN/abEfcRRl3Ox1zbF6dPGoycNVpjQzthlVYoNWsPN9SXHYPxgn2PJ6Js1WZ5ozkV2AtV+Zmkh23TYgli8r80utSrf9wApZ9if1HfTdsSxlYliXFtDNnLPZKlFbqyVkqsb3s23VMIXDnZCbYJD65IjZkXCZgn1TJ+A9O6WeEXWw0A+hQkTXdvUSoFG0LuUPtAOmvJTvhfB1DXyPbVlXWPzBeuehi2/JTY72jtbnL2+qnLOBsy8H1RjUux4aiziH13hP2uvqyiOFdoydA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JPAwvQYD+CfQZZKFRMyVJkOkdmTG8XrITzs/3pDsPsg=; b=WgCZ8/CsHyIm3YeBZGLVeeos3LsFMqBjXEAtaIagk4/O8ap1KPltqejuE/cYV4Y/icYcoS99BaXpdhenLJTAUQUUJRlfdpcIKJCpnhKJ2GvFiH5vkiiIm/yl/ODG+nBMd4DeboWRTQZqD10rO6AnoSwsk3cMJzSV+r2SPokBXHo= Received: from BLAPR10MB4835.namprd10.prod.outlook.com (2603:10b6:208:331::11) by MWHPR10MB1472.namprd10.prod.outlook.com (2603:10b6:300:22::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5504.15; Thu, 4 Aug 2022 09:23:55 +0000 Received: from BLAPR10MB4835.namprd10.prod.outlook.com ([fe80::89:b21d:7dc4:24d9]) by BLAPR10MB4835.namprd10.prod.outlook.com ([fe80::89:b21d:7dc4:24d9%7]) with mapi id 15.20.5504.015; Thu, 4 Aug 2022 09:23:54 +0000 Message-ID: <1579c011-eb6f-09ad-bf9b-5b3df3e1b182@oracle.com> Date: Thu, 4 Aug 2022 10:23:48 +0100 Subject: Re: [PATCH v1] mm/hugetlb_vmemmap: remap head page to newly allocated page Content-Language: en-US To: Muchun Song Cc: linux-mm@kvack.org, Mike Kravetz , Andrew Morton References: <20220802180309.19340-1-joao.m.martins@oracle.com> <0b085bb1-b5f7-dfc2-588a-880de0d77ea2@oracle.com> From: Joao Martins In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: LO2P265CA0255.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:8a::27) To BLAPR10MB4835.namprd10.prod.outlook.com (2603:10b6:208:331::11) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 630e6be9-3ae9-4ec6-25f1-08da75fb0cfc X-MS-TrafficTypeDiagnostic: MWHPR10MB1472:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 0JtXL6p8x8IP4rNahvj0/+PTqHqkh8N80cQB70ZVODjwACR8zpgqAAkamx2ASP8v2asQldqJdx1ne2w+IJCLV/aimOFmytZe1rKSUJ7gRewLp4UMlLgomDu6VhTPBmgMXhHiLhPd4iv/bbF4a4A2s4Fy9QLoSHprQA/stkiG5PqoNIzN7v6yCoXfILdtM8nNdwLaYjisI8Z2BraLRk9M/AS/3DsB8ar0b/aDMvHPNmYIjMFJjg+DNQWfP22cHLAMqB8MpLTfnxNFbWNcx85UyI3wOBNll+WuamctGON3IdUOI5horctGLUdrfwgVSj0KPaxNnrbbJ3kaWxlUI/KVEloYKmH2Jnb3pOvZnhkwkdIAl+RS6/nuCGsECdbJzhx++PiZP7f97HV5bCcck9ljE+djAOABBrwbSw8sr7RIbBtYdSYh6ynnEyICYkHBdrrMvvmDFTt0180ubyiaQ19H86FdEWlStzo89RUMQnd2KjBEKwSqHlA294K7D4LsUKz/Onw4yDJwEyRsI2nuXH1zyCAYRnFZ9xT8f74CqR8BceAlTPqedYlS3pXu2KvdS/w2brfAKth5Tn19umFZjVrD87s6XrzYsShls9MmCu1xTkQd+DGEL0sjlC91bF6YYV1oSF6jPGJWmORzi5+F1ZAzsliDeJTC2LdaxDypq/G/eJOtN+kbWc9sCaynD7MZFASHKjR0MFKqYSGk1swwEUAZxDu+/9EiubDRB8mmndvVYMfPuV+OzSxSonju6CZkNLyRaLCkJ6RKe8v76Asdh+FO6KjJOsTsdB41iziqlbTzpcEmlPvNRoDTJMyYUn5bDsn49OTJi1ZcBvWURH0kEew1J2gFvJ3HcDuI97WS00Urhys= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BLAPR10MB4835.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(346002)(376002)(136003)(39860400002)(366004)(396003)(6486002)(966005)(54906003)(6916009)(478600001)(36756003)(31686004)(66946007)(66556008)(4326008)(8676002)(66476007)(83380400001)(8936002)(31696002)(5660300002)(186003)(86362001)(6666004)(41300700001)(2906002)(316002)(53546011)(38100700002)(6506007)(6512007)(2616005)(26005)(36044002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?QTE3S0Q3YmhFOENXOGtMVVB4c3hLNEE4U2JibENDUzVhOVlpVkdpbzNuV0V4?= =?utf-8?B?VzFzQll0VDJ4NlpTeTNPdmNocmhycnVOc01zVEQ4YTBnMlZEL1RJWVFTN05t?= =?utf-8?B?ZzlocUgreTBFalpZQ3Q5VUJ0clFFUTQxUTBaRGpDa0oxWW9qS1V4U2ZGemtt?= =?utf-8?B?azF3NVFPVmVheEdmY0ZYQTJ3em1TZWlEM2ZJWFgxSzdTOUhrdkpEUllIQndD?= =?utf-8?B?SXF0OHplTlZzNFNnNytBeE5aWHl4WDR3VVNnSUNIR1Q4MmxiZHhyQStoWThG?= =?utf-8?B?UHFLdHoyNXI3Y2FtZnl0TlpPMERLMnZwbWNwNmFjcDRveDF3WElvc2tOQ2Ru?= =?utf-8?B?ZFpFdTBSM3U2NUhJRUYzTk9kK0YwZmIrUDFUcFBNUVJRUzBuTHN6SGxCWFZD?= =?utf-8?B?a2xhdGZYRnRRcysydTBpYlZSWnZlRWJNNTFlbHFyelBCenhqZ3lnam41T2pu?= =?utf-8?B?cWRZUHpOdkZLR01MOWhlaEpXUG9FNUNUb2N0ZTk2OThHWDNlNm1KU3d1OTRp?= =?utf-8?B?ekVNTTVDTE9BdmdTbWtYOUVJNVlOUG14blF3cTM5RFozbGhZalFMMU5xOGVD?= =?utf-8?B?dnNCQ2l6ZkRwMG1ZSDV2NVRzTWY3MEVJT1RuV0FDVTE5WlRiYnpRR0dGejIr?= =?utf-8?B?TFh5aFdRbjBPYWRZSWFMc0VFTVRmYzJQV1ovbXQ1bWN1N3F5d1FLZklscHRp?= =?utf-8?B?ekZjQ25pRS9zQ3F0SmJzV3pOU0ErL1p6ekdIRjk1RWN2b2FtZDArcTg4Vy9T?= =?utf-8?B?WGNQY2phb1pqdUF6ZDQyTk8yOGFZTEJmUERaN2pabWkxejdoemlJSzdFMWYr?= =?utf-8?B?SnJFMkRZbU1rTXpkcG1iR2greExjRFZTcmNHRStMYm45cFZLUXFVKy8vL3BZ?= =?utf-8?B?TUxKWXpYMmh4MHpQZG0wTVdrNWZKY1RZQXMvd2JjckgzK1BjYXBVYUpVUGFI?= =?utf-8?B?NVppRUxLUlRDME94elJObStWZGhnSEtXMEw1Nnk4cWxPRTFUV3pucCtDZTlO?= =?utf-8?B?TC9vUHFuK29qL1Q4R1pPWXdycDZzNWk1bkdGMm5MbFNTQ2xaMzJLUDNicEJy?= =?utf-8?B?TlBjd0lhWHRNWUdUU2xyYXRmUk1BZWV5eWc5UEF6QUJGdldiOUFHYmw0UjI0?= =?utf-8?B?ajY1NUhydUJUSlZLRzcrcVBiVm9vTHUyckJqblZHU1g0K2dpT1dYU0h5NUxT?= =?utf-8?B?bmFzSENXV2VqNVJiZ0M1NG9ENUVVYkVFK290TDdDcXFxbUd2MGc5SEV5OUZq?= =?utf-8?B?UEQzdE9XSGFqSWpBczcrUkJiVTR0WXJTRVhxR0QxaFdFYU40YVpOVU5lcVpH?= =?utf-8?B?c2I2VGlrSkMrOXRLcVBtUXRBOEYydjVMSS96eS90SkJreXQwK1RnL2p3Ry9l?= =?utf-8?B?RkZJN3NCVFJCK285M0JxdkVjcXZJWEl3YVRPeTVTTU9iRmpjbStGaDliQzFs?= =?utf-8?B?MGtncGdjSXFqVTVlSk4vZlJ5L2J6NzZsYXlMYUdzam01b3IyeThlanNzdGpL?= =?utf-8?B?alZEcDduTm1aV2RMQk1GSjNzV1NEdENRUkdibHI5QW9JWnBMV1JybG4vOWVG?= =?utf-8?B?YXVCWkJ0cDZ6d0FCYjFHZjNVYXJWbWFFdURmTzdQNUh4VmJvckxCMk83TS9W?= =?utf-8?B?dWxjT0Z0RnhuQzVzRjY0MGVtMUhjWExBc0hueC9FZXlVdWEvRTZqcVkvaFZH?= =?utf-8?B?cmVIK3V1eFFWQTJmS0orQWx5MGk4dFZTUWNpV2hzempFMHVYajlHbWJ0Z1RJ?= =?utf-8?B?eHRHZnpiRWxRWHRpdmR3cmJ1Y0NaSVg4TTFFQldSa015SkFXRWpVZ1hCai9u?= =?utf-8?B?SlZXUlJLRzdwaWZXc21kb3hDK0lQenluWlNldXdHS0M4VUQyZGZEWXBLcElx?= =?utf-8?B?VjdmSmFmOU9wckV4RkZ1OU1pS0pqUi9QR3RUeUQ4UXY0VXhHTk9zNjVVYU93?= =?utf-8?B?Zmh6L2VDNlFVeXhybVVIcHpmZkpFa1FmeldmdHZ5UmFyYS9BUWhpL29sUjdu?= =?utf-8?B?b1RNM0cyQmY4Njlac1NXa0NsdWo0MlhHMUc2TGJXVjdRUGtCZ3dzODMzSTN6?= =?utf-8?B?ZXhRUGJtS0Y2bmg5RUlaRU9ycHQvZFprd0NNY1RONGVEdzBHUXloMWtnMjlF?= =?utf-8?B?aExhQklxR0lva0Rwck8wOEprWUFNTFFONitSdGs1QlRzVUEwQjZySHVpZ2lU?= =?utf-8?B?dHc9PQ==?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 630e6be9-3ae9-4ec6-25f1-08da75fb0cfc X-MS-Exchange-CrossTenant-AuthSource: BLAPR10MB4835.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Aug 2022 09:23:54.2845 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: YHkRHYKLmp4fUPmKVtTETAvLaH0vV6+9A+agkK6B+u00PJmFz16K5qyaAUT+uC9wa1V8VBh39Sq/e8HFimfoxTld7QVBzAKy0aYs4A2p84I= X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1472 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-04_03,2022-08-02_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxlogscore=999 adultscore=0 spamscore=0 bulkscore=0 suspectscore=0 mlxscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2208040040 X-Proofpoint-ORIG-GUID: Yv2YE4glG_G5_LLrgaO8dnKh0yk9MDf4 X-Proofpoint-GUID: Yv2YE4glG_G5_LLrgaO8dnKh0yk9MDf4 ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659605041; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JPAwvQYD+CfQZZKFRMyVJkOkdmTG8XrITzs/3pDsPsg=; b=QuCrl+Bs2vC8sJc8tXEahE6n8ITuV6jD6RSTe6ZfGgF0iZvvEUP7c4oV0Szn4DjIoRqozs C6KTJxDsCd3s3ogvnt7GaqrXnIamVPuLBFCTUrf/iSaNXSdUQyfnMyiReljKGSxqbGaNze Yt/5MbfPdq/JRa2xfn0lwgBVSpd79yM= ARC-Authentication-Results: i=2; imf07.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b="d7Eob/3J"; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b="WgCZ8/Cs"; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf07.hostedemail.com: domain of joao.m.martins@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=joao.m.martins@oracle.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1659605041; a=rsa-sha256; cv=pass; b=hr8132X6zxldnES8iZmeP4J+NrygyyKCFgOyH7Z0NN/YZYEjK7tEWFS/jSKZUDlIbrVrRo iYXqIR41uqTuCwnzDp6RogkJH6doqYTYkY89dECque01OSrvCgoWf9RncgjSbbM/cXgwQa O/xjLyyZ3LgkHxNFinbXU0BX2kjc7co= X-Stat-Signature: hje3urtd5abwsekhddwzpdxhridaj4u6 X-Rspamd-Queue-Id: 17E7F40116 X-Rspamd-Server: rspam01 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2022-7-12 header.b="d7Eob/3J"; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b="WgCZ8/Cs"; dmarc=pass (policy=none) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf07.hostedemail.com: domain of joao.m.martins@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=joao.m.martins@oracle.com X-Rspam-User: X-HE-Tag: 1659605040-876690 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 8/4/22 08:17, Muchun Song wrote: > On Wed, Aug 03, 2022 at 01:22:21PM +0100, Joao Martins wrote: >> >> >> On 8/3/22 11:44, Muchun Song wrote: >>> On Wed, Aug 03, 2022 at 10:52:13AM +0100, Joao Martins wrote: >>>> On 8/3/22 05:11, Muchun Song wrote: >>>>> On Tue, Aug 02, 2022 at 07:03:09PM +0100, Joao Martins wrote: >>>>>> Today with `hugetlb_free_vmemmap=on` the struct page memory that is >>>>>> freed back to page allocator is as following: for a 2M hugetlb page it >>>>>> will reuse the first 4K vmemmap page to remap the remaining 7 vmemmap >>>>>> pages, and for a 1G hugetlb it will remap the remaining 4095 vmemmap >>>>>> pages. Essentially, that means that it breaks the first 4K of a >>>>>> potentially contiguous chunk of memory of 32K (for 2M hugetlb pages) or >>>>>> 16M (for 1G hugetlb pages). For this reason the memory that it's free >>>>>> back to page allocator cannot be used for hugetlb to allocate huge pages >>>>>> of the same size, but rather only of a smaller huge page size: >>>>>> >>>>> >>>>> Hi Joao, >>>>> >>>>> Thanks for your work on this. I admit you are right. The current mechanism >>>>> prevented the freed vmemmap pages from being mergerd into a potential >>>>> contiguous page. Allocating a new head page is straightforward approach, >>>>> however, it is very dangerous at runtime after system booting up. Why >>>>> dangerous? Because you should first 1) copy the content from the head vmemmap >>>>> page to the targeted (newly allocated) page, and then 2) change the PTE >>>>> entry to the new page. However, the content (especially the refcount) of the >>>>> old head vmemmmap page could be changed elsewhere (e.g. other modules) >>>>> between the step 1) and 2). Eventually, the new allocated vmemmap page is >>>>> corrupted. Unluckily, we don't have an easy approach to prevent it. >>>>> >>>> OK, I see what I missed. You mean the refcount (or any other data) on the >>>> preceeding struct pages to this head struct page unrelated to the hugetlb >>>> page being remapped. Meaning when the first struct page in the old vmemmap >>>> page is *not* the head page we are trying to remap right? >>>> >>>> See further below in your patch but I wonder if we could actually check this >>>> from the hugetlb head va being aligned to PAGE_SIZE. Meaning that we would be checking >>>> that this head page is the first of struct page in the vmemmap page that >>>> we are still safe to remap? If so, the patch could be simpler more >>>> like mine, without the special freeing path you added below. >>>> >>>> If I'm right, see at the end. >>> >>> I am not sure we are on the same page (it seems that we are not after I saw your >>> below patch). >> >> Even though I misunderstood you it might still look like a possible scenario. >> >>> So let me make it become more clarified. >>> >> Thanks >> >>> CPU0: CPU1: >>> >>> vmemmap_remap_free(start, end, reuse) >>> // alloc a new page used to be the head vmemmap page >>> page = alloc_pages_node(); >>> >>> memcpy(page_address(page), reuse, PAGE_SIZE); >>> // Now the @reuse address is mapped to the original >>> // page frame. So the change will be reflected on the >>> // original page frame. >>> get_page(reuse); >>> vmemmap_remap_pte(); >>> // remap to the above new allocated page >>> set_pte_at(); >>> >>> flush_tlb_kernel_range(); >> >> note-to-self: totally missed to change the flush_tlb_kernel_range() to include the full range. >> > > Right. I have noticed that as well. > >>> // Now the @reuse address is mapped to the new allocated >>> // page frame. So the change will be reflected on the >>> // new page frame and it is corrupted. >>> put_page(reuse); >>> >>> So we should make 1) memcpy, 2) remap and 3) TLB flush atomic on CPU0, however >>> it is not easy. >>> >> OK, I understand what you mean now. However, I am trying to follow if this race is >> possible? Note that given your previous answer above, I am assuming in your race scenario >> that the vmemmap page only ever stores metadata (struct page) related to the hugetlb-page >> currently being remapped. If this assumption is wrong, then your race would be possible >> (but it wouldn't be from a get_page in the reuse_addr) >> >> So, how would we get into doing a get_page() on the head-page that we are remapping (and >> its put_page() for that matter) from somewhere else ... considering we are at >> prep_new_huge_page() when we call vmemmap_remap_free() and hence ... we already got it >> from page allocator ... but hugetlb hasn't yet finished initializing the head page >> metadata. Hence it isn't yet accounted for someone to grab either e.g. in >> demote/page-fault/migration/etc? >> > > As I know, at least there are two places which could get the refcount. > 1) GUP and 2) Memoey failure. > So I am aware the means to grab the page refcount. It's how we get into such situation that I wasn't sure how we get to in the first place: > For 1), you can refer to the commit 7118fc2906e2925d7edb5ed9c8a57f2a5f23b849. Good pointer. This one sheds some light too: https://lore.kernel.org/linux-mm/CAG48ez23q0Jy9cuVnwAe7t_fdhMk2S7N5Hdi-GLcCeq5bsfLxw@mail.gmail.com/ I wonder we get into a situation of doing a GUP on a user VA referring a just-allocated *hugetlb* page from buddy (but not yet in hugetlb free lists) even if temporarily. Unless the page was still siting on a page tables that were about to tear down. Maybe it is specific to gigantic pages. Hmm > For 2), I think you can refer to the function of __get_unpoison_page(). > Ah, yes. Good point. > Both places could grab an extra refcount to the processing HugeTLB's struct page. > Thanks for the pointers ;)