From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA721C47258 for ; Tue, 23 Jan 2024 19:54:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A50E6B007E; Tue, 23 Jan 2024 14:54:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 12DFB6B0080; Tue, 23 Jan 2024 14:54:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F10506B0083; Tue, 23 Jan 2024 14:54:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D897E6B007E for ; Tue, 23 Jan 2024 14:54:52 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A3E56140369 for ; Tue, 23 Jan 2024 19:54:52 +0000 (UTC) X-FDA: 81711628824.02.CB3712F Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf20.hostedemail.com (Postfix) with ESMTP id BDB831C0019 for ; Tue, 23 Jan 2024 19:54:49 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b="I11P/kwN"; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf20.hostedemail.com: domain of prakash.sangappa@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=prakash.sangappa@oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706039689; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references:dkim-signature; bh=4cvj7c2w+z41DgBXLGIzhGP2KC9m8gbIV2h/6Xm2E7c=; b=fCCsefKSnCHGBXVpE3bCYYJmlYKiBZTjxo//HvxQ94f9dGSOx0ZWxcpwE3q4lEAK0qC2NM teoBAgL3uy5SrUzCrJILZs4VHQ7BdTvYqTxQ3JjFYL/AbMztg5HRtDFZi3jfvYAgAHy9Mv KHx/6V2IL5oEbrDHdnAT2GDkHceTxCA= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b="I11P/kwN"; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf20.hostedemail.com: domain of prakash.sangappa@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=prakash.sangappa@oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706039689; a=rsa-sha256; cv=none; b=LNGI3iHLrhKQ7SfVdlJWXzMnf1QmdJk//DQ2mZPPE1LCEy84XG+9iFFnsMoGg/7cFBJ991 k+QDYaZ3VMRRXxSBKoToHmKXx7tzRyp4wP/52mnFtbs9brMGYNsd+nSUNjPOQHVwBsgmK3 gMzFpk824VmxskmwV0OtjKazDCep4Zg= Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 40NGRDpM010751; Tue, 23 Jan 2024 19:54:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id; s=corp-2023-11-20; bh=4cvj7c2w+z41DgBXLGIzhGP2KC9m8gbIV2h/6Xm2E7c=; b=I11P/kwN3xlIMVsze4r4AjQR7rnVlK2rgaOneJe0CJFtIjNLw3D7mX24cwJKAQL1xqJw Yi3fXY7s8tRbr2HZAoL0NfpoaP2oNjL8VRHykKlPRSyi2L7RzyaV6hhU61JmytGcyPci /K2Ay+8DExnSAxe5zBtCgJW8WRe+nbLnUi+esvwPYiJg/+qTAF2yoHkKQtRsxIMm2o3o Q75g34T/gIdb8qIDZP1MtBJfuMW5JdxQheFA3SvLqXg7Im2AQH8JboxAszGsKESi4qbp lllT7Fbei66C9trjJOYD4FgnAeu5j8Mvkjyk41dUftM0IeCVFZjkSq2bdA73RQ4w86Bk Uw== Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3vr7cxy5t9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 23 Jan 2024 19:54:48 +0000 Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 40NJGoJm001953; Tue, 23 Jan 2024 19:54:48 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3vs322uy92-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA256 bits=128 verify=NO); Tue, 23 Jan 2024 19:54:48 +0000 Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 40NJsliO012746; Tue, 23 Jan 2024 19:54:47 GMT Received: from pp-thinkcentre-m82.us.oracle.com (dhcp-10-132-95-245.usdhcp.oraclecorp.com [10.132.95.245]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3vs322uy3m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA256 bits=128 verify=NO); Tue, 23 Jan 2024 19:54:47 +0000 From: Prakash Sangappa To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: muchun.song@linux.dev, mike.kravetz@oracle.com, akpm@linux-foundation.org Subject: [PATCH v3] Hugetlb pages should not be reserved by shmat() if SHM_NORESERVE Date: Tue, 23 Jan 2024 12:04:42 -0800 Message-Id: <1706040282-12388-1-git-send-email-prakash.sangappa@oracle.com> X-Mailer: git-send-email 2.7.4 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-01-23_11,2024-01-23_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 suspectscore=0 phishscore=0 malwarescore=0 mlxscore=0 mlxlogscore=999 spamscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2401230148 X-Proofpoint-ORIG-GUID: PbxVhyfXH_cXUs6-8xQQ0Hu3MOa1QFGZ X-Proofpoint-GUID: PbxVhyfXH_cXUs6-8xQQ0Hu3MOa1QFGZ X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: BDB831C0019 X-Stat-Signature: 33z97s3ksz6ykep1zatmthnooenwcr5y X-HE-Tag: 1706039689-206402 X-HE-Meta: U2FsdGVkX1/9XmEC3Eov4B61x5FAFemgZSJmwXR/PmO0Nd8paDuBY8o4F0bAb0k2y6ZTBlGIrRs4FKBEKULUPX6MVI0Z7iE8b40ZOMsLoOt+ZXYHBJxoSWXKwAo1JLg//CgC8KC1F75/4Zc4K0vnGa81UKerFABxCmxC8nFrKxmO+xz3YtWmtP3oE8UbaO9KtEuOXLZsmQTJZaFcx59NxhtW1fNJOMQaAWS1IoDg27NGgZh9HwLA5dEtUixwnAW6YKeWHJz7ig+ktEDXOxE8UTaAG0fih8H0vYVOP6Ul7OC0Odrb/rrSQa/Aq7b+wvdFoMGyWo+LZ/dBTy4bmbzfOC5EI9li5PyaHzxDXKK8w82iHiuAbsTeJdWu434LCacjM44fnA9CY+nleueSd773AHu5cX/B1YPcC2QgGdiV5/uKtCCbygtD3pe99caM4e83VlmEtAx9ZfEHgsDV59ccrQwijgYLWIlENIWRNwsxrzDq3NdOCXdcHD4ebfMLkoZ/7RP/qXTNJC7TrPVJMPd/h7FR8BpqpN/NENuut/rwSP4kuVLhzclCk81aDxKhyD7VIb3WNqjahoSTgaDL1VTKnWgr8CoLOHzAa9SdwAo+JrNL3kUDiYHnWq9ACrk26oikcXhcsmtSqyiQ31nsOE262a33GfEPhkSBW1RjvLe/MwQUea/jZCSDZTWCc6NmzwfDZ04dWfyitCC77FZTbhYZsCYpufrTUzHnZQIRkho8W93mCpQrYUvqskVn9lzgGm2GjW57VXTfNjubM8xaD+khJgm/71oyNqxqLwVseGBEDqQ1T8SK8/oilk6iObSxhHPdjXCS/nyo1ta62uqfvulXJP+UoORVaD2bFNOC4Ss+qtjmkz2yv6v/lsytpJke61Ja1Fo2wBbHMrDCxMWLdlyI11s42PRVDTi6YymaJm+nU0//kVC1XShN6qnxLkJqL8H2kGVGP2G46Olh51KutM4 J+2jrmlE z8/a6nPhKibLgnXjmqeNdTImECF9xltl8xgAcNvusHpe8yUi8p/qfV+fC1DbIO4EsRClOGH0j8OljdfBikVyV9EDX/K+NDp214pibCldrCw0A+gqsuHDtHiCLvEvYWS2gOSqE+RrLfiVCc1Cn9+1eAdjnPqlVZ1hKa8hZC173+bQ94vdQIwjk8r2kzSbrGvK0uUkiZ1W3JMlh+Ba7Wz/6pSCzWg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For shared memory of type SHM_HUGETLB, hugetlb pages are reserved in shmget() call. If SHM_NORESERVE flags is specified then the hugetlb pages are not reserved. However when the shared memory is attached with the shmat() call the hugetlb pages are getting reserved incorrectly for SHM_HUGETLB shared memory created with SHM_NORESERVE which is a bug. ------------------------------- Following test shows the issue. $cat shmhtb.c int main() { int shmflags = 0660 | IPC_CREAT | SHM_HUGETLB | SHM_NORESERVE; int shmid; shmid = shmget(SKEY, SHMSZ, shmflags); if (shmid < 0) { printf("shmat: shmget() failed, %d\n", errno); return 1; } printf("After shmget()\n"); system("cat /proc/meminfo | grep -i hugepages_"); shmat(shmid, NULL, 0); printf("\nAfter shmat()\n"); system("cat /proc/meminfo | grep -i hugepages_"); shmctl(shmid, IPC_RMID, NULL); return 0; } #sysctl -w vm.nr_hugepages=20 #./shmhtb After shmget() HugePages_Total: 20 HugePages_Free: 20 HugePages_Rsvd: 0 HugePages_Surp: 0 After shmat() HugePages_Total: 20 HugePages_Free: 20 HugePages_Rsvd: 5 <-- HugePages_Surp: 0 -------------------------------- Fix is to ensure that hugetlb pages are not reserved for SHM_HUGETLB shared memory in the shmat() call. Signed-off-by: Prakash Sangappa --- v2: Modifed fix to call hugetlb_reserve_pages() with VM_NORESERVE instead as per vma lock is allocated in hugetlb_reserve_pages(). v3: Updated change log to describe user visible effect of the bug with a test case, as suggested by Andrew Morton. fs/hugetlbfs/inode.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index f757d4f..40b12b0 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -100,6 +100,7 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma) loff_t len, vma_len; int ret; struct hstate *h = hstate_file(file); + vm_flags_t vm_flags; /* * vma address alignment (but not the pgoff alignment) has @@ -141,10 +142,20 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma) file_accessed(file); ret = -ENOMEM; + + vm_flags = vma->vm_flags; + /* + * for SHM_HUGETLB, the pages are reserved in the shmget() call so skip + * reserving here. Note: only for SHM hugetlbfs file, the inode + * flag S_PRIVATE is set. + */ + if (inode->i_flags & S_PRIVATE) + vm_flags |= VM_NORESERVE; + if (!hugetlb_reserve_pages(inode, vma->vm_pgoff >> huge_page_order(h), len >> huge_page_shift(h), vma, - vma->vm_flags)) + vm_flags)) goto out; ret = 0; -- 2.7.4