From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C38CFC27C65 for ; Mon, 10 Jun 2024 13:28:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4198A6B0088; Mon, 10 Jun 2024 09:28:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3CA166B0089; Mon, 10 Jun 2024 09:28:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2429F6B008A; Mon, 10 Jun 2024 09:28:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F39086B0088 for ; Mon, 10 Jun 2024 09:28:03 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 83600C0BF1 for ; Mon, 10 Jun 2024 13:28:03 +0000 (UTC) X-FDA: 82215057246.13.114BFAD Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by imf29.hostedemail.com (Postfix) with ESMTP id 7CE3E12001C for ; Mon, 10 Jun 2024 13:27:59 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=CFbXnNlD; spf=pass (imf29.hostedemail.com: domain of da.gomez@samsung.com designates 210.118.77.12 as permitted sender) smtp.mailfrom=da.gomez@samsung.com; dmarc=pass (policy=none) header.from=samsung.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1718026080; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2Z8wXjlYIPERfzptUV3h5MdMM3rR60oXjr32nLTaL6E=; b=8aoQ8f/3lc8/2aEQ1GyXVbTfipDMYPCjFqcL7OGx2FoX1b1299zBVaze2Pv5ASYaHYOqZr CdRo4zEdFRDcWkrbmEF/iLHCl/HMIDMQYQhvclA2TU3uNi205slyRvenovPn9qk0TTkqfg KQ+INAAsGeEw36yMqnIRe5egBPBJCac= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=CFbXnNlD; spf=pass (imf29.hostedemail.com: domain of da.gomez@samsung.com designates 210.118.77.12 as permitted sender) smtp.mailfrom=da.gomez@samsung.com; dmarc=pass (policy=none) header.from=samsung.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1718026080; a=rsa-sha256; cv=none; b=7TGl5E1anqP/8E3o/V46ultSsBlMuxu+AXrgU0abitK5O8SqTU1Rm1u7qohw1oOg8iWQmY TP5cyrvmhhTK3rUiQf8ncal2RoIW4ynJMcUT2kAhPeMtntQbgatGNlshvBmj7KeH3L2Z2u TRiwufeemNBBTJYVnd9Lcf6meTwpjsg= Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20240610132756euoutp02b24353d9489c4916003f56d61778fa81~Xp51KT9xY2472324723euoutp02U for ; Mon, 10 Jun 2024 13:27:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20240610132756euoutp02b24353d9489c4916003f56d61778fa81~Xp51KT9xY2472324723euoutp02U DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1718026076; bh=2Z8wXjlYIPERfzptUV3h5MdMM3rR60oXjr32nLTaL6E=; h=From:To:CC:Subject:Date:In-Reply-To:References:From; b=CFbXnNlDSK6gOG9nIzT+hJzC/nF8Muw6cgaGbE4CTQ81JPsNDIzuZVwZ2e3mec6ao YhpCvc3Q6lGIg360HL4cqowFeWC19XmgqhdZ/1GlDj+SbWotoYg5p0fpPvbhDaSlE+ CvIRL5MCZmDrJu1O9HstjymL+mm9On6G+QrSvUAI= Received: from eusmges1new.samsung.com (unknown [203.254.199.242]) by eucas1p2.samsung.com (KnoxPortal) with ESMTP id 20240610132756eucas1p20a0ddc30e24d8fdfa888adc5a738be9b~Xp507UIEj2617726177eucas1p29; Mon, 10 Jun 2024 13:27:56 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges1new.samsung.com (EUCPMTA) with SMTP id BC.22.09624.C5FF6666; Mon, 10 Jun 2024 14:27:56 +0100 (BST) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p1.samsung.com (KnoxPortal) with ESMTPA id 20240610132756eucas1p1d892ccbabdb5f8fc4cff55c662f24d75~Xp50hlGva2803828038eucas1p1l; Mon, 10 Jun 2024 13:27:56 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20240610132756eusmtrp1c09aa9d774e4a09179c0522d65a17869~Xp50g79xe1090610906eusmtrp1K; Mon, 10 Jun 2024 13:27:56 +0000 (GMT) X-AuditID: cbfec7f2-bfbff70000002598-10-6666ff5c8b01 Received: from eusmtip2.samsung.com ( [203.254.199.222]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id C2.CD.09010.B5FF6666; Mon, 10 Jun 2024 14:27:56 +0100 (BST) Received: from CAMSVWEXC02.scsc.local (unknown [106.1.227.72]) by eusmtip2.samsung.com (KnoxPortal) with ESMTPA id 20240610132755eusmtip2a7310cd58c7606617cc1cc03cd441c5e~Xp50S9nLG1991019910eusmtip2W; Mon, 10 Jun 2024 13:27:55 +0000 (GMT) Received: from CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) by CAMSVWEXC02.scsc.local (2002:6a01:e348::6a01:e348) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 10 Jun 2024 14:27:55 +0100 Received: from CAMSVWEXC01.scsc.local ([::1]) by CAMSVWEXC01.scsc.local ([fe80::7d73:5123:34e0:4f73%13]) with mapi id 15.00.1497.012; Mon, 10 Jun 2024 14:27:54 +0100 From: Daniel Gomez To: Baolin Wang CC: "akpm@linux-foundation.org" , "hughd@google.com" , "willy@infradead.org" , "david@redhat.com" , "wangkefeng.wang@huawei.com" , "ying.huang@intel.com" , "21cnbao@gmail.com" <21cnbao@gmail.com>, "ryan.roberts@arm.com" , "shy828301@gmail.com" , "ziy@nvidia.com" , "ioworker0@gmail.com" , Pankaj Raghav , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH v4 4/6] mm: shmem: add mTHP support for anonymous shmem Thread-Topic: [PATCH v4 4/6] mm: shmem: add mTHP support for anonymous shmem Thread-Index: AQHatmirtcoE1ahYsESQwpFoX6NdQ7HA9mmA Date: Mon, 10 Jun 2024 13:27:53 +0000 Message-ID: In-Reply-To: <9be6eeacd0304c82a1cb1b7487977a3e14d2b5df.1717495894.git.baolin.wang@linux.alibaba.com> Accept-Language: en-US, en-GB Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [106.110.32.67] Content-Type: text/plain; charset="us-ascii" Content-ID: <9F9DBEBECA11944D86EAED38E57CED8E@scsc.local> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrLKsWRmVeSWpSXmKPExsWy7djP87ox/9PSDK5NVrD4fFfIYs76NWwW //ceY7T4uv4Xs8XTT30sFot+G1tc3jWHzeLemv+sFj27pzJaLDixmNGi8fN9RovfP4ASJ2dN ZrGYffQeuwOfx5p5axg9ds66y+6xYFOpR8uRt6wem1doeSze85LJY9OnSeweJ2b8ZvHY+dDS o7f5HZvH+31X2Tw+b5IL4InisklJzcksSy3St0vgyth4s5GtYH1qxcK955kbGCcEdDFyckgI mEg8XfmFsYuRi0NIYAWjRPeqqUwQzhdGiX37JrFBOJ8ZJd6uOcPcxcgB1rJ2lglIt5DAckaJ h09T4GpW7XkM1XCGUeL/qsdQc1cySkycvpIZpIVNQFNi38lN7CCTRAT0JXrn+oLUMAtMZ5VY fLmHDSQuLOAt8f2gKUi5iICPxNJ1W1ghbCOJfycXM4LYLAKqErc/tLOA2LwCvhJHvv9hArE5 BVIkersgVjEKyEo8WvmLHcRmFhCXuPVkPhPEz4ISi2bvYYawxST+7XrIBmHrSJy9/oQRwjaQ 2Lp0HwuErSjRcewmG8QcHYkFuz9B2ZYSj7+0Qc3Xlli28DUzxD2CEidnPmEB+UtCYCeXxKJ7 U6EWuEi82vCcHcIWlnh1fAv7BEadWUjum4VkxywkO2Yh2TELyY4FjKyrGMVTS4tz01OLDfNS y/WKE3OLS/PS9ZLzczcxAlPk6X/HP+1gnPvqo94hRiYOxkOMEhzMSiK8QhnJaUK8KYmVValF +fFFpTmpxYcYpTlYlMR5VVPkU4UE0hNLUrNTUwtSi2CyTBycUg1MxZ+kFy3Ydff5P7mjvbN8 PN5onl20ecUyVe2lKW+b2Y03rpLn3tPMfEFNJMl+509D92/n3yx+f7ZE7KJvwUFmpxlTN7Zc Sz9fGfM/lvntk6Nv895OnSQp+dzI+pz3yXKjr8IrAudXes1lreixM9b+6KU5ZfZl6SvTnWa7 plnmP55RstSX7Ron84x/vn+vFZxr7lh2cNrpHT3dyxZGvLuSG5vA9Pi1vmxZxHr/24oik3Zr pSramm62/PxnVs6+cI7oyj42VbOzce6FnWf0Lvj1hTfGu5hnNx3hLnxdv+FT1j1mpsjJ/9mk G9etlnkRvlFA197OYa/Ssor3fXu0fvx5mvJxk+5jnkmuwaZz6156KbEUZyQaajEXFScCABgz v3cABAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrKKsWRmVeSWpSXmKPExsVy+t/xe7ox/9PSDF5uZbb4fFfIYs76NWwW //ceY7T4uv4Xs8XTT30sFot+G1tc3jWHzeLemv+sFj27pzJaLDixmNGi8fN9RovfP4ASJ2dN ZrGYffQeuwOfx5p5axg9ds66y+6xYFOpR8uRt6wem1doeSze85LJY9OnSeweJ2b8ZvHY+dDS o7f5HZvH+31X2Tw+b5IL4InSsynKLy1JVcjILy6xVYo2tDDSM7S00DMysdQzNDaPtTIyVdK3 s0lJzcksSy3St0vQy9h4s5GtYH1qxcK955kbGCcEdDFycEgImEisnWXSxcjFISSwlFFi8rkW pi5GTqC4jMTGL1dZIWxhiT/Xutggij4yStzYepwVwjnDKDH19QUoZyWjxOW1e8Ha2QQ0Jfad 3MQOskJEQF+id64vSA2zwHRWicWXe9hA4sIC3hLfD5qClIsI+EgsXbeFFcI2kvh3cjEjiM0i oCpx+0M7C4jNK+ArceT7H7DxQgKzGCX2T44CsTkFUiR6u1Yyg9iMArISj1b+YgexmQXEJW49 mQ/1jYDEkj3nmSFsUYmXj/9BfaYjcfb6E0YI20Bi69J9LBC2okTHsZtsEHN0JBbs/gRlW0o8 /tIGNV9bYtnC18wQtwlKnJz5hGUCo8wsJKtnIWmfhaR9FpL2WUjaFzCyrmIUSS0tzk3PLTbS K07MLS7NS9dLzs/dxAhMf9uO/dyyg3Hlq496hxiZOBgPMUpwMCuJ8AplJKcJ8aYkVlalFuXH F5XmpBYfYjQFht1EZinR5HxgAs4riTc0MzA1NDGzNDC1NDNWEuf1LOhIFBJITyxJzU5NLUgt gulj4uCUamCapuKqW3ZE5iPLzGL7BVO+WHS0dhxXfuTl1LHrdpG/p1mZ9wy1m4YvUq8YcP+r el+s292VnfafL/rUr9NMbzTzJ+rmleRv9zX+tmJWb1vsWg/e1BajBU6BorWRB55rH9U1N7H1 l9cVmJV8Yo2ehFCwmI/WlNI7SV8ZzlhIXbiWNPfjSfalFreuvdpl+d7mT86L5lUnnwXO3zrF LfnqV5a3p9Q8LTS6bafs5Vxynu/nisd/j9uYF006vrB7xjapwmtHXz1aVLEtY9/FYyEGM39q pgd0ie6UOXHOSu+mwc3448tkK9apRf+xz5j0Yo+nVrRCnvk7IV3O7H0LGO+9eLzx+dY9H8Uu 3t3x+3duxfIMJZbijERDLeai4kQAbRrQhwgEAAA= X-CMS-MailID: 20240610132756eucas1p1d892ccbabdb5f8fc4cff55c662f24d75 X-Msg-Generator: CA X-RootMTR: 20240610132756eucas1p1d892ccbabdb5f8fc4cff55c662f24d75 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20240610132756eucas1p1d892ccbabdb5f8fc4cff55c662f24d75 References: <9be6eeacd0304c82a1cb1b7487977a3e14d2b5df.1717495894.git.baolin.wang@linux.alibaba.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 7CE3E12001C X-Stat-Signature: 7g19w5k1w1efescgasaycrohdic6i74e X-Rspam-User: X-HE-Tag: 1718026079-946602 X-HE-Meta: U2FsdGVkX1+LGwb/HoTJCnHhhebpk/iovEtgSIjiGCPpQ1aHbZyYNXE//sKium7JOXJTU7qEZflCTlGWztB5H7qRgpSS45wVqCFqWor8VmQvb4/WBJfLv3r09SSXznUDB15oXhB/dieXUmSHgyj7mMvkAWL6UER9A4DCnWGBujUltIczJf/mzypxsTTdrGCsRWTLQCIUsfWdf7uNXNHXzRB++rdvQ1LuNWhmqh6pyJXWH9VdmMkGoxd+QeH7UH74KGW3NxQXFQhcfQ28gDZlaxPY7p0RI5rntMBRzYgn8QuLUp7IC2GwbukZZtu4LjKt9TpVW1tUPJnqJIBzFaN/f5PyloDSA10Bhr5yDbdVS0Uc/a8d7ikrtTdGwblDopMBTGyjA+1hbOR8wiEi4O6VbNd5vrKvgh1Kbn5R5rbVRj6x3Ft3Icx1+u0q/3Qw8Nk3EhS95RFERLN22JiWUixBwUh9pzIHmVnEsJRQKhZE5mM2dh+iazHguiT0gPVfULjLW+QFIgF/s1k3T39XiLsua5bdtV0xE7ReSjqPrarGAZ6txaWhSfxB0jUEynRnPWrG/yHn5oOt/2//9lCMk+ZXv4ZNF70kmYSGekoixJ5wMQMG7ljCFFWyTcV4UxAFROusc05FUsaayiINQKfwZ601+HJGH+vsMi+Mj6N4IIIxUwBLTwvpzAPLAlwzMiDkKz/wzevZBQ4WJXEw9J1PqBdKB2O10mEbtBGItE4lI4bqsqiP3veYf/ZkRBuX45MahiaP5YNqiNfvA0+nweH1Lf1877XkYbe52K+diaE2ieWmtGkM9dBKRO5zLSS9z2JsS80Ni+siunWTrtTpRqO2MHxldcvX0ZR2RkXhdoSthLXHxyOj4in87EdojTpo9Fuj7jJfNixC5N6ZR322bte0018uCQ94Ab+h4/WjkxNPlfkUZcQWkmTnTzoyiyfAlX+RX2tObbBxqFdwPwY8DRWyAhi hCsBwU+W weuI59/7X9SWZooEl+3DUTYQ6eBnTrg+QoGJ0e5YQFwve0tbiEZXPtuQHNnx18VAUQ/C+kp4LL14gN9FQJV3PLUSv0g2k7lMnLxjAjM256tMQe/K2hL7WlvUAiWNmmHM1Vu7mgheYZV/MVwZ07p/VAYytnDRwGyCya6oN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 04, 2024 at 06:17:48PM +0800, Baolin Wang wrote: > Commit 19eaf44954df adds multi-size THP (mTHP) for anonymous pages, that > can allow THP to be configured through the sysfs interface located at > '/sys/kernel/mm/transparent_hugepage/hugepage-XXkb/enabled'. >=20 > However, the anonymous shmem will ignore the anonymous mTHP rule > configured through the sysfs interface, and can only use the PMD-mapped > THP, that is not reasonable. Users expect to apply the mTHP rule for > all anonymous pages, including the anonymous shmem, in order to enjoy > the benefits of mTHP. For example, lower latency than PMD-mapped THP, > smaller memory bloat than PMD-mapped THP, contiguous PTEs on ARM architec= ture > to reduce TLB miss etc. In addition, the mTHP interfaces can be extended > to support all shmem/tmpfs scenarios in the future, especially for the > shmem mmap() case. >=20 > The primary strategy is similar to supporting anonymous mTHP. Introduce > a new interface '/mm/transparent_hugepage/hugepage-XXkb/shmem_enabled', > which can have almost the same values as the top-level > '/sys/kernel/mm/transparent_hugepage/shmem_enabled', with adding a new > additional "inherit" option and dropping the testing options 'force' and > 'deny'. By default all sizes will be set to "never" except PMD size, > which is set to "inherit". This ensures backward compatibility with the > anonymous shmem enabled of the top level, meanwhile also allows independe= nt > control of anonymous shmem enabled for each mTHP. >=20 > Signed-off-by: Baolin Wang > --- > include/linux/huge_mm.h | 10 +++ > mm/shmem.c | 187 +++++++++++++++++++++++++++++++++------- > 2 files changed, 167 insertions(+), 30 deletions(-) >=20 > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h > index fac21548c5de..909cfc67521d 100644 > --- a/include/linux/huge_mm.h > +++ b/include/linux/huge_mm.h > @@ -575,6 +575,16 @@ static inline bool thp_migration_supported(void) > { > return false; > } > + > +static inline int highest_order(unsigned long orders) > +{ > + return 0; > +} > + > +static inline int next_order(unsigned long *orders, int prev) > +{ > + return 0; > +} > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > =20 > static inline int split_folio_to_list_to_order(struct folio *folio, > diff --git a/mm/shmem.c b/mm/shmem.c > index 643ff7516b4d..9a8533482208 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1611,6 +1611,107 @@ static gfp_t limit_gfp_mask(gfp_t huge_gfp, gfp_t= limit_gfp) > return result; > } > =20 > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > +static unsigned long anon_shmem_allowable_huge_orders(struct inode *inod= e, We want to get mTHP orders as well for tmpfs so, could we make this to work= for both paths? If true, I'd remove the anon prefix. > + struct vm_area_struct *vma, pgoff_t index, > + bool global_huge) Why did you rename 'huge' variable to 'global_huge'? We were using 'huge' i= n shmem_alloc_and_add_folio() before this commit. I guess it's just odd to me= this var rename without seen any name conflict inside it. > +{ > + unsigned long mask =3D READ_ONCE(huge_anon_shmem_orders_always); > + unsigned long within_size_orders =3D READ_ONCE(huge_anon_shmem_orders_w= ithin_size); > + unsigned long vm_flags =3D vma->vm_flags; > + /* > + * Check all the (large) orders below HPAGE_PMD_ORDER + 1 that > + * are enabled for this vma. > + */ > + unsigned long orders =3D BIT(PMD_ORDER + 1) - 1; > + loff_t i_size; > + int order; > + We can start the mm anon path here but we should exclude the ones that do n= ot apply for tmpfs. > + if ((vm_flags & VM_NOHUGEPAGE) || > + test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)) > + return 0; > + > + /* If the hardware/firmware marked hugepage support disabled. */ > + if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED= )) > + return 0; > + > + /* > + * Following the 'deny' semantics of the top level, force the huge > + * option off from all mounts. > + */ > + if (shmem_huge =3D=3D SHMEM_HUGE_DENY) > + return 0; > + > + /* > + * Only allow inherit orders if the top-level value is 'force', which > + * means non-PMD sized THP can not override 'huge' mount option now. > + */ > + if (shmem_huge =3D=3D SHMEM_HUGE_FORCE) > + return READ_ONCE(huge_anon_shmem_orders_inherit); > + > + /* Allow mTHP that will be fully within i_size. */ > + order =3D highest_order(within_size_orders); > + while (within_size_orders) { > + index =3D round_up(index + 1, order); > + i_size =3D round_up(i_size_read(inode), PAGE_SIZE); > + if (i_size >> PAGE_SHIFT >=3D index) { > + mask |=3D within_size_orders; > + break; > + } > + > + order =3D next_order(&within_size_orders, order); > + } > + > + if (vm_flags & VM_HUGEPAGE) > + mask |=3D READ_ONCE(huge_anon_shmem_orders_madvise); > + > + if (global_huge) > + mask |=3D READ_ONCE(huge_anon_shmem_orders_inherit); > + > + return orders & mask; > +} > + > +static unsigned long anon_shmem_suitable_orders(struct inode *inode, str= uct vm_fault *vmf, > + struct address_space *mapping, pgoff_t index, > + unsigned long orders) > +{ > + struct vm_area_struct *vma =3D vmf->vma; > + unsigned long pages; > + int order; > + > + orders =3D thp_vma_suitable_orders(vma, vmf->address, orders); This won't apply to tmpfs. I'm thinking if we can apply shmem_mapping_size_order() [1] here for tmpfs path so we have the same suit= able orders for both paths. [1] https://lore.kernel.org/all/v5acpezkt4ml3j3ufmbgnq5b335envea7xfobvowtae= tvbt3an@v3pfkwly5jh2/#t > + if (!orders) > + return 0; > + > + /* Find the highest order that can add into the page cache */ > + order =3D highest_order(orders); > + while (orders) { > + pages =3D 1UL << order; > + index =3D round_down(index, pages); > + if (!xa_find(&mapping->i_pages, &index, > + index + pages - 1, XA_PRESENT)) > + break; > + order =3D next_order(&orders, order); > + } > + > + return orders; > +} > +#else > +static unsigned long anon_shmem_allowable_huge_orders(struct inode *inod= e, > + struct vm_area_struct *vma, pgoff_t index, > + bool global_huge) > +{ > + return 0; > +} > + > +static unsigned long anon_shmem_suitable_orders(struct inode *inode, str= uct vm_fault *vmf, > + struct address_space *mapping, pgoff_t index, > + unsigned long orders) > +{ > + return 0; > +} > +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > + > static struct folio *shmem_alloc_folio(gfp_t gfp, int order, > struct shmem_inode_info *info, pgoff_t index) > { > @@ -1625,38 +1726,55 @@ static struct folio *shmem_alloc_folio(gfp_t gfp,= int order, > return folio; > } > =20 > -static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, > - struct inode *inode, pgoff_t index, > - struct mm_struct *fault_mm, bool huge) > +static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, > + gfp_t gfp, struct inode *inode, pgoff_t index, > + struct mm_struct *fault_mm, unsigned long orders) > { > struct address_space *mapping =3D inode->i_mapping; > struct shmem_inode_info *info =3D SHMEM_I(inode); > - struct folio *folio; > + struct vm_area_struct *vma =3D vmf ? vmf->vma : NULL; > + unsigned long suitable_orders =3D 0; > + struct folio *folio =3D NULL; > long pages; > - int error; > + int error, order; > =20 > if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) > - huge =3D false; > + orders =3D 0; > =20 > - if (huge) { > - pages =3D HPAGE_PMD_NR; > - index =3D round_down(index, HPAGE_PMD_NR); > + if (orders > 0) { Can we get rid of this condition if we handle all allowable orders in 'orde= rs'? Including order-0 and PMD-order. I agree, we do not need the huge flag anym= ore since you have handled all cases in shmem_allowable_huge_orders(). > + if (vma && vma_is_anon_shmem(vma)) { > + suitable_orders =3D anon_shmem_suitable_orders(inode, vmf, > + mapping, index, orders); > + } else if (orders & BIT(HPAGE_PMD_ORDER)) { > + pages =3D HPAGE_PMD_NR; > + suitable_orders =3D BIT(HPAGE_PMD_ORDER); > + index =3D round_down(index, HPAGE_PMD_NR); > =20 > - /* > - * Check for conflict before waiting on a huge allocation. > - * Conflict might be that a huge page has just been allocated > - * and added to page cache by a racing thread, or that there > - * is already at least one small page in the huge extent. > - * Be careful to retry when appropriate, but not forever! > - * Elsewhere -EEXIST would be the right code, but not here. > - */ > - if (xa_find(&mapping->i_pages, &index, > - index + HPAGE_PMD_NR - 1, XA_PRESENT)) > - return ERR_PTR(-E2BIG); > + /* > + * Check for conflict before waiting on a huge allocation. > + * Conflict might be that a huge page has just been allocated > + * and added to page cache by a racing thread, or that there > + * is already at least one small page in the huge extent. > + * Be careful to retry when appropriate, but not forever! > + * Elsewhere -EEXIST would be the right code, but not here. > + */ > + if (xa_find(&mapping->i_pages, &index, > + index + HPAGE_PMD_NR - 1, XA_PRESENT)) > + return ERR_PTR(-E2BIG); > + } > =20 > - folio =3D shmem_alloc_folio(gfp, HPAGE_PMD_ORDER, info, index); > - if (!folio && pages =3D=3D HPAGE_PMD_NR) > - count_vm_event(THP_FILE_FALLBACK); > + order =3D highest_order(suitable_orders); > + while (suitable_orders) { > + pages =3D 1UL << order; > + index =3D round_down(index, pages); > + folio =3D shmem_alloc_folio(gfp, order, info, index); > + if (folio) > + goto allocated; > + > + if (pages =3D=3D HPAGE_PMD_NR) > + count_vm_event(THP_FILE_FALLBACK); > + order =3D next_order(&suitable_orders, order); > + } > } else { > pages =3D 1; > folio =3D shmem_alloc_folio(gfp, 0, info, index); > @@ -1664,6 +1782,7 @@ static struct folio *shmem_alloc_and_add_folio(gfp_= t gfp, > if (!folio) > return ERR_PTR(-ENOMEM); > =20 > +allocated: > __folio_set_locked(folio); > __folio_set_swapbacked(folio); > =20 > @@ -1958,7 +2077,8 @@ static int shmem_get_folio_gfp(struct inode *inode,= pgoff_t index, > struct mm_struct *fault_mm; > struct folio *folio; > int error; > - bool alloced; > + bool alloced, huge; > + unsigned long orders =3D 0; > =20 > if (WARN_ON_ONCE(!shmem_mapping(inode->i_mapping))) > return -EINVAL; > @@ -2030,14 +2150,21 @@ static int shmem_get_folio_gfp(struct inode *inod= e, pgoff_t index, > return 0; > } > =20 > - if (shmem_is_huge(inode, index, false, fault_mm, > - vma ? vma->vm_flags : 0)) { > + huge =3D shmem_is_huge(inode, index, false, fault_mm, > + vma ? vma->vm_flags : 0); > + /* Find hugepage orders that are allowed for anonymous shmem. */ > + if (vma && vma_is_anon_shmem(vma)) I guess we do not want to check the anon path here either (in case you agre= e to merge this with tmpfs path). > + orders =3D anon_shmem_allowable_huge_orders(inode, vma, index, huge); > + else if (huge) > + orders =3D BIT(HPAGE_PMD_ORDER); Why not handling this case inside allowable_huge_orders()? > + > + if (orders > 0) { Does it make sense to handle these case anymore? Before, we had the huge path and order-0. If we handle all cases in allowable_orders() perhaps we c= an simplify this. > gfp_t huge_gfp; > =20 > huge_gfp =3D vma_thp_gfp_mask(vma); We are also setting this flag regardless of the final order. Meaning that suitable_orders() might return order-0 and yet we keep the huge gfp flag. I= s that right? > huge_gfp =3D limit_gfp_mask(huge_gfp, gfp); > - folio =3D shmem_alloc_and_add_folio(huge_gfp, > - inode, index, fault_mm, true); > + folio =3D shmem_alloc_and_add_folio(vmf, huge_gfp, > + inode, index, fault_mm, orders); > if (!IS_ERR(folio)) { > if (folio_test_pmd_mappable(folio)) > count_vm_event(THP_FILE_ALLOC); > @@ -2047,7 +2174,7 @@ static int shmem_get_folio_gfp(struct inode *inode,= pgoff_t index, > goto repeat; > } > =20 > - folio =3D shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false)= ; > + folio =3D shmem_alloc_and_add_folio(vmf, gfp, inode, index, fault_mm, 0= ); > if (IS_ERR(folio)) { > error =3D PTR_ERR(folio); > if (error =3D=3D -EEXIST) > @@ -2058,7 +2185,7 @@ static int shmem_get_folio_gfp(struct inode *inode,= pgoff_t index, > =20 > alloced: > alloced =3D true; > - if (folio_test_pmd_mappable(folio) && > + if (folio_test_large(folio) && > DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE) < > folio_next_index(folio) - 1) { > struct shmem_sb_info *sbinfo =3D SHMEM_SB(inode->i_sb); > --=20 > 2.39.3 > =