From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1AB8D15D90 for ; Mon, 21 Oct 2024 13:34:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 487456B0083; Mon, 21 Oct 2024 09:34:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 436FB6B0088; Mon, 21 Oct 2024 09:34:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D9D76B0089; Mon, 21 Oct 2024 09:34:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 11B9F6B0083 for ; Mon, 21 Oct 2024 09:34:11 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4034480208 for ; Mon, 21 Oct 2024 13:33:57 +0000 (UTC) X-FDA: 82697702514.25.53A71CE Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com [210.118.77.11]) by imf18.hostedemail.com (Postfix) with ESMTP id 0810A1C0005 for ; Mon, 21 Oct 2024 13:34:00 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=kDvtQ+1l; dmarc=pass (policy=none) header.from=samsung.com; spf=pass (imf18.hostedemail.com: domain of da.gomez@samsung.com designates 210.118.77.11 as permitted sender) smtp.mailfrom=da.gomez@samsung.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729517537; a=rsa-sha256; cv=none; b=HRqExc1RHL5Xr6xCf06McSR4V9E2IUlvF/2qprscRJWBtytwiYb2PrCPVtkc1iTArJJy9Y MAl9C+wdcteUHKEIjfVWDY9Iztv2sSFQKlVkUYdl5WZ9ztpV5r1W5Z5iDnvk0nQGY0MbcV BLMuDMu+p2icEP9011t0BTjFPsg59e8= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=samsung.com header.s=mail20170921 header.b=kDvtQ+1l; dmarc=pass (policy=none) header.from=samsung.com; spf=pass (imf18.hostedemail.com: domain of da.gomez@samsung.com designates 210.118.77.11 as permitted sender) smtp.mailfrom=da.gomez@samsung.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729517537; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/GlGoWO1KXI0pNd+E2lielZpzBFWB68sQ9FCHRBX3sg=; b=A1CBc/MYvR5uK/DXN6OuPYhP/ncpRDVHj2Em7Ui/iYk9VC8qi28AEs1G5BmsSnjD9DooXL TOD/pAbzjExTsPWAlUd27K4MLMenT4kN92s1ovm5mfXhheQx8ttUFa0W9KJ1e9klueOZnm Y88BtT+BW16qxI2Ow/uy7CTM0afwbaI= Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout1.w1.samsung.com (KnoxPortal) with ESMTP id 20241021133405euoutp014dc6533979dddce1c29a5cfba48969e4~AeyKTW9xA1077610776euoutp01R for ; Mon, 21 Oct 2024 13:34:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.w1.samsung.com 20241021133405euoutp014dc6533979dddce1c29a5cfba48969e4~AeyKTW9xA1077610776euoutp01R DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1729517645; bh=/GlGoWO1KXI0pNd+E2lielZpzBFWB68sQ9FCHRBX3sg=; h=Date:CC:Subject:From:To:In-Reply-To:References:From; b=kDvtQ+1lDbDQDbG90lMrqo8vslt6NyL8f6jKrnfaZV1+Ficzhj3YCaXYFnZ1PfqBC W8obrg/XO69CfU8w35UZYmz5RBNp1cY5iaUCkakBLGa8MeTFmw93AZh6AEkF865aH1 3iDJ76TRXoIXAUM+oCAwoix7F3nk7B6fd8tg6aKc= Received: from eusmges2new.samsung.com (unknown [203.254.199.244]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20241021133405eucas1p14bb0a6b2ab9713c595f363b78207e696~AeyKCMSNk1043010430eucas1p1K; Mon, 21 Oct 2024 13:34:05 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges2new.samsung.com (EUCPMTA) with SMTP id C1.8A.20409.C4856176; Mon, 21 Oct 2024 14:34:04 +0100 (BST) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20241021133404eucas1p24c126e03f496270d49a7d074fe27a90a~AeyJoJ-z72043420434eucas1p2B; Mon, 21 Oct 2024 13:34:04 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20241021133404eusmtrp1bd8ff4b0f4ff252b29d71461c3bb0d4b~AeyJnf1C_2503525035eusmtrp1O; Mon, 21 Oct 2024 13:34:04 +0000 (GMT) X-AuditID: cbfec7f4-c0df970000004fb9-cd-6716584ca96f Received: from eusmtip2.samsung.com ( [203.254.199.222]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id 2B.78.19654.C4856176; Mon, 21 Oct 2024 14:34:04 +0100 (BST) Received: from CAMSVWEXC02.scsc.local (unknown [106.1.227.72]) by eusmtip2.samsung.com (KnoxPortal) with ESMTPA id 20241021133404eusmtip2a0696345220e5c9e650587cba7149cfb~AeyJbDox52760427604eusmtip2P; Mon, 21 Oct 2024 13:34:04 +0000 (GMT) Received: from mail.scsc.local (106.110.32.87) by CAMSVWEXC02.scsc.local (2002:6a01:e348::6a01:e348) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 21 Oct 2024 14:34:03 +0100 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" Date: Mon, 21 Oct 2024 15:34:03 +0200 Message-ID: CC: Matthew Wilcox , , , , , <21cnbao@gmail.com>, , , , , "Kirill A . Shutemov" Subject: Re: [RFC PATCH v3 0/4] Support large folios for tmpfs From: Daniel Gomez To: "Kirill A. Shutemov" , Baolin Wang X-Mailer: aerc 0.18.2-67-g7f69618ac1fd In-Reply-To: X-Originating-IP: [106.110.32.87] X-ClientProxiedBy: CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) To CAMSVWEXC02.scsc.local (2002:6a01:e348::6a01:e348) X-Brightmail-Tracker: H4sIAAAAAAAAA01SfVCLcRy/3/M8W8+y6THRT16b8lIqb+ceh8L549FR3HE6LzF6TGxr7Vle ck4hnLqlGFk1lfeM3JrayO4kSyThOvSyvBRdLo5VyjBrz7j++3w/L/f7fL/3w1FhHccfT5Cr aKVcLBVxvbFy68Cz0BWxoyUzq/ULSXurkMwv1XNJ5z0rIHtLf6Jkx3c1RhY75pBvPuVjpD67 CCVf3snnkja9k0Nm3tUAMs3eBkhHfz53sYDS6/SAMmtbvahCQzJ1pLqbQ5VdDaYM33O8qEe5 Dowyv5tP6WpXU18tjVwqT92BUHbDhFX89d4L42lpwm5aGR6xxXvHW8cGhSFgb3vhI04qOOl/ AvBwSMyFaQXFnBPAGxcSVwE0au1cdugBsKHyF8IOdpfS9drrX6TpzGGPcAXA2ucazn9XhrHC ix3MALZYK9wRATEC1p5rxwYxSoTAy0WfURZPh+kVDrcHI4LgzdZUF8Zd/jnwdxmftVgRWN42 bRCPJCKgqciMDGKuK2qpNbijvkQc/NnfiLLtpkBnVrb7KR4RDW83ZnpaB8Dc7GsYiw/Ax8Ym 9waQeMCDnV0mhBWWwTSNyWMaCbtqjJ7wOPjkVKaHl8CLN7UerICVzVrOYGdILIDqOilLL4H1 j50ISw+Hr7tHsKsMhznlZ1GWFsDjR4UnQaB2yH20Q+6jHXKfQoCWAD86mZFJaGa2nN4Txohl TLJcErYtUWYArg/35E9Njwlc6foWVgUQHFQBiKMiX4FINUoiFMSL96XQysTNymQpzVSBsTgm 8hMExU+khYRErKJ30bSCVv5TEZznn4qUyprW2FYNVCeZQtd+VC14hQ70vv/6MrJj8tNqtX9o 2/gfi31b9bGxOz+Eb8zJOa+5ZTPvOepT2X976zfLsXDL0oBLp/mFP1qkeY4NF1MCI9UpRs2h mJUN9pgxhx7OK8fzfCrPLdoEkGmn9qfLSOvyDHEx1VeyOUkXVOY3w/plWGOn7P4F0Xbds4Sd lqxxxXHNaNWYqFndxpV9b170hUSvD8kNpnXcKF5I/Q2VVvFuE8PvUPB7kPblcltwXG7duuOB /JJ1i5wR6RPWXJ/UJ3Aw+8ui6ndv9IlMemDLKDA1Z5c+VTT4nVUI9QN4C5ooi60hUpMmT+08 eLjgQ5Zvb0tUoAhjdohnBaNKRvwXixyF0d8DAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrGIsWRmVeSWpSXmKPExsVy+t/xe7o+EWLpBn92y1t8vitkMWf9GjaL /3uPMVp8Xf+L2eLppz4Wi0W/jS1uPp/DYrFm4kJmi8u75rBZ3Fvzn9WiZ/dURovGz/cZLX7/ mMPmwOuxZt4aRo+ds+6yeyzYVOrRcuQtq8fmFVoemz5NYvc4MeM3i8fOh5Ye804Gerzfd5XN Y3bfUyaPz5vkAnii9GyK8ktLUhUy8otLbJWiDS2M9AwtLfSMTCz1DI3NY62MTJX07WxSUnMy y1KL9O0S9DIe/I4u2KRY8WTBCdYGxglSXYycHBICJhK3pjUzdTFycQgJLGWU+P/jMhtEQkZi 45errBC2sMSfa11gcSGBj4wS908pQjTsZJRY8KobrIhXQFDi5MwnLCA2s4C2xLKFr5khbE2J 1u2/2UFsFgFViXV3G4BsDqB6Y4m/m3kgSo4xSWy7rwFiCwvYSexYuJMJxGYDat13chNYq4hA nMSvH1eZIe5Rk/jfP5EF4oZdzBJ9lz+AJTgF/CS2Xu1hhyhSlJgxcSULhF0r8fnvM8YJjCKz kJw6C8mps5CcuoCReRWjSGppcW56brGRXnFibnFpXrpecn7uJkZgAth27OeWHYwrX33UO8TI xMF4iFGCg1lJhFepRDRdiDclsbIqtSg/vqg0J7X4EKMp0M8TmaVEk/OBKSivJN7QzMDU0MTM 0sDU0sxYSZyX7cr5NCGB9MSS1OzU1ILUIpg+Jg5OqQamjbP3JvnpPBcovpyyptk5fIPz+TNB rYGzSwy9m7SPmlvcf5dkUi8kHrL2Xn1kyp7E6U/KFV7olj5pm/6f0UfRd4NRcJBfquP+0/Ys wtc2HpR/cavyvZxDrnKvmIhi8R0GjzXxRV8tM/J5bmTtNbmxZuOclGP7DfkO7P8cof5SiSU9 4wuXxoHNV5dcvxq5be+0GO01l5iS3eakXmavD1/v6b1IeVrDTNPtPumXbzd9P6m9bfF1r3NK V1ft6s9nq5up8G/DQoOpjs9UOx2bTh7fYb/4ep0en5X88dN/o/fUyFtIfuc6vPfQZfe7LN3B SRGzP076fv720sRfa3rYuqadDq546nqs2HQnV0+154T9SizFGYmGWsxFxYkA+PLwtYkDAAA= X-CMS-MailID: 20241021133404eucas1p24c126e03f496270d49a7d074fe27a90a X-Msg-Generator: CA X-RootMTR: 20241021085439eucas1p10a0b6e7c3b0ace3c9a0402427595875a X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20241021085439eucas1p10a0b6e7c3b0ace3c9a0402427595875a References: <6dohx7zna7x6hxzo4cwnwarep3a7rohx4qxubds3uujfb7gp3c@2xaubczl2n6d> <8e48cf24-83e1-486e-b89c-41edb7eeff3e@linux.alibaba.com> X-Rspamd-Queue-Id: 0810A1C0005 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: iiyjd9rp94cjex58ns4iiou9xyyof3ry X-HE-Tag: 1729517640-361214 X-HE-Meta: U2FsdGVkX1/6UNLkHWVbP4OV8IUBrmglH30fqNFAmfL2uduP3ovNC/KASRb913Oa8o9lq0EYEl2YZ5TO2zPITg4S4+4uMoe96Q5XzWQt6PAB6DovyzZEXOOKbSc3WX0RqBEd7eytEssoycQnJR3t5iHLX8PbUCQ+IyZVX9MUzvd903h2OCpDE7SFy1ZCxDx4WJs1DU6vKt6loNZDGD0LpfXgZg5mbrNUjJbG/nuEZX7XhvGd7NTviLCF0nRNs6jSGe4peF9UC1tydRHZ7MfHUbnmhPWhSZZvsruh6w0J65R3N6PEgEr5GPadH4lzgq52t0H4bP4RyGvr3pucunC6cKLYNkGxjMOxxRXfUk5boybYxMRHxmPFRFr/2LlNkJGhQNsrr1sKSfDT+bla72h/UeOM4dE3P5s7ziDL0q6tNr/Tcj11aHIwEzP0qunSc623tDKzfEur8t09eosmAc7M0BB4mmPES1341rsOA4/kpzv8nF7e/uCQo6vYmb2XPnYJHVjccPjkoMGWyRCjIbLMEV8qQdq1K8vKUXzvF3aGdD54BpPEpdjSc7KHkMnaVoJAWPwIHFO25WmBydsW26P5oUjQMVyJ+BABJhtWKaNXIm9P8Vk64RfXYdSCdwCy5qhZeUpVgSrxLwCA6vcC55FMDHcihm0VPuZj6rdTJvuQZSBF8yuj/qEfS71zbeb0Y5UjtTf/bZ/3Wcl9tMjkS3jLLCv47J3eEfySIdvqp1d99NBZPnZ/0x2VcT+jseuOYWiqrzm9is+AwqljuU4J5lNbgdt+l+ENOtChaNJpBujMbQW3yknNWj6HPP/MVgZCbVNJBampwPOkX+GDTuVis1TDQIfjdRBUkQ8B40IMZ7ajKbP++QwSbKA2j8cia/ewdeW5aZfrdze7N10Bo16FcUcS0YTEM+QsuB5UnB3zutuSLjqVNS1+gphLOIWZa1eL7oAi2i3Y0RzEZIZJ72h8h4O V/eNHXab PlE0z0PEnugVhZn0EPE8I1H3wIicDsMfQRSJOTcRwfd6B9tCbH38zDeNCKWfNUxIHgQ9+gN7ShekI7WruAFn6E6GueO2o24EdcP5uiYKxdTA9UH3cybe9shFgSw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon Oct 21, 2024 at 10:54 AM CEST, Kirill A. Shutemov wrote: > On Mon, Oct 21, 2024 at 02:24:18PM +0800, Baolin Wang wrote: >>=20 >>=20 >> On 2024/10/17 19:26, Kirill A. Shutemov wrote: >> > On Thu, Oct 17, 2024 at 05:34:15PM +0800, Baolin Wang wrote: >> > > + Kirill >> > >=20 >> > > On 2024/10/16 22:06, Matthew Wilcox wrote: >> > > > On Thu, Oct 10, 2024 at 05:58:10PM +0800, Baolin Wang wrote: >> > > > > Considering that tmpfs already has the 'huge=3D' option to contr= ol the THP >> > > > > allocation, it is necessary to maintain compatibility with the '= huge=3D' >> > > > > option, as well as considering the 'deny' and 'force' option con= trolled >> > > > > by '/sys/kernel/mm/transparent_hugepage/shmem_enabled'. >> > > >=20 >> > > > No, it's not. No other filesystem honours these settings. tmpfs = would >> > > > not have had these settings if it were written today. It should s= imply >> > > > ignore them, the way that NFS ignores the "intr" mount option now = that >> > > > we have a better solution to the original problem. >> > > >=20 >> > > > To reiterate my position: >> > > >=20 >> > > > - When using tmpfs as a filesystem, it should behave like other >> > > > filesystems. >> > > > - When using tmpfs to implement MAP_ANONYMOUS | MAP_SHARED, it = should >> > > > behave like anonymous memory. >> > >=20 >> > > I do agree with your point to some extent, but the =E2=80=98huge=3D= =E2=80=99 option has >> > > existed for nearly 8 years, and the huge orders based on write size = may not >> > > achieve the performance of PMD-sized THP in some scenarios, such as = when the >> > > write length is consistently 4K. So, I am still concerned that ignor= ing the >> > > 'huge' option could lead to compatibility issues. >> >=20 >> > Yeah, I don't think we are there yet to ignore the mount option. >>=20 >> OK. >>=20 >> > Maybe we need to get a new generic interface to request the semantics >> > tmpfs has with huge=3D on per-inode level on any fs. Like a set of FAD= V_* >> > handles to make kernel allocate PMD-size folio on any allocation or on >> > allocations within i_size. I think this behaviour is useful beyond tmp= fs. >> >=20 >> > Then huge=3D implementation for tmpfs can be re-defined to set these >> > per-inode FADV_ flags by default. This way we can keep tmpfs compatibl= e >> > with current deployments and less special comparing to rest of >> > filesystems on kernel side. >>=20 >> I did a quick search, and I didn't find any other fs that require PMD-si= zed >> huge pages, so I am not sure if FADV_* is useful for filesystems other t= han >> tmpfs. Please correct me if I missed something. > > What do you mean by "require"? THPs are always opportunistic. > > IIUC, we don't have a way to hint kernel to use huge pages for a file on > read from backing storage. Readahead is not always the right way. > >> > If huge=3D is not set, tmpfs would behave the same way as the rest of >> > filesystems. >>=20 >> So if 'huge=3D' is not set, tmpfs write()/fallocate() can still allocate= large >> folios based on the write size? If yes, that means it will change the >> default huge behavior for tmpfs. Because previously having 'huge=3D' is = not >> set means the huge option is 'SHMEM_HUGE_NEVER', which is similar to wha= t I >> mentioned: >> "Another possible choice is to make the huge pages allocation based on w= rite >> size as the *default* behavior for tmpfs, ..." > > I am more worried about breaking existing users of huge pages. So changin= g > behaviour of users who don't specify huge is okay to me. I think moving tmpfs to allocate large folios opportunistically by default (as it was proposed initially) doesn't necessary conflict with the default behaviour (huge=3Dnever). We just need to clarify that in the documentation. However, and IIRC, one of the requests from Hugh was to have a way to disable large folios which is something other FS do not have control of as of today. Ryan sent a proposal to actually control that globally but I think it didn't move forward. So, what are we missing to go back to implement large folios in tmpfs in the default case, as any other fs leveraging large folios?