From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1785C35274 for ; Thu, 21 Dec 2023 19:51:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8CBD06B0085; Thu, 21 Dec 2023 14:51:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 87C926B0089; Thu, 21 Dec 2023 14:51:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 76A2E6B008A; Thu, 21 Dec 2023 14:51:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 670DB6B0085 for ; Thu, 21 Dec 2023 14:51:05 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 3807616072A for ; Thu, 21 Dec 2023 19:51:05 +0000 (UTC) X-FDA: 81591868890.05.39452DD Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf15.hostedemail.com (Postfix) with ESMTP id 0B742A0008 for ; Thu, 21 Dec 2023 19:51:01 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=VMKdh2gt; dmarc=none; spf=none (imf15.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703188262; a=rsa-sha256; cv=none; b=XwMkSUWVn9duvmTyrsM/rP4ZfRkpMZLnqsUZrmhec41T8hRL3uPzNFk9cFe2oOrgfSuT/y 1+/9a9ns85LsdK3fGzWn4kmg+0705ASoOwW1IjhTQQDxodMArpt4mwVwALNm/4Itye5GRn nJ6HJs8dGKOQRg3zDDhevlo394BeL/4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=VMKdh2gt; dmarc=none; spf=none (imf15.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703188262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jkYZaGzPxkq2kteSU3jlRwqK1FmL8my5278X4xLgGkM=; b=GTIG1habMIMLttnOE2ahVPMPdXm5psaPunh6Hj1rqAtQTpZeoZlieSGz8Voycu0C0DcFAm BgmdGScJ3RwhxS6mfoJoC9tmES5UpPVd3CAR1aYtqF7haH6iiuALDB9F6yT6AyrhdhCof+ vo1vYcwKDa1stO+IiiBiHq87sRm42wQ= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=jkYZaGzPxkq2kteSU3jlRwqK1FmL8my5278X4xLgGkM=; b=VMKdh2gtMwrC3FAIuyqWrG9EQn U28olh8j+a+NXdl4qQ84hTfiqSlI00X9abBsH8obLeyLcxgOuBI/rCKbTf1tfce4iVTnqhLRG5yNk Dj5BJirDtX973eF2mf8ltVGphu/5yFP6FWk1n1rOKOJJ+5dZQ8zDQCP0nXoA8S8S3m4fsTjhsi56p fvjc75daOgfV1wjzw+CQmSMy9a3L8t6BSiBB3P6qS2SuE/4rohcV1K4MvrASjRNc0Nv4S8UiPaQJy qQWTBd/2jPjM1RL9c66u/iBMw5ZpEauG3HofqHTBpIL2CZhV0+3OxeEzsOmho2PsYVQRSp+ufDqu5 yvMzEBxQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1rGP4J-0061YK-D2; Thu, 21 Dec 2023 19:50:55 +0000 Date: Thu, 21 Dec 2023 19:50:55 +0000 From: Matthew Wilcox To: Fangrui Song Cc: Yang Shi , Andrew Morton , linux-mm@kvack.org, Song Liu , Miaohe Lin , linux-kernel@vger.kernel.org, Zhouyi Zhou Subject: Re: [PATCH] mm: remove VM_EXEC requirement for THP eligibility Message-ID: References: <20231220054123.1266001-1-maskray@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 0B742A0008 X-Stat-Signature: fhiz7q6z7iyxzypcafounqii959m4iq4 X-HE-Tag: 1703188261-110955 X-HE-Meta: U2FsdGVkX1/Nb9DBYF+2vKZ31QZuI6LGnyZMwUTItGV/RFoSMCJgcr2W5HQx2LCjuqsd+37k5zAjRrc0YskxVyji/a18ny1W8LgHP6+1dbUueTf5Vyo5UeRAF28jVaOnLNdCSLfAUqDbDJol1CFRx0ZdH9/9IF8+4iUJyjlo+qQs/WyDDCHFDkO1xtW6/U0O4nBpC+JZB3bbNTGkxVS54y1S97rVuYY/53dVXWjBflyYyOM9yRCXz6zp1MXcRP7l0r6jqw1rFHsndcsxTtNRStuXrlhc21e5G1bNv6XNnyVL+8QTqTimcf/wtnC7jPLYgSDYhRCl49so0kNntqjtNA0BMZL5pwbuMlVd6Wii5CgDPdK+ne5SyYTN66s7w35l3adQDGT5OwaAjJnsZRghZCT92j4swN0b0xyCEgOJQ5Ue3YdTf3oWgUY+nA0P1PYUTxrQD4hqsY/l9KlQEWZ9jBk5CTxlhVn43ZpB4aIKDslbbrYfT0JgrK4WlQWwM23tEzlXWc6So3fPuP9El26AiLEP0Hdyanccrx09WdlLqa7AH63PhPyFapZGJ/nqhb/RBfC0k2WzUGIPhtW1SprK/M7/MZHCAMZFzcGvhg7FwqfbnprieSt6PXcgX2lW0+u0VHHpqCZoqNycV6U7LMd5YqG0Shs6KfW700CVEpSCtc3BW0WPTLHqmT8ybxKDcd9mLDpYpnzoJqHHr0WpM/i+/s1CFpXwskzHKHm/ANl+okHqKaqzDXxZbXvntq1pBTmWTGkqehsyrbWru4MHAHIbgHJ1LR59pGZodRGQM40vU42vhQVGf+P32oN1dz9HiCpO7lCh4Ae3/LiCy3iT6ZyhcRDx6lk+xwQemAMk4SNTS6NdYbn4BDMdr/neuqEO/L5v3uXsYKB49oN7g2mrrY1odX6mCC9eUI/NfqBs6DR4FoIXZvrsh5v2xhxdxNQDSFmR6BVWtMqHD0HUksjaDxQ bHq/hyGS pnZHPs3DHubFGptEhVULwX3y2pvBqT2W4f5MJKYNKHWlg8mqGewa2Bb17TeQb7o0sQrpFVExe23jyHi1SkfVOcPOR15Nl82KA735CyxQBxWJsHjahVH1vTqQ/lXwg9t6z1d0kg1su21m37E215ATKM1xIuFgp/gEo1Sc/PpBAtqxO/fhM36hAZhPgL4UaOYyvJ8kFHl8zGYHw0pVJQ7GU2xj4oxffqn+jM03/z4Wg+YUzcHM3BlzhmbaELGEFlcqAo0YTT96JgkKA5CZ8lcEHzbf68NrQCaNqgUHmwI0bPDqFF/xPCxVt+I55GeffqJJQlsU5VlvN3tRuNzXWoAkzRppYf4MLMjLSkcaUFnxiFtkQlVsBNzzBYpUp2i3paOBZ/Ly+dE80jmSA2bPRSvXWZatosw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Dec 20, 2023 at 08:53:38PM -0800, Fangrui Song wrote: > Thanks for the comment. Frankly, I am not familiar with huge pages... > I noticed this VM_EXEC condition when I was writing this > hugepage-related section in > https://maskray.me/blog/2023-12-17-exploring-the-section-layout-in-linker-output#transparent-huge-pages-for-mapped-files > (Thanks to Alexander Monakov's comment about > CONFIG_READ_ONLY_THP_FOR_FS in > https://mazzo.li/posts/check-huge-page.html). CONFIG_READ_ONLY_THP_FOR_FS is a preliminary hack which solves some problems. The real solution is using large folios, which at the moment means that you should test on XFS or AFS; filesystem authors have not been enthusiastic about adding support to their filesystems so far. In your blog, you write: : In -z noseparate-code layouts, the file content starts somewhere at : the first page, potentially wasting half a huge page on unrelated : content. Switching to -z separate-code allows reclaiming the benefits : of the half huge page but increases the file size. Balancing : these aspects poses a challenge. One potential solution is using : fallocate(FALLOC_FL_PUNCH_HOLE), which introduces complexity into the : linker. However, this approach feels like a workaround to address a : kernel limitation. It would be preferable if a file-backed huge page : didn't necessitate a file offset aligned to a huge page boundary. You should distinguish between file size (ie st_size in stat(3)) and amount of space occupied on storage (st_blocks). The linker should be fine with creating a sparse file. If it doesn't, cp --sparse will do the trick. Yes, it's a kernel limitation that folios have to be aligned within the file as well as in both virtual and physical address space. It's a huge complexity win to do that; I don't think we'd be able to tile the page cache effectively if we allowed folios to be placed at arbitrary offsets (I think it turns into a knapsack problem at that point). > As dTLB for read-only data is also an important optimization of > file-backed THP, it seems straightforward that we should drop the > VM_EXEC condition :) I'm not particularly enthusiastic about making CONFIG_READ_ONLY_THP_FOR_FS better. Large folios are the future. Indeed, I'd like to see CONFIG_READ_ONLY_THP_FOR_FS go away in the next year or two once btrfs and ext4 have support for large folios.