From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BE86C36010 for ; Fri, 28 Mar 2025 19:14:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CF564280162; Fri, 28 Mar 2025 15:14:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CA76F28015F; Fri, 28 Mar 2025 15:14:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B6CF4280162; Fri, 28 Mar 2025 15:14:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8ADED28015F for ; Fri, 28 Mar 2025 15:14:38 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2AECAA9B5A for ; Fri, 28 Mar 2025 19:14:38 +0000 (UTC) X-FDA: 83271911436.22.4A99FD6 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf07.hostedemail.com (Postfix) with ESMTP id F0ECE40006 for ; Fri, 28 Mar 2025 19:14:35 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=HLCfb4k2; spf=none (imf07.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743189276; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=90Q+4dJPZ++gRc8qjEAkgpjYJdb5VOUpPRHZ+O0S5l4=; b=jtb6eAAA7tvJHAQ43nsSMSSW1u6ptjfLwTYdC275yRuYVlXa9183jGfBs/nuBHy5nBJxh8 kDztx4D/Id8LheCBJwZcqaUsNv5O7iOyjLReHtsueLuIWnCxpsMGNdqnKvED7BdCNdSCfu xTTFT2aSJZjWCPjviNYYTxISH0dAO34= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=HLCfb4k2; spf=none (imf07.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743189276; a=rsa-sha256; cv=none; b=ol0g8v6pZvoVsrh4YxrQid8mFkI0/j8SbP6MDmJIVR819I+qfBP9msss8OYG0aKX9q1XkY Gur3qCXlb4k0cBbn01PsPjz695gHCllB5Y/QdVOAts3JtbPqIoYyXZFRDfCpTtsG3ybZZG lxBR07g7gHxnFa/uXnKDzdflDLP7imM= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=90Q+4dJPZ++gRc8qjEAkgpjYJdb5VOUpPRHZ+O0S5l4=; b=HLCfb4k2r/rHy/w+UYAen83Wc+ dsKCYV8PnrQ5ZHUuxWn0mkw/Q2XFZkxGxTsit8jmvcDJgMBBNfFEfsR5vdfFzD7X6tCcuwmzuWJAP DZxxwdAOTTjO5EWbktI9ydKRZUVfHee9DeTf0msC7VuWag1kIzHNibARYV4uXt8oABMleV9ZEb9hl ie+5LZOSr+QZ2mUovgjSEI0BzF4K0e7EmzSk6uqXiE0bDeasEdJqW8x5egSvmEApsxQWNJLyaWBys B1Y/fYo8GOr9dDOwYLqOyr2kYAUyG2pJHC4vdfc8/I3h6gTsprPy2axjgHB/a8yfv7d+/H7BS8PCv y6PRjJpw==; Received: from willy by casper.infradead.org with local (Exim 4.98.1 #2 (Red Hat Linux)) id 1tyF9y-00000008yK0-2NDn; Fri, 28 Mar 2025 19:14:30 +0000 Date: Fri, 28 Mar 2025 19:14:30 +0000 From: Matthew Wilcox To: Ryan Roberts Cc: Kalesh Singh , Andrew Morton , David Hildenbrand , Dave Chinner , Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v3] mm/filemap: Allow arch to request folio size for exec memory Message-ID: References: <20250327160700.1147155-1-ryan.roberts@arm.com> <5131c7ad-cc37-44fc-8672-5866ecbef65b@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5131c7ad-cc37-44fc-8672-5866ecbef65b@arm.com> X-Rspamd-Queue-Id: F0ECE40006 X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: wf1defxxqntezz1e4qs9dftnsssgcrfw X-HE-Tag: 1743189275-7676 X-HE-Meta: U2FsdGVkX18UotNedlK2gez/Eswq429sZR83LH67lCH/L2QOakUtk/Vl1zXYGavYWp0ECY/lJ133Lr4x7T10frJzYqNdeVD03012g52SAq8gXGl6/R1cMX7/MOLH7+ufai0yiyRYzNZNpc2pOsf0i7Zt2altSfP9zSEQaAEwWB5dvRedP5YJGCxgDAayz+1PM5K00nFeV2aaRdTCwc3FOwziPbqnDW8Y6O46nt+mio5mOTEpHxLqJLWmZfzEzgawZ/F7vi89O99VcwZQGaCITz+iKOEPBukHJSRQQKofPR/a2WnbMJJa/yWFXTIKuJObBSniTWs0yLp+fIoT4Ov4SH7ueBfOG6Ges48c6klHEi4/dNJJDxrZYMv1MndESzQK1UQbVdDCwoJFwuBdRqYO6Y44CbORMcsGuvUTRZvztU6W1ZWN+clG5JZcFoEpSRCbtcW28h1I3Uy08iXHjnilZoC6OrCMdzvpgLRbwJzd08ZehYc+F3r6Zwfrq4gjecik8/+c0YtIFvhw8DLVtjTg6WglBHWS05ZSJxhhg14mO+HKWNLCsy7DXdmNRhiDjotVVSxM2Rg+g+zI8PUBPNHvgHWY822v9tSa/ukTzi//SNF6hJ0CADmIZ8OfJn4QQyr/EqxDlxlXouzEd9z2OxtXe+Gd41VaxaVIAdxezm/Q11A386YsK9DQduE6XytkqMBB5PGF3CDza5l4OIbuZIseZOOixUk8g2dRYCFEL/91Gq5UEW5r2yoQB2+KQIDmK6vKa02c5CD4X7UxyJ/vlCOPq7LDW+LZJn+Xn63oN4x3DF9DY0Bzrlh9gi/0Iv2mlZjeOa7zAet6oIsfaF6J90xPXn6ZsR7Kv7I22ivUtGf3K0osL7Mr5OzlgyPoz1vP9rbVHORrtl4f4ShukuFstmUduqKvZHUYJWNeDRzKj2tQSdpo3OO7wehLFykD9JUEQP/PA30JxDUHPQpEYFbLzuJ mSZ5OVcn r1CLBF3D98C3Tzh7zjR+QvDNcwWIhYUgSwoRA+G1KLrWgs/V1ghYmOum/FD+TooWPaZppnv8JsmzllIULyurv3umKYzEW7gu+6zOcbBm9o4uffcQUKe/Ch20modU7xtCGLgV7+Zt+kobxjWqU+68QtqJESxI4BBULL1H5Y0vF6qgcEJ6Oxh1gU8w3YPUzlIgPok7KX+dzZhNCSAsD7un+wcyO+qlBRsweRQ+RVAEyg0Pq4Gc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 27, 2025 at 04:23:14PM -0400, Ryan Roberts wrote: > + Kalesh > > On 27/03/2025 12:44, Matthew Wilcox wrote: > > On Thu, Mar 27, 2025 at 04:06:58PM +0000, Ryan Roberts wrote: > >> So let's special-case the read(ahead) logic for executable mappings. The > >> trade-off is performance improvement (due to more efficient storage of > >> the translations in iTLB) vs potential read amplification (due to > >> reading too much data around the fault which won't be used), and the > >> latter is independent of base page size. I've chosen 64K folio size for > >> arm64 which benefits both the 4K and 16K base page size configs and > >> shouldn't lead to any read amplification in practice since the old > >> read-around path was (usually) reading blocks of 128K. I don't > >> anticipate any write amplification because text is always RO. > > > > Is there not also the potential for wasted memory due to ELF alignment? > > I think this is an orthogonal issue? My change isn't making that any worse. To a certain extent, it is. If readahead was doing order-2 allocations before and is now doing order-4, you're tying up 0-12 extra pages which happen to be filled with zeroes due to being used to cache the contents of a hole. > > Kalesh talked about it in the MM BOF at the same time that Ted and I > > were discussing it in the FS BOF. Some coordination required (like > > maybe Kalesh could have mentioned it to me rathere than assuming I'd be > > there?) > > I was at Kalesh's talk. David H suggested that a potential solution might be for > readahead to ask the fs where the next hole is and then truncate readahead to > avoid reading the hole. Given it's padding, nothing should directly fault it in > so it never ends up in the page cache. Not sure if you discussed anything like > that if you were talking in parallel? Ted said that he and Kalesh had talked about that solution. I have a more bold solution in mind which lifts the ext4 extent cache to the VFS inode so that the readahead code can interrogate it. > Anyway, I'm not sure if you're suggesting these changes need to be considered as > one somehow or if you're just mentioning it given it is loosely related? My view > is that this change is an improvement indepently and could go in much sooner. This is not a reason to delay this patch. It's just a downside which should be mentioned in the commit message. > >> +static inline int arch_exec_folio_order(void) > >> +{ > >> + return -1; > >> +} > > > > This feels a bit fragile. I often expect to be able to store an order > > in an unsigned int. Why not return 0 instead? > > Well 0 is a valid order, no? I think we have had the "is order signed or > unsigned" argument before. get_order() returns a signed int :) But why not always return a valid order? I don't think we need a sentinel. The default value can be 0 to do what we do today.