From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0594C4829E for ; Thu, 15 Feb 2024 22:49:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 031978D0007; Thu, 15 Feb 2024 17:49:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EFE678D0001; Thu, 15 Feb 2024 17:48:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D77B08D0007; Thu, 15 Feb 2024 17:48:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C21788D0001 for ; Thu, 15 Feb 2024 17:48:59 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 93E071A043E for ; Thu, 15 Feb 2024 22:48:59 +0000 (UTC) X-FDA: 81795529998.05.453552D Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf13.hostedemail.com (Postfix) with ESMTP id 181462000C for ; Thu, 15 Feb 2024 22:48:56 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=DARco0iC; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708037337; a=rsa-sha256; cv=none; b=qEQ1azokXXNh5Tqh1gCEEKxu08wDGXgxFFnuyE40koWOFPG9St/uSRCqhXIUhDVsEfe8Dz 8yVSS27TrV591B6oAUAo13ODoVueJDWtAhfwHdPyCHfdPBEvEge+bUquKP57lzkKCmt1Vo FUeBQH+jaMhz6b7J+ReMaPsGqrU6AFE= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=DARco0iC; spf=pass (imf13.hostedemail.com: domain of akpm@linux-foundation.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708037337; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4scxRUx3SPo9VFkDLcuwMrKCJZWE4pvU5PHIM9kfCiE=; b=WHAijU3ogoXOcsUiWnHf8hxaVfExsZMJh/vWPmR1r2qJg6RvfytP451AP5Kn+FL4axEDB7 5ItIPdEXt8XZJyKRF+sy4Hfch4lARDUOzv5I9ZMv2jQPTx9+Jthhl5Q42Fwdgk0W0Qbehb aIRnbuEjTEPilyNRUaaFPU6UuNvOiQo= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id C26F7CE18AB; Thu, 15 Feb 2024 22:48:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8B9DAC433C7; Thu, 15 Feb 2024 22:48:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1708037331; bh=O2YipuwbWV1YLastP336dH3q8Zkm6CgW4TI5yh0PFMM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=DARco0iCdfuae92XExh7ksU1jOCySljoFJvUjTDn6qsullpN+UZD19SQnJFqhMYAZ oOm+fymLMFt8D+F9JZKcJYJsu3fAkB0Xfg6t5w4Y8bHGlrMdP//45q1dE8IjOYoDXb P21Fkvnul+Og6eWw2GUVO+Duw3g99sqJQjsFXikQ= Date: Thu, 15 Feb 2024 14:48:49 -0800 From: Andrew Morton To: Ryan Roberts Cc: Catalin Marinas , Will Deacon , Mark Rutland , "Matthew Wilcox (Oracle)" , David Hildenbrand , Barry Song <21cnbao@gmail.com>, John Hubbard , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2] mm/filemap: Allow arch to request folio size for exec memory Message-Id: <20240215144849.aba06863acc08b8ded09a187@linux-foundation.org> In-Reply-To: <20240215154059.2863126-1-ryan.roberts@arm.com> References: <20240215154059.2863126-1-ryan.roberts@arm.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 181462000C X-Stat-Signature: obqbo8wenj41ndpe3m17z9bfph3wxgsm X-Rspam-User: X-HE-Tag: 1708037336-389289 X-HE-Meta: U2FsdGVkX18z4U8yMbDzh3jVxlx/0crdF4Zxdd2XFULWEW0bkphS55DpukX1cyI3jIT/d1vtpeXuBytBgs+hrWcw6GU+VDNSfl5wNu2m6sfIAqa/tqojoRFgJgmxqzISBMehVxFMgh15lCPAkfkgTcwf4+DtnloUldN6F5as/gUsFb2WW9g7i9s5bP1sClozmsycdKN6hL+6+yK/UCXs37QiSnj08aABmBCKescQrwo1ZXXdqgyy/3ED7mICmCZhWt4c4u1Wo23J8VQT17njXoTh3cx/a63nrdqRrVwaFnxChHrf1DiwlwVe8IHKD4RS8azRQuxpD6qKH8arqV3Csrz7gJ3uEGXX6GlVedmYnOQ+jE9bZMgxsfIbOYL7/TYD9B4kGi/k7rITwzTR0qpmBcPrqpsZ4rPjB/XCVg0jyHI8iDhyRcJA+MY0nNJZ+YG9X+cWk4cyuOvgmM+3YYng5XwHsoGdoty+ua2xenmTag46mpaHeIa/EpWvW3XyhndS1/j95glBOafoheJyji0GxoRciwhjOmX8iy6hXmFLtbeqoy318EyiQLW8vMwvlAmThjasts6pCcbvOL89k1btK9Iw8sZNZWILOwWjFGh3GHwtsfMISoPzzp/Ux6JR6nFpEL6C3iBIdZ8JD2w25aCT8zqBAOc16Ui5vt5Y0zGZV9H2y+O83WR0hA8qJCKP62Bhw1CADg2+W3KkvqY+9eYV+zbTwyXP+Vbh5SsJqPmLkI29ylxXAp+oKNWkaqKZFcEC1wgxXi9ZNcDCQsZ3zXe1x+CLdZcNa5Cg1VavFkn0tUeuF1gdik5no3/aDn5SgMeixGfSuLZHxugKOSkn8a596CiZKmtRd2qoRXOFQAGBKBday/C0pZemPkTR2XjrV/ASDnQCfcvWFULS/yjZOAM/cTw51zs4Fqc/HoGgQcyeEOPexXLFHtAsdihVlMQQZ8EkolzpFBWgd0bSUqAAFVz FDlR7ILf HssxbIaVsSUlvkIbiJVF9B60x2NEGbC7lpqIH0lB0/ON+UzRaweN1wZCPqmv2RZK6juaCRAtgzY7IWTKws9AwdonAPKlWq/Cm5jSGzL+ah9cRx+oBTYbDftR6euEi82tZoV+yjJLsCu9KR95dGZdKX8AiGt75sCygBAer/AYwmGE2BhPyLUCdR/ZP1u5dwuoChtDAC9FMRDwnBrMAXf4Nud5p1pWQHAMavJIExw0wJcJ9tVBSOcyrOcALTPZE1ko3WXsD/gamgJpd0pctsQOrtpDoUZ+kjP4M0mI1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 15 Feb 2024 15:40:59 +0000 Ryan Roberts wrote: > Change the readahead config so that if it is being requested for an > executable mapping, do a synchronous read of an arch-specified size in a > naturally aligned manner. Some nits: > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -1115,6 +1115,18 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf, > */ > #define arch_wants_old_prefaulted_pte cpu_has_hw_af > > +/* > + * Request exec memory is read into pagecache in at least 64K folios. The > + * trade-off here is performance improvement due to storing translations more > + * effciently in the iTLB vs the potential for read amplification due to reading "efficiently" > + * data from disk that won't be used. The latter is independent of base page > + * size, so we set a page-size independent block size of 64K. This size can be > + * contpte-mapped when 4K base pages are in use (16 pages into 1 iTLB entry), > + * and HPA can coalesce it (4 pages into 1 TLB entry) when 16K base pages are in > + * use. > + */ > +#define arch_wants_exec_folio_order() ilog2(SZ_64K >> PAGE_SHIFT) > + To my eye, "arch_wants_foo" and "arch_want_foo" are booleans. Either this arch wants a particular treatment or it does not want it. I suggest a better name would be "arch_exec_folio_order". > static inline bool pud_sect_supported(void) > { > return PAGE_SIZE == SZ_4K; > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index aab227e12493..6cdd145cbbb9 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -407,6 +407,18 @@ static inline bool arch_has_hw_pte_young(void) > } > #endif > > +#ifndef arch_wants_exec_folio_order > +/* > + * Returns preferred minimum folio order for executable file-backed memory. Must > + * be in range [0, PMD_ORDER]. Negative value implies that the HW has no > + * preference and mm will not special-case executable memory in the pagecache. > + */ I think this comment contains material which would be useful above the other arch_wants_exec_folio_order() implementation - the "must be in range" part. So I suggest all this material be incorporated into a single comment which describes arch_wants_exec_folio_order(). Then this comment can be removed entirely. Assume the reader knows to go seek the other definition for the commentary. > +static inline int arch_wants_exec_folio_order(void) > +{ > + return -1; > +} > +#endif > + > #ifndef arch_check_zapped_pte > static inline void arch_check_zapped_pte(struct vm_area_struct *vma, > pte_t pte) > diff --git a/mm/filemap.c b/mm/filemap.c > index 142864338ca4..7954274de11c 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3118,6 +3118,25 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) > } > #endif > > + /* > + * Allow arch to request a preferred minimum folio order for executable > + * memory. This can often be beneficial to performance if (e.g.) arm64 > + * can contpte-map the folio. Executable memory rarely benefits from > + * read-ahead anyway, due to its random access nature. "readahead" > + */ > + if (vm_flags & VM_EXEC) { > + int order = arch_wants_exec_folio_order(); > + > + if (order >= 0) { > + fpin = maybe_unlock_mmap_for_io(vmf, fpin); > + ra->size = 1UL << order; > + ra->async_size = 0; > + ractl._index &= ~((unsigned long)ra->size - 1); > + page_cache_ra_order(&ractl, ra, order); > + return fpin; > + } > + } > + > /* If we don't want any read-ahead, don't bother */ > if (vm_flags & VM_RAND_READ) > return fpin;