From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4586BC3ABB6 for ; Mon, 5 May 2025 10:10:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 83E286B0092; Mon, 5 May 2025 06:10:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7C29D6B0093; Mon, 5 May 2025 06:10:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 63EB36B0095; Mon, 5 May 2025 06:10:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 416776B0092 for ; Mon, 5 May 2025 06:10:01 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 162DB58764 for ; Mon, 5 May 2025 10:10:02 +0000 (UTC) X-FDA: 83408433444.01.809E9FF Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf06.hostedemail.com (Postfix) with ESMTP id C26BD180007 for ; Mon, 5 May 2025 10:09:59 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="hRpkzJ/D"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=aLcvaLsr; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="hRpkzJ/D"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=aLcvaLsr; dmarc=none; spf=pass (imf06.hostedemail.com: domain of jack@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746439800; a=rsa-sha256; cv=none; b=tjlCutUxIB8w+S/EQOHpZRv90e/FBnk04MCmmRgR8LOlH/hPcpxqKgSg73fXyu7zvuNnT8 ocUHgQl1O5ddDv2XpcxYnFOtdrneylmYCvFtHKtPovhmFDrk7nfd2aZX6Op/VYCjeSUhU8 o1omwtKyJaNi/GSvmsfLsfu9Gt3HSGc= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="hRpkzJ/D"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=aLcvaLsr; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="hRpkzJ/D"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=aLcvaLsr; dmarc=none; spf=pass (imf06.hostedemail.com: domain of jack@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746439800; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=g2/4qyUgWiq9hjZiz/h8FxoiBrCLfmuG/2ol1BJ7KKk=; b=3ebjFm2XOWHZFajg15w0xM3fXWZhlpjEXazhrYRyEF5yLQrPg/H0QyZr1tUf3PWzllhnr6 0s8bPmVy5kbsD5hUNf3QvAG2eRVxMDWjGj95L8HVqS9+z2g7WGY1d8AZRJPYegtrMAs7BE WT0vnUyqsUlpNCcgW9uG9fl6iNE6FLA= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 4E8FE1F453; Mon, 5 May 2025 10:09:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1746439798; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=g2/4qyUgWiq9hjZiz/h8FxoiBrCLfmuG/2ol1BJ7KKk=; b=hRpkzJ/Dlcg4wIuP6tt0k5+5tz3/5TJgZYRXo3KhC9NA2wQvz6TVOuIDJGxZi7pA+Dg9sH Shvy41Wj2QZaocBjQD9qYyJcgQ7OUVu0bSeucDxoAtI/z+yE2kcKGb6j9nyVwKlD7YsTy8 MJOqUQQLCNB7BRrHi1ijo1ojkVh5ldg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1746439798; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=g2/4qyUgWiq9hjZiz/h8FxoiBrCLfmuG/2ol1BJ7KKk=; b=aLcvaLsrlMjxIh04UCMqoFSKJPTbYtNwdYLL86isDMIFPfKaHijlivnRw4LXrjWcZlUMOI HbYsa6I6dSmTMiDw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1746439798; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=g2/4qyUgWiq9hjZiz/h8FxoiBrCLfmuG/2ol1BJ7KKk=; b=hRpkzJ/Dlcg4wIuP6tt0k5+5tz3/5TJgZYRXo3KhC9NA2wQvz6TVOuIDJGxZi7pA+Dg9sH Shvy41Wj2QZaocBjQD9qYyJcgQ7OUVu0bSeucDxoAtI/z+yE2kcKGb6j9nyVwKlD7YsTy8 MJOqUQQLCNB7BRrHi1ijo1ojkVh5ldg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1746439798; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=g2/4qyUgWiq9hjZiz/h8FxoiBrCLfmuG/2ol1BJ7KKk=; b=aLcvaLsrlMjxIh04UCMqoFSKJPTbYtNwdYLL86isDMIFPfKaHijlivnRw4LXrjWcZlUMOI HbYsa6I6dSmTMiDw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3624713883; Mon, 5 May 2025 10:09:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id +gbmDHaOGGiLeAAAD6G6ig (envelope-from ); Mon, 05 May 2025 10:09:58 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id D0B8EA0670; Mon, 5 May 2025 12:09:57 +0200 (CEST) Date: Mon, 5 May 2025 12:09:57 +0200 From: Jan Kara To: David Hildenbrand Cc: Ryan Roberts , Andrew Morton , "Matthew Wilcox (Oracle)" , Alexander Viro , Christian Brauner , Jan Kara , Dave Chinner , Catalin Marinas , Will Deacon , Kalesh Singh , Zi Yan , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH v4 1/5] mm/readahead: Honour new_order in page_cache_ra_order() Message-ID: References: <20250430145920.3748738-1-ryan.roberts@arm.com> <20250430145920.3748738-2-ryan.roberts@arm.com> <48b4aa79-943b-46bc-ac24-604fdf998566@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48b4aa79-943b-46bc-ac24-604fdf998566@redhat.com> X-Rspamd-Action: no action X-Rspamd-Queue-Id: C26BD180007 X-Stat-Signature: su9inuq97gsfq1ej461q653dmnr8mpg1 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1746439799-353567 X-HE-Meta: U2FsdGVkX198BpSofpQarm9mUmuvgmkA4wNuzSs0XFLGX0UA6LPiPi0yFqqEHVMLPwuXlH4udR6RuHf+Y8EiDnw2t0CG+ndX+pBj3oaTlE0T1DPocRE8GfdI8mb+z37w+SbrBAVpzYm7tjNMLfN2BQKMrNLJO3/yWvBKocWB45scPKLa3D94KSbC4H/XKB8/euOAekbesKwaP2v6zwa57VDvEjepe+gYJIJ10L1DYWmU9rsGscmm0aycCs7hTe28chaIgxvxIq9MU7PyZj2K9WM/5BvnYdJhW7u2lkQmrJBIshjYGtbtksgxV+3aeLTi0PwSdhzXCCr3STGkVKjlDh1jiUUV+0i/coudMGHhJuCSq/BDSSNN4Gz3riArxAAkplUNb9i3FOWsbveigMuE9s9t371myr9V/o9l+Qfm9e8LRKal+moAPeH5rXXuT6TZgmC/V5GDG7WYrA4sUQFcAPlPRwpwu4inbMnbWiqJgEepGijJHFdSMqZ8Pq1JMCyT/Dnu7os5AYqPklkbM6iRI6uH7g+XNp/qZ1paCQqRDSeStFUsB6KSZ6Z1ElCdEYYjOTop2BQD9TGbq8BtVRHhQLug/GjzrSDp4XwWedU8EWAPEtMcLxCoYs4Wq7F/aVIvY2a4pfIoSmtLOe4G8Vpn//7b5v/WKlrMPPWs+cvdM3SHsNkrMz/egMbq8fpYfKfcFW+3fm9ZsAXWyHyilcfqIM9iZyN1BZ3UtjALJSsPI2y5IL/JrJ0iPEpJk7HzbZBhfqmrTj+IMAsLdWMZFCiV+YuFGytrQjElxpqPngvM0cGnwx/qn6onopD0J13HWPJzK/K0Xc7tQIPpQk8Fjuq18zYquFOQfbiDK41VxxzxQZUxbVDDQPeuMH3k0kgZg/LILS8h2s0Isse7Vr+QjqjrTvwOzcQC5dbMt/CJR/Wv9jDjLW9/HnEu/gQVNByGR2nfDg7PjyrNrPFMz+A47fR zG9C9F7r RunIyjDCTA51EH1OLkl0jzSuNnINq3nyvEQPkc0vvXjcSoIRTSt+WfHxN1gVfy8LHP/qYNuzWlmiNy7ykqv6DhZIdoe5wHNGJLQb9Vm8jaDeCETTiE4HNFEOgAuzZUkPJZObbnGicLyXifHtCwmFd8DsmOiRG+ue5C9P+7ccMDW2hKUt2aulqH3i/JLYvWgqDMY3JqYpBPiYdCezaTmoAJRrKy06Uxhcm7cvF0lNZWX77IkajxADDC7ELtr8h6ajtMsFocFUXit3M1CVr+YfDk+z46IR5vOgye/7L+hYazQMXHWSPaCeKt3lvZhNo9PCytivSQM4nkX5HucEPBDKb0CDvRIfOUGx0dpzYFO3jesIh4S8Drdp0/mGR7kiOL/i8MmY1ePtG0ejEm3Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon 05-05-25 11:51:43, David Hildenbrand wrote: > On 30.04.25 16:59, Ryan Roberts wrote: > > page_cache_ra_order() takes a parameter called new_order, which is > > intended to express the preferred order of the folios that will be > > allocated for the readahead operation. Most callers indeed call this > > with their preferred new order. But page_cache_async_ra() calls it with > > the preferred order of the previous readahead request (actually the > > order of the folio that had the readahead marker, which may be smaller > > when alignment comes into play). > > > > And despite the parameter name, page_cache_ra_order() always treats it > > at the old order, adding 2 to it on entry. As a result, a cold readahead > > always starts with order-2 folios. > > > > Let's fix this behaviour by always passing in the *new* order. > > > > Worked example: > > > > Prior to the change, mmaping an 8MB file and touching each page > > sequentially, resulted in the following, where we start with order-2 > > folios for the first 128K then ramp up to order-4 for the next 128K, > > then get clamped to order-5 for the rest of the file because pa_pages is > > limited to 128K: > > > > TYPE STARTOFFS ENDOFFS SIZE STARTPG ENDPG NRPG ORDER > > ----- ---------- ---------- --------- ------- ------- ----- ----- > > FOLIO 0x00000000 0x00004000 16384 0 4 4 2 > > FOLIO 0x00004000 0x00008000 16384 4 8 4 2 > > FOLIO 0x00008000 0x0000c000 16384 8 12 4 2 > > FOLIO 0x0000c000 0x00010000 16384 12 16 4 2 > > FOLIO 0x00010000 0x00014000 16384 16 20 4 2 > > FOLIO 0x00014000 0x00018000 16384 20 24 4 2 > > FOLIO 0x00018000 0x0001c000 16384 24 28 4 2 > > FOLIO 0x0001c000 0x00020000 16384 28 32 4 2 > > FOLIO 0x00020000 0x00030000 65536 32 48 16 4 > > FOLIO 0x00030000 0x00040000 65536 48 64 16 4 > > FOLIO 0x00040000 0x00060000 131072 64 96 32 5 > > FOLIO 0x00060000 0x00080000 131072 96 128 32 5 > > FOLIO 0x00080000 0x000a0000 131072 128 160 32 5 > > FOLIO 0x000a0000 0x000c0000 131072 160 192 32 5 > > Interesting, I would have thought we'd ramp up earlier. > > > ... > > > > After the change, the same operation results in the first 128K being > > order-0, then we start ramping up to order-2, -4, and finally get > > clamped at order-5: > > > > TYPE STARTOFFS ENDOFFS SIZE STARTPG ENDPG NRPG ORDER > > ----- ---------- ---------- --------- ------- ------- ----- ----- > > FOLIO 0x00000000 0x00001000 4096 0 1 1 0 > > FOLIO 0x00001000 0x00002000 4096 1 2 1 0 > > FOLIO 0x00002000 0x00003000 4096 2 3 1 0 > > FOLIO 0x00003000 0x00004000 4096 3 4 1 0 > > FOLIO 0x00004000 0x00005000 4096 4 5 1 0 > > FOLIO 0x00005000 0x00006000 4096 5 6 1 0 > > FOLIO 0x00006000 0x00007000 4096 6 7 1 0 > > FOLIO 0x00007000 0x00008000 4096 7 8 1 0 > > FOLIO 0x00008000 0x00009000 4096 8 9 1 0 > > FOLIO 0x00009000 0x0000a000 4096 9 10 1 0 > > FOLIO 0x0000a000 0x0000b000 4096 10 11 1 0 > > FOLIO 0x0000b000 0x0000c000 4096 11 12 1 0 > > FOLIO 0x0000c000 0x0000d000 4096 12 13 1 0 > > FOLIO 0x0000d000 0x0000e000 4096 13 14 1 0 > > FOLIO 0x0000e000 0x0000f000 4096 14 15 1 0 > > FOLIO 0x0000f000 0x00010000 4096 15 16 1 0 > > FOLIO 0x00010000 0x00011000 4096 16 17 1 0 > > FOLIO 0x00011000 0x00012000 4096 17 18 1 0 > > FOLIO 0x00012000 0x00013000 4096 18 19 1 0 > > FOLIO 0x00013000 0x00014000 4096 19 20 1 0 > > FOLIO 0x00014000 0x00015000 4096 20 21 1 0 > > FOLIO 0x00015000 0x00016000 4096 21 22 1 0 > > FOLIO 0x00016000 0x00017000 4096 22 23 1 0 > > FOLIO 0x00017000 0x00018000 4096 23 24 1 0 > > FOLIO 0x00018000 0x00019000 4096 24 25 1 0 > > FOLIO 0x00019000 0x0001a000 4096 25 26 1 0 > > FOLIO 0x0001a000 0x0001b000 4096 26 27 1 0 > > FOLIO 0x0001b000 0x0001c000 4096 27 28 1 0 > > FOLIO 0x0001c000 0x0001d000 4096 28 29 1 0 > > FOLIO 0x0001d000 0x0001e000 4096 29 30 1 0 > > FOLIO 0x0001e000 0x0001f000 4096 30 31 1 0 > > FOLIO 0x0001f000 0x00020000 4096 31 32 1 0 > > FOLIO 0x00020000 0x00024000 16384 32 36 4 2 > > FOLIO 0x00024000 0x00028000 16384 36 40 4 2 > > FOLIO 0x00028000 0x0002c000 16384 40 44 4 2 > > FOLIO 0x0002c000 0x00030000 16384 44 48 4 2 > > FOLIO 0x00030000 0x00034000 16384 48 52 4 2 > > FOLIO 0x00034000 0x00038000 16384 52 56 4 2 > > FOLIO 0x00038000 0x0003c000 16384 56 60 4 2 > > FOLIO 0x0003c000 0x00040000 16384 60 64 4 2 > > FOLIO 0x00040000 0x00050000 65536 64 80 16 4 > > FOLIO 0x00050000 0x00060000 65536 80 96 16 4 > > FOLIO 0x00060000 0x00080000 131072 96 128 32 5 > > FOLIO 0x00080000 0x000a0000 131072 128 160 32 5 > > FOLIO 0x000a0000 0x000c0000 131072 160 192 32 5 > > FOLIO 0x000c0000 0x000e0000 131072 192 224 32 5 > > Similar here, do you know why we don't ramp up earlier. Allocating that many > order-0 + order-2 pages looks a bit suboptimal to me for a sequential read. Note that this is reading through mmap using the mmap readahead code. If you use standard read(2), the readahead window starts small as well and ramps us along with the desired order so we don't allocate that many small order pages in that case. Honza -- Jan Kara SUSE Labs, CR