From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16AC2C48BC1 for ; Wed, 14 Feb 2024 15:11:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 82C396B00A8; Wed, 14 Feb 2024 10:11:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7DC3F6B00A9; Wed, 14 Feb 2024 10:11:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6A3E56B00AA; Wed, 14 Feb 2024 10:11:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5A88C6B00A8 for ; Wed, 14 Feb 2024 10:11:06 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1F8331207F5 for ; Wed, 14 Feb 2024 15:11:06 +0000 (UTC) X-FDA: 81790747332.25.A6E5C23 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) by imf05.hostedemail.com (Postfix) with ESMTP id A62F010002D for ; Wed, 14 Feb 2024 15:11:03 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=kH0iXx3O; spf=pass (imf05.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707923464; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=y5oOwF/jwjqn3ZhiI16vrltxvvH9rJjUK6O56MFSA3w=; b=mOVMOg54VKOIRZMg0K1Jx7E/Gngg2/d/lgzZnaX8/9WIzRF+oQ7U65QK/T/HO8M78B7XkP mOGlLORX7T/cZtsDCDmvNGoxyMCyPWlwMhVraHyUyo5dNmjXho3klV8KVRM1VFNF85TWAi CyV3p74XDRQPQfnjHylDbjywomQ+tkY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707923464; a=rsa-sha256; cv=none; b=Zh4D1UfztVlSsyvPthaEQ3/SmNPNMQwVVS/uW77UYuNxg+lKy5T19NySIgVFtWTBlRp/tJ Np39Zcnb4tCtxtSD7ML3fjhdL/k+F5r02K5+fpeoaHcbwR3j+Wwe2ec28UBm5ig/K4oL9l plN7WEBf9ITDbIrq6aneLuu3w3rOCZI= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=kH0iXx3O; spf=pass (imf05.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=none Received: from smtp1.mailbox.org (smtp1.mailbox.org [10.196.197.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4TZhTt1K3zz9sQh; Wed, 14 Feb 2024 16:10:58 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1707923458; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=y5oOwF/jwjqn3ZhiI16vrltxvvH9rJjUK6O56MFSA3w=; b=kH0iXx3OU2AoKeH6+Kv7FS4/2NWtV8KoexHqWiCOdkUo/NakgP/SLL8kn3XbonyRiPvxM1 xweUKOFsXEph+f1Y+4qHbZXJJ5fAmfSGM/7WEMES0alv+D2bW/ce+9YByMx38zZay/6AEZ CVscFMEvW50awHywXWFu8qw8IJDRfMiTFa8RLU6ekYW0SODWCv6jisilBIKhE8JmyyUqVB wM5W6asehG6X4ku8nSNfuDsIEYAuyLEOjJ87tHBFSTd1QZWTXqPBjfoy5nEs6JyWfqxFtX HPsytRboAbF6mpy6GcpR1vakUMivXpK3sGt5PXj1JOa0P1kyotHYuNx1Uz84pw== Date: Wed, 14 Feb 2024 16:10:51 +0100 From: "Pankaj Raghav (Samsung)" To: Dave Chinner Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org Subject: Re: [RFC v2 05/14] readahead: align index to mapping_min_order in ondemand_ra and force_ra Message-ID: References: <20240213093713.1753368-1-kernel@pankajraghav.com> <20240213093713.1753368-6-kernel@pankajraghav.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: A62F010002D X-Rspam-User: X-Stat-Signature: rwizqhfqwfyw7spztorn7kukur1gipy4 X-Rspamd-Server: rspam03 X-HE-Tag: 1707923463-737059 X-HE-Meta: U2FsdGVkX19PzLv48nQG9dyoUi89g2rj04jih+9v5eCV7Iv/M1rJPQiNU7/vuHsFciS0RcYufZPWm3AHCy+GYJD1PhKDj9VmkV63P887bSKxaSzVb6sAcNixsIZr0RUJ14nuvdX2D+Mt1PuBlRxtzkgnitTzQWEPBPQQV4sjnrUV18BY9VAO+xMBM+PQF8ZQg6/s2So4aPu+9TElVMCOqHAA8pYfMZj8JbXma8UgmnGTcCsVuMo5aKqp2b4UxvQpZ7hN4jb6AdNLzoYg/q4q405miIj06PPTsYRvpGU74pe00KEc8F+l9paSa0zjxBLxSuprvwAKhypGetBPURWE2/oaAa91TOS2MYHF6zdnB4SlL5EGeRvOp3qNxSa9eHqcUBplX/BFhi21VEvcYkPKmMDZrZWScANXQItYqP+3io/HH3kLwu/cSkRaOrbA2f7T5Womv+Fst1dY5RbTMZJBtg0DqzgB+7SNCr7sISYJ8yKTRxk/9sxn0gmXvzNoD/Uu4GXGVI4jESr1YzXulaWWpLFDlsJNqOM7iiQaHUcGyYQvfXcOrKjZWuWqx5mFmXR4v85z/46lL49cWoTVf96zuggi1Jatpl2VdNVR5fN8IGT6vJnsSv0Qvn9a7cLZqFLA4NJ6FGl8k+hYeMLx8qhqGTSFLrAc43dvvw8l4Uia2oUEIUfJTtn8PJW+qxmVHAdgtDgojx1PMXN/3Ce68Di+4d6YLdtmNJ6pA6IpHL+jUljhCSU0OExVs9g++OUgHf2cIFj6pViDTlBCwIbAzdqnjnmn1lzqPI4VkMn5CPi09MqaI8OwwwBRr8oUPZli6xpALaOEw0M7ntkfiLyKzO4riY01EzcWdvBVpkvqQFmgZwSGwF1/rYHbxLLS+UYciNwOOWL4hq1I7cB85lNy/ERoG/2IP3Wa7lAPDfBT5r+q/mwXANCGv1I90klS7FxswVb0vLl9iK7MPkQqckLGcoL BKDfO6pL 9Hv6fOkrmPT62XWw/SLSR5UieccTjBmR+nNrZThqrTDonKFssdW5kfvAl7q694doxfnnrqxFcQCLnQP7ruNvuq6nId1QOK3l2fkLGcAuctmkcbnRFZQOozdSaRzzxWEPW9rK0Pb59gw1rHklEcUaH6Yutm/8WtD8g3q7orynQ/Dl4oiZ3eTjQMWtESJuw1Nh2SF6R X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > > @@ -324,6 +325,13 @@ void force_page_cache_ra(struct readahead_control *ractl, > > * be up to the optimal hardware IO size > > */ > > index = readahead_index(ractl); > > + if (!IS_ALIGNED(index, min_nrpages)) { > > + unsigned long old_index = index; > > + > > + index = round_down(index, min_nrpages); > > + nr_to_read += (old_index - index); > > + } > > new_index = mapping_align_start_index(mapping, index); > if (new_index != index) { > nr_to_read += index - new_index; > index = new_index Looks good. > } > > > + > > max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); > > nr_to_read = min_t(unsigned long, nr_to_read, max_pages); > > This needs to have a size of at least the minimum folio order size > so readahead can fill entire folios, not get neutered to the maximum > IO size the underlying storage supports. So something like: > > max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); > > nr_to_read = min_t(unsigned long, nr_to_read, max_pages); nr_to_read = max(nr_to_read, min_order); > > > + * For higher order address space requirements we ensure no initial reads > > + * are ever less than the min number of pages required. > > + * > > + * We *always* cap the max io size allowed by the device. > > */ > > -static unsigned long get_init_ra_size(unsigned long size, unsigned long max) > > +static unsigned long get_init_ra_size(unsigned long size, > > + unsigned int min_nrpages, > > + unsigned long max) > > { > > unsigned long newsize = roundup_pow_of_two(size); > > > > + newsize = max_t(unsigned long, newsize, min_nrpages); > > This really doesn't need to care about min_nrpages. That rounding > can be done in the caller when the new size is returned. Sounds good. > > > if (newsize <= max / 32) > > newsize = newsize * 4; > > > > > > > @@ -561,7 +583,11 @@ static void ondemand_readahead(struct readahead_control *ractl, > > unsigned long add_pages; > > pgoff_t index = readahead_index(ractl); > > pgoff_t expected, prev_index; > > - unsigned int order = folio ? folio_order(folio) : 0; > > + unsigned int min_order = mapping_min_folio_order(ractl->mapping); > > + unsigned int min_nrpages = mapping_min_folio_nrpages(ractl->mapping); > > + unsigned int order = folio ? folio_order(folio) : min_order; > > Huh? If we have a folio, then the order is whatever that folio is, > otherwise we use min_order. What if the folio is larger than > min_order? Doesn't that mean that this: > > > @@ -583,8 +609,8 @@ static void ondemand_readahead(struct readahead_control *ractl, > > expected = round_down(ra->start + ra->size - ra->async_size, > > 1UL << order); > > if (index == expected || index == (ra->start + ra->size)) { > > - ra->start += ra->size; > > - ra->size = get_next_ra_size(ra, max_pages); > > + ra->start += round_down(ra->size, min_nrpages); > > + ra->size = get_next_ra_size(ra, min_nrpages, max_pages); > > may set up the incorrect readahead range because the folio order is > larger than min_nrpages? Hmm... So I think we should just increment ra->start by ra->size, and make sure to round the new size we get from get_next_ra_size() to min_nrpages. Then we will not disturb the readahead range and always increase the range in multiples of min_nrpages: ra->start += ra->size; ra->size = round_up(get_next_ra_size(ra, max_pages), min_nrpages); > > > ra->async_size = ra->size; > > goto readit; > > } > > @@ -603,13 +629,18 @@ static void ondemand_readahead(struct readahead_control *ractl, > > max_pages); > > rcu_read_unlock(); > > > > + start = round_down(start, min_nrpages); > > start = mapping_align_start_index(mapping, start); > > + > > + VM_BUG_ON(folio->index & (folio_nr_pages(folio) - 1)); > > + > > if (!start || start - index > max_pages) > > return; > > > > ra->start = start; > > ra->size = start - index; /* old async_size */ > > + > > ra->size += req_size; > > - ra->size = get_next_ra_size(ra, max_pages); > > + ra->size = get_next_ra_size(ra, min_nrpages, max_pages); > > ra->size = max(min_nrpages, get_next_ra_size(ra, max_pages)); If this is a round_up of size instead of max operation, we can always ensure the ra->start from index aligned to min_nrpages. See my reasoning in the previous comment. -- Pankaj