From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77E83C4829A for ; Tue, 13 Feb 2024 22:29:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E8AD16B0098; Tue, 13 Feb 2024 17:29:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E16376B00A2; Tue, 13 Feb 2024 17:29:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C706D6B00A3; Tue, 13 Feb 2024 17:29:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B126F6B0098 for ; Tue, 13 Feb 2024 17:29:42 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 835FE80BCC for ; Tue, 13 Feb 2024 22:29:42 +0000 (UTC) X-FDA: 81788223804.27.4628647 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by imf11.hostedemail.com (Postfix) with ESMTP id 9BDDB4000F for ; Tue, 13 Feb 2024 22:29:40 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=bbSniVG8; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf11.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707863380; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1tsGlujf3U5wgIjzjj3ZkynAP7g286QKuhgEsG9eCxo=; b=5Jzkkzvtpp8M3TXOn5bDCzLyyXhKXDs/UcI2QdxmJ1X7NUH4FSkjf3t31x4wIbTfAfxANV VZBOKavaD41erMncWb9k6WHMxqjY6/uinYDsmEUIgemHwjGyCo4cbHpouJMueFHN4ZVxzr Xq1wX1m9QgQpJLCputrZmMuC1rl05PA= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=bbSniVG8; dmarc=pass (policy=quarantine) header.from=fromorbit.com; spf=pass (imf11.hostedemail.com: domain of david@fromorbit.com designates 209.85.210.176 as permitted sender) smtp.mailfrom=david@fromorbit.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707863380; a=rsa-sha256; cv=none; b=BDQNd5oUcmIW2BNG4F29NzERfjJNv+nvayzrlmFvEMOi6Q1TxOE7qBPdWGa7hz30FBSEeU qk4kgc19ynO+tNBmDaOSeBpvJEYKiC0QZHnzRkPEwQ4uzqSyRarpqqC8Ma+Q7qKrH9mW8X MPY5XrfmpDWp2nx/F8jMPzGERQvi9Ew= Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-6e10404557eso529963b3a.1 for ; Tue, 13 Feb 2024 14:29:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1707863379; x=1708468179; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=1tsGlujf3U5wgIjzjj3ZkynAP7g286QKuhgEsG9eCxo=; b=bbSniVG8SIg1bHNrInSeybkt1wjKxH5JGACyQ46iXTYKn5kuvpG3/rq4yg72/bSTB9 UJb4e23zTIbex7v18GPI+ML8lSzbTVsMIhX1l096bSx5xqIOsXJgAo6Eo+7RuA6VcfGt On+beNujvOCy4Irfenl7acskDYESBBxKRHHEX9qvO+r6y45v8ImsTRdYZlTTBp+UALuH ZsJ3GjPUWVXmYueB/CKporZ74mWkFbyffqoQsY7QgHv1OM4qUwWzSdXR6tBnUWvhTVze HxmOcrqtQB0zkOAuyWB/F/E0GHJP9sgQe3RDf7zvyR/B9NrrWI2ZNccoDzMcd1WiU9Ld XFmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707863379; x=1708468179; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=1tsGlujf3U5wgIjzjj3ZkynAP7g286QKuhgEsG9eCxo=; b=kmQR0SslysCMtiXk8z2BjGTx25ZJlVOLHHVhQRVp+tHT7KUkARhm+vg8eboQFFuYrH B1jy3gW5o4wAf20UE96BOH99q81/jTAPZ/3LI+pxq8ZjMvSyxNw9yn5OLfNkESgn+CrR JpFhNz/UuD0EnUheCP0LzHeWR24yo/oaVSlFXGJjvTRxobsL7vxCES/52Ti8Ay6s+MnA 6iwkRaYrbPXIgPjg1LqUAgliSImU+H256rDdttpVXLzaO1bDsSPPT+kbaNkTCzxx/FeW E5OcnSOBu/yK/8mQk9ExoFfwcAawocge4LkwHCBOO7rB8MLGfn+VqptrzBh7hoQQ2L7H 3H7Q== X-Forwarded-Encrypted: i=1; AJvYcCXQqiZXiq0aOiUohtQfcJ6ZzJdUWs5xgIT4Ooe9E2EmxvgASeft0MgnqN8Na+Q8lXvP6V1UlOkCzKlnMOXRDGUx/zg= X-Gm-Message-State: AOJu0YzokjoetSd20rAAW39bLOPRWpWFdscoVkTsKtqu1dDmf/yDHAiy wP79uPJnS07CcbFoQaZJwDZ+ohwALnQCw41LYeXal4Np6oCE6Htjuvp9YyWZppM= X-Google-Smtp-Source: AGHT+IEyaWb1NzllIEt0at5U+E7258Vf7P56wt9349tb0T7YpOyIYiSsnC+HuSOyBQ/Of0B8ji1y9A== X-Received: by 2002:a05:6a00:2d0d:b0:6e0:f2a6:abde with SMTP id fa13-20020a056a002d0d00b006e0f2a6abdemr845966pfb.5.1707863379463; Tue, 13 Feb 2024 14:29:39 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCVGBAOI42Qf39cV7pTd7API1uDQ3e1AxePniZJsj3B6uhT0nwuRLh8I5gLz1dnxk/7lJ+QwPOs/cJAiCXhPwRh6v4eXRFRqkps8PEyVpta9zgy7T+Pw2nb34xBI8Bs7z5BmKaeHJpFcMAxDgaGoc2aEPpEiXuZHRzQJ/PqzIPZhkAmesxB8p+0i5BwgWVgG5jCKMqAQMKu7lY11o7rJMxaro64t7+t/SWVNfoLOgZJVxf7JPZ7Rh/7YwJxzp+FTpLGY83me1rZ8FVCoddoMer/CKcKFZzDQ3mmTmpm9r/W+gjl8oBvjeZVHmqQfUb7vzPWZoyRctYepplqNR5W7Orn1gSXGGSaopCLks+/MLg3M2b4mclSqQyopeBENQ0Y588UZwf10BiWLwVXY9Hs7ULICuXo4yR8NRghA/CI= Received: from dread.disaster.area (pa49-181-38-249.pa.nsw.optusnet.com.au. [49.181.38.249]) by smtp.gmail.com with ESMTPSA id z1-20020a056a00240100b006e1078461casm323798pfh.183.2024.02.13.14.29.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Feb 2024 14:29:38 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1ra1HU-0068Aq-1v; Wed, 14 Feb 2024 09:29:36 +1100 Date: Wed, 14 Feb 2024 09:29:36 +1100 From: Dave Chinner To: "Pankaj Raghav (Samsung)" Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, mcgrof@kernel.org, gost.dev@samsung.com, akpm@linux-foundation.org, kbusch@kernel.org, djwong@kernel.org, chandan.babu@oracle.com, p.raghav@samsung.com, linux-kernel@vger.kernel.org, hare@suse.de, willy@infradead.org, linux-mm@kvack.org Subject: Re: [RFC v2 05/14] readahead: align index to mapping_min_order in ondemand_ra and force_ra Message-ID: References: <20240213093713.1753368-1-kernel@pankajraghav.com> <20240213093713.1753368-6-kernel@pankajraghav.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240213093713.1753368-6-kernel@pankajraghav.com> X-Rspam-User: X-Stat-Signature: i1emnfubc3c94hue78wk9mbwr9ktfwm9 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 9BDDB4000F X-HE-Tag: 1707863380-262958 X-HE-Meta: U2FsdGVkX18D8J22vC3usYxoTA6Ggqzep9IlRezUV3dwfV8bhoTv6ls7U9S7yOCvtuja5Q7vZh0ju35PO47Tri/yPIaNnv9y+GPsv+5RVbn3n3J/ROO7Rzb80oGn/rNfAa1hOTMYnh7z5YduwMxemu8o/qAEWR9FDiZcUVO+MUppN1vf7kf6Nb7kWcl/1d4FTq9teLDPxqyQcdt0MMFnflw0+d/I5mEPb8V19gUSf+Hn/TSQcLtgDHbIkisqn56kGatvSSQO4ibwbRAhU24z1d9JyQ6ocNLUyV9Aw3DTJU3/e8I2ZlZBOtREy3H6AGDBkZWQZLq9DCRgnwu2jzhdvQ6bVnU7Xexi9VnIKeZENRwSbZi5PCkXekpyiY3bvbq3pYXGEn4/4+mFLgbM634OOoeqxHl/7/pzSPlXN+w706NgMG30jIpijcwn0RwWvquK6y9IWiC9RimNqHNIuNtRoEmXs8T06uLX7x5zuq3Id8DuoFULo/m24d0G3ss8wFOy4Vya5sprMEUdNzqODTZ9rIyylW5ytc2kMETtZj4Et1ZmPGjadRkVpM4in5SWDkDZHLTHB+2kUq0bwrcEATXrnG4L3iy1npcRHsQ+mpbp06ICsJSlKEWUlGSo6fwClqINEyslkzx7rw+AcKMDJi5b1fBptIl3/VhyWFGY7QkvCJH5xjsUBhIwCHJMK2qCp5w1t9m3fT3MOD9n+RpCRtE8lUfEEloswFWkMGsUf5aBISA5B+1WCoklUCsz8WosA0ly9VV4ewq3fTAZ+Nc9wzHQJfnV0lJN60gx2Hky2AA58aVOl+DV8JdSFLiLYimwsk/RHCfk5lFwwKwTfQ7jlDMCha9+NjEu2V1py49TyK1xL/1jMshP6Y6WUPCT38iR9nPyRqKMRXKetb7tnui+OzDb7daEXWC+hIibn1hJuH/4um7RZRowRbhRw8MGGnQhADu02I47bLaHdbafxRHY9xe 2dpqg3gU 5NQlmo76kV4JvWP7jKB8oJXsDIFaNafyOu0noO5J+th6+1RNcDn8lcOulJXbvCZQdFQT9mdGPJhADZpRmuvbTyEOq7vAv0aNPdPXEO/dU31xOVxzi6JtYkeHbM3tH9NkrnVxXqd1RJhIOvTqQo3DATYjtEP48JoN8sP2tKCH7px81Tto= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 13, 2024 at 10:37:04AM +0100, Pankaj Raghav (Samsung) wrote: > From: Luis Chamberlain > > Align the ra->start and ra->size to mapping_min_order in > ondemand_readahead(), and align the index to mapping_min_order in > force_page_cache_ra(). This will ensure that the folios allocated for > readahead that are added to the page cache are aligned to > mapping_min_order. > > Signed-off-by: Luis Chamberlain > Signed-off-by: Pankaj Raghav > --- > mm/readahead.c | 48 ++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 40 insertions(+), 8 deletions(-) > > diff --git a/mm/readahead.c b/mm/readahead.c > index 4fa7d0e65706..5e1ec7705c78 100644 > --- a/mm/readahead.c > +++ b/mm/readahead.c > @@ -315,6 +315,7 @@ void force_page_cache_ra(struct readahead_control *ractl, > struct file_ra_state *ra = ractl->ra; > struct backing_dev_info *bdi = inode_to_bdi(mapping->host); > unsigned long max_pages, index; > + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); > > if (unlikely(!mapping->a_ops->read_folio && !mapping->a_ops->readahead)) > return; > @@ -324,6 +325,13 @@ void force_page_cache_ra(struct readahead_control *ractl, > * be up to the optimal hardware IO size > */ > index = readahead_index(ractl); > + if (!IS_ALIGNED(index, min_nrpages)) { > + unsigned long old_index = index; > + > + index = round_down(index, min_nrpages); > + nr_to_read += (old_index - index); > + } new_index = mapping_align_start_index(mapping, index); if (new_index != index) { nr_to_read += index - new_index; index = new_index } > + > max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); > nr_to_read = min_t(unsigned long, nr_to_read, max_pages); This needs to have a size of at least the minimum folio order size so readahead can fill entire folios, not get neutered to the maximum IO size the underlying storage supports. > while (nr_to_read) { > @@ -332,6 +340,7 @@ void force_page_cache_ra(struct readahead_control *ractl, > if (this_chunk > nr_to_read) > this_chunk = nr_to_read; > ractl->_index = index; > + VM_BUG_ON(!IS_ALIGNED(index, min_nrpages)); > do_page_cache_ra(ractl, this_chunk, 0); > > index += this_chunk; > @@ -344,11 +353,20 @@ void force_page_cache_ra(struct readahead_control *ractl, > * for small size, x 4 for medium, and x 2 for large > * for 128k (32 page) max ra > * 1-2 page = 16k, 3-4 page 32k, 5-8 page = 64k, > 8 page = 128k initial > + * > + * For higher order address space requirements we ensure no initial reads > + * are ever less than the min number of pages required. > + * > + * We *always* cap the max io size allowed by the device. > */ > -static unsigned long get_init_ra_size(unsigned long size, unsigned long max) > +static unsigned long get_init_ra_size(unsigned long size, > + unsigned int min_nrpages, > + unsigned long max) > { > unsigned long newsize = roundup_pow_of_two(size); > > + newsize = max_t(unsigned long, newsize, min_nrpages); This really doesn't need to care about min_nrpages. That rounding can be done in the caller when the new size is returned. > if (newsize <= max / 32) > newsize = newsize * 4; > else if (newsize <= max / 4) > @@ -356,6 +374,8 @@ static unsigned long get_init_ra_size(unsigned long size, unsigned long max) > else > newsize = max; > > + VM_BUG_ON(newsize & (min_nrpages - 1)); > + > return newsize; > } > > @@ -364,14 +384,16 @@ static unsigned long get_init_ra_size(unsigned long size, unsigned long max) > * return it as the new window size. > */ > static unsigned long get_next_ra_size(struct file_ra_state *ra, > + unsigned int min_nrpages, > unsigned long max) > { > - unsigned long cur = ra->size; > + unsigned long cur = max(ra->size, min_nrpages); > > if (cur < max / 16) > return 4 * cur; > if (cur <= max / 2) > return 2 * cur; > + > return max; Ditto. > } > > @@ -561,7 +583,11 @@ static void ondemand_readahead(struct readahead_control *ractl, > unsigned long add_pages; > pgoff_t index = readahead_index(ractl); > pgoff_t expected, prev_index; > - unsigned int order = folio ? folio_order(folio) : 0; > + unsigned int min_order = mapping_min_folio_order(ractl->mapping); > + unsigned int min_nrpages = mapping_min_folio_nrpages(ractl->mapping); > + unsigned int order = folio ? folio_order(folio) : min_order; Huh? If we have a folio, then the order is whatever that folio is, otherwise we use min_order. What if the folio is larger than min_order? Doesn't that mean that this: > @@ -583,8 +609,8 @@ static void ondemand_readahead(struct readahead_control *ractl, > expected = round_down(ra->start + ra->size - ra->async_size, > 1UL << order); > if (index == expected || index == (ra->start + ra->size)) { > - ra->start += ra->size; > - ra->size = get_next_ra_size(ra, max_pages); > + ra->start += round_down(ra->size, min_nrpages); > + ra->size = get_next_ra_size(ra, min_nrpages, max_pages); may set up the incorrect readahead range because the folio order is larger than min_nrpages? > ra->async_size = ra->size; > goto readit; > } > @@ -603,13 +629,18 @@ static void ondemand_readahead(struct readahead_control *ractl, > max_pages); > rcu_read_unlock(); > > + start = round_down(start, min_nrpages); start = mapping_align_start_index(mapping, start); > + > + VM_BUG_ON(folio->index & (folio_nr_pages(folio) - 1)); > + > if (!start || start - index > max_pages) > return; > > ra->start = start; > ra->size = start - index; /* old async_size */ > + > ra->size += req_size; > - ra->size = get_next_ra_size(ra, max_pages); > + ra->size = get_next_ra_size(ra, min_nrpages, max_pages); ra->size = max(min_nrpages, get_next_ra_size(ra, max_pages)); > ra->async_size = ra->size; > goto readit; > } > @@ -646,7 +677,7 @@ static void ondemand_readahead(struct readahead_control *ractl, > > initial_readahead: > ra->start = index; > - ra->size = get_init_ra_size(req_size, max_pages); > + ra->size = get_init_ra_size(req_size, min_nrpages, max_pages); ra->size = max(min_nrpages, get_init_ra_size(req_size, max_pages)); -Dave. -- Dave Chinner david@fromorbit.com