From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86267C54E64 for ; Mon, 25 Mar 2024 19:00:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D6CAA6B0088; Mon, 25 Mar 2024 15:00:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF3636B0089; Mon, 25 Mar 2024 15:00:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BBBEC6B008A; Mon, 25 Mar 2024 15:00:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A92366B0088 for ; Mon, 25 Mar 2024 15:00:31 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 723381C0A3A for ; Mon, 25 Mar 2024 19:00:31 +0000 (UTC) X-FDA: 81936477462.28.DCB5629 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf27.hostedemail.com (Postfix) with ESMTP id 03D264002B for ; Mon, 25 Mar 2024 19:00:28 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Wo9sobzT; dmarc=none; spf=none (imf27.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711393229; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vJKeYjHEVe0gfRNnzKMs8M7l+ce1BSEpQNULOYVmG6E=; b=52lw+bfWZpnzxwJRm7MoyNade9kRU/PLPJRmuLiJYodbAJlOf839+1C8MFVN+0xDjGESiv 64PAcxgA76DKUFOQNcqTTpBt2O63FX/uUmAgXNvJgVTmBOLVN6rphNVeaOs1UeIzuI8rN7 oZc9aGpGdfajyteRoLnTQ2NQsYd2qjM= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Wo9sobzT; dmarc=none; spf=none (imf27.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711393229; a=rsa-sha256; cv=none; b=wieJPUkLWKBACQWKV0Gvs8aB42DZazq0i4zuV0J2pklk6wQB4nYg0lKSir2n8CRoT66s6x NXx3O1hUm/gE9kO5yX6JqhurGfP6xkX49SsiMy02xftZTwjtqv5Gxxck812JE44Y3aEnEf Qm1uCnvgSMIQITeteCsoyKy3AyqoYtQ= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=vJKeYjHEVe0gfRNnzKMs8M7l+ce1BSEpQNULOYVmG6E=; b=Wo9sobzTx9yoIPBHdce3Jf8Q4w 2ruJ5KSp24f10buTZ2gTyna4Qps8r45y3SNriLQkPwN/xC8aKCzf0J0DHzH8RXz4lTQBVbt2/NjK/ 7F78mDZYmv1HUsOZrSSu+JaifG7wBI7oEJ+xz/lUhbaQEPLemWnRxUyYoUS5gAxGCdjUv9cMGKltV juuvNVMbM982bct7QUPIAw6nUDogsAidz2B57REtlrMzJXKTgUzbYR2sbS1YB723Yp6UCmUZYFDFu VEtPaNYIkPLvvdkfciegfBcdmumUffVExTTu8xT8XV284FkDg/e5NLKj5wfl4lOIueBUmwq+mLEXy 48nf7hlA==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1ropYU-0000000H7OE-3KgK; Mon, 25 Mar 2024 19:00:22 +0000 Date: Mon, 25 Mar 2024 19:00:22 +0000 From: Matthew Wilcox To: "Pankaj Raghav (Samsung)" Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, gost.dev@samsung.com, chandan.babu@oracle.com, hare@suse.de, mcgrof@kernel.org, djwong@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, david@fromorbit.com, akpm@linux-foundation.org, Pankaj Raghav Subject: Re: [PATCH v3 05/11] readahead: allocate folios with mapping_min_order in readahead Message-ID: References: <20240313170253.2324812-1-kernel@pankajraghav.com> <20240313170253.2324812-6-kernel@pankajraghav.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240313170253.2324812-6-kernel@pankajraghav.com> X-Rspamd-Queue-Id: 03D264002B X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: pteqjw7c1cu9nrcypsndahnq6z6hmb8h X-HE-Tag: 1711393228-351603 X-HE-Meta: U2FsdGVkX18xTkBeA8X3vPwldW7zrkjzGsc0LRlY5jUvnjoS3yy9g/8YfEU9xDmSBgqNB/dUL2dkGJjQGynLZP+y5L2GDBYEPAoodNeCSyHFMebEU4Q5nEzmDZ78qQ2T9zHhCEU7jDpcCojfmUY2KDfItnQvR4bv4D9q7YGyt6mRnpKTR409LTZe83LSRYcn3re43fLrBSHAtLquMeMFx3q49L2Cr25AAFfSxfADBn5hSBftwhex54OnzzBr7zz6V3UY0i7LHtAg9Ja1yw0FXjfPF5fisBR6o3eAPQ93OaROW769RTnCenm1zRuPe7ydCzxM+i7tSo3vXFBuLLJI2+bbCdE3RiP+k6ZvqbGIn7l9nBiyvgEzVhRg3SjSNGv68WbPndS+UGGNVfrlaGnh3oLdSSrfr75n4jRKt+nsLGyxqS3gsOSg1iWb2psYlkTbzlfdDDd59qvMKXY6+QjNynhZ5Nlkw7jPUkCY3t/aYyMsjnE7n/voRxN7Xdq8xU7prVlkv6zVbR/N/ncoMhZfje16f5zCXR60XlMV8aISm2lmdh2O4l1l603+5tzuKEjNuEL05Z9C83SdeIuFR+70Fq9Q+JTb+8nAGb0b6sukCMl7Qd2MxUzMCxWu6Ql2PRwxwnYwld8HVKAuy8PsTbWbN54RJOdcab4bolfBW6NrsqdpQASDhmt2ExjyCP1L+xUKksImJtX9j7Q+mEfae+yi2675yVO2Vl2/t90sRF6On34Mn2/BcekS/9dexn4YiCoQh9VvpgddXKH+8QTZCq8TOmMPMHYrqOYIoRNAI9wIKmHtG7ns+bBhfkjnWlA5qoDI0Jdt3FbJc311LdIjQmnM/On0WD1RgSf2Ej+aP1DsixqoE5I8bnzoBEGkHlW65WntirtYlfIt1zA9Uhl06SdIc5ya+g1pmU7EGMt0x7GMaDEp3tYtd45Iy7RecV2Xg0upg97RR7xmlukvdcb+ZmL edM6ptVM C2gzbKcCK2P6BU/5Okpmhu/B8yXsbSunPkefmaXNRWQFQm7Q0SROciMXk6ELiI/8tfBgWWiuwVFNC6CD8QUVqZ9SnPC5cw/bYjoDadt29d+MOjeN1eFFAdnHRj3wuRHM4PFADVyUCkLET6Y5MQEUnbV4N8Aw23uREago4TOkU2qc10+x1WmdZNlJhnu6RVAYrkPq8CkDCfq5vfbQUTITYfEJqEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 13, 2024 at 06:02:47PM +0100, Pankaj Raghav (Samsung) wrote: > From: Pankaj Raghav > > page_cache_ra_unbounded() was allocating single pages (0 order folios) > if there was no folio found in an index. Allocate mapping_min_order folios > as we need to guarantee the minimum order if it is set. > When read_pages() is triggered and if a page is already present, check > for truncation and move the ractl->_index by mapping_min_nrpages if that > folio was truncated. This is done to ensure we keep the alignment > requirement while adding a folio to the page cache. > > page_cache_ra_order() tries to allocate folio to the page cache with a > higher order if the index aligns with that order. Modify it so that the > order does not go below the min_order requirement of the page cache. This paragraph doesn't make sense. We have an assertion that there's no folio in the page cache with a lower order than the minimum, so this seems to be describing a situation that can't happen. Does it need to be rephrased (because you're actually describing something else) or is it just stale? > @@ -239,23 +258,35 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, > * not worth getting one just for that. > */ > read_pages(ractl); > - ractl->_index += folio_nr_pages(folio); > + > + /* > + * Move the ractl->_index by at least min_pages > + * if the folio got truncated to respect the > + * alignment constraint in the page cache. > + * > + */ > + if (mapping != folio->mapping) > + nr_pages = min_nrpages; > + > + VM_BUG_ON_FOLIO(nr_pages < min_nrpages, folio); > + ractl->_index += nr_pages; > i = ractl->_index + ractl->_nr_pages - index; > continue; > } > > - folio = filemap_alloc_folio(gfp_mask, 0); > + folio = filemap_alloc_folio(gfp_mask, > + mapping_min_folio_order(mapping)); > if (!folio) > break; > if (filemap_add_folio(mapping, folio, index + i, > gfp_mask) < 0) { > folio_put(folio); > read_pages(ractl); > - ractl->_index++; > + ractl->_index += min_nrpages; Hah, you changed this here. Please move into previous patch. > i = ractl->_index + ractl->_nr_pages - index; > continue; > } > - if (i == nr_to_read - lookahead_size) > + if (i == mark) > folio_set_readahead(folio); > ractl->_workingset |= folio_test_workingset(folio); > ractl->_nr_pages += folio_nr_pages(folio); > @@ -489,12 +520,18 @@ void page_cache_ra_order(struct readahead_control *ractl, > { > struct address_space *mapping = ractl->mapping; > pgoff_t index = readahead_index(ractl); > + unsigned int min_order = mapping_min_folio_order(mapping); > pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; > pgoff_t mark = index + ra->size - ra->async_size; > int err = 0; > gfp_t gfp = readahead_gfp_mask(mapping); > + unsigned int min_ra_size = max(4, mapping_min_folio_nrpages(mapping)); > > - if (!mapping_large_folio_support(mapping) || ra->size < 4) > + /* > + * Fallback when size < min_nrpages as each folio should be > + * at least min_nrpages anyway. > + */ > + if (!mapping_large_folio_support(mapping) || ra->size < min_ra_size) > goto fallback; > > limit = min(limit, index + ra->size - 1); > @@ -505,9 +542,19 @@ void page_cache_ra_order(struct readahead_control *ractl, > new_order = MAX_PAGECACHE_ORDER; > while ((1 << new_order) > ra->size) > new_order--; > + if (new_order < min_order) > + new_order = min_order; I think these are the two lines you're describing in the paragraph that doesn't make sense to me? And if so, I think they're useless? > @@ -515,7 +562,7 @@ void page_cache_ra_order(struct readahead_control *ractl, > if (index & ((1UL << order) - 1)) > order = __ffs(index); > /* Don't allocate pages past EOF */ > - while (index + (1UL << order) - 1 > limit) > + while (order > min_order && index + (1UL << order) - 1 > limit) > order--; This raises an interesting question that I don't know if we have a test for. POSIX says that if we mmap, let's say, the first 16kB of a 10kB file, then we can store into offset 0-12287, but stores to offsets 12288-16383 get a signal (I forget if it's SEGV or BUS). Thus far, we've declined to even create folios in the page cache that would let us create PTEs for offset 12288-16383, so I haven't paid too much attention to this. Now we're going to have folios that extend into that range, so we need to be sure that when we mmap(), we only create PTEs that go as far as 12287. Can you check that we have such an fstest, and that we still pass it with your patches applied and a suitably large block size?