From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C8DDC04E69 for ; Wed, 9 Aug 2023 16:44:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B20A36B0071; Wed, 9 Aug 2023 12:44:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD0CB8E0001; Wed, 9 Aug 2023 12:44:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 971E56B0075; Wed, 9 Aug 2023 12:44:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 852CF6B0071 for ; Wed, 9 Aug 2023 12:44:53 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4B99240784 for ; Wed, 9 Aug 2023 16:44:53 +0000 (UTC) X-FDA: 81105140466.16.A1619D7 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf26.hostedemail.com (Postfix) with ESMTP id 3DADF140012 for ; Wed, 9 Aug 2023 16:44:49 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="dfLcv7/A"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=IC1FePRO; dmarc=none; spf=pass (imf26.hostedemail.com: domain of jack@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691599490; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nDhLE3ZT6vGy7jFyOFpA8ZS/b+W0KeOHURtugtULW3g=; b=qDiphdOXeGAGGRJQbbNsR4Z2spXivWaDwHGFZe/XwtNNaAepygRoZp2Oa6wS3StcNkvtXl 4L694TzT3n5CMab0G3mG9K4n982pZvqwKzsn1G+nvHMauEXRuUFmp3d8xFDx31kTPSwXO4 Ol8bdRNAgmsqIehnbxR+GnPZJAdfh54= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b="dfLcv7/A"; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=IC1FePRO; dmarc=none; spf=pass (imf26.hostedemail.com: domain of jack@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691599490; a=rsa-sha256; cv=none; b=VcRqM0rbMj8AJ8m193LYxjACvHkKhFDGNpSvfCgmdVhG1gdVlAoWWKF0DhLEwcB2tcBWV0 v1aFJlaIMq/1XH7vF6w4SVrQhIHY7xfPLNBDqF4JL/k8cQAH3qp1JSEv4tCfLEBchKnIQD 8GJ6DyzOyVaawxMtnd7bYRIGRVM6D4I= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 84B7A2187E; Wed, 9 Aug 2023 16:44:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1691599487; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nDhLE3ZT6vGy7jFyOFpA8ZS/b+W0KeOHURtugtULW3g=; b=dfLcv7/AijRLKHjL1t1noAK9Qk5iKSFINL4VURbX3V9BPdMkLGdsFnao+maJUgXRq8FzHA ampedrAG6ab+ku3msmXGKGC50qGzvzpq1uFfrkN4bpx5ePJf7RBuXFD1UZW1zZJf/ZoseJ NK9IGgKfeS2AVFS48lEAsqJgWhNBWag= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1691599487; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nDhLE3ZT6vGy7jFyOFpA8ZS/b+W0KeOHURtugtULW3g=; b=IC1FePROY468GiaL9M2rWyaAYV2g2X93uqoziugzeEp2uSvcA4TmqRA0ZiA/wR/ocuihXL TD4pMgaSKCACQrAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6F4F8133B5; Wed, 9 Aug 2023 16:44:47 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id pbEbG3/C02S1awAAMHmgww (envelope-from ); Wed, 09 Aug 2023 16:44:47 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id F112CA0769; Wed, 9 Aug 2023 18:44:46 +0200 (CEST) Date: Wed, 9 Aug 2023 18:44:46 +0200 From: Jan Kara To: Haibo Li Cc: linux-kernel@vger.kernel.org, "Matthew Wilcox (Oracle)" , Andrew Morton , Matthias Brugger , AngeloGioacchino Del Regno , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, xiaoming.yu@mediatek.com Subject: Re: [PATCH] mm/filemap.c:fix update prev_pos after one read request done Message-ID: <20230809164446.uwxryhrfbjka2lio@quack3> References: <20230628110220.120134-1-haibo.li@mediatek.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230628110220.120134-1-haibo.li@mediatek.com> X-Rspam-User: X-Stat-Signature: 9txkriqbtfokgotq7y7i4zoz8i8g8gr3 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3DADF140012 X-HE-Tag: 1691599489-900749 X-HE-Meta: U2FsdGVkX19mufLC7YA5eXYQnQk6egcfKo09fP5O68aK9dmvOp8OsX5z2wY1FEB8sShGKzNMbFTqjvxRQLk0Nr2AmoCinRVBHP3ZUVtZdgUc2usTCWsIiqiyQvmzX2CO8xoy4DauNi7Q2fVzryaqTJw8dTUeKPzzCIoauK+SfhTlL4VFkmo2r9Egc3VU9H6mIEuZe9JD0b8hgLMplmi7g2RjJfbqhl3kkpIug4KucOtiDrvnnsTRKOvpaLAOXiLmzwfeuL2sqibaRk8H1ERs7Zs5N/2x9ovWQ+MD7hHitBW/7F0U4r+Za8C0aYpF33Bmi2P6CxMoJ/aSHn0JYx9fJ6CeY5p/NTDaf8AUfLt/9Dlip2HvYyO5aBd6Kbe2xvRYNYVKlAmTuAeCXPXwyn3rtZji0yAcQR7hD1gVeR5Z57rmOEtgtUn7bYvu67dkIMgUOQAu6T74z/Od2yReBR22CtmI2vCC3Kpy0KWPlbdRvXDEGgg/vqkEFjRBsHgUZmK0ilhwBy7RD4Ir01Atp7lvUU90PH26cL6Qo/+FPvV1255OHNWrkeT1APTZGwCPqPMulw8DeO0JKJzp3vZuisf3CZMfisCd185kBJYpOToo8zlFrBMWIsidDfMhoRPKpeFV2WGuWdpuwNKuT8stDFaVMfevnrvqgxmmJ4gsxC9k5jEdu+rhkDvpRFGtlqRRLipdG1SuqjmNRL+v9jC8OqlAZPPl50zsQQfW8rvPGRDYdQYsf5h34MAVXsS5Ci4+SYuIYlphiqjssOnkI1pDMIiUdfvTUJ5LLN2WzgFUrRAUAkjX+Q8aB66TFm2V7VQMKLEKoR4yUZFodqXqAQ2cvo2L26TNMklm81+AkWiKmNiY2RrQAuhwg6OrZu82DNKkcAhirAa7mw6CDman3Y7jWE2F6YTlckGZlDfamoU0lyfGNnf+kNpC1NpfDJvaHkAefU9yfgJXagdhmgD7ZXneKg5 P8oyMstX 34J6Qf/y8a5yTBMA3UkvvSNLhE3UVOP50WG+3nEBUM6gr26wCq7B8OcKagO6p6RO6j3ffKrjdfZ6LFlabdeHCjDLHIxoSMReAZT96rIacwUFUV3izmFoS6ambr04uV672K3ShFXKS9mVM9J/UPbRVsc4gBm8bpwtCzc+kYwc0BTCM7g5IMJK1QsfT+I4yd8ILiDtoTg9H8Rn9A6Bq81EgjlJJ9bjTFBflxe/LgfkYNWh72nr2ylASN8pEyca3P0Aey7p3+MkGj+xPWB+T/lHuEmH+ZhVaGSUu29XjktD+4G8rdZJ0pLTLBS7ncZ5Wx2LgcqGad8VQTJ3NIqnUmyVhKNXNF94Uc7t1e34MhEmXsgAyzB/T0dKdA7bNcGgwhvSvldeA4beTGtyQTraK1tmd/tpqZHFBHcmcShnl4i8iSXn6N/DGVKAfIR8Lhg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 28-06-23 19:02:20, Haibo Li wrote: > ra->prev_pos tracks the last visited byte in the previous read request. > It is used to check whether it is sequential read in > ondemand_readahead and thus affects the readahead window. > > From commit 06c0444290ce ("mm/filemap.c: generic_file_buffered_read() > now uses find_get_pages_contig"),update logic of prev_pos is changed. > It updates prev_pos after each returns from filemap_get_pages. > But the read request from user may be not fully completed > at this point. > The updated prev_pos impacts the subsequent readahead window. > > The real problem is performance drop of fsck_msdos between linux-5.4 > and linux-5.15(also linux-6.4). > Comparing to linux-5.4,It spends about 110% time and read 140% pages. > The read pattern of fsck_msdos is not fully sequential. > > Simplified read pattern of fsck_msdos likes below: > 1.read at page offset 0xa,size 0x1000 > 2.read at other page offset like 0x20,size 0x1000 > 3.read at page offset 0xa,size 0x4000 > 4.read at page offset 0xe,size 0x1000 > > Here is the read status on linux-6.4: > 1.after read at page offset 0xa,size 0x1000 > ->page ofs 0xa go into pagecache > 2.after read at page offset 0x20,size 0x1000 > ->page ofs 0x20 go into pagecache > 3.read at page offset 0xa,size 0x4000 > ->filemap_get_pages read ofs 0xa from pagecache and returns > ->prev_pos is updated to 0xb and goto next loop > ->filemap_get_pages tends to read ofs 0xb,size 0x3000 > ->initial_readahead case in ondemand_readahead since prev_pos is > the same as request ofs. > ->read 8 pages while async size is 5 pages > (PageReadahead flag at page 0xe) > 4.read at page offset 0xe,size 0x1000 > ->hit page 0xe with PageReadahead flag set,double the ra_size. > read 16 pages while async size is 16 pages > Now it reads 24 pages while actually uses 5 pages > > on linux-5.4: > 1.the same as 6.4 > 2.the same as 6.4 > 3.read at page offset 0xa,size 0x4000 > ->read ofs 0xa from pagecache > ->read ofs 0xb,size 0x3000 using page_cache_sync_readahead > read 3 pages > ->prev_pos is updated to 0xd before generic_file_buffered_read > returns > 4.read at page offset 0xe,size 0x1000 > ->initial_readahead case in ondemand_readahead since > request ofs-prev_pos==1 > ->read 4 pages while async size is 3 pages > > Now it reads 7 pages while actually uses 5 pages. > > In above demo,the initial_readahead case is triggered by offset > of user request on linux-5.4. > While it may be triggered by update logic of prev_pos on linux-6.4. > > To fix the performance drop,update prev_pos after finishing one read > request. > > Signed-off-by: Haibo Li Sorry for the delayed reply. This seems to have fallen through the cracks. So if I understand your analysis right, you are complaining that random read larger than 1 page gets misdetected as sequential read and so "larger than necessary" readahead happens. I tend to agree with your opinion and your solution looks good to me. Feel free to add: Reviewed-by: Jan Kara Willy, any opinion? Andrew, can you pickup the patch if Willy doesn't object? Honza > --- > mm/filemap.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index 83dda76d1fc3..16b2054eee71 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2670,6 +2670,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, > int i, error = 0; > bool writably_mapped; > loff_t isize, end_offset; > + loff_t last_pos = ra->prev_pos; > > if (unlikely(iocb->ki_pos >= inode->i_sb->s_maxbytes)) > return 0; > @@ -2721,8 +2722,8 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, > * When a read accesses the same folio several times, only > * mark it as accessed the first time. > */ > - if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1, > - fbatch.folios[0])) > + if (!pos_same_folio(iocb->ki_pos, last_pos - 1, > + fbatch.folios[0])) > folio_mark_accessed(fbatch.folios[0]); > > for (i = 0; i < folio_batch_count(&fbatch); i++) { > @@ -2749,7 +2750,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, > > already_read += copied; > iocb->ki_pos += copied; > - ra->prev_pos = iocb->ki_pos; > + last_pos = iocb->ki_pos; > > if (copied < bytes) { > error = -EFAULT; > @@ -2763,7 +2764,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, > } while (iov_iter_count(iter) && iocb->ki_pos < isize && !error); > > file_accessed(filp); > - > + ra->prev_pos = last_pos; > return already_read ? already_read : error; > } > EXPORT_SYMBOL_GPL(filemap_read); > -- > 2.25.1 > -- Jan Kara SUSE Labs, CR