From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88CF0C32771 for ; Wed, 17 Aug 2022 15:25:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9151A8D0001; Wed, 17 Aug 2022 11:25:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C3B56B0074; Wed, 17 Aug 2022 11:25:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B25E8D0001; Wed, 17 Aug 2022 11:25:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6CDEE6B0073 for ; Wed, 17 Aug 2022 11:25:48 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 13D8F1A15A0 for ; Wed, 17 Aug 2022 15:25:48 +0000 (UTC) X-FDA: 79809459576.15.BD8ECE7 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf10.hostedemail.com (Postfix) with ESMTP id F29BDC01BF for ; Wed, 17 Aug 2022 15:25:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=edlU3BExzsRsoCeVpTIwLsOGPOEwh0l0t+arlBBi9+w=; b=dx/mfpfp6rJWVEcc9hWrY+db3v Ze83S+bkWfMqxScbAFFDCkzWlHt9miBuQ3Y2ctLa49acGvknqmXae+W504bnEbxVGsLPi+WuoFafy Nm6SPKlt8dARBLYA6ZIVVqrGUGYCAuQru/1IzJ/sAT8ADDyvCBOw2VCagHZUGMsZDuKjtCO6xZJpR zgGyLPlknGFxmh02IDic++MFQsduHSplp/F9thjbq8xWmoBN1wpvgTE4r2d3bdFoeEGFWDDn8WW5v 70ymaZF6ETGfIm51PO35dwSWz3RGCUqoMLLU7bQ5Qzsr5Gu99iJny33mzZR5sfbjoFHmxECAFq79B iHIfkK/Q==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1oOKvK-008M4m-0T; Wed, 17 Aug 2022 15:25:38 +0000 Date: Wed, 17 Aug 2022 16:25:37 +0100 From: Matthew Wilcox To: Guixin Liu Cc: akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH] mm/filemap.c: fix the timing of asignment of prev_pos Message-ID: References: <1660744317-8183-1-git-send-email-kanie@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1660744317-8183-1-git-send-email-kanie@linux.alibaba.com> ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660749947; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=edlU3BExzsRsoCeVpTIwLsOGPOEwh0l0t+arlBBi9+w=; b=yr8IvvKm4+rHg1Ko6+yS8y2sHB95z6IZmesi7mzr4nZqy+a0/0kNotWMQ3NvN522YGN0gQ 1PUC3vI+9kKUH/Xegm0L5WlrwN4V9ppqRUpeCrSZQ06+9u12DdvffqVy6Kv+TKIVrLTFkR ahepqijSCMXs4/MA08IxxcW7KcAHmaA= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="dx/mfpfp"; dmarc=none; spf=none (imf10.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660749947; a=rsa-sha256; cv=none; b=nOPDf+Wq77fktpOY2uqbsEG7odnhkYzozp74TB4uXdaxK384xej/MkU95tI+zH3kqH3/bb 3G6gYlfrDy2Y+rl7oxXza+5qiw3qBVxNVe7aPLAZjWeMdWDvcsQ5uvSgjnMIWUrc4Sp9Dl yUfWJy32ocUEIjYeUZQCez9/oRogJwU= Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="dx/mfpfp"; dmarc=none; spf=none (imf10.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org X-Rspam-User: X-Stat-Signature: p98ospfy7yr6gq74ouy6gtgterwmoxye X-Rspamd-Queue-Id: F29BDC01BF X-Rspamd-Server: rspam11 X-HE-Tag: 1660749946-832183 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Aug 17, 2022 at 09:51:57PM +0800, Guixin Liu wrote: > The prev_pos should be assigned before the iocb->ki_pos is incremented, > so that the prev_pos is the exact location of the last visit. > > Fixes: 06c0444290cec ("mm/filemap.c: generic_file_buffered_read() now > uses find_get_pages_contig") > Signed-off-by: Guixin Liu > > --- > Hi guys, > When I`m running repetitive 4k read io which has same offset, > I find that access to folio_mark_accessed is inevitable in the > read process, the reason is that the prev_pos is assigned after the > iocb->ki_pos is incremented, so that the prev_pos is always not equal > to the position currently visited. > Is this a bug that needs fixing? I think you've misunderstood the purpose of 'prev_pos'. But this has been the source of bugs, so let's go through it in detail. In general, we want to mark a folio as accessed each time we read from it. So if we do this: read(fd, buf, 1024 * 1024); we want to mark each folio as having been accessed. But if we're doing lots of short reads, we don't want to mark a folio as being accessed multiple times (if you dive into the implementation, you'll see the first time, the 'referenced' flag is set and the second time, the folio is moved to the active list, so it matters how often we call mark_accessed). IOW: for (i = 0; i < 1024 * 1024; i++) read(fd, buf, 1); should do the same amount of accessed/referenced/activation as the single read above. So when we store ki_pos in prev_pos, we don't want to know "Where did the previous read start?" We want to know "Where did the previous read end". That's why when we test it, we check whether prev_pos - 1 is in the same folio as the offset we're looking at: if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1, fbatch.folios[0])) folio_mark_accessed(fbatch.folios[0]); I'm not super-proud of this code, and accept that it's confusing. But I don't think the patch below is right. If you could share your actual test and show what's going wrong, I'm interested. I think what you're saying is that this loop: for (i = 0; i < 1000; i++) pread(fd, buf, 4096, 1024 * 1024); results in the folio at offset 1MB being marked as accessed more than once. If so, then I think that's the algorithm behaving as designed. Whether that's desirable is a different question; when I touched this code last, I was trying to restore the previous behaviour which was inadvertently broken. I'm not taking a position on what the right behaviour is for such code. > mm/filemap.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index 660490c..68fd987 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2703,8 +2703,8 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, > copied = copy_folio_to_iter(folio, offset, bytes, iter); > > already_read += copied; > - iocb->ki_pos += copied; > ra->prev_pos = iocb->ki_pos; > + iocb->ki_pos += copied; > > if (copied < bytes) { > error = -EFAULT; > -- > 1.8.3.1 >