From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE0E9C25B08 for ; Thu, 18 Aug 2022 03:13:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 00DD08D0002; Wed, 17 Aug 2022 23:13:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EFFD88D0001; Wed, 17 Aug 2022 23:13:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC7BD8D0002; Wed, 17 Aug 2022 23:13:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id CB2F78D0001 for ; Wed, 17 Aug 2022 23:13:31 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 95E91140879 for ; Thu, 18 Aug 2022 03:13:31 +0000 (UTC) X-FDA: 79811243022.05.9B567F2 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by imf05.hostedemail.com (Postfix) with ESMTP id B5D8C100061 for ; Thu, 18 Aug 2022 03:13:29 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=kanie@linux.alibaba.com;NM=1;PH=DS;RN=4;SR=0;TI=SMTPD_---0VMZ06Bv_1660792404; Received: from 30.227.140.19(mailfrom:kanie@linux.alibaba.com fp:SMTPD_---0VMZ06Bv_1660792404) by smtp.aliyun-inc.com; Thu, 18 Aug 2022 11:13:25 +0800 Message-ID: Date: Thu, 18 Aug 2022 11:13:24 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: [RFC PATCH] mm/filemap.c: fix the timing of asignment of prev_pos To: Matthew Wilcox Cc: akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org References: <1660744317-8183-1-git-send-email-kanie@linux.alibaba.com> From: Guixin Liu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660792410; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QWWtjMczp5sYosohIfIkeDQsOwLMNlKp1+FeF2GU5hI=; b=zuIbpfkXke4wn6yKT28k4ymEq3vEPAYmVgOxDua/FrRTxLXYHZxaSA7mOonsXwtSQ6Aj1W vKpZAj5ils5/wxfEH6XO4c1KRQccWT7pa3Fq36dGWdFbkw49a3xAq+vYLqy8KjoebC4+ks KETGTTn37jA5ZVCC7Ro9WU/c1VS3MS0= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf05.hostedemail.com: domain of kanie@linux.alibaba.com designates 115.124.30.45 as permitted sender) smtp.mailfrom=kanie@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660792410; a=rsa-sha256; cv=none; b=j1wnMcPv7D1vpNqvSTQwyWuol86oC3bUjAhV/h21TGxFKv+gx78kJ+WWlanA347rPDUrpm /pPhYOmSfLayVLgoENC5iZ7f0cMr7/fy+4vioRG5Ru6VIFYybKFamrexJJWKHZCKlemt5K wgGLE1571SgUVyrJ9p2F5WCbUJwTh1E= Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf05.hostedemail.com: domain of kanie@linux.alibaba.com designates 115.124.30.45 as permitted sender) smtp.mailfrom=kanie@linux.alibaba.com X-Stat-Signature: pnop36io5pz8cy1zy3nn5k3z7sh77wmf X-Rspamd-Queue-Id: B5D8C100061 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1660792409-292293 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: 在 2022/8/17 23:25, Matthew Wilcox 写道: > On Wed, Aug 17, 2022 at 09:51:57PM +0800, Guixin Liu wrote: >> The prev_pos should be assigned before the iocb->ki_pos is incremented, >> so that the prev_pos is the exact location of the last visit. >> >> Fixes: 06c0444290cec ("mm/filemap.c: generic_file_buffered_read() now >> uses find_get_pages_contig") >> Signed-off-by: Guixin Liu >> >> --- >> Hi guys, >> When I`m running repetitive 4k read io which has same offset, >> I find that access to folio_mark_accessed is inevitable in the >> read process, the reason is that the prev_pos is assigned after the >> iocb->ki_pos is incremented, so that the prev_pos is always not equal >> to the position currently visited. >> Is this a bug that needs fixing? > I think you've misunderstood the purpose of 'prev_pos'. But this has > been the source of bugs, so let's go through it in detail. > > In general, we want to mark a folio as accessed each time we read from > it. So if we do this: > > read(fd, buf, 1024 * 1024); > > we want to mark each folio as having been accessed. > > But if we're doing lots of short reads, we don't want to mark a folio as > being accessed multiple times (if you dive into the implementation, > you'll see the first time, the 'referenced' flag is set and the second > time, the folio is moved to the active list, so it matters how often > we call mark_accessed). IOW: > > for (i = 0; i < 1024 * 1024; i++) > read(fd, buf, 1); > > should do the same amount of accessed/referenced/activation as the single > read above. > > So when we store ki_pos in prev_pos, we don't want to know "Where did > the previous read start?" We want to know "Where did the previous read > end". That's why when we test it, we check whether prev_pos - 1 is in > the same folio as the offset we're looking at: > > if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1, > fbatch.folios[0])) > folio_mark_accessed(fbatch.folios[0]); > > I'm not super-proud of this code, and accept that it's confusing. > But I don't think the patch below is right. If you could share > your actual test and show what's going wrong, I'm interested. > > I think what you're saying is that this loop: > > for (i = 0; i < 1000; i++) > pread(fd, buf, 4096, 1024 * 1024); > > results in the folio at offset 1MB being marked as accessed more than > once. If so, then I think that's the algorithm behaving as designed. > Whether that's desirable is a different question; when I touched this > code last, I was trying to restore the previous behaviour which was > inadvertently broken. I'm not taking a position on what the right > behaviour is for such code. > My thanks for your detailed description, I am wrong about this, I test not on the newest code, My fault. The 5ccc944dce3d actually solved this problem. Best regards, Guixin Liu >> mm/filemap.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/filemap.c b/mm/filemap.c >> index 660490c..68fd987 100644 >> --- a/mm/filemap.c >> +++ b/mm/filemap.c >> @@ -2703,8 +2703,8 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, >> copied = copy_folio_to_iter(folio, offset, bytes, iter); >> >> already_read += copied; >> - iocb->ki_pos += copied; >> ra->prev_pos = iocb->ki_pos; >> + iocb->ki_pos += copied; >> >> if (copied < bytes) { >> error = -EFAULT; >> -- >> 1.8.3.1 >>