From: Jan Kara <jack@suse.cz>
To: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: linux-mm@kvack.org, Jiayuan Chen <jiayuan.chen@shopee.com>,
syzbot+6880f676b265dbd42d63@syzkaller.appspotmail.com,
Theodore Ts'o <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
Steven Rostedt <rostedt@goodmis.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Jan Kara <jack@suse.cz>,
linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
ntfs3@lists.linux.dev, linux-trace-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v1] mm: annotate data race of f_ra.prev_pos
Date: Thu, 26 Feb 2026 14:21:57 +0100 [thread overview]
Message-ID: <2xzc3lp6ehtjwbzip4i5muh4g6oep4l72zh3j6sablfghbvbau@kh7famgorzrh> (raw)
In-Reply-To: <20260226084020.163720-1-jiayuan.chen@linux.dev>
On Thu 26-02-26 16:40:07, Jiayuan Chen wrote:
> From: Jiayuan Chen <jiayuan.chen@shopee.com>
>
> KCSAN reports a data race when concurrent readers access the same
> struct file:
>
> BUG: KCSAN: data-race in filemap_read / filemap_splice_read
>
> write to 0xffff88811a6f8228 of 8 bytes by task 10061 on cpu 0:
> filemap_splice_read+0x523/0x780 mm/filemap.c:3125
> ...
>
> write to 0xffff88811a6f8228 of 8 bytes by task 10066 on cpu 1:
> filemap_read+0x98d/0xa10 mm/filemap.c:2873
> ...
>
> Both filemap_read() and filemap_splice_read() update f_ra.prev_pos
> without synchronization. This is a benign race since prev_pos is only
> used as a hint for readahead heuristics in page_cache_sync_ra(), and a
> stale or torn value merely results in a suboptimal readahead decision,
> not a correctness issue.
>
> Use WRITE_ONCE/READ_ONCE to annotate all accesses to prev_pos across
> the tree for consistency and silence KCSAN.
>
> Reported-by: syzbot+6880f676b265dbd42d63@syzkaller.appspotmail.com
> Link: https://syzkaller.appspot.com/bug?extid=6880f676b265dbd42d63
> Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Given this, I think it would be much less intrusive and also more
explanatory to just mark prev_pos with __data_racy with appropriate reason
you're mentioning in the changelog.
Honza
> ---
> fs/ext4/dir.c | 2 +-
> fs/ntfs3/fsntfs.c | 2 +-
> include/trace/events/readahead.h | 2 +-
> mm/filemap.c | 6 +++---
> mm/readahead.c | 4 ++--
> mm/shmem.c | 2 +-
> 6 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index 28b2a3deb954..1ddf7acce5ca 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -200,7 +200,7 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
> sb->s_bdev->bd_mapping,
> &file->f_ra, file, index,
> 1 << EXT4_SB(sb)->s_min_folio_order);
> - file->f_ra.prev_pos = (loff_t)index << PAGE_SHIFT;
> + WRITE_ONCE(file->f_ra.prev_pos, (loff_t)index << PAGE_SHIFT);
> bh = ext4_bread(NULL, inode, map.m_lblk, 0);
> if (IS_ERR(bh)) {
> err = PTR_ERR(bh);
> diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
> index 0df2aa81d884..d1232fc03c08 100644
> --- a/fs/ntfs3/fsntfs.c
> +++ b/fs/ntfs3/fsntfs.c
> @@ -1239,7 +1239,7 @@ int ntfs_read_run_nb_ra(struct ntfs_sb_info *sbi, const struct runs_tree *run,
> if (!ra_has_index(ra, index)) {
> page_cache_sync_readahead(mapping, ra, NULL,
> index, 1);
> - ra->prev_pos = (loff_t)index << PAGE_SHIFT;
> + WRITE_ONCE(ra->prev_pos, (loff_t)index << PAGE_SHIFT);
> }
> }
>
> diff --git a/include/trace/events/readahead.h b/include/trace/events/readahead.h
> index 0997ac5eceab..63d8df6c2983 100644
> --- a/include/trace/events/readahead.h
> +++ b/include/trace/events/readahead.h
> @@ -101,7 +101,7 @@ DECLARE_EVENT_CLASS(page_cache_ra_op,
> __entry->async_size = ra->async_size;
> __entry->ra_pages = ra->ra_pages;
> __entry->mmap_miss = ra->mmap_miss;
> - __entry->prev_pos = ra->prev_pos;
> + __entry->prev_pos = READ_ONCE(ra->prev_pos);
> __entry->req_count = req_count;
> ),
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 63f256307fdd..d3e2d4b826b9 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2771,7 +2771,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
> int i, error = 0;
> bool writably_mapped;
> loff_t isize, end_offset;
> - loff_t last_pos = ra->prev_pos;
> + loff_t last_pos = READ_ONCE(ra->prev_pos);
>
> if (unlikely(iocb->ki_pos < 0))
> return -EINVAL;
> @@ -2870,7 +2870,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
> } while (iov_iter_count(iter) && iocb->ki_pos < isize && !error);
>
> file_accessed(filp);
> - ra->prev_pos = last_pos;
> + WRITE_ONCE(ra->prev_pos, last_pos);
> return already_read ? already_read : error;
> }
> EXPORT_SYMBOL_GPL(filemap_read);
> @@ -3122,7 +3122,7 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
> len -= n;
> total_spliced += n;
> *ppos += n;
> - in->f_ra.prev_pos = *ppos;
> + WRITE_ONCE(in->f_ra.prev_pos, *ppos);
> if (pipe_is_full(pipe))
> goto out;
> }
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 7b05082c89ea..de49b35b0329 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -142,7 +142,7 @@ void
> file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
> {
> ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
> - ra->prev_pos = -1;
> + WRITE_ONCE(ra->prev_pos, -1);
> }
> EXPORT_SYMBOL_GPL(file_ra_state_init);
>
> @@ -584,7 +584,7 @@ void page_cache_sync_ra(struct readahead_control *ractl,
> }
>
> max_pages = ractl_max_pages(ractl, req_count);
> - prev_index = (unsigned long long)ra->prev_pos >> PAGE_SHIFT;
> + prev_index = (unsigned long long)READ_ONCE(ra->prev_pos) >> PAGE_SHIFT;
> /*
> * A start of file, oversized read, or sequential cache miss:
> * trivial case: (index - prev_index) == 1
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 5e7dcf5bc5d3..03569199baf4 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3642,7 +3642,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
> len -= n;
> total_spliced += n;
> *ppos += n;
> - in->f_ra.prev_pos = *ppos;
> + WRITE_ONCE(in->f_ra.prev_pos, *ppos);
> if (pipe_is_full(pipe))
> break;
>
> --
> 2.43.0
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2026-02-26 13:22 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 8:40 Jiayuan Chen
2026-02-26 13:21 ` Jan Kara [this message]
2026-02-26 14:29 ` Jiayuan Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2xzc3lp6ehtjwbzip4i5muh4g6oep4l72zh3j6sablfghbvbau@kh7famgorzrh \
--to=jack@suse.cz \
--cc=adilger.kernel@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=almaz.alexandrovich@paragon-software.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=hughd@google.com \
--cc=jiayuan.chen@linux.dev \
--cc=jiayuan.chen@shopee.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=ntfs3@lists.linux.dev \
--cc=rostedt@goodmis.org \
--cc=syzbot+6880f676b265dbd42d63@syzkaller.appspotmail.com \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox