linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -RFC 0/2] mm/ext4: avoid data corruption when extending DIO write race with buffered read
@ 2023-12-02  9:14 Baokun Li
  2023-12-02  9:14 ` [PATCH -RFC 1/2] mm: " Baokun Li
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Baokun Li @ 2023-12-02  9:14 UTC (permalink / raw)
  To: linux-mm, linux-ext4
  Cc: tytso, adilger.kernel, jack, willy, akpm, ritesh.list,
	linux-kernel, yi.zhang, yangerkun, yukuai3, libaokun1

Hello everyone!

Recently, while running some pressure tests on MYSQL, noticed that
occasionally a "corrupted data in log event" error would be reported.
After analyzing the error, I found that extending DIO write and buffered
read were competing, resulting in some zero-filled page end being read.
Since ext4 buffered read doesn't hold an inode lock, and there is no
field in the page to indicate the valid data size, it seems to me that
it is impossible to solve this problem perfectly without changing these
two things.

In this series, the first patch reads the inode size twice, and takes the
smaller of the two values as the copyout limit to avoid copying data that
was not actually read (0-padding) into the user buffer and causing data
corruption. This greatly reduces the probability of problems under 4k
page. However, the problem is still easily triggered under 64k page.

The second patch waits for the existing dio write to complete and
invalidate the stale page cache before performing a new buffered read
in ext4, avoiding data corruption by copying the stale page cache to
the user buffer. This makes it much less likely that the problem will
be triggered in a 64k page.

Do we have a plan to add a lock to the ext4 buffered read or a field in
the page that indicates the size of the valid data in the page? Or does
anyone have a better idea?

Comments and questions are, as always, welcome.

Baokun Li (2):
  mm: avoid data corruption when extending DIO write race with buffered
    read
  ext4: avoid data corruption when extending DIO write race with
    buffered read

 fs/ext4/file.c | 3 +++
 mm/filemap.c   | 5 +++--
 2 files changed, 6 insertions(+), 2 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-12-12 14:25 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-02  9:14 [PATCH -RFC 0/2] mm/ext4: avoid data corruption when extending DIO write race with buffered read Baokun Li
2023-12-02  9:14 ` [PATCH -RFC 1/2] mm: " Baokun Li
2023-12-02  9:14 ` [PATCH -RFC 2/2] ext4: " Baokun Li
2023-12-04 12:11 ` [PATCH -RFC 0/2] mm/ext4: " Jan Kara
2023-12-04 13:50   ` Baokun Li
2023-12-04 14:41     ` Jan Kara
2023-12-05 12:50       ` Baokun Li
2023-12-06 19:37         ` Jan Kara
2023-12-07  3:01           ` Baokun Li
2023-12-07 14:15           ` Baokun Li
2023-12-11 17:49             ` Jan Kara
2023-12-12  2:15               ` Baokun Li
2023-12-12  4:36           ` Matthew Wilcox
2023-12-12 14:25             ` Jan Kara
2023-12-05  4:17     ` Theodore Ts'o
2023-12-05 13:19       ` Baokun Li
2023-12-06 21:55         ` Theodore Ts'o
2023-12-07  6:41           ` Baokun Li
2023-12-06  8:35     ` Dave Chinner
2023-12-06  9:02       ` Christoph Hellwig
2023-12-06 10:34         ` Dave Chinner
2023-12-06 12:20           ` Christoph Hellwig
2023-12-06 11:57       ` Baokun Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox